Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python auto doc #665

Merged
merged 8 commits into from
Dec 26, 2023
Merged

Python auto doc #665

merged 8 commits into from
Dec 26, 2023

Conversation

wrongbad
Copy link
Contributor

This is based on top of the "more numpy" branch, so look at just the top commit for now.

I threw together a quick proof of concept. Obviously the script can use a lot of refactoring to add more sources with less redundancy.

One pain point already is method overloading and how to generate keys/ids to reference them. I put a simple "_1", "_2" mechanism in, but this can cause problems when adding new overloads and docstrings silently get shifted to wrong functions. I looked at doxygen XML outputs and it seems they generate some arbitrary hash for overloads, so it would likely not be a stable reference from the binding cpp file.

I could also try parentheses matching to scoop the full function name, remove all whitespace and hash it, then you only need to update the key if the function signature changes.

Copy link

codecov bot commented Dec 18, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (7d564cf) 91.66% compared to head (50ab9f7) 91.66%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #665   +/-   ##
=======================================
  Coverage   91.66%   91.66%           
=======================================
  Files          37       37           
  Lines        4737     4737           
=======================================
  Hits         4342     4342           
  Misses        395      395           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pca006132
Copy link
Collaborator

For overloads, we can probably use the parameter names as keys. It will be nice if we can make some macro that set the nb::arg and fill in the parameter name in the docstring.

@wrongbad
Copy link
Contributor Author

While parsing parameters names and types would be nice, I fear it quickly spirals into general-purpose c++ parsing, which is more than a 100-line python script can handle

@pca006132
Copy link
Collaborator

I think the function signature is simple in our case, we can make sure that there are no parenthesis in the parameter type. This can make it readable using regex. We can just change our source file to make it easy to parse, instead of changing the script to parse whatever c++ madness.

@wrongbad
Copy link
Contributor Author

Ah I'm reminded why I prefer to amend and force-push when working on PR. Rebasing this branch on master is a huge mess

@elalish
Copy link
Owner

elalish commented Dec 20, 2023

No need to bother with rebasing - just use a regular merge. That's why we squash.

#pragma once
#include <string>
namespace manifold_docs {
const auto cross_section__area_f5458809be32 = R"___(Return the total area covered by complex polygons making up the
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file generated by your script? Would be great to have some documentation about how the system works and the right way to maintain it. Looks promising though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated a comment in the file, and added comments in the gen_docs.py script.

@elalish
Copy link
Owner

elalish commented Dec 21, 2023

This is looking great - is it ready for review?

@zalo
Copy link
Contributor

zalo commented Dec 21, 2023

Is the “_1”, “_2” overload handling mechanism still in there; how does it work? (I would be kind of sad if callers had to disambiguate overloads by appending the number of the overload to the function name… OpenCascade.js eventually went this direction for typescript annotations)

If that’s the case, perhaps a lesser evil would be marking non-intersecting args as optional?

@pca006132
Copy link
Collaborator

I think the auto doc is just for documentation, it does not affect overload resolution in the runtime. And it is using the argument names as suffices to disambiguate different overloads, e.g. manifold__manifold vs manifold__manifold__mesh_gl__property_tolerance.

@wrongbad
Copy link
Contributor Author

wrongbad commented Dec 21, 2023

Yes I think it's ready for review.

And yes the _1 is gone and these overload names don't affect API, only binding sources for maintainers.

Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me, thank you! So the CI builds these docs automatically and they'll get packaged up into our Python releases? How do we test them?


# we don't handle inline functions in classes properly
# so instead just white-list functions we want
def select_functions(s):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add something in a README somewhere that when a new non-member function is added to the API it needs to also be added to this list?



collect(f"{base}/src/manifold/src/manifold.cpp", lambda s: method_re.search(s))
collect(f"{base}/src/manifold/src/constructors.cpp", lambda s: method_re.search(s))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps also add to a README that when new files are added with API documentation, they need to also be added as a collect line?



def method_key(name):
name = re.sub("\+", "_plus", name)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to add a substitution of MeshGL to mesh, since several docs refer to this data structure by its C++ name.

@wrongbad
Copy link
Contributor Author

wrongbad commented Dec 22, 2023

@elalish What specifically are you hoping to test? At least right now empty doc-strings are omitted, which would result in compile errors if a code change caused one of them to get lost somehow. But other than that, I'm not sure automated testing can know if a doc-string is "correct".

I guess we could also run the nanobind-stubgen script on the bindings and check that the param names match the function signature.

@elalish
Copy link
Owner

elalish commented Dec 22, 2023

Well, it doesn't have to be automation. I just feel like before I publish a release I'd like to be able to see the python docs in context and verify they look right. More of some README steps; maybe run the script and then hover over some python in VSCode and see if the doc string looks right? I just want to make sure I'm looking at the most recent version and such.

@zalo
Copy link
Contributor

zalo commented Dec 22, 2023

If all it takes to build locally is to just run CMake on this branch, then, I think I have a few examples:

image
image

Here's an example of Github Copilot giving suggestions based on context alongside the docs:
ScaleDocs

Also, it seems like sphere(), cube(), and hull_points() don't have docs... There are probably more missing...
Which is weird, because I see them here, so I know the script ran...
docstrings.inl.txt

@pca006132
Copy link
Collaborator

I think for some language servers you need the stub file.

@wrongbad
Copy link
Contributor Author

For basic human QA of python docs, I would recommend the following:

pip install nanobind-stubgen
# in manifold repo root
pip install . 
nanobind-stubgen manifold3d

Open manifold3d.pyi and read it.

@elalish
Copy link
Owner

elalish commented Dec 23, 2023

Thanks @wrongbad, but nanobind-stubgen manifold3d gave me WARNING:root:Function is not valid python code: Initialize self. See help(type(self)) for accurate signature.

Would you mind adding these directions to your README? I'm guessing there's a step missing, or my python setup is weird somehow. @zalo does it work for you now? Would also be great if you could paste manifold3d.pyi into a comment here so others can review it easily.

@pca006132
Copy link
Collaborator

pca006132 commented Dec 23, 2023

I also got the same warning, it seems that it is related to enum codegen, which I guess is complaining the lack of self argument in __init__.

class JoinType:
    """
    <attribute '__doc__' of 'JoinType' objects>
    """

    @entries: dict
    
    Miter: JoinType
    
    Round: JoinType
    
    Square: JoinType
    
    def __init__(*args, **kwargs):
        """
        Initialize self.  See help(type(self)) for accurate signature.
        """
        ...

There are also other issues, copying (and updating) my previous comment to here:

  • Does not annotate static methods.
  • Does not annotate every overloaded method. I think it does not annotate the first one and jedi is not happy about this.
  • Cannot figure out the type for things like Floatx3
  • If the return type is a tuple, e.g. tuple with 6 floats, it can only infer tuple as the return type.

@pca006132
Copy link
Collaborator

I think for now we can use a postprocessor to fix some of the issues, and open a PR in the upstream to make the real fix later.

@wrongbad
Copy link
Contributor Author

Yeah I get the warnings too but it seems non-critical. I updated the readme and included a simpler option as well:

pip install .
python -c 'import manifold3d; help(manifold3d)'

@elalish
Copy link
Owner

elalish commented Dec 24, 2023

@pca006132 It seems like most of what you're referring to is working for me, unless I'm not understanding this right:

overloads:

    class Manifold(builtins.object)
     |  Methods defined here:
     |  
     |  __add__(...)
     |      __add__(self, arg: manifold3d.Manifold, /) -> manifold3d.Manifold
     |      
     |      Shorthand for Boolean Union.
     |  
     |  __init__(...)
     |      __init__(self) -> None
     |      __init__(self, mesh: manifold3d.Mesh, property_tolerance: list[float] = []) -> None
     |      
     |      Overloaded function.
     |      
     |      1. ``__init__(self) -> None``
     |      
     |      Construct an empty Manifold.
     |      
     |      2. ``__init__(self, mesh: manifold3d.Mesh, property_tolerance: list[float] = []) -> None
``
     |      
     |      Convert a Mesh into a Manifold, retaining its properties and merging only
     |      the positions according to the merge vectors. Will return an empty Manifold
     |      and set an Error Status if the result is not an oriented 2-manifold. Will
     |      collapse degenerate triangles and unnecessary vertices.
     |      All fields are read, making this structure suitable for a lossless round-trip
     |      of data from GetMesh. For multi-material input, use ReserveIDs to set a
     |      unique originalID for each material, and sort the materials into triangle
     |      runs.
     |      :param mesh: The input Mesh.
     |      :param property_tolerance: A vector of precision values for each property
     |      beyond position. If specified, the propertyTolerance vector must have size =
     |      numProp - 3. This is the amount of interpolation error allowed before two
     |      neighboring triangles are considered to be on a property boundary edge.
     |      Property boundary edges will be retained across operations even if the
     |      triangles are coplanar. Defaults to 1e-5, which works well for most
     |      properties in the [-1, 1] range.

static methods:

     |  Static methods defined here:
     |  
     |  __new__(*args, **kwargs) from nanobind.nb_type_0
     |      Create and return a new object.  See help(type) for accurate signature.
     |  
     |  ----------------------------------------------------------------------
     |  Data and other attributes defined here:
     |  
     |  batch_hull = <nanobind.nb_func object>
     |      batch_hull(manifolds: list[manifold3d.Manifold]) -> manifold3d.Manifold
     |      
     |      Compute the convex hull enveloping a set of manifolds.
     |      :param manifolds: A vector of manifolds over which to compute a convex hull.

This is looking pretty decent to me - @pca006132 should we merge this, or is there some kind of regression you see?

@pca006132
Copy link
Collaborator

ok, maybe it is a version problem on my side, I installed it earlier. yeah we can merge this.

Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to assume this is all good - it certainly cleans up our code nicely. I wish I knew Python well enough to feel some confidence in verifying it, but I'll assume our users will file issues if they notice problems. Thank you!

@elalish elalish merged commit 54fa822 into elalish:master Dec 26, 2023
@elalish elalish mentioned this pull request Jan 7, 2024
cartesian-theatrics pushed a commit to SovereignShop/manifold that referenced this pull request Mar 11, 2024
* python autogen docstrings

* use function signature hashes in docstring key

* remove sys.path debug code

* docstrings use param names

* autogen docstrings during cmake

* document gen_docs.py

* more readme, remap mesh_gl references

* readme about verify python docstrings, autogen depend on sources
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants