Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building sdist is slow with wide exclude pattern #603

Closed
1 task done
Lothiraldan opened this issue Nov 6, 2018 · 3 comments · Fixed by #604
Closed
1 task done

Building sdist is slow with wide exclude pattern #603

Lothiraldan opened this issue Nov 6, 2018 · 3 comments · Fixed by #604
Labels
area/build-system Related to PEP 517 packaging (see poetry-core) kind/bug Something isn't working as expected

Comments

@Lothiraldan
Copy link
Contributor

  • I have searched the issues of this repo and believe that this is not a duplicate.

Issue

I'm working on a project with the following pyproject.toml: https://gist.github.com/Lothiraldan/2c0b5ce0171e8450490e3b493e7c2960 and I want to ship a React project inside my package.

Include and exclude are working great, far easier to configure than MANIFEST.IN IMHO.

My issue is that the builder code is walking all of the directories and for each file check that it's not in the excluded list. One of my exclude pattern is "balto/web_interfaces/balto_react/node_modules/**/*" which generates a lot of matching files. The length of the excluded file list is 28761 in my case.

This makes the following line https://github.com/sdispater/poetry/blob/master/poetry/masonry/builders/sdist.py#L281 quite slow. A build takes about 4 minutes on my laptop.

Here is a py-spy dump of the process:

Collecting samples from 'pid: 31302' (python v3.6.6)
Total Samples 5100
GIL: 0.00%, Active: 95.50%, Threads: 1

  %Own   %Total  OwnTime  TotalTime  Function (filename:line)                                                                                                                                                        
 38.00%  51.00%   10.55s    14.14s   __eq__ (4/python3.6/pathlib.py:736)
 29.00%  93.00%    6.64s    24.87s   <listcomp> (/home/lothiraldan/.local/pipx/venvs/poetry/lib64/python3.6/site-packages/poetry/masonry/builders/sdist.py:281)
 16.50%  16.50%    4.38s     4.38s   _cparts (4/python3.6/pathlib.py:728)
  7.50%  11.00%    2.81s     3.65s   __eq__ (4/python3.6/pathlib.py:734)
  1.50%   1.50%   0.015s    0.015s   run (/home/lothiraldan/.local/pipx/venvs/poetry/lib64/python3.6/site-packages/cleo/application.py:104)
  1.00%   1.00%   0.130s    0.130s   _cparts (4/python3.6/pathlib.py:724)
  1.00%   2.00%   0.315s    0.435s   __eq__ (4/python3.6/pathlib.py:733)
  0.50%   0.50%   0.025s    0.035s   parse_parts (4/python3.6/pathlib.py:87)
  0.50%   0.50%   0.165s    0.180s   wrapped (4/python3.6/pathlib.py:387)
  0.00%   1.00%   0.000s    0.355s   find_packages (/home/lothiraldan/.local/pipx/venvs/poetry/lib64/python3.6/site-packages/poetry/masonry/builders/sdist.py:277)
  0.00%   0.00%   0.030s    0.030s   _get_sep (4/python3.6/posixpath.py:45)
  0.00%  94.00%   0.000s    25.36s   execute (/home/lothiraldan/.local/pipx/venvs/poetry/lib64/python3.6/site-packages/cleo/commands/command.py:107)
  0.00%   0.00%   0.010s    0.075s   <listcomp> (/home/lothiraldan/.local/pipx/venvs/poetry/lib64/python3.6/site-packages/poetry/masonry/builders/sdist.py:276)
  0.00%   0.00%   0.025s    0.025s   _select_from (4/python3.6/pathlib.py:529)
  0.00%  94.00%   0.000s    25.36s   do_run (/home/lothiraldan/.local/pipx/venvs/poetry/lib64/python3.6/site-packages/cleo/application.py:197)
  0.00%  94.00%   0.000s    25.36s   build (/home/lothiraldan/.local/pipx/venvs/poetry/lib64/python3.6/site-packages/poetry/masonry/builder.py:21)
  0.00%  94.00%   0.000s    25.36s   do_run (/home/lothiraldan/.local/pipx/venvs/poetry/lib64/python3.6/site-packages/poetry/console/application.py:88)

Press Control-C to quit, or ? for help.

I have some ideas about how to make it faster, I will send some patches if that's okay.

Lothiraldan added a commit to Lothiraldan/poetry that referenced this issue Nov 6, 2018
Before we build a potentially very big list of all the excluded file matching
every exclude pattern, which could be a lot and then do a containment check on
the resulting list.

A first step to speed up the process is to returns a set instead of a list so
containment checks will not be dependent to the number of matching files
anymore.

If the set still ends up being too large, introducing an intermediary solution
that check if a file name match a glob pattern instead of expanding the glob
pattern might be a good solution. Right now, it seems overkill in comparison
of this patch.

With this patch, building my project that exclude `nodes_modules/**/*` went
from 4 minutes to 7 seconds.

Fix python-poetry#603
@sdispater sdispater added kind/bug Something isn't working as expected area/build-system Related to PEP 517 packaging (see poetry-core) labels Nov 7, 2018
sdispater pushed a commit that referenced this issue Nov 8, 2018
Before we build a potentially very big list of all the excluded file matching
every exclude pattern, which could be a lot and then do a containment check on
the resulting list.

A first step to speed up the process is to returns a set instead of a list so
containment checks will not be dependent to the number of matching files
anymore.

If the set still ends up being too large, introducing an intermediary solution
that check if a file name match a glob pattern instead of expanding the glob
pattern might be a good solution. Right now, it seems overkill in comparison
of this patch.

With this patch, building my project that exclude `nodes_modules/**/*` went
from 4 minutes to 7 seconds.

Fix #603
@sdispater
Copy link
Member

Just for information, commit cb79ffb further improve performances

@Lothiraldan
Copy link
Contributor Author

Cool! I will make sure to test it against my package!

Copy link

github-actions bot commented Mar 3, 2024

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/build-system Related to PEP 517 packaging (see poetry-core) kind/bug Something isn't working as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants