Prefer requirements with upper bounds #13273

notatallshaw · 2025-03-08T20:28:33Z

This is the last PR related to get_preference I am making this release cycle, this one provides significant performance improvements for some pathological resolutions, and I don't have any examples where it makes the resolution worse.

There are a number of intuitive ways to think about why this would help find a resolution faster, but the one I most like is operators that do not allow versions after a certain version limit the possible versions to some finite set (ignoring that technically a package could never bump their major/minor version number) where as other operators still allow for unbounded future versions to be selected.

Checking against the scenarios in Pip-Resolution-Scenarios-and-Benchmarks, there are two problematic scenarios that go from failing to succeeding compared to master branch:

Difference for scenario scenarios/problematic.toml - autogluon:
    	Success: False -> True
    	Failure Reason: Build Failure -> None
    	Distinct Sdists visisted: 1 -> 2 (200.00%)
    	Distinct Wheels visisted: 162 -> 259 (159.88%)
    	Total visisted packages: 2398 -> 1254 (52.29%)
    	Total visisted requirements: 23945 -> 13006 (54.32%)
    	Total rejected requirements: 12791 -> 4101 (32.06%)
    	Total pinned packages: 729 -> 536 (73.53%)
    	Total rounds: 1283 -> 768 (59.86%)

Difference for scenario scenarios/problematic.toml - sphinx:
    	Success: False -> True
    	Failure Reason: Resolution Too Deep -> None
    	Distinct Sdists visisted: 1 -> 0 (0.00%)
    	Distinct Wheels visisted: 124 -> 79 (63.71%)
    	Total visisted packages: 97707 -> 80 (0.08%)
    	Total visisted requirements: 225360 -> 225 (0.10%)
    	Total rejected requirements: 137840 -> 6 (0.00%)
    	Total pinned packages: 8113 -> 82 (1.01%)
    	Total rounds: 15005 -> 82 (0.55%)

Note, for the latter scenario I have the benchmark tool stop at ~15k rounds, where as normally pip won't stop till 200k rounds.

pfmoore · 2025-03-08T22:04:16Z

My only concern with this is that it could result in people thinking that upper bounds are good, whereas it's generally acknowledged that they are a bad idea in practice. I can imagine people seeing this change and thinking that it means that if they add upper version caps to their requirements, they will get faster resolution times "magically".

While the performance numbers can't lie, I would be much happier if we could give a clear explanation of why and when this improves resolution times, in a way that doesn't make it appear that you'll always be better with an upper cap. In particular, the intuition you suggest makes no sense to me - upper version limits don't change an infinite (or even large) set of possibilities into a finite one, the set of possibilities is always finite.

I think this is more a matter of how we present the change than a problem with the actual logic, though. If I understand correctly, the real saving is due to the fact that, because pip always tries the latest version available first, requirements that guarantee that one or more of the newest versions will not work allow us to skip those newer versions sooner, so we should prefer them over requirements that don't. The key is that an upper cap is only useful to us if it actually excludes new versions, and therefore it's an explicit "the project is known not to work with recent versions of this dependency" rather than the pre-emptive "we've only tested with dependency versions up to X" cap that's called out as problematic in the article I linked.

I don't know how we get that much nuance¹ in a release note, though. Maybe we'd be best simply not explaining in the news item - just saying something like "Improve the heuristics for determining the order for resolving dependencies", and adding a short explanatory note ("because this allows us to skip guaranteed-incompatible later versions sooner" or similar) to the docstring and documentation.

About something that's frankly an obscure internal detail. ↩

notatallshaw · 2025-03-08T22:12:26Z

@pfmoore I would say your whole comment is on point and I agree with your concern. In fact it's exactly because adding upper bound requirements makes resolution hard that it's useful to consider them before other requirements, so I definitely don't want to encourage a proliferation of them.

I will rethink the documentation, explaining more of the nuance, and the news items, simplifying it to not go into the nuance.

notatallshaw · 2025-03-08T22:27:39Z

In particular, the intuition you suggest makes no sense to me - upper version limits don't change an infinite (or even large) set of possibilities into a finite one, the set of possibilities is always finite.

It's not really important, but the thought process goes like this:

Let's say you have versions are released in a strictly monotonically increasing fashion (not always true but often true), e.g. 1.0, 1.1, 1.2, 1.3, ..., 1.99, and there is a requirement <1.100, then in general that upper bound will limit the number of possibilities to 100 versions for the rest of time, but an unbounded number of versions can be released after this, e.g. 1.100, ..., 1.10000, ..., 2.0, 2.1, ..., etc.

Another, related way, of thinking about it is that upper bounding make statements about the future of a package, which usually is not known for sure, and is therefore can be problematic, but lower bounding requirements say something about the past which can be known, so tend not to be problematic, and resolving problematic requirements first is usually better.

Your intuitive explanation is probably a better one, and I'll update to reflect.

potiuk · 2025-03-09T08:11:32Z

@pfmoore I would say your whole comment is on point and I agree with your concern. In fact it's exactly because adding upper bound requirements makes resolution hard that it's useful to consider them before other requirements, so I definitely don't want to encourage a proliferation of them.

100%. And thanks for the article link @pfmoore - I have not seen it before, but this is pure gold.

notatallshaw · 2025-03-09T16:50:53Z

I've updated docs and news entry based on feedback.

I'm limiting this PR to just touching the get_preference docs here, but I do think I it's worthwhile in a seperate PR to take a look over the dependency resolution docs in general. I'm planning to raise another PR soon that makes ResolutionTooDeep a diagnostic error, as part of that pointing users to the resolution docs, so I will reread over the docs and see what can be improved.

ichard26

While I'm far from being knowledgeable on dependency resolution, I did spend some time reading the pip documentation on dependency resolution. This makes a lot more sense. LGTM (but this isn't a formal approval).

notatallshaw added 2 commits March 8, 2025 15:31

Prefer requirements with upper bounds

b3882c5

NEWS ENTRY

de28f96

notatallshaw force-pushed the prefer-requirements-with-upper-bounds branch from 2d2ad3c to de28f96 Compare March 8, 2025 20:31

psf-chronographer bot added the bot:chronographer:provided label Mar 8, 2025

notatallshaw added this to the 25.1 milestone Mar 8, 2025

notatallshaw added 2 commits March 9, 2025 12:40

Clarify get_preference docs

e7f52d0

Update NEWS ENTRY

cb50601

pfmoore approved these changes Mar 9, 2025

View reviewed changes

notatallshaw added 2 commits March 9, 2025 16:53

Fix grammar

d7536d4

Add missing test

1e68d9c

notatallshaw force-pushed the prefer-requirements-with-upper-bounds branch from a5bc3d2 to 1e68d9c Compare March 9, 2025 20:57

ichard26 reviewed Mar 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefer requirements with upper bounds #13273

Prefer requirements with upper bounds #13273

notatallshaw commented Mar 8, 2025 •

edited

Loading

pfmoore commented Mar 8, 2025

notatallshaw commented Mar 8, 2025

notatallshaw commented Mar 8, 2025 •

edited

Loading

potiuk commented Mar 9, 2025 •

edited

Loading

notatallshaw commented Mar 9, 2025

ichard26 left a comment

Prefer requirements with upper bounds #13273

Are you sure you want to change the base?

Prefer requirements with upper bounds #13273

Conversation

notatallshaw commented Mar 8, 2025 • edited Loading

pfmoore commented Mar 8, 2025

Footnotes

notatallshaw commented Mar 8, 2025

notatallshaw commented Mar 8, 2025 • edited Loading

potiuk commented Mar 9, 2025 • edited Loading

notatallshaw commented Mar 9, 2025

ichard26 left a comment

Choose a reason for hiding this comment

notatallshaw commented Mar 8, 2025 •

edited

Loading

notatallshaw commented Mar 8, 2025 •

edited

Loading

potiuk commented Mar 9, 2025 •

edited

Loading