Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce overhead in converting resolutions #11660

Merged
merged 2 commits into from
Feb 20, 2025

Conversation

konstin
Copy link
Member

@konstin konstin commented Feb 20, 2025

Solving spent a chunk of its time just converting resolutions, the left two blocks:

image

These blocks are ResolverOutput::from_state with 1.3% and ForkState::into_resolution with 4.1% of resolver thread runtime for apache airflow universal.

We reduce the overhead spent in those functions, to now 1.1% and 2.1% of resolver time spend in those functions by:

Commit 1: Replace the hash set for the edges with a vec in ForkState::into_resolution. We deduplicate edges anyway when collecting them, and the hash-and-insert was slow.

Commit 2: Reduce the distribution clonign in ResolverOutput::from_state by using an Arc.

The same profile excerpt for the resolver with the branch (note that there is now an unrelated block between the two we optimized):

image

Wall times are noisy, but the profiles show those changes as improvements.

$ hyperfine --warmup 2 "./uv-main pip compile --no-progress scripts/requirements/airflow.in --universal" "./uv-branch pip compile --no-progress scripts/requirements/airflow.in --universal"
Benchmark 1: ./uv-main pip compile --no-progress scripts/requirements/airflow.in --universal
  Time (mean ± σ):      99.1 ms ±   3.8 ms    [User: 111.8 ms, System: 115.5 ms]
  Range (min … max):    93.6 ms … 110.4 ms    29 runs
 
Benchmark 2: ./uv-branch pip compile --no-progress scripts/requirements/airflow.in --universal
  Time (mean ± σ):      97.1 ms ±   4.3 ms    [User: 114.8 ms, System: 112.0 ms]
  Range (min … max):    90.9 ms … 112.4 ms    29 runs
 
Summary
  ./uv-branch pip compile --no-progress scripts/requirements/airflow.in --universal ran
    1.02 ± 0.06 times faster than ./uv-main pip compile --no-progress scripts/requirements/airflow.in --universal

Comment on lines -3070 to +3080
pub(crate) edges: FxHashSet<ResolutionDependencyEdge>,
pub(crate) edges: Vec<ResolutionDependencyEdge>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the big perf improvement

Building the `HashSet` was a block in the profile (2.6% of the resolver thread). We don't need to hash at all since we're deduplicating edges when reading them anyway.

The benchmark isn't too clear but the profile show it as an improvement.

```
Benchmark 1: ./uv-main pip compile --no-progress scripts/requirements/airflow.in --universal
  Time (mean ± σ):     102.0 ms ±   6.8 ms    [User: 116.4 ms, System: 114.6 ms]
  Range (min … max):    93.1 ms … 115.8 ms    32 runs

Benchmark 2: ./uv-branch pip compile --no-progress scripts/requirements/airflow.in --universal
  Time (mean ± σ):      98.6 ms ±   4.4 ms    [User: 114.0 ms, System: 112.7 ms]
  Range (min … max):    93.3 ms … 112.4 ms    31 runs

Summary
  ./uv-branch pip compile --no-progress scripts/requirements/airflow.in --universal ran
    1.03 ± 0.08 times faster than ./uv-main pip compile --no-progress scripts/requirements/airflow.in --universal
```
@konstin konstin force-pushed the konsti/reduce-edge-hashing branch from be3ea7e to 1d7b6f9 Compare February 20, 2025 11:03
@@ -393,7 +394,7 @@ pub struct Plan {

/// The distributions that are not already installed in the current environment, and are
/// not available in the local cache.
pub remote: Vec<Dist>,
pub remote: Vec<Arc<Dist>>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is isn't necessary but we avoid the clone as we already have Arcs here.

In `ResolverOutput::from_state`, we spend a lot of time in cloning `ResolvedDist`. We can reduce that time by using an `Arc` around the `Dist`s in `ResolvedDist`.
@konstin konstin force-pushed the konsti/reduce-edge-hashing branch from 1d7b6f9 to f465616 Compare February 20, 2025 11:21
@zanieb zanieb added the performance Potential performance improvement label Feb 20, 2025
@charliermarsh charliermarsh merged commit ae916cf into main Feb 20, 2025
74 checks passed
@charliermarsh charliermarsh deleted the konsti/reduce-edge-hashing branch February 20, 2025 20:13
tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Feb 25, 2025
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.6.0` -> `0.6.3` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>astral-sh/uv (astral-sh/uv)</summary>

### [`v0.6.3`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#063)

[Compare Source](astral-sh/uv@0.6.2...0.6.3)

##### Enhancements

-   Allow quotes around command-line options in `requirement.txt files` ([#&#8203;11644](astral-sh/uv#11644))
-   Initialize PEP 723 script in `uv lock --script` ([#&#8203;11717](astral-sh/uv#11717))

##### Configuration

-   Accept multiple `.env` files in `UV_ENV_FILE` ([#&#8203;11665](astral-sh/uv#11665))

##### Performance

-   Reduce overhead in converting resolutions ([#&#8203;11660](astral-sh/uv#11660))
-   Use `SmallString` on `Hashes` ([#&#8203;11756](astral-sh/uv#11756))
-   Use a `Box` for `Yanked` on `File` ([#&#8203;11755](astral-sh/uv#11755))
-   Use a `SmallString` for the `Yanked` enum ([#&#8203;11715](astral-sh/uv#11715))
-   Use boxed slices for hash vector ([#&#8203;11714](astral-sh/uv#11714))
-   Use install concurrency for bytecode compilation too ([#&#8203;11615](astral-sh/uv#11615))

##### Bug fixes

-   Avoid installing duplicate dependencies across conflicting groups ([#&#8203;11653](astral-sh/uv#11653))
-   Check subdirectory existence after cache heal ([#&#8203;11719](astral-sh/uv#11719))
-   Include uppercase platforms for Windows wheels ([#&#8203;11681](astral-sh/uv#11681))
-   Respect existing PEP 723 script settings in `uv add` ([#&#8203;11716](astral-sh/uv#11716))
-   Reuse refined interpreter to create tool environment ([#&#8203;11680](astral-sh/uv#11680))
-   Skip removed directories during bytecode compilation ([#&#8203;11633](astral-sh/uv#11633))
-   Support conflict markers in `uv export` ([#&#8203;11643](astral-sh/uv#11643))
-   Treat lockfile as outdated if (empty) extras are added ([#&#8203;11702](astral-sh/uv#11702))
-   Display path separators as backslashes on Windows ([#&#8203;11667](astral-sh/uv#11667))
-   Display the built file name instead of the canonicalized name in `uv build` ([#&#8203;11593](astral-sh/uv#11593))
-   Fix message when there are no buildable packages ([#&#8203;11722](astral-sh/uv#11722))
-   Re-allow HTTP schemes for Git dependencies ([#&#8203;11687](astral-sh/uv#11687))

##### Documentation

-   Add anchor links to arguments and options in the CLI reference ([#&#8203;11754](astral-sh/uv#11754))
-   Add link to environment marker specification ([#&#8203;11748](astral-sh/uv#11748))
-   Fix missing a closing bracket in the `cache-keys` setting ([#&#8203;11669](astral-sh/uv#11669))
-   Remove the last edited date from documentation pages ([#&#8203;11753](astral-sh/uv#11753))
-   Fix readme typo ([#&#8203;11742](astral-sh/uv#11742))

### [`v0.6.2`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#062)

[Compare Source](astral-sh/uv@0.6.1...0.6.2)

##### Enhancements

-   Add support for constraining build dependencies with `tool.uv.build-constraint-dependencies` ([#&#8203;11585](astral-sh/uv#11585))
-   Sort dependency group keys when adding new group ([#&#8203;11591](astral-sh/uv#11591))

##### Performance

-   Use an `Arc` for index URLs ([#&#8203;11586](astral-sh/uv#11586))

##### Bug fixes

-   Allow use of x86-64 Python on ARM Windows ([#&#8203;11625](astral-sh/uv#11625))
-   Fix an issue where conflict markers could instigate a very large lock file ([#&#8203;11293](astral-sh/uv#11293))
-   Fix duplicate packages with multiple conflicting extras declared ([#&#8203;11513](astral-sh/uv#11513))
-   Respect color settings for log messages ([#&#8203;11604](astral-sh/uv#11604))
-   Eagerly reject unsupported Git schemes ([#&#8203;11514](astral-sh/uv#11514))

##### Documentation

-   Add documentation for specifying Python versions in tool commands ([#&#8203;11598](astral-sh/uv#11598))

### [`v0.6.1`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#061)

[Compare Source](astral-sh/uv@0.6.0...0.6.1)

##### Enhancements

-   Allow users to mark platforms as "required" for wheel coverage ([#&#8203;10067](astral-sh/uv#10067))
-   Warn for builds in non-build and workspace root pyproject.toml ([#&#8203;11394](astral-sh/uv#11394))

##### Bug fixes

-   Add `--all` to `uvx --reinstall` message ([#&#8203;11535](astral-sh/uv#11535))
-   Fallback to `GET` on HTTP 400 when attempting to use range requests for wheel download ([#&#8203;11539](astral-sh/uv#11539))
-   Prefer local variants in preference selection ([#&#8203;11546](astral-sh/uv#11546))
-   Respect verbatim executable name in `uvx` ([#&#8203;11524](astral-sh/uv#11524))

##### Documentation

-   Add documentation for required environments ([#&#8203;11542](astral-sh/uv#11542))
-   Note that `main.py` used to be `hello.py` ([#&#8203;11519](astral-sh/uv#11519))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4xNzEuMiIsInVwZGF0ZWRJblZlciI6IjM5LjE3OS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
loic-lescoat pushed a commit to loic-lescoat/uv that referenced this pull request Mar 2, 2025
Solving spent a chunk of its time just converting resolutions, the left
two blocks:


![image](https://github.com/user-attachments/assets/6f266440-c6e2-447c-ad7f-f92244f9d09b)

These blocks are `ResolverOutput::from_state` with 1.3% and
`ForkState::into_resolution` with 4.1% of resolver thread runtime for
apache airflow universal.

We reduce the overhead spent in those functions, to now 1.1% and 2.1% of
resolver time spend in those functions by:

Commit 1: Replace the hash set for the edges with a vec in
`ForkState::into_resolution`. We deduplicate edges anyway when
collecting them, and the hash-and-insert was slow.

Commit 2: Reduce the distribution clonign in
`ResolverOutput::from_state` by using an `Arc`.

The same profile excerpt for the resolver with the branch (note that
there is now an unrelated block between the two we optimized):


![image](https://github.com/user-attachments/assets/e36c205d-2cf8-4fe6-a2dd-3020c0515922)

Wall times are noisy, but the profiles show those changes as
improvements.

```
$ hyperfine --warmup 2 "./uv-main pip compile --no-progress scripts/requirements/airflow.in --universal" "./uv-branch pip compile --no-progress scripts/requirements/airflow.in --universal"
Benchmark 1: ./uv-main pip compile --no-progress scripts/requirements/airflow.in --universal
  Time (mean ± σ):      99.1 ms ±   3.8 ms    [User: 111.8 ms, System: 115.5 ms]
  Range (min … max):    93.6 ms … 110.4 ms    29 runs
 
Benchmark 2: ./uv-branch pip compile --no-progress scripts/requirements/airflow.in --universal
  Time (mean ± σ):      97.1 ms ±   4.3 ms    [User: 114.8 ms, System: 112.0 ms]
  Range (min … max):    90.9 ms … 112.4 ms    29 runs
 
Summary
  ./uv-branch pip compile --no-progress scripts/requirements/airflow.in --universal ran
    1.02 ± 0.06 times faster than ./uv-main pip compile --no-progress scripts/requirements/airflow.in --universal
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Potential performance improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants