-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a fast path for #attributes_to_hash
#424
Draft
byroot
wants to merge
2
commits into
okuramasafumi:main
Choose a base branch
from
byroot:cache-keys
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I couldn't find a reason for this pattern to call a method to get a similar proc every time. It does incur an extra method call, and a Proc object allocation. In addition, in the case of `collection_converter`, the relatively complex code can be replaced by a simpler and faster `map`. Map is preferable over shifting objects into a new Array, because it allows Ruby to directly allocate an Array of the right size rather than to have to potentially resize it multiple times. This isn't a big gain, but I think it makes the code easier to read anyway. Before: ``` Calculating ------------------------------------- alba 801.321 (± 0.6%) i/s (1.25 ms/i) - 4.080k in 5.091760s Calculating ------------------------------------- alba 826.321k memsize ( 0.000 retained) 9.908k objects ( 0.000 retained) 6.000 strings ( 0.000 retained) ``` After: ``` Calculating ------------------------------------- alba 830.039 (± 0.8%) i/s (1.20 ms/i) - 4.182k in 5.038669s Calculating ------------------------------------- alba 818.241k memsize ( 0.000 retained) 9.807k objects ( 0.000 retained) 6.000 strings ( 0.000 retained) ```
This uses a number of performance tricks. Rather than to call `transform_key` for every attribute of every instances, we do it once statically, and then use `transform_values` to fetch the values. This remove a big overhead, but is made a bit tricky as there are a number of global configurations that can invalidate that cache. I made it work, but things like using `ObjectSpace.each_object` isn't reasonable. `Hash#transform_values` is also good because it allow Ruby to directly allocate a Hash of the right size, and skip the need to re-hash keys. This doesn't make a big difference on the existing benchmark because there aren't many attributes, but I expect the gains would be more important the more attributes you have. Makes `@_resource_methods` a Hash, so we can check whether we need to call the method on ourself or on `obj` in `O(1)`. Inline the `Symbol` case for fetching attributes as it's the most common. More cases could be inlined. Specialize a fast path when `@_on_error` isn't set, because `rescue` clause cause an overhead (need to call `setjmp(3)`. So it's best not to rescue when we don't have to. Before: ``` ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- alba 82.000 i/100ms Calculating ------------------------------------- alba 830.409 (± 0.8%) i/s (1.20 ms/i) - 4.182k in 5.036433s Calculating ------------------------------------- alba 818.241k memsize ( 0.000 retained) 9.807k objects ( 0.000 retained) 6.000 strings ( 0.000 retained) ``` After: ``` ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- alba 107.000 i/100ms Calculating ------------------------------------- alba 1.074k (± 0.7%) i/s (930.85 μs/i) - 5.457k in 5.079952s Calculating ------------------------------------- alba 818.241k memsize ( 0.000 retained) 9.807k objects ( 0.000 retained) 6.000 strings ( 0.000 retained) ```
Merged
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #424 +/- ##
===========================================
- Coverage 99.66% 46.22% -53.45%
===========================================
Files 14 13 -1
Lines 603 649 +46
Branches 156 174 +18
===========================================
- Hits 601 300 -301
- Misses 2 349 +347 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This uses a number of performance tricks.
Rather than to call
transform_key
for every attribute of every instances, we do it once statically, and then usetransform_values
to fetch the values. This remove a big overhead, but is made a bit tricky as there are a number of global configurations that can invalidate that cache. I made it work, but things like usingObjectSpace.each_object
isn't reasonable.Hash#transform_values
is also good because it allow Ruby to directly allocate a Hash of the right size, and skip the need to re-hash keys.This doesn't make a big difference on the existing benchmark because there aren't many attributes, but I expect the gains would be more important the more attributes you have.
Makes
@_resource_methods
a Hash, so we can check whether we need to call the method on ourself or onobj
inO(1)
.Inline the
Symbol
case for fetching attributes as it's the most common.More cases could be inlined.
Specialize a fast path when
@_on_error
isn't set, becauserescue
clause cause an overhead (need to callsetjmp(3)
.So it's best not to rescue when we don't have to.
Before:
After:
Draft
Opening this as a draft because as noted above and in code comments, there are a number of Alba feature / flexibility that conflict a bit with this approach. So there would be some extra work needed to not break compatibility in some corner cases.
I mostly wanted to see how far we can theoretically go.
Next Step?
After that, looking at the profile, the only significant gain left I could think of would be to generate customized code for
attributes_to_hash
so that instead of relying on__send__
, we could have regular method calls, hence help YJIT and the VM cache calls.