Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a fast path for #attributes_to_hash #424

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

byroot
Copy link
Contributor

@byroot byroot commented Mar 8, 2025

This uses a number of performance tricks.

Rather than to call transform_key for every attribute of every instances, we do it once statically, and then use transform_values to fetch the values. This remove a big overhead, but is made a bit tricky as there are a number of global configurations that can invalidate that cache. I made it work, but things like using ObjectSpace.each_object isn't reasonable.

Hash#transform_values is also good because it allow Ruby to directly allocate a Hash of the right size, and skip the need to re-hash keys.
This doesn't make a big difference on the existing benchmark because there aren't many attributes, but I expect the gains would be more important the more attributes you have.

Makes @_resource_methods a Hash, so we can check whether we need to call the method on ourself or on obj in O(1).

Inline the Symbol case for fetching attributes as it's the most common.
More cases could be inlined.

Specialize a fast path when @_on_error isn't set, because rescue clause cause an overhead (need to call setjmp(3).
So it's best not to rescue when we don't have to.

Before:

ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                alba    82.000 i/100ms
Calculating -------------------------------------
                alba    830.409 (± 0.8%) i/s    (1.20 ms/i) -      4.182k in   5.036433s
Calculating -------------------------------------
                alba   818.241k memsize (     0.000  retained)
                         9.807k objects (     0.000  retained)
                         6.000  strings (     0.000  retained)

After:

ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                alba   107.000 i/100ms
Calculating -------------------------------------
                alba      1.074k (± 0.7%) i/s  (930.85 μs/i) -      5.457k in   5.079952s
Calculating -------------------------------------
                alba   818.241k memsize (     0.000  retained)
                         9.807k objects (     0.000  retained)
                         6.000  strings (     0.000  retained)

Draft

Opening this as a draft because as noted above and in code comments, there are a number of Alba feature / flexibility that conflict a bit with this approach. So there would be some extra work needed to not break compatibility in some corner cases.

I mostly wanted to see how far we can theoretically go.

Next Step?

After that, looking at the profile, the only significant gain left I could think of would be to generate customized code for attributes_to_hash so that instead of relying on __send__, we could have regular method calls, hence help YJIT and the VM cache calls.

byroot added 2 commits March 7, 2025 14:27
I couldn't find a reason for this pattern to call a method to get
a similar proc every time.

It does incur an extra method call, and a Proc object allocation.

In addition, in the case of `collection_converter`, the relatively
complex code can be replaced by a simpler and faster `map`.

Map is preferable over shifting objects into a new Array, because
it allows Ruby to directly allocate an Array of the right size
rather than to have to potentially resize it multiple times.

This isn't a big gain, but I think it makes the code easier to
read anyway.

Before:

```
Calculating -------------------------------------
                alba    801.321 (± 0.6%) i/s    (1.25 ms/i) -      4.080k in   5.091760s
Calculating -------------------------------------
                alba   826.321k memsize (     0.000  retained)
                         9.908k objects (     0.000  retained)
                         6.000  strings (     0.000  retained)
```

After:

```
Calculating -------------------------------------
                alba    830.039 (± 0.8%) i/s    (1.20 ms/i) -      4.182k in   5.038669s
Calculating -------------------------------------
                alba   818.241k memsize (     0.000  retained)
                         9.807k objects (     0.000  retained)
                         6.000  strings (     0.000  retained)
```
This uses a number of performance tricks.

Rather than to call `transform_key` for every attribute of every
instances, we do it once statically, and then use `transform_values`
to fetch the values. This remove a big overhead, but is made a bit tricky
as there are a number of global configurations that can invalidate that
cache. I made it work, but things like using `ObjectSpace.each_object`
isn't reasonable.

`Hash#transform_values` is also good because it allow Ruby to directly
allocate a Hash of the right size, and skip the need to re-hash keys.
This doesn't make a big difference on the existing benchmark because
there aren't many attributes, but I expect the gains would be more
important the more attributes you have.

Makes `@_resource_methods` a Hash, so we can check whether we need to
call the method on ourself or on `obj` in `O(1)`.

Inline the `Symbol` case for fetching attributes as it's the most common.
More cases could be inlined.

Specialize a fast path when `@_on_error` isn't set, because `rescue`
clause cause an overhead (need to call `setjmp(3)`.
So it's best not to rescue when we don't have to.

Before:

```
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                alba    82.000 i/100ms
Calculating -------------------------------------
                alba    830.409 (± 0.8%) i/s    (1.20 ms/i) -      4.182k in   5.036433s
Calculating -------------------------------------
                alba   818.241k memsize (     0.000  retained)
                         9.807k objects (     0.000  retained)
                         6.000  strings (     0.000  retained)
```

After:

```
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
                alba   107.000 i/100ms
Calculating -------------------------------------
                alba      1.074k (± 0.7%) i/s  (930.85 μs/i) -      5.457k in   5.079952s
Calculating -------------------------------------
                alba   818.241k memsize (     0.000  retained)
                         9.807k objects (     0.000  retained)
                         6.000  strings (     0.000  retained)
```
@byroot byroot mentioned this pull request Mar 8, 2025
Copy link

codecov bot commented Mar 8, 2025

Codecov Report

Attention: Patch coverage is 30.98592% with 49 lines in your changes missing coverage. Please review.

Project coverage is 46.22%. Comparing base (ad700cd) to head (ceee5da).

Files with missing lines Patch % Lines
lib/alba/resource.rb 30.00% 49 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (ad700cd) and HEAD (ceee5da). Click for more details.

HEAD has 21 uploads less than BASE
Flag BASE (ad700cd) HEAD (ceee5da)
63 42
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #424       +/-   ##
===========================================
- Coverage   99.66%   46.22%   -53.45%     
===========================================
  Files          14       13        -1     
  Lines         603      649       +46     
  Branches      156      174       +18     
===========================================
- Hits          601      300      -301     
- Misses          2      349      +347     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant