Do not store JSON Yul ASTs and Yul CFG in `CompilerStack` #15451

cameel · 2024-09-24T15:15:45Z

~~Depends on #15457.~~ Merged.

This fixes excessive memory usage due to JSON representation of Yul ASTs and Yul CFG being always calculated from IR, even if not requested as an output.

The PR fixes the problem by changing CompilerStack so that these artifacts are only prepared when requested by CLI or StandardCompiler. This requires reparsing the IR from scratch each time (and the PR does not go out of its way to ensure it's parsed only once if multiple ASTs are requested), but Yul parsing is generally fast and these artifacts are not needed in the typical Solidity development workflow anyway.

Effects

This has a huge impact on memory usage and also non-negligible impact on running time.

For example when compiling Uniswap using our external benchmarks, disabling generation of this data makes it go from 143 s and 5 GB RAM down to 115 s and 1.5 GB RAM.

I've been noticing the aggressive RAM usage for a while now, to the point that I had to be careful when running the external benchmarks - compiling Eigenlayer or Sablier sometimes required more than 30 GB RAM and would crash on my machine.

cameel · 2024-09-24T15:25:28Z

Hmm... tests seem to be failing due to different native locations in the optimized AST:

-            "nativeSrc": "58:298:0",
+            "nativeSrc": "58:315:0",
             "nodeType": "YulBlock",

-                    "nativeSrc": "68:282:0",
+                    "nativeSrc": "68:299:0",
                     "nodeType": "YulBlock",

-                            "nativeSrc": "111:27:0",
+                            "nativeSrc": "128:27:0",
                             "nodeType": "YulVariableDeclaration",

cameel · 2024-09-24T15:58:16Z

Benchmarks

Using the improved benchmarking scripts from #15450.

IR pipeline

File	Time (0.8.27)	Time (this PR)	Improvement	Exit code
openzeppelin	40 s	32 s	-20%	0
uniswap-v4	157 s	116 s	-26%	0
eigenlayer	716 s	531 s	-26%	0
seaport	13 s	12 s	-8%	1
sablier-v2	935 s	702 s	-25%	1

File	Memory (0.8.27)	Memory (this PR)	Improvement	Exit code
openzeppelin	1220 MiB	495 MiB	-59%	0
uniswap-v4	4805 MiB	1494 MiB	-69%	0
eigenlayer	20346 MiB	4470 MiB	-78%	0
seaport	1042 MiB	950 MiB	-9%	1
sablier-v2	25535 MiB	6268 MiB	-75%	1

Note: For the last two the numbers do not represent the whole project, because in each there is a contract that fails to compile due to a StackTooDeep error.

Legacy pipeline

There should be zero difference here since these artifacts are only generated when IR is requested.
They're still potentially useful to see what the variance of these numbers is (they come from a single run).

File	Time (0.8.27)	Time (this PR)	Improvement
openzeppelin	10 s	10 s	0%
uniswap-v4	24 s	20 s	-17%
eigenlayer	86 s	72 s	-16%
seaport	183 s	159 s	-13%
sablier-v2	155 s	149 s	-4%

File	Memory (0.8.27)	Memory (this PR)	Improvement
openzeppelin	519 MiB	517 MiB	0%
uniswap-v4	1013 MiB	1011 MiB	0%
eigenlayer	2833 MiB	2835 MiB	0%
seaport	4476 MiB	4476 MiB	0%
sablier-v2	4882 MiB	5213 MiB	+7%

cameel · 2024-09-24T23:04:19Z

An interesting thing is that this overhead of JSON processing did not show up in profiling I did recently: #15179 (comment). I'd expect to see YulStack::astJson() there, bit I can't see any JSON-related functions at all on any of the diagrams. I wonder if it might be something along the lines of the overhead coming from system memory management and the profiler does not counting that?

One thing I do see is YulStack::parseAndAnalyze() taking 6% of execution for Uniswap on develop, which might indicate that all that extra reparsing may be starting to add up. Or not - this is the only diagram where I see it so it could also be a fluke.

clonker · 2024-09-25T09:33:28Z

Hmm... tests seem to be failing due to different native locations in the optimized AST:

Not sure if it is the problem leading to it, but it may be a problem regardless:
I did a little experiment and compared the output of

contract(_contractName).yulIROptimized

with the output of

loadGeneratedIR(loadGeneratedIR(loadGeneratedIR(contract(_contractName).yulIROptimized).print()).print()).print()

just to see if it stays consistent under repeated loading and printing operations. It doesn't:

@@ -1,10 +1,10 @@
 /// @use-src 0:"test/cmdlineTests/ast_ir/input.sol"
 object "C_2" {
     code {
         {
-            /// @src 0:60:73  "contract C {}"
+            /// @src 0:60:73
             let _1 := memoryguard(0x80)
             mstore(64, _1)
             if callvalue() { revert(0, 0) }
             let _2 := datasize("C_2_deployed")
             codecopy(_1, dataoffset("C_2_deployed"), _2)
@@ -13,11 +13,11 @@
     }
     /// @use-src 0:"test/cmdlineTests/ast_ir/input.sol"
     object "C_2_deployed" {
         code {
             {
-                /// @src 0:60:73  "contract C {}"
+                /// @src 0:60:73
                 revert(0, 0)
             }
         }
         data ".metadata" hex"a2646970667358221220b0b8ededf61719cebc357d865ee54b27e1d49d69b9c45f8107b7210d698521d864736f6c63782c302e382e32382d646576656c6f702e323032342e392e32352b636f6d6d69742e34663630383164662e6d6f64005d"
     }

clonker · 2024-09-25T09:57:43Z

Also the difference is multiples of 17, which is consistent with the length of "contract C {}" - so I recon that it is the problem after all

cameel · 2024-09-25T12:58:03Z

loadGeneratedIR(loadGeneratedIR(loadGeneratedIR(contract(_contractName).yulIROptimized).print()).print()).print()

There probably is something to it, the exact 17 char difference is too specific, but at least in the snippet above the cause is just the way you print it.

yulIROptimized was generated with print(*this), where this is the CompilerStack. The parameter is an object that can provide access to sources, which are needed if you want to print debug snippets. Plain print() without an argument does not have access to it and won't print those snippets.

I'm trying to debug it but, oddly, after rebasing this I cannot reproduce this difference locally. The test passes for me somehow. I haven't tried it before the rebase, so I'm not sure if it's just not reproducible on my machine in general or if it's the rebase that changed something...

EDIT: It still fails in CI so it's not the rebase.

cameel · 2024-09-25T17:51:20Z

I think I figured it out. Looks like my previous fix for native locations (#15457) did not properly account for code snippets in debug info and I did not notice that. TBH it's quite hard to notice when these offsets are wrong in test output and verifying them all manually is not feasible - which is the reason why we had the original issue in the first place.

New fix: #15457.

libsolidity/interface/CompilerStack.cpp

Changelog.md

- The JSON artifacts take a lot of memory so always generating them is very wasteful. - Even when requested, they are only relevant while the current contract is being copmiled. We only really want to store the serialized JSON that will be returned to the user. - This actually also speeds up compilation quite a bit.

cameel · 2024-09-30T10:09:20Z

I did some more benchmarking and profiling on top of this.

First I ran our profiling (PROFILE_OPTIMIZER_STEPS CMake option) to see how long each step takes. See #15465 (comment) for detailed results.

I also checked how long each part of the pipeline takes now:

	OpenZeppelin	Uniswap	Eigenlayer
Compilation (optimized)	32 s	117 s	514 s
Optimizer steps	18 s	61 s	261 s
Compilation (unoptimized)	15 s	49 s	212 s
Compilation (only IR generation)	4 s	10 s	37 s
Compilation (only analysis)	1 s	2 s	2 s

The numbers are from running full compilation with only specific outputs selected, except for "Optimizer steps", which is the time I got from profiling optimization steps in #15465 (comment).

The same in relative terms:

	OpenZeppelin	Uniswap	Eigenlayer
Compilation (optimized)	100%	100%	100%
Optimizer steps	56%	52%	51%
Compilation (unoptimized)	47%	42%	41%
Compilation (only IR generation)	13%	9%	7%
Compilation (only analysis)	3%	2%	0%

Conclusions

Optimization time is now down to 50-60% of the total.
IR generation and analysis are only ~10% of the total. The rest is optimization and bytecode generation.
CSE is the single slowest step, taking 20% of the optimization time.
Results are pretty consistent between projects. Top 10 steps are (almost) the same and relative running times are very similar too.

cameel added the performance 🐎 label Sep 24, 2024

cameel requested review from r0qs and clonker September 24, 2024 15:15

cameel self-assigned this Sep 24, 2024

cameel changed the title ~~Compiler stack do not store serialized json~~ Do not store JSON Yul ASTs and Yul CFG in CompilerStack Sep 24, 2024

ekpyron added this to the 0.8.28 milestone Sep 24, 2024

cameel added the 🟡 PR review label label Sep 25, 2024

cameel force-pushed the compiler-stack-do-not-store-serialized-json branch from 4f6081d to 7ff5fbd Compare September 25, 2024 12:44

cameel force-pushed the compiler-stack-do-not-store-serialized-json branch from 7ff5fbd to c23cac8 Compare September 25, 2024 17:30

cameel mentioned this pull request Sep 25, 2024

Fix shift in Yul AST native src locations due to code snippets in debug info #15457

Merged

cameel changed the base branch from develop to fix-code-snippets-shifting-yul-ast-native-src-locations September 25, 2024 17:45

cameel added the has dependencies The PR depends on other PRs that must be merged first label Sep 25, 2024

cameel force-pushed the compiler-stack-do-not-store-serialized-json branch from c23cac8 to 2360c54 Compare September 25, 2024 18:08

cameel force-pushed the fix-code-snippets-shifting-yul-ast-native-src-locations branch from 995da44 to 614f312 Compare September 25, 2024 18:08

clonker reviewed Sep 26, 2024

View reviewed changes

libsolidity/interface/CompilerStack.cpp Show resolved Hide resolved

clonker reviewed Sep 26, 2024

View reviewed changes

Changelog.md Outdated Show resolved Hide resolved

cameel force-pushed the fix-code-snippets-shifting-yul-ast-native-src-locations branch from 614f312 to 4216aa7 Compare September 26, 2024 15:16

cameel force-pushed the compiler-stack-do-not-store-serialized-json branch from 2360c54 to c78a8d2 Compare September 26, 2024 15:28

Base automatically changed from fix-code-snippets-shifting-yul-ast-native-src-locations to develop September 26, 2024 15:51

cameel marked this pull request as ready for review September 26, 2024 16:46

cameel added 2 commits September 26, 2024 18:46

CompilerStack: Helper for Yul parsing

578e1bb

cameel force-pushed the compiler-stack-do-not-store-serialized-json branch from c78a8d2 to 9ccc44a Compare September 26, 2024 16:46

cameel removed the has dependencies The PR depends on other PRs that must be merged first label Sep 26, 2024

This was referenced Sep 26, 2024

Stop caching outputs with LazyInit in CompilerStack #15459

Closed

Do not cache instances of Compiler in CompilerStack #15460

Merged

clonker approved these changes Sep 27, 2024

View reviewed changes

cameel merged commit 26d5b3f into develop Sep 27, 2024
72 checks passed

cameel deleted the compiler-stack-do-not-store-serialized-json branch September 27, 2024 13:17

cameel mentioned this pull request Sep 30, 2024

Fix missing std:: prefix for PROFILE_OPTIMIZER_STEPS #15465

Merged

This was referenced Sep 30, 2024

Requesting bytecode/IR for one contract results in it being generated for all selected contracts #15373

Closed

Add support for instructions and source ranges. #15368

Open

aarlt mentioned this pull request Oct 1, 2024

Enable ethdebug debug info and output selection. #15289

Merged

This was referenced Oct 11, 2024

Compiler stack: Protect loadGeneratedIR from contracts that cannot be deployed #15505

Merged

Treat builtins separately in Yul AST #15347

Merged

cameel mentioned this pull request Nov 3, 2024

RuntimeError: memory access out of bounds when compiling a large input file with soljson.js (but not solc) #13966

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not store JSON Yul ASTs and Yul CFG in `CompilerStack` #15451

Do not store JSON Yul ASTs and Yul CFG in `CompilerStack` #15451

cameel commented Sep 24, 2024 •

edited

Loading

cameel commented Sep 24, 2024

cameel commented Sep 24, 2024 •

edited

Loading

cameel commented Sep 24, 2024

clonker commented Sep 25, 2024

clonker commented Sep 25, 2024

cameel commented Sep 25, 2024 •

edited

Loading

cameel commented Sep 25, 2024

cameel commented Sep 30, 2024 •

edited

Loading

Do not store JSON Yul ASTs and Yul CFG in CompilerStack #15451

Do not store JSON Yul ASTs and Yul CFG in CompilerStack #15451

Conversation

cameel commented Sep 24, 2024 • edited Loading

Effects

cameel commented Sep 24, 2024

cameel commented Sep 24, 2024 • edited Loading

Benchmarks

IR pipeline

Legacy pipeline

cameel commented Sep 24, 2024

clonker commented Sep 25, 2024

clonker commented Sep 25, 2024

cameel commented Sep 25, 2024 • edited Loading

cameel commented Sep 25, 2024

cameel commented Sep 30, 2024 • edited Loading

Conclusions

Do not store JSON Yul ASTs and Yul CFG in `CompilerStack` #15451

Do not store JSON Yul ASTs and Yul CFG in `CompilerStack` #15451

cameel commented Sep 24, 2024 •

edited

Loading

cameel commented Sep 24, 2024 •

edited

Loading

cameel commented Sep 25, 2024 •

edited

Loading

cameel commented Sep 30, 2024 •

edited

Loading