Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geth API performance degradation #15739

Closed
richiela opened this issue Dec 22, 2017 · 24 comments
Closed

geth API performance degradation #15739

richiela opened this issue Dec 22, 2017 · 24 comments

Comments

@richiela
Copy link

Hi there,

please note that this is an issue tracker reserved for bug reports and feature requests.

For general questions please use the gitter channel or the Ethereum stack exchange at https://ethereum.stackexchange.com.

System information

Geth version: "Geth/v1.7.2-stable/linux-amd64/go1.8"
OS & Version: Ubuntu 16.04
Commit hash : (if develop)

Expected behaviour

daemon responds to API calls in a timeline manner

Actual behaviour

After running for 10-12 hours, the daemon jsonrpc interface response times for certain calls start slowing down. getBalance calls seem to take the same time, as do eth_call. But send eth_sendTransaction, regardless from a locked or unlocked (coinbase) account, slows down to the point where its unusable. A restart of the daemon fixes the issue.

This is what it looks like after restart of the daemon and for the first few hours

EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xc3cc155e320bb7da40b65c7a36ad4d944aafaa86 2056 / 2816: getBalanceDuration 55 ms sendDuration 802 ms 2017-12-22 05:56:20.350
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x6045ed31b513c80c44e31594c4feb27333022bd7 2055 / 2816: getBalanceDuration 51 ms sendDuration 436 ms 2017-12-22 05:56:19.337
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x5cf085326ad6e992409806bef9d93b99eb968261 2054 / 2816: getBalanceDuration 48 ms sendDuration 494 ms 2017-12-22 05:56:18.807
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x6034f9fecdca8355dbfe971f80e82ba42236f5ef 2053 / 2816: getBalanceDuration 52 ms sendDuration 463 ms 2017-12-22 05:56:18.243
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x30277fc62f48a2b54903cb427628acef6c33f287 2052 / 2816: getBalanceDuration 41 ms sendDuration 1884 ms 2017-12-22 05:56:17.697
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xab8b9e9d6cf1cd0cefac392d8a87d48e0597fc58 2051 / 2816: getBalanceDuration 44 ms sendDuration 364 ms 2017-12-22 05:56:15.730

Here is the response time after about 10 hours:

EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xfbb30fd8cb6869a711455348b01f740ecfcc7a1f 575 / 618: getBalanceDuration 40 ms sendDuration 11508 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x40b378794490d77acd52e188619bf0aff5178202 574 / 618: getBalanceDuration 2 ms sendDuration 149232 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x83aea7637dd722283a2f86e3037716a57ec8082e 573 / 618: getBalanceDuration 40 ms sendDuration 49735 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xab744eab6f50422b44b30b55a599f277a61f7465 572 / 618: getBalanceDuration 44 ms sendDuration 144540 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xe0a21dfac0182b71850b90b0a72e8b2f8c8114e9 571 / 618: getBalanceDuration 40 ms sendDuration 90995 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x635b5f9af7edabc406f4b7afc0ebb9d52f3e8875 570 / 618: getBalanceDuration 41 ms sendDuration 152650 ms

geth is not resource constrained down it keeps climbing in memory usage

47771 101001 20 0 22.521g 0.016t 45992 S 161.8 30.3 823:12.48 geth

Steps to reproduce the behaviour

Just let daemon run and use it normally

Backtrace

[backtrace]
@karalabe
Copy link
Member

karalabe commented Dec 22, 2017

When this happens again, could you do a debug.cpuProfile("geth.cpu", 60) and send the generated file to us? It would be invaluable to see where Geth is spending its time.

@richiela
Copy link
Author

I restarted and its acting badly already :(

geth.zip

@mtbitcoin
Copy link

We're seeing the same mainly when it is used to serve a lot of RPC requests.

@holiman
Copy link
Contributor

holiman commented Dec 22, 2017

Graph

Edit: image was so large so I made it a link instead of displaying inline.

@karalabe
Copy link
Member

Could background database compaction be the culprit?

@holiman
Copy link
Contributor

holiman commented Dec 29, 2017

Does not really look like compaction to me, more like many simultaneous db requests, (during state.copy?) and IO-lag when it tries to open database-files.

@TinyCalf
Copy link

TinyCalf commented Jan 3, 2018

same problem when I send more than 5 rpc requests at the same time.Memory will be occupied and won't release. In bitcoin core, the bitcoind can limit the rpcthreads to avoid this problem:
-rpcthreads= Set the number of threads to service RPC calls (default: 4)
but geth do not have such option, any solutions or advise?

@richiela
Copy link
Author

richiela commented Jan 3, 2018

geth.zip

Adding another debug.cpuProfile("geth.cpu", 60) during a time that estimategas took over 1minute to return.

@richiela
Copy link
Author

richiela commented Jan 4, 2018

geth.cpu.zip

And another. geth is under 0 load. Everything is offline and times are still in the 10s of seconds. This should be obvious to track down. Please help.

@fjl
Copy link
Contributor

fjl commented Jan 4, 2018

@richiela which version of geth are you running?

@fjl
Copy link
Contributor

fjl commented Jan 4, 2018

I'm asking because I can't make sense of the profile without having the binary that created it. It doesn't seem to be geth 1.7.2 from geth.ethereum.org/downloads.

@richiela
Copy link
Author

richiela commented Jan 4, 2018

name: "Geth/v1.7.2-stable/linux-amd64/go1.8",

Compiled from source.

@fjl
Copy link
Contributor

fjl commented Jan 5, 2018

In that case it would be good to have the actual geth binary.

@fjl
Copy link
Contributor

fjl commented Jan 5, 2018

Note: you should consider compiling with go 1.9.2 instead of go 1.8. They added debug symbols to profiles in go 1.9. We still need the "Geth/v1.7.2-stable/linux-amd64/go1.8" binary to look at the profiles you posted.

@holiman
Copy link
Contributor

holiman commented Jan 5, 2018

@richiela could you provide us with the binary?

@holiman
Copy link
Contributor

holiman commented Jan 5, 2018

Regarding estimategas,I have a hunch that providing a "gas" (which is the max gas) in the call instead of letting geth supply, you'll get back to very quick responses. @richiela could you please test that? Specify something like gas=8000000 (8M).

@holiman
Copy link
Contributor

holiman commented Jan 5, 2018

I'm not able to repro myself, but I would very much like to know if this "solves" it, because if so I know what the underlying issue is.

@holiman
Copy link
Contributor

holiman commented Jan 5, 2018

@richiela also, when you measure sendDuration -- which specific method is used to send? Does it include unlock and signing, or sending a raw signed transaction?

@richiela
Copy link
Author

richiela commented Jan 5, 2018

time curl -X POST http://10.0.0.23:8101 -d @test.json
{"jsonrpc":"2.0","id":9,"result":"0x5208"}

real 0m5.283s
user 0m0.004s
sys 0m0.000s

cat test.json
{"id":9,"jsonrpc":"2.0","method":"eth_estimateGas","params":[{"from":"0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98","to":"0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98","value":"0x16345785D8A0000", "gas" : "0x7A1200"}]}

Doesn't solve the problem. As for the sendDuration, it is not constant because in some cases we have to unlock, in others we don't. Following Nick's suggestion of using a static gas value, our problem seems to been solved, so the majority of it was caused by that single call.

This is what it looks like today:

EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xcca1917d40043229c82e44cf9afc239dbea027a1 192 / 198: getBalanceDuration 38 ms sendDuration 1260 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xc382db0633f17234cfbaaf791909f9a3b6eff417 191 / 198: getBalanceDuration 38 ms sendDuration 135 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xe40dad060b5a034374462eb411d311d90baa005a 190 / 198: getBalanceDuration 43 ms sendDuration 126 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xb6e4989121ca00b3512360612106749c5bee738b 189 / 198: getBalanceDuration 39 ms sendDuration 553 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x83a835c4097fe9a05276bdd0c21ea145ee2c0c4e 188 / 198: getBalanceDuration 40 ms sendDuration 126 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x6c6386772e8c872e2bda834bccbd800bef47827d 187 / 198: getBalanceDuration 1 ms sendDuration 1276 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x4311f1050c414afc5984cffc6e83f879b0872717 186 / 198: getBalanceDuration 38 ms sendDuration 216 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x3cabc3e29a7e9b681078f06a00aa5403cab66097 185 / 198: getBalanceDuration 41 ms sendDuration 125 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xf045c1a21b4ac448781d9c6ee3f5df6df3a2aa1c 184 / 198: getBalanceDuration 41 ms sendDuration 0 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x217b8a24200d67c56f3927e9bec723f3d01518e6 183 / 198: getBalanceDuration 39 ms sendDuration 1245 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xab4e1fc208851823df8e20da9a3088c5ac7e4e3b 182 / 198: getBalanceDuration 41 ms sendDuration 125 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x4418d1ff48c42ecc2bf96beab707566b82615671 181 / 198: getBalanceDuration 38 ms sendDuration 0 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x0bfb1c2890dbea6a17648a5b910b9929df67dcae 180 / 198: getBalanceDuration 38 ms sendDuration 125 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x606c0ad30959095c910939f8ae71eb38d3d88f6d 179 / 198: getBalanceDuration 40 ms sendDuration 786 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x0cf5982f58e8361efaa31dd5e4eca277066206f7 178 / 198: getBalanceDuration 41 ms sendDuration 129 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0xfc89d292c9d9624352783ebfbffb69d6a1e4c6a2 177 / 198: getBalanceDuration 42 ms sendDuration 129 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x5f7095761da7fad1af52d6a0c80b7771ec8f1341 176 / 198: getBalanceDuration 47 ms sendDuration 128 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x756fd71df547188b39498368384feeab75113bdd 175 / 198: getBalanceDuration 41 ms sendDuration 129 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x3bf03643bcb8d68f4be2ce9efc6ce3a560e3770f 174 / 198: getBalanceDuration 40 ms sendDuration 125 ms
EthereumDeposit [ETH] : Ethereum.Deposit: swept 0x24abcbebcf7ebff159d6f1a1a6472e83517934e7 173 / 198: getBalanceDuration 40 ms sendDuration 477 ms

Any calls > 1s is usually an unlock. Anything sub 1 sec is a deposit contract.

I will upload a binary as soon as i get a second to go back on that server.

@richiela
Copy link
Author

richiela commented Jan 7, 2018

geth (2).zip
Binary as requested.

@ryanschneider
Copy link
Contributor

@richiela when you say eth_call is fine during these slow downs, are those calls targetting an existing block (e.g. latest) or pending? I'd be curious if eth_calls on pending are just as slow (see potentially related #15859).

@richiela
Copy link
Author

Hmm.. eth_call is done on latest and usually just on a small number of calls. I dont have a good way to test eth_call on pending.

@ryanschneider
Copy link
Contributor

Ok, I just took what was a slow estimateGas RPC and turned it into an eth_call where I targeted first latest and then pending. So something like:

$PAYLOAD='{ ... }'
time curl http://0.0.0.0:8545 -X POST  -H "Content-Type: application/json" -d '{"id":1,"jsonrpc":"2.0","params":[ $PAYLOAD, "latest" ], "method":"eth_call"}'

time curl http://0.0.0.0:8545 -X POST  -H "Content-Type: application/json" -d '{"id":2,"jsonrpc":"2.0","params":[ $PAYLOAD, "pending" ], "method":"eth_call"}'

That is, do the same eth_call payload twice in a row, once on latest and once on pending, and got back these results:

{"jsonrpc":"2.0","id":1,"result":"..."}

real	0m0.014s
user	0m0.004s
sys	0m0.000s
{"jsonrpc":"2.0","id":2,"result":"..."}

real	0m6.572s
user	0m0.004s
sys	0m0.000s

This is on a node that's processing other estimateGas calls, as well as normal devp2p traffic.

Sometimes they both take roughly the same time (10s of milliseconds), but very often the 'pending' one takes 6-10 seconds.

@fjl
Copy link
Contributor

fjl commented Jan 29, 2019

This is too old to do anything meaningful. Some amount of fluctuation is to be expected when interacting with the pending state because access to it must synchronize with generation of the pending state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants