Skip to content
This repository was archived by the owner on Dec 4, 2024. It is now read-only.

Discussion: Questions on exponentially-increasing IBFT randomTimeout() #245

Closed
dankostiuk opened this issue Nov 25, 2021 · 0 comments · Fixed by #263
Closed

Discussion: Questions on exponentially-increasing IBFT randomTimeout() #245

dankostiuk opened this issue Nov 25, 2021 · 0 comments · Fixed by #263

Comments

@dankostiuk
Copy link
Contributor

Discussion: Questions on exponentially-increasing IBFT randomTimeout()

Description

While investigating a chain halt that occurred on our testnet earlier this week, restarting all validators fixed our chain as more messages would be sent in a shorter timespan, allowing nodes to transition out of each round faster (since the required messageQueue length thresholds would be met sooner). This then brought up some internal discussion which led to some questions we had about randomTimeout():

  1. Why use exponentially-increasing timeouts in the first place? We realize the original geth fork you've based the SDK on makes use of this calculate as well but we couldn't find out any indication why they didn't use a fixed or linearly-increasing timeout instead (see https://github.com/getamis/go-ethereum/blob/c7547381b2ea8999e423970d619835c662176790/consensus/istanbul/core/core.go#L316-L329). Was this to prevent multiple nodes from changing state at the same time?

  2. Could introducing a new flag to cap the timeout to a specified maximum value alleviate the problem? At least if the maximum randomTimeout() was 10 minutes for example, a chain would be able to recover from a halt in the span of hours instead of what could be days (or more). Shouldn't the goal be to recover the chain as quickly as possible if a chain halt were to occur?

Left this open as a discussion - thanks in advance and as always we appreciate the hard work.

Your environment

  • OS and version Ubuntu 20
  • version of the Polygon SDK
  • branch that causes this issue develop
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant