-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: Fork Detector + Evidence (proof of fork) for discussion. #479
Conversation
Hey josef, I just skimmed through the document yesterday. I think the biggest question I have outstanding is in regards to the the PoF with the lightblock traces. I feel there's no much mentioned of the reasoning behind how the PoF actually proves that there is a fork (this might be discussed in more depth in a document that I haven't yet seen) or more specifically what I feel is the second half of the problem which is what the full node does with the PoF, how it verifies it and extracts out the respective evidence that it will need to commit on the chain (assuming that is the ultimate goal of fork accountability). I thought I'd just drop my thoughts in this PR now so that we can discuss this for our call later today. |
Thanks, @cmwaters for looking at the document! Indeed, this document does not capture the questions you raised. The plan is that that together with @milosevic we will address these questions in a different specification. We already created the following issue for that tendermint/tendermint#5149. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just initial comments from starting review. Will keep working through it.
> to the secondary. However, assuming that in this scenario the | ||
> primary is faulty it may not respond. | ||
|
||
As the main goal is to catch misbehavior of the primary, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be indented?
Also why do we say main goal is to catch the primary? Don't we want to catch any of them, even if primary is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to catch all of them. I guess this remark came out of discussion with @milosevic that also underly the comments in #494: we may use whatever is verified, and roll back if we discover a problem. In this view, the attack on the light client we have to worry about is done by the primary
different from the one the full node knows (we do not send the trace | ||
the primary gave us back to the primary) | ||
|
||
- The light client attack is via the primary. Thus we try to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confusing. the primary may be correct and the attack may be coming from one or many of the secondaries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said above, it may only lead to a safety issue, if it comes from the primary, and we gave the verified header to the user. If the verified header that comes from the primary is OK, the secondary may just force the lightclient to shut down and report evidence, but it will not make the light client have a bad verified block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key property detection should ensure is the following: if a bad light block is "consumed" (either verified in the light node case or installed on chain in IBC case), it will be detected within bounded time. In case of the light node, this is ensured by checking verified block against witnesses, while in IBC case it done by a correct relayer checking every installed header. This ensures that exposed misbehaviour is always slashable which incentivise correct behaviour. In the scenario with correct primary and fork coming from the secondary, we will as Josef pointed out, also submit fork and shut down, but this scenario is less severe as "consumed" block is from the correct branch.
it with a flag "trusted" that can be used | ||
- internally for efficiency reasons (to maintain | ||
[LCD-INV-TRUSTED-AGREED.1] until a fork is detected) | ||
- by light client based on the "one correct full node" assumption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also comment in #494 (comment) about introducing a local time parameter which the client waits before trusting (where during this time its cross referencing the header with existing and new peers, assuming a more dynamic peer set and peer exchange)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea a lot. In some sense, it makes the requirements about the trusting and unbonding periods more concrete. The unbonding period must contain the following times:
- download and verify a block
- cross check a block (= the time parameter you are suggesting)
- compute and report proof of fork
- process proof of fork, and punish misbehavior.
Right now, all this is somewhat hidden in the assumption that the trusting period << unbonding period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this time parameter should be part of the model and API. In that sense it does not make StateTrusted obsolete: we just need to change its semantic. The right semantic could be the following: light block is in StateTrusted if it was StateVerified and Fork_Detection_Timeout has passed without finding a valid fork. Note that one possible implementation of this is what we currently have with always assuming correct node in the peer list as it provides bounded detection time. Other possible implementation would involve dynamic peer set, and assumptions on periodic presence of the correct nodes in the peer set. For efficiency reasons we could imagine also probabilistic variants where not every light block will be checked but only the random one.
Co-authored-by: Ethan Buchman <[email protected]>
> [LCV-DATA-POF.1] mirrors the definition [TMBC-SIGN-FORK.1]: | ||
> *TrustedBlock* corresponds to *a*, and *PrimaryTrace* and *SecondaryTrace* | ||
> are traces to two blocks *b* and *c*. The traces establish that both | ||
> *skip-root(a,b,t)* and *skip-root(a,c,t)* are satisfied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to enforce that the heights used in verification via skipping must be the same for both skip-root(a,b,t)
and skip-root(a,c,t)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Just the blocks b and c need to be the same height. The other blocks on the traces need not be at the same height. Due to the dynamic changes in the validator set, in general skipping via the same heights might not work on different branches.
Actually, if there would be earlier two conflicting blocks with the same height on two different traces, we could use those earlier blocks as proof of fork. Therefore, attackers will most likely try to avoid earlier conflicting block.
@@ -169,7 +357,7 @@ false* then we have a *fork on the chain*. | |||
> fork. | |||
|
|||
|
|||
#### **[TMBC-LC-FORK]** | |||
#### **[TMBC-LC-FORK.1]** | |||
Let *a*, *b*, *c*, be light blocks and *t* a time. We define | |||
*light-client-fork(a,b,c,t)* iff | |||
- *sign-skip-match(a,b,c,t) = false* and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the condition should be slightly different: sign-skip-match(a,b,c,t) = true, sequ-rooted(b), b is unique and b != c.
Co-authored-by: Zarko Milosevic <[email protected]>
Co-authored-by: Zarko Milosevic <[email protected]>
Co-authored-by: Zarko Milosevic <[email protected]>
Co-authored-by: Zarko Milosevic <[email protected]>
I have added a file (rendered) with sketches of functions for fork detection in the IBC context. They contain suggestions to updates to the on-chain IBC component, and for the relayer. |
|
||
> The above requires us that before we pick a new secondary, we have to | ||
> query the secondary for the header of height | ||
> *lightStore.LatestTrusted().Height*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this doesn't ensure that trusted store of the secondary is consistent with the primary, i.e., only the latest trusted light block must be consistent. Could this potentially be problematic if secondary (with invalid store except latest trusted light block) is later promoted to a primary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is subtle. First, this current supervisor is simplified as it considers only going forward (VerifyToTarget
only goes forward). In this case, it is not a problem.
If we want the supervisor to be more general it need to use Main instead of VerifyToTarget
. Then we would need to query the (new) primary for an older trusted block as reference. This older trusted block then also needs to be passed into Forkdetection (which also assumes going forward). This is less critical as we discuss to anyway only pass in one block in the future. @OStevan has prepared a Rust code snippet and will create a PR on data structures and what information to pass around.
generated during detection. | ||
|
||
```go | ||
type PoFStore struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need this; using array of LightNodeProofOfFork should be sufficient.
``` | ||
- Expected postcondition | ||
- returns a lightstore that contains all lightblocks *b* from *ls* | ||
that satisfy: *from < b.Header.Height <= to* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need returned light blocks to satisfy that every next block is verified from the previous one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, actually subtrace should be replaced by something more clever, as you suggest. The problem with consequent blocks not supporting each other, occurred to me only later, and I didn't want to change anything before we agree on (that there is a problem) and the solution.
**TODO:** formalize conditions | ||
- Implementation remark | ||
- the primary is replaced by a secondary, and lightblocks above | ||
trusted blocks are removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe those blocks just need to go through fork detection procedure instead of removing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that. It will just make the control flow a bit more complicated.
result := VerifyToTarget(s, auxLS, sh.Header.Height) | ||
if result = (_,ResultSuccess) || (_,EXPIRED) { | ||
LS,result := VerifyToTarget(secondary, auxLS, sh.Header.Height) | ||
if (result = ResultSuccess || result = EXPIRED) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case result = EXPIRED that means that secondary is not responsive. But as this does not allow us to verify if conflicting header is a fork, we shouldn't report it. Note that this is not a problem as the important scenario is faulty-primary, correct secondary, in which case we will not have timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, EXPIRED
referred to some older version of verification.md. The Error code does not exist anymore, and we should remove it and just check for ResultSuccess
Dr. Widder held a gun against my head and asked me to approve it %) |
This document (rendered) can be a basis for follow-up discussions contains the following sections:
which fork detection happens
definition, that is, the data structure to submit evidence to full
nodes.
Addresses point 2. in #461.
Defines "proof of fork" as required by tendermint/tendermint#5149