/eth/v1/node/syncing is not really up to the task #77

rolfyone · 2020-09-03T04:57:24Z

Sorry for the late flagging of this, I've just started actually moving some of the teku endpoints to serve new endpoints.

Syncing is a slightly more complicated domain than it probably appears on first glance.

A given node has a good idea of what slot it's at, and can pretty easily find what slot other peers are at, and based on trailing distances or some other metric decide it actually needs to sync data from a peer.

The current endpoint arguably over-simplifies the scope of the problem, by only having 2 fields:

head_slot: uint64,
sync_distance: uint64

Head slot describes 'head slot the node is trying to reach', and sync_distance is 0 if the node is in sync.

It's probably not uncommon for nodes to be in some kind of state where they're a little behind and just processing gossip to catch up on the state of play.

Lighthouse/teku nodes had syncing endpoint returning an explicit syncing flag, the head_slot of the best peer, the current_slot that the node is up to, and the start_slot (which i have never really needed, but is basically the first non final slot from the nodes perspective, so could potentially be winding back to that point if it ends up syncing a different node)

The value in this is that it builds a picture on the state of play of the node you're querying.

explicitly told if the node is syncing via a boolean
have the slot the node is at, and the slot of its best peer.

To best map currently, I would basically have to look at whether our node is syncing, and if it is immediately pass back our current slot, and sync_distance 0 to say we're not syncing.
If we are syncing, head_slot would then be the head_slot of the best peer that I have, and sync_distance would be head_slot - current_slot

Is this something we should consider adding more context to before going too far down the implementation route for this endpoint?

If we do stay with current description, we should probably at the very least rename the 'head_slot' due to that being a term currently in use by ethstats, and I think we're actually wanting 'target_slot' or similar.

The text was updated successfully, but these errors were encountered:

mpetrunic · 2020-09-03T07:34:33Z

head_slot is your nodes current calculated head slot. syncing_distance is amount of slots between you head_slot and calculated network head_slot. If your node is synced or gossiping blocks, you can just put sync_distance=0. Target slot is easy to calculate -> head_slot + sync_distance.

rolfyone · 2020-09-03T07:36:57Z

the comment for head_slot is Head slot node is trying to reach

so from what you're saying the comment is completely misleading? if it's YOUR head slot, that's the exact opposite of what I understood it to be..

mpetrunic · 2020-09-03T07:45:30Z

This is embarrassing, I even implemented it like I said in lodestar but spec clearly says otherwise. 😞
Should we change api description to says "Beacon nodes head slot" or rename to target_slot?

mcdee · 2020-09-03T08:23:01Z

Given the confusion, perhaps rename "head_slot" to "chain_head_slot"?

Also unsure about the idea of a target slot outlined above. Surely the target slot is the chain head slot, in which case sync distance is the chain head slot minus the node's highest synced slot.

I'm not a fan of trying to keep track of the peers' highest synced slot; it is a very volatile number and I'm not sure that knowing it will give much value to an end user. Perhaps I'm missing the use case here.

ajsutton · 2020-09-03T23:44:20Z

I think it's pretty set that "head slot" means the slot of the block at the node's current chain head because that's how eth2stats labels it.

In terms of the API I think it probably needs to provide more explicit information. There's three questions I'd normally call an API like this to answer:

Is the node in sync or not?
Where is it up to?
How much longer does it have to go?

The ETH1 syncing APIs (https://besu.hyperledger.org/en/stable/Reference/API-Methods/#eth_syncing) are quite useful for answering these questions, providing:

syncing: boolean
current slot (probably should be head slot in ETH2 terminology - slot of the node's current best block)
highest slot (or maybe target slot - the slot the node is trying to sync up to)

ETH1 also provide starting block which I've never understood the reasoning for and a few fast sync specific fields, so I'd omit those. I would also avoid the slightly weird behaviour of only returning the head slot etc when syncing and just syncing: false when not syncing. When in sync I'd expect a response like:

{
  "syncing": false,
  "head_slot": 1000,
  "highest_slot": 1000
}

I think the explicit syncing flag is particularly useful, though perhaps better to flip it to "in_sync". A node is not necessarily out of sync because it has peers with a block at a later slot as they may be for non-canonical forks. It may even be that the node considers itself in sync but is actively running a batch sync to pull in a different fork but which it believes is minority (eg small number of peers on it, more than 50% of attestations on it's current canonical fork etc). I guess you can communicate that by returning a distance of 0 in that case but that requires a bunch more understanding to interpret the result instead of having more explicit fields.

rolfyone · 2020-09-06T21:15:59Z

if we were to update the description of head slot to something like suggested that'd be fine. I was just completely confused with how to work it given what the description was.
Also happy with Adrians suggestion if that ends up the solution, I might work on a different endpoint for a while and see how this pans out :)

rolfyone · 2020-09-24T20:35:44Z

i think we've all moved past this now, closing.

paulhauner · 2021-01-13T02:37:26Z

Lighthouse is facing this issue as well: sigp/lighthouse#2148 (review). We also have the is_syncing flag that Teku has. We can work around the standard API removing it by declaring that syncing = sync_distance > 0.

However, I find this unattractive. Firstly, it's a pain for users. Consider a user who wants to use curl to ask the age old question of "is my node syncing?" Seeing "is_syncing: true" is going to make their lives a whole lot easier.

Secondly, the API is being opinionated by saying: It's not possible to be "not synced" with a sync distance greater than zero . This seems to be reasonable at first, but it's also reasonable for a BN to say "I might only have a few blocks left to sync and I'm synced-enough, so lets turn on all the duties endpoints and health endpoint". In this case we're unable to communicate that we're still expecting a few more blocks but are considering ourselves to be synced.

I think it's pretty clear-cut that UX is better with an is_syncing value. It's subjective if you want to do the sync tolerance stuff I talk about, but regardless I think it makes it makes the case that adding is_syncing makes the API more explicit and expressive in ways that are useful to some.

EDIT: I opened a PR at #121

rolfyone closed this as completed Sep 24, 2020

realbigsean mentioned this issue Jan 12, 2021

Remove the is_syncing flag from node/syncing endpoint. sigp/lighthouse#2148

Closed

paulhauner mentioned this issue Jan 13, 2021

Add is_syncing to /node/syncing #121

Merged

mehdi-aouadi mentioned this issue Dec 11, 2024

Revisit isInSync meaning in SyncState Consensys/teku#8890

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/eth/v1/node/syncing is not really up to the task #77

/eth/v1/node/syncing is not really up to the task #77

rolfyone commented Sep 3, 2020

mpetrunic commented Sep 3, 2020

rolfyone commented Sep 3, 2020

mpetrunic commented Sep 3, 2020

mcdee commented Sep 3, 2020

ajsutton commented Sep 3, 2020

rolfyone commented Sep 6, 2020

rolfyone commented Sep 24, 2020

paulhauner commented Jan 13, 2021 •

edited

Loading

/eth/v1/node/syncing is not really up to the task #77

/eth/v1/node/syncing is not really up to the task #77

Comments

rolfyone commented Sep 3, 2020

mpetrunic commented Sep 3, 2020

rolfyone commented Sep 3, 2020

mpetrunic commented Sep 3, 2020

mcdee commented Sep 3, 2020

ajsutton commented Sep 3, 2020

rolfyone commented Sep 6, 2020

rolfyone commented Sep 24, 2020

paulhauner commented Jan 13, 2021 • edited Loading

paulhauner commented Jan 13, 2021 •

edited

Loading