-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
channeldb+routing+discovery: reject zombie announcements #2777
channeldb+routing+discovery: reject zombie announcements #2777
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excited to finally exterminate the Zombies in a permanent manner....LOK TAR OGAR!!!! ✊🏾
Solid set of changes, completed first pass and will start testing on various nodes. No major comments other than the sig validation and node pruning comments you'll find in-line.
routing/router.go
Outdated
MarkEdgeLive(chanID lnwire.ShortChannelID) error | ||
|
||
// IsZombieEdge returns whether the edge is considered zombie. | ||
IsZombieEdge(chanID lnwire.ShortChannelID) bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface is getting a bit thicc...in the future I think we should slim it unbundle it a bit by breaking it into distinct interfaces that have very narrow purposes. As is now, many of the callers that accept this interface don't use most of the methods they're forced to know about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slimmed it down a bit since the gossiper now only requires knowing of MarkEdgeLive
.
discovery/gossiper_test.go
Outdated
case <-time.After(2 * trickleDelay): | ||
} | ||
|
||
time.Sleep(time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why's this sleep needed? Perhaps we should instead return to the caller immediately even if we get an update for an ann we don't know of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have to re-process it, if we return to the caller immediately then they won't be informed of any errors that occurred while re-processing it. Turns out the sleep wasn't needed though.
routing/router.go
Outdated
"as zombie: %v", chanToPrune.ChannelPoint, err) | ||
} | ||
|
||
err = r.cfg.Graph.DeleteChannelEdge(&chanToPrune.ChannelPoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should combine both operations into a single call? If we do, then this means that someone can't give us a super old channel and make us re-scan the graph with GetUTXO or the like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah the more i think about this, i think its a good idea to have the transition from live -> zombie be atomic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Fixed.
routing/router.go
Outdated
// channels from the graph based on their spentness, but whether they | ||
// are considered zombies or not. | ||
if r.cfg.AssumeChannelValid { | ||
if err := r.pruneZombieChans(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This skips calling PruneGraphNodes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be including these changes in another PR.
Comments addressed, PTAL @cfromknecht @Roasbeef. |
if err != nil { | ||
return errors.Errorf("unable to reconstruct message: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 one day, we will kill go-errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will begin testing this on my node 🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💀🔫
In this commit, we add a zombie edge index to the database. This allows us to quickly determine across restarts whether we're attempting to process an edge we've previously deemed as zombie.
We mark the edges as zombies when pruning them to ensure we don't attempt to reprocess them later on. This also applies to channels that have been removed from the graph due to being stale.
In this commit, we extend the graph's FetchChannelEdgesByID and HasChannelEdge methods to also check the zombie index whenever the edge to be looked up doesn't exist within the edge index. We do this to signal to callers that the edge is known, but only as a zombie, and the only information that we have about the edge are the node public keys of the two parties involved in the edge. In the event that an edge does exist within the zombie index, we make an additional check on edge policies to ensure they are not within the router's pruning window, indicating that it is a fresh update.
In this commit, we leverage the recently introduced zombie edge index to quickly reject announcements for edges we've previously deemed as zombies. Care has been taken to ensure we don't reject fresh updates for edges we've considered zombies.
// signed by the correct party. The least-significant | ||
// bit in the flag on the channel update tells us which | ||
// edge is being updated. | ||
var pubKey *btcec.PublicKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a blocker, just noting that this same block of code exists in many places w/in the code base. would be great to have a method like
func (c *ChannelEdgeInfo) DirectionKey(flags lnwire.ChannelFlags) *btcec.PublicKey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🎎
return nil | ||
} | ||
|
||
exists = true | ||
isZombie = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non blocking nit: not actually needed since the default value of a bool
is false.
"graph(shortChanID=%v), saving for "+ | ||
"reprocessing later", shortChanID) | ||
|
||
// NOTE: We don't return anything on the error channel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this potentially cause the gossiper to deadlock if the announcement never comes through? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No because we return. The errChan
in which we're sending the error back is given to the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was just wondering who waits for the value on the channel, but looks like we never actually read it for remote annonucements...
In this PR, we introduce a new persistent index for zombie edges to quickly reject announcements for edges we've previously deemed as zombies. Care has been taken to ensure we don't reject fresh updates for edges we've considered zombies.