Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix queue cleanup in proposer #93

Merged
merged 4 commits into from
Oct 14, 2021
Merged

Conversation

petuhovskiy
Copy link
Member

There was a problem when msgQueueHead was moved further than truncateLsn, when minQuorumLsn matched end of wal record in the middle of queue message. When this happened, safekeeper started receiving WAL from the middle of WAL record and couldn't parse it.

This is fixed by splitting truncateLsn advancement and queue cleanup. Closes #91

Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message
@petuhovskiy petuhovskiy requested a review from arssher October 8, 2021 13:21
@petuhovskiy
Copy link
Member Author

Fix seems to work, but it's hard to test it, because it's hard to reliably reproduce the initial bug with violation of streaming starting point.

Here are CI runs with this fix applied. They mostly work fine, but some tests failed several times because of other reasons. Pipeline #2523 failed with WAL streaming connection failed (error connecting to server: Cannot assign requested address (os error 99)) in pageserver.log. This error happened in test_restarts_under_load and test_many_timelines tests at the same time. This looks like a known problem, discussed here neondatabase/neon#467

@arssher
Copy link

arssher commented Oct 14, 2021

I have added minor comment and assertion. Let's merge if they are ok (and once question above is sorted out).

@petuhovskiy
Copy link
Member Author

I pushed branch to zenith to test this patch in CI. I think we can merge it after pipeline succeeds.

@petuhovskiy petuhovskiy merged commit 8e82cb4 into main Oct 14, 2021
@petuhovskiy petuhovskiy deleted the fix_proposer_queue_cleanup branch October 14, 2021 12:03
ololobus pushed a commit that referenced this pull request Nov 10, 2021
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
lubennikovaav pushed a commit that referenced this pull request Feb 9, 2022
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
MMeent pushed a commit that referenced this pull request Jul 7, 2022
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
MMeent pushed a commit that referenced this pull request Aug 18, 2022
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
lubennikovaav pushed a commit that referenced this pull request Nov 21, 2022
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
MMeent pushed a commit that referenced this pull request Feb 10, 2023
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
MMeent pushed a commit that referenced this pull request Feb 10, 2023
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
MMeent pushed a commit that referenced this pull request May 11, 2023
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
tristan957 pushed a commit that referenced this pull request Aug 10, 2023
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
tristan957 pushed a commit that referenced this pull request Nov 8, 2023
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
tristan957 pushed a commit that referenced this pull request Nov 8, 2023
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
tristan957 pushed a commit that referenced this pull request Feb 5, 2024
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
tristan957 pushed a commit that referenced this pull request Feb 5, 2024
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
tristan957 pushed a commit that referenced this pull request Feb 6, 2024
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
tristan957 pushed a commit that referenced this pull request May 10, 2024
Queue was moved further than truncateLsn, when quorumLsn matched end of wal record in the middle of queue message. Fix cleanup of unreceived messages.

Co-authored-by: Arseny Sher <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix starting streaming since record boundary for safekeepers joining after voting
2 participants