-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add write queue #736
base: master
Are you sure you want to change the base?
Conversation
Draft as there is no way for now to set any timeout other than |
d650c94
to
4c88c58
Compare
This PR is ready to be tested and benchmarked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first pass.
I read all the code. I could understand the goal of changing the flow of exec
request.
IMHO, the following is missing:
-
A short but concise description of the nature of change. Motivation why do we need this new feature, what wasn't so efficient before having it, main technical decisions made.
-
I'm not convinced that exec request shall start from
WAITING
state however I could be wrong and if it's notENQUEUED
orINITIED
I'm OK withWAITING
but please explain why. -
Please make a few comments about changes around barriers, so far I don't fully understand this part.
size_t n; | ||
char *cursor; | ||
failure.code = (uint64_t)code; | ||
failure.message = message; | ||
struct response_failure failure = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice.
@@ -71,7 +72,7 @@ static void gateway_handle_cb(struct handle *req, | |||
return; | |||
} | |||
|
|||
if (status != 0) { | |||
if (status == SQLITE_IOERR_NOT_LEADER || status == SQLITE_IOERR_LEADERSHIP_LOST) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem an equivalent code to if (status != 0)
. There're certain reasons not to change this check this way, for example, in case if someone will fire or add a new SQLITE_IOERR_
this will lead to wrong handing in this place. Still, if you want to enforce that error types observed here are limited to SQLITE_IOERR_NOT_LEADER | SQLITE_IOERR_LEADERSHIP_LOST
, it's better to put an assert inside the condition:
if (status != 0) {
assert(status == SQLITE_IOERR_NOT_LEADER || status == SQLITE_IOERR_LEADERSHIP_LOST);
return 0; | ||
} | ||
|
||
/** | ||
* State machine for exec requests. | ||
*/ | ||
enum { | ||
EXEC_WAITING, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This state semantically looks like EXEC_INITED
. Curious, what am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide more docs about the new state and its interaction with other states here. It'll be curious to read about the purpose of introduced queue, the use of the timer and how it reflects on request state transitions.
* but also start a timer as this statement should not sit | ||
* in the queue for too long. In the case the timer expires | ||
* the statement will just fail with SQLITE_BUSY. */ | ||
err = raft_timer_start(l->raft, &req->timer, l->db->config->busy_timeout, 0, exec_timeout_cb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is too long. Please try to fit it in 80 chars.
* in the queue for too long. In the case the timer expires | ||
* the statement will just fail with SQLITE_BUSY. */ | ||
err = raft_timer_start(l->raft, &req->timer, l->db->config->busy_timeout, 0, exec_timeout_cb); | ||
if (err != RAFT_RESULT_OK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ignore][sigh] Also, I can't believe that our time_start()
function allocates.
t = raft_malloc(sizeof *t); | ||
assert(t != NULL); | ||
t->type = TIMER; | ||
t->completion_time = *io->time + (timeout ? timeout : 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just + timeout
?
This PR adds a write queue for the execution of statements.
To add that I had to change a lot the execution model which was, depending on the state, either sync or async. It is now async only. This is a good thing IMHO as the handling of the result can be done always in the callback while before this change it had to be managed both from the caller (in case of
LEADER_NOT_ASYNC
) and in the callback, spreading the handling logic and creating weird dependencies in the caller logic and in the tests.This PR also depends on
sqlite3_txn_state
which appeared in sqlite 3.34. Given that20.04
is stuck with version3.31
, this is dropping the tests for that version. This is not a big deal as I see unlikely that20.04
will ever get dqlite1.18
as it would be a new version in a frozen release.