Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

errors: final cleanups (restructure ExecutionError and introduce SchemaAgreementError) #1204

Merged
merged 13 commits into from
Feb 10, 2025

Conversation

muzarski
Copy link
Contributor

@muzarski muzarski commented Feb 6, 2025

Fixes: #519

This PR originated from our offline discussion with @Lorak-mmk .

Changes

ProtocolError

We remove this type altogether. We noticed that this error type was a catch-all for sub-errors/variants we did not know where to put otherwise.

ExecutionError

Before this PR, it would contain a lot of top-level variants without providing some meaningful context for them. Most of the variants came from the RequestAttemptError::into_execution_error method.

Inlining RequestAttemptError

We decided that it's best to inline RequestAttemptError, so users have additional context about how request execution looks like when they match the error. It allows us to remove a lot of variants from ExecutionError and ProtocolError (which is later removed).

SchemaAgreementError

New error type - returned by public API related to awaiting schema agreement (Session::await/check_schema_agreement).

BadQuery

Me and @Lorak-mmk agreed that final version (after this PR) of BadQuery is just fine. I did not do any changes to this error type, apart from moving one of the ProtocolError variants to it. I remember that @wprzytula did not like the name of this error - any suggestions?

Potential improvements for the future

SerializationError appears in two places

  1. BadQuery::SerializationError
    This is an error that appears on standard execution path for Session::execute and Session::batch, when we serialize the values (before doing the retries etc.)

  2. ExecutionError::LastAttemptError(RequestAttemptError::SerializationError)
    This error is constructed ONLY for Session::query with non-empty values. Why do we currently need a SerializationError in RequestAttemptError - i.e. why do we serialize on each attempt? This is because we need to firstly prepare the statement during each attempt - the statement needs to be firstly prepared on the node during current attempt. Only after preparation, have we access to the metadata and are able to serialize the values.

After discussion, we decided to currently leave it as is. This is something to be potentially fixed in the future. The fix requires non-trivial changes to execution logic (which is quite complex). The idea is to prepare the statement only once before the retries - then we can serialize the values and pass the serialized values to the context of each attempt. I think it deserves a separate issue in the repository, since it not only will fix the hierarchy of errors, but will improve the overall performance as well - we won't need to prepare the statement on each attempt (we can send QUERY cql request to the node).

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • [ ] I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • [ ] I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

@muzarski muzarski self-assigned this Feb 6, 2025
@muzarski muzarski marked this pull request as draft February 6, 2025 14:08
@muzarski muzarski added the API-stability Part of the effort to stabilize the API label Feb 6, 2025
@muzarski muzarski added this to the 1.0.0 milestone Feb 6, 2025
@github-actions github-actions bot added the semver-checks-breaking cargo-semver-checks reports that this PR introduces breaking API changes label Feb 6, 2025
Copy link

github-actions bot commented Feb 6, 2025

cargo semver-checks detected some API incompatibilities in this PR.
Checked commit: 34afac7

See the following report for details:

cargo semver-checks output
./scripts/semver-checks.sh --baseline-rev bceac6aaa852d51699b73820fc5317d75e3e21d4
+ cargo semver-checks -p scylla -p scylla-cql --baseline-rev bceac6aaa852d51699b73820fc5317d75e3e21d4
     Cloning bceac6aaa852d51699b73820fc5317d75e3e21d4
    Building scylla v0.15.0 (current)
       Built [  22.994s] (current)
     Parsing scylla v0.15.0 (current)
      Parsed [   0.047s] (current)
    Building scylla v0.15.0 (baseline)
       Built [  22.263s] (baseline)
     Parsing scylla v0.15.0 (baseline)
      Parsed [   0.044s] (baseline)
    Checking scylla v0.15.0 -> v0.15.0 (no change)
     Checked [   0.119s] 127 checks: 124 pass, 3 fail, 0 warn, 0 skip

--- failure enum_missing: pub enum removed or renamed ---

Description:
A publicly-visible enum cannot be imported by its prior path. A `pub use` may have been removed, or the enum itself may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.39.0/src/lints/enum_missing.ron

Failed in:
  enum scylla::errors::ProtocolError, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:194
  enum scylla::errors::SchemaVersionFetchError, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:260

--- failure enum_variant_missing: pub enum variant removed or renamed ---

Description:
A publicly-visible enum has at least one variant that is no longer available under its prior name. It may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.39.0/src/lints/enum_variant_missing.ron

Failed in:
  variant ExecutionError::DbError, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:49
  variant ExecutionError::CqlRequestSerialization, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:57
  variant ExecutionError::BodyExtensionsParseError, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:61
  variant ExecutionError::CqlResultParseError, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:74
  variant ExecutionError::CqlErrorParseError, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:78
  variant ExecutionError::ProtocolError, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:90
  variant ExecutionError::BrokenConnection, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:94
  variant ExecutionError::UnableToAllocStreamId, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:98
  variant ExecutionError::SchemaAgreementTimeout, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:109

--- failure inherent_method_missing: pub method removed or renamed ---

Description:
A publicly-visible method or associated fn is no longer available under its prior name. It may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.39.0/src/lints/inherent_method_missing.ron

Failed in:
  RequestAttemptError::into_execution_error, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-bceac6aaa852d51699b73820fc5317d75e3e21d4/a2020bf1cde8f2c41199b73119bb64096056f0b9/scylla/src/errors.rs:908

     Summary semver requires new major version: 3 major and 0 minor checks failed
    Finished [  46.624s] scylla
    Building scylla-cql v0.4.0 (current)
       Built [  11.339s] (current)
     Parsing scylla-cql v0.4.0 (current)
      Parsed [   0.028s] (current)
    Building scylla-cql v0.4.0 (baseline)
       Built [  11.215s] (baseline)
     Parsing scylla-cql v0.4.0 (baseline)
      Parsed [   0.026s] (baseline)
    Checking scylla-cql v0.4.0 -> v0.4.0 (no change)
     Checked [   0.112s] 127 checks: 126 pass, 1 fail, 0 warn, 0 skip

--- failure enum_marked_non_exhaustive: enum marked #[non_exhaustive] ---

Description:
A public enum has been marked #[non_exhaustive]. Pattern-matching on it outside of its crate must now include a wildcard pattern like `_`, or it will fail to compile.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#attr-adding-non-exhaustive
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.39.0/src/lints/enum_marked_non_exhaustive.ron

Failed in:
  enum DbError in /home/runner/work/scylla-rust-driver/scylla-rust-driver/scylla-cql/src/frame/response/error.rs:155

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  23.236s] scylla-cql
make: *** [Makefile:61: semver-rev] Error 1

@muzarski muzarski force-pushed the inline-request-error branch 2 times, most recently from 8a7a9e7 to f23611c Compare February 6, 2025 15:46
@muzarski muzarski marked this pull request as ready for review February 6, 2025 16:17
Comment on lines 3 to 36
// Re-export error types from pager module.
pub use crate::client::pager::{NextPageError, NextRowError};

// Re-export error types from query_result module.
pub use crate::response::query_result::{
FirstRowError, IntoRowsResultError, MaybeFirstRowError, ResultNotRowsError, RowsError,
SingleRowError,
};

// Re-export error type from authentication module.
pub use crate::authentication::AuthError;

// Re-export error types from scylla-cql.
pub use scylla_cql::deserialize::{DeserializationError, TypeCheckError};
pub use scylla_cql::frame::frame_errors::{
CqlAuthChallengeParseError, CqlAuthSuccessParseError, CqlAuthenticateParseError,
CqlErrorParseError, CqlEventParseError, CqlRequestSerializationError, CqlResponseParseError,
CqlResultParseError, CqlSupportedParseError, FrameBodyExtensionsParseError,
FrameHeaderParseError,
};
pub use scylla_cql::frame::request::CqlRequestKind;
pub use scylla_cql::frame::response::error::{DbError, OperationType, WriteType};
pub use scylla_cql::frame::response::CqlResponseKind;
pub use scylla_cql::serialize::SerializationError;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Re-exporting errors from scylla_cql of course makes sense. I'm wondering about errors from scylla. Should we re-export them here, or maybe move them here?
And if we decide to use re-exports, which errors should be defined in the errors module, and which whould only be re-exported here?

@wprzytula @muzarski

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 In any case, please move re-exports after imports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 In any case, please move re-exports after imports.

Done

Copy link
Contributor Author

@muzarski muzarski Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Re-exporting errors from scylla_cql of course makes sense. I'm wondering about errors from scylla. Should we re-export them here, or maybe move them here?

I'm fine with either approach.

And if we decide to use re-exports, which errors should be defined in the errors module, and which whould only be re-exported here?

I believe errors such as NextRowError/IntoRowsResultError were introduced in different module, since they are strictly tied to one object (in these cases it's QueryPager and QueryResult). Errors in errors module are rather related to methods on Session. But I'm pretty sure this does not apply to all errors in this module (e.g. RequestError used only in RetrySession API). So, actually, I think we could move ALL error types to scylla::errors module. Current division is a bit misleading.

This commit inlines the RequestAttemptError in ExecutionError.

It also adjusts some tests which otherwise are failing. This is because
some variants land in other place than they would before this commit.

In the next commits we will start cleaning up the ExecutionError - i.e.
removing unused variants and moving some of them to other places.

The purpose why we inline RequestAttemptError is to remove some
error variants from the top level. For example, `CqlErrorParseError` should
not be placed on the top level of the error hierarchy. It is a very specific
and users should get some more context about it - i.e. that this is the cause
of failure of the last attempt.
I removed the variants that can be removed "for free", i.e. no further changes in
the codebase are required.
Removed the variant and adjusted the tests.
This commit moves the NonfinishedPagingState variant from the ProtocolError
to RequestAttemptError (apologize for the short names in the title, otherwise
it wouldn't fit in one line in GitHub ui).

Why we do this? The conversion (e.g. NonErrorQueryResponse::into_query_result)
methods for response kinds are currently a bit problematic. Some of them
return `RequestAttemptError` (e.g. QueryResponse::into_non_error_query_response),
while some of them are forced to return `ExecutionError` (e.g. [NonError]QueryResponse::into_query_result)
because of the aforementioned variant, which was placed in ProtocolError.

Why is this problematic? Well, these conversion methods are called from different
contexts:
- in Session layer, at the end of execution path - it makes sense to return
  ExecutionError there, because it's the final error type returned from the
  execution API.
- in Connection layer, in pub(crate) methods (e.g. Connection::query_unpaged)
  that are used internally by e.g. schema agreement logic. It's not a good
  idea to extend the error type to ExecutionError in such case. Connection methods
  should return errors representing an error of a single attempt (i.e. RequestAttemptError).

Our goal is to narrow the return error type of `Connection::fetch_schema_version` to
RequestAttemptError. This method calls `Connection::query_unpaged`, which then
calls `QueryResponse::into_query_result`. In result, `Connection::fetch_schema_version`
is forced to return ExecutionError.

Starting from the following commit, we will narrow the return type of methods related to schema agreement.
We will introduce a self-contained error type, i.e. SchemaAgreementError, which
can be returned from public API (Session::await_schema_agreement).
The idea behind it was explained in the previous commit.
Few things happen here.

1. Introducing new error type SchemaAgreementTimeout
This is rather a rename of `SchemaVersionFetch` error type. We extend the type
by additional variants.

2. Removing ProtocolError::SchemaVersionFetch variant

3. Narrowing the error type of schema agreement methods
Thanks to that,  public API methods (e.g. Session::await_schema_agreement)
can return a self-contained error type providing more context than previously
returned ExecutionError.

4. Replacing ExecutionError::SchemaAgreementTimeout with SchemaAgreementError variant
SchemaAgreementError dependency is still needed, because of auto await schema agreement
configuration. By default, the driver awaits schema agreement on schema altering
queries.

5. Adjusting the example
Previously, the example would (unjustifiably) match against ExecutionError::RequestTimeout
error. Even before this PR, it should match against ExecutionError::SchemaAgreementTimeout.
Thanks to this commit, we were able to spot and fix this issue.
I believe that this is currently the best fitting place for this variant.
We have removed the ProtocolError type, which was previously used as a
catch-all for various sub-errors and variants. It was something like
"I don't know where it belongs, so I'm gonna put it under ProtocolError".
So the order resembles the order of variants construction during execution.
So they are accessible via scylla::errors
@muzarski muzarski force-pushed the inline-request-error branch from f23611c to 6b271fb Compare February 7, 2025 13:53
Copy link
Collaborator

@Lorak-mmk Lorak-mmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed some pub types without non_exhaustive. Please tell me if they should remain that way, have non_exhaustive added, or be unpubed.

  • FirstRowError
  • IntoRowsResultError
  • MaybeFirstRowError
  • ResultNotRowsError
  • RowsError
  • SingleRowError
    cc @wprzytula because they come from deserialization refactor.

Also:

  • DbError (!)
  • OperationType
  • WriteType

We also have

/// Type to represent an authentication error message.
pub type AuthError = String;

is it indentional?

@muzarski
Copy link
Contributor Author

muzarski commented Feb 7, 2025

I noticed some pub types without non_exhaustive. Please tell me if they should remain that way, have non_exhaustive added, or be unpubed.

* `FirstRowError`

* `IntoRowsResultError`

* `MaybeFirstRowError`

* `ResultNotRowsError`

* `RowsError`

* `SingleRowError`
  cc @wprzytula because they come from deserialization refactor.

I remember asking about it during deserialization PRs. I don't remember why we decided not to mark them as non_exhaustive.

Also:

* `DbError` (!)

* `OperationType`

* `WriteType`

These should definitely be non_exhaustive. Especially the DbError (if Scylla ever introduces new error types).

We also have

/// Type to represent an authentication error message.
pub type AuthError = String;

is it indentional?

Why wouldn't it be? We could always convert it to Arc<dyn Error> or something like that. I.e. do something like with Serialization/DeserializationErrors. It has to be a dynamic error type (in this case just a String), because users need to be able to define their custom types - the error is returned from a public trait.

@Lorak-mmk
Copy link
Collaborator

Why wouldn't it be?

I'm not saying it is wrong, I'm just asking if it is intentionally this way or was just overlooked.

@muzarski
Copy link
Contributor Author

muzarski commented Feb 7, 2025

Why wouldn't it be?

I'm not saying it is wrong, I'm just asking if it is intentionally this way or was just overlooked.

Ok. So would you leave it as a String, or do something like:

pub AuthError(Arc<dyn Error>);

// Errors returned by driver's  default implementations of the trait (if there are any - I'm not sure)
pub enum BuiltinAuthErrorKind {
...
}

This would look like the [De]SerializationError.

@Lorak-mmk
Copy link
Collaborator

Let's leave it as string.

I extracted the part of speculative_execution::can_be_ignored as
DbError method. This way, we don't have to match against "_" (because
method is defined in the same module as the type - namely scylla_cql).
@muzarski
Copy link
Contributor Author

muzarski commented Feb 7, 2025

v1.1:

  • marked DbError as non_exhaustive
  • narrow the return error type of public ClusterState methods related to token calculation

We still need to discuss and agree on whether we want to pack all the error types used in the driver to scylla::errors or not.

@wprzytula wprzytula merged commit 86efc40 into scylladb:main Feb 10, 2025
11 checks passed
@Lorak-mmk Lorak-mmk mentioned this pull request Mar 4, 2025
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API-stability Part of the effort to stabilize the API semver-checks-breaking cargo-semver-checks reports that this PR introduces breaking API changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better names and higher granularity for errors
3 participants