Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cancellation example to Cosmos #457

Merged
merged 1 commit into from
Oct 29, 2021

Conversation

yoshuawuyts
Copy link
Contributor

This PR was co-authored with @rylev.

This PR is a follow-up to #392 (review), adding a cancellation example for the SDK. This PR shows off two methods of cancellation: using a timeout, and using a stop token. But in practice users are free to use any kind of cancellation method they want; what the SDK guarantees is that if the corresponding future is dropped, the request is in fact terminated.

As you can probably tell by reading the code, ensuring cancellation is propagated does not require any changes to the codebase itself. This is because both our hyper and reqwest backends cancel their corresponding requests when dropped. We're including this example solely to show how users can cancel requests themselves.

Differences with Golang

The Go context package has three distinct cancellation methods: cancel after a certain time, cancel at a specific time, and cancel based on a signal. In our example we show how to cancel in response to a signal, and at a specific time. This is the core of the time-based example:

let deadline = Instant::now() + Duration::from_secs(1);
match future.until(deadline).await {

Hypothetically the until method could also take the relative Duration type directly. But internally it would still need to convert the relative type to a concrete instance 1. Because both are equivalent, we've only chosen to include a single time-based example.

The final difference with the Golang Context object is that our context does is not yet able to store key-value pairs. We should consider adapting our context to be able to store objects through a typemap interface in a future PR. But that's out of scope for this PR, and we should address that separately.

Cancellation Trees and Cancellation Propagation

As we talked about during our sync meetings and in prior PRs, the main property we want to guarantee is that cancellation propagates. As far as I understand it, this is what we're referring to when we're talking about trees in the context of cancellation.

The SDK as it currently exists ensures that when the client or any of its outgoing requests are dropped, its internal resources are also cleaned up 2. For a request this means the request is cancelled, and the TCP connection is handed back to the connection pool. For the client it means that when it's dropped the pool of open TCP connections is closed down, and all of the file descriptors are closed.

This ensures that however end-users choose to cancel any part of the SDK, it should always just work. Whether we're returning early from a function because of an error, awaiting multiple requests concurrently, or respond to a cancellation token -- the SDK ensures that it itself is well-behaved: it will not leak resources, and close any ongoing operations the moment it knows the user is no longer interested in the results.

Unlike our Golang and C# SDKs we do not need to manually thread through a Context to ensure our cancellation propagates throughout our program. This means that our SDK's interface will differ slightly from what existing Azure customers will be used to from other languages; but only because Rust as a language provides stronger guarantees around cancellation out of the box. But this will also have the consequence that for users of the Rust language, the Azure SDK will likely feel more natural and pleasant to use. Which is something we've in the past said is something that we're especially keen on doing too.

Conclusion

This example shows how to cancel in-flight requests in various manners using the Azure SDK. Because our dependencies already support "cancellation on drop", and we're not doing anything that's incompatible with that, our SDK transitively has that property as well.

Overall we hope this example is useful for people looking to learn how to cancel requests made by the SDK, and act as a useful reference for possible future documentation. Thanks!

cc/ @heaths @JeffreyRichter

Footnotes

  1. The difference between a "duration" and an "instant" is: "5 seconds" vs "5 seconds from now". We cannot meaningfully say anything about time elapsed without comparing two instants. Even in Golang, internally the implementation of withTimeout converts a Duration into Time before calling the withDeadline method.

  2. Note that requests cannot outlive clients due to the borrow checker. So we're guaranteed we have no in-flight requests when the client cleans up its resources.

Copy link
Contributor

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this solution, and it seems like we should be able to handle the use cases that other SDKs use the Context for (with the exception of the key/value store which we should tackle next).

One thing I'd like to see if documentation for the HttpClient trait that documents the requirement that when outgoing request futures are dropped, we expect in-flight requests to be canceled.

env_logger::init();
// First we retrieve the account name and master key from environment variables, and
// create an authorization token.
let account = std::env::var("COSMOS_ACCOUNT").expect("Set env variable COSMOS_ACCOUNT first!");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually call this an "endpoint" as it includes the scheme (https) and host info. Our XxxClient constructors require the full endpoint; not just an account.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that the other SDKs take the full HTTP url instead of just the account name? It seems easier on the user to just provide the thing unique to them (their account). We already support other non-azure.com "endpoints" but for those who use the default public cloud, the process is pretty straight forward.

(by the way, this isn't really an issue with this PR, and we probably want to open an issue to discuss this so that it doesn't get lost in this PR)

let account = std::env::var("COSMOS_ACCOUNT").expect("Set env variable COSMOS_ACCOUNT first!");
let master_key =
std::env::var("COSMOS_MASTER_KEY").expect("Set env variable COSMOS_MASTER_KEY first!");
let authorization_token = AuthorizationToken::primary_from_base64(&master_key)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to clean up the auth types (which we call Credentials). So what kind of credential is this? I think we'd normally call this a KeyCredential which you'd construct with the master_key.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heaths were you working on cleaning up credentials? It might be worth having someone deeply familiar with how credentials work in other SDKs try to clean this up a bit.


// Create a new Cosmos client.
let options = CosmosOptions::default();
let client = CosmosClient::new(account.clone(), authorization_token.clone(), options);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Is there any way in Rust to let a customer pass nil for options or do they always have to create it and pass it? Usually, customers don't care about the options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented down below about the same issue with operations. In this case, removing the options argument from CosmosClient::new and adding a CosmosClient::new_with_options would likely be the best approach.

let client = CosmosClient::new(account.clone(), authorization_token.clone(), options);

// Create a new database, and time out if it takes more than 1 second.
let options = CreateDatabaseOptions::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume in Rust this options variable "hides" the previous options field (for client options), right?

Also, again, options are usually not specified by customers so is there any way to just them pass null or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume in Rust this options variable "hides" the previous options field (for client options), right?

Yes - the technical name for this is called "shadowing", and it's fairly common in Rust code.

Also, again, options are usually not specified by customers so is there any way to just them pass null or something?

Unfortunately, Rust does not offer method overloading or default parameters which limits our options considerably. There are two ways we can handle this:

Make the options argument an Option<CreateDatabaseOptions>

This would make the user able to pass None instead of creating an options struct.

client.create_database(Context::new(), "my_database", None); 

When the user wants to pass options they would need to wrap it in the Some variant:

let options = CreateDatabaseOptions::new();
client.create_database(Context::new(), "my_database", Some(options)); 

Have two functions one which takes options and the other which doesn't

This is actually fairly common in Rust.

// When the user doesn't want to specify any options
client.create_database(Context::new(), "my_database"); 

// When the user wants to specify options
let options = CreateDatabaseOptions::new();
client.create_database_with_options(Context::new(), "my_database", options); 

This is somewhat ugly and a tiny bit of a maintenance burden, but given that the user usually would not want to specify options, it's likely the best we can do.

}

tokio::time::sleep(Duration::from_secs(5)).await;
// This causes all cancel tokens to fire. Any request tied to a stop token created
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Could a compute-bound loop get this signal and prematurely stop?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in the other PR, but the most straight-forward way to handle compute-bound workloads is to explicitly yield back to the runtime (a la cooperative multithreading). In tokio, this is done with yield_now.

Copy link
Contributor

@MindFlavor MindFlavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. It's simple and clean. ❤️

Copy link
Member

@cataggar cataggar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is pretty! ❤️

@rylev rylev added the Cosmos The azure_cosmos crate label Oct 29, 2021
@rylev
Copy link
Contributor

rylev commented Oct 29, 2021

I'm going to merge this, but we should continue the discussion!

@rylev rylev merged commit cab39cd into Azure:main Oct 29, 2021
@yoshuawuyts yoshuawuyts deleted the cancellation-example branch November 1, 2021 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cosmos The azure_cosmos crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants