Skip to content
This repository was archived by the owner on Oct 9, 2023. It is now read-only.

Apply interruptible override from workflow execution config #minor #429

Merged
merged 5 commits into from
May 2, 2022

Conversation

MorpheusXAUT
Copy link
Contributor

TL;DR

Applies interruptible override from workflow execution spec to nodes, allowing for workflows to be flagged as interruptible for a single execution.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

The ExecutionConfig of a FlyteWorkflow now contains an Interruptible flag, indicating a workflow should be marked as interruptible for a single execution. The IsInterruptible getter of FlyteWorkflow has been adapted to return either the new Interruptible flag of the ExecutionConfig or the already existing one from the NodeDefaults, maintaining backward compatibility. Furthermore, the existing defaults logic for interruptible tasks still applies - a task defined with interruptible=False will still remain non-schedulable for spot instances.
Two minor "interruptible" typos in the NodeSpec and nodeExecMetadata types have been fixed - changes are backward compatible since the json representation had already been spelled correctly.

Tests have been added to verify correct application of interruptible "inheritance" for nodes, smoke testing has been performed manually for workflow/single task executions and relaunches together with the flyteadmin changes.

Tracking Issue

flyteorg/flyte#2284

Follow-up issue

NA

Nick Müller added 2 commits April 21, 2022 10:55
Allows for override of interruptible flag for a single execution, NodeDefaults still apply if not set in config
Fixed minor typos in NodeSpec/nodeExecMetadata

Signed-off-by: Nick Müller <[email protected]>
Signed-off-by: Nick Müller <[email protected]>
@welcome
Copy link

welcome bot commented Apr 21, 2022

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

Allows for workflow interruptible flag to be overwritten to false as well (task settings still take preference)

Signed-off-by: Nick Müller <[email protected]>
@MorpheusXAUT
Copy link
Contributor Author

Whilst a workflow can now have its interruptible setting overwritten to false, the task settings still take preference - if a task is defined with interruptible=True, it will not be overwritten to false, even if the workflow execution config has Interruptible: false specified.
If perform a single execution with an override disabling spot-instances should also overrule all task preferences, some further changes are needed.

@EngHabu EngHabu requested review from hamersaw and removed request for kumare3 April 26, 2022 11:55
@EngHabu
Copy link
Contributor

EngHabu commented Apr 26, 2022

@MorpheusXAUT thank you for your contribution!

@hamersaw, mind taking a look at this?

Copy link
Contributor

@hamersaw hamersaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the workflow interruptible flag only applies as a default in the case that interruptible is not explicitly set on the node. Looks like this functionality is quite old. I don't think this is necessarily intuitive in some cases. For example, if the workflow interruptible is set to false some nodes may still be set as interruptible.

Not saying this is a change we should make, but if we are further exposing this functionality (in the UI changes) perhaps it warrants a brief discussion. Have you thought about this? My first thought is it may make more sense to handle as follows:
workflow interruptible = true then all nodes are interruptible
workflow interruptible not set then use the interruptible set on nodes
workflow interruptible = false then all nodes are non-interruptible

I believe it would only require a simple change to this conditional.

@@ -143,7 +143,7 @@ type NodeSpec struct {
ActiveDeadline *v1.Duration `json:"activeDeadline,omitempty"`
// The value set to True means task is OK with getting interrupted
// +optional
Interruptibe *bool `json:"interruptible,omitempty"`
Interruptible *bool `json:"interruptible,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@MorpheusXAUT
Copy link
Contributor Author

If I understand correctly, the workflow interruptible flag only applies as a default in the case that interruptible is not explicitly set on the node. Looks like this functionality is quite old. I don't think this is necessarily intuitive in some cases. For example, if the workflow interruptible is set to false some nodes may still be set as interruptible.

That's correct - the interruptible flag on the node currently has the "last call" when it comes to picking a setting.

Not saying this is a change we should make, but if we are further exposing this functionality (in the UI changes) perhaps it warrants a brief discussion. Have you thought about this? My first thought is it may make more sense to handle as follows:
workflow interruptible = true then all nodes are interruptible
workflow interruptible not set then use the interruptible set on nodes
workflow interruptible = false then all nodes are non-interruptible

I did think about that and initially had the override "skip" the node settings just like you suggested, but actually chose to keep the current behavior after talking to our team about what their expectations would be.
Our main concern point was with tasks that cannot properly be run as interruptible (e.g. because of side effects) being overriden because the person running them might not fully know about these pitfalls. Allowing the person writing the workflow to explicitly disabling the interruptible setting no matter what lets them ensure there's no issues in the future.

For that reason, I wouldn't force every node to run as interruptible when using the workflow override, although I agree that it might be a bit unintuitive to users in some case.
The other way around - forcing some nodes that "want" to be interruptible to not be - would not be as "problematic" I'd say, but the question is if adding just that wouldn't be even more inconsistent...

I believe it would only require a simple change to this conditional.

Yeah, that should do it, although we'd probably also need to change the signature of IsInterruptible() to return a *bool so we can distinguish between no override being provided and it being set to false - not sure how many other changes that would require off the top of my head.

@hamersaw
Copy link
Contributor

If I understand correctly, the workflow interruptible flag only applies as a default in the case that interruptible is not explicitly set on the node. Looks like this functionality is quite old. I don't think this is necessarily intuitive in some cases. For example, if the workflow interruptible is set to false some nodes may still be set as interruptible.

That's correct - the interruptible flag on the node currently has the "last call" when it comes to picking a setting.

Not saying this is a change we should make, but if we are further exposing this functionality (in the UI changes) perhaps it warrants a brief discussion. Have you thought about this? My first thought is it may make more sense to handle as follows:
workflow interruptible = true then all nodes are interruptible
workflow interruptible not set then use the interruptible set on nodes
workflow interruptible = false then all nodes are non-interruptible

I did think about that and initially had the override "skip" the node settings just like you suggested, but actually chose to keep the current behavior after talking to our team about what their expectations would be. Our main concern point was with tasks that cannot properly be run as interruptible (e.g. because of side effects) being overriden because the person running them might not fully know about these pitfalls. Allowing the person writing the workflow to explicitly disabling the interruptible setting no matter what lets them ensure there's no issues in the future.

Reasonable and very interesting, I wrongly assumed your use-case was the opposite, setting the workflow interruptible to false where a node is set to true.

For that reason, I wouldn't force every node to run as interruptible when using the workflow override, although I agree that it might be a bit unintuitive to users in some case. The other way around - forcing some nodes that "want" to be interruptible to not be - would not be as "problematic" I'd say, but the question is if adding just that wouldn't be even more inconsistent...

Agreed, I would much rather remain consistent.

I believe it would only require a simple change to this conditional.

Yeah, that should do it, although we'd probably also need to change the signature of IsInterruptible() to return a *bool so we can distinguish between no override being provided and it being set to false - not sure how many other changes that would require off the top of my head.

Once the IDL PR is merged and go.mod updated I can approve this. Just a note for future updates. If there is a general want for overriding interruptible on all nodes rather than defaulting if unset we could explore updating the workflow field to something like an InterruptbileStyle enum that has AllNonInterruptible, AllInterruptible, DefaultNonInterruptible, DefaultInterruptible, etc to remove any confusion and provide a feature-full implementation. I think at this moment these changes may be excessive.

@codecov
Copy link

codecov bot commented Apr 27, 2022

Codecov Report

Merging #429 (beb6c0e) into master (24c2200) will increase coverage by 0.02%.
The diff coverage is 62.50%.

❗ Current head beb6c0e differs from pull request most recent head c5a74f4. Consider uploading reports for the commit c5a74f4 to get more accurate results

Signed-off-by: Nick Müller <[email protected]>
@hamersaw hamersaw self-requested a review May 2, 2022 21:15
@hamersaw hamersaw merged commit 6967837 into flyteorg:master May 2, 2022
@welcome
Copy link

welcome bot commented May 2, 2022

Congrats on merging your first pull request! 🎉

eapolinario pushed a commit to eapolinario/flytepropeller that referenced this pull request Aug 9, 2023
…lyteorg#429)

* Workflows can now be marked as interruptible via execution config
Allows for override of interruptible flag for a single execution, NodeDefaults still apply if not set in config
Fixed minor typos in NodeSpec/nodeExecMetadata

Signed-off-by: Nick Müller <[email protected]>

* Updated golden files

Signed-off-by: Nick Müller <[email protected]>

* Interruptible override from execution config is now optional
Allows for workflow interruptible flag to be overwritten to false as well (task settings still take preference)

Signed-off-by: Nick Müller <[email protected]>

* Use released flyteidl version

Signed-off-by: Nick Müller <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants