Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for worker persistence with KEDA v2.0.0 in helm chart #13209

Merged
merged 2 commits into from
Mar 10, 2021

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Dec 21, 2020

Here we upgrade keda to v2.0.0 in helm chart update the docs since things have changed a bit.

Notably, since 2.0.0 adds support for statefulsets, the ScaledObject definition is updated to add support for the case where workers.persistence.enabled=True.

@boring-cyborg boring-cyborg bot added the area:helm-chart Airflow Helm Chart label Dec 21, 2020
@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch 2 times, most recently from 5c3b003 to 2c8fc60 Compare December 21, 2020 16:45
@dstandish dstandish changed the title Upgrade keda to v2.0.0 Upgrade keda to v2.0.0 and support worker persistence Dec 21, 2020
@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch 2 times, most recently from 2d1191c to 7480a7b Compare December 24, 2020 08:52
@dstandish dstandish changed the title Upgrade keda to v2.0.0 and support worker persistence Add support for worker persistence with upgrade to KEDA v2.0.0 in helm chart Dec 24, 2020
@dstandish dstandish changed the title Add support for worker persistence with upgrade to KEDA v2.0.0 in helm chart Add support for worker persistence with KEDA v2.0.0 in helm chart Dec 24, 2020
@mik-laj mik-laj self-requested a review December 25, 2020 17:25
@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch from 7480a7b to 1e15128 Compare December 26, 2020 08:08
@dstandish
Copy link
Contributor Author

@mik-laj following your comment here #13183 (comment) i added a mechanism for validating CRD objects

i enabled schema validation for the keda scaledojbect, and also removed the option to disable schema validation since it's no longer necessary

@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch 2 times, most recently from 75fcd02 to 3e8525d Compare January 9, 2021 17:41
@github-actions
Copy link

github-actions bot commented Jan 9, 2021

The Workflow run is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.

@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch from 3e8525d to 9576585 Compare February 5, 2021 06:01
@github-actions
Copy link

github-actions bot commented Feb 5, 2021

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch 4 times, most recently from ba0e1cb to 8df0757 Compare February 11, 2021 02:40
Copy link
Contributor

@ianstanton ianstanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dstandish Minor fix requested in README

@mik-laj
Copy link
Member

mik-laj commented Mar 6, 2021

Can you do a rebase?

Copy link
Member

@mik-laj mik-laj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dstandish
Copy link
Contributor Author

Great -- will do

@@ -24,27 +24,23 @@ KEDA stands for Kubernetes Event Driven Autoscaling.
`KEDA <https://github.com/kedacore/keda>`__ is a custom controller that
allows users to create custom bindings to the Kubernetes `Horizontal Pod
Autoscaler <https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/>`__.
We have built scalers that allows users to create scalers based on
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed this sentence because if i recall correctly the "we" is astronomer, so in this repo it doesn't really have the same meaning and i don't think the statement belongs. but it's also sortof beside the point.

Copy link
Member

@turbaszek turbaszek Mar 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly it was me and Daniel, so not only Astronomer. Also Airflow is on KEDA users page. But I agree that it creates not value here.

@@ -61,7 +88,7 @@ def validate_k8s_object(instance):
validate.validate(instance)


def render_chart(name="RELEASE-NAME", values=None, show_only=None, validate_schema=True):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_schema was added recently to allow tests for CRDs but with this PR we have a way of pulling in the definition.
it does require a network call here fwiw: https://github.com/apache/airflow/pull/13209/files#diff-91361141bde70cbfc90142ebe2377ce918025d3343d9216e00e50c66f083b12cR65

pollingInterval: {{ .Values.workers.keda.pollingInterval }} # Optional. Default: 30 seconds
cooldownPeriod: {{ .Values.workers.keda.cooldownPeriod }} # Optional. Default: 300 seconds
maxReplicaCount: {{ .Values.workers.keda.maxReplicaCount }} # Optional. Default: 100
triggers:
- type: postgresql
metadata:
targetQueryValue: "1"
connection: AIRFLOW_CONN_AIRFLOW_DB
query: "SELECT ceil(COUNT(*)::decimal / {{ .Values.config.celery.worker_concurrency }}) FROM task_instance WHERE state='running' OR state='queued'"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while updating connectionFromEnv just decided to make this a bit more readable by splitting onto separate lines

@github-actions
Copy link

github-actions bot commented Mar 7, 2021

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Mar 7, 2021
@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch 2 times, most recently from 063d9ae to cc33c3e Compare March 7, 2021 02:35
@dstandish
Copy link
Contributor Author

@mik-laj do you want to have another look post-rebase?

Also what is the convention for committer-authors? Should the author merge it themselves after there is approval or is it expected that you wait for the reviewer to merge?

Thank you 🙏

@ashb
Copy link
Member

ashb commented Mar 8, 2021

Also what is the convention for committer-authors? Should the author merge it themselves after there is approval or is it expected that you wait for the reviewer to merge?

Generally we can merge our selves after a review. Use your judgment if you need a re-review.

@dstandish
Copy link
Contributor Author

Generally we can merge our selves after a review. Use your judgment if you need a re-review

OK in that case since the rebase was just about docs and nothing materially changed i'll go ahead and merge.

Thanks all

@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch from cc33c3e to b4f61d8 Compare March 8, 2021 16:15
@dstandish
Copy link
Contributor Author

forgot to push fixup adding autoscaler to spelling wordlist will merge afterward

KEDA 2.0.0 introduces support for StatefulSets, so we can now remove the constraint that
worker persistence must be disabled.
@dstandish dstandish force-pushed the upgrade-keda-to-2.0.0 branch from b4f61d8 to 3c814cf Compare March 8, 2021 16:19
@dstandish
Copy link
Contributor Author

dstandish commented Mar 8, 2021

@mik-laj @ianstanton after taking another look at this i thought that adding the namedtuple in this case was maybe a little overkill / confusing because the elements are never accessed it's just a way constructing a key in the dict for the URL lookup, given the two components -- api version and kind.

can you please take a look and let me know what you think of this perhaps simpler (though less explicit) alternative.

and maybe you have a better suggestion.

crd_lookup = {
    'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',
}
...
def get_schema_crd(api_version, kind):
    url = crd_lookup.get(f"{api_version}::{kind}")
    if not url:
        return None
    response = requests.get(url)
    yaml_schema = response.content.decode('utf-8')
    schema = yaml.safe_load(StringIO(yaml_schema))
    return schema

so i'm just joining the two components with :: to make a string that's pretty straightforward to understand so that we don't need the extra K8sObject namedtuple. Also leaving the key as a regular tuple does work but end up much uglier with line breaks.

i'm gonna just go ahead and add this in a second commit and request review and go from there

@dstandish dstandish requested a review from mik-laj March 8, 2021 16:42
@@ -31,9 +32,12 @@

BASE_URL_SPEC = "https://raw.githubusercontent.com/instrumenta/kubernetes-json-schema/master/v1.14.0"

crd_lookup = {
'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml', # noqa: E501 # pylint: disable=line-too-long
}
Copy link
Contributor Author

@dstandish dstandish Mar 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above concatenation of api version and kind in this way, keda.sh/v1alpha1::ScaledObject is a made-up syntax that only serves to provide a simple way to look up the url for a particular combination of api version + kind for CRDs

it is only referenced in one place:

url = crd_lookup.get(f"{api_version}::{kind}")

@ianstanton
Copy link
Contributor

@mik-laj @ianstanton after taking another look at this i thought that adding the namedtuple in this case was maybe a little overkill / confusing because the elements are never accessed it's just a way constructing a key in the dict for the URL lookup, given the two components -- api version and kind.

can you please take a look and let me know what you think of this perhaps simpler (though less explicit) alternative.

and maybe you have a better suggestion.

crd_lookup = {
    'keda.sh/v1alpha1::ScaledObject': 'https://raw.githubusercontent.com/kedacore/keda/v2.0.0/config/crd/bases/keda.sh_scaledobjects.yaml',
}
...
def get_schema_crd(api_version, kind):
    url = crd_lookup.get(f"{api_version}::{kind}")
    if not url:
        return None
    response = requests.get(url)
    yaml_schema = response.content.decode('utf-8')
    schema = yaml.safe_load(StringIO(yaml_schema))
    return schema

so i'm just joining the two components with :: to make a string that's pretty straightforward to understand so that we don't need the extra K8sObject namedtuple. Also leaving the key as a regular tuple does work but end up much uglier with line breaks.

i'm gonna just go ahead and add this in a second commit and request review and go from there

@dstandish this approach seems acceptable to me 👍

@dstandish dstandish merged commit 91edde6 into apache:master Mar 10, 2021
@dstandish dstandish deleted the upgrade-keda-to-2.0.0 branch March 10, 2021 19:07
@dstandish
Copy link
Contributor Author

@mik-laj @kaxil One question about this.... This helm chart isn't "released" ... so I guess there's no need to put anything in updating.md for this update at this time.... but let me know if that's not true and you want me to add something somewhere....

@kaxil
Copy link
Member

kaxil commented Mar 10, 2021

@mik-laj @kaxil One question about this.... This helm chart isn't "released" ... so I guess there's no need to put anything in updating.md for this update at this time.... but let me know if that's not true and you want me to add something somewhere....

Yup that is true, currently we don't need to put anything in Updating.md

cognifloyd pushed a commit to cognifloyd/stackstorm-k8s that referenced this pull request Feb 16, 2022
…ache/airflow#13209)

* Upgrade KEDA to v2.0.0 and add support for worker persistence

KEDA 2.0.0 introduces support for StatefulSets, so we can now remove the constraint that
worker persistence must be disabled.

Co-authored-by: Daniel Standish <[email protected]>

Partial Commit Extracted From: https://github.com/apache/airflow
@nirroz93
Copy link
Contributor

Hi, sorry for commenting an old PR. I see here https://github.com/apache/airflow/blob/main/chart/values.yaml#LL501C4-L501C15 there is still a comment
# Persistence.enabled must be set to false to use KEDA.

as far as I understand from this PR, the comment is a leftover (since this PR), right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:helm-chart Airflow Helm Chart okay to merge It's ok to merge this PR as it does not require more tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants