Skip to content

Commit 8313523

Browse files
committed
Turn optional-dependencies in pyproject.toml into dynamic property
While currently hatchling and pip nicely supports dynamic replacement of the dependencies even if they are statically defined, this is not proper according to EP 621. When property of the project is set to be dynamic, it also contains static values. It's either static or dynamic. This is not a problem for wheel packages when installed, by any standard tool, because the wheel package has all the metadata added to wheel (and does not contain pyproject.toml) but in various cases (such as installing airflow via Github URL or from sdist, it can make a difference - depending whether the tool installing airflow will use directly pyproject.toml for optimization, or whether it will run build hooks to prepare the dependencies). This change makes all optional dependencies dynamici - rather than bake them in the pyproject.toml, we mark them as dynamic, so that any tool that uses pyproject.toml or sdist PKG-INFO will know that it has to run build hooks to get the actual optional dependencies. There are a few consequences of that: * our pyproject.toml will not contain automatically generated part - which is actually good, as it caused some confusion * all dynamic optional dependencies of ours are either present in hatch_build.py or calculated there - this is a bit new but sounds reasonable - and those dynamic dependencies are not really updated often, so thish is not an issue to maintain them there * the pre-commits that manage the optional dependencies got a lot simpler now - a lot of code has been removed.
1 parent 25e5d54 commit 8313523

35 files changed

+1331
-1840
lines changed

.dockerignore

-1
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,6 @@
5454
!Dockerfile
5555
!hatch_build.py
5656
!prod_image_installed_providers.txt
57-
!airflow_pre_installed_providers.txt
5857

5958
# This folder is for you if you want to add any packages to the docker context when you build your own
6059
# docker image. most of other files and any new folder you add will be excluded by default

.pre-commit-config.yaml

+9-9
Original file line numberDiff line numberDiff line change
@@ -432,26 +432,26 @@ repos:
432432
additional_dependencies: ['setuptools', 'rich>=12.4.4', 'pyyaml', 'tomli']
433433
- id: check-extra-packages-references
434434
name: Checks setup extra packages
435-
description: Checks if all the extras defined in pyproject.toml are listed in extra-packages-ref.rst file
435+
description: Checks if all the extras defined in hatch_build.py are listed in extra-packages-ref.rst file
436436
language: python
437-
files: ^docs/apache-airflow/extra-packages-ref\.rst$|^pyproject.toml
437+
files: ^docs/apache-airflow/extra-packages-ref\.rst$|^hatch_build.py
438438
pass_filenames: false
439439
entry: ./scripts/ci/pre_commit/pre_commit_check_extra_packages_ref.py
440-
additional_dependencies: ['rich>=12.4.4', 'tomli', 'tabulate']
441-
- id: check-pyproject-toml-order
442-
name: Check order of dependencies in pyproject.toml
440+
additional_dependencies: ['rich>=12.4.4', 'hatchling==1.22.4', 'tabulate']
441+
- id: check-hatch-build-order
442+
name: Check order of dependencies in hatch_build.py
443443
language: python
444-
files: ^pyproject\.toml$
444+
files: ^hatch_build.py$
445445
pass_filenames: false
446-
entry: ./scripts/ci/pre_commit/pre_commit_check_order_pyproject_toml.py
447-
additional_dependencies: ['rich>=12.4.4']
446+
entry: ./scripts/ci/pre_commit/pre_commit_check_order_hatch_build.py
447+
additional_dependencies: ['rich>=12.4.4', 'hatchling==1.22.4']
448448
- id: update-extras
449449
name: Update extras in documentation
450450
entry: ./scripts/ci/pre_commit/pre_commit_insert_extras.py
451451
language: python
452452
files: ^contributing-docs/12_airflow_dependencies_and_extras.rst$|^INSTALL$|^airflow/providers/.*/provider\.yaml$|^Dockerfile.*
453453
pass_filenames: false
454-
additional_dependencies: ['rich>=12.4.4', 'tomli']
454+
additional_dependencies: ['rich>=12.4.4', 'hatchling==1.22.4']
455455
- id: check-extras-order
456456
name: Check order of extras in Dockerfile
457457
entry: ./scripts/ci/pre_commit/pre_commit_check_order_dockerfile_extras.py

Dockerfile

+8-3
Original file line numberDiff line numberDiff line change
@@ -455,13 +455,17 @@ function install_airflow_dependencies_from_branch_tip() {
455455
if [[ ${INSTALL_POSTGRES_CLIENT} != "true" ]]; then
456456
AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS/postgres,}
457457
fi
458+
local TEMP_AIRFLOW_DIR
459+
TEMP_AIRFLOW_DIR=$(mktemp -d)
458460
# Install latest set of dependencies - without constraints. This is to download a "base" set of
459461
# dependencies that we can cache and reuse when installing airflow using constraints and latest
460462
# pyproject.toml in the next step (when we install regular airflow).
461463
set -x
462-
${PACKAGING_TOOL_CMD} install ${EXTRA_INSTALL_FLAGS} \
463-
${ADDITIONAL_PIP_INSTALL_FLAGS} \
464-
"apache-airflow[${AIRFLOW_EXTRAS}] @ https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz"
464+
curl -fsSL "https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz" | \
465+
tar xvz -C "${TEMP_AIRFLOW_DIR}" --strip 1
466+
# Make sure editable dependencies are calculated when devel-ci dependencies are installed
467+
${PACKAGING_TOOL_CMD} install ${EXTRA_INSTALL_FLAGS} ${ADDITIONAL_PIP_INSTALL_FLAGS} \
468+
--editable "${TEMP_AIRFLOW_DIR}[${AIRFLOW_EXTRAS}]"
465469
set +x
466470
common::install_packaging_tools
467471
set -x
@@ -477,6 +481,7 @@ function install_airflow_dependencies_from_branch_tip() {
477481
set +x
478482
${PACKAGING_TOOL_CMD} uninstall ${EXTRA_UNINSTALL_FLAGS} apache-airflow
479483
set -x
484+
rm -rvf "${TEMP_AIRFLOW_DIR}"
480485
# If you want to make sure dependency is removed from cache in your PR when you removed it from
481486
# pyproject.toml - please add your dependency here as a list of strings
482487
# for example:

Dockerfile.ci

+8-4
Original file line numberDiff line numberDiff line change
@@ -402,13 +402,17 @@ function install_airflow_dependencies_from_branch_tip() {
402402
if [[ ${INSTALL_POSTGRES_CLIENT} != "true" ]]; then
403403
AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS/postgres,}
404404
fi
405+
local TEMP_AIRFLOW_DIR
406+
TEMP_AIRFLOW_DIR=$(mktemp -d)
405407
# Install latest set of dependencies - without constraints. This is to download a "base" set of
406408
# dependencies that we can cache and reuse when installing airflow using constraints and latest
407409
# pyproject.toml in the next step (when we install regular airflow).
408410
set -x
409-
${PACKAGING_TOOL_CMD} install ${EXTRA_INSTALL_FLAGS} \
410-
${ADDITIONAL_PIP_INSTALL_FLAGS} \
411-
"apache-airflow[${AIRFLOW_EXTRAS}] @ https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz"
411+
curl -fsSL "https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz" | \
412+
tar xvz -C "${TEMP_AIRFLOW_DIR}" --strip 1
413+
# Make sure editable dependencies are calculated when devel-ci dependencies are installed
414+
${PACKAGING_TOOL_CMD} install ${EXTRA_INSTALL_FLAGS} ${ADDITIONAL_PIP_INSTALL_FLAGS} \
415+
--editable "${TEMP_AIRFLOW_DIR}[${AIRFLOW_EXTRAS}]"
412416
set +x
413417
common::install_packaging_tools
414418
set -x
@@ -424,6 +428,7 @@ function install_airflow_dependencies_from_branch_tip() {
424428
set +x
425429
${PACKAGING_TOOL_CMD} uninstall ${EXTRA_UNINSTALL_FLAGS} apache-airflow
426430
set -x
431+
rm -rvf "${TEMP_AIRFLOW_DIR}"
427432
# If you want to make sure dependency is removed from cache in your PR when you removed it from
428433
# pyproject.toml - please add your dependency here as a list of strings
429434
# for example:
@@ -1309,7 +1314,6 @@ COPY airflow/__init__.py ${AIRFLOW_SOURCES}/airflow/
13091314
COPY generated/* ${AIRFLOW_SOURCES}/generated/
13101315
COPY constraints/* ${AIRFLOW_SOURCES}/constraints/
13111316
COPY LICENSE ${AIRFLOW_SOURCES}/LICENSE
1312-
COPY airflow_pre_installed_providers.txt ${AIRFLOW_SOURCES}/
13131317
COPY hatch_build.py ${AIRFLOW_SOURCES}/
13141318
COPY --from=scripts install_airflow.sh /scripts/docker/
13151319

INSTALL

+101-39
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
# INSTALL / BUILD instructions for Apache Airflow
1+
INSTALL / BUILD instructions for Apache Airflow
22

3-
## Basic installation of Airflow from sources and development environment setup
3+
Basic installation of Airflow from sources and development environment setup
4+
============================================================================
45

56
This is a generic installation method that requires minimum starndard tools to develop airflow and
67
test it in local virtual environment (using standard CPyhon installation and `pip`).
@@ -23,7 +24,18 @@ MacOS (Mojave/Catalina) you might need to to install XCode command line tools an
2324

2425
brew install sqlite mysql postgresql
2526

26-
## Downloading and installing Airflow from sources
27+
The `pip` is one of the build packaging front-ends that might be used to install Airflow. It's the one
28+
that we recommend (see below) for reproducible installation of specific versions of Airflow.
29+
30+
As of version 2.8 Airflow follows PEP 517/518 and uses `pyproject.toml` file to define build dependencies
31+
and build process and it requires relatively modern versions of packaging tools to get airflow built from
32+
local sources or sdist packages, as PEP 517 compliant build hooks are used to determine dynamic build
33+
dependencies. In case of `pip` it means that at least version 22.1.0 is needed (released at the beginning of
34+
2022) to build or install Airflow from sources. This does not affect the ability of installing Airflow from
35+
released wheel packages.
36+
37+
Downloading and installing Airflow from sources
38+
-----------------------------------------------
2739

2840
While you can get Airflow sources in various ways (including cloning https://github.com/apache/airflow/), the
2941
canonical way to download it is to fetch the tarball published at https://downloads.apache.org where you can
@@ -95,7 +107,8 @@ Airflow project contains some pre-defined virtualenv definitions in ``pyproject.
95107
easily used by hatch to create your local venvs. This is not necessary for you to develop and test
96108
Airflow, but it is a convenient way to manage your local Python versions and virtualenvs.
97109

98-
## Installing Hatch
110+
Installing Hatch
111+
----------------
99112

100113
You can install hat using various other ways (including Gui installers).
101114

@@ -128,19 +141,21 @@ You can see the list of available envs with:
128141

129142
This is what it shows currently:
130143

131-
┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
132-
┃ Name ┃ Type ┃ Features ┃ Description ┃
133-
┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
134-
│ default │ virtual │ devel │ Default environment with Python 3.8 for maximum compatibility │
135-
├─────────────┼─────────┼──────────┼───────────────────────────────────────────────────────────────┤
136-
│ airflow-38 │ virtual │ │ Environment with Python 3.8. No devel installed. │
137-
├─────────────┼─────────┼──────────┼───────────────────────────────────────────────────────────────┤
138-
│ airflow-39 │ virtual │ │ Environment with Python 3.9. No devel installed. │
139-
├─────────────┼─────────┼──────────┼───────────────────────────────────────────────────────────────┤
140-
│ airflow-310 │ virtual │ │ Environment with Python 3.10. No devel installed. │
141-
├─────────────┼─────────┼──────────┼───────────────────────────────────────────────────────────────┤
142-
│ airflow-311 │ virtual │ │ Environment with Python 3.11. No devel installed │
143-
└─────────────┴─────────┴──────────┴───────────────────────────────────────────────────────────────┘
144+
┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
145+
┃ Name ┃ Type ┃ Description ┃
146+
┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
147+
│ default │ virtual │ Default environment with Python 3.8 for maximum compatibility │
148+
├─────────────┼─────────┼───────────────────────────────────────────────────────────────┤
149+
│ airflow-38 │ virtual │ Environment with Python 3.8. No devel installed. │
150+
├─────────────┼─────────┼───────────────────────────────────────────────────────────────┤
151+
│ airflow-39 │ virtual │ Environment with Python 3.9. No devel installed. │
152+
├─────────────┼─────────┼───────────────────────────────────────────────────────────────┤
153+
│ airflow-310 │ virtual │ Environment with Python 3.10. No devel installed. │
154+
├─────────────┼─────────┼───────────────────────────────────────────────────────────────┤
155+
│ airflow-311 │ virtual │ Environment with Python 3.11. No devel installed │
156+
├─────────────┼─────────┼───────────────────────────────────────────────────────────────┤
157+
│ airflow-312 │ virtual │ Environment with Python 3.11. No devel installed │
158+
└─────────────┴─────────┴───────────────────────────────────────────────────────────────┘
144159

145160
The default env (if you have not used one explicitly) is `default` and it is a Python 3.8
146161
virtualenv for maximum compatibility with `devel` extra installed - this devel extra contains the minimum set
@@ -229,7 +244,8 @@ and install to latest supported ones by pure airflow core.
229244
pip install -e ".[devel]" \
230245
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-no-providers-3.8.txt"
231246

232-
## All airflow extras
247+
Airflow extras
248+
==============
233249

234250
Airflow has a number of extras that you can install to get additional dependencies. They sometimes install
235251
providers, sometimes enable other features where packages are not installed by default.
@@ -239,36 +255,69 @@ https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html
239255

240256
The list of available extras is below.
241257

242-
Regular extras that are available for users in the Airflow package.
258+
Core extras
259+
-----------
260+
261+
Those extras are available as regular core airflow extras - they install optional features of Airflow.
262+
263+
# START CORE EXTRAS HERE
264+
265+
aiobotocore, apache-atlas, apache-webhdfs, async, cgroups, deprecated-api, github-enterprise,
266+
google-auth, graphviz, kerberos, ldap, leveldb, otel, pandas, password, pydantic, rabbitmq, s3fs,
267+
saml, sentry, statsd, uv, virtualenv
268+
269+
# END CORE EXTRAS HERE
243270

244-
# START REGULAR EXTRAS HERE
271+
Provider extras
272+
---------------
245273

246-
aiobotocore, airbyte, alibaba, all, all-core, all-dbs, amazon, apache-atlas, apache-beam, apache-
247-
cassandra, apache-drill, apache-druid, apache-flink, apache-hdfs, apache-hive, apache-impala,
248-
apache-kafka, apache-kylin, apache-livy, apache-pig, apache-pinot, apache-spark, apache-webhdfs,
249-
apprise, arangodb, asana, async, atlas, atlassian-jira, aws, azure, cassandra, celery, cgroups,
250-
cloudant, cncf-kubernetes, cohere, common-io, common-sql, crypto, databricks, datadog, dbt-cloud,
251-
deprecated-api, dingding, discord, docker, druid, elasticsearch, exasol, fab, facebook, ftp, gcp,
252-
gcp_api, github, github-enterprise, google, google-auth, graphviz, grpc, hashicorp, hdfs, hive,
253-
http, imap, influxdb, jdbc, jenkins, kerberos, kubernetes, ldap, leveldb, microsoft-azure,
254-
microsoft-mssql, microsoft-psrp, microsoft-winrm, mongo, mssql, mysql, neo4j, odbc, openai,
255-
openfaas, openlineage, opensearch, opsgenie, oracle, otel, pagerduty, pandas, papermill, password,
256-
pgvector, pinecone, pinot, postgres, presto, pydantic, qdrant, rabbitmq, redis, s3, s3fs,
257-
salesforce, samba, saml, segment, sendgrid, sentry, sftp, singularity, slack, smtp, snowflake,
258-
spark, sqlite, ssh, statsd, tableau, tabular, telegram, teradata, trino, uv, vertica, virtualenv,
259-
weaviate, webhdfs, winrm, yandex, zendesk
274+
Those extras are available as regular Airflow extras, they install provider packages in standard builds
275+
or dependencies that are necessary to enable the feature in editable build.
260276

261-
# END REGULAR EXTRAS HERE
277+
# START PROVIDER EXTRAS HERE
262278

263-
Devel extras - used to install development-related tools. Only available during editable install.
279+
airbyte, alibaba, amazon, apache.beam, apache.cassandra, apache.drill, apache.druid, apache.flink,
280+
apache.hdfs, apache.hive, apache.impala, apache.kafka, apache.kylin, apache.livy, apache.pig,
281+
apache.pinot, apache.spark, apprise, arangodb, asana, atlassian.jira, celery, cloudant,
282+
cncf.kubernetes, cohere, common.io, common.sql, databricks, datadog, dbt.cloud, dingding, discord,
283+
docker, elasticsearch, exasol, fab, facebook, ftp, github, google, grpc, hashicorp, http, imap,
284+
influxdb, jdbc, jenkins, microsoft.azure, microsoft.mssql, microsoft.psrp, microsoft.winrm, mongo,
285+
mysql, neo4j, odbc, openai, openfaas, openlineage, opensearch, opsgenie, oracle, pagerduty,
286+
papermill, pgvector, pinecone, postgres, presto, qdrant, redis, salesforce, samba, segment,
287+
sendgrid, sftp, singularity, slack, smtp, snowflake, sqlite, ssh, tableau, tabular, telegram,
288+
teradata, trino, vertica, weaviate, yandex, zendesk
289+
290+
# END PROVIDER EXTRAS HERE
291+
292+
Devel extras
293+
------------
294+
295+
The `devel` extras are not available in the released packages. They are only available when you install
296+
Airflow from sources in `editable` installation - i.e. one that you are usually using to contribute to
297+
Airflow. They provide tools such as `pytest` and `mypy` for general purpose development and testing.
264298

265299
# START DEVEL EXTRAS HERE
266300

267-
devel, devel-all, devel-all-dbs, devel-ci, devel-debuggers, devel-devscripts, devel-duckdb, devel-
268-
hadoop, devel-mypy, devel-sentry, devel-static-checks, devel-tests
301+
devel, devel-all-dbs, devel-ci, devel-debuggers, devel-devscripts, devel-duckdb, devel-hadoop,
302+
devel-mypy, devel-sentry, devel-static-checks, devel-tests
269303

270304
# END DEVEL EXTRAS HERE
271305

306+
Bundle extras
307+
-------------
308+
309+
Those extras are bundles dynamically generated from other extras.
310+
311+
# START BUNDLE EXTRAS HERE
312+
313+
all, all-core, all-dbs, devel-all, devel-ci
314+
315+
# END BUNDLE EXTRAS HERE
316+
317+
318+
Doc extras
319+
----------
320+
272321
Doc extras - used to install dependencies that are needed to build documentation. Only available during
273322
editable install.
274323

@@ -278,7 +327,20 @@ doc, doc-gen
278327

279328
# END DOC EXTRAS HERE
280329

281-
## Compiling front end assets
330+
Deprecated extras
331+
-----------------
332+
333+
The `deprecated` extras are deprecated extras from Airflow 1 that will be removed in future versions.
334+
335+
# START DEPRECATED EXTRAS HERE
336+
337+
atlas, aws, azure, cassandra, crypto, druid, gcp, gcp-api, hdfs, hive, kubernetes, mssql, pinot, s3,
338+
spark, webhdfs, winrm
339+
340+
# END DEPRECATED EXTRAS HERE
341+
342+
Compiling front end assets
343+
--------------------------
282344

283345
Sometimes you can see that front-end assets are missing and website looks broken. This is because
284346
you need to compile front-end assets. This is done automatically when you create a virtualenv

airflow_pre_installed_providers.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# List of all the providers that are pre-installed when you run `pip install apache-airflow` without extras
22
common.io
33
common.sql
4-
fab>=1.0.2dev0
4+
fab>=1.0.2dev1
55
ftp
66
http
77
imap

clients/python/pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
# under the License.
1717

1818
[build-system]
19-
requires = ["hatchling"]
19+
requires = ["hatchling==1.22.4"]
2020
build-backend = "hatchling.build"
2121

2222
[project]

0 commit comments

Comments
 (0)