👖 Conformal Tights

Conformal Tights is a Python package for Coherent Conformal Prediction^✦ that exports:

🍬 a scikit-learn meta-estimator that adds coherent conformal prediction of quantiles and intervals to any scikit-learn regressor
🔮 a Darts forecaster that adds coherent conformal probabilistic time series forecasting to any scikit-learn regressor

Features

Tip

^✦Coherent Conformal Prediction (CCP): what makes Conformal Tights unique is that it produces so-called coherent conformally calibrated quantile predictions. Without coherence, a model's predicted quantiles may cross each other in practice. For instance, the 25th percentile prediction may be higher than the 75th percentile prediction. With coherence, the predicted quantiles increase monotonically as you would expect.

🚦 Coherent: quantiles increase monotonically instead of crossing each other
🌡️ Conformal: prediction intervals with reliable coverage and accurate quantile predictions
🪜 Dynamic: two-level conformal calibration of both absolute and relative residuals
👖 Tight: selects the lowest dispersion that provides the desired coverage
🎁 Data efficient: requires only a small number of calibration examples to fit
🐼 Pandas support: optionally predict on DataFrames and receive DataFrame output

Using

Installing

pip install conformal-tights

Predicting quantiles

Conformal Tights exports a meta-estimator called ConformalCoherentQuantileRegressor that you can use to equip any scikit-learn regressor with a predict_quantiles method that predicts conformally calibrated quantiles. Example usage:

from conformal_tights import ConformalCoherentQuantileRegressor
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor

# Fetch dataset and split in train and test
X, y = fetch_openml("ames_housing", version=1, return_X_y=True, as_frame=True, parser="auto")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)

# Create a regressor, equip it with conformal prediction, and fit on the train set
my_regressor = XGBRegressor(objective="reg:absoluteerror")
conformal_predictor = ConformalCoherentQuantileRegressor(estimator=my_regressor)
conformal_predictor.fit(X_train, y_train)

# Predict with the underlying regressor
ŷ_test = conformal_predictor.predict(X_test)

# Predict quantiles with the conformal predictor
ŷ_test_quantiles = conformal_predictor.predict_quantiles(
    X_test, quantiles=(0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975)
)

When the input data is a pandas DataFrame, the output is also a pandas DataFrame. For example, printing the head of ŷ_test_quantiles yields:

house_id	0.025	0.05	0.1	0.5	0.9	0.95	0.975
1357	114743.7	120917.9	131752.6	156708.2	175907.8	187996.1	205443.4
2367	67382.7	80191.7	86871.8	105807.1	118465.3	127581.2	142419.1
2822	119068.0	131864.8	138541.6	159447.7	179227.2	197337.0	214134.1
2126	93885.8	100040.7	111345.5	134292.7	150557.1	164595.8	182524.1
1544	68959.8	81648.8	88364.1	108298.3	122329.6	132421.1	147225.6

Let's visualize the predicted quantiles on the test set:

Expand to see the code that generated the graph above

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

%config InlineBackend.figure_format = "retina"
plt.rc("font", family="DejaVu Sans", size=10)
plt.figure(figsize=(8, 4.5))
idx = ŷ_test_quantiles[0.5].sample(50, random_state=42).sort_values().index
x = list(range(1, len(idx) + 1))
x_ticks = [1, *list(range(5, len(idx) + 1, 5))]
for j in range(3):
    coverage = round(100 * (ŷ_test_quantiles.columns[-(j + 1)] - ŷ_test_quantiles.columns[j]))
    plt.bar(
        x,
        ŷ_test_quantiles.loc[idx].iloc[:, -(j + 1)] - ŷ_test_quantiles.loc[idx].iloc[:, j],
        bottom=ŷ_test_quantiles.loc[idx].iloc[:, j],
        color=["#b3d9ff", "#86bfff", "#4da6ff"][j],
        label=f"{coverage}% Prediction interval",
    )
plt.plot(
    x,
    y_test.loc[idx],
    "s",
    label="Actual (test)",
    markeredgecolor="#e74c3c",
    markeredgewidth=1.414,
    markerfacecolor="none",
    markersize=4,
)
plt.plot(x, ŷ_test.loc[idx], "s", color="blue", label="Predicted (test)", markersize=2)
plt.xlabel("House")
plt.xticks(x_ticks, x_ticks)
plt.gca().yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, _: f"${x/1000:,.0f}k"))
plt.gca().tick_params(axis="both", labelsize=10)
plt.gca().spines["top"].set_visible(False)
plt.gca().spines["right"].set_visible(False)
plt.grid(False)
plt.grid(axis="y")
plt.legend(loc="upper left", title="House price", title_fontproperties={"weight": "bold"})
plt.tight_layout()

Predicting intervals

In addition to quantile prediction, you can use predict_interval to predict conformally calibrated prediction intervals. Compared to quantiles, these focus on reliable coverage over quantile accuracy. Example usage:

# Predict an interval for each example with the conformal predictor
ŷ_test_interval = conformal_predictor.predict_interval(X_test, coverage=0.95)

# Measure the coverage of the prediction intervals on the test set
coverage = ((ŷ_test_interval.iloc[:, 0] <= y_test) & (y_test <= ŷ_test_interval.iloc[:, 1])).mean()
print(coverage)  # 96.6%

When the input data is a pandas DataFrame, the output is also a pandas DataFrame. For example, printing the head of ŷ_test_interval yields:

house_id	0.025	0.975
1357	107202.8	206290.4
2367	66665.1	146004.8
2822	115591.8	220314.8
2126	85288.1	183037.8
1544	67889.9	150646.2

Forecasting time series

Conformal Tights also exports a Darts forecaster called DartsForecaster that uses a ConformalCoherentQuantileRegressor to make conformally calibrated probabilistic time series forecasts. To demonstrate its usage, let's begin by loading a time series dataset:

from darts.datasets import ElectricityConsumptionZurichDataset

# Load a forecasting dataset
ts = ElectricityConsumptionZurichDataset().load()
ts = ts.resample("h")

# Split the dataset into covariates X and target y
X = ts.drop_columns(["Value_NE5", "Value_NE7"])
y = ts["Value_NE5"]  # NE5 = Household energy consumption

# Add categorical covariates to X
X = X.add_holidays(country_code="CH")
X = X.add_datetime_attribute("month")
X = X.add_datetime_attribute("dayofweek")
X = X.add_datetime_attribute("hour")
X_categoricals = ["holidays", "month", "dayofweek", "hour"]

Printing the tail of the covariates time series X.pd_dataframe() yields:

Timestamp	Hr [%Hr]	RainDur [min]	T [°C]	WD [°]	WVs [m/s]	WVv [m/s]	p [hPa]	month	dayofweek	hour
2022‑08‑30 20h	70.2	0.0	19.9	290.2	1.7	1.5	968.5	7.0	1.0	20.0
2022‑08‑30 21h	70.1	0.0	19.5	239.2	1.0	0.7	968.1	7.0	1.0	21.0
2022‑08‑30 22h	71.3	0.0	19.5	28.9	1.5	1.3	967.9	7.0	1.0	22.0
2022‑08‑30 23h	80.4	0.0	18.9	24.3	1.6	1.1	967.9	7.0	1.0	23.0
2022‑08‑31 00h	81.6	1.0	18.7	293.5	0.9	0.3	967.8	7.0	2.0	0.0

We can now equip a scikit-learn regressor with conformal prediction using ConformalCoherentQuantileRegressor as before, and then equip that conformal predictor with probabilistic time series forecasting using DartsForecaster:

from conformal_tights import DartsForecaster, ConformalCoherentQuantileRegressor
from pandas import Timestamp
from xgboost import XGBRegressor

# Split the dataset into train and test
test_cutoff = Timestamp("2022-06-01")
y_train, y_test = y.split_after(test_cutoff)
X_train, X_test = X.split_after(test_cutoff)

# Now let's:
# 1. Create an sklearn regressor of our choosing, in this case `XGBRegressor`
# 2. Add conformal quantile prediction to the regressor with `ConformalCoherentQuantileRegressor`
# 3. Add probabilistic forecasting to the conformal predictor with `DartsForecaster`
my_regressor = XGBRegressor()
conformal_predictor = ConformalCoherentQuantileRegressor(estimator=my_regressor)
forecaster = DartsForecaster(
    model=conformal_predictor,
    lags=5 * 24,  # Add the last 5 days of the target to the prediction features
    lags_future_covariates=[0],  # Add the current timestamp's covariates to the prediction features
    categorical_future_covariates=X_categoricals,  # Convert these covariates to pd.Categorical
)

# Fit the forecaster
forecaster.fit(y_train, future_covariates=X_train)

# Make a probabilistic forecast 5 days into the future by predicting a set of conformally calibrated
# quantiles at each time step and drawing 500 samples from them
quantiles = (0.025, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.975)
forecast = forecaster.predict(
    n=5 * 24, future_covariates=X_test, num_samples=500, quantiles=quantiles
)

Printing the head of the forecast quantiles time series forecast.quantiles_df(quantiles=quantiles) yields:

Timestamp	Value_NE5_0.025	Value_NE5_0.05	Value_NE5_0.1	Value_NE5_0.25	Value_NE5_0.5	Value_NE5_0.75	Value_NE5_0.9	Value_NE5_0.95	Value_NE5_0.975
2022‑06‑01 01h	19165.2	19268.3	19435.7	19663.0	19861.7	20062.2	20237.9	20337.7	20453.2
2022‑06‑01 02h	19004.0	19099.0	19226.3	19453.7	19710.7	19966.1	20170.1	20272.8	20366.9
2022‑06‑01 03h	19372.6	19493.0	19679.4	20027.6	20324.6	20546.3	20773.2	20910.3	21014.1
2022‑06‑01 04h	21936.2	22105.6	22436.0	22917.5	23308.6	23604.8	23871.0	24121.7	24351.5
2022‑06‑01 05h	25040.5	25330.5	25531.1	25910.4	26439.4	26903.2	27287.4	27493.9	27633.9

Let's visualize the forecast and its prediction interval on the test set:

Expand to see the code that generated the graph above

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

%config InlineBackend.figure_format = "retina"
plt.rc("font", family="DejaVu Sans", size=10)
plt.figure(figsize=(8, 4.5))
y_train[-2 * 24 :].plot(label="Actual (train)")
y_test[: len(forecast)].plot(label="Actual (test)")
forecast.plot(label="Forecast with\n90% Prediction interval", low_quantile=0.05, high_quantile=0.95)
plt.gca().set_xlabel("")
plt.gca().yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, _: f"{x/1000:,.0f} MWh"))
plt.gca().tick_params(axis="both", labelsize=10)
plt.legend(loc="upper right", title="Energy consumption", title_fontproperties={"weight": "bold"})
plt.tight_layout()

Contributing

Prerequisites

Generate an SSH key and add the SSH key to your GitHub account.

Configure SSH to automatically load your SSH keys:

cat << EOF >> ~/.ssh/config

Host *
  AddKeysToAgent yes
  IgnoreUnknown UseKeychain
  UseKeychain yes
  ForwardAgent yes
EOF

Install Docker Desktop.
Install VS Code and VS Code's Dev Containers extension. Alternatively, install PyCharm.
Optional: install a Nerd Font such as FiraCode Nerd Font and configure VS Code or PyCharm to use it.

Development environments

The following development environments are supported:

⭐️ GitHub Codespaces: click on Open in GitHub Codespaces to start developing in your browser.
⭐️ VS Code Dev Container (with container volume): click on Open in Dev Containers to clone this repository in a container volume and create a Dev Container with VS Code.

⭐️ uv: clone this repository and run the following from root of the repository:

# Create and install a virtual environment
uv sync --python 3.10 --all-extras

# Activate the virtual environment
source .venv/bin/activate

# Install the pre-commit hooks
pre-commit install --install-hooks

VS Code Dev Container: clone this repository, open it with VS Code, and run Ctrl/⌘ + ⇧ + P → Dev Containers: Reopen in Container.
PyCharm Dev Container: clone this repository, open it with PyCharm, create a Dev Container with Mount Sources, and configure an existing Python interpreter at /opt/venv/bin/python.

Developing

This project follows the Conventional Commits standard to automate Semantic Versioning and Keep A Changelog with Commitizen.
Run poe from within the development environment to print a list of Poe the Poet tasks available to run on this project.
Run uv add {package} from within the development environment to install a run time dependency and add it to pyproject.toml and uv.lock. Add --dev to install a development dependency.
Run uv sync --upgrade from within the development environment to upgrade all dependencies to the latest versions allowed by pyproject.toml. Add --only-dev to upgrade the development dependencies only.
Run cz bump to bump the package's version, update the CHANGELOG.md, and create a git tag. Then push the changes and the git tag with git push origin main --tags.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.devcontainer		.devcontainer
.github		.github
notebooks		notebooks
src/conformal_tights		src/conformal_tights
tests		tests
.copier-answers.yml		.copier-answers.yml
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

👖 Conformal Tights

Features

Using

Quick links

Installing

Predicting quantiles

Predicting intervals

Forecasting time series

Contributing

About

Releases 10

Contributors 2

Languages

License

superlinear-ai/conformal-tights

Folders and files

Latest commit

History

Repository files navigation

👖 Conformal Tights

Features

Using

Quick links

Installing

Predicting quantiles

Predicting intervals

Forecasting time series

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 10

Contributors 2

Languages