Skip to content

Commit

Permalink
feat: Fingerprint datapoints python package (#353)
Browse files Browse the repository at this point in the history
Add simple Python package that contains fingerprint data and helpers to
get them and release flow to push it to pypi.

---------

Co-authored-by: Jan Buchar <[email protected]>
  • Loading branch information
Pijukatel and janbuchar authored Mar 10, 2025
1 parent d68c889 commit 344a7d0
Show file tree
Hide file tree
Showing 5 changed files with 115 additions and 0 deletions.
26 changes: 26 additions & 0 deletions .github/workflows/release_python.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: Create a release
on: workflow_dispatch

jobs:
publish_to_pypi:
name: Publish to PyPI
runs-on: ubuntu-latest
permissions:
contents: write
id-token: write # Required for OIDC authentication.
environment:
name: pypi
url: https://pypi.org/project/crawlee
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.13'
- uses: astral-sh/setup-uv@v5
- name: Build project
shell: bash
run: uv build

# Publishes the package to PyPI using PyPA official GitHub action with OIDC authentication.
- name: Publish package to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
7 changes: 7 additions & 0 deletions apify_fingerprint_datapoints/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Changelog

All notable changes to this project will be documented in this file.

## 0.0.1 - (2025-03-05)

- Initial version.
7 changes: 7 additions & 0 deletions apify_fingerprint_datapoints/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Fingerprint datapoints files collected by Apify and originally stored at https://github.com/apify/fingerprint-suite.
This package contains datafiles and helper functions for getting the path to the datafiles.


## License

This project is licensed under the Apache License 2.0 - see the [LICENSE](https://github.com/apify/fingerprint-suite/blob/master/fingerprint_datapoints/LICENSE) file for details.
26 changes: 26 additions & 0 deletions apify_fingerprint_datapoints/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
from pathlib import Path


def get_browser_helper_file() -> Path:
"""Get path of `browser-helper-file.json`."""
return Path(__file__).parent / "data" / "browser-helper-file.json"


def get_header_network() -> Path:
"""Get path of `header-network-definition.zip`."""
return Path(__file__).parent / "data" / "header-network-definition.zip"


def get_headers_order() -> Path:
"""Get path of `headers-order.json`."""
return Path(__file__).parent / "data" / "headers-order.json"


def get_input_network() -> Path:
"""Get path of `input-network-definition.zip`."""
return Path(__file__).parent / "data" / "input-network-definition.zip"


def get_fingerprint_network() -> Path:
"""Get path of `fingerprint-network-definition.zip`."""
return Path(__file__).parent / "data" / "fingerprint-network-definition.zip"
49 changes: 49 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "apify_fingerprint_datapoints"
version = "0.0.1"
description = "Browser fingerprint datapoints collected by Apify"
authors = [{ name = "Apify Technologies s.r.o.", email = "[email protected]" }]
license = { file = "LICENSE.md" }
readme = "apify_fingerprint_datapoints/README.md"
requires-python = ">=3.9"
classifiers = [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Software Development :: Libraries",
]
keywords = [
"apify",
"chrome",
"crawlee",
"crawler",
"scraper",
"scraping",
"fingerprints",
]

[project.urls]
"Homepage" = "https://docs.apify.com/academy/anti-scraping/techniques/fingerprinting"
"Apify homepage" = "https://apify.com"
"Changelog" = "https://github.com/apify/fingerprint-suite/blob/master/fingerprint_datapoints/CHANGELOG.md"
"Documentation" = "https://docs.apify.com/academy/anti-scraping/techniques/fingerprinting"
"Issue tracker" = "https://github.com/apify/fingerprint-suite/issues"
"Repository" = "https://github.com/apify/fingerprint-suite"

[tool.hatch.build.targets.wheel.force-include]
"./packages/fingerprint-generator/src/data_files/fingerprint-network-definition.zip" = "apify_fingerprint_datapoints/data/fingerprint-network-definition.zip"
"./packages/header-generator/src/data_files/browser-helper-file.json" = "apify_fingerprint_datapoints/data/browser-helper-file.json"
"./packages/header-generator/src/data_files/header-network-definition.zip" = "apify_fingerprint_datapoints/data/header-network-definition.zip"
"./packages/header-generator/src/data_files/headers-order.json" = "apify_fingerprint_datapoints/data/headers-order.json"
"./packages/header-generator/src/data_files/input-network-definition.zip" = "apify_fingerprint_datapoints/data/input-network-definition.zip"

0 comments on commit 344a7d0

Please sign in to comment.