Skip to content

v0.0.7

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 15 Jan 04:45
· 8 commits to main since this release

tabben v0.0.7

New Features

  • added the CIFAR10 dataset as a standardized dataset (i.e. can be accessed directly using OpenTabularDataset)
  • added a DatasetCollection class for bulk processing of datasets
  • multiple splits can be passed into the OpenTabularDataset constructor, e.g. OpenTabularDataset(dir, name, split=['train', 'valid'])

Breaking Changes

  • removed the TabularCIFAR10Dataset as it is no longer needed
  • categorical columns for all datasets start counting at 0
  • these OpenTabularDataset constructor parameters are now keyword only: download, transform, target_transform

Non-Breaking Changes

  • many datasets have additional "extras" (every current dataset has extras) available, such as
    • training data profiles (and full-data profiles that should not be used for model selection)
    • bibtex, licenses when available
  • as a result of the many changes to several datasets, the version for all datasets has been incremented
    • version numbers are now 3 element integer arrays

Bugfixes

Non-Code Updates

  • the documentation website is now generated with sphinx, and has API reference info as well
  • dataset pages for amazon and rossman datasets added to docs website
  • there is now also a Julia package Tabben.jl for loading datasets and evaluating models (not at feature parity yet)