Skip to content

Releases: DS4SD/docling

v2.25.2

05 Mar 14:51
Compare
Choose a tag to compare

Fix

  • Proper handling of orphan IDs in layout postprocessing (#1118) (c56ab3a)

Documentation

v2.25.1

03 Mar 00:56
Compare
Choose a tag to compare

Fix

  • Enable locks for threadsafe pdfium (#1052) (8dc0562)
  • html: Use 'start' attribute when parsing ordered lists from HTML docs (#1062) (de7b963)

Documentation

  • Improve docs on token limit warning triggered by HybridChunker (#1077) (db3ceef)

v2.25.0

26 Feb 14:16
Compare
Choose a tag to compare

Feature

  • [Experimental] Introduce VLM pipeline using HF AutoModelForVision2Seq, featuring SmolDocling model (#1054) (3c9fe76)
  • cli: Add option for downloading all models, refine help messages (#1061) (ab683e4)

Fix

Documentation

  • Extend chunking docs, add FAQ on token limit (#1053) (c84b973)

v2.24.0

20 Feb 18:31
Compare
Choose a tag to compare

Feature

v2.23.1

20 Feb 16:26
Compare
Choose a tag to compare

Fix

  • Runtime error when Pandas Series is not always of string type (#1024) (6796f0a)

Documentation

v2.23.0

17 Feb 14:22
Compare
Choose a tag to compare

Feature

Fix

  • Revise DocTags, fix iterate_items to output content_layer in items (#965) (6e75f0b)

v2.22.0

14 Feb 08:53
Compare
Choose a tag to compare

Feature

  • Add support for CSV input with new backend to transform CSV files to DoclingDocument (#945) (00d9405)
  • Introduce the enable_remote_services option to allow remote connections while processing (#941) (2716c7d)
  • Allow artifacts_path to be defined as ENV (#940) (5101e25)

Fix

Documentation

  • Update example Dockerfile with download CLI (#929) (7493d5b)
  • Examples for picture descriptions (#951) (2d66e99)

v2.21.0

10 Feb 11:43
Compare
Choose a tag to compare

Feature

  • Add content_layer property to items to address body, furniture and other roles (#735) (cf78d5b)

v2.20.0

07 Feb 17:46
Compare
Choose a tag to compare

Feature

  • Describe pictures using vision models (#259) (4cc6e3e)

Fix

v2.19.0

07 Feb 13:36
Compare
Choose a tag to compare

Feature

Fix

  • markdown: Handle nested lists (#910) (90b766e)
  • Test cases for RTL programmatic PDFs and fixes for the formula model (#903) (9114ada)
  • msword_backend: Handle conversion error in label parsing (#896) (722a6eb)
  • Enrichment models batch size and expose picture classifier (#878) (5ad6de0)

Documentation

  • Introduce example with custom models for RapidOCR (#874) (6d3fea0)