Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update parse options to reflect modern libxml usage #3439

Open
flavorjones opened this issue Feb 19, 2025 · 2 comments
Open

Update parse options to reflect modern libxml usage #3439

flavorjones opened this issue Feb 19, 2025 · 2 comments

Comments

@flavorjones
Copy link
Member

Context

In #3360 the libxml2 maintainer left some suggestions about how we're exposing and documenting some of the parse options.

He mentioned:

  • DTDATTR and DTDVALID imply DTDLOAD and are unsafe as well.
  • SAX1 should probably not be exposed.
  • NODICT should probably not be exposed.
  • XINCLUDE, NOXINCNODE and NOBASEFIX are only used by the XML Reader and XInclude API.
  • HUGE is safe these days (since 2.10)

and some forward-looking statements about the upcoming 2.14 release:

  • UNZIP: Enable decompression. This option has no real effect for now. The plan is that users who really need decompression start to add the option. At a later point, it will be required to enable decompression.
  • NO_SYS_CATALOG: Don't use system catalogs when resolving DTDs or entities.
  • CATALOG_PI: Enable oasis-xml-catalog PIs. This is a really obscure feature that should have never been enabled by default. I don't think your users need it.

Actions

I think the actions I'd like to take re: documentation:

  • Make the following bits :nodoc:: SAX1, NODICT
  • Update documentation for DTDATTR and DTDVALID to imply DTDLOAD and include safety warnings
    • And double-check that these are all off by default
  • Update documentation for the XINCLUDE set to specify they're only used by Reader and Node#process_xincludes

And the functional action I'd like to take:

  • Add HUGE to all the default bitsets if the libxml2 version is >= 2.10.0

I'd like to wait until the UNZIP bit is useful before adding it. We don't expose the catalog bits, so nothing to do there.

@BurdetteLamar
Copy link
Contributor

Assign to me, if you like. I can make the first pass at these changes.

@flavorjones
Copy link
Member Author

@BurdetteLamar Assigned you, thanks! If you want to skip the HUGE bit, that would be fine, I may want to do some additional testing and can spin it out into a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants