Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor internals #110

Open
wants to merge 88 commits into
base: main
Choose a base branch
from
Open

Refactor internals #110

wants to merge 88 commits into from

Conversation

cbourjau
Copy link
Collaborator

@cbourjau cbourjau commented Mar 10, 2025

This PR is a large refactor of the internal workings of ndonnx. The newly introduced concepts are described in an explicit file.

Breaking changes are limited to functions not found in the array-api standard and can likely be reduced further.

Closes #75.
Closes #74.

Copy link
Member

@adityagoel4512 adityagoel4512 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🚀. We've agreed the general design already a few times so I go into various other directions here. I'm mainly concerned about the conflation of some subtle breaking changes (and I'm sure I cannot catch every single one given the size of this PR), obvious user API breakages, and internal refactoring. Many (though not all) of the breaking changes feel uncorrelated with "Refactor internals"; I'm hoping to minimally get an exhaustive enumeration for a change log and as a reviewing aide for me.

# Copyright (c) QuantCo 2023-2024
# SPDX-License-Identifier: BSD-3-Clause

import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intention behind these tests? If it is to maintain precise understanding of upstream dependencies (Spox, onnx shape inference or ort) then it seems quite dubious to run this as part of the ndonnx test suite. Changes to ndonnx have no bearing on these tests so this will always do the same thing until an external change.

A better strategy would be to have this run in CI weekly or on a pixi update that change Spox or ONNX or onnxruntime. We can then immediately exploit any fixes/improvements or mitigate anything that is broken.

Comment on lines +87 to +88
# Only set to `True` temporarily and only if there was a
# deliberate update to the schema.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need schema updating code (and if we did, it would be an independent script and not in the test suite). Updating the schema breaks backwards compatibility. Any evolutions should go in a "v2" schema so as to not break users' input/output wrangling code. Do you have a use case in mind for editing existing schemas?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I backported this code to #111 . You are right that the schemas should not be updated. It might be that we add more schemas, though, or that one may want to regenerate the files for some other reasons. If nothing else it is good to see how the files were generated.

@@ -5,7 +5,7 @@
[![conda-forge](https://img.shields.io/conda/vn/conda-forge/ndonnx?style=flat-square&logoColor=white&logo=conda-forge)](https://anaconda.org/conda-forge/ndonnx)
[![pypi](https://img.shields.io/pypi/v/ndonnx.svg?logo=pypi&logoColor=white)](https://pypi.org/project/ndonnx)

An ONNX-backed array library that is compliant with the [Array API](https://data-apis.org/array-api/) standard.
An ONNX-backed implementation of the [Array API](https://data-apis.org/array-api/) standard.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree the rewrite of the README. Implementing the Array API is a very nice feature worth mentioning, but is not the entire point of the library, nor its biggest utility at the present moment! If that was the only goal, the library wouldn't have an extension module, datetime/timedelta/nullable/user-defined data types, Spox interoperability and probably other features.

In any case, it might be easier if you move the README change proposal to a new pull request as the edits are mostly independent of the refactoring exercise (except the Array).

#
# Union types are exhaustive and don't create ambiguities with respect to user-defined subtypes.
# TODO: Rename
NCoreIntegerDTypes = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NPrimitiveIntegerDTypes?

)
self._data = data

def disassemble(self) -> dict[str, Var]:
Copy link
Member

@adityagoel4512 adityagoel4512 Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optional mask creates a subtle but important breaking change: inference handling code that previously assumed that there is a one-two-one, static relationship between data type and "fields" see this invariant violated (I believe). This means you can read the metadata from an ONNX model but that does NOT tell you everything about the way to split up your inputs for the ONNX backend being used. i.e. inspecting the metadata, I have no idea if the "null" field is present or not. I also have to check all the inputs to the model.

Do I have this correct?

I think there are two aspects to address before merging this:

  • This is a very subtle but undocumented breaking change. Is it truly so necessary that it gets incorporated along with a substantial refactoring of internal behaviour? I understand that there will be perf wins.
  • If it is urgent, can we continue to have this very nice property that the dtype schema/metadata is sufficient to know exactly how to pass things to your ONNX model?

[f"{k}: {v}" for k, v in self._tyarray.__ndx_value_repr__().items()]
)
shape = self._tyarray.shape
return f"array({value_repr}, shape={shape}, dtype={self.dtype})"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new print formatting gets a little long for constant arrays. Can we remove the shape for these?

return x.endpoints.shape[:-1]


class List(StructType):
Copy link
Member

@adityagoel4512 adityagoel4512 Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was very valuable for testing the limits of the dtype system. I'm sure it's implementable again - can you please bring back the List test and/or have some custom dtype tests that illustrate how dtypes get defined from outside of ndonnx. Right now, everything (categoricals, object dtype) has been inlined into the library which won't be the long term way to support niche dtypes without a clear numpy counterpart.

(it might also be a nice one for the docs in the user-defined dtypes example that's currently got a ...!)

@adityagoel4512
Copy link
Member

adityagoel4512 commented Mar 12, 2025

Breaking changes are limited to functions not found in the array-api standard and can likely be reduced further.

I also think enumerating the breaking changes (some of which I've commented on already) will help with this.

@@ -1,52 +1,27 @@
Experimental
============
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for already taking care of parts of the docs :). The Spox Integration page looks unedited.

@adityagoel4512
Copy link
Member

adityagoel4512 commented Mar 12, 2025

I also think we should indeed ship datetime64, timedelta64 but not the categorical or "object" dtype. It would be good to remove the pandas dependency added to the test environment and leave that to downstream libraries.

cbourjau added a commit that referenced this pull request Mar 13, 2025
This test is back-ported from #110 since it is important that that PR
does not break the current schemas.
cbourjau added a commit that referenced this pull request Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Naming of Array.spox_var() and ndonnx.from_spox_var Handling compatibility with ONNX Runtime
2 participants