Add new abstract methods for reporting source provenance #1997
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an initial implementation for the proposal up at: https://lists.apache.org/thread/q6gxjpld2vb1c9rqlsv24m12c087snc4
Some thoughts about this approach:
This prioritizes machine readability and standardization of source provenance and version information
Sources have a lot of freedom in how they implement things, and so we may very well need to expand on the types and constants added here, such as
SourceInfoMedium
,SourceVersionType
, etc.The idea here is to have greater certainty about how sources are obtained, even if this cannot be covered by all currently existing source implementations (e.g. I didn't initially add a
bzr
medium for which we have a plugin, or acvs
medium for which we do not yet have a plugin).Aspirationally, forcing this data to be precise can allow adjacent tooling to do useful things.
This drops the freeform "public data" mentioned in the proposal discussion
My rationale for this choice in this branch, is that ultimately we want a data with a constant shape, and if for examlpe, we want the user to be able to override or assist a source with determining the reported "version", then the
Source
implementation already has everything it needs to do so:collect_source_info()
however it wants.This does not yet attempt to cover the concept of tracking information
I would like to consider this, but we should think carefully about how this can be useful. For instance, some git plugins have different interpretations of what their "tracking" strings mean, sometimes following a branch head, sometimes looking for the latest tag in history which matches a given regular expression.
If we export this tracking information, it should probably be useful for external tooling to figure out how to do the tracking and come to the same conclusion, otherwise it is unclear what this is useful for.
This does not cover the CVE information
While the SourceInfo objects representing a source's provenance is a list, I believe that the CVE information continues to be a per-element concept.
For example, when we have applied security patches to a module, those security patches are, themselves, sources, with provenance of being revisioned in the local project