Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta] GA for searchable snapshots reading legacy versions #5948

Open
1 of 3 tasks
kartg opened this issue Jan 20, 2023 · 2 comments
Open
1 of 3 tasks

[Meta] GA for searchable snapshots reading legacy versions #5948

kartg opened this issue Jan 20, 2023 · 2 comments
Labels
enhancement Enhancement or improvement to existing feature or request Roadmap:Cost/Performance/Scale Project-wide roadmap label Search:Searchable Snapshots

Comments

@kartg
Copy link
Member

kartg commented Jan 20, 2023

Is your feature request related to a problem? Please describe.
This meta-issue tracks follow-up items to #5451.

Today, the ability for searchable snapshots to read snapshots written by legacy versions has been merged to main (#5812) and 2.x (#5429) behind a feature flag. This feature is useful for users storing data for long periods of time (and thereby across major version upgrades) for log analysis or audit use-cases. As an example, see #5443

What is a legacy version? (adapted from here)

IMO a "legacy" version is any version older than OpenSearch's standard backwards compatibility guarantee (one prior major version). So:

  • For 3.x, OpenSearch 1.x (or ElasticSearch 7.10.2) and lower versions are considered legacy
  • For 2.x -> ElasticSearch 6.x and lower
  • For 1.x -> ElasticSearch 5.x and lower

I think these should also have a lower bound, but I don't have a value in mind at this time. The current implementation only goes as far back as ElasticSearch 6.0.


Follow-up items

  • Implement a testing strategy for searchable snapshots reading legacy versions.

Currently, no integration tests exist for the feature. Individual classes are unit-tested using a compressed ElasticSearch 6.0 index directory (README). Ideally we should be able to produce tests that provide confidence beyond simply "best-effort" (as is currently the case with the Lucene 9 implementation)

  • Remove the workarounds in Lucene access

Currently, the "expert" readLatestCommit API is package-private in Lucene, which prevents OpenSearch from accessing it directly. Thus, the code path uses a workaround. We should seek to remove this workaround and use the expert API directly, by submitting upstream changes to Lucene if need be.

  • Expand support to ElasticSearch 5.x

The current implementation only goes as far back as ElasticSearch 6.0. Users would benefit from expanding this feature to support ElasticSearch 5.x snapshots as well.

@kartg kartg added enhancement Enhancement or improvement to existing feature or request Indexing & Search Search Search query, autocomplete ...etc labels Jan 20, 2023
@kartg
Copy link
Member Author

kartg commented Apr 7, 2023

Expand support to ElasticSearch 5.x

The current implementation only goes as far back as ElasticSearch 6.0. Users would benefit from expanding this feature to support ElasticSearch 5.x snapshots as well.

This may not be viable because Lucene does not support reading indices older than 7.0 due to the Segment Infos format.

@kartg
Copy link
Member Author

kartg commented Apr 11, 2023

Remove the workarounds in Lucene access

This is being tracked here - #7084

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Roadmap:Cost/Performance/Scale Project-wide roadmap label Search:Searchable Snapshots
Projects
Status: New
Status: Later (6 months plus)
Development

No branches or pull requests

4 participants