Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingesters should handle TSDB corruption on startup #2815

Open
aknuds1 opened this issue Aug 23, 2022 · 0 comments
Open

Ingesters should handle TSDB corruption on startup #2815

aknuds1 opened this issue Aug 23, 2022 · 0 comments

Comments

@aknuds1
Copy link
Contributor

aknuds1 commented Aug 23, 2022

Is your feature request related to a problem? Please describe.

It sometimes happens that an ingester fails to start due to detecting TSDB corruption. The typical result of this will be that the ingester in question crashloops, and an engineer has to step in and clean up the corrupted data.

The corruption might for example have occurred because of physical disk failure in the data center.

An example log message from when this happens:

unable to open TSDB for user <redacted>: failed to open TSDB: /data/tsdb/<redacted>: /data/tsdb/<redacted>/chunks_head/<redacted>: invalid magic number 0 

This can also appear in combination with corrupted shipper files, example log:

failed to parse /data/tsdb/<redacted>/thanos.shipper.json as JSON: "": unexpected end of JSON input

Describe the solution you'd like

I would like some automatic cleanup of corrupted TSDB files in ingesters, so it won't be required for engineers to step in and clean up manually.

According to @pracucci dealing with TSDB corruption could be tricky, but corrupted thanos.shipper.json files should be easy to solve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants