-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate versions for same chart #450
Comments
Perhaps this is a symptom of
|
To be very clear, we are eventually seeing hundreds or thousands of duplicate versions after Harbor replications. Is there a more robust approach needed to ensure that duplicate versions do not exist in the index? |
I think we run in the same issue. We have a simple workflow which creates test charts per PR we create on our test environment. For a short while now (after an upgrade) whenever the PR builder is resyncing the chart, it creates a new version. |
Hi , Thanks for reporting . Do you bootstrap with Overwrites , and run in mostly concurrent environment ? |
@scbizu we do it like this: |
we use a curl POST and Harbor replication with overwrite |
I think I know why . And the bugfix will be scheduled to next release. |
@hobti01 sorry for disturb you again , but can you provide how many duplicate chart do you have ? According to the original design purpose as you paste:
your duplicate charts will not increase since the latest version stored must be equal with the new chart you upload . So if your chart version is the same as the latest chart version you stored , the latest chart version will be updated to the chart which you upload . However , as #220 said , the chart version equality has some bugs before version v0.13.1 . And the old duplicate chart will still exist even after we fixed this issue . The manually deletion for your old duplicate chart is needed after the new version is released. Thanks for your patience : ) |
In some cases we had > 40000 duplicates for a single version. For many other chart versions we have 10s or 100s of duplicates. If the upload is only checking the last 5 charts in the local memory cache, then it seems to me that it is not checking the shared index in redis or the charts on disk. With multiple chartmuseum replicas, this seems like an opportunity for each replica to have its own memory cache being correct but the redis stored index is not. However, I have not dived into the code since there are several complicated interactions there. I've tried to manually delete the duplicate charts but unfortunately the first deletion removes the file on storage and the entries remain in the index with no file in storage (S3 in our case). This causes subsequent 404s when attempting to download the chart. I have worked around the issue by forcing chartmuseum to rescan every 5s so that the index is rebuilt from the files on disk, but this does not seem like a good solution since the memory cache and redis index cache should make this unnecessary. |
It seems like mostly what #220 fixed .
Upload will update the index entry directly and regenerate the repo index via emitting a addChart event(before)/ updateChart event(now for overwrite cases) , you can see some of these changes in my PR #454 and the event listener , but I only test it in local storage without redis caching . If you can bump to v0.13.1 and try to delete the duplicate chart from index entry and test again , it will make a lot of sense for detecting what the real issue is . |
We've had all these issues with 0.13.1 so I'm not sure that #220 solves the issue that the file is removed but not all duplicate entries in the index are removed. |
Thank you for the fix @scbizu , do you have an estimate on when 0.13.2 can be released and used? |
I am gonna to see if there is anything else urgent issue can be fixed in v0.13.2 🤔 And will have some talk with @jdolitsky when we should release it . |
hi @scbizu, I tested in my environment, it seems this issue still exists. the chartmuseum version in my harbor are following:
|
Hihi @ninjadq , can you give me some more details about the chartmuseum's configuration ? And does these two harbor share the same index file ? |
This is my test environment and test steps
this is env file for chartmuseum in harbor
PS: the chartmuseum binary inside harbor was replaced by the one I build from the source code of chartmuseum's main branch |
Thanks for your so-detail report , I will try to reproduce this this weekend . |
* The detailed issue is described in helm#450 * And there is a PR helm#454 fixed one scenario of this issue * But there is another ocassion in which users upload chart with prov * in this PR is to handle this situation with the way similar with helm#454 Signed-off-by: DQ <[email protected]>
I think I got the reason why this duplicate still exists. The harbor replication job replicates charts with the |
* The detailed issue is described in helm#450 * And there is a PR helm#454 fixed one scenario of this issue * But there is another ocassion in which users upload chart with prov * in this PR is to handle this situation with the way similar with helm#454 Signed-off-by: DQ <[email protected]>
* The detailed issue is described in helm#450 * And there is a PR helm#454 fixed one scenario of this issue * But there is another ocassion in which users upload chart with prov * in this PR is to handle this situation with the way similar with helm#454 Signed-off-by: DQ <[email protected]>
Thank you @ninjadq , I forget the handler for |
* The detailed issue is described in helm#450 * And there is a PR helm#454 fixed one scenario of this issue * But there is another ocassion in which users upload chart with prov * in this PR is to handle this situation with the way similar with helm#454 Signed-off-by: DQ <[email protected]>
* The detailed issue is described in helm#450 * And there is a PR helm#454 fixed one scenario of this issue * But there is another ocassion in which users upload chart with prov * in this PR is to handle this situation with the way similar with helm#454 Signed-off-by: DQ <[email protected]>
* The detailed issue is described in helm#450 * And there is a PR helm#454 fixed one scenario of this issue * But there is another ocassion in which users upload chart with prov * in this PR is to handle this situation with the way similar with helm#454 Signed-off-by: DQ <[email protected]>
* Fix duplicate versions for same chart * The detailed issue is described in #450 * And there is a PR #454 fixed one scenario of this issue * But there is another ocassion in which users upload chart with prov * in this PR is to handle this situation with the way similar with #454 Signed-off-by: DQ <[email protected]> * Enhance: optimize loop in `getChartAndProvFiles` * If conflict, it didn't need to do the left logic, just return the file * move out file format check logic out of `validateChartOrProv` * these changes are discussed in #492 (comment) Signed-off-by: DQ <[email protected]>
We are seeing many duplicate entries in
index.yaml
for the same chart version. In some cases there are thousands of duplicates for the same version. We expect only one entry per version/tag.ChartMuseum version: 0.13.1
Running as part of Harbor with multiple instances of chartmuseum and settings:
Is there a concurrency issue that we should be aware of?
Example:
The text was updated successfully, but these errors were encountered: