WebAPI v2.11 fails to start with many sources configured #2031 #2032

anton-abushkevich · 2022-05-11T09:26:08Z

Quick fix for #2031

…riority

anthonysena · 2022-05-11T20:27:08Z

Tagging @anton-abushkevich @ssuvorov-fls here. I added a commit where I'm trying to add back control of the caching to honor the priority of the daimon.

Prior to this change, we'd only cache record counts where the priority is > 0. The code I added tries to re-apply this rule but while making the update, it did raise a few questions:

The changes done in Complete caching of Achilles results in Data Sources #1978 #1993 adds some new rules for caching. As I understand it, any source configured with both a vocabulary and results daimon will have a job started with 2 steps: one to run the CDMResultsCacheTasklet and anther to run the newly added AchillesCacheTasklet. With the changes I made, I only run the CDMResultsCacheTasklet if the source's results daimon has a priority > 0. I stopped short of applying this same rule for the AchillesCacheTasklet since I was unsure if this would cause any problems. If a source with a results daimon does not run the AchillesCacheTasklet, will the Data Sources functionality still work without the cache?
It seems that every start of WebAPI requires the AchillesCacheTasklet to run and presumably copy the Achilles data from the CDM to the WebAPI DB (the newly added cdm_cache table). My question is: do we need to do this every time we start up? This feels like it should be a 1-time operation to make that copy. We might consider that the tasklet should try to figure out if the caching needs to be re-applied? I think that was the idea for the cdm.cache.cron.warming.enable but I'm unsure why you'd need to run this on a scheduled basis?
The behavior of cdm.cache.cron.warming.enable differs from cdm.result.cache.warming.enable. cdm.cache.cron.warming.enable controls if the scheduled task will run based on the cron expression specified in cdm.cache.cron.expression. Originally cdm.result.cache.warming.enable controlled if we even warmed the results cache, which only included the record counts. That setting has now expanded to control both the record count cache and the Achilles results cache. I'd prefer if we have a mechanism to control both of these caches based on the point above.
There is a mechanism to "bucket" the jobs (see getBucketsSizes) and as I understand it, it is used to create a single job with multiple steps based on the sources involved. I'm unclear though if Spring Batch is honoring this since it appears to launch all jobs immediately. In the case where we have a large number of sources configured, that will leave WebAPI & Atlas unresponsive to user requests (i.e. cohort generations) until the queue is cleared. Again, this will require us to think about the 1st point about controlling which sources are immediately cached.

anton-abushkevich · 2022-05-18T09:06:06Z

@anthonysena

If a source with a results daimon does not run the AchillesCacheTasklet, will the Data Sources functionality still work without the cache?

Yes, it should still work.

It seems that every start of WebAPI requires the AchillesCacheTasklet to run and presumably copy the Achilles data from the CDM to the WebAPI DB (the newly added cdm_cache table). My question is: do we need to do this every time we start up? This feels like it should be a 1-time operation to make that copy. We might consider that the tasklet should try to figure out if the caching needs to be re-applied? I think that was the idea for the cdm.cache.cron.warming.enable but I'm unsure why you'd need to run this on a scheduled basis?

The behavior of cdm.cache.cron.warming.enable differs from cdm.result.cache.warming.enable. cdm.cache.cron.warming.enable controls if the scheduled task will run based on the cron expression specified in cdm.cache.cron.expression. Originally cdm.result.cache.warming.enable controlled if we even warmed the results cache, which only included the record counts. That setting has now expanded to control both the record count cache and the Achilles results cache. I'd prefer if we have a mechanism to control both of these caches based on the point above.

There is a mechanism to "bucket" the jobs (see getBucketsSizes) and as I understand it, it is used to create a single job with multiple steps based on the sources involved. I'm unclear though if Spring Batch is honoring this since it appears to launch all jobs immediately. In the case where we have a large number of sources configured, that will leave WebAPI & Atlas unresponsive to user requests (i.e. cohort generations) until the queue is cleared. Again, this will require us to think about the 1st point about controlling which sources are immediately cached.

I think you right, but it is not the scope of this PR. Let's create another task for this and do the discuttion/research there.

I will approve your changes in this PR and create a new task to discuss these topics. I hope, Sergey @ssuvorov-fls will help us with it.

anthonysena · 2022-05-19T13:04:10Z

Yes, it should still work.

It will be important to preserve the backwards compatibility here - in our environment, it is not feasible to cache all Achilles results.

I think you right, but it is not the scope of this PR. Let's create another task for this and do the discuttion/research there.

I will approve your changes in this PR and create a new task to discuss these topics. I hope, Sergey @ssuvorov-fls will help us with it.

Could you please open a new issue with these details? We'll keep the focus of this PR on the name length issue but my preference is to add a configuration setting that allows us to disable the Achilles cache functionality for now.

anthonysena · 2022-05-19T16:18:30Z

This addresses the root of the problem described in #2031 by making sure the job name does not exceed the character limit. I'll approve this PR since this satisfies the bug fix required for that issue. Let's use #2034 to address the questions I've raised here. Thanks!

* fails to start with many sources configured #2031 * Starting to add back the ability to control caching based on daimon priority Co-authored-by: Anthony Sena <[email protected]> (cherry picked from commit fad3dab)

fails to start with many sources configured #2031

5a3f575

anton-abushkevich requested a review from anthonysena May 11, 2022 09:26

anton-abushkevich mentioned this pull request May 11, 2022

WebAPI v2.11 fails to start with many sources configured #2031

Closed

Starting to add back the ability to control caching based on daimon p…

e2d3bda

…riority

anthonysena mentioned this pull request May 19, 2022

Ability to disable the Achilles cache #2034

Closed

anthonysena approved these changes May 19, 2022

View reviewed changes

anton-abushkevich merged commit fad3dab into master May 23, 2022

delete-merged-branch bot deleted the issue-2031_cache_warm_job_name_fix branch May 23, 2022 07:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WebAPI v2.11 fails to start with many sources configured #2031 #2032

WebAPI v2.11 fails to start with many sources configured #2031 #2032

anton-abushkevich commented May 11, 2022

anthonysena commented May 11, 2022

anton-abushkevich commented May 18, 2022

anthonysena commented May 19, 2022

anthonysena commented May 19, 2022

WebAPI v2.11 fails to start with many sources configured #2031 #2032

WebAPI v2.11 fails to start with many sources configured #2031 #2032

Conversation

anton-abushkevich commented May 11, 2022

anthonysena commented May 11, 2022

anton-abushkevich commented May 18, 2022

anthonysena commented May 19, 2022

anthonysena commented May 19, 2022