Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is it possible to utilize multiple cores when training (adding measurements)? #344

Closed
brandon-holt opened this issue Aug 15, 2024 · 4 comments
Labels
question Further information is requested

Comments

@brandon-holt
Copy link
Contributor

brandon-holt commented Aug 15, 2024

Hi, I noticed that when adding measurements to a campaign object only one core is being utilized. Is there a way to parallelize this process to decrease runtime? This is currently a very slow process for me.

By contrast, I noticed when running the simulate_experiment module all cores are in use. I know these are different processes, but just was curious why this module can utilize multiple cores.

Thanks!

@AdrianSosic
Copy link
Collaborator

Hi @brandon-holt, as always, thanks for reporting the issue. The fact that the mere addition of measurements (i.e., without even recommending) causes delays is clearly suboptimal and needs to be fixed. Ideally, this should not be noticeable at all but the current overhead stems from a design choice that we might need to rethink: it's probably caused by the process of "marking" parameter configurations being measured in the search space metadata. This process is currently by no means optimized for speed and I see different potential ways around it that we'd need to discuss in our team:

  • Making the involved fuzzy matching more efficient
  • Switching to a more performant backend like polars
  • Following an entire different approach to metadata handling
  • ...

I suspect your search space is quite big, causing the delays? Can you give me a rough estimate of your dimensions so that I have something to work with?

@brandon-holt
Copy link
Contributor Author

@AdrianSosic I see, this is interesting insight!

Here is the size of a typical campaign searchspace I am working with

campaign.searchspace.discrete.comp_rep size = (37324800, 191)
campaign.searchspace.discrete.exp_rep size = (37324800, 8)

@AdrianSosic
Copy link
Collaborator

Thanks for sharing. That is indeed already quite a bit. I'll take this into our team meeting and see what we can do about it. Perhaps we can find a quick fix for you... But priority-wise, a full fix could take a while since my focus is currently still on the surrogate / SHAP issue 😋

@Scienfitz Scienfitz added the question Further information is requested label Feb 5, 2025
Scienfitz added a commit that referenced this issue Mar 5, 2025
- use vectorized operations instead of the for loop
- fixed column validations
- I tested that the result of the new version is always exactly equal to
the old version
- added some basic pytests for the utility
- related to #344 

Here a resulting test looking at the speedup:
<img width="815" alt="image"
src="https://github.com/user-attachments/assets/094bd96c-1e0f-4c4b-a10e-fdd5d680eb16"
/>
- speedup for the most realistic cases (`left_df` large versus
`right_df`) approaches 4x from above
- for less relevant cases (`left_df` and `right_df` comparable in size
or overall very small) the speedup can even be 40x
@Scienfitz
Copy link
Collaborator

Scienfitz commented Mar 6, 2025

@brandon-holt We recently merged and improvement to the measurement addition logic in #489

It vectorized the operations that were likely behind the slowdown you experienced in this issue. This should automatically improve CPU usage, but might come at memory cost.

We noticed a speedup between 4x and 40x.

While it is possible that the operation can be improved even further in the future, I will close this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants