-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: update example notebooks to pip install the fmeval package #158
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -11,9 +11,9 @@ | |||
"tags": [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we keep below commented, add an additional comment mentioning the use case of this envirnment variable?
os.environ["PARALLELIZATION_FACTOR"] = "1"
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that almost all users (aside from contributors who are very familiar with the internals of our codebase) no longer need to be aware of this env variable. We originally needed to include this as a hacky way to avoid encountering some niche Ray bugs, but I have since fixed those bugs in this PR. I think it's best not to expose details about infra internals (i.e. Ray details) as much as possible, so as to not overwhelm the average user.
This env variable is currently only being used to determine the number of actors in the ActorPools
that we create. While too many actors can lead to OOM issues in theory, in our use case, OOM issues are almost never caused by having too many ActorPool
actors, since those actors are always cheap (memory and CPU usage-wise).
The only time OOM would be an issue for us in the example notebooks is if we're dealing with an eval algo that uses the BertscoreHelperModel
, which is a relatively expensive (memory-wise) actor. But in this cases, we don't use ActorPool
, so PARALLELIZATION_FACTOR
doesn't apply. In such cases, the user just needs to use a notebook instance that has enough memory to load the Bert model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with minor comments/questions
Issue #, if available:
Description of changes:
This PR updates how the example notebooks install the
fmeval
package now that it's available through PyPI. It also cleans up the notebooks by adding more explanatory comments and using a uniform style so they all have the same look and feel. I've also deleted the JS toxicity example notebook as I feel it does not add any value given that we already have a JS stereotyping notebook.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.