-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bnb
] Introducing BitsAndBytesConfig
#21579
Conversation
- add v1 - add tests - more user-friendly API - add docs
@younesbelkada great job again. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I have thoughts on the API, left a couple of comments!
@@ -0,0 +1,19 @@ | |||
<!--Copyright 2023 The HuggingFace Team. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a quantization page, not just bitsandbytes. We may support more configs in the future (for instance coming from optimum).
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# `bitsandbytes` Integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should thus be one section, but not the whole title.
|
||
TODO: write documentaiton here | ||
|
||
## BitsandbytesConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Camel-cased name please (BitsAndBytesConfig)
src/transformers/modeling_utils.py
Outdated
load_in_8bit_skip_modules (`List[str]`, *optional*): | ||
An explicit list of the modules that we do not want to convert in 8-bit. This is useful for models such | ||
as Jukebox that has several heads in different places and not necessarily at the last position. | ||
bitsandbytes_config (`Dict`, *optional*): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be quantization_config
here. And actualyl, whay about load_in_8bit
supporting both True
and a config?
src/transformers/modeling_utils.py
Outdated
if load_in_8bit and bitsandbytes_config is not None: | ||
raise ValueError( | ||
"You can't pass both `load_in_8bit=True` and `bitsandbytes_config` as they are mutually exclusive." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this makes me think supporting both bool and config to load_in_8bit
is a better API.
src/transformers/modeling_utils.py
Outdated
logger.warning( | ||
"The `load_in_8bit` argument will be deprecated in the future (v5). Please use " | ||
"`BitsandbytesConfig` instead." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so that's your plan. I actually like having a simple load_in_8bit=True
that loads the most basic config.
bnb
] Introducing BitsandbytesConfig
bnb
] Introducing BitsAndBytesConfig
The documentation is not available anymore as the PR was closed or merged. |
@@ -4081,6 +4085,9 @@ | |||
logging, | |||
) | |||
|
|||
# bitsandbytes config | |||
from .utils.quantization_config import BitsAndBytesConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right approach? If I protect this import with is_bitsandbytes_available
the docs won't be able to build it seems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, especially since all bnb imports are protected, so you can import this directly here.
This PR is now ready for review! Would love to have a round of review @sgugger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating. I was thinking that maybe the **kwargs in from pretrained could be used to update any attribute of the config when you instantiate it (like for regular model config arguments) so this way load_in_8bit
could become an attribute of this config and there is no need to deprecate it.
@@ -4081,6 +4085,9 @@ | |||
logging, | |||
) | |||
|
|||
# bitsandbytes config | |||
from .utils.quantization_config import BitsAndBytesConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, especially since all bnb imports are protected, so you can import this directly here.
Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
- add also tests - change logic
Thanks for the extensive review! I should have addressed the comments now 💪 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last comment and we should be good to go!
src/transformers/modeling_utils.py
Outdated
quantization_config_kwargs = { | ||
k: v for k, v in kwargs.items() if k in inspect.signature(BitsAndBytesConfig).parameters | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be easier to have BitsAndBytesConfig
have a from_dict
method with return_unused_kwargs
, like PretrainedConfig
. We can then do
quantization_config, kwargs = BitsAndBytesConfig(**kwargs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed!
I think we stil need to add:
quantization_config_kwargs = {
k: v for k, v in kwargs.items() if k in inspect.signature(BitsAndBytesConfig).parameters
}
to check if a user has passed quantization_config
together with some other associated kwargs inside from_pretrained
. But this check is done only in the case elif quantization_config is not None:
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this!
Next step would be to save this in the model config and be able to re-load a checkpoint pushed to the Hub quantized!
Thanks ! Exaclty! Looking forward to it! |
What does this PR do?
This PR introduces a new feature termed as
BitsandbytesConfig
- enabling users to play more flexibly withtransformers
+bitsandbytes
API.This is also a first step of one of the major refactor we are planning to support possibly more quantization features
This PR also addresses: #20281 (comment)
With this PR it will be also possible to apply advanced usecases for
bnb
models such as offloading parameters in cpu & gpu to run some part in int8 and the other part in cpu (but infp32
)Draft for now, will update the docs and fix the CI tests after a first pass 💪
One of my comment being that we should keep
load_in_8bit
argument as it is for now (as I think this argument is quite powerful) and progressively entirely replace it with the config in the future if more quantization methods will be supported onbitsandbytes
.cc @sgugger