[`bnb`] Introducing `BitsAndBytesConfig` #21579

younesbelkada · 2023-02-11T10:00:25Z

What does this PR do?

This PR introduces a new feature termed as BitsandbytesConfig - enabling users to play more flexibly with transformers + bitsandbytes API.
This is also a first step of one of the major refactor we are planning to support possibly more quantization features

This PR also addresses: #20281 (comment)

With this PR it will be also possible to apply advanced usecases for bnb models such as offloading parameters in cpu & gpu to run some part in int8 and the other part in cpu (but in fp32)

Draft for now, will update the docs and fix the CI tests after a first pass 💪

One of my comment being that we should keep load_in_8bit argument as it is for now (as I think this argument is quite powerful) and progressively entirely replace it with the config in the future if more quantization methods will be supported on bitsandbytes.

cc @sgugger

- add v1 - add tests - more user-friendly API - add docs

alexconstant9108 · 2023-02-12T18:34:56Z

@younesbelkada great job again. :)
I really like the refactored API which supports passing a whole config object containing all relevant BnB options in one place.
I am curious are the calculations during inference on the offloaded parameters (in fp32) to CPU and disk actually executed on the CPU or do they get transferred to one of the GPUs for calculations during every pass?

sgugger

Thanks for working on this! I have thoughts on the API, left a couple of comments!

sgugger · 2023-02-13T14:57:48Z

docs/source/en/main_classes/bitsandbytes.mdx

@@ -0,0 +1,19 @@
+<!--Copyright 2023 The HuggingFace Team. All rights reserved.


This should be a quantization page, not just bitsandbytes. We may support more configs in the future (for instance coming from optimum).

sgugger · 2023-02-13T14:58:04Z

docs/source/en/main_classes/bitsandbytes.mdx

+specific language governing permissions and limitations under the License.
+-->
+
+# `bitsandbytes` Integration


This should thus be one section, but not the whole title.

sgugger · 2023-02-13T14:58:44Z

docs/source/en/main_classes/bitsandbytes.mdx

+
+TODO: write documentaiton here
+
+## BitsandbytesConfig


Camel-cased name please (BitsAndBytesConfig)

sgugger · 2023-02-13T15:01:13Z

src/transformers/modeling_utils.py

-            load_in_8bit_skip_modules (`List[str]`, *optional*):
-                An explicit list of the modules that we do not want to convert in 8-bit. This is useful for models such
-                as Jukebox that has several heads in different places and not necessarily at the last position.
+            bitsandbytes_config (`Dict`, *optional*):


This should be quantization_config here. And actualyl, whay about load_in_8bit supporting both True and a config?

sgugger · 2023-02-13T15:01:42Z

src/transformers/modeling_utils.py

+        if load_in_8bit and bitsandbytes_config is not None:
+            raise ValueError(
+                "You can't pass both `load_in_8bit=True` and `bitsandbytes_config` as they are mutually exclusive."
+            )


Yeah this makes me think supporting both bool and config to load_in_8bit is a better API.

sgugger · 2023-02-13T15:02:39Z

src/transformers/modeling_utils.py

+            logger.warning(
+                "The `load_in_8bit` argument will be deprecated in the future (v5). Please use "
+                "`BitsandbytesConfig` instead."
+            )


Ah, so that's your plan. I actually like having a simple load_in_8bit=True that loads the most basic config.

…ormers into int8-config

HuggingFaceDocBuilderDev · 2023-02-14T16:18:59Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada · 2023-02-14T16:50:57Z

src/transformers/__init__.py

@@ -4081,6 +4085,9 @@
        logging,
    )

+    # bitsandbytes config
+    from .utils.quantization_config import BitsAndBytesConfig


Is this the right approach? If I protect this import with is_bitsandbytes_available the docs won't be able to build it seems

Yes, especially since all bnb imports are protected, so you can import this directly here.

younesbelkada · 2023-02-14T16:51:26Z

This PR is now ready for review! Would love to have a round of review @sgugger

sgugger

Thanks for iterating. I was thinking that maybe the **kwargs in from pretrained could be used to update any attribute of the config when you instantiate it (like for regular model config arguments) so this way load_in_8bit could become an attribute of this config and there is no need to deprecate it.

docs/source/en/main_classes/quantization.mdx

src/transformers/modeling_utils.py

src/transformers/utils/bitsandbytes.py

src/transformers/utils/dummy_bitsandbytes_objects.py

sgugger · 2023-02-14T18:33:49Z

src/transformers/__init__.py

@@ -4081,6 +4085,9 @@
        logging,
    )

+    # bitsandbytes config
+    from .utils.quantization_config import BitsAndBytesConfig


Yes, especially since all bnb imports are protected, so you can import this directly here.

Co-authored-by: Sylvain Gugger <[email protected]>

- add also tests - change logic

younesbelkada · 2023-02-15T09:25:37Z

Thanks for the extensive review! I should have addressed the comments now 💪

sgugger

One last comment and we should be good to go!

sgugger · 2023-02-15T14:16:36Z

src/transformers/modeling_utils.py

+        quantization_config_kwargs = {
+            k: v for k, v in kwargs.items() if k in inspect.signature(BitsAndBytesConfig).parameters
+        }


It would be easier to have BitsAndBytesConfig have a from_dict method with return_unused_kwargs, like PretrainedConfig. We can then do

quantization_config, kwargs = BitsAndBytesConfig(**kwargs)

Agreed!
I think we stil need to add:

quantization_config_kwargs = { k: v for k, v in kwargs.items() if k in inspect.signature(BitsAndBytesConfig).parameters }

to check if a user has passed quantization_config together with some other associated kwargs inside from_pretrained. But this check is done only in the case elif quantization_config is not None:.

sgugger

Thanks for your work on this!

Next step would be to save this in the model config and be able to re-load a checkpoint pushed to the Hub quantized!

younesbelkada · 2023-02-17T08:43:57Z

Thanks ! Exaclty! Looking forward to it!

v1 BitsandbytesConfig

6493c6d

- add v1 - add tests - more user-friendly API - add docs

younesbelkada mentioned this pull request Feb 11, 2023

[bnb] We should be able to run 8-bit models on CPU & GPU #20281

Closed

younesbelkada requested a review from sgugger February 11, 2023 10:04

sgugger reviewed Feb 13, 2023

View reviewed changes

younesbelkada and others added 3 commits February 14, 2023 14:02

change to BitsAndBytesConfig

36751c9

Merge remote-tracking branch 'upstream/main' into int8-config

fa45ac9

Merge branch 'main' into int8-config

f0964ae

younesbelkada changed the title ~~[bnb] Introducing BitsandbytesConfig~~ [bnb] Introducing BitsAndBytesConfig Feb 14, 2023

younesbelkada added 10 commits February 14, 2023 14:07

replace logic

b5587d4

Merge branch 'int8-config' of https://github.com/younesbelkada/transf…

97a531d

…ormers into int8-config

changes

d24dc1d

make fixup

832e622

quality

c5b0531

make fixup

afdbb3b

fix doc

57b8b81

fix test

ed6bec0

update toctree

b4449f0

fix slow test

8b175ad

younesbelkada added 4 commits February 14, 2023 16:24

add tips

525a973

add warning

453687b

change title

052a84c

oops

0cf4da0

younesbelkada commented Feb 14, 2023

View reviewed changes

younesbelkada marked this pull request as ready for review February 14, 2023 16:51

sgugger reviewed Feb 14, 2023

View reviewed changes

younesbelkada and others added 2 commits February 14, 2023 19:56

Update docs/source/en/main_classes/quantization.mdx

4c7dd51

Co-authored-by: Sylvain Gugger <[email protected]>

Update src/transformers/utils/bitsandbytes.py

3361623

Co-authored-by: Sylvain Gugger <[email protected]>

younesbelkada added 3 commits February 14, 2023 19:01

remove unused file

9bce8a7

adapt suggestion

2f281ac

- add also tests - change logic

update docs

3e5b687

sgugger reviewed Feb 15, 2023

View reviewed changes

younesbelkada added 2 commits February 16, 2023 18:19

adapt suggestions

93c9e2a

Merge remote-tracking branch 'upstream/main' into int8-config

34e5916

younesbelkada requested a review from sgugger February 16, 2023 18:31

sgugger approved these changes Feb 16, 2023

View reviewed changes

younesbelkada merged commit 3668ec1 into huggingface:main Feb 17, 2023

younesbelkada mentioned this pull request Feb 19, 2023

[bnb] fix bnb decoders bug #21688

Merged

younesbelkada mentioned this pull request Feb 27, 2023

[Tracker] [bnb] Supporting device_map containing GPU and CPU devices #19090

Closed

z80maniac mentioned this pull request Feb 28, 2023

Unable to use load_in_8bit when the model is shared between GPU and CPU bitsandbytes-foundation/bitsandbytes#40

Closed

phall1 mentioned this pull request Jun 2, 2023

Add transformers v4.29.0 compatibility to sagemaker.huggingface aws/sagemaker-python-sdk#3896

Closed

younesbelkada mentioned this pull request Jun 6, 2023

[bnb] Fix bnb skip modules #24043

Merged

This was referenced Aug 30, 2024

can't pass load_in_4bitor load_in_8bit as a kwarg when passing quantization_config argument at the same time hiyouga/LLaMA-Factory#4157

Closed

Error when load model in 4bit haotian-liu/LLaVA#1638

Open

amosyou mentioned this pull request Oct 10, 2024

Fix warning message for fp32_cpu_offloading in bitsandbytes configs #34079

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`bnb`] Introducing `BitsAndBytesConfig` #21579

[`bnb`] Introducing `BitsAndBytesConfig` #21579

younesbelkada commented Feb 11, 2023 •

edited

Loading

alexconstant9108 commented Feb 12, 2023

sgugger left a comment

sgugger Feb 13, 2023

sgugger Feb 13, 2023

sgugger Feb 13, 2023

sgugger Feb 13, 2023

sgugger Feb 13, 2023

sgugger Feb 13, 2023

HuggingFaceDocBuilderDev commented Feb 14, 2023 •

edited

Loading

younesbelkada Feb 14, 2023

sgugger Feb 14, 2023

younesbelkada commented Feb 14, 2023

sgugger left a comment

sgugger Feb 14, 2023

younesbelkada commented Feb 15, 2023

sgugger left a comment

sgugger Feb 15, 2023

younesbelkada Feb 16, 2023 •

edited

Loading

sgugger left a comment

younesbelkada commented Feb 17, 2023

		@@ -0,0 +1,19 @@
		<!--Copyright 2023 The HuggingFace Team. All rights reserved.

[bnb] Introducing BitsAndBytesConfig #21579

[bnb] Introducing BitsAndBytesConfig #21579

Conversation

younesbelkada commented Feb 11, 2023 • edited Loading

What does this PR do?

alexconstant9108 commented Feb 12, 2023

sgugger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Feb 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

younesbelkada commented Feb 14, 2023

sgugger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

younesbelkada commented Feb 15, 2023

sgugger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

younesbelkada Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

younesbelkada commented Feb 17, 2023

[`bnb`] Introducing `BitsAndBytesConfig` #21579

[`bnb`] Introducing `BitsAndBytesConfig` #21579

younesbelkada commented Feb 11, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Feb 14, 2023 •

edited

Loading

younesbelkada Feb 16, 2023 •

edited

Loading