Error when load model in 4bit #1638

rin2401 · 2024-07-31T06:38:34Z

Don't need kwargs['load_in_4bit'] = True when use quantization_config

Lines 34 to 40 in c121f04

    
           elif load_4bit: 
        
               kwargs['load_in_4bit'] = True 
        
               kwargs['quantization_config'] = BitsAndBytesConfig( 
        
                   load_in_4bit=True, 
        
                   bnb_4bit_compute_dtype=torch.float16, 
        
                   bnb_4bit_use_double_quant=True, 
        
                   bnb_4bit_quant_type='nf4'

The text was updated successfully, but these errors were encountered:

drzraf · 2024-09-01T04:18:29Z

Happens with transformers 4.44.2 . The ValueError was introduced here in transformers (huggingface/transformers#21579) (instead of a slow deprecation)

This was merged in huggingface/transformers@3668ec1 (part of v4.27.0 and onward)

and since this project requires transformers==4.37.2, there is no reason to keep passing deprecated booleans which trigger this error.

drzraf mentioned this issue Sep 1, 2024

[Question] Can LLava inference on CPU? #865

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when load model in 4bit #1638

Error when load model in 4bit #1638

rin2401 commented Jul 31, 2024

drzraf commented Sep 1, 2024

Error when load model in 4bit #1638

Error when load model in 4bit #1638

Comments

rin2401 commented Jul 31, 2024

drzraf commented Sep 1, 2024