Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when I try to evaluate pretrained Qwen 2.5 0.5B model #1936

Open
mtasic85 opened this issue Feb 12, 2025 · 3 comments
Open

Error when I try to evaluate pretrained Qwen 2.5 0.5B model #1936

mtasic85 opened this issue Feb 12, 2025 · 3 comments
Labels
question Further information is requested

Comments

@mtasic85
Copy link
Contributor

Hi,

I pretrained Qwen 2.5 0.5B base model with single layer (on purpose), when I chat with model it "works."

However when I try to evaluate model it fails:

litgpt evaluate \
    --tasks 'leaderboard' \
    --out_dir 'evaluate/pretrain-core/leaderboard/' \
    --batch_size 4 \
    --dtype 'bfloat16' \
    'out/pretrain-core/final'

Error:

{'access_token': None,                                                                                                                                                                                                                 
 'batch_size': 4,                                                                                                                                                                                                                      
 'checkpoint_dir': PosixPath('checkpoints/../out/pretrain-core/final'),                                                                                                                                                                
 'device': None,                                                                                                                                                                                                                       
 'dtype': 'bfloat16',                                                                                                                                                                                                                  
 'force_conversion': False,                                                                                                                                                                                                            
 'limit': None,                                                                                                                                                                                                                        
 'num_fewshot': None,                                                                                                                                                                                                                  
 'out_dir': PosixPath('../evaluate/pretrain-core/leaderboard'),                                                                                                                                                                        
 'save_filepath': None,                                                                                                                                                                                                                
 'seed': 1234,                                                                                                                                                                                                                         
 'tasks': 'leaderboard'}                                                                                                                                                                                                               
{'checkpoint_dir': PosixPath('checkpoints/../out/pretrain-core/final'),                                                                                                                                                                
 'output_dir': PosixPath('../evaluate/pretrain-core/leaderboard')}
Traceback (most recent call last):                                                                                                                                                                                                     
  File "/home/tangled/tangled-1.0-0.5b-base/scripts/venv/bin/litgpt", line 8, in <module>                                                                                                                                              
    sys.exit(main())                                                                                                                                                                                                                   
             ^^^^^^                                                                                                                                                                                                                    
  File "/home/tangled/tangled-1.0-0.5b-base/scripts/venv/lib/python3.12/site-packages/litgpt/__main__.py", line 71, in main                                                                                                            
    CLI(parser_data)                                                                                                                                                                                                                   
  File "/home/tangled/tangled-1.0-0.5b-base/scripts/venv/lib/python3.12/site-packages/jsonargparse/_cli.py", line 119, in CLI                                                                                                          
    return _run_component(component, init.get(subcommand))                                                                                                                                                                             
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                             
  File "/home/tangled/tangled-1.0-0.5b-base/scripts/venv/lib/python3.12/site-packages/jsonargparse/_cli.py", line 204, in _run_component                                                                                               
    return component(**cfg)                                                                                                                                                                                                            
           ^^^^^^^^^^^^^^^^                                                                                                                                                                                                            
  File "/home/tangled/tangled-1.0-0.5b-base/scripts/venv/lib/python3.12/site-packages/litgpt/eval/evaluate.py", line 95, in convert_and_evaluate                                                                                       
    convert_lit_checkpoint(checkpoint_dir=checkpoint_dir, output_dir=out_dir)                                                                                                                                                          
  File "/home/tangled/tangled-1.0-0.5b-base/scripts/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context                                                                                       
    return func(*args, **kwargs)                                                                                                                                                                                                       
           ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                       
  File "/home/tangled/tangled-1.0-0.5b-base/scripts/venv/lib/python3.12/site-packages/litgpt/scripts/convert_lit_checkpoint.py", line 398, in convert_lit_checkpoint                                                                   
    copy_fn(sd, lit_weights, saver=saver)                                                                                                                                                                                              
  File "/home/tangled/tangled-1.0-0.5b-base/scripts/venv/lib/python3.12/site-packages/litgpt/scripts/convert_lit_checkpoint.py", line 160, in copy_weights_llama                                                                       
    to_names = (weight_map[name_template].format(*ids),)                                                                                                                                                                               
                ~~~~~~~~~~^^^^^^^^^^^^^^^                                                                                                                                                                                              
KeyError: 'transformer.h.{}.attn.qkv.bias'
@mtasic85 mtasic85 added the question Further information is requested label Feb 12, 2025
@t-vi
Copy link
Collaborator

t-vi commented Feb 12, 2025

Hi,
can you post your model_config.yaml please?
The traceback seems to point to Llama conversion but you should have gotten the qwen one if your model name starts with qwen2.5 or qwq.

elif config.name.lower().startswith(("qwen2.5","qwq")):

@mtasic85
Copy link
Contributor Author

model_config.yaml

attention_logit_softcapping: null
attention_scores_scalar: null
attn_bias: true
bias: false
block_size: 32768
final_logit_softcapping: null
gelu_approximate: none
head_size: 64
hf_config: {}
intermediate_size: 4864
lm_head_bias: false
mlp_class_name: LLaMAMLP
n_embd: 896
n_expert: 0
n_expert_per_token: 0
n_head: 14
n_layer: 1
n_query_groups: 2
name: ''
norm_class_name: RMSNorm
norm_eps: 1.0e-06
norm_qk: false
padded_vocab_size: 151936
padding_multiple: 512
parallel_residual: false
post_attention_norm: false
post_mlp_norm: false
rope_adjustments: null
rope_base: 1000000
rope_condense_ratio: 1
rotary_percentage: 1.0
scale_embeddings: false
shared_attention_norm: false
sliding_window_layer_placing: null
sliding_window_size: null
vocab_size: 151643

pretrain-core-model.yaml

model_name: "Qwen2.5-0.5B"

model_config:
  block_size: 32768
  vocab_size: 151643
  padded_vocab_size: 151936
  n_layer: 1
  n_head: 14
  n_embd: 896
  n_query_groups: 2
  rotary_percentage: 1.0
  parallel_residual: False
  bias: False
  attn_bias: True
  norm_class_name: "RMSNorm"
  mlp_class_name: "LLaMAMLP"
  intermediate_size: 4864
  norm_eps: 1e-6
  rope_base: 1000000
  # head_size: 64 # n_embd / n_head

@mtasic85
Copy link
Contributor Author

I added in field "name" prefix for model to be "qwen2.5" and it worked. Thanks @t-vi !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants