Skip to content

[QUESTION]How to convert the weight file format of the MAMBA model from pt to safetensors format? #1339

@fxnie

Description

@fxnie

Ask a clear and concise question about Megatron-LM.
MODEL_PATH="/workspace/mnt/xxx/ckpt/mamba2"
SAVE_PATH="/workspace/mnt/xxx/models/convert_ckpt/mamba2"

python tools/checkpoint/hybrid_conversion.py
--load-dir ${MODEL_PATH}
--save-dir ${SAVE_PATH}
--target-tp-size 2
--target-pp-size 1
--d-model 2560
--mamba-version 2
--mamba-d-state 128
--mamba2-n-groups 8
--mamba2-head-dim 64

I found that the conversion script is still in pt format after conversion. What should I do to convert it to SafeTensor format? help

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions