Enable deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

The bos_token_id doesn't match between the model config and its tokenizer. It happens on those using Qwen as the base model. Opened an discussion here: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/discussions/25


It may not fit on device w/o quantization, but exporting the llama based deepseek-R1 to ExecuTorch works just fine, e.g. setting model_id to `deepseek-ai/DeepSeek-R1-Distill-Llama-8B`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B #26

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Enable deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B #26

Description

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions