Skip to content

[QUESTION] save_checkpoint with expert_tensor_parallel_size #1719

@jeromeku

Description

@jeromeku

Your question
How to save a sharded checkpoint when using MoE parallel folding -- specifically, when ETP != TP? save_checkpoint only supports PP, TP, and EP (as in expert_model_parallel not expert_tensor_parallel)?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions