Skip to content

Releases: NVIDIA/Megatron-LM

NVIDIA Megatron Core 0.13.1

12 Aug 18:33
Compare
Choose a tag to compare
Merge branch 'cherry-pick-f36e1705' into 'core_r0.13.0'

Cherry-pick 'Use ruff linter (3627)' into 'core_r0.13.0'

See merge request ADLR/megatron-lm!3793

NVIDIA Megatron Core 0.14.0rc5

11 Aug 04:12
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Megatron Core 0.14.0rc5 (2025-08-11)

NVIDIA Megatron Core 0.12.3

12 Aug 18:12
Compare
Choose a tag to compare
Merge branch 'chtruong/cherry-pick-3627' into 'core_r0.12.0'

Cherry-pick 'use yaml safe load  (3627)' into 'core_r0.12.0'

See merge request ADLR/megatron-lm!3795

NVIDIA Megatron Core 0.14.0rc4

04 Aug 04:12
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Megatron Core 0.14.0rc4 (2025-08-04)

NVIDIA Megatron Core 0.14.0rc3

28 Jul 04:13
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Megatron Core 0.14.0rc3 (2025-07-28)

NVIDIA Megatron Core 0.13.0

25 Jul 18:04
Compare
Choose a tag to compare
  • Support bf16 dtype for optimizer states to use precision-aware optimizer in TransformerEngine
  • MoE
    • Features:
      • Flexible Asymmetric Virtual Pipeline Parallelism with Custom Pipeline Layout (--pipeline-model-parallel-layout)
      • Add support to pass custom parallelism groups to MoE modules.
      • Add Hybrid Shard Data-Parallel support for MoE models (--num-distributed-optimizer-instances)
      • Support EP + custom FSDP training for DeepSeek-V3
      • FP8 support for Multi-Token-Prediction
    • Memory Optimization
      • Fine-grained recomputation to reduce activation memory. (--recompute-modules with --recompute-granularity selective)
      • Memory efficient token permutation by moving the probs multiplication from unpermutation to activation function of GroupedMLP.
    • Performance Optimization
      • MLA RoPE fusion kernel and YARN embedding cache.
      • FP8 padding optimization of MoE models by padding the routing map.
    • Bug fixes:
      • Fix the aux loss calculation when expert_bias or group limited routing is used. This leads to load_balancing_loss values change compared to the previous version.
      • Fix packed sequence support for MLA
    • Known Issues:
      • MTP is not compatible with flexible pipeline layout, will be fixed at !3594.
      • MTP convergence issue with TP2, will be fixed at !3594.

NVIDIA Megatron Core 0.14.0rc2

21 Jul 04:12
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Megatron Core 0.14.0rc2 (2025-07-21)

NVIDIA Megatron Core 0.13.0rc4

22 Jul 08:03
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Megatron Core 0.13.0rc4 (2025-07-22)

NVIDIA Megatron Core 0.13.0rc3

17 Jul 15:04
9b9ea83
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Megatron Core 0.13.0rc3 (2025-07-17)

NVIDIA Megatron Core 0.14.0rc1

14 Jul 04:12
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA Megatron Core 0.14.0rc1 (2025-07-14)