fix(mtp logging): Correctly accumulate MTP loss for logging when log_interval > 1 #1684

Luowaterbi · 2025-07-11T14:00:51Z

In multi_token_prediction.py, the MTP loss is not correctly accumulated for logging when the log_interval is greater than 1.

The current implementation overwrites the
total_loss_dict[f"mtp_{i+1} loss"] at each step. This means only the loss from the logging step itself (e.g., the 10th step if log_interval=10) is stored. This single value is then incorrectly divided by log_interval, resulting in a reported MTP loss that is artificially low and does not reflect the true average.

This commit modifies the logic to correctly accumulate the f"mtp_{i+1} loss" across all steps within a logging interval by checking for the key's existence and using addition (+=) instead of assignment.

This fix ensures the reported MTP loss is accurate.

…interval > 1 In `multi_token_prediction.py`, the MTP loss is not correctly accumulated for logging when the log_interval is greater than 1. The current implementation overwrites the `total_loss_dict[f"mtp_{i+1} loss"]` at each step. This means only the loss from the logging step itself (e.g., the 10th step if `log_interval=10`) is stored. This single value is then incorrectly divided by `log_interval`, resulting in a reported MTP loss that is artificially low and does not reflect the true average. This commit modifies the logic to correctly accumulate the `f"mtp_{i+1} loss"` across all steps within a logging interval by checking for the key's existence and using addition (`+=`) instead of assignment. This fix ensures the reported MTP loss is accurate.

yanring · 2025-07-21T03:11:27Z

Thank you for the PR. We'll help merge it.

Luowaterbi mentioned this pull request Jul 11, 2025

[BUG] MTP loss is not accumulated correctly for logging when log_interval > 1 #1686

Open

ericharper requested a review from yanring July 19, 2025 23:38

yanring requested a review from BestJuly July 21, 2025 03:07

sbhavani added the module: moe label Jul 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(mtp logging): Correctly accumulate MTP loss for logging when log_interval > 1 #1684

fix(mtp logging): Correctly accumulate MTP loss for logging when log_interval > 1 #1684

Uh oh!

Luowaterbi commented Jul 11, 2025

Uh oh!

yanring commented Jul 21, 2025

Uh oh!

Uh oh!

fix(mtp logging): Correctly accumulate MTP loss for logging when log_interval > 1 #1684

Are you sure you want to change the base?

fix(mtp logging): Correctly accumulate MTP loss for logging when log_interval > 1 #1684

Uh oh!

Conversation

Luowaterbi commented Jul 11, 2025

Uh oh!

yanring commented Jul 21, 2025

Uh oh!

Uh oh!