Why does optimization_strategy differ between training and prediction for Fine-Tuning Qwen2.5-VL? #174

dcfabian · 2025-02-26T09:29:48Z

dcfabian
Feb 26, 2025

Hi,

Could someone explain why during training we use optimization_strategy "qlora" but during prediction we load the trained checkpoint and than use optimization_strategy "None"?

Answered by SkalskiP

Feb 26, 2025

Hi @dcfabian 👋🏻 Good question!

When training, we use an optimization strategy like LoRA or QLoRA to help the model learn efficiently. QLoRA applies techniques like quantization and low-rank adaptation to reduce memory usage and speed up training without losing performance. This makes the training process more efficient and resource-friendly.

Once training is complete, however, the model has already learned the necessary patterns and relationships. For prediction (or inference), we simply load the trained checkpoint and run the model as-is—there’s no need to use the training-specific optimizations. That’s why the optimization strategy is set to NONE during prediction.

In short, QLoRA is us…

View full answer

SkalskiP · 2025-02-26T15:44:16Z

SkalskiP
Feb 26, 2025
Maintainer

Hi @dcfabian 👋🏻 Good question!

When training, we use an optimization strategy like LoRA or QLoRA to help the model learn efficiently. QLoRA applies techniques like quantization and low-rank adaptation to reduce memory usage and speed up training without losing performance. This makes the training process more efficient and resource-friendly.

Once training is complete, however, the model has already learned the necessary patterns and relationships. For prediction (or inference), we simply load the trained checkpoint and run the model as-is—there’s no need to use the training-specific optimizations. That’s why the optimization strategy is set to NONE during prediction.

In short, QLoRA is useful for improving the training process, but once training is done, the model doesn't need those extra adjustments during inference.

1 reply

dcfabian Feb 26, 2025
Author

Thank you for the clear explanation! That makes a lot of sense. I appreciate the detailed response!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why does optimization_strategy differ between training and prediction for Fine-Tuning Qwen2.5-VL? #174

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why does optimization_strategy differ between training and prediction for Fine-Tuning Qwen2.5-VL? #174

Uh oh!

dcfabian Feb 26, 2025

Replies: 1 comment · 1 reply

Uh oh!

SkalskiP Feb 26, 2025 Maintainer

Uh oh!

dcfabian Feb 26, 2025 Author

dcfabian
Feb 26, 2025

Replies: 1 comment 1 reply

SkalskiP
Feb 26, 2025
Maintainer

dcfabian Feb 26, 2025
Author