[docs] Typos - Single GPU efficient training features (huggingface#38964)

casinca · stevhliu · web-flow · commit 21cb353b7b4f · 2025-06-23T12:33:10.000-07:00
* Typos - corrected bf16 training argument - corrected header for SDPA * improved readability for SDPA suggested by @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
diff --git a/docs/source/en/perf_train_gpu_one.md b/docs/source/en/perf_train_gpu_one.md
@@ -31,7 +31,7 @@ Refer to the table below to quickly help you identify the features relevant to y
 | data preloading | yes | no |
 | torch_empty_cache_steps | no | yes |
 | torch.compile | yes | no |
-| PEFT | no | yes |
+| scaled dot production attention (SDPA) | yes | yes |
 
 ## Trainer
 
@@ -128,7 +128,7 @@ fp16 isn't memory-optimized because the gradients that are computed in fp16 are
 
 [bf16](https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus) trades off some precision for a much larger dynamic range, which is helpful for avoiding overflow and underflow errors. You can use bf16 without adding any loss scaling methods like you would with fp16. bf16 is supported by NVIDIAs Ampere architecture or newer.
 
-Configure [`~TrainingArguments.fp16`] in [`TrainingArguments`] to enable mixed precision training with the bf16 data type.
+Configure [`~TrainingArguments.bf16`] in [`TrainingArguments`] to enable mixed precision training with the bf16 data type.
 
 ```py
 from transformers import TrainingArguments