Skip to content

Commit 906cac7

Browse files
guangy10Guang Yangecharlaix
authored
Improve setup guide (#31)
* Improve setup guide * revert --------- Co-authored-by: Guang Yang <[email protected]> Co-authored-by: Ella Charlaix <[email protected]>
1 parent 8e9f80b commit 906cac7

File tree

1 file changed

+82
-20
lines changed

1 file changed

+82
-20
lines changed

README.md

Lines changed: 82 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,14 @@ Optimum ExecuTorch enables efficient deployment of transformer models using Meta
2020

2121
## ⚡ Quick Installation
2222

23-
Install from source:
23+
### 1. Create a virtual environment
24+
Install [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) on your machine. Then, create a virtual environment to manage our dependencies.
25+
```
26+
conda create -n optimum-executorch python=3.11
27+
conda activate optimum-executorch
28+
```
29+
30+
### 2. Install optimum-executorch from source
2431
```
2532
git clone https://github.com/huggingface/optimum-executorch.git
2633
cd optimum-executorch
@@ -29,11 +36,64 @@ pip install .
2936

3037
- 🔜 Install from pypi coming soon...
3138

39+
### [Optional] 3. Install dependencies in dev mode
40+
You can install `executorch` and `transformers` from source, where you can access new ExecuTorch
41+
compatilbe models from `transformers` and new features from `executorch` as both repos are under
42+
rapid deployment.
43+
44+
Follow these steps manually:
45+
46+
#### 3.1. Clone and Install ExecuTorch from Source
47+
From the root directory where `optimum-executorch` is cloned:
48+
```
49+
# Clone the ExecuTorch repository
50+
git clone https://github.com/pytorch/executorch.git
51+
cd executorch
52+
# Checkout the stable branch to ensure stability
53+
git checkout viable/strict
54+
# Install ExecuTorch
55+
bash ./install_executorch.sh
56+
cd ..
57+
```
58+
59+
#### 3.2. Clone and Install Transformers from Source
60+
From the root directory where `optimum-executorch` is cloned:
61+
```
62+
# Clone the Transformers repository
63+
git clone https://github.com/huggingface/transformers.git
64+
cd transformers
65+
# Install Transformers in editable mode
66+
pip install -e .
67+
cd ..
68+
```
69+
3270
## 🎯 Quick Start
3371

3472
There are two ways to use Optimum ExecuTorch:
3573

36-
### Option 1: Export and Load Separately
74+
### Option 1: Export and Load in One Python API
75+
```python
76+
from optimum.executorch import ExecuTorchModelForCausalLM
77+
from transformers import AutoTokenizer
78+
79+
# Load and export the model on-the-fly
80+
model_id = "meta-llama/Llama-3.2-1B"
81+
model = ExecuTorchModelForCausalLM.from_pretrained(model_id, recipe="xnnpack")
82+
83+
# Generate text right away
84+
tokenizer = AutoTokenizer.from_pretrained(model_id)
85+
generated_text = model.text_generation(
86+
tokenizer=tokenizer,
87+
prompt="Simply put, the theory of relativity states that",
88+
max_seq_len=128
89+
)
90+
print(generated_text)
91+
```
92+
93+
> **Note:** If an ExecuTorch model is already cached on the Hugging Face Hub, the API will automatically skip the export step and load the cached `.pte` file. To test this, replace the `model_id` in the example above with `"executorch-community/SmolLM2-135M"`, where the `.pte` file is pre-cached. Additionally, the `.pte` file can be directly associated with the eager model, as demonstrated in this [example](https://huggingface.co/optimum-internal-testing/tiny-random-llama/tree/executorch).
94+
95+
96+
### Option 2: Export and Load Separately
3797

3898
#### Step 1: Export your model
3999
Use the CLI tool to convert your model to ExecuTorch format:
@@ -61,33 +121,35 @@ generated_text = model.text_generation(
61121
prompt="Simply put, the theory of relativity states that",
62122
max_seq_len=128
63123
)
124+
print(generated_text)
64125
```
65126

66-
### Option 2: Python API
67-
```python
68-
from optimum.executorch import ExecuTorchModelForCausalLM
69-
from transformers import AutoTokenizer
127+
## Supported Models and Backend
70128

71-
# Load and export the model on-the-fly
72-
model_id = "meta-llama/Llama-3.2-1B"
73-
model = ExecuTorchModelForCausalLM.from_pretrained(model_id, recipe="xnnpack")
129+
**Optimum-ExecuTorch** currently supports the following transformer models:
130+
131+
- [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) and its variants
132+
- [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) and its variants
133+
- [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) and its variants
134+
- [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) and its variants
135+
- [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) and its variants
136+
- [allenai/OLMo-1B-hf](https://huggingface.co/allenai/OLMo-1B-hf) and its variants
137+
138+
*Note: This list is continuously expanding. As we continue to expand support, more models and variants will be added.*
139+
140+
**Supported Backend:**
141+
142+
Currently, **Optimum-ExecuTorch** supports only the [XNNPACK Backend](https://pytorch.org/executorch/main/backends-xnnpack.html) for efficient CPU execution on mobile devices. Quantization support for XNNPACK is planned to be added shortly.
143+
144+
For a comprehensive overview of all backends supported by ExecuTorch, please refer to the [ExecuTorch Backend Overview](https://pytorch.org/executorch/main/backends-overview.html).
74145

75-
# Generate text right away
76-
tokenizer = AutoTokenizer.from_pretrained(model_id)
77-
generated_text = model.text_generation(
78-
tokenizer=tokenizer,
79-
prompt="Simply put, the theory of relativity states that",
80-
max_seq_len=128
81-
)
82-
```
83146

84147
## 🛠️ Advanced Usage
85148

86149
Check our [ExecuTorch GitHub repo](https://github.com/pytorch/executorch) directly for:
87-
- Custom model export configurations
88-
- Performance optimization guides
150+
- More backends and performance optimization options
89151
- Deployment guides for Android, iOS, and embedded devices
90-
- Additional examples
152+
- Additional examples and benchmarks
91153

92154
## 🤝 Contributing
93155

0 commit comments

Comments
 (0)