Skip to content

Commit 1a26031

Browse files
author
Guang Yang
committed
Improve setup guide
1 parent 6a7e83f commit 1a26031

File tree

2 files changed

+88
-26
lines changed

2 files changed

+88
-26
lines changed

README.md

Lines changed: 81 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,14 @@ Optimum ExecuTorch enables efficient deployment of transformer models using Meta
2020

2121
## ⚡ Quick Installation
2222

23-
Install from source:
23+
### 1. Create a virtual environment:
24+
Install [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) on your machine. Then, create a virtual environment to manage our dependencies.
25+
```
26+
conda create -n optimum-executorch python=3.11
27+
conda activate optimum-executorch
28+
```
29+
30+
### 2. Install optimum-executorch from source:
2431
```
2532
git clone https://github.com/huggingface/optimum-executorch.git
2633
cd optimum-executorch
@@ -29,11 +36,64 @@ pip install .
2936

3037
- 🔜 Install from pypi coming soon...
3138

39+
### [Optional] 3. Install dependencies in dev mode
40+
You can install `executorch` and `transformers` from source, where you can access new ExecuTorch
41+
compatilbe models from `transformers` and new features from `executorch` as both repos are under
42+
rapid deployment.
43+
44+
Follow these steps manually:
45+
46+
#### 3.1. Clone and Install ExecuTorch from Source:
47+
From the root directory where `optimum-executorch` is cloned:
48+
```
49+
# Clone the ExecuTorch repository
50+
git clone https://github.com/pytorch/executorch.git
51+
cd executorch
52+
# Checkout the stable branch to ensure stability
53+
git checkout viable/strict
54+
# Install ExecuTorch
55+
bash ./install_executorch.sh
56+
cd ..
57+
```
58+
59+
#### 3.2. Clone and Install Transformers from Source
60+
From the root directory where `optimum-executorch` is cloned:
61+
```
62+
# Clone the Transformers repository
63+
git clone https://github.com/huggingface/transformers.git
64+
cd transformers
65+
# Install Transformers in editable mode
66+
pip install -e .
67+
cd ..
68+
```
69+
3270
## 🎯 Quick Start
3371

3472
There are two ways to use Optimum ExecuTorch:
3573

36-
### Option 1: Export and Load Separately
74+
### Option 1: Export and Load in One Python API
75+
```python
76+
from optimum.executorch import ExecuTorchModelForCausalLM
77+
from transformers import AutoTokenizer
78+
79+
# Load and export the model on-the-fly
80+
model_id = "meta-llama/Llama-3.2-1B"
81+
model = ExecuTorchModelForCausalLM.from_pretrained(model_id, recipe="xnnpack")
82+
83+
# Generate text right away
84+
tokenizer = AutoTokenizer.from_pretrained(model_id)
85+
generated_text = model.text_generation(
86+
tokenizer=tokenizer,
87+
prompt="Simply put, the theory of relativity states that",
88+
max_seq_len=128
89+
)
90+
print(generated_text)
91+
```
92+
93+
> **Note:** If an ExecuTorch model is already cached on the Hugging Face Hub, the API will automatically skip the export step and load the cached `.pte` file. To test this, replace the `model_id` in the example above with `"executorch-community/SmolLM2-135M"`, where the `.pte` file is pre-cached. Additionally, the `.pte` file can be directly associated with the eager model, as demonstrated in this [example](https://huggingface.co/optimum-internal-testing/tiny-random-llama/tree/executorch).
94+
95+
96+
### Option 2: Export and Load Separately
3797

3898
#### Step 1: Export your model
3999
Use the CLI tool to convert your model to ExecuTorch format:
@@ -61,33 +121,34 @@ generated_text = model.text_generation(
61121
prompt="Simply put, the theory of relativity states that",
62122
max_seq_len=128
63123
)
124+
print(generated_text)
64125
```
65126

66-
### Option 2: Python API
67-
```python
68-
from optimum.executorch import ExecuTorchModelForCausalLM
69-
from transformers import AutoTokenizer
127+
## Supported Models
70128

71-
# Load and export the model on-the-fly
72-
model_id = "meta-llama/Llama-3.2-1B"
73-
model = ExecuTorchModelForCausalLM.from_pretrained(model_id, recipe="xnnpack")
129+
Optimum ExecuTorch currently supports the following transformer models:
130+
131+
- [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) (and its variants)
132+
- [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) (and its variants)
133+
- [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) (and its variants)
134+
- [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) (and its variants)
135+
- [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) (and its variants)
136+
- [allenai/OLMo-1B-hf](https://huggingface.co/allenai/OLMo-1B-hf) (and its variants)
137+
138+
*Note: This list is continuously expanding. As we continue to expand support, more models and variants will be added.*
139+
140+
141+
## Supported Recipes
142+
143+
Optimum ExecuTorch currently only supports [`XNNPACK` Backend](https://pytorch.org/executorch/main/backends-xnnpack.html).
74144

75-
# Generate text right away
76-
tokenizer = AutoTokenizer.from_pretrained(model_id)
77-
generated_text = model.text_generation(
78-
tokenizer=tokenizer,
79-
prompt="Simply put, the theory of relativity states that",
80-
max_seq_len=128
81-
)
82-
```
83145

84146
## 🛠️ Advanced Usage
85147

86148
Check our [ExecuTorch GitHub repo](https://github.com/pytorch/executorch) directly for:
87-
- Custom model export configurations
88-
- Performance optimization guides
149+
- More backends and performance optimization options
89150
- Deployment guides for Android, iOS, and embedded devices
90-
- Additional examples
151+
- Additional examples and benchmarks
91152

92153
## 🤝 Contributing
93154

setup.py

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,18 +12,19 @@
1212
assert False, "Error: Could not open '%s' due %s\n" % (filepath, error)
1313

1414
INSTALL_REQUIRE = [
15-
"optimum~=1.24",
15+
"accelerate>=0.26.0",
16+
"datasets",
1617
"executorch>=0.4.0",
18+
"optimum~=1.24",
19+
"safetensors",
20+
"sentencepiece",
21+
"tiktoken",
1722
"transformers>=4.46",
1823
]
1924

2025
TESTS_REQUIRE = [
21-
"accelerate>=0.26.0",
22-
"pytest",
2326
"parameterized",
24-
"sentencepiece",
25-
"datasets",
26-
"safetensors",
27+
"pytest",
2728
]
2829

2930

0 commit comments

Comments
 (0)