Open
Description
When loading the cached model for inference with the example code below:
tokenizer = AutoTokenizer.from_pretrained(model_id)
print(f"tokenizer: bos_token_id={tokenizer.bos_token_id}, eos_token_id={tokenizer.eos_token_id}")
model = ExecuTorchModelForCausalLM.from_pretrained(model_id, recipe="xnnpack", revision="executorch")
print(f"model: bos_token_id={model.bos_token_id}, eos_token_id={model.eos_token_id}")
I got:
File "/Users/guangyang/optimum-executorch/optimum/executorch/modeling.py", line 351, in text_generation
raise ValueError(
ValueError: The tokenizer's bos_token_id=1 must be the same as the model's bos_token_id=0.
Through the debug log, it shows "tokenizer: bos_token_id=1, eos_token_id=2" and "model: bos_token_id=0, eos_token_id=1". @echarlaix it seems like a bug in the tokenizer for this model.