Support load converted model from local cache

🚀 Feature

Another UX improvement, similarly to load a cached model from hub #15. 

Today when users run `ExecuTorchModelForXxx.from_pretrained(model_id, export=True)` it will save the converted model to the local filesystem. When users rerun this command, it will run through the entire stack again including fetching the model from hub, exporting to ExecuTorch, saving it to a local filesytem, etc. This is inefficient as it's common that users will invoke the `from_pretrained` API multiple times in different locations in their script. Just like loading a `from_pretrained` model from transformers, I'm wondering if it's possible to support loading the cached model from the local fs directly if it exists. From the API perspective, I think we can introduce a new flag for users to control it explicitly.

CC: @echarlaix 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support load converted model from local cache #16

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support load converted model from local cache #16

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions