-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Thanks for the great work!
It's a very clever way to compute embeddings beforehand and use them directly as target values during backpropagation step.
Questions
- Have you done any testing to find out, how well the distilled model performs as compared to the original teacher model?
- If we use Vision Transformer (ViT) models as base, should there be any improvement to embedding quality?
- Instead of using the distilled model for classification task by computing the
probs
, How well it performs in case we want to utilize the raw embeddings for ranking the images based on cosine distance.
Metadata
Metadata
Assignees
Labels
No labels