You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- data source is the [Cohere/wikipedia-22-12-de-embeddings](https://huggingface.co/datasets/Cohere/wikipedia-22-12-de-embeddings) dataset on Hugging Face Hub
31
+
- we took `wiki_id`, `title` and `text`
32
+
- did some normalization and filtering
33
+
- and merged the texts to an appropriate token count
34
+
- details can be found in the respective notebooks
0 commit comments