手动替换词表 #861
SerenNoble
started this conversation in
General
手动替换词表
#861
Replies: 2 comments 1 reply
-
可以仿照 https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_tokenizer/merge_tokenizers.py 中的方式进行手动修改词表 |
Beta Was this translation helpful? Give feedback.
1 reply
-
感谢您的回复,我参考您的代码,进行了词表的替换,但是仍然有个问题。比如原始词表中有“U”, “L”, “ONG”。我在旧词表中将低频的某个词替换成“ULONG”,并且修改了他的得分。但是使用新词表时ULONG仍然被分解为“U”, “L”, “ONG” |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
请问我能够手动替换llama2词表中的低频词,不改变词表的大小进行训练呢?
Beta Was this translation helpful? Give feedback.
All reactions