Skip to content

[QUESTION] Why does GPTDataset not directly cache all samples document_index and sample_index, and then construct different shuffle_index for different parameters? #102

[QUESTION] Why does GPTDataset not directly cache all samples document_index and sample_index, and then construct different shuffle_index for different parameters?

[QUESTION] Why does GPTDataset not directly cache all samples document_index and sample_index, and then construct different shuffle_index for different parameters? #102