Skip to content

[QUESTION] Why does GPTDataset not directly cache all samples document_index and sample_index, and then construct different shuffle_index for different parameters? #1139

@eviler007

Description

@eviler007

Your question
Why does GPTDataset not directly cache all samples document_index and sample_index, and then construct different shuffle_index for different parameters?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions