Skip to content

PLE 的 bug #52

Open
Open
@Xiaopengli1

Description

@Xiaopengli1

当使用 aliccp MTL example 运行 PLE 中存在以下问题

        for ple_out, tower, predict_layer in zip(ple_outs, self.towers, self.predict_layers):
            tower_out = tower(ple_out)  #[batch_size, 1]

但是tower_out 的形状是 [batch_size, 8]。

所以我认为是不是在PLE 模型初始化中

        self.towers = nn.ModuleList(
            MLP(expert_params["dims"][-1], output_layer=False, **tower_params_list[i]) for i in range(self.n_task))

改为

        self.towers = nn.ModuleList(
            MLP(expert_params["dims"][-1], output_layer=True, **tower_params_list[i]) for i in range(self.n_task))

@morningsky 欢迎指正

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions