You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A specialized reasoning-focused MoE model based on Qwen3-30B-A3Bn
1640
+
1641
+
Mini-Hydra is a Mixture-of-Experts (MoE) language model designed for efficient reasoning and faster conclusion generation. Built upon the Qwen3-30B-A3B architecture, this model aims to bridge the performance gap between sparse MoE models and their dense counterparts while maintaining computational efficiency.
1642
+
The model was trained on a carefully curated combination of reasoning-focused datasets:
1643
+
Tesslate/Gradient-Reasoning: Advanced reasoning problems with step-by-step solutions
1644
+
Daemontatox/curated_thoughts_convs: Curated conversational data emphasizing thoughtful responses
1645
+
Daemontatox/natural_reasoning: Natural language reasoning examples and explanations
1646
+
Daemontatox/numina_math_cconvs: Mathematical conversation and problem-solving data
0 commit comments