MVPavan's Notes

❯

❯

❯

Mixture of Experts

Mixture-of-Experts

Dec 29, 20251 min read

Model Merging, Mixtures of Experts, and Towards Smaller LLMs

Home | Substack

Load Balancing:

to prevent overfitting on the same experts

Auxiliary Loss

Graph View

Load Balancing:
Auxiliary Loss

Created with Quartz v4.5.2 © 2026

GitHub
Discord Community