MVPavan's Notes

Home

❯

AI

❯

Study

❯

Mixture of Experts

Mixture-of-Experts

Dec 29, 20251 min read

Model Merging, Mixtures of Experts, and Towards Smaller LLMs

Home | Substack

Load Balancing:

to prevent overfitting on the same experts

Auxiliary Loss


Graph View

  • Load Balancing:
  • Auxiliary Loss

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community