The payback math is settled in implementation, not theory. Baidu's application US20250036920A1 (“Mixture-of-experts model implementation method and system,” published 2025-01-30) is about the production engineering of MoE. Assigned to Beijing Baidu Netcom and classified CPC G06N 3/045 and 3/0495, it covers how a mixture-of-experts model is actually implemented and run.
The distinction matters financially. Mixture-of-experts promises capacity without proportional cost — but only if the implementation captures that promise. Expert routing, load balancing, and data movement all have to work efficiently in production, or the architecture's theoretical savings evaporate into idle hardware and communication overhead.
Baidu reports AI across its search, cloud, and content businesses, with model-serving efficiency feeding those lines rather than appearing as standalone economics. The application is the granular record under that: dated 2025, owned, aimed at the implementation layer that decides whether MoE is cheap in practice.
Published is not granted — scope unsettled — and I attach no number; no filing isolates implementation savings. What it documents is that the production-engineering layer of MoE, where efficiency is won or lost, was an explicit 2025 IP target for a major platform.
For the infrastructure desk, the durable lesson is that architecture is a promise and implementation is the receipt. An MoE implementation patent is a primary document that the gap between theoretical and realized efficiency is being engineered closed — and that gap is exactly where serving cost lives.