Microsoft Sparse MoE Encoding Patent | AlgorithmLedger

Routing inputs to experts is only half of mixture-of-experts; moving data efficiently is the other half. A 2024 Microsoft application on sparse MoE encoding tackles that.

The payback math has a hidden term: data movement. Microsoft's application US20240086719A1 (“Sparse encoding and decoding at mixture-of-experts layer,” published 2024-03-14) goes after it. Assigned to Microsoft Technology Licensing, LLC and classified CPC G06N 3/098, it covers sparse encoding and decoding at the layer where mixture-of-experts routes inputs to experts.

Mixture-of-experts saves compute by activating only some experts per input — but those experts often live on different devices, so routing inputs to them and gathering results back means moving data across the system. That communication can become the new bottleneck, eating the savings the routing was supposed to deliver. Sparse encoding at that layer is about cutting the data moved.

“A computing system including a plurality of processing devices configured to execute a Mixture-of-Experts (MoE) layer. The processing devices are configured to execute the MoE layer at least in part by receiving an input tensor including input tokens.”— U.S. Patent Application 2024/0086719 A1 source

Microsoft routes AI revenue through cloud and productivity segments with no technique-level economics disclosed, as always. The application is the granular record under the cost story: dated 2024, owned, aimed at the communication overhead of serving the increasingly standard MoE architecture.

Published is not granted — scope unsettled — and I attach no number; no filing isolates communication savings. What it documents is that Microsoft was patenting the less-obvious half of MoE efficiency in 2024: not the routing, but the cost of moving data through it.

For the infrastructure desk, the reusable lesson is that compute savings can be silently eaten by communication. The companies serving MoE models cheaply will be the ones that solved data movement, not just routing — and a dated patent on sparse MoE encoding is evidence Microsoft was working that exact corner.

The IP Behind Efficient MoE Serving: Microsoft's 2024 Sparse-Encoding Filing

Comments