The arithmetic underneath modern AI is matrix multiplication, repeated at enormous scale, and increasingly on matrices that are mostly zeros. On May 26, 2026, Intel was issued US12639398B2, a grant covering a graphics processor with "sparse matrix multiply acceleration hardware including a systolic processing array with feedback inputs." Strip away the phrasing and the claim is about doing the central operation of a neural network faster by skipping the zeros — and doing it in dedicated silicon rather than general-purpose logic.
What did Intel actually lock in here? The grant describes a graphics processor built from processing clusters, each with multiple multiprocessors connected over a data interconnect, and each multiprocessor carrying the sparse matrix-multiply unit. The detail that matters for coverage is the systolic array with feedback inputs — a specific hardware arrangement for streaming operands through a grid of multiply-accumulate cells. A granted claim on that arrangement is enforceable coverage on a particular way of accelerating the densest part of AI compute.
each multiprocessor comprising sparse matrix multiply acceleration hardware including a systolic processing array with feedback inputs— Scalable sparse matrix multiply acceleration using systolic arrays with feedback inputs, US12639398B2
The grant matters because of where it sits in the stack. Sparsity — exploiting the fact that many weights in a trained model are zero or near-zero — is one of the levers hardware vendors pull to raise effective throughput without raising clock speed or power. A claim that ties sparse matrix multiplication to a systolic array is coverage on a named technique inside the part of the chip that competitors also build. It is the kind of issued patent that documents, in the record, that Intel holds enforceable ground in AI compute primitives, not only in the surrounding silicon.
The same week, the package around the compute
Intel's grants that week did not stop at the math unit. A striking share covered the multi-die packaging that the AI buildout has made central. US12642071B2 claims a scalable architecture for multi-die semiconductor packages, with cores on one die directly coupled to local DRAM portions on a second die through through-silicon vias. US12642130B2 covers hybrid bonding a die to a substrate with vias on both sides of the die, and US12642132B2 claims a die-stacking package architecture for high-speed I/O using through-dielectric vias.
These are not AI patents in the model sense, but they are the physical substrate the AI accelerator depends on: how dies stack, how memory sits next to compute, how signals cross between them. As single monolithic chips give way to multi-die packages, coverage on the bonding and interconnect techniques is coverage on how the accelerator is assembled. Intel held a cluster of it issued in a single week.
Memory, cache and the rest of the footprint
Closer to the compute, US12639779B2 covers merging partial cache-line writes inside a graphics processor's memory path — a data-movement efficiency claim of the sort that compounds across large workloads. And the week's grants reach into the facility too: US12641747B2 claims a dual-fan cooling arrangement that maintains positive pressure inside an electronics chassis.
Lay the week's issuances end to end and a coverage map emerges that runs from the multiply-accumulate cell out to the chassis fan: the sparse matrix unit that does the AI arithmetic, the cache path that feeds it, the through-silicon vias that connect it to memory, the hybrid-bonding and die-stacking methods that build the package, and the cooling that keeps it running. Each is documented in a patent number issued the same day. For a reader assessing the competitive ground in AI silicon, the record shows Intel holding enforceable coverage at multiple layers of that stack simultaneously — the compute primitive and the plumbing that surrounds it, granted together.
Comments
Loading comments…