The most revealing thing about OpenAI's patent filings is the subject. The company is known to the public for models. The three applications it published on May 28, 2026 are about transistors. All three describe compute-in-memory (CIM) accelerator hardware aimed, in the filings' own framing, at "heavy AI training and inference workloads." A published application is a delayed look at where a company was spending engineering effort roughly 18 months earlier; this set points at the chip, not the model.

The lead filing, US20260148791A1, describes a compute-in-memory device that performs vector-matrix multiplication — the core AI operation — directly inside the memory that holds the weights, rather than shuttling data back and forth to a separate compute unit. The design lets the chip read and write some weight values while simultaneously running the multiplication on others.

a compute-in-memory (CIM) device or macro is configured to perform vector matrix multiplication (VMM) operations between a vector of activation values and a matrix of weight values— Parallel operations in compute systems, US20260148791A1

The other two published the same day extend the same architecture. US20260147854A1 covers floating-point multiplications in a CIM macro, with a mode-decoding unit that adapts to different floating-point formats for the activation and weight values — the kind of mixed-precision handling that matters when a model uses several numeric formats to save memory and power. US20260147536A1 describes aligning the mantissa bits of those products and summing them through an adder tree to produce an integer-format accumulation. Together the three read as a coherent body of work on one accelerator design, filed by a shared set of inventors.

What the filings point to

The forward-looking read is straightforward because the filings are so consistent. A company whose public identity is a foundation-model lab is documented, in its earliest published patents, working on the floating-point datapath of a memory-resident matrix-multiply engine. That is silicon-design work — the layer below the model, the layer that until now OpenAI has been understood to rent from others. The applications point to a company quietly investing in the hardware its models run on.

The inventor names reinforce the read. The filings list engineers with deep accelerator and chip-design backgrounds, not model researchers, working across all three applications. A single application about memory hardware could be incidental. Three, published the same day, sharing inventors and a single architectural thread, describe a sustained effort rather than a one-off.

Why compute-in-memory, and why it matters

The technical choice carries its own signal. Conventional accelerators spend a large share of their energy moving weight data between memory and compute units — the "memory wall" that dominates the power budget of large-model workloads. Compute-in-memory attacks that directly by doing the multiplication where the weights already sit. A model company filing on CIM is filing on the specific bottleneck that determines how expensive its own inference is to serve.

For a reader tracking the AI sector as a business, the through-line is the cost and supply of compute — the dependency that runs through every model company's economics. These applications indicate OpenAI is doing original engineering at that layer rather than treating accelerators purely as a purchased input. Whether any of these designs reaches silicon at scale is not something the filings establish; an application is a record of R&D direction, not a product. But the direction is unambiguous. The first patents the public sees from this company are not about how its models think. They are about the multiply-accumulate cell, the floating-point format, and the adder tree — the machinery of running a model at all. The filings point downward, into the hardware.