As consumers increasingly ask a chatbot what to buy instead of scrolling a search page, a quiet question becomes a competitive one: when a large language model recommends a product, whose product does it recommend, and why? A paper posted to arXiv on June 16, 2026 by Xi Chu and Yupeng Hou tries to answer that empirically, and its findings should interest anyone who treats LLM recommendation as a new distribution channel rather than a novelty. The study examines brand dynamics across three commercial models — GPT-4o-mini, Claude Sonnet, and Gemini 3 Flash — using skincare, a category the authors chose precisely because consumers cannot judge quality before buying and must lean on brand reputation.
The headline result is a brand advantage that is both enormous and brittle. When competing products carry identical specifications, the authors report that well-known brands get recommended 100 percent of the time. They label this a "Conditional Monopoly." But the condition is doing all the work: that total dominance, they find, "disappears with less than a +0.1-star rating advantage for a competitor." In other words, the incumbent wins by default when nothing distinguishes the products, but the moment a challenger has even a sliver of measurable quality signal, the monopoly breaks.
"Our results suggest that generative engine optimization (GEO) should be studied not only as a security risk, but also as an emerging marketing practice that shapes market competition."— arXiv:2606.17443, source
That framing — GEO as a marketing practice that shapes competition — is the part with commercial teeth. Search engine optimization spawned an entire industry; the paper is arguing that its successor, generative engine optimization, is already doing the same thing inside model outputs, and that the discipline studying it has been looking at the wrong risk. The security literature treats prompt manipulation as an attack. This study reframes the same maneuvers as ordinary competitive behavior — firms optimizing how they are described so that a model surfaces them — which means the relevant question is not just "is this an exploit?" but "who captures the recommendation, and what did it cost them?"
The fabricated-evidence finding
The second experiment is the one that should make any brand-safety or trust-and-safety reader sit up. The authors report that "authority-style marketing language, including fabricated clinical-evidence claims," breaks the incumbent monopoly, and they quantify the effect: a "Bias Surplus Value equal to +0.17 rating points," with each model responding differently. Translated, dressing a product in confident, clinical-sounding language — even fabricated claims — was worth more to the model's recommendation than a real +0.17-star rating advantage. That is a direct measurement of how a model can be nudged by rhetoric it cannot verify, and a sharp illustration of why a recommendation channel that rewards authoritative-sounding text is structurally exposed to manipulation.
For the business side, this is the uncomfortable core of the GEO economy. If authority-style language pays off in recommendation share, the incentive to produce it is obvious, and the incentive to fabricate it is only one step further. The fact that "each model responds differently" matters too: it means GEO is not a single optimization target but a per-platform one, the same fragmentation that made SEO a perpetual cat-and-mouse game across search engines.
The social dilemma when everyone optimizes
The third experiment is, to my eye, the most strategically important and the easiest to under-read. The authors model what happens when every brand adopts the same GEO strategy at once, and they find a collapse: "when all brands adopt the same optimization strategy, individual payoff falls from +0.802 to +0.007 in our payoff proxy, and non-participating brands receive zero recommendations." That is a textbook social dilemma. Optimizing pays handsomely when you are one of the few doing it; when everyone does it, the gains compete away to nearly nothing — but anyone who opts out is shut out entirely, receiving zero recommendations. The result is an equilibrium in which every brand must spend on GEO to stay visible while almost none of them capture lasting advantage from it. It is a tax on participation.
That dynamic, if it generalizes, reshapes marketing economics in the LLM channel. It implies GEO spending will rise toward a competitive necessity rather than a differentiator, with the value accruing less to any individual brand than to the channel and its optimizers — precisely the pattern that played out in paid search.
The caveats that keep this honest
This is a preprint, not peer-reviewed at the time of writing, and the study is deliberately narrow: a single product category chosen for its reliance on reputation, three specific model versions, and controlled experiments with a stated payoff proxy and engineered rating deltas. The authors include a robustness check on search goods, but the precise figures — the +0.1-star threshold, the +0.17 Bias Surplus Value, the 0.802-to-0.007 payoff collapse — are properties of their experimental setup, not laws of the marketplace. A reader should treat them as well-defined measurements within a designed study rather than as universal constants, and watch for replication across categories and newer model versions.
What survives those caveats is the conceptual contribution, and it is a sturdy one. LLM recommendation has an incumbency bias that is real but fragile, it can be moved by language a model cannot fact-check, and when optimization becomes universal it turns into a cost everyone pays and few escape. For brands, agencies, and the platforms hosting these models, that is a market structure worth understanding now — before the GEO industry that the paper diagnoses fully prices itself in. The full study is on arXiv.