Adtech Ran This Experiment First

The most valuable advertising audience in the world, at least as measured by CPM, is middle-aged North American women with iPhones. This is not a marketing intuition or a demographic hunch. It is the output of a measurement system that has run billions of auctions, tracked billions of clicks, and continuously adjusted its model of which attention converts to purchase. The system did not decide this. It discovered it, incrementally, by measuring everything it could and adjusting toward whatever the measurement rewarded.

This is the same loop now running through AI. Adtech ran it first, at scale, for two decades. The pathologies it produced are worth examining carefully – not as cautionary tales, but as data.

The CPM auction encodes a theory of value

Cost per thousand impressions (CPM) is a proxy. The advertising system does not actually know whether a given viewer will buy a given product. It knows what viewers in similar contexts have done historically, and it prices attention accordingly. A viewer in North Carolina watching a cooking video generates roughly 30 to 45 times more ad revenue than a viewer in rural India watching the same video, because the North Carolina viewer is statistically more likely to be reachable by the purchase economy behind the ad.

That pricing is not arbitrary. It emerges from measurement: show ads, track clicks, track conversions, adjust bids, repeat. The system converges on a model of attention value that reflects observed purchasing behaviour across millions of transactions. Middle-aged North American women with iPhones rank highly because they convert. The measurement found them.

Optimising a proxy detaches the metric from the outcome

The gap between the measured proxy (click) and the intended outcome (purchase) is where the system’s incentives diverge from the advertiser’s interests. A click generates revenue for the platform regardless of whether a purchase follows. This gap is not a flaw that was overlooked. It is a structural property of any system that measures a correlate of the thing it wants rather than the thing itself.

Click fraud fills that gap from the other side. If clicks generate revenue for publishers regardless of purchase intent, then generating clicks without purchase intent is economically rational behaviour for anyone on the publisher side of the transaction. The click farming industry – artificial traffic sourced from low-cost labour markets or bot networks, routed through VPNs to simulate high-value geographic origins – exists because the measurement gap is wide enough to exploit. The system created the arbitrage by pricing the proxy rather than the outcome.

This is reward hacking with a balance sheet. The LLM training article describes the same dynamic in RLHF: a model trained to maximise a reward signal learns to maximise the signal rather than the underlying behaviour the signal was meant to capture. The mechanism is identical. The adtech version simply has more history and more money attached to it.

The geographic disparity is the system working correctly

The CPM gap between the US and India does not represent a failure of the advertising system. The system is pricing attention accurately by what it predicts will convert within its current model of purchaser behaviour. An Indian viewer watching a video aimed at North American consumers is, by that model, less likely to generate a purchase. The pricing reflects that prediction.

What the system cannot account for is the possibility that its model of purchaser behaviour is incomplete. If the Indian viewer represents an emerging market that will be highly valuable in five years, the current pricing is wrong – but wrong in a direction the system has no incentive to correct, because the correction would require sacrificing current revenue for a future model update. Systems optimised on historical data systematically discount futures that differ from the past.

The data maturity article identifies the same problem in enterprise ML: organisations whose data reflects historical operations rather than current or future conditions build models that encode the past rather than describe the present. The advertising system is a mature example of what that looks like at scale.

Fraud degrades the signal that the system depends on

Click fraud does not just steal from advertisers. It corrupts the training data for the model that prices attention. If a meaningful fraction of recorded clicks are fraudulent, the correlation between clicks and conversions weakens, and the system’s model of what converts becomes less accurate. The fraud that exploits the system simultaneously erodes the measurement quality that the system depends on to function.

This is the long-run cost of proxy metric gaming: the proxy loses its predictive value. A sufficiently gamed metric stops measuring what it originally measured and starts measuring gaming behaviour. The advertising industry has spent twenty years fighting this, with considerable investment in fraud detection, traffic quality scoring, and viewability standards. Each countermeasure is a response to a new form of adversarial optimisation against the previous measurement.

The same cycle is beginning in AI systems. Prompt injection, jailbreaks, and adversarial inputs are the early-stage equivalent. As AI systems are used to make consequential decisions, the incentive to game their inputs grows, and the measurement infrastructure required to maintain signal quality will have to scale with it.

The attention economy is surprisingly global

A baker in Uzbekistan and a grandmother cooking for her village in India now reach North American audiences through the same platform. The platform prices their content by the audience it attracts, not by where the content was made. The CPM flows from US advertisers, through Google’s auction system, to creators in economies where it converts to a materially different standard of living.

This is not a designed outcome of the advertising system. It is an emergent consequence of the measurement loop running continuously at scale. The system found that certain categories of content – unpolished, slow, visually distant from anything produced inside the Western creator economy – attract sustained attention from high-CPM audiences. It promoted more of it. The promotion attracted more attention. The loop converged on a distribution of content that no editorial team would have selected.

The PMFOps article describes the same dynamic in product development: the measure-adjust loop surfaces what the audience responds to regardless of what the product team intended to build. The advertising system has been running that loop over content for two decades. What it has found about human attention – that authenticity is scarce, that scarcity attracts, and that the system’s own promotion of a thing erodes the scarcity that made it attractive – is a result that every system running the same loop will eventually reach.

The loop is not neutral

The measure-optimise-adjust cycle does not optimise toward any particular set of values. It optimises toward whatever the measurement rewards. In adtech, the measurement rewards attention that converts to purchase within a short attribution window. That objective function has no term for the quality of the attention, the accuracy of the content that captured it, or the long-run effects of the content distribution it produces.

The cost of ML article frames machine learning as F(X) = Y: a model applied to data producing output. The Y – the label, the definition of correct – is the point at which human judgement enters the system. In adtech, Y was defined as conversion, tracked over a short window, within an attribution model that over-credits the last click. Two decades of optimisation toward that Y has produced a system that is extremely good at generating short-window last-click conversions, and whose side effects – fraud, gaming, geographic arbitrage, content homogenisation – are the signature of that objective function rather than failures of the system.

Every system that adopts the measure-optimise-adjust loop will produce the same signature, scaled to whatever Y was chosen. The question worth asking before the loop starts is what the loop is actually optimising for, and whether that Y is close enough to the thing you want to remain a valid proxy as the system scales. Adtech answered this question empirically, over two decades, at considerable cost. The answer is available.


Questions about this? Get in touch.