Surrogate Endpoints: When Lab Values Don't Equal Health

Surrogate endpoints measure biomarkers (CRP, cholesterol, diversity) but don't always predict health outcomes. HRT increased bone density (surrogate) yet increased fractures (actual outcome). Prentice criteria validate surrogates but require rigorous evidence.

Evaluate9 min read

Biomarkers Don't Always Mean Benefit

Hormone replacement therapy (HRT) exemplified surrogate endpoint failure. Clinical logic seemed sound: postmenopausal women have lower estrogen, lower bone mineral density (BMD), and higher fracture risk. HRT increases BMD—a reliable surrogate for bone strength. Surely HRT prevents fractures and extends life?

The Women's Health Initiative trial of 2002 revealed the error. HRT increased BMD (surrogate endpoint success) but increased breast cancer, blood clots, and ultimately mortality. The drug achieved the surrogate measure while harming patients. BMD alone didn't predict clinical outcomes.

Similarly, anti-arrhythmic drugs suppress cardiac arrhythmias (surrogate endpoint for heart health), yet increase mortality. The drugs achieved their biomarker goal while killing patients.

Surrogate endpoints are measurements intended to predict clinical outcomes. They're valuable for efficiently testing treatments: measuring CRP reduction is faster than waiting years for heart attack reduction; measuring FEV1 improvement is faster than waiting for asthma mortality reduction. But surrogates are only valid if changes truly reflect health improvement.

Three criteria validate surrogates (Prentice 1989): (1) strong association between surrogate and outcome; (2) treatment effects on surrogate proportional to effects on outcome; (3) surrogate captures treatment's mechanism. Even if criteria are met, unexpected effects (through alternative mechanisms) can occur.

Microbiome research reliance on surrogates is profound. Studies measure: alpha diversity (bacterial richness), specific taxa abundance (Faecalibacterium, Akkermansia, Faecalibacterium prausnitzii), short-chain fatty acid production, and inflammation markers (fecal calprotectin, CRP). But do these surrogates predict clinical outcomes?

Evidence is mixed. Increased Faecalibacterium abundance appears protective in IBD; low abundance associates with active disease. But does a probiotic increasing Faecalibacterium abundance translate to symptom relief or remission? A few trials show yes; most show modest effects. The surrogate (Faecalibacterium) predicts outcome poorly.

Alpha diversity provides another example. Dysbiosis (low diversity) associates with disease; diversity restoration seems beneficial. Some studies show probiotic-induced diversity increases correlate with symptom improvement; others show diversity increases without symptom change. The surrogate-outcome relationship remains weak.

Short-chain fatty acids (butyrate, propionate, acetate) produced by beneficial bacteria theoretically promote gut health. Higher butyrate seems beneficial. Yet clinical trials increasing dietary fiber (which increases butyrate-producing bacteria) show variable symptom improvement. The surrogate (butyrate) doesn't consistently predict outcomes.

FDA accelerated approval pathways sometimes approve drugs using surrogate endpoints, with conditional marketing approval pending long-term outcome data. This approach speeds drug availability but risks adopting ineffective treatments. Some accelerated-approval drugs later required withdrawal when outcome trials contradicted surrogate-endpoint promises.

Trying to improve surrogates without evidence for outcome benefit wastes resources. A probiotic trial showing increased Faecalibacterium abundance but no symptom improvement proves little about clinical benefit—the surrogate changed without translating to health.

Better microbiome research would prioritize clinical outcomes: symptom relief, quality of life, hospitalization prevention, disease remission. Biomarker changes are secondary endpoints supporting mechanistic understanding. Without clinical outcome improvement, biomarker changes alone don't justify treatment recommendations.

The lesson: measuring something is not the same as improving health. Interventions must pass the ultimate test—do patients feel better, live longer, or have fewer complications? Surrogate improvements alone are insufficient.

Sources & references

Biomarkers Don't Always Mean Benefit

Sources & references

Continue reading