Architected for sensitivity in long disease-modifying trials

Disease-modifying AD therapies produce small effect sizes against slow-progressing endpoints. The risk of a Phase 2 failing because measurement noise across sites and time drowned the signal is real.

Our framework is a complementary scientific approach for AD-focused programs: classical interpretable features preserved for face validity, deep-learning embeddings added in parallel for sensitivity to subtle change, with harmonized measurement across languages and study duration.

Current evidence is retrospective; prospective partner validation is the next step.

Why biomarker sensitivity is the bottleneck in AD trials

The cognitive composites used as primary endpoints — PACC5, ADCOMS, CDR-SB — are by design slow-moving in the early disease stages where disease-modifying therapies are tested. Trials are long, expensive, and prone to false negatives when the underlying signal is subtle.

Most published speech-biomarker work has been optimized for discrimination — distinguishing AD from healthy controls — a fundamentally easier statistical problem than detecting small longitudinal change under treatment. Three architectural choices matter most for the latter: representation richness, multilingual consistency, and stability over study duration.

Three architectural choices for sensitivity to small effects

Each choice addresses a specific failure mode in long, multi-site disease-modifying AD trials.

01 — Sensitivity

Deep-learning embeddings designed to preserve subtle longitudinal change

Hand-engineered features distinguish clinical groups well, but compress out the within-subject change a treatment effect produces. We keep classical features for face validity and add deep-learning embeddings in parallel for sensitivity to longitudinal change — complementary, not a replacement.

02 — Harmonization

Multilingual, harmonized — same model across trial sites

When each language is bolted onto an English-led model, site-to-site variation adds noise to your treatment-effect estimate. Our framework uses a single shared architecture across the major European trial languages, with construct definitions harmonized across sites — so site variability is clinical heterogeneity, not measurement noise.

03 — Stability

Designed for stable measurement over the trial timeline

Over an 18-month trial, OS updates silently change the on-device noise-cancellation applied to recordings. For a pure deep-learning biomarker, this introduces drift confounded with the treatment effect. Our dual-pipeline architecture is designed to remain stable across these preprocessing changes.

See our evaluation

Initial retrospective evidence

Across four datasets and five tasks — including retrospective decline classification on ADReSSo — the framework discriminates clinical groups and classifies decline. Prospective validation in trial-relevant populations is the next step.

Discriminative and longitudinal performance
0.93AUC
External clinical evaluation N = 1,142
0.77AUC
AD vs MCI differentiation N = 161
0.75AUC
Retrospective decline classification ADReSSo · N = 105
2.74MAE
Cross-sectional MMSE estimation N = 156 · R² = 0.63

Current evidence is retrospective on public and partner-curated benchmarks. Prospective validation in trial-relevant cohorts is the active next step.

AD/PD™ 2026 · ePoster

Validation of fully explainable speech biomarkers for AD/MCI screening

Adrian Ortiz, Hanyu Peng · Cephalgo SAS

Evaluation on 44 AD/MCI cases (mean age 80.3) + 44 age-matched healthy controls (mean age 80.5). Fully interpretable speech features, traceable to specific acoustic and linguistic constructs.

0.94
AUC
89.3%
Balanced acc.
92.4%
Sensitivity
86.3%
Specificity

Where this fits in AD trials

Four high-value entry points for AD-focused programs.

Enrichment

Potential functional progression layer on amyloid-positive cohorts

Among amyloid-positive candidates, identify those most likely to show functional progression over the trial timeline.

Longitudinal monitoring

Trajectory information between visits

High-frequency speech assessment captures trajectory information that infrequent composite-based visits cannot. The sensitivity, multilingual, and stability properties above are all most directly relevant here.

Daily speech Clinic visit
Cognitive-speech score →
Visit 1 Visit 2 Visit 3
Neuropsychiatric symptoms

Prosodic markers of agitation, apathy, affect

Prosodic and acoustic features track agitation, apathy, and affective state — directly relevant to AD-related neuropsychiatric symptom trials.

Collaboration formats

We do not require new data collection to begin. If your AD program has previously collected speech data, we can apply our framework to it under a research collaboration.

Retrospective analysis

On legacy speech data from completed or ongoing AD trials.

Exploratory endpoint integration

In active protocols, with no impact on primary endpoint design.

Joint validation studies

On partner-curated cohorts, with shared design and reporting.

Contact

Let's discuss your AD program.

Tell us about the studies you're running and the questions you'd like speech to help answer.

Get in touch