Speech biomarkers
for Parkinson's trials

For PD progression monitoring, treatment-response exploration, and combined motor + cognitive assessment. Our framework brings quantitative speech measures into trials with a property that pure deep-learning approaches lack: stability across the acoustic variability of real-world smartphone data collection.

The hidden problem in remote PD monitoring

A longitudinal smartphone-based PD trial spans months or years. Over that time, participants' phones receive OS updates that silently change the noise-cancellation algorithm applied to recordings. The biomarker model is then evaluated on audio preprocessed by an algorithm generation not present at training time.

Because PD biomarkers are designed to detect within-subject change over time, any systematic shift in model output caused by a noise-cancellation change is confounded with the underlying disease-progression signal — the very signal the biomarker is designed to measure.

Motor speech features (jitter, shimmer, harmonics-to-noise ratio, formant stability) are particularly sensitive to this kind of preprocessing variability. End-to-end deep-learning models trained on clean speech are at known risk of learning patterns that disappear when data is collected through different smartphone preprocessing pipelines.

How OS updates confound longitudinal measurement

OS updates arrive at irregular intervals over the study; each one may change the on-device noise-cancellation algorithm, introducing a step in observed model output that is indistinguishable from true disease progression.

True PD progression Observed model output OS update event
PD severity score → M 0 M 6 M 12 M 18 OS UPDATE OS UPDATE OS UPDATE
Schematic illustration. In a longitudinal smartphone-based trial, the OS noise-cancellation algorithm changes with each system update — these arrive at irregular intervals and may compound over the study duration. A pure deep-learning model trained on the initial NC generation produces step changes in output at each update. The step changes are confounded with the disease-progression signal the biomarker is designed to detect.

A 4× tighter sensitivity spread

In an internal benchmark on the Italian Parkinson's Voice and Speech Dataset, we evaluated four classifier architectures across six scenarios of mid-study noise-cancellation algorithm change. Our dual-pipeline architecture was the most stable across all six conditions in this benchmark.

Sensitivity spread across 6 deployment scenarios
42–46pp
Baselines — Pure DL & feature-only Sensitivity spread, 6 scenarios
11.3pp
Cephalgo — Dual-pipeline architecture Sensitivity spread, 6 scenarios
86% +
Cephalgo sensitivity floor 86.4 % – 97.7 % across every condition

Robustness evaluation

Sensitivity spread (max − min) of each classifier across six deployment scenarios in which the training noise-cancellation algorithm differs from the test algorithm. Cephalgo's dual-pipeline architecture maintained sensitivity within 11.3 pp across all six scenarios — a 4× tighter spread than the baselines.

Italian PD Voice and Speech Dataset · n = 50 · 10 held-out test subjects · Reporting follows TRIPOD+AI

See the full chart on Speech analysis
This evaluation was accepted as a selected poster for the 9th Annual Digital Biomarkers in Clinical Trials Summit (Roche, Basel, June 2026), a precompetitive consortium of pharma digital biomarker leads.
Limitation note — Evaluation used a single public cohort (n = 50), 10 held-out test subjects, and four software-simulated NC algorithms rather than direct OS-update replicas. Results are exploratory evidence of preprocessing sensitivity rather than direct validation against real OS-update effects. Prospective evaluation on partner cohorts is part of our active research collaboration agenda.

Our approach

Our curated acoustic pipeline preserves PD-relevant features — jitter, shimmer, harmonics-to-noise ratio, formant stability, prosodic timing — through validated, transparent extraction methods. Deep learning adds complementary pattern recognition where it helps, without making the platform fragile to the acoustic conditions of remote acquisition.

The framework provides feature-level explainability: each output is traceable to specific acoustic features rather than an opaque score. The architecture is aligned with the DiME V3 verification and validation framework; evaluation reporting follows TRIPOD+AI guidance.

Validation work on PD-specific clinical outcome correlates is ongoing. We treat current PD clinical-validation results as initial, and are actively seeking collaborators for prospective and retrospective validation on longitudinal partner cohorts.

Retrospective PD screening evaluation

Selected ePoster, AD/PD™ 2026.

AD/PD™ 2026 · ePoster

Validation of fully explainable speech biomarkers for PD/HC screening

Adrian Ortiz, Hanyu Peng · Cephalgo SAS

Evaluation on 28 PD cases (mean age 67.2) + 37 healthy controls (mean age 48.3). Known age imbalance of this public dataset. Fully interpretable speech features, traceable to specific acoustic and linguistic constructs.

0.97
AUC
91.4%
Balanced acc.
88.5%
Sensitivity
96.3%
Specificity

Where this fits in PD trials

Four high-value entry points for PD-focused programs.

Progression in placebo arms

Natural-history data between clinic visits

Continuous high-frequency speech acquisition provides natural-history progression data that complements infrequent in-clinic UPDRS assessment.

Treatment-response exploration

Quantitative motor-speech change tracking

Tracking motor-speech changes following intervention, including in disease-modifying programs where conventional rating scales may be too coarse.

Decentralized & hybrid trials

Remote acquisition with reduced drift risk

Remote speech acquisition supports decentralized trial designs with reduced preprocessing-drift risk versus competing approaches.

Motor + Cognitive combined

PD-MCI and PDD comorbidity signal

Speech captures both motor speech impairment and cognitive change — relevant for trials concerned with cognitive comorbidity (PD-MCI, PDD).

Collaboration formats

We are particularly interested in collaborations that include longitudinal data on disease-modifying therapies, where the value of stable remote measurement is highest.

Retrospective analysis

On legacy speech data from completed PD trials.

Exploratory endpoint integration

In active protocols.

Joint validation studies

On partner-curated longitudinal cohorts.

Contact

Let's discuss your PD program.

Tell us about the studies you're running and the questions you'd like speech to help answer.

Get in touch