Evidence & Methodology

What TDS Measures

The design questions that decide whether a trial can see its own signal

Every protocol is evaluated across the temporal dimensions of its design — producing the Temporal Design Score, a single read on how well a study's measurement architecture matches the temporal structure of the disease it's studying.

Is measurement happening when the biology actually changes?

Whether the protocol's visit schedule and assay timing land at the moments when meaningful biological change is expected — rather than at intervals chosen for operational convenience.

Does the design see across the timescales that matter?

Whether the biology that unfolds over different horizons each gets a chance to show up in the data, rather than only the parts easiest to capture.

Is the data rich enough to tell each patient's story?

Whether measurements are dense enough to understand how individual patients are actually responding — not just group averages that can hide the signal.

Does the design capture change as it happens?

Whether the protocol is built to observe how the biology is moving over time, rather than inferring it after the fact from a few scattered snapshots.

The same framework applies at every stage of development, including preclinical study design.

Triage Patterns

Six timepoint mismatches we see again and again

Most Phase 2 protocols are operationally sound and still measure the wrong thing at the wrong moment. The biology moves on one clock; the visit schedule runs on another. When those clocks don't line up, a trial can read as a drug failure when it was really a measurement failure. The composites below are anonymized patterns drawn from publicly registered trials that did not meet their endpoints — the kinds of gaps a Trial Readiness assessment surfaces during triage, before a single patient is enrolled.

Immuno-oncology · Phase 2

Reading the result before the mechanism fires

A checkpoint-inhibitor protocol assessed tumor response on a conventional imaging cadence. The relevant immune activation occurred weeks earlier and was never measured — so early responders and true non-responders looked identical at the only timepoint that counted.

Mismatch: primary readout scheduled long after the biological event it was meant to capture.

Metabolic · Phase 2

Endpoint timed to the clinic, not the disease

Visits fell at weeks 4, 8 and 12 “because that's when patients come in.” The effect being studied stabilized on a slower arc, so the final measurement landed in a transitional window that flattered the placebo arm and muddied the signal.

Mismatch: sampling cadence inherited from logistics rather than process duration.

CNS / Neuro · Phase 2

Measuring position, never velocity

The protocol captured a single severity score at baseline and at end-of-study. With only two points, rate of change — the variable most associated with durable benefit — could not be estimated at all.

Mismatch: too few timepoints to resolve a trajectory, only a start and an end.

Oncology · Phase 2

A single scale, no mechanistic bridge

Only a late-stage anatomic measure was collected. With no early molecular or functional readouts in between, there was no way to tell a non-responding patient from one whose mechanism was working but whose anatomy hadn't caught up.

Mismatch: coverage concentrated on one timescale, leaving the mechanistic middle blank.

Infectious disease · Phase 2

Sampling slower than the process moves

The biological process turned over faster than the gap between visits. By the time the next sample was drawn, the informative window had already opened and closed — so the curve was reconstructed from points too far apart to be reliable.

Mismatch: intervals wider than the dynamics they were meant to capture.

Autoimmune · Phase 2

Mechanism and outcome on different clocks

The mechanistic biomarker and the clinical endpoint were each sampled on schedules that never overlapped. The two could not be linked in time, so a clean biomarker signal could not be tied to — or used to interpret — the clinical result.

Mismatch: mechanism and outcome measured on uncoordinated timelines.

Every one of these reads as procedurally fine on paper. None is a dosing error or a statistical mistake — they're timing decisions that only look wrong once you map the schedule against the biology. That mapping is what a Trial Readiness assessment does before enrollment, when the schedule is still cheap to change.

Retrospective Evidence

TDS is associated with trial outcome across every indication tested

A retrospective analysis of 275,000 interventional trials from ClinicalTrials.gov found a statistically significant association between estimated temporal design quality and trial outcomes (p = 2.7 × 10⁻¹⁰¹, Cohen's d = 0.19 all-comers, d = 0.40 oncology). The association replicates across 5 global regions and 10 therapeutic areas.

Mean TDS Difference: Successful vs. Failed Trials

Oncology

+3.2 pts

n=6,978

Dose Escal.

+2.3 pts

n=1,696

CNS

+2.2 pts

n=1,132

Cardiovascular

+2.0 pts

n=1,356

Metabolic

+1.5 pts

n=864

Autoimmune

+1.4 pts

n=1,175

Respiratory

+1.4 pts

n=995

Infectious Dis.

+1.0 pts

n=970

What the data shows

Trajectory modeling is associated with higher success

Trials with sufficient per-patient data density to model individual trajectories succeeded at 58% vs. 27% for those relying on group-level analysis.

Rate-of-change capture is the strongest associative factor

Whether a trial captures how fast things change — not just where they are at a given moment — is the strongest associative factor in 8 of 10 indications tested.

The association is specific to design quality

TDS is associated with Scientific and Design outcomes but not Operational or Commercial failures — the pattern expected if it captures temporal design adequacy rather than overall sponsor sophistication.

Replicates across regions

The TDS–outcome association holds across North America, Europe, Asia-Pacific, and multi-regional trials with no regional exceptions.

Validation Methodology

Signal Robustness Across Outcome Heterogeneity

TDS measures temporal design quality—the timing and sequencing of data collection—which cannot tell you whether a molecule works. What it speaks to is whether a design is positioned to detect an effect if one is present. Trial Readiness validates the framework against outcomes where design quality is informative. Registry-derived outcome classifications include both design-driven failures and efficacy failures (drug or molecule efficacy).

When validation is restricted to design-driven and operational failures (where temporal design has explanatory relevance), effect sizes strengthen significantly, confirming that the framework's signal is robust to outcome heterogeneity. This filtering demonstrates that reported effect sizes are conservative, and the true TDS signal in design-driven contexts is stronger than all-comers estimates suggest.

Transparency

What we know and what we're still testing

What we've established

A retrospective association between estimated temporal design quality and trial outcomes across 275,000 interventional trials, with effect sizes ranging from d = 0.19 (all indications) to d = 0.40 (oncology). The association replicates across 5 regions, 10 therapeutic areas, and all sponsor types.

From association to prospective test

The retrospective analysis shows a clear and consistent association between temporal design quality and trial outcomes. The next phase tests prospectively whether protocols revised on TDS guidance carry that advantage forward — the Validation Partner program is running that now, with Phase 2 sponsors implementing TDS recommendations.

Strengthening through independent review

An inter-rater reliability study with independent expert reviewers is underway, corroborating TDS scoring against blinded expert assessment under a pre-specified analysis plan. The retrospective foundation supports use today; this work strengthens it further.

Addressing Protocol Burden

Stepwise triage: not every improvement requires more visits

Our recommendations are organized into three operational tiers so sponsors implement only what fits their constraints.

Tier 1

Zero Additional Patient Burden

Use specimens already collected. Add statistical analyses to existing data. Recompute existing measurements differently.

Tier 2

New Assays at Existing Visits

One additional tube at an existing draw, or running an extra panel on an existing biopsy.

Tier 3

New Visit Windows

New timepoints, additional imaging, or on-treatment sampling — the highest-impact changes, for teams with room to add them.

Many of the most valuable improvements fall in Tier 1 — gains available at near-zero incremental cost, using data a trial is already collecting.

References & Sources

¹ BIO, Informa Pharma Intelligence, QLS Advisors. Clinical Development Success Rates 2011–2020. Phase II success rate: ~30.7%. bio.org

² Wong CH, Siah KW, Lo AW. Estimation of clinical trial success rates. Biostatistics. 2019;20(2):273-286. Oncology POS: 3.4%. doi.org

³ Schuhmacher A, et al. Benchmarking R&D success rates. Drug Discovery Today. 2025;30(2):104291. sciencedirect.com

TDS retrospective analysis: Scientari LLC, 275,000 interventional trials from ClinicalTrials.gov. Association is retrospective and correlational.