Long-Read Sequencing Is Finally Moving Into the Clinic

The case for long-read sequencing in the clinic has been clear for a long time. The problem was that clear arguments and economic reality do not always move at the same speed.

Short-read platforms — where sequence reads average 150-300 base pairs — have powered clinical genomics since the mid-2000s. They are cheap, fast, and the bioinformatics pipelines supporting them are mature. For the vast majority of clinical use cases, particularly small variant detection in well-characterized disease genes, short-read works well enough that switching costs have been hard to justify.

Long-read platforms produce reads of 10,000 base pairs or more, sometimes exceeding 100,000. That difference matters for a specific but important class of genomic problems: structural variants, repeat expansions, phasing of compound heterozygotes, full-length transcript isoforms, and methylation detection in a single pass. These are exactly the problems that remain unsolved in a meaningful fraction of patients who undergo standard short-read whole genome sequencing and come back with no diagnosis.

The Diagnostic Yield Problem

The diagnostic yield of short-read whole genome sequencing in rare disease is typically cited at 25-40%, depending on the indication and phenotypic precision of the cohort. That number has not moved substantially in five years despite improvements in variant interpretation. The biology causing the remaining 60-75% of undiagnosed rare disease cases does not require better bioinformatics — it requires data that short-read sequencing structurally cannot generate.

Structural variants — large deletions, duplications, inversions, translocations — account for a meaningful fraction of rare disease diagnoses that short-read routinely misses or mischaracterizes. Repeat expansions are another major category. More than 50 neurological and neuromuscular disorders are caused by repeat expansions, and the majority of those expansions are technically difficult or impossible to resolve on short-read platforms. Long-read handles both with relative reliability.

In our experience evaluating long-read clinical studies, the incremental diagnostic yield over short-read in undiagnosed rare disease cohorts ranges from 8% to 17% — a number that sounds modest until you consider what a diagnosis means to a patient who has spent years without one.

What Changed on the Economics

Two years ago, the cost per whole genome on long-read platforms was roughly 10-15 times higher than short-read. The turnaround time was also significantly longer. Both have compressed substantially. Current pricing for clinical long-read WGS is approaching 3-4 times short-read cost, and sequencing run times have dropped to the point where same-day sequencing and analysis is feasible in urgent clinical settings like NICU applications.

The platform fragmentation is also resolving. There are now two dominant long-read platforms with credible clinical-grade workflows: nanopore sequencing, which produces very long reads with native methylation detection but requires careful quality control, and SMRT-based sequencing, which offers higher accuracy per read with strong indel performance. Both have made substantial investments in clinical software, quality systems, and laboratory information management integration over the last three years.

The bioinformatics gap — which was genuinely a problem as recently as 2022 — has largely closed for core clinical applications. Structural variant calling algorithms, repeat expansion detection, and methylation analysis pipelines are now sufficiently validated that clinical laboratories can deploy them with acceptable false positive rates.

Where Clinical Adoption Is Actually Happening

The adoption curve is not uniform. Three areas are moving fastest.

Rare and undiagnosed disease is the leading edge. The economics are favorable because the patient population is high-acuity, prior short-read testing has often already been performed and returned negative, and the incremental diagnostic yield justifies premium pricing. Several large pediatric medical centers have now published clinical validation data demonstrating long-read outperformance in specific diagnostic categories.

NICU genomics is a second area of rapid movement. Critically ill neonates with possible genetic diagnoses represent a time-sensitive clinical problem where turnaround time directly affects treatment decisions. Rapid WGS programs — most currently built on short-read — are beginning to evaluate long-read as the run time economics improve. The case for a 12-hour NICU long-read WGS is compelling if it closes the diagnostic gap meaningfully.

Pharmacogenomics is a third area that receives less attention but has real near-term commercial logic. Many clinically relevant pharmacogenomic variants — particularly in CYP2D6 and other complex loci with structural variation — are not reliably genotyped by short-read. Long-read resolves these cleanly and could enable more confident drug dosing decisions in polypharmacy populations.

Investment Implications

We have watched the long-read clinical thesis develop for several years. The platform companies themselves are public and priced accordingly. The more interesting private investment landscape is in the application layer: clinical software built specifically for long-read data interpretation, laboratory services businesses with long-read-first infrastructure, and disease-specific testing companies that can build clinical validation datasets proprietary to specific indications.

The companies we find most credible in this space are not the ones claiming long-read will replace short-read across all clinical sequencing. That is unlikely in the near term and probably not the right frame. The more defensible position is that long-read is the right tool for specific problem classes, and building around those specific problem classes — rare disease, complex pharmacogenomics, certain oncology applications — creates a more focused and fundable thesis than "long-read for everything."

The clinical transition is happening. The pace will not be uniform across applications, health systems, or geographies. But the questions we were asking two years ago — whether long-read can ever reach clinical cost competitiveness, whether the bioinformatics would mature — are largely answered. The current questions are about reimbursement pathways, laboratory accreditation timelines, and which applications have the clearest health economic case to make to payers. Those are solvable problems.

Long-Read Sequencing Is Finally Moving Into the Clinic

The Diagnostic Yield Problem

What Changed on the Economics

Where Clinical Adoption Is Actually Happening

Investment Implications

Continue Reading

What Spatial Transcriptomics Actually Means for Drug Targets

Genomics Data Infrastructure Is the Next Picks-and-Shovels Play

Why We're Still Bullish on Liquid Biopsy in 2026