Architecting the future of predictive genomics – a blueprint for technology

Photo by Adi Goldstein on Unsplash including technology

This blog is the third in a series from Zetta Genomics CEO, Mark Bailley, on the opportunities of predictive medicine. Parts one and two looked at health systems and economies, and public education. Part three turns to the challenges of technology.

Genomics has the potential to deliver predictive medicine that democratises healthcare, creates viable health economies and improves the health outcomes of billions of people across the globe. Genomic technologies will be a key enabler to turn this potential into reality.    Yet, not all genomic technologies are created equal. While some of the more headline-grabbing fields have been the focus of significant investment, others have not. Genomic data management has been something of a Cinderella technology – often overlooked and under-resourced – but if we are to achieve genomic potential, genomic data must go to the ball.  



It’s worth remembering that it’s only been 20 years since the sequencing of the first human genome. A decade later, we could probably have numbered the global total in the low thousands. 

Everything changed in 2013 when Genomics England committed to sequence a then unprecedented 100,000 genomes within five years. It was a moonshot that demanded the industrialisation of what had previously been a cottage industry.   

Sequencing technology has been a key beneficiary, with significant investment that has seen growing scale and shrinking costs. Today, the latest sequencing machines cost around US$1 million – while sequencing a genome can be achieved for as little as US$200. It means that the global store of genomes is now numbered in the tens of millions. As many health systems around the world embark on large scale sequencing projects such as neonatal testing, this figure is going to rise exponentially.  


The data challenge 

To truly harness predictive genomic medicine at the population scale, however, we also needed a fundamental transformation of the technologies that allow us to manage and analyse data at speed and scale. 

As Genomics England worked to get to 100,000 genomes, it realised that genomic data management needed an overhaul. While contemporary data technologies could just about cope with tens or hundreds of genomes, they were woefully inadequate when faced with the population level demands of tens or hundreds of millions.  

One of the primary barriers is the fact that, pre-genomic health data, such as a biopsy, was – and only needed to be – a snapshot of a moment in time. Genomic data, however, needs to be dynamic – requiring constant access and reanalysis as understanding of the genome grows and diseases progress.  

Today, this lack of a genomic-native data platform hampers health systems as they try to embed genomics in routine care. Identifying variants that could provide a diagnosis, for example, still relies on largely manual processes that absorb huge amounts of time from highly skilled specialists such as clinical scientists and bioinformaticians. While genomics races into the 21st century, its data foundations languish in the 20th. 


The data opportunity 

Genomics England and the University of Cambridge accepted the initial data challenge, harnessing the expertise of the global open-source community to sweep away the old flat file system.  

The technologies that they pioneered are advancing fast. Researchers and clinicians can now access genomic-native, highly automated tertiary analysis platforms to store, aggregate, retrieve, annotate, interpret and continually reanalyse data – in real time at the laboratory bench or the patient’s bedside.  

Where scale once hampered genomics, these technologies become more powerful as data volumes grow. It means, for example, that we can cut analysis times – from weeks or months – to hours, days and even minutes. Now, we can increase diagnostic yields and end diagnostic odysseys for many more patients.  

Finding their way into clinical settings, these data management platforms are already making an impact.  

Dr Caoimhe McKenna, a clinical genetics doctor at the Northern Ireland Regional Genomics Service, now uses our XetaBase technologies. She said, Variation in the human genome is vast, so I need a solution to help me find the needle in the haystack. XetaBase is unique – a technology that allows me to visualise patient variants in ways that are user friendly and intuitive. The platform incorporates variant prioritisation tools such as Exomiser, alongside easy links to browsers such as Decipher. Without XetaBase, this process is slow and laborious – with XetaBase I can find that needle and identify diagnoses faster.”  

Today, genomic success is often measured in numbers of genomes sequenced. Yet, what benefits do the big numbers bring if we can’t transform the data they capture into actionable insight and improved patient outcomes?  

The challenge for genomic medicine is to architect a future where technologies advance in step – where vast libraries of sequences are met with a greater capacity to store, analyse, interpret – and apply – the treasure trove of data that they hold. 

Using next generation genomic data technologies, we can amplify insight, drive discovery and accelerate the introduction of precision and predictive medicine at scale.