With advances in genome sequencing technology, the life sciences industry is experiencing unprecedented growth in biomedical data. Between 100 million and 2 billion human genomes could be sequenced by 2025. The data storage demands for that sequencing: between 2 exabytes and 40 exabytes (one exabyte is 1018 bytes). Analyses of such massive volumes of genome alignments data could take up to 10,000 trillion CPU hours.
Augmenting this massive explosion of genomics data is the digitalization of healthcare – including electronic medical records (EMRs), electronic case report forms (eCRFs) in clinical trials, and patient-reported outcomes from e-diaries, mobile apps, and wearables. All of it is rich biomedical data sitting in various data repositories, ripe for integration and processing.
Realizing value from biomedical big data
Given its volume and complexity, all this biomedical data is akin to dark data locked away in different silos. Furthermore, without a comprehensive and holistic view, it’s difficult for stakeholders along the healthcare continuum to realize any value from it. These kind of insights could lead to breakthroughs in research and development as well as other medical advancements.
Making sense of this data involves three key steps:
- Automate the integration of both structured and unstructured data from a variety of disparate sources into one clinical data warehouse. Clinical data warehouses help optimize internal resources, standardize data exchange, and streamline data analyses between stakeholders. However, a critical requirement for this step is to ensure systems are interoperable and data can be compressed with no loss to data integrity.
- Ensure data privacy and security are not compromised, especially when storing and transmitting sensitive patient data. This is possible with role-based authorization, encryption, and anonymization of data both on premises and in the cloud.
- Connect the dots with powerful data visualization, analytics, and reporting tools in order to glean meaningful and actionable insights. These tools need to be flexible and efficient to produce near real-time results that can cater to different stakeholder needs – be it researchers, biostatisticians, medical writers, or business users.
More volume, complexity – and opportunity
Over the next decade, even more data is expected to enter the mix. In addition to genome sequencing, advances in ingestible biosensors, nanotechnology, and pharmacogenomics will fuel further growth in biomedical big data. Solutions that provide robust, comprehensive, and secure data storage with near real-time answers to research questions will help harness this big data potential.
Companies like SAP are committed to providing innovative solutions that can help researchers and physicians keep up with the deluge, thereby deriving value that advances research and development and ultimately improves health outcomes. For more information visit our SAP Personalized Medicine hub and continue this discussion by following @SAP_Healthcare on Twitter.