Skip to main content
NEJM Catalyst homepage

Healthcare Big Data and the Promise of Value-Based Care

Big Data is essential to every significant healthcare undertaking. Read about the challenges, applications, and potential brilliant future for healthcare big data.
NEJM Catalyst
January 1, 2018
This article appeared in NEJM Catalyst prior to the launch of the NEJM Catalyst Innovations in Care Delivery journal. Learn more.
What Is Big Data in Healthcare?
“Big data in healthcare” refers to the abundant health data amassed from numerous sources including electronic health records (EHRs), medical imaging, genomic sequencing, payor records, pharmaceutical research, wearables, and medical devices, to name a few. Three characteristics distinguish it from traditional electronic medical and human health data used for decision-making: It is available in extraordinarily high volume; it moves at high velocity and spans the health industry’s massive digital universe; and, because it derives from many sources, it is highly variable in structure and nature. This is known as the 3Vs of Big Data.
Figure 1
Sources of Big Data in Healthcare include EHRs, Payer Records, Smart Devices, Genetic Databases, the Government.
Sources of Big Data in Healthcare.
With its diversity in format, type, and context, it is difficult to merge big healthcare data into conventional databases, making it enormously challenging to process, and hard for industry leaders to harness its significant promise to transform the industry.
Despite these challenges, several new technological improvements are allowing healthcare big data to be converted to useful, actionable information.
Despite these challenges, several new technological improvements are allowing healthcare big data to be converted to useful, actionable information. By leveraging appropriate software tools, big data is informing the movement toward value-based healthcare and is opening the door to remarkable advancements, even while reducing costs. With the wealth of information that healthcare data analytics provides, caregivers and administrators can now make better medical and financial decisions while still delivering an ever-increasing quality of patient care.
But adoption of big data analysis in healthcare has lagged behind other industries due to challenges such as privacy of health information, security, siloed data, and budget constraints. In the meantime, 80 percent of executives from financial services, insurance, media, entertainment, manufacturing, and logistics companies surveyed report their investments in big data processing as “successful,” and more than one in five declare their big data initiatives have been “transformational” for their firms.
There are at least two trends today that encourage the healthcare industry to embrace big data. The first is the aforementioned move from a pay-for-service model, which financially rewards caregivers for performing procedures, to a value-based care model, which rewards them based on the health of their patient populations.  Healthcare data analytics will enable the measurement and tracking of population health, thereby enabling this switch.  The second trend involves using big data analysis to deliver information that is evidence-based and will, over time, increase efficiencies and help sharpen our understanding of the best practices associated with any disease, injury or illness.
Using big data analysis to deliver information that is evidence-based will, over time, increase efficiencies and help sharpen our understanding of the best practices associated with any disease, injury or illness.
Undoubtedly, adopting the use of healthcare big data can transform the industry, driving it away from a fee-for-service model toward value-based care. In short, it can deliver on the promise of lowering healthcare costs while revealing ways to deliver superior patient experiences, treatments, and outcomes.
Applications for Big Data in Healthcare
Keeping patients healthy and avoiding illness and disease stands at the front of any priority list. Consumer products like the Fitbit activity tracker and the Apple Watch keep tabs on the physical activity levels of individuals and can also report on specific health-related trends. The resulting data is already being sent to cloud servers, providing information to physicians who use it as part of their overall health and wellness programs.
Already, Fitbit has partnered with United Healthcare, which rewards its insureds up to $1500 per year for exercising regularly. Informed Data Systems’ One Drop app for Android and Apple is bringing about dramatic changes in A1c for people with diabetes. Meanwhile, Apple’s HealthKit, CareKit, and ResearchKit leverage the technology embedded in Apple’s mobile devices to help patients manage their conditions and enable researchers to collect data from hundreds of millions of users worldwide.
Figure 2
Big Data in Healthcare Applications--Diagnostics, Prevention, Precision Med, Research, Reduced Costs
Applications for Big Data in Healthcare.
Expanding diagnostic service gives patients greater access to professional care. Apps for mobile devices, such as Aetna’s Triage, advise patients on their medical condition using aggregated data and can recommend patients seek medical care based on input to the app.
In yet another of their healthcare data initiatives, Apple has teamed up with researchers at Stanford to determine if the Apple Watch’s heart sensor can be used to detect atrial fibrillation, a condition that causes the death of approximately 130,000 Americans each year. If the device proves successful at spotting the malady, Apple can notify wearers that they need to seek medical attention.
Propeller Health uses a Bluetooth-enabled sensor that attaches to inhalers and spirometers for people with asthma or COPD. The company tracks the environmental conditions at sensor locations and sends reports to patients’ phones, so they can better understand the causes of their symptoms and take measures to prevent attacks. The company also sends reminders about when to take medications. With 34 peer-reviewed articles to date, Propeller reports patients are experiencing 79 percent fewer asthma attacks and are enjoying 50 percent more symptom-free days.
Reducing prescription errors improves outcomes and saves lives. According to the Network for Excellence in Health Innovation, prescription errors cost some $21 billion per year, affecting more than 7 million U.S. patients and leading to 7,000 deaths. Israeli startup MedAware is partnering with healthcare organizations to deploy their decision support tool that uses big data to spot prescription errors before they occur.
Reducing costs. The greater insight that medical data gives physicians translates to better patient care, shorter hospital stays, and fewer admissions and re-admissions.
The Mayo Clinic uses big data analytics to identify patients with more than one chronic condition (comorbidity) as likely to benefit from early interventions at care homes, thereby saving them from visits to the emergency department.
Knowledge derived from big data analysis gives healthcare providers clinical insights not otherwise available. It allows them to prescribe treatments and make clinical decisions with greater accuracy, eliminating the guesswork often involved in treatment, resulting in lower costs and enhanced patient care.
Analysis of healthcare big data also contributes to greater insight into patient cohorts that are at greatest risk for illness, thereby permitting a proactive approach to prevention. In short, analysis of healthcare big data can identify outlier patients who consume health services far beyond the norm. It can pinpoint protocols and processes that deliver substandard results or whose costs are excessive in contrast to outcomes. It can be used to educate, inform and motivate patients to take responsibility for their own wellness. By bringing financial and clinical data together, it can highlight efficiencies and effectiveness of treatment plans.
Healthcare Big Data Lakes Become “Oceans”
Just as a researcher prefers to work with sample sizes of, say, millions of values rather than only hundreds, the more information contained in a big data sample, the better. While the term “data lake” is often used to describe a collection of raw big data, several events are underway that promise to build what might be called “data oceans” brimming with research and analysis opportunities.
Researchers and funding agencies recognize the benefit of integrating and sharing clinical research data to fill such “oceans.” For example, the Li Ka Shing Centre for Health Information and Discovery of the University of Oxford provides access to the UK Biobank and plans to add 50 million electronic patient records. In addition:
The European Medical Information Framework (EMIF) aims to improve access to health data derived from the electronic health records of some 50 million Europeans, as well as cohort datasets from participating research communities.
Open PHACTS is a platform for researchers and others who need access to pharmacological data. It was built in cooperation with academic and commercial organizations and allows users to extract information and make decisions on complex pharmacologic matters.
A division of the Dutch multi-national company, N.V. Philips, has aggregated more than 15 petabytes of data taken from 390 million medical records, patient inputs, and imaging studies. Healthcare personnel can access this massive collection to obtain critical data for informing the clinical decision-making process.
In the U.S., the National Institute of Health established the Big Data to Knowledge (BD2K) program designed to bring biomedical big data to researchers, clinicians, and others. Initiatives such as these will increasingly empower healthcare providers to improve patient care while simultaneously countering the unsustainable cost trajectory. They will also provide researchers with a rich universe of accessible data and information for disease prevention and cure.
Challenges for Implementing Big Data in Healthcare
Healthcare organizations face challenges with healthcare data that fall into several major categories including data aggregation, policy and process, and management. Let’s explore these further.
Data Aggregation Challenges. First, patient and financial data are often spread across many payors, hospitals, administrative offices, government agencies, servers and file cabinets. Pulling it together and arranging for all data producers to collaborate in the future as new data is produced requires a lot of planning. In addition, every participating organization must understand and agree upon the types and formats of big data they intend to analyze.  Looking beyond issues of the format in which it is stored (paper, film, traditional databases, EHR, etc.), the accuracy and quality of such data need to be established. This requires not only data cleansing (usually a largely manual process), but also a review of data governance: Was the data recorded accurately, or have errors crept in, perhaps over a period of many years?
Policy and Process Challenges. Once data is validated and aggregated, various process- and policy-related issues need to be addressed. The HIPAA regulations demand that policies and procedures protect health information. Access control, authentication, security during transmission, and other rules complicate the task. This multifaceted issue has been solved to some extent by cloud service providers, perhaps most notably Amazon AWS, which offers cloud services that comply with HIPAA and Protected Health Information (PHI).
Management Challenges. Finally, realizing the promises of big data analytics in healthcare requires organizations to adjust their ways of doing business. Data scientists will likely be needed along with IT staff that have the required skills to run the analytics. Some organizations may struggle with the thought of having to “rip and replace” much of their IT infrastructure, although cloud service providers mitigate some of those concerns. Physicians and administrators may need time before they trust the heretofore unseen advice big data can provide.
The Brilliant Future for Big Data in Healthcare
Just as executives in commerce and industrial sectors declare their big data initiatives have been successful and transformational, the outlook for healthcare is even more exciting. Below are a few areas where big data is destined to transform healthcare.
Precision medicine, as envisioned by the National Institutes of Health, seeks to enroll one million people to volunteer their health information in the All of Us research program. That program is part of the NIH Precision Medicine Initiative. According to the NIH, the initiative intends to “understand how a person’s genetics, environment, and lifestyle can help determine the best approach to prevent or treat disease. The long-term goals of the Precision Medicine Initiative focus on bringing precision medicine to all areas of health and healthcare on a large scale.”
Wearables and IoT sensors, already noted above, have the potential to revolutionize healthcare for many patient populations—and to help people remain healthy. A wearable device or sensor may one day provide a direct, real-time feed to a patient’s electronic health records, which allows medical staff to monitor and then consult with the patient, either face-to-face or remotely.
Machine learning, a component of artificial intelligence, and one that depends on big data is already helping physicians improve patient care. IBM with its Watson Health computer system has already partnered with Mayo Clinic, CVS Health, Memorial Sloan Kettering Cancer Center, and others. Machine learning, together with healthcare big data analytics, multiply caregivers’ ability to enhance patient care.
Fueling the Big Data Healthcare Revolution
Big data is just beginning to revolutionize healthcare and move the industry forward on many fronts. The changes in medicine, technology, and financing that big data in healthcare promises, offer solutions that improve patient care and drive value in healthcare organizations. But, it will require stakeholders— providers, payers, pharmaceutical producers, government and policymakers, and the scientific and research communities — to collaborate and innovate to reinvent the design and performance of their systems. They must build the technological infrastructure to house and converge the massive volume of healthcare data, which industry analysts estimate will grow to a whopping 2,314 exabytes by 2020. Furthermore, they need to invest in the human capital—IT experts, data scientists, data architects, and big data engineers—to guide us into this new and exciting frontier of human health and well-being.

Information & Authors


Published In

NEJM Catalyst
January 1, 2018


Published online: January 1, 2018
Published in issue: February 13, 2018




NEJM Catalyst

Metrics & Citations




Export citation

Select the format you want to export the citation of this publication.

View Options

View options









Innovation Cover - Mobile
Innovation Cover - Tablet Innovation Cover - Desktop

Fostering an Innovation Culture
to Reimagine Care Delivery

A collection of forward-thinking articles from NEJM Catalyst on innovative approaches to the biggest problems facing health care.