Across all medical specialties, there is a severe lack of high-quality clinical evidence, in part because the gold standard for evidence is large-scale randomized controlled clinical trials. Such trials are on an unsustainable cost trajectory, as they require expensive, stand-alone data capture infrastructures. Furthermore, they typically enroll highly selected populations that are not necessarily representative of real-world patients. Although the emergence of the electronic health record (EHR) holds great promise for generating much-needed evidence, medical research lags far behind other industries in its ability to use big data to get the answers decision makers need in health care. The ability to harness good quality, usable data from EHRs will likely be as revolutionary to health care as the Internet was to other industries.
The problem is complex, and one facet of the issue is that data from health systems are not interoperable; for example, information such as date of birth, blood pressure, or diagnoses can be recorded in a myriad of ways. Although the Centers for Medicare and Medicaid Services encourages and incents “meaningful use” of EHRs, these systems are customizable to each institution’s needs, and as a result, data from individual health care systems and providers are housed in silos of babel—with limited ability to exchange information between them. Compounding the issue, most organizations erected proprietary systems of digital health data capture before standardized formats were developed and before thoughtful consideration about reuse of these data for research activities gained traction. As a result, it has been infeasible to ask questions as seemingly simple and important as “Which dose of aspirin is associated with better outcomes?”