Data Revolution Overview: Massive Data Access Initiatives in Health and Science Ramp Up

by Justin Q Taylor


If you work in the health or science sectors, you have probably noticed a flurry of “big data” initiatives. Finding ways to standardize and make accessible the massive amounts of data we accumulate is a growing challenge. Let’s look at some of the recent progress in this area and how we might move forward.

Healthcare has undergone massive policy changes in the last few years, but the data revolution in this field has just begun. 2010’s Affordable Care Act laid the policy groundwork for expanding the use and usefulness of Health Information Technology, notably by encouraging Electronic Medical Records.  Obviously, standardizing health information represents a huge boon to public health research by allowing for easy access to reliable health information. In April, McKinsey & Company released a comprehensive report calling for increased integration of big data into healthcare processes. Notably, they estimate that expansion of current integration trends could lead to a $300-450 billion annual reduction in US healthcare spending, representing a 12-17% overall decrease. On the technological side, the McKinsey report points out that more than 200 new businesses have developed “innovative health-data applications” since 2010. A recent Technorati article confirms the trend of technological innovation brought on by big data, pointing out Samsung’s innovative work with hospitals to create new technology which allows Electronic Medical Records to be better integrated into patient care.

The data revolution is also becoming a greater part of scientific research. With last year’s launch of Obama’s $200 million “Big Data Research and Development Initiative,” the government has accelerated many projects which will allow big data pooling in science. The president’s FY14 budget proposal includes an additional $40 million to expand NIH involvement in the initiative.

As an example of this work, a big step forward was achieved in cancer research last week when the University of Chicago launched “The Bionimbus Protected Data Cloud” to allow for secure access to genetic cancer information from The Cancer Genome Atlas (TCGA). Previously, using  The Cancer Genome Atlas involved weeks of downloading data and additional time setting up methods to manipulate that data. Cloud access to many of the giant data initiatives in the science world will be very helpful to increasing the use of data which is otherwise difficult for researchers to access.

Readers: How can we push smaller research communities to start using more big data methods to standardize and share data? Are you involved with a project that uses big data in an innovative way?