November 12, 2014
2 min read

‘Big Data’ analyses successfully reveal important patient insights

You've successfully added to your alerts. You will receive an email when new content is published.

Click Here to Manage Email Alerts

We were unable to process your request. Please try again later. If you continue to have this issue please contact

Researchers have successfully analyzed “Big Data,” in this case, enormous amounts of details from electronic health records, to gain accurate insights into patients’ conditions and predict diseases.

Isaac S. Kohane, MD, PhD, professor of pediatrics and health sciences technology at Harvard Medical School’s Center for Biomedical Informatics (CBMI), spoke Monday about his research into analyzing large amounts of data from EHRs at the Exponential Medicine conference in San Diego.

“For the past 15 years, I’ve been trying to use health care systems as living laboratories,” Kohane said.

He said in one of his projects, he and his team used longitudinal data to identify patients at high risk for domestic abuse based on the types of other clinical visits the patient made. By identifying the relationships between visits for pain to the ED, gastrological complaints and other issues could predict the risk for domestic abuse, not producible in a short clinical visit with one physician. He said predicting risk was achievable through analysis of a larger data set.

In other research into autism, Kohane and colleagues identified more than 5,000 comorbidities in a database of 13,750 children with autism across multiple health care systems. The cohort was divided into three groups with autism based on age at diagnosis: 0 to 6 months, 6 to 12 months and 12 to 18 months.

“We wanted to see if any natural clustering was present,” Kohane said.

The researchers said one group had high incidence of inflammatory diseases such as inflammatory bowel disease. Another group had a greater incidence of autism, and the third had a higher prevalence of other psychiatric and behavioral disorders.

“When we overcome the problem of sharing data between all of these health systems, all of a sudden, we see a granularity in these diseases that is going to change the way we diagnose and treat these individuals,” Kohane said.

He and his group developed the i2b2 (Informatics for Integrating Biology and the Bedside) open source database and query system that was designed to maintain patient privacy and has been expanded to include genetic information when possible. Kohane said it has been adopted by more than 100 hospitals, including about 80 in the US.

Earlier this week, Harvard Medical School announced it received a government grant aimed at expanding i2b2. The $11.3 million grant is from the NIH’s Big Data to Knowledge program, and the CBMI is one of 12 recipients of the award, according to a press release.

“If you really want to understand where people stand diagnostically, you need to bring all these various elements together,” Kohane said in the release. “People say many important things about their health in social media, and if a critical mass of individuals grant their physicians permission to access this and we can then start matching information from tens of thousands of patients, meaningful patterns will emerge that could not have been ascertained through conventional means.”

The data will include information from conventional sources, such as EHRs, but also unconventional streams, including common social media platforms.

“This work really helps facilitate the waltz between basic and translational science,” Kohane said. “We want to do everything we can to support and further the work of all our colleagues by creating tools that effectively make sense out of big data sets, where everyone can play a role and contribute to this ‘commons’ writ large.”

Once compiled, the data “representing populations from a multidimensional perspective” will be available at no charge through the open-source software program called “information commons.” –By Shirley Pulawski