Wikipedia page views used to forecast disease outbreaks
Monitoring Wikipedia page views could lead to accurate and timely outbreak forecasts, according to recently published data.
Researchers from Los Alamos National Laboratory in New Mexico analyzed Wikipedia access logs from March 7, 2010 to Feb. 1, 2014 for various infectious disease page views, as well as proxy data to determine user locations worldwide. Additionally, disease incidence data collected from WHO epidemiological reports were used to create models intended to monitor current outbreaks as well as those in the future.
“A global disease-forecasting system will change the way we respond to epidemics,” researcher Sara Y. Del Valle, PhD, said in a press release. “In the same way we check the weather each morning, individuals and public health officials can monitor disease incidence and plan for the future based on today’s forecast.”
Of the 14 disease-location contexts analyzed, eight were successful and six were not. Cases that researchers considered successful had an r2 values ranging from 0.92 to 0.66, and could forecast values up to 28 days in advance. Researchers suspect that people use Wikipedia to gather information about diseases before seeking medical attention.
Three of the failed cases were due to patterns in official data that were too subtle for the model to recognize; the others were inaccurate because the signal-to-noise ratio in Wikipedia data also was too subtle.
Despite this, further areas of study and revision were detailed that the researchers said could improve this method of outbreak monitoring in future studies.
“The goal is to build an operational disease-monitoring and forecasting system with open data and open source code,” Del Valle said. “This paper shows we can achieve that goal.”
Disclosure: The researchers report no relevant financial disclosures.