While flu season is a fact of life, the severity of a given flu season can vary significantly. Consider that at the end of 2011, the Centers for Disease Control and Prevention (CDC) reported 849 flu cases from September 30th through the end of the year. In 2012, this figure skyrocketed to 22,048 during the same time frame.
As shown by these very divergent numbers, some flu seasons clearly have a greater impact on public health than others. Of course, this makes it difficult to estimate the number of flu cases on a year to year basis. In order to better gauge how influenza annually affects Americans, some researchers have turned to a well-known online encyclopedia.
Searching for Answers
Regardless of how much time you spend on the internet, you’ve probably heard of Wikipedia. In short, Wikipedia is an online encyclopedia that allows users to edit most of its articles. Using this source, two Boston-based researchers have created a new type of flu prediction system.
Detailing their efforts in the journal PLOS Computational Biology, the research team examined Wikipedia articles with flu-related terms, and documented how frequently they were viewed during flu season. To determine when flu activity was at its worst, the study relied on the number of hourly visits for such articles. This information was then added to statistics complied by the CDC regarding influenza cases. Six separate flu seasons were examined overall, from December 2007 to August 2013.
The authors compared their new system against Google’s Flu Trends, a web service designed to monitor the spread of flu in the United States and other countries.
In three of the six flu seasons covered by the study, the Wikipedia-based model correctly identified the week in which influenza was most prevalent. In contrast, the Google Flu Trends system accurately determined the peak of just two influenza seasons.
Explaining the Results
The problem with Google’s flu-tracking system, argued the researchers, is that it bases its conclusions on search engine queries. When flu season starts to rear its ugly head, the number of internet searchers for influenza rises sharply. Of course, simply searching for flu articles hardly means that a person has this illness.
This fact can throw off the Google flu model when influenza becomes a hot news item. Two recent examples of such a scenario include the 2009 swine flu pandemic and the 2012-13 flu season. In both of these cases, Google Flu Trends overestimated the number of people who had been infected with influenza.
To get around this problem, the new model utilizes certain Wikipedia articles, which served as markers for time spent on the website. In other words, these articles helped the researchers estimate how much time people normally spent on Wikipedia. The new system is not without flaws, as it still cannot determine exactly why someone is reading an article related to flu. Because of this shortcoming, the researchers stress that those concerned about influenza should also monitor data released by the CDC.