Judging when to tighten, or loosen, the local economy has become the world’s most consequential guessing game, and each policymaker has his or her own instincts and benchmarks. The point when hospitals reach 70 per cent capacity is a red flag, for instance; so are upticks in coronavirus case counts and deaths.
But as the governors of states like Florida, California and Texas have learned in recent days, such benchmarks make for a poor alarm system. Once the coronavirus finds an opening in the population, it gains a two-week head start on health officials, circulating and multiplying swiftly before its re-emergence becomes apparent at hospitals, testing clinics and elsewhere. Now, an international team of scientists has developed a model — or, at minimum, the template for a model — that could predict outbreaks about two weeks before they occur, in time to put effective containment measures in place.
In a paper posted on arXiv.org, the team, led by Mauricio Santillana and Nicole Kogan of Harvard, presented an algorithm that registered danger 14 days or more before case counts begin to increase. The system uses real-time monitoring of Twitter, Google searches and mobility data from smartphones, among other data streams. The algorithm, the researchers write, could function “as a thermostat, in a cooling or heating system, to guide intermittent activation or relaxation of public health interventions” — that is, a smoother, safer reopening.
“In most infectious-disease modeling, you project different scenarios based on assumptions made up front,” said Dr Santillana, director of the Machine Intelligence Lab at Boston Children’s Hospital and an assistant professor of pediatrics and epidemiology at Harvard. “What we’re doing here is observing, without making assumptions. The difference is that our methods are responsive to immediate changes in behavior and we can incorporate those.” Outside experts who were shown the new analysis, which has not yet been peer reviewed, said it demonstrated the increasing value of real-time data, like social media, in improving existing models. The study shows “alternative, next-gen data sources may provide early signals of rising COVID-19 prevalence,” said LA Meyers, a biologist and statistician at the University of Texas, Austin. “Particularly if confirmed case counts are lagged by delays in seeking treatment and obtaining test results.” The use of real-time data analysis to gauge disease progression goes back at least to 2008, when engineers at Google began estimating doctor visits for the flu by tracking search trends for words like “feeling exhausted,” “joints aching,” “Tamiflu dosage” and many others. The Google Flu Trends algorithm, as it is known, performed poorly. For instance, it continually overestimated doctor visits, later evaluations found, because of limitations of the data and the influence of outside factors such as media attention, which can drive up searches that are unrelated to actual illness. Since then, researchers have made multiple adjustments to this approach, combining Google searches with other kinds of data. Teams at Carnegie-Mellon University, University College London and the University of Texas, among others, have models incorporating some real-time data analysis. “We know that no single data stream is useful in isolation,” said Madhav Marathe, a computer scientist at the University of Virginia. “The contribution of this new paper is that they have a good, wide variety of streams.” For all its appeal, big-data analytics cannot anticipate sudden changes in mass behaviour any better than other, traditional models can, experts said. There is no algorithm that might have predicted the nationwide protests in the wake of George Floyd’s killing, for instance — mass gatherings that may have seeded new outbreaks, despite precautions taken by protesters. Social media and search engines also can become less sensitive with time; the more familiar with a pathogen people become, the less they will search with selected key words.
— Benedict Carey is a science reporter for NYT© 2020
The New York Times