TY - JOUR
T1 - Regional influenza prediction with sampling twitter data and PDE model
AU - Wang, Yufang
AU - Xu, Kuai
AU - Kang, Yun
AU - Wang, Haiyan
AU - Wang, Feng
AU - Avram, Adrian
N1 - Funding Information:
Funding: This research was funded by the Humanities and Social Sciences Research of the Ministry of Education of China (18YJCZH184), the National Social Science Fund of China (19CGL002), the Natural Science Foundation of Tianjin (19JCQNJC14800), China Postdoctoral Science Foundation (2018M640232), the National Science Foundation (DMS‐1737861, DMS‐1558127), and the James S. McDonnell Foundation 21st Century Science Initiative in Studying Complex Systems Scholar Award (UHC Scholar Award 220020472).
Publisher Copyright:
© 2020 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2020/2/1
Y1 - 2020/2/1
N2 - The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a method for influenza prediction based on the real‐time tweet data from social media, and this method ensures real‐time prediction and is applicable to sampling data. Specifically, we first simulate the sampling process of flu tweets, and then develop a specific partial differential equation (PDE) model to characterize and predict the aggregated flu tweet volumes. Our PDE model incorporates the effects of flu spreading, flu recovery, and active human interventions for reducing flu. Our extensive simulation results show that this PDE model can almost eliminate the data reduction effects from the sampling process: It requires lesser historical data but achieves stronger prediction results with a relative accuracy of over 90% on the 1% sampling data. Even for the more aggressive data sampling ratios such as 0.1% and 0.01% sampling, our model is still able to achieve relative accuracies of 85% and 83%, respectively. These promising results highlight the ability of our mechanistic PDE model in predicting temporal–spatial patterns of flu trends even in the scenario of small sampling Twitter data.
AB - The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a method for influenza prediction based on the real‐time tweet data from social media, and this method ensures real‐time prediction and is applicable to sampling data. Specifically, we first simulate the sampling process of flu tweets, and then develop a specific partial differential equation (PDE) model to characterize and predict the aggregated flu tweet volumes. Our PDE model incorporates the effects of flu spreading, flu recovery, and active human interventions for reducing flu. Our extensive simulation results show that this PDE model can almost eliminate the data reduction effects from the sampling process: It requires lesser historical data but achieves stronger prediction results with a relative accuracy of over 90% on the 1% sampling data. Even for the more aggressive data sampling ratios such as 0.1% and 0.01% sampling, our model is still able to achieve relative accuracies of 85% and 83%, respectively. These promising results highlight the ability of our mechanistic PDE model in predicting temporal–spatial patterns of flu trends even in the scenario of small sampling Twitter data.
KW - Flu prediction
KW - PDE model
KW - Sampling tweets data
UR - http://www.scopus.com/inward/record.url?scp=85078228568&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078228568&partnerID=8YFLogxK
U2 - 10.3390/ijerph17030678
DO - 10.3390/ijerph17030678
M3 - Article
C2 - 31973008
AN - SCOPUS:85078228568
SN - 1661-7827
VL - 17
JO - International Journal of Environmental Research and Public Health
JF - International Journal of Environmental Research and Public Health
IS - 3
M1 - 678
ER -