Volume 35 Issue 6
Jun.  2022
Turn off MathJax
Article Contents

JI Tian Jiao, CHENG Qiang, ZHANG Yong, ZENG Han Ri, WANG Jian Xing, YANG Guan Yu, XU Wen Bo, LIU Hong Tu. A Novel Early Warning Model for Hand, Foot and Mouth Disease Prediction Based on a Graph Convolutional Network[J]. Biomedical and Environmental Sciences, 2022, 35(6): 494-503. doi: 10.3967/bes2022.065
Citation: JI Tian Jiao, CHENG Qiang, ZHANG Yong, ZENG Han Ri, WANG Jian Xing, YANG Guan Yu, XU Wen Bo, LIU Hong Tu. A Novel Early Warning Model for Hand, Foot and Mouth Disease Prediction Based on a Graph Convolutional Network[J]. Biomedical and Environmental Sciences, 2022, 35(6): 494-503. doi: 10.3967/bes2022.065

A Novel Early Warning Model for Hand, Foot and Mouth Disease Prediction Based on a Graph Convolutional Network

doi: 10.3967/bes2022.065
Funds:  The work was supported by grants from the Key Technologies Research and Development Program from the Ministry of Science and Technology [grant number: ZDZX-2018ZX102001002-003-003] and the Beijing Natural Science Foundation [project number: L192014]
More Information
  • Author Bio:

    JI Tian Jiao, female, born in 1986, PhD Candidate, majoring in Etiology of HFMD

  • Corresponding author: LIU Hong Tu, Professor, PhD, Tel: 86-10-58900888, E-mail: liuht@ivdc.chinacdc.cn; XU Wen Bo, Professor, Tel: 86-10-58900187, E-mail: xuwb@ivdc.chinacdc.cn; YANG Guan Yu, Professor, PhD, Tel: 86-25-83794249, E-mail: yang.list@seu.edu.cn
  • Received Date: 2022-01-05
  • Accepted Date: 2022-04-12
  •   Objectives  Hand, foot and mouth disease (HFMD) is a widespread infectious disease that causes a significant disease burden on society. To achieve early intervention and to prevent outbreaks of disease, we propose a novel warning model that can accurately predict the incidence of HFMD.  Methods  We propose a spatial-temporal graph convolutional network (STGCN) that combines spatial factors for surrounding cities with historical incidence over a certain time period to predict the future occurrence of HFMD in Guangdong and Shandong between 2011 and 2019. The 2011–2018 data served as the training and verification set, while data from 2019 served as the prediction set. Six important parameters were selected and verified in this model and the deviation was displayed by the root mean square error and the mean absolute error.  Results  As the first application using a STGCN for disease forecasting, we succeeded in accurately predicting the incidence of HFMD over a 12-week period at the prefecture level, especially for cities of significant concern.  Conclusions  This model provides a novel approach for infectious disease prediction and may help health administrative departments implement effective control measures up to 3 months in advance, which may significantly reduce the morbidity associated with HFMD in the future.
  • 加载中
  • [1] Cobbin JCA, Britton PN, Burrell R, et al. A complex mosaic of enteroviruses shapes community-acquired hand, foot and mouth disease transmission and evolution within a single hospital. Virus Evol, 2018; 4, vey020.
    [2] Xing WJ, Liao QH, Viboud C, et al. Hand, foot, and mouth disease in China, 2008-12: an epidemiological study. Lancet Infect Dis, 2014; 14, 308−18. doi:  10.1016/S1473-3099(13)70342-6
    [3] Li XW, Ni X, Qian SY, et al. Chinese guidelines for the diagnosis and treatment of hand, foot and mouth disease (2018 edition). World J Pediatr, 2018; 14, 437−47. doi:  10.1007/s12519-018-0189-8
    [4] Koh WM, Bogich T, Siegel K, et al. The epidemiology of hand, foot and mouth disease in Asia: a systematic review and analysis. Pediatr Infect Dis J, 2016; 35, e285−300. doi:  10.1097/INF.0000000000001242
    [5] Iwai M, Masaki A, Hasegawa S, et al. Genetic changes of coxsackievirus A16 and enterovirus 71 isolated from hand, foot, and mouth disease patients in Toyama, Japan between 1981 and 2007. Jpn J Infect Dis, 2009; 62, 254−9.
    [6] Chua KB, Kasri AR. Hand foot and mouth disease due to enterovirus 71 in Malaysia. Virol Sin, 2011; 26, 221−8. doi:  10.1007/s12250-011-3195-8
    [7] Wu Y, Yeo A, Phoon MC, et al. The largest outbreak of hand; foot and mouth disease in Singapore in 2008: the role of enterovirus 71 and coxsackievirus A strains. Int J Infect Dis, 2010; 14, e1076−81. doi:  10.1016/j.ijid.2010.07.006
    [8] Geoghegan JL, Van Tan L, Kühnert D, et al. Phylodynamics of enterovirus A71-associated hand, foot, and mouth disease in Viet Nam. J Virol, 2015; 89, 8871−9. doi:  10.1128/JVI.00706-15
    [9] Biswas T. Enterovirus 71 causes hand, foot and mouth disease outbreak in Cambodia. Natl Med J India, 2012; 25, 316.
    [10] Zhang Y, Zhu Z, Yang WZ, et al. An emerging recombinant human enterovirus 71 responsible for the 2008 outbreak of hand foot and mouth disease in Fuyang city of China. Virol J, 2010; 7, 94. doi:  10.1186/1743-422X-7-94
    [11] Ji TJ, Han TL, Tan XJ, et al. Surveillance, epidemiology, and pathogen spectrum of hand, foot, and mouth disease in mainland of China from 2008 to 2017. Biosaf Health, 2019; 1, 32−40. doi:  10.1016/j.bsheal.2019.02.005
    [12] Du ZC, Lawrence WR, Zhang WJ, et al. Bayesian spatiotemporal analysis for association of environmental factors with hand, foot, and mouth disease in Guangdong, China. Sci Rep, 2018; 8, 15147. doi:  10.1038/s41598-018-33109-3
    [13] Liu YX, Wang XJ, Liu YX, et al. Detecting spatial-temporal clusters of HFMD from 2007 to 2011 in Shandong Province, China. PLoS One, 2013; 8, e63447. doi:  10.1371/journal.pone.0063447
    [14] Yang BY, Liu FF, Liao QH, et al. Epidemiology of hand, foot and mouth disease in China, 2008 to 2015 prior to the introduction of EV-A71 vaccine. Euro Surveill, 2017; 22, 16−00824.
    [15] Tian L, Liang FC, Xu MM, et al. Spatio-temporal analysis of the relationship between meteorological factors and hand-foot-mouth disease in Beijing, China. BMC Infect Dis, 2018; 18, 158. doi:  10.1186/s12879-018-3071-3
    [16] Pons-Salort M, Grassly NC. Serotype-specific immunity explains the incidence of diseases caused by human enteroviruses. Science, 2018; 361, 800−3. doi:  10.1126/science.aat6777
    [17] Liu SJ, Chen JP, Wang JM, et al. Predicting the outbreak of hand, foot, and mouth disease in Nanjing, China: a time-series model based on weather variability. Int J Biometeorol, 2018; 62, 565−74. doi:  10.1007/s00484-017-1465-3
    [18] Chadsuthi S, Iamsirithaworn S, Triampo W, et al. Modeling seasonal influenza transmission and its association with climate factors in Thailand using time-series and ARIMAX analyses. Comput Math Methods Med, 2015; 2015, 436495.
    [19] Lai GK, Chang WC, Yang YM, et al. Modeling long-and short-term temporal patterns with deep neural networks. In: Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM. 2018, 95-104.
    [20] Lee CCD, Tang JH, Hwang JS, et al. Effect of meteorological and geographical factors on the epidemics of hand, foot, and mouth disease in island-type territory, East Asia. Biomed Res Int, 2015; 2015, 805039.
    [21] Xu CD, Xiao GX. Spatiotemporal risk mapping of hand, foot and mouth disease and its association with meteorological variables in children under 5 years. Epidemiol Infect, 2017; 145, 2912−20. doi:  10.1017/S0950268817001984
    [22] Yu B, Yin HT, Zhu ZX. Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. IJCAI. org. 2018, 3634-40.
    [23] Wang HW, Zhang FZ, Wang JL, et al. Exploring high-order user preference on the knowledge graph for recommender systems. ACM Trans Inf Syst, 2019; 37, 32.
  • Supplementary-Guangdong_video.mp4
    Supplementary-Shandong_video.mp4
    22007Supplementary Materials.pdf
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(12)  / Tables(5)

Article Metrics

Article views(360) PDF downloads(85) Cited by()

Proportional views
Related

A Novel Early Warning Model for Hand, Foot and Mouth Disease Prediction Based on a Graph Convolutional Network

doi: 10.3967/bes2022.065
Funds:  The work was supported by grants from the Key Technologies Research and Development Program from the Ministry of Science and Technology [grant number: ZDZX-2018ZX102001002-003-003] and the Beijing Natural Science Foundation [project number: L192014]

Abstract:   Objectives  Hand, foot and mouth disease (HFMD) is a widespread infectious disease that causes a significant disease burden on society. To achieve early intervention and to prevent outbreaks of disease, we propose a novel warning model that can accurately predict the incidence of HFMD.  Methods  We propose a spatial-temporal graph convolutional network (STGCN) that combines spatial factors for surrounding cities with historical incidence over a certain time period to predict the future occurrence of HFMD in Guangdong and Shandong between 2011 and 2019. The 2011–2018 data served as the training and verification set, while data from 2019 served as the prediction set. Six important parameters were selected and verified in this model and the deviation was displayed by the root mean square error and the mean absolute error.  Results  As the first application using a STGCN for disease forecasting, we succeeded in accurately predicting the incidence of HFMD over a 12-week period at the prefecture level, especially for cities of significant concern.  Conclusions  This model provides a novel approach for infectious disease prediction and may help health administrative departments implement effective control measures up to 3 months in advance, which may significantly reduce the morbidity associated with HFMD in the future.

JI Tian Jiao, CHENG Qiang, ZHANG Yong, ZENG Han Ri, WANG Jian Xing, YANG Guan Yu, XU Wen Bo, LIU Hong Tu. A Novel Early Warning Model for Hand, Foot and Mouth Disease Prediction Based on a Graph Convolutional Network[J]. Biomedical and Environmental Sciences, 2022, 35(6): 494-503. doi: 10.3967/bes2022.065
Citation: JI Tian Jiao, CHENG Qiang, ZHANG Yong, ZENG Han Ri, WANG Jian Xing, YANG Guan Yu, XU Wen Bo, LIU Hong Tu. A Novel Early Warning Model for Hand, Foot and Mouth Disease Prediction Based on a Graph Convolutional Network[J]. Biomedical and Environmental Sciences, 2022, 35(6): 494-503. doi: 10.3967/bes2022.065
    • Hand, foot and mouth disease (HFMD) is a common infectious disorder caused by various enteroviruses[1], and children younger than 5 years are especially prone to infection[2]. Most cases are self-limiting; however, some patients rapidly develop neurological or cardiopulmonary complications, which can even lead to death[3].

      In recent years, outbreaks of this disease have been reported frequently in most parts of the world, including the Asia-Pacific region[4], especially in eastern and southeast Asia[5,6]. During 2008, Singapore experienced its largest outbreak of HFMD, resulting in 29,686 cases[7]. Subsequently, 170 deaths were reported in Vietnam in 2011[8] and 98 deaths were reported in Cambodia in 2012[9]. In mainland of China, an epidemic of HFMD started in Fuyang city, Anhui Province, in 2008 resulting in 353 severe cases and 22 deaths[10]. More than 18 million HFMD-related cases were reported in mainland of China between 2008 and 2017, and the number of deaths ranked in the top three among all notifiable diseases since 2010[11]. HFMD morbidity and mortality cause a significant economic and psychological burden on patients and society.

      Epidemiological surveillance and an improved understanding of the spatiotemporal characteristics of HFMD may provide useful insight into local epidemic control measures and resource allocation. Accordingly, the epidemiological characteristics, risk factors and spatiotemporal patterns of HFMD have been studied on a national scale[12-15], prompting researchers to design and optimize HFMD warning models for different situations. The classic model that has been applied is a dynamic model of transmission, namely the Susceptibles-Infectives-Recovered (SIR) model[16]. However, these traditional models require the construction of complex systems based on unrealistic assumptions and simplifications. With the increasing amount of data, data-driven prediction models of infectious diseases have been widely studied, such as the autoregressive integrated moving average (ARIMA) model[17,18]. However, these models use a purely mathematical method of differences to extract linear factors in a sequence and cannot explain the nonlinear factors that affect changes in a time series, thereby leading to low prediction accuracy. Therefore, data-driven deep learning models, such as long- and short-term temporal (LSTM) patterns with deep neural networks[19], not only have good fault tolerance and large-scale nonlinear parallel processing methods, but also have strong self-learning and adaptive capabilities.

      However, such models only consider disease parameters in the time dimension, so a spatial-temporal prediction model that accurately describes the real situation was needed. In this study, we used the spatial-temporal graph convolutional network (STGCN), which can consider spatial effects to predict the future occurrence of HFMD. By stacking continuous convolution modules to process the number of patients, the hidden time series features are extracted. For the prediction of future incidence, STGCN will not only refer to the recent incidence in a city but will also take into account the incidence in neighboring cities through the graph convolution module.

      Guangdong and Shandong Provinces, representing the most populous, major economic provinces in southern and northern China, have suffered greatly from the HFMD epidemic. Therefore, we designed this model to help local health administrative departments take timely and effective blocking measures to reduce the morbidity and mortality associated with HFMD in the future.

    • Considering the extensive area and diverse demographic, economic and climatic characteristics of China, the distribution of cases and the risk factors for HFMD likely vary among different regions. Therefore, we choose two representative provinces with a heavy burden of HFMD as the research area: Guangdong and Shandong, which represent southern and northern China, respectively.

      Guangdong Province comprises 21 administrative districts and can be divided into four administrative regions according to its population and area size (Pearl River Delta region, eastern Guangdong, western Guangdong and northern Guangdong). Most regions have a subtropical monsoon climate, and the typical high temperature and rainy conditions are optimal for the epidemic spread of HFMD. Shandong Province comprises 140 counties (sub-districts) belonging to 17 administrative districts and can also be divided into four administrative regions (eastern Shandong, central Shandong, southern Shandong and northwest Shandong). With a warm temperate monsoon climate, Shandong is a distinct city with four seasons, which means that the occurrence of HFMD presents significant seasonality.

    • Data were acquired for all reported HFMD cases from January 1, 2011 to December 32, 2019 in Guangdong and Shandong from the National Notifiable Disease Surveillance System (NNDSS) of the Chinese Center for Disease Control and Prevention.

      Spatial information for each city, mainly including longitude and latitude data, was downloaded from the National Catalogue Service for Geographic Information of the Ministry of Natural Resources of the People’s Republic of China to construct the graph structure. The 2011–2018 data served as the training and verification set, while data from 2019 served as the prediction set.

      Symptomatic HFMD cases (n = 3,257,285) were reported in Guangdong, and the numbers of reported cases fluctuated, with a high incidence observed every 2 years. There were two peaks in epidemiology each year: a summer peak was observed in May and June (Supplementary Figure S1, available in www.besjournal.com), with a second smaller autumn peak in October and November, with the exception of 2017 when the autumn peak exceeded the summer peak. High risk areas of HFMD in Guangdong were located in the Pearl River Delta region, especially Zhuhai city and Guangzhou city, which had the highest incidence rates and number of reported cases over the 9-year period (Supplementary Figures S2S3, available in www.besjournal.com).

      Figure S1.  (A) Epidemical curve of reported cases with HFMD between 2011 and 2019 in Guangdong. (B) Geographic distribution of average number of probable and laboratory-confirmed cases. (C) Geographic distribution of average incidence rates of probable and laboratory-confirmed cases.

      Figure S2.  Number of reported case of HFMD in cities of Guangdong province between 2011 and 2019. Data was available from Chinese Disease Prevention and Control Information System (http://10.249.1.170:81).

      Figure S3.  Incidence of HFMD in cities of Guangdong province between 2011 and 2019. Data was available from Chinese Disease Prevention and Control Information System (http://10.249.1.170:81).

      A total of 832,065 HFMD cases were reported by the surveillance system from 2011 to 2019 in Shandong. The incidence of HFMD showed a typical major peak each year, and the number of reported cases began to increase in March and reached a peak from May to July (Supplementary Figure S4, available in www.besjournal.com). The highest average number of reported cases occurred in the provincial capital and the northwest region of Shandong, while the highest average incidence rate was detected in Dongying city on the northern coast of Shandong, and the lowest incidence rate was in Linyi city, located inland in south Shandong (Supplementary Figures S5S6, available in www.besjournal.com).

      Figure S4.  (A) Epidemical curve of reported cases with HFMD between 2011 and 2019 in Shandong. (B) Geographic distribution of average number of probable and laboratory-confirmed cases. (C) Geographic distribution of average incidence rates of probable and laboratory-confirmed cases.

      Figure S5.  Number of reported case of HFMD in cities of Shandong province between 2011 and 2019. Data was available from Chinese Disease Prevention and Control Information System (http://10.249.1.170:81).

      Figure S6.  Incidence of HFMD in cities of Shandong province between 2011 and 2019. Data was available from Chinese Disease Prevention and Control Information System (http://10.249.1.170:81).

    • As shown in Figure 1, the STGCN model mainly consists of three components: two temporal convolution layers and one spatial convolution layer. The two spatiotemporal convolution block components have the same structure and a fully connected layer component was used as the output layer. The internal layout of each spatiotemporal convolution component was a sandwich structure. First, the feature information in the time dimension was obtained through the temporal convolution layer (annual incidence data), and then the feature information in the space was mixed through the spatial convolution layer (incidence in various cities extracted by graph convolution operation). After that, high-dimensional information was obtained by feature extraction in the time dimension. At the output layer, we took the incidence data of the same period last year as one of the reference factors for the prediction results. Details about the specific parameter settings and principles are provided in the Supplementary Materials (Supplementary Text S1.2), available in www.besjournal.com.

      Figure 1.  Overview of the proposed STGCN model.

      ProvinceH4 weeks 8 weeks 12 weeks
      MAERMSE MAERMSE MAERMSE
      SD1257.4796.41 50.7183.95 41.0968.81
      2435.4962.1735.4662.1836.0862.19
      3641.6075.8944.9879.9945.7874.17
      GD12249.81460.71254.86496.17240.10471.97
      24215.51449.10210.66413.80200.35387.81
      36210.21408.58248.48482.16233.58438.41

      Table S1.  The evaluation parameter with influence of historical data length

      To build a more stable and accurate model, six important parameters were selected and verified in this model, including the forecast time (predicting 4 weeks, 8 weeks or 12 weeks in advance), historical data length (historically reported cases in one city, H = 4 weeks, 8 weeks or 12 weeks), data channel size (used to determine the convolution kernels in each convolution layer), time convolution kernel size (used to determine the size of the receptive field in each extraction process in the temporal dimension, kt = 3 or 5), neighborhood number (the fusion range of spatial information around one city, ks) and the inclusion or exclusion of graph convolution. The deviation was displayed by the root mean square error (RMSE) and the mean absolute error (MAE). Moreover, the consistency between the true value and the predicted value was verified by R2.

    • The deviation between the observed value and the true value was displayed by the RMSE and MAE, and the detailed results of parameter comparisons are provided in the Supplementary Tables S1S4 and Supplementary Figure S7, available in www.besjournal.com.

      Convolution
      kernel size
      Forecast
      weeks
      Shandong Guangdong
      With
      spatial
      Without
      spatial
      With
      spatial
      Without
      spatial
      kt = 34MAE35.4944.95 211.31214.43
      RMSE62.1782.34407.89457.11
      8MAE35.4353.17235.43231.67
      RMSE62.19101.31459.17462.32
      12MAE41.4843.96226.98232.07
      RMSE71.2977.46450.50452.03
      kt = 54MAE35.4945.85215.51210.03
      RMSE62.1766.61449.10436.94
      8MAE35.4643.69210.66246.84
      RMSE62.1865.36413.80495.93
      12MAE36.0845.77200.35225.48
      RMSE62.1966.56387.81427.83

      Table S4.  The influence of graph convolution module in Shandong and Guangdong

      Figure S7.  The Geospatial map under different forecast lengths. (A) Geospatial map of Shandong Province on 29th week with the predicted length on 4 weeks, 8 weeks, and 12 weeks. (B) Geospatial map of Guangdong Province on 25th week with the predicted length on 4 weeks, 8 weeks, and 12 weeks.

      As the main epidemiological data for Shandong showed relative singleness, the optimum prediction model could achieve 12 weeks of early warning based on the following parameters: 24 weeks of historical data length (H = 24), channel size of (1, 4, 8), time convolution kernel size of 5 (kt = 5) and neighborhood number of 4 (ks = 4) with graph convolution.

      However, Guangdong showed more complex epidemiological data, and the demographic data among cities showed great disparity. Therefore, the result of the prediction model was different from that of Shandong, and the optimum prediction model was based on the following parameters: 24 weeks of historical data length (H = 24), channel size of (1, 3, 1), time convolution kernel size of 5 (kt = 5) and neighborhood number of 5 (ks = 5).

    • We selected the cities of Qingdao, Liaocheng, Jinan and Zaozhuang to represent east Shandong, west Shandong, south Shandong and north Shandong, respectively (Figure 2AD and Supplementary Figure S8, available in www.besjournal.com). The blue line is the actual incidence, and the prediction data are shown in orange. The consistency between the true value and the predicted value was verified by R2, and the correlation between the average disease data from 2011–2018 and the predicted curve was also compared (Supplementary Table S5, available in www.besjournal.com).

      Figure 2.  The predictive epidemical curves based on city level data of HFMD incidence in 2019. (A)–(D) show data from four representative cities in Shandong Province: Qingdao, Liaocheng, Jinan and Zaozhuang, respectively. (E)–(H) show data from four representative cities in Guangdong Province: Dongguan, Jieyang, Qingyuan, and Zhanjiang, respectively.

      Figure S8.  The predictive epidemical curves about HFMD in Guangdong province, 2019. (A) Pearl River Delta region. (B) Eastern Guangdong. (C) Western Guangdong. (D) Northern Guangdong.

      CityPredictive
      epidemical
      curve
      R2Average
      number of
      reported
      cases
      (2011−2018)
      Average
      Morbidity
      (/million)
      (2011−2018)
      CityPredictive
      epidemical
      curve
      R2Average
      number of
      reported cases
      (2011−2018)
      Average
      Morbidity
      (/million)
      (2011−2018)
      Laiwu0.031,266.509.51Binzhou0.755,213.2513.63
      Linyi0.233,378.883.30Qingdao0.759,439.2510.50
      Heze0.284,797.755.68Weihai0.804465.2213.03
      Zibo0.524,158.389.01Jining0.814920.225.97
      Rizhao0.612,986.6310.45Jinan0.8610,718.5015.21
      Dongying0.653,926.5018.76Tai.an0.916,152.3811.04
      Liaocheng0.673,482.505.86Weifang0.915,114.756.79
      Dezhou0.683,580.256.30Zaozhuang0.954,209.7511.02

      Table S5.  The goodness of fit between the predicted curve and the actual incidence curve in Shandong

      For most of the cities in Shandong, the model can capture the time point of disease outbreak, and the peak height is consistent with the real situation (R2 > 0.5), which shows that the model has good prediction ability after data training. Furthermore, the best prediction was shown in cities with more than 4,000 reported cases or an incidence higher than 10/million people (R2 > 0.75, and detailed materials shown in Supplementary Table S5).

      Correspondingly, we selected the cities of Dongguan, Jieyang, Qingyuan and Zhanjiang to represent the Pearl River Delta region, eastern Guangdong, northern Guangdong and western Guangdong, respectively (Figure 2EH). In general, the prediction curve in the Pearl River Delta region matched the actual incidence curve (R2 > 0.5, Supplementary Figure S9, available in www.besjournal.com). However, the precision seemed lower in northern and western Guangdong, which showed low incidence rates and disease burden. Despite this, most prediction curves rose slightly earlier than the actual increase in incidence, which means that the model can play a role in early warning.

      Figure S9.  The predictive epidemical curves about HFMD in Shandong province, 2019. (A) Eastern Shandong. (B) Central Shandong. (C) Southern Shandong. (D) Northwest Shandong.

      Given the more complicated incidence rate characteristics, the predictive model for Guangdong requires more disease data to train and verify the results, and the optimum prediction model was appropriate for more than 10,000 reported cases or an incidence higher than 30/million people (R2 > 0.6, and detailed materials shown in Supplementary Table S6, available in www.besjournal.com).

      CityPredictive
      epidemical
      curve
      R2Average
      number of
      reported
      cases
      (2011−2018)
      Average
      Morbidity
      (/million)
      (2011−2018)
      CityPredictive
      epidemical
      curve
      R2Average
      number of
      reported
      cases
      (2011−2018)
      Average
      Morbidity
      (/million)
      (2011−2018)
      Yangjiang0.015402.8817.69Shaoguan0.466432.6322.14
      Shanwei0.012932.389.81Shenzhen0.4741759.1337.67
      Shantou0.027168.8812.82Zhuhai0.5710453.0064.52
      Maoming0.045157.508.64Huizhou0.5820951.6344.79
      Zhanjiang0.046609.509.27Qingyuan0.6010853.7534.64
      Chaozhou0.052588.889.66Jiangmen0.6210017.2528.21
      Yunfu0.158478.3834.90Dongguan0.6532108.1338.71
      Meizhou0.308592.1319.88Zhaoqing0.6812463.6330.97
      Zhongshan0.336609.5044.10Guangzhou0.7552876.3839.84
      Heyuan0.345519.3822.11Foshan0.7732410.2544.00
      Jieyang0.414335.007.22

      Table S6.  The goodness of fit between the predicted curve and the actual incidence curve in Guangdong

    • To balance the uneven distribution of the population in each prefecture-level city, we produced geological maps according to incidence (per million people), and the severity of the disease was graphed according to its brightness value (Figure 3).

      Figure 3.  Geospatial maps showing HFMD incidence in Shandong Province and Guangdong Province. (A) Shandong on week 29. (B) Shandong on week 33. (C) Guangdong on week 26. (D) Guangdong on week 39.

      The overall peak period for Shandong was at approximately 25–33 weeks, and the most seriously affected areas were around Dongying, Jinan and Qingdao, which demonstrated spread of the disease to surrounding areas. By week 29 of the comparison, the prediction results of the model were basically consistent with the actual high incidence areas, and the actual rates were also accurately predicted for marginal cities such as Weihai and Zaozhuang. By weeks 30–33, the early warning effect of the model was more obvious. The color indicated for Jinan, Weihai, Qingdao and Dongying was more obvious than for other cities around Jinan, as the pandemic increased after the summer season. This shows that the model can capture spatial information and use this in its prediction.

      For Guangdong, the high risk areas were concentrated in the Pearl River Delta region, while the summer and autumn peaks occurred at approximately 24–28 weeks and 36–40 weeks, respectively. As the maps show, we found that week 26 in summer and week 39 in autumn gave representative predictions for the major cities that were close to the real incidence rates. However, the early warning effect was not observed for the other regions, as their incidence rates were significantly lower than that in the Pearl River Delta region, and changes in morbidity rates were not obvious in the geospatial map.

    • Four common disease prediction models were compared with our STGCN model in this study, and the prediction performances of these models for Shandong and Guangdong, with different prediction lengths (4 weeks, 8 weeks, and 12 weeks), are summarized in Table 1, showing the MAE and RMSE values.

      AreaModel4 weeks8 weeks12 weeks
      MAERMSE MAERMSE MAERMSE
      ShandongHA124.98196.05141.65219.14153.74233.09
      SVR107.12130.73110.23150.69122.80160.88
      LSTM55.83107.53142.51172.98149.33183.26
      CONV-LSTM58.0897.6758.95106.3961.38111.59
      STGCN50.3895.0751.5596.1851.2996.01
      GuangdongHA286.79502.38322.68550.92344.92569.47
      SVR177.68294.34233.00460.24244.48383.71
      LSTM158.4628.43170.99339.68184.52390.29
      CONV-LSTM147.28266.25163.51338.87168.30348.65
      STGCN144.79262.15158.48306.87159.92320.76
        Note. HA, historical average model. SVR, support vector regression. LSTM, long- and short-term temporal. STGCN, spatial-temporal graph convolutional network. RMSE, root mean square error. MAE, mean absolute error.

      Table 1.  Comparison of the five prediction models

      The historical average model (HA) takes the model establishing as a seasonal process, and uses the average of previous seasons as the prediction. In this study, we used the incidence data of HFMD for each city for 24 consecutive weeks as the input, and then we calculated the average value and used this as the predicted value of the subsequent incidence data. The support vector regression (SVR) model is a common time series prediction model that involves mapping low-dimensional data to higher-dimensional space, and then reducing the hyperplane in the higher-dimensional space. In this study, radial basis function was used as the kernel function, and the penalty parameter and the number of multinomial kernel functions were set as 1.0 and 3, respectively. The LSTM model can fully extract time information by stacking multiple LSTM cell structures. In this study, we divided the model into two LSTM units with activation functions, namely the full connection layer and the output layer. One hundred neurons were set in the full connection layer, and the number of neurons in the output layer was the number of cities, corresponding to the prediction results for each city. The CONV-LSTM model can not only established temporal relations as an LSTM model, but also had the capability of the CNN model to capture spatial features hidden within the data. It has a convolution structure between different states and a predictive structure by stacking CONV-LSTM layers. The model consists of one CONV-LSTM layer, one LSTM unit, one full connection layer and one output layer. The number of neurons in the output layer is the number of cities, corresponding to the prediction results for each city.

      As shown in Table 1, the STGCN achieved excellent results for two province datasets. It greatly outperformed time models, including the HA, SVR and LSTM models. Compared with the spatiotemporal model, the STGCN also surpasses CONV-LSTM based on convolution and gated networks. Compared with CONV-LSTM that uses round-robin architecture, with the height and width of each layer remaining constant, the STGCN uses multiple convolution kernels of different sizes and a codec-like architecture, so that the model can learn the characteristics of incidence rates over different time spans. Moreover, this model can also learn correlations between different cities by changing the width of the convolution kernel, which can greatly increase the predictive power of the model.

    • Epidemiological studies of HFMD have indicated diverse seasonal patterns of HFMD incidence in southern and northern China. Different climatic conditions may lead to different seasonal characteristics between the northern and southern regions. Discriminant analysis confirmed that climate factors[20-21] were the main predictors of the epidemiological distribution of HFMD throughout mainland China. Therefore, the national incidence data cannot be used to establish an incidence model as they would cause deviations and fluctuations in the model. In this study, Guangdong Province was selected as a representative low latitude region to simulate a high incidence province in the subtropical region, and Shandong Province was selected as a representative high latitude region to simulate the temperate monsoon climate to establish the HFMD prediction model.

      Over the past few years, graph convolutional networks have attracted widespread attention because of their powerful modeling capabilities that have been successfully applied to areas such as traffic prediction[22] and recommender systems[23]. The better use of topological structure has achieved a significant improvement over traditional machine learning methods in mid- and long-term traffic prediction. Based on this, we applied the deep neural network STGCN to an early warning system for infectious diseases. The time dimension information was extracted by a convolution neural network, while spatial information was captured by a graph convolution algorithm. Finally, by stacking the space-time convolution blocks, the deeper space-time features were extracted continuously. Thus, the reliability of trend prediction by this model was greatly increased.

      According to the RMSE and MAE values, the optimum parameters for Shandong and Guangdong were obviously different, which indicated that the stability and consistency of the data may influence model construction and output. If the input data set is too small, the model cannot effectively capture the hidden features, which leads to large errors in prediction. However, if the input data is too large, there may be too much interference for the model, resulting in learning difficulties. Guangdong showed a more complicated incidence profile. Although most cities in Guangdong showed two peaks in the incidence data, the autumn peak varied in forms on different years. Moreover, the second (autumn) peak gradually converged with the first (summer) peak after 2017, with the second peak exceeding that of the first peak for some cities. Thus, the predictive model needs more disease data to train and verify the results.

      After a series of calibrations for the parameters, the forecast data became more consistent with the actual incidence, especially in some cities with a high disease burden. For the northern cities with distinct seasons, this model works well, particularly when there are more than 4,000 reported cases or an incidence higher than 10/million people. For the southern cities with more versatile epidemic features, the predictive model requires more data, and the optimum curve was obtained with more than 10,000 reported cases or an incidence higher than 30/million people. In general, for cities with a lower population density and incidence, the occurrence of infectious diseases tends to be occasional and sporadic. For example, for Yangjiang, Shantou and Shanwei in Guangdong Province, with a low morbidity and population density, the HFMD epidemic data are atypical and irregular, which leads to poor simulation results with the model. This model can be used to implement preventive measures up to 3 months in advance for cities with a high incidence of HFMD, thereby allowing health administrative departments to initiate control measures (akin to those from pandemic preparedness plans) including surveillance, mandatory reporting, isolation, school closures and social distancing.

      In conclusion, we present a pioneering early warning model with the potential to greatly reduce the incidence of HFMD by allowing effective prevention and control measures to be put in place in advance. This model could also be applied to the prevention and control of other infectious diseases. However, it will be important to consider seasonal differences in climatic conditions that may affect disease epidemics in different regions, such as the average temperature, rainfall, wind speed, relative humidity and daylight duration. Moreover, studies on a variety of disease epidemics indicate that the characteristics of region-varying and time-varying parameters, and spatially stratified heterogeneity, should also be fully considered. Therefore, we believe that the model can be optimized by the addition of more ancillary factors in the future.

    • LIU Hong Tu, YANG Guan Yu, and XU Wen Bo conceived and designed the study. JI Tian Jiao and CHENG Qiang analyzed and interpreted the data. ZENG Han Ri and WANG Jian Xing performed the experiments and entered previous cases into the database. JI Tian Jiao and ZHANG Yong prepared the manuscript. All authors approved the manuscript

    • The authors declare that there are no conflicts of interest.

    • The individual participant-level data that underlie the results reported in this article will be shared after de-identification (text, tables, figures and appendices).

    • Our heartfelt thanks to the Laboratory of Imaging, Southeast University, for the equipment and hardware they provided. We also appreciate the pediatricians who reported cases to the surveillance system for HFMD and all of the staff members responsible for specimen collection, virus isolation and shipment in Guangdong and Shandong Province Centers for Diseases Prevention and Control.

Reference (23)
Supplements:
Supplementary-Guangdong_video.mp4
Supplementary-Shandong_video.mp4
22007Supplementary Materials.pdf

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return