Influenza Detection Using Twitter and Word Embeddings 


Vol. 45,  No. 1, pp. 96-104, Jan.  2020
10.7840/kics.2020.45.1.96


PDF
  Abstract

Influenza is a disease that causes between 3 and 5 millions serious illnesses worldwide and produces between 290,000 and 650,000 deaths each year. To minimize the impact of influenza, the KCDC provides surveillance data for influenza, but there is a reporting delay of 1~2 weeks between actual outbreaks and surveillance data provided. It is therefore important to detect influenza early by using real-time web data such as search queries and social network services to reduce reporting delay. Twitter is one of the social network services that is suitable for predicting the outbreaks of influenza in real time, and word embeddings can improve the accuracy of predictive models by learning tweets and extracting words that are highly related to influenza. This study proposes a regression model that learns tweets using word embeddings, extract words that are highly related to influenza, and detects the signs of influenza in real time through the information provided by tweets that contain extracted words. We compared the accuracy of regression models using words extracted from Word2vec, GloVe and Fasttext, which are the states-of-arts word embeddings, and found that regression models using words extracted from Word2vec have the highest correlation ratio of 0.9718 with the surveillance data.

  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

I. Kim and B. Jang, "Influenza Detection Using Twitter and Word Embeddings," The Journal of Korean Institute of Communications and Information Sciences, vol. 45, no. 1, pp. 96-104, 2020. DOI: 10.7840/kics.2020.45.1.96.

[ACM Style]

Inhwan Kim and Beakcheol Jang. 2020. Influenza Detection Using Twitter and Word Embeddings. The Journal of Korean Institute of Communications and Information Sciences, 45, 1, (2020), 96-104. DOI: 10.7840/kics.2020.45.1.96.

[KICS Style]

Inhwan Kim and Beakcheol Jang, "Influenza Detection Using Twitter and Word Embeddings," The Journal of Korean Institute of Communications and Information Sciences, vol. 45, no. 1, pp. 96-104, 1. 2020. (https://doi.org/10.7840/kics.2020.45.1.96)