Detecting Spam Data for Securing the Reliability of Text Analysis 


Vol. 42,  No. 2, pp. 493-504, Feb.  2017


PDF
  Abstract

Recently, tremendous amounts of unstructured text data that is distributed through news, blogs, and social media has gained much attention from many researchers and practitioners as this data contains abundant information about various consumers’ opinions. However, as the usefulness of text data is increasing, more and more attempts to gain profits by distorting text data maliciously or nonmaliciously are also increasing. This increase in spam text data not only burdens users who want to obtain useful information with a large amount of inappropriate information, but also damages the reliability of information and information providers. Therefore, efforts must be made to improve the reliability of information and the quality of analysis results by detecting and removing spam data in advance. For this purpose, many studies to detect spam have been actively conducted in areas such as opinion spam detection, spam e-mail detection, and web spam detection. In this study, we introduce core concepts and current research trends of spam detection and propose a methodology to detect the spam tag of a blog as one of the challenging attempts to improve the reliability of blog information.

  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

Y. Hyun and N. Kim, "Detecting Spam Data for Securing the Reliability of Text Analysis," The Journal of Korean Institute of Communications and Information Sciences, vol. 42, no. 2, pp. 493-504, 2017. DOI: .

[ACM Style]

Yoonjin Hyun and Namgyu Kim. 2017. Detecting Spam Data for Securing the Reliability of Text Analysis. The Journal of Korean Institute of Communications and Information Sciences, 42, 2, (2017), 493-504. DOI: .

[KICS Style]

Yoonjin Hyun and Namgyu Kim, "Detecting Spam Data for Securing the Reliability of Text Analysis," The Journal of Korean Institute of Communications and Information Sciences, vol. 42, no. 2, pp. 493-504, 2. 2017.