바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

An Experimental Study on Opinion Classification Using Supervised Latent Semantic Indexing(LSI)

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2009, v.26 no.3, pp.451-462
https://doi.org/10.3743/KOSIM.2009.26.3.451


  • Downloaded
  • Viewed

Abstract

The aim of this study is to apply latent semantic indexing(LSI) techniques for efficient automatic classification of opinionated documents. For the experiments, we collected 1,000 opinionated documents such as reviews and news, with 500 among them labelled as positive documents and the remaining 500 as negative. In this study, sets of content words and sentiment words were extracted using a POS tagger in order to identify the optimal feature set in opinion classification. Findings addressed that it was more effective to employ LSI techniques than using a term indexing method in sentiment classification. The best performance was achieved by a supervised LSI technique.

keywords
의견 마이닝, 의견 자동분류, 개념색인, 용어색인, 잠재의미색인, 지도적 잠재의미색인, opinion mining, sentiment classification, LSI, supervised latent semantic indexing, latent semantic indexing, term indexing, opinion mining, sentiment classification, LSI, supervised latent semantic indexing, latent semantic indexing, term indexing

Reference

1.

정영미. (2005). 정보검색연구:구미무역출판부.

2.

황재원. (2008). 감정 분류를 위한 한국어 감정 자질 추출 기법과 감정 자질의 유용성 평가. 인지과학, 19(4), 499-517.

3.

Chakraborti, S. (2006). Sprinkling: supervised Latent Semantic Indexing. Lecture Notes in Computer Science, 3936, 510-514.

4.

Chaovalit, P. (2005). Movie Re- view Mining: a comparison between supervised and unsupervised classification approaches (-). Proc. of the 38th Annual Hawaii International Conference on System Sciences.

5.

Cui, H. (2006). Comparative experiments on sentiment classification for online product re- views (1265-1270). Proc. of the 21st National Conference on Artificial Intelligenc.

6.

Dave, K. (2003). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews (519-528). Proc. of the 12th International Conference on World Wide Web.

7.

Ding, C. H. Q. (1999). A similarity-based probability model for Latent Semantic Indexing (59-65). Proc. of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

8.

Dumais, S. T. (1993). LSI meets TREC: A status report (137-152). Proc. of the 1st Text REtrieval Conference(TREC-1).

9.

Liu, Bing.. (2007). Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data:Springer.

10.

Pang, Bo. (2002). Thumbs up? Senti- ment classification using machine lear- ning techniques (79-86). Proc. of the ACL-02 Conference on Empirical Methods in Natural Language Processing.

11.

Pang, Bo. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts (271-278). Pro. of the 42nd An- nual Meeting of the Association for Computational Linguistics.

12.

Turney, P. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews (417-424). Proc. of the 40th annual meeting of the Association for Computational Linguistics.

13.

Wilson, T. (2004). Just how mad are you? Finding strong and weak opinion clauses (761-767). Proc. of the 2004 National Conference on Association for the Advancement of Artificial Intelligence.

14.

Yang, Y. (1997). A com- parative study on feature selection in text categorization (412-420). Proc. of the 14th International Conference on Machine Learning.

15.

Yu, H. (2003). Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences (129-136). Proc. of the 8th Conference on Empirical Methods in Natural Language Processing.

Journal of the Korean Society for Information Management