바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

Mining Semantically Similar Tags from Delicious

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2009, v.26 no.2, pp.127-147
https://doi.org/10.3743/KOSIM.2009.26.2.127

  • Downloaded
  • Viewed

Abstract

The synonym issue is an inherent barrier in human-computer communication, and it is more challenging in a Web 2.0 application, especially in social tagging applications. In an effort to resolve the issue, the goal of this study is to test the feasibility of a Web 2.0 application as a potential source for synonyms. This study investigates a way of identifying similar tags from a popular collaborative tagging application, Delicious. Specifically, we propose an algorithm (FolkSim) for measuring the similarity of social tags from Delicious. We compared FolkSim to a cosine-based similarity method and observed that the top-ranked tags on the similar list generated by FolkSim tend to be among the best possible similar tags in given choices. Also, the lists appear to be relatively better than the ones created by CosSim. We also observed that tag folksonomy and similar list resemble each other to a certain degree so that it possibly serves as an alternative outcome, especially in case the FolkSim-based list is unavailable or infeasible.

keywords
유사태그, 딜리셔스, 유사어 추출, 폭소노미, 웹 마이닝, tag similarity, Delicious, synonym extraction, Folksonomy, web mining, tag similarity, Delicious, synonym extraction, Folksonomy, web mining

Reference

1.

Baeza-Yates, R. (1999). Modern Information Retrieval:ACM Press.

2.

Begelman, G. (2006). Automated tag clustering: Improving search and exploration in the tag space (22-26). Proceedings of the Tagging Workshop at the 15th International World Wide Web Conference.

3.

Chen, Hsinchun. (1992). Automatic construction of networks of concepts characterizing document databases. IEEE Transactions on Systems, Man and Cybernetics, 22(5), 885-902.

4.

Choy, S. O. (2006). Web information retrieval in collaborative tagging systems (352-355). Proceedings of the International Conference on Web Intelligence.

5.

Crouch, C. J. (1990). An approach to the automatic construction of global thesauri. Information Processing and Management, 26, 629-640.

6.

Dhillon, I. S. (2001). Concept decompositions for large sparse text data using clustering. Machine learning, 42(1), 143-175.

7.

Furnas, G. W. (1987). The vocabulary problem in human-system communication. Communications of the ACM, 30, 964-971.

8.

Garg, Nikhil. (2008). Personalized Tag Suggestion for Flickr (1063-1064). Proceedings of the World Wide Web conference.

9.

Golder, S. (2006). Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2), 198-208.

10.

Hotho, Andreas. (2006). Proceedings of the 3rd European Semantic Web Conference (411-426).

11.

Jannink, Jan. (1999). Thesaurus entry extraction from an on-line dictionary (-). Proceedings of the Second International Conference on Information Fusion.

12.

Lin, D. (1998). Automatic retrieval and clustering of similar words (768-774). Proceedings of the 17th International Conference on Computational Linguistics.

13.

Lin, Dekang. (2003). Identifying synonyms among distributionally similar words. Proceedings of International Joint Conferences on Artificial Intelligence, , 1492-1493.

14.

Turney, Peter D. (2001). Mining the Web for synonyms: PMI_IR versus LSA on TOEFL (491-502). Proceedings of the 12th European Conference on Machine Learning.

15.

Turney, Peter D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews (417-424). Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.

16.

Vander Wal, T. (2007). Folksonomy coninage and definition. http://vanderwal.net/folksonomy.html.

17.

White, S. (2005). A spectral clustering approach to finding communities in graphs (274-285). Proceedings of the Fifth SIAM International Conference on Data Mining.

18.

Wu, Hua. (2003). Optimizing synonym extraction using monolingual and bilingual resources (72-79). Proceedings of the Second International Workshop on Paraphrasing: Paraphrase Acquisition and Applications.

19.

Yi, Kwan. (2008). Mining a Web2.0 service for the discovery of semantically similar terms: a case study with Del.icio.us (321-326). Proceedings of the International Conference on Asia-Pacific Digital Libraries.

Journal of the Korean Society for Information Management