바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

Hierarchic Document Clustering in OPAC

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2004, v.21 no.1, pp.93-117
https://doi.org/10.3743/KOSIM.2004.21.1.093

  • Downloaded
  • Viewed

Abstract

This study is to develop a hiararchic clustering model for document classification and browsing in OPAC systems. Two automatic indexing techniques (with and without controlled terms), two term weighting methods (based on term frequency and binary weight), five similarity coefficients (Dice, Jaccard, Pearson, Cosine, and Squared Euclidean), and three hierarchic clustering algorithms (Between Average Linkage, Within Average Linkage, and Complete Linkage method) were tested on the document collection of 175 books and theses on library and information science. The best document clusters resulted from the Between Average Linkage or Complete Linkage method with Jaccard or Dice coefficient on the automatic indexing with controlled terms in binary vector. The clusters from Between Average Linkage with Jaccard has more likely decimal classification structure.

keywords
온라인 목록, 문헌 클러스터링, 계층 클러스터링, 자동분류, 열람, 유사도 계수, OPAC, document clustering, hierarchic clustering, automatic classification, browsing, similarity coefficient

Reference

1.

(2001). 지식분류의 자동화를 위한 클러스터링 모형 연구. 18(2), 203-230.

2.

(1999). 문헌클러스터링을 위한 유사계수 간의 연관성 측정. 8, 25-28.

3.

(1999). A Lorgitudinal study of the effects of OPAC screen changes on searching behavior and searcher success. 60(Nov.), 515-530.

4.

(1996). Ordering author and work records: an evaluation of colledtion in online catalog displays. 47(7), 538-554.

5.

(2001). Predicting the relevance of a library catalog search. 52(10), 812-827.

6.

(1980189-195). A Model of cluster searching based on classification. , -.

7.

(1992). a cluster-based approach to browsing large document collections Processing of the 15th Annual International ACM SIGIR Conference on Research and development in Information Retrieval. , 318-329.

8.

(1989). Comparison of hierarchic agglomerative clustering methods for document retrieval. , 220-227.

9.

(1985). Automatic classification of book material represented by back-of-book index Journal of Documentation. , 135-155.

10.

(1983). An Experiment in automatic hierachical document classification. , 113-120.

11.

(1984). Hierarchic agglomerative clustering methods for automatic document classification Journal of Documentation. , 175-205.

12.

(1996). Reexamining the cluster hypothesis: Scatter/Gather on retrieval results. , 76-84.

13.

(1986). Workload Characteristics and Computer System Utilization in Online Library Catalog University of California. , -.

14.

(1991197-215). The Decline of subject searching long-term trends and patterns of index use in an online catalog Journal of the American Society for Information Science. , -.

15.

(1998). Evaluating a visual navigation system for a digital library. , 535-554.

16.

(1998). The WebCluster Project Using clustering for mediating access to the world Wide Web. , 357-358.

17.

(1973189-190). Clustering as an output option Proceedings of the American Society for information Science. , -.

18.

(2001). Information navigation on the web by clustering and summarizing query results. 37, 789-816.

19.

(1971). The SMART Retrieval System-Experiments in Automatic Document Retrieval. , -.

20.

(1997). Almost-constant-time clustering of arbitary corpus subsets. , 60-66.

21.

(2002). The Effectiveness of query-specific hierarchic clustering in information retrieval. 38(4), 559-582.

22.

(1998). Title key-words and subject descriptors: a comparison of subject entries of books in the humanity and social science. 54(4), 466-476.

23.

(1995). User persistence in displaying online catalog posting: LUIS. 39(3), 247-264.

24.

(1985). Internation Forum on Information and Documentation. , 28-32.

25.

(1998). Web document clustering: A feasibility demonstration. , 46-54.

26.

(1991). Monitoring user success through transaction log analysis. , 49-56.

Journal of the Korean Society for Information Management