바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

유사문헌집단에서 적합/부적합정보의 유용성에 관한 연구

A Study on the Utility of Relevance/Non-relevance Information in Homogeneous Documents

정보관리학회지 / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2015, v.32 no.3, pp.277-293
https://doi.org/10.3743/KOSIM.2015.32.3.277
문성빈 (연세대학교)
  • 다운로드 수
  • 조회수

초록

본 논문에서는 문헌의 적합성수준을 적합성정도에 따라 4그룹(부적합한, 조금 적합한, 적합한, 매우 적합한)으로 나눈 후 서로 다른 심사자가 적합성 판정을 내린 4개의 적합성 판정세트(A, B, C, D)에서 “조금 적합한” 문헌을 부적합문헌으로 분류했을 때와 적합문헌으로 분류하였을 때에, 초록/표제 시스템과 전문검색시스템에서 적합성피드백으로 인한 검색효율성의 증진은 어느 쪽이 더 혜택을 받게 되는 지를 연구하였다. “조금 적합한” 문헌을 적합문헌으로 포함시켰을 때 초록/표제시스템이 전문검색시스템보다 모든 적합성판정세트에서 검색효율성의 증가율이 높았고, 반면에 전문검색시스템에서는 “조금 적합한” 문헌을 적합문헌그룹에서 제외시켰을 때 검색효율성의 증가율이 일관성 있게 높아지는 것을 발견하였다. 이는 전문검색시스템에서는 적합문헌으로 포함된 “조금 적합한” 문헌으로부터 얻어지는 적합성피드백 정보는 잡음의 역할을 하게 되어 검색효율성의 증진에 도움이 안 되고 있음을 암시하고 있다. 특히, 매우 동질적인 문헌을 색인 및 검색대상으로 하고 있는 전문검색시스템에서는 잡음에 의해 초래되는 낮은 정확률을 개선하는 정교한 검색기법에 대한 연구가 지속되어야만 한다.

keywords
relevance, relevance judgment, relevance feedback, full-text retrieval system, retrieval effectiveness, 적합성, 적합성판단, 적합성피드백, 전문검색시스템, 검색효율성

Abstract

This study examined the relative retrieval effectiveness after relevance feedback between two systems (Title/Abstract and Full-text) using four different sets of relevance judgment. Four relevance levels (not relevant, marginally relevant, relevant, highly relevant) are also used, each of which is determined by referees giving a relevance score to documents. This study also investigated how much the average precision was improved after relevance feedback when “marginally relevant” documents are included in the relevant class with the Title/Abstract system, and with the Full-text retrieval system as well. It is found that the Title/Abstract system benefited from relevance feedback with the marginally relevant documents. In case of the Title/Abstract system, the higher percentage of improvement was consistently obtained when including the marginally relevant documents in the relevance class, however the result was vice versa in case of the Full-text retrieval system. It implied that the marginally relevant documents in the relevant class had caused noises in the Full-text retrieval system.

keywords
relevance, relevance judgment, relevance feedback, full-text retrieval system, retrieval effectiveness, 적합성, 적합성판단, 적합성피드백, 전문검색시스템, 검색효율성

참고문헌

1.

문성빈. (1993). 적합성피드백을 이용한 전문검색시스템의 효율성 증진을 위한 연구. 정보관리학회지, 10(2), 43-67.

2.

문성빈. (1997). 상이한 적합성 판정과 전문검색시스템의 평가에 관한 연구. 정보관리학회지, 14(2), 123-141.

3.

Amati, G.. (1999). Probabilistic learning for selective dissemination of information. Information Processing and Management, 35(5), 633-654.

4.

Belkin, N. J.. (1984). Cognitive models and information transfer. Social Science Information Studies, 4, 111-129.

5.

Belkin, N. J.. (2001). Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. Information Processing and Management, 37(3), 403-434.

6.

Blair, D. C.. (1990). Full-text information retrieval: further analysis and clarification. Information Processing and Management, 26(3), 437-447.

7.

Borlund, P.. (2003). The concept of relevance in IR. Journal of the American Society for Information Science and Technology, 54(10), 913-925.

8.

Burgin, R.. (1992). Variations in relevance judgements and evaluation of retrieval performance. Information Processing and Management, 28(5), 619-627.

9.

Dang, E. K. F.. (2010). A new context-dependent term weight computed by boost and discount using relevance information. Journal of the American Society for Information Science and Technology, 61(12), 2514-2530.

10.

Eisenberg, M. B.. (1988). Measuring relevance judgments. Information Processing and Management, 24(4), 373-389.

11.

Eisenberg, M. B.. (1988). Order effect: A study of the possible influence of presentation order on user judgments of document relevance. Journal of the American Society for Information Science, 39(1), 37-49.

12.

Greisdorf, H.. (2003). Relevance threshold: A multi-stage predictive model of how users evaluate information. Information Processing and Management, 39(3), 403-423.

13.

Harter, S. P.. (1992). Psychological relevance and information science. Journal of the American Society for Information Science, 43(9), 602-615.

14.

Harter, S. P.. (1996). Variation in relevance assessments and the measurement of retrieval effectiveness. Journal of the American Society for Information Science, 47(1), 37-49.

15.

Hjørland, B.. (2010). The Foundation of the concept of relevance. Journal of the American Society for Information Science and Technology, 61(2), 217-237.

16.

Huang, X.. (2013). Relevance: An improved framework for explicating the notion. Journal of the American Society for Information Science and Technology, 64(1), 18-35.

17.

Kekäläinen, J.. (2002). Using graded relevance assessments in IR evaluation. Journal of the American Society for Information Science and Technology, 53(13), 1120-1129.

18.

López-Pujalte, C.. (2002). A test of genetic algorithms in Relevance Feedback. Information Processing and Management, 38(6), 793-805.

19.

López-Pujalte, C.. (2003). Order-based fitness functions for genetic algorithms applied to relevance feedback. Journal of the American Society for Information Science and Technology, 54(2), 152-160.

20.

Maglaughlin, K. L.. (2002). User perspectives on relevance criteria: A comparison among relevant, partially relevant, and not relevant. Journal of the American Society for Information Science and Technology, 53(5), 327-342.

21.

Maron, M. E.. (1988). Probabilistic design principles for conventional and full-text retrieval systems. Information Processing and Management, 24(3), 249-255.

22.

Mizzaro, S.. (1997). Relevance: The whole history. Journal of the American Society for Information Science, 48(9), 810-832.

23.

Quiroga, L. M.. (2002). An experiment in building profiles in information filtering:The role of context of user relevance feedback. Information Processing and Management, 38(5), 671-694.

24.

Salton. G.. (1990). Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41(4), 288-297.

25.

Saracevic, T.. (1975). Relevance: A review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26(6), 321-343.

26.

Saracevic, T.. (2007). Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: Nature and manifestations of relevance. Journal of the American Society for Information Science and Technology, 58(13), 1915-1933.

27.

Saracevic, T.. (2007). Relevance: A review of the literature and a framework for thinking on the notion in information science. Part III: Behavior and effects of relevance. Journal of the American Society for Information Science and Technology, 58(13), 2126-2144.

28.

Schamber, L.. (1994). Relevance and information behavior. Annual Review of Information Science and Technology, 29(1), 3-48.

29.

Schamber, L.. (1990). A re-examination of relevance: Toward a dynamic, situational, definition. Information Processing & Management, 26(6), 755-776.

30.

Shaw, W. M. Jr.. (1991). The cystic fibrosis database: Content and research opportunities. LISR, 13, 347-366.

31.

Sormunen, E.. (2002). Liberal relevance criteria of TREC-counting on negligible documents? (324-330). Proceedings of the SIGIR 2002. ACM.

32.

Sormunen, E.. (2001). Document text characteristics affect the ranking of the most relevant documents by expanded structured queries. Journal of Documentation, 57(3), 358-376.

33.

Spink, A.. (2001). Regions and levels: Measuring and mapping users' relevance judgments. Journal of the American Society for Information Science and Technology, 52(2), 161-173.

34.

Spink, A.. (1996). Feedback in information retrieval. Annual Review of Information and Science and Technology, 31, 33-78.

35.

Spink, A.. (1998). From highly relevant to nonrelevant: Examining different regions of relevance. Information Processing and Management, 34(5), 599-622.

36.

Swanson, D. R.. (1986). Subjective versus objective relevance in bibliographic retrieval system. The Library Quarterly, 56, 389-398.

37.

Tang, R.. (1999). Towards the identification of the optimal number of relevance categories. Journal of the American Society for Information Science, 50(3), 254-264.

38.

Vakkari, P.. (2004). The influence of relevance levels on the effectiveness of interactive information retrieval. Journal of the American Society for Information Science and Technology, 55(11), 963-969.

39.

Voorhees, E.. (2001). Evaluation by highly relevant documents (74-82). Proceedings of the SIGIR 2001. ACM.

40.

Voorhees, E. M.. (2005). TREC: Experiment and evaluation in information retrieval:MIT Press.

41.

Xu, Y.. (2006). Relevance judgment: What do information users consider beyond topicality. Journal of the American Society for Information Science and Technology, 57(7), 961-973.

42.

Xu, Y.. (2008). Order effect in relevance judgment. Journal of the American Society for Information Science and Technology, 59(8), 1264-1275.

정보관리학회지