정보관리학회지, 한국정보관리학회

11

김성진(인하공업전문대학) 2004, Vol.21, No.2, pp.211-233 https://doi.org/10.3743/KOSIM.2004.21.2.211

초록보기

초록

웹은 지금까지 연구되어온 전통적인 정보검색 시스템과는 차별되는 새로운 정보환경이므로 웹상에서 발생하는 이용자와 정보검색 시스템 간의 상호작용에 대한 이해를 위해 새로운 관점에서의 연구가 충분히 이루어져야 하며 이러한 연구를 뒷받침해줄 웹 기반 정보탐색 패러다임이 정착될 필요가 있다. 이러한 맥락에서 본 연구는 웹 정보탐색행위를 연구한 문헌에서 제시된 이론적 모형들을 검토하고 비교 분석하였다. Wang, Hawk, Tenopir, Hsieh-Yee, Choo, Detlor, Turnbul, Chun과 Cooper, Rieh, Spink의 연구에서 제시된 모형들이 논의되었다. 분석 결과, 웹 정보탐색 모형은 크게 상호작용 모형, 정보탐색행위 모형, 평가 모형으로 구분되며, 전통적인 정보탐색과정 모형에 비해 복합요인들의 상호작용과 정보탐색행위의 비선형적 관점이 강조되었다는 특징을 갖는다.

Abstract

The web is a new information environment, which has different characteristics from a traditional IR environment. Needed are more research from a new point of view as well as the adoption of a new research paradigm in order to understand a user-system interaction on the web. The purpose of this study is to review and analyze models of web-based information seeking behavior, which Wang, Hawk & Tenopir, Hsieh-Yee, Choo, Detlor & Turnbull, Chun & Cooper, Rieh, and Spink proposed. The comparative analysis indicates that web-based information seeking models are categorized into three area: interaction model, information seeking behavior model, and evaluation model, and that they are based on a multifaceted interaction and a nonlinear perspective.

12

자연어 질의 분석과 검색어 확장에 기반한 웹 정보 검색

윤성희(상명대학교) 2004, Vol.21, No.2, pp.235-248 https://doi.org/10.3743/KOSIM.2004.21.2.235

초록보기

초록

웹 문서 검색을 위해 키워드와 불리언 연산식을 사용하는 것에 비해 자연어 질의 문장을 입력하는 방법은 검색 시스템 사용자에게 훨씬 이상적인 인터페이스이다. 본 논문은 사용자가 입력하는 자연어 질의 문장을 구문 분석하고 그 구문 구조에 기반하여 검색어를 확장하는 다중 검색 기법을 제안한다. 구문 트리를 순회하여 구조적으로 연관된 복합 명사를 조합하거나 분할하는 과정을 거치고, 이형 표기 및 축약 표기 용어들에 대해 확장 다중 검색함으로써 웹 정보 검색 시스템의 재현율과 정확도를 높일 수 있다.

Abstract

For the users of information retrieval systems, natural language query is the more ideal interface, compared with keyword and boolean expressions. This paper proposes a retrieval technique with expanded keyword from syntactically-analyzed structures of natural language query as user input. Through the steps combining or splitting the compound nouns based on syntactic tree traversal of the query, and expanding the other-formed or shorten-formed into multiple keyword, it can enhance the precision and correctness of the retrieval system.

13

국내 학위논문의 표준관리모형 개발에 관한 연구

윤희윤(대구대학교) 2004, Vol.21, No.3, pp.99-123 https://doi.org/10.3743/KOSIM.2004.21.3.099

초록보기

초록

전통적으로 학위논문은 우수한 학술정보적 가치를 내포하고 있음에도 불구하고 물리적 이용가능성의 부재로 인하여 평가 절하되어 왔다. 이를 해소하려면 전자형 학위논문시스템의 개발이 필요하며, 그것은 정보탐색 과정에서 연구결과를 배포하는 기본채널로 그리고 학위논문이 핵심자료임을 인식시키는 계기를 제공할 것이다. 그래서 최근에 각국의 많은 대학(도서관)이 디지털 환경에 부응할 목적으로 전통적(인쇄형) 학위논문을 전자버전으로 변환하고 있다. 이에 주목하여 본 연구는 웹설문지와 홈페이지 조사방식으로 학위논문 관리시스템의 현황을 분석한 선행연구를 이론적 배경으로 삼아 국내 학위논문의 표준관리모형을 제안하였다.

Abstract

Traditionally, theses and dissertations have been extremely underutilized information sources due to their lack of physical availability. The development of electronic theses and dissertations system will provide the opportunity for theses and dissertations to be recognized as a basic channel for the dissemination of research findings and an essential resource in the discovery process. In digital environment, many universities and libraries throughout the world are now making digitized versions of traditional(print) dissertations available online. The purpose of this paper is to analyze the current theses and dissertations management system based on web questionaries and survey of home pages, and to suggest a standardized theses and dissertation management model in Korea.

14

디지털도서관 구축과정에서 TREC 텍스트 문서의 시각적 표현에 관한 연구

정기태(Assistant Professor University of Oklahoma School of Library and Information Studies) ; 박일종(계명대학교) 2004, Vol.21, No.3, pp.1-14 https://doi.org/10.3743/KOSIM.2004.21.3.001

초록보기

초록

이용자들은 유사문서를 검색할 때, 각 가지 문서의 시각적표현을 통하여 도움을 얻게 되며, 모든 정보검색에 관한 연구는 이용자들의 다양한 요구를 충족시키기 위한 여러 가지의 해결책을 제시하고 있다. 제안되어진 해결책은 알파벳 순서로 만들어 진 파피루스 문서로부터 카드목록, 마이크로 필름을 이용한 저장, 컴퓨터 디스크를 이용한 파일 보관 등에 이르기까지 다양한 방법들을 들 수 있을 것이다. 또한 대부분의 정보검색 시스템들은 Document Surrogate(문헌을 대체할 수 있는 것들), 즉 요약문, 목차, 초록, 리뷰한 내용, 기계가독형목록(MARC) 기록물 등과 같은 서지자료들을 전체논문을 대체하여 이용하게 된다.본 논문에서는 또 다른 형태의 Document Surrogate로서 용어 리스트의 집단화 방법을 이용해서 찾아보았다. 이 Document Surrogate들은 Multidimensional Scaling (MDS)을 이용해서 2차원 그래프 위에 좌표로써 표현되어지고 있다. 사용된 2차원의그래프 위에서 좌표간의 거리는 문헌들의 유사성을 나타낸다고 해석할 수 있으며 거리가 가까우면 가까울수록 두 문서는 더욱 유사한내용을 포함하고 있다고 해석할 수 있는 것으로 밝혀졌다.

Abstract

Visualization of documents will help users when they do search similar documents, and all research in information retrieval addresses itself to the problem of a user with an information need facing a data source containing an acceptable solution to that need. In various contexts, adequate solutions to this problem have included alphabetized cubbyholes housing papyrus rolls, microfilm registers, card catalogs and inverted files coded onto discs. Many information retrieval systems rely on the use of a document surrogate. Though they might be surprise to discover it, nearly every information seeker uses an array of document surrogates. Summaries, tables of contents, abstracts, reviews, and MARC recordsthese are all document surrogates. That is, they stand infor a document allowing a user to make some decision regarding it, whether to retrieve a book from the stacks, whether to read an entire article, etc.In this paper another type of document surrogate is investigated using a grouping method of term list. Using Multidimensional Scaling Method (MDS) those surrogates are visualized on two-dimensional graph. The distances between dots on the two-dimensional graph can be represented as the similarity of the documents. More close the distance, more similar the documents.

15

식별체계기반의 전자원문 연계시스템 설계 및 구현

이상환(한국과학기술정보연구원) ; 신동구(한국과학기술정보연구원) ; 김재수(한국과학기술연구원) ; 정택영(한국과학기술정보연구원) ; 최진영(고려대학교) 2004, Vol.21, No.3, pp.15-29 https://doi.org/10.3743/KOSIM.2004.21.3.015

초록보기

초록

한국과학기술정보연구원 선임연구원(sanglee@kisti.re.kr)＊＊ 한국과학기술정보연구원 연구원(lovesin@kisti.re.kr)＊＊＊ 한국과학기술정보연구원 선임연구원(jaesoo@kisti.re.kr)＊＊＊＊ 고려대학교 컴퓨터학과 교수(choi@formal.korea.re.kr)＊＊＊＊＊ 한국과학기술정보연구원 책임연구원(tychung@kisti.re.kr) 논문접수일자 : 2004년 6월 24일 게재확정일자 : 2004년 9월 17일攀攀정보통신 및 인터넷의 급속한 발전으로 기존의 물리적인 저작물이 디지털 콘텐츠로 급속히 전환되면서 디지털 콘텐츠 자원에 대한 접근 및 서비스 방식과 기존의 식별기호로는 디지털 콘텐츠의 특성을 충족시키는 식별이 미흡하고 한계가 있다. 또한, URN명세를 만족하는 DOI 식별체계도 저널, 회의자료와 같은 학술잡지형태에만 활용되고 있어 다양한 형태의 비학술잡지에 적용할 식별체계가 필요하다. 따라서, 해외 주요 디지털 콘텐츠 서비스기관의 식별체계 활용사례와 KISTI에서 소장하고 있는 학술잡지 형태 2종, 비학술잡지 형태 3종 등 5종을 분석하여 학술잡지뿐만 아니라 비학술잡지에도 적용할 수 있는 고유 식별기호를 개발하고, 고유 식별기호 기반의 전자원문 연계시스템을 설계 및 구현하고자 한다.

Abstract

With the rapid growth of information technology and the internet, the physical contents are transformed into digital contents at a fast rate. With the change, accessing the digital contents, the service methods and the identifier used for the digital contents are not systematic and limited for use. The DOI identifier system used for the URN is also limited to academic journals or magazines and are not adequately applicable for non-academic journal or digital contents. Therefore, we have developed a unique identifier based on the analysis made on the system adopted by foreign digital contents service institutes, two types of academic journals 3 types of non-academic journals owned by KISTI that can be adopted by non-academic journals. The identifier is to be used to design and implement a digital contents service system.

16

객체-관계형 데이터베이스에 의한 XML문헌의 검색성능 평가

김희섭(경북대학교) 2004, Vol.21, No.2, pp.189-210 https://doi.org/10.3743/KOSIM.2004.21.2.189

초록보기

초록

본 연구의 목적은 객체-관계형 데이터베이스 접근에 의한 XML 문헌의 검색 성능을 평가하는 것이다. 본 논문에서는 INEX(Initiative for the Evaluation of XML retrieval)에서의 XML 문헌의 색인 및 검색 방법에 대하여, 그리고 실험 방법론들에 대하여 기술하고 있다. 대부분의 전통적인 정보검색 성능평가 실험에서와 같이 본 연구에서 사용된 테스트 콜렉션(test collection)은 문헌(즉, XML 문헌), 토픽, ad hoc 검색, 적합성 판단, 평가로 이루어졌다. 그리고 ORDBMS 기술들을 기반으로 개발된 전용 XML 데이터베이스의 일종인 EXIMATM Supply을 사용하여 INEX에서 제공한 대규모 XML 문헌들을 저장하고 검색하였다. 본 논문에서는 실험에서 사용한 시스템에 대한 개략적인 기능들과 색인 및 검색 과정 그리고 INEX 2002에서의 성능평가 결과에 대하여, 앞으로 개선되어야 할 기능에 대하여 논하고 있다.

Abstract

The purpose of this study is to evaluate the performance of XML retrieval based on ORDBMSs(Object-Relational Database Management Systems) approach. This paper describes indexing and retrieval methods for XML documents and the methodologies of experiments at INEX(Initiative for the Evaluation of XML retrieval). Like any other traditional information retrieval experiment, the test collection was consists of documents, topics/queries, task, relevance assessments and evaluation. EXIMATM Supply, a kind of native XML DB based on ORDBMS technologies, is used for this experiment. Although this approach has many benefits, for example, no delay in storing and searching XML documents, but it showed relatively disappointed retrieval performance at INEX 2002. This result may caused since the given topics had to be decomposed and modified to be processed by the XPath processor, and during this modification the original meaning of topics can be changed inevitably and some important information may pass over.

17

K-Means 알고리즘을 이용한 계층적 클러스터링에서 클러스터 계층 깊이와 초기값 선정

이신원(중원대학교) ; 안동언(전북대학교) ; 정성종(전북대학교) 2004, Vol.21, No.4, pp.173-185 https://doi.org/10.3743/KOSIM.2004.21.4.173

초록보기

초록

정보통신의 기술이 발달하면서 정보의 양이 많아지고 사용자의 질의에 대한 검색 결과 리스트도 많이 추출되므로 빠르고 고품질의 문서 클러스터링 알고리즘이 중요한 역할을 하고 있다. 많은 논문들이 계층적 클러스터링 방법을 이용하여 좋은 성능을 보이지만 시간이 많이 소요된다. 반면 K-means 알고리즘은 시간 복잡도를 줄일 수 있는 방법이다. 본 논문에서는 계층적 클러스터링 시스템인 콘도르(Condor) 시스템에서 간단하고 고품질이며 효율적으로 정보 검색 할 수 있도록 구현하였다. 이 시스템은 K-Means Algorithm을 이용하였으며 클러스터 계층 깊이와 초기값을 조절하여 88%의 정확율을 보였다.

Abstract

Fast and high-quality document clustering algorithms play an important role in providing data exploration by organizing large amounts of information into a small number of meaningful clusters. Many papers have shown that the hierarchical clustering method takes good-performance, but is limited because of its quadratic time complexity. In contrast, with a large number of variables, K-means has a time complexity that is linear in the number of documents, but is thought to produce inferior clusters. In this paper, Condor system using K-Means algorithm Compares with regular method that the initial centroids have been established in advance, our method performance has been improved a lot.

18

OPAC에서 자동분류 열람을 위한 계층 클러스터링 연구

노정순(한남대학교) 2004, Vol.21, No.1, pp.93-117 https://doi.org/10.3743/KOSIM.2004.21.1.093

초록보기

초록

본 연구는 OPAC에서 계층 클러스터링을 응용하여 소장자료를 계층구조로 분류하여 열람하는데 사용될 수 있는 최적의 계층 클러스터링 모형을 찾기 위한 목적으로 수행되었다. 문헌정보학 분야 단행본과 학위논문으로 실험집단을 구축하여 다양한 색인기법(서명단어 자동색인과 통제어 통합색인)과 용어가중치 기법(절대빈도와 이진빈도), 유사도 계수(다이스, 자카드, 피어슨, 코싸인, 제곱 유클리드), 클러스터링 기법(집단간 평균연결, 집단내 평균연결, 완전연결)을 변수로 실험하였다. 연구결과 집단간 평균연결법과 제곱 유클리드 유사도를 제외하고 나머지 유사도 계수와 클러스터링 기법은 비교적 우수한 클러스터를 생성하였으나, 통제어 통합색인을 이진빈도로 가중치를 부여하여 완전연결법과 집단간 평균연결법으로 클러스터링 하였을 때 가장 좋은 클러스터가 생성되었다. 그러나 자카드 유사도 계수를 사용한 집단간 평균연결법이 십진구조와 더 유사하였다.

Abstract

This study is to develop a hiararchic clustering model for document classification and browsing in OPAC systems. Two automatic indexing techniques (with and without controlled terms), two term weighting methods (based on term frequency and binary weight), five similarity coefficients (Dice, Jaccard, Pearson, Cosine, and Squared Euclidean), and three hierarchic clustering algorithms (Between Average Linkage, Within Average Linkage, and Complete Linkage method) were tested on the document collection of 175 books and theses on library and information science. The best document clusters resulted from the Between Average Linkage or Complete Linkage method with Jaccard or Dice coefficient on the automatic indexing with controlled terms in binary vector. The clusters from Between Average Linkage with Jaccard has more likely decimal classification structure.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지