정보관리학회지, 한국정보관리학회

1

배희진(숭실대학교) ; 박소연(덕성여자대학교) ; 이준호(숭실대학교) ; 이진숙(숭실대학교) 2004, Vol.21, No.1, pp.173-186 https://doi.org/10.3743/KOSIM.2004.21.1.173

초록보기

초록

본 연구에서는 국내 주요 웹 검색 포탈인 네이버, 야후 코리아, 엠파스가 제공하는 웹 디렉토리들의 커버리지 및 커버리지 중복성을 분석하였다. 이를 위하여 본 연구는 웹 디렉토리에 등록된 사이트들의 수집 방법을 개발하고, 대분류 매핑, 중복 분류 및 참조 링크 고려와 같은 커버리지 및 커버리지 중복성 분석에 필요한 방법론을 제시하였다. 조사 결과, 참조 링크의 허용 여부가 웹 디렉토리의 커버리지에 매우 큰 영향을 미치며, 국내 웹 디렉토리들 사이의 커버리지 중복성이 매우 낮은 것으로 나타났다. 본 연구는 국내 웹 디렉토리들에 대한 이해를 넓히고, 웹 디렉토리들의 커버리지 및 커버리지 중복성 분석에 필요한 방법론을 제시함으로써, 웹 디렉토리에 관한 연구에 기여할 것으로 기대된다.

Abstract

This study examines coverage and coverage overlap of the three major Korean web directories, Naver, Yahoo Korea, and Empas. This study also suggests a methodology for collecting and processing web sites provided by these web directories. A method for mapping main categories was developed. Each directory provided registered web pages in a slightly different way. Reference links had a significant influence on the coverage of each web directory. The overlap of pages among three directories was quite low. It is expected that this study could contribute to the field of web research by providing insights to how directories provide web pages and suggesting a methodology for the analysis of directory coverage.

2

단어 의미 정보를 활용하는 이용자 자연어 질의 유형의 효율적 분류

윤성희(상명대학교) ; 백선욱(상명대학교) 2004, Vol.21, No.4, pp.251-263 https://doi.org/10.3743/KOSIM.2004.21.4.251

초록보기

초록

질의응답 시스템에서의 질의 분석 과정은 이용자의 자연어 질의 문장에서 질의 의도를 파악하여 그 유형을 분류하고 정답 추출을 위한 정보를 구하는 것이다. 본 연구에서는 복잡한 분류 규칙 집합이나 대용량의 언어 지식 자원 대신 이용자 질의 문장에서 질의 초점 어휘를 추출하고 구문 구조적으로 관련된 단어들의 의미 정보에 기반하여 효율적으로 질의 유형을 분류하는 방법을 제안한다. 질의 초점 어휘가 생략된 경우의 처리와 동의어와 접미사 정보를 이용하여 질의 유형 분류 성능을 향상시킬 수 있는 방법도 제안한다.

Abstract

For question-answering system, question analysis module finds the question points from user’s natural language questions, classifies the question types, and extracts some useful information for answer. This paper proposes a question type classifying technique based on focus words extracted from questions and word semantic information, instead of complicated rules or huge knowledge resources. It also shows how to find the question type without focus words, and how useful the synonym or postfix information to enhance the performance of classifying module.

3

객체-관계형 데이터베이스에 의한 XML문헌의 검색성능 평가

김희섭(경북대학교) 2004, Vol.21, No.2, pp.189-210 https://doi.org/10.3743/KOSIM.2004.21.2.189

초록보기

초록

본 연구의 목적은 객체-관계형 데이터베이스 접근에 의한 XML 문헌의 검색 성능을 평가하는 것이다. 본 논문에서는 INEX(Initiative for the Evaluation of XML retrieval)에서의 XML 문헌의 색인 및 검색 방법에 대하여, 그리고 실험 방법론들에 대하여 기술하고 있다. 대부분의 전통적인 정보검색 성능평가 실험에서와 같이 본 연구에서 사용된 테스트 콜렉션(test collection)은 문헌(즉, XML 문헌), 토픽, ad hoc 검색, 적합성 판단, 평가로 이루어졌다. 그리고 ORDBMS 기술들을 기반으로 개발된 전용 XML 데이터베이스의 일종인 EXIMATM Supply을 사용하여 INEX에서 제공한 대규모 XML 문헌들을 저장하고 검색하였다. 본 논문에서는 실험에서 사용한 시스템에 대한 개략적인 기능들과 색인 및 검색 과정 그리고 INEX 2002에서의 성능평가 결과에 대하여, 앞으로 개선되어야 할 기능에 대하여 논하고 있다.

Abstract

The purpose of this study is to evaluate the performance of XML retrieval based on ORDBMSs(Object-Relational Database Management Systems) approach. This paper describes indexing and retrieval methods for XML documents and the methodologies of experiments at INEX(Initiative for the Evaluation of XML retrieval). Like any other traditional information retrieval experiment, the test collection was consists of documents, topics/queries, task, relevance assessments and evaluation. EXIMATM Supply, a kind of native XML DB based on ORDBMS technologies, is used for this experiment. Although this approach has many benefits, for example, no delay in storing and searching XML documents, but it showed relatively disappointed retrieval performance at INEX 2002. This result may caused since the given topics had to be decomposed and modified to be processed by the XPath processor, and during this modification the original meaning of topics can be changed inevitably and some important information may pass over.

4

연관성 척도의 빈도수준 선호경향에 대한 연구

이재윤(경기대학교) 2004, Vol.21, No.4, pp.281-294 https://doi.org/10.3743/KOSIM.2004.21.4.281

초록보기

초록

연관성 척도는 정보검색 및 데이터마이닝을 비롯한 다양한 분야에서 사용되고 있다. 각 연관성 척도가 높거나 낮은 빈도 중에서 어떤 쪽을 선호하는가를 나타내는 빈도수준 선호경향은 척도의 적용 결과에 중요한 영향을 미치므로 이에 대한 면밀한 조사가 필요하다. 이 연구에서는 주요 연관성 척도들의 빈도수준 선호경향을 가상의 데이터를 통해 분석하고 그 결과를 제시하였다. 또한 코사인 계수를 비롯한 대표적인 연관성 척도에 대해서 빈도수준 선호경향을 조절할 수 있는 방법을 제안하였다. 이 조절 방법을 동시출현 기반 질의확장 정보검색에 적용해본 결과 그 유용성이 확인되었다. 마지막으로 분석 및 실험 결과가 관련 분야에 시사하는 바를 논하였다.

Abstract

Association measures are applied to various applications, including information retrieval and data mining. Each association measure is subject to a close examination to its tendency to prefer high or low frequency level because it has a significant impact on the performance of applications. This paper examines the frequency level preference(FLP) tendency of some popular association measures using artificially generated cooccurrence data, and evaluates the results. After that, a method of how to adjust the FLP tendency of major association measures such as cosine coefficient is proposed. This method is tested on the cooccurrence-based query expansion in information retrieval and the result can be regarded as promising the usefulness of the method. Based on these results of analysis and experiment, implications for related disciplines are identified.

5

기계학습 기반 피드백 과정을 통한 SDI 시스템의 성능향상에 관한 연구

노영희(건국대학교) 2004, Vol.21, No.4, pp.133-152 https://doi.org/10.3743/KOSIM.2004.21.4.133

초록보기

초록

정보시대의 도래로 정보량은 기하급수적으로 증가하게 되었고, 이러한 대량의 정보로부터 이용자 개개인에게 적합한 정보를 적시에 제공할 수 있는 방법으로 SDI 서비스가 연구개발되어 왔지만, 현실적으로 그 활용도는 매우 낮은 것으로 조사되었다. 이에 본 논문에서는 그 원인을 분석하고 SDI 시스템의 성능을 개선시킬 수 있는 적합성 피드백 기반 SDI 시스템을 개발하고자 하였다. 본 연구의 실험을 위해 개발된 실험시스템은 이용자 최소개입 피드백기반 SDI 시스템, 완전자동 피드백기반 SDI 시스템, 그리고 이용자 최대개입 피드백 기반 SDI 시스템이며, 새로 개발된 3개 시스템의 성능 개선정도를 평가하기 위해 네 번째 시스템으로서 전통적인 SDI 서비스에서 사용하고 있는 방법으로 시스템을 개발하였다. 실험결과 이용자 최대개입 피드백 기반 SDI 시스템이 가장 높은 성능을 보여 주었고, 완전자동 피드백 기반, 이용자 최소개입 피드백기반, 전통적 SDI 시스템 순으로 나타났으며, 피드백 기반 시스템들은 피드백이 진행될수록 그 성능이 향상되는 것으로 나타났다.

Abstract

As the Internet facilitates the rapid increase of information availability, the study on SDI service that provides users with relevant document in a timely manner has been developed. However, the practical use of this service has been low. This thesis aims at analyzing the reasons for this and developing relevance feedback based SDI system to improve the performance of the existing SDI system. Experimental systems that are developed for this study are SDI system based on users' minimum intervention feedback, SDI system based on perfect automation feedback, and SDI system based on users' maximum intervention feedback. The fourth system that utilizes the traditional SDI system is also studied to evaluate the level of performance improvement of the newly developed three types of SDI system. As a result of this study, SDI system based on users' maximum intervention feedback showed greatest performance improvement. The next performance improvement happened in order of SDI system based on perfect automation feedback, SDI system based on users' minimum intervention feedback, and the traditional SDI system. Feedback based systems showed greater performance improvement as they went through more feedback processes.

6

디지털 특수자료를 위한 XML 스키마 기반의 메타데이터 표현 체계

오삼균(성균관대학교) ; 채진석(인천대학교) 2004, Vol.21, No.4, pp.109-131 https://doi.org/10.3743/KOSIM.2004.21.4.109

초록보기

초록

연구는 서울대학교 디지털도서관 프로젝트의 지원으로 추진되었음.＊＊＊＊성균관대학교 문헌정보학과 부교수(samoh@skku.ac.kr)＊＊＊＊인천대학교 컴퓨터공학과 부교수(jschae@incheon.ac.kr) 논문접수일자 : 2004년 11월 13일 게재확정일자 : 2004년 12월 19일攀攀정보자원의 전달 매체와 형태가 다양화됨에 따라서 이에 대한 관리방법 또한 다양화되어 왔다. 도서관 환경에서는 정보자원를 위한 관리방법으로서 AACR, KCR 등의 목록규칙이 정립되었으며 이러한 목록규칙에 근거한 정보자원관리를 자동화하고자 하는 노력의 결과로서 MARC가 개발되었다. 하지만, MARC 레코드는 서지 레코드가 지니고 있는 의미적 관계의 표현을 지원하지 못하는 구조적 경직성으로 인해 다양하고 상이한 기술적 특성을 지니는 정보자원들을 적절히 기술하는데 제약이 따른다. 즉, MARC의 기본 설계 목적이 몇몇 정보유형에는 비교적 적합하더라도 새로운 형태의 정보유형의 다양성을 지원하는데 어려움이 있다. 또한 MARC를 활용한 정보자원 관리 방식에서는 정보자원 간 연결 관계의 표현을 지원하지 못한다. 즉, MARC의 데이터 모델은 자원기술의 대상을 단일의 객체로 파악하는 단층 데이터 모델이기 때문에 여러 객체들 간의 연결 관계를 설정할 수 있는 다층 데이터 모델을 이용한 정보자원 기술이 필요한 경우는 적절치 못하다. 본 연구에서는 다층 데이터 모델을 지원하는 IFLA FRBR 기본 모델을 기초로 하여 전자도서관에서 사용되는 고서, 고문서, 음악 자료, 학술회의 및 세미나 자료의 관리에 있어서 이용자의 정보요구를 최대한 수용할 수 있는 최적의 메타데이터 모델과 이에 대한 XML 스키마 기반의 표현 체계를 제시하고자 한다.

Abstract

As there are diverse delivery media and forms of information resources, their management schemes are diverse as well. In library community, cataloguing rules for describing information resources such as AACR and KCR have been developed. The efforts to automate management of information resources based on these rules resulted in the development of MARC. However, MARC records are restricted in describing the information resources and MARC has various and distinct characteristics of the structural rigidity, which does not support the representation of extended semantic structures that exist among bibliographic entities. Therefore, since the data model for MARC is single-layer data model, it is not appropriate for describing information resources represented by multi-layer data model which can be used to set up the relationships among various objects in digital libraries. In this paper, we propose an a metadata model for digital libraries based on the IFLA FRBR basic model which supports multi-layer data model and a representation scheme based on XML Schema to manage the metadata about old books, old documents, resource related to music, conferences and seminars.

7

문헌정보학 이론의 효율성과 활용성 연구

김성진(인하공업전문대학) ; 정동열(이화여자대학교) 2004, Vol.21, No.1, pp.23-53 https://doi.org/10.3743/KOSIM.2004.21.1.023

초록보기

초록

본 연구는 국내외 학술지에서 이론개발과 이론활용이 이루어진 이론연구를 조사함으로써, 문헌정보학 이론의 효율성과 활용성을 분석하고 이를 기반으로 문헌정보학의 학문적 본질을 규명하는 데 목적이 있다. 이를 위해 국내외 문헌정보학의 대표 학술지를 두 종씩 선정하여 1984년부터 2003년 상반기까지 게재된 연구논문 1,661편에 대한 내용분석을 실시하였다. 이론개발과 이론활용에 대한 질적 평가를 위해 4단계의 이론 효율성 모델과 5단계의 이론 활용성 모델을 각각 분석척도로 사용하였다. 이론연구에 대한 구체적인 분석을 위해 연구의 배경적 속성(학회지, 발행국, 연구시기), 연구의 내용적 속성(연구주제, 연구방법), 연구자 속성(소속, 전공, 연구경력)을 조사하고, 활용된 이론의 근원학문과 활용주기를 분석하였다. 또한 저자동시인용법을 적용하여 동시이론활용을 분석함으로써 20년간 문헌정보학 연구자들에 의해 형성된 이론적 기반에 대한 지적 구조를 규명하였다.

Abstract

The purpose of this study is to analyze the identity and relationship of library and information science by exploring theoretical aspects of LIS research, including theory building and theory use. The sample of this study consists of 1,661 research articles published from 1984 to 2003 in two Korean and two American core LIS journals. Theory articles are analyzed with two scales, such as '4-degree of theory efficiency' and '5-degree of theory use.' Each article is coded in terms of journal, country, publication year, subfield, and methodology of the article, and affiliation, department, and research experience of the first author. The theories used therein are coded according to their origin and age. Also, an author co-citation technique is applied to represent intellectual structure on a two-dimensional map, which has been constructed by theory use of LIS authors for 20 years.

8

디지털도서관 구축과정에서 TREC 텍스트 문서의 시각적 표현에 관한 연구

정기태(Assistant Professor University of Oklahoma School of Library and Information Studies) ; 박일종(계명대학교) 2004, Vol.21, No.3, pp.1-14 https://doi.org/10.3743/KOSIM.2004.21.3.001

초록보기

초록

이용자들은 유사문서를 검색할 때, 각 가지 문서의 시각적표현을 통하여 도움을 얻게 되며, 모든 정보검색에 관한 연구는 이용자들의 다양한 요구를 충족시키기 위한 여러 가지의 해결책을 제시하고 있다. 제안되어진 해결책은 알파벳 순서로 만들어 진 파피루스 문서로부터 카드목록, 마이크로 필름을 이용한 저장, 컴퓨터 디스크를 이용한 파일 보관 등에 이르기까지 다양한 방법들을 들 수 있을 것이다. 또한 대부분의 정보검색 시스템들은 Document Surrogate(문헌을 대체할 수 있는 것들), 즉 요약문, 목차, 초록, 리뷰한 내용, 기계가독형목록(MARC) 기록물 등과 같은 서지자료들을 전체논문을 대체하여 이용하게 된다.본 논문에서는 또 다른 형태의 Document Surrogate로서 용어 리스트의 집단화 방법을 이용해서 찾아보았다. 이 Document Surrogate들은 Multidimensional Scaling (MDS)을 이용해서 2차원 그래프 위에 좌표로써 표현되어지고 있다. 사용된 2차원의그래프 위에서 좌표간의 거리는 문헌들의 유사성을 나타낸다고 해석할 수 있으며 거리가 가까우면 가까울수록 두 문서는 더욱 유사한내용을 포함하고 있다고 해석할 수 있는 것으로 밝혀졌다.

Abstract

Visualization of documents will help users when they do search similar documents, and all research in information retrieval addresses itself to the problem of a user with an information need facing a data source containing an acceptable solution to that need. In various contexts, adequate solutions to this problem have included alphabetized cubbyholes housing papyrus rolls, microfilm registers, card catalogs and inverted files coded onto discs. Many information retrieval systems rely on the use of a document surrogate. Though they might be surprise to discover it, nearly every information seeker uses an array of document surrogates. Summaries, tables of contents, abstracts, reviews, and MARC recordsthese are all document surrogates. That is, they stand infor a document allowing a user to make some decision regarding it, whether to retrieve a book from the stacks, whether to read an entire article, etc.In this paper another type of document surrogate is investigated using a grouping method of term list. Using Multidimensional Scaling Method (MDS) those surrogates are visualized on two-dimensional graph. The distances between dots on the two-dimensional graph can be represented as the similarity of the documents. More close the distance, more similar the documents.

9

대구·경북지역 주요 대학도서관 전자정보실의 현황 및 운영실태에 관한 분석

오동근(계명대학교) ; 김숙찬(계명대학교) 2004, Vol.21, No.4, pp.89-107 https://doi.org/10.3743/KOSIM.2004.21.4.089

초록보기

초록

이 연구는 대구경북지역의 주요 대학도서관 중 전자정보실 또는 이와 유사한 자료실을 운영하고 있는 5개 대학의 전자정보실을 대상으로 담당자에 대한 설문조사와 면담을 바탕으로 운영상의 문제점을 도출하고 개선방안을 모색하고자 시도되었다. 인력과 규모 및 기기관련사항, 운영방법, 수행업무, 운영상의 문제점, 향후의 운영계획 및 장기발전방안에 관련된 현황을 분석하였다. 현황분석과 연구자의 전자정보실 업무수행경험을 바탕으로 전자정보실의 효과적인 운영 및 이용증진을 위한 개선방안을 제시하였다. 현재 이 지역대학도서관의 전자정보실 현황은 다양한 양상을 보이고 있으나, 그 실체에 대한 각 주체들의 공감대가 형성되지 못하였고, 전자정보실 자체의 종합적인 발전 목표 및 정책이 수립되지 못한 채 단기적인 계획 및 정책을 수립하고 운영되고 있다는 점은 유사한 것으로 나타났다.

Abstract

This study analyzes the present conditions and operations of digital information rooms in the five selected university libraries located in Daegu and Kyoungpook area, with a special regard to the personnel, size of the room, facilities, collections, user instructions and public relations, and related tasks being done. It concludes with some suggestions and recommendations to improve the existing practices and the works in the room based on the result from this study.

10

OWL을 이용한 온톨로지 기반의 목록시스템 설계 연구

이현실(원광대학교) ; 한성국(원광대학교) 2004, Vol.21, No.2, pp.249-267 https://doi.org/10.3743/KOSIM.2004.21.2.249

초록보기

초록

MARC는 목록 데이터를 상세하게 정의할 수 있는 장점이 있지만, 개념요소가 구조화 되어 있지 않고 표현체계가 복잡하기 때문에 단순 계층구조의 의미 어휘 체계를 지원하는 XML DTD나 RDF/S로는 그 구조를 모델화하기가 어렵다. 본 연구에서는 MARC의 데이터 요소를 추상화하여 목록 데이터의 개념 구조를 표현하는 서지 온톨로지를 구축하였으며, 개념간의 논리 관계와 프로퍼티의 카디널리티 및 프로퍼티 값에 대한 논리적 제한을 부가할 수 있는 OWL을 이용하여 MRAC 필드의 복합 구조를 모델링하여 구축한 목록 온톨로지를 구현하였다. 온톨로지 언어를 이용한 MARC 데이터를 기술 방법은 목록 데이터에 대한 메타데이터 구성과 목록의 호환성 문제를 해결할 수 있는 기초적 방안이 되며, 시맨틱 웹 서비스를 기반으로 하는 차세대 문헌 정보서비스 시스템 구현의 토대가 될 것이다.

Abstract

Although MARC can define the detail cataloguing data, it has complex structures and frameworks to represent bibliographic information. On account of these idiosyncratic features of MARC, XML DTD or RDF/S that supports simple hierarchy of conceptual vocabularies cannot capture MARC formalism effectively. This study implements bibliographic ontology by means of abstracting conceptual relationships between bibliographic vocabularies of MARC. The bibliographic ontology is formalized with OWL that can represent the logical relations between conceptual elements and specify cardinality and property value restrictions. The bibliographic ontology in this study will provide metadata for cataloguing data and resolve compatibility problems between cataloguing systems. And it can also contribute the development of next generation bibliographic information system using semantic Web services.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지