정보관리학회지, 한국정보관리학회

31

윤소영(국사편찬위원회) ; 문성빈(연세대학교) 2006, Vol.23, No.1, pp.201-219 https://doi.org/10.3743/KOSIM.2006.23.1.201

초록보기

초록

이 연구에서는 가장 적합한 엘리먼트 기반 XML 문서검색 기법을 제시하기 위해 언어모델 검색 접근법으로 다이버전스 기법, 보정 기법 그리고 계층적 언어모델의 검색성능을 평가하는 실험을 수행하였다. 실험 결과, 가장 효율적인 검색 접근법으로 문서의 구조정보를 적용한 계층적 언어모델 검색을 제안하였다. 특히, 계층적 언어모델은 실제 검색에서 중요성을 가지는 검색순위 상위에서 뛰어난 성능을 보였다.

Abstract

This experimental study suggests an element-based XML document retrieval method that reveals highly relevant elements. The models investigated here for comparison are divergence and smoothing method, and hierarchical language model. In conclusion, the hierarchical language model proved to be most effective in element-based XML document retrieval with regard to the improved exhaustivity and harmed specificity.

32

질의로그 데이터에 기반한 특허 및 상표검색에 관한 연구

이지연(연세대학교) ; 백우진(건국대학교) 2006, Vol.23, No.2, pp.61-79 https://doi.org/10.3743/KOSIM.2006.23.2.061

초록보기

초록

본 연구는 특허 및 상표 검색 개선을 위한 방법을 제안하고자 하는 목적에서 출발하였다. 이를 위해 193일간 한국특허정보원의 특허기술정보서비스를 이용한 17,559명의 이용자가 작성한 100,016개의 질의문에 대한 로그 데이터를 분석하였다. 개별적인 질의로그 분석 이외에, 2,202개의 복수 질의문을 이용한 탐색세션을 분석함으로써 검색 개선을 위한 추가적인 단서를 발견하였다. 분석결과에 의하면, 특허 및 상표검색은 일반적인 웹 검색의 유형과 유사한데, 특히 질의문의 길이가 짧다는 측면에서 매우 흡사하다. 그러나 특허 및 상표검색의 경우, 일반 웹 검색보다 불리언 연산자를 많이 사용하고 있었다. 복수 질의문 분석을 통해 이용자들이 질의문을 재작성하는데 도움이 될 수 있는 탐색기능을 제안할 수 있었다. 복수의 질의문으로 구성된 탐색세션을 분석한 결과, 이용자들은 질의문을 재작성하기 위하여 부연하기, 특정화하기, 일반화하기, 교체하기, 중단하기와 같은 방법을 사용하고 있음을 알 수 있었다.

Abstract

To come up with the recommendations to improve the patent & trademark retrieval efficiency, 100,016 patent & trademark search requests by 17,559 unique users over a period of 193 days were analyzed. By analyzing 2,202 multi-query sessions, where one user issuing two or more queries consecutively, we discovered a number of retrieval efficiency improvements clues. The session analysis result also led to suggestions for new system features to help users reformulating queries. The patent & trademark retrieval users were found to be similar to the typical web users in certain aspects especially in issuing short queries. However, we also found that the patent & trademark retrieval users used Boolean operators more than the typical web search users. By analyzing the multi-query sessions, we found that the users had five intentions in reformulating queries such as paraphrasing, specialization, generalization, alternation, and interruption, which were also used by the web search engine users.

33

북마크릿을 활용한 LibraryLookup 서비스 제공방안에 관한 연구

구중억(한국기초과학지원연구원) ; 이응봉(충남대학교) 2006, Vol.23, No.3, pp.49-68 https://doi.org/10.3743/KOSIM.2006.23.3.049

초록보기

초록

도서관 이용자에게 장애가 없는 정보서비스를 제공하기 위해서는 OPAC의 접근성, 사용성 및 검색성을 향상시키고, 도서의 검색, 식별 및 브라우징의 도구로써 ISBN의 활용가치를 높이는 것이 필요하다. 북마크릿은 웹브라우저의 ‘즐겨찾기에 추가’ 또는 ‘툴바’에 드래그하여 저장할 수 있는 작은 크기의 자바스크립트이다. 그리고 오픈소스인 북마크릿은 웹페이지에서 ISBN을 추출한 다음, 해당 ISBN으로 도서관의 OPAC에서 도서를 검색할 수 있는 간단하지만 강력한 검색도구이다. 해외의 도서관 시스템 벤더, 도서관, OCLC 등은 이용자가 온라인서점의 웹페이지를 살펴보면서 동시에 도서관의 소장 및 대출 정보를 실시간으로 검색할 수 있는 북마크릿을 제공하고 있다. 따라서 본 연구에서는 해외에서 개발되어 활용되고 있는 네 가지 유형의 북마크릿에 대한 적용사례 분석을 통해 북마크릿의 특징과 장단점을 정리하였다. 이를 통해서 북마크릿의 기본요건과 적용모델을 도출하고, 국내 도서관의 OPAC과 온라인서점에서 북마크릿을 활용한 Library Lookup 서비스 제공방안을 제안하였다.

Abstract

It is required to enhance the value of ISBN as a tool for book search, identification, and browsing, and improve the accessability and search capability of library OPAC. Bookmarklet is a small size javascript which can be saved as URL in a web browser bookmark or web page hyperlink. Open source bookmarklet can extract ISBN from web pages and search a book from library OPAC using the ISBN, so it is recognized as a simple but powerful search tool. In foreign countries, commercial library system vendors, libraries, OCLC, etc. are providing bookmarklets which allow a user to search for library holdings and loan information in a real time while he/she is travelling in an online bookshop web page. Therefore, this paper compared and analyzed international bookmarklets application examples and proposed LibraryLookup service in which library OPAC and online bookshop can make use of the bookmarklets.

34

텍스트 마이닝 기법을 이용한 연관용어 선정에 관한 실험적 연구

김수연(연세대학교) ; 정영미(연세대학교) 2006, Vol.23, No.3, pp.147-165 https://doi.org/10.3743/KOSIM.2006.23.3.147

초록보기

초록

이 연구에서는 전체 문헌집단으로부터 초기 질의어에 대한 연관용어 선정 시 사용할 수 있는 최적의 기법을 찾기 위해 연관규칙 마이닝과 용어 클러스터링 기법을 이용하여 연관용어 선정 실험을 수행하였다. 연관규칙 마이닝 기법에서는 Apriori 알고리즘을 사용하였으며, 용어 클러스터링 기법에서는 연관성 척도로 GSS 계수, 자카드계수, 코사인계수, 소칼 & 스니스 5, 상호정보량을 사용하였다. 성능평가 척도로는 연관용어 정확률과 연관용어 일치율을 사용하였으며, 실험결과 Apriori 알고리즘과 GSS 계수가 가장 좋은 성능을 나타냈다.

Abstract

In this study, experiments for selection of association terms were conducted in order to discover the optimum method in selecting additional terms that are related to an initial query term. Association term sets were generated by using support, confidence, and lift measures of the Apriori algorithm, and also by using the similarity measures such as GSS, Jaccard coefficient, cosine coefficient, and Sokal & Sneath 5, and mutual information. In performance evaluation of term selection methods, precision of association terms as well as the overlap ratio of association terms and relevant documents' indexing terms were used. It was found that Apriori algorithm and GSS achieved the highest level of performances.

35

대학도서관의 전통적 기능에 대한 이용자 평가

박일종(계명대학교) ; 신상헌(계명대학교) 2006, Vol.23, No.1, pp.243-259 https://doi.org/10.3743/KOSIM.2006.23.1.243

초록보기

초록

본 연구에서는 대학 도서관들이 이용자들에게 제공하는 여러 가지 기능들을 조사하고 그 가치를 측정하였다. 이용자들의 판단을 중시하는 연구의 수행을 위해 구체적인 대학 도서관의 이용요인이나 기능이 될 수 있는 상황이나 조건들은 이용자들이 직접 설정하고 그 중요도 평가를 설문조사방법으로 수집하였다. 수집된 자료의 분석은 크게 세 단계로 나누어 시행하였다. 첫째, 측정변수들의 관련성 및 독특성, 그리고 통계적 중요도에 따른 요인을 영역별로 나누기 위해 요인분석을 실시하였다. 둘째, 연구모형을 도출하기 위해 이분 로지스틱 회귀분석(binary logistic regression)을 실시하여 판별력 향상을 검정하였다. 세 번째 분석에서는 연구모형의 독립변수들에 대해 집단간 평균차이 분석을 실시하여 집단별 변수값 등 부가적인 설명을 하였다. 분석결과, 이용자들이 대학 도서관을 활용하는 목적 뿐만 아니라 이용자들에게 끼치는 지식이나 정보 그리고 도서관 시설들을 설명하는 데에는 도서요인, 경쟁 및 효율요인, 그리고 지불(무료)요인 등이 있음이 밝혀졌다. 또한 본 연구에서는 전자도서관 기능과 지불요인과의 상관성도 통계적으로 유의하게 나타났다.

Abstract

This paper examines the values of various library functions according to users' points of view. To execute this study, the several 'circumstance', related variables and 'condition' variables that lead to factors or functions of academic libraries were measured.Analysis was carried out in three stages. In the first, factor analysis was used on the three multi variable dimensions to ensure that the groups of variables loaded significantly and uniquely on the respective dimensions. The second phase of analysis involved the use of binary logistic regression analysis to complete research models. In the third phase, t-test was used to identify significant differences in the independent variables for additional explanation of the models. Books, competition & effectiveness and fee verses free (fee-free hereafter) are the three main factors that distinguish not only the purpose of using an academic library but also the degree of influence on knowledge, information and library facilities for the users. In addition, the fee-free factor related to digital library facilities was also uncovered.

36

소설 주제 접근체계의 확장 연구 - 상징과 모티프를 중심으로 -

김나름(연세대학교) ; 김태수(연세대학교) 2006, Vol.23, No.4, pp.69-87 https://doi.org/10.3743/KOSIM.2006.23.4.069

초록보기

초록

소설을 비롯한 문학작품에 대한 접근은 기술요소 중심이었고, 주제접근 역시 작품 속에 등장하는 소재, 인물명, 지명 등 형식 요소에 국한되어 왔다. 이러한 관행은 소설 주제의 본질을 놓친 것이며 미학적 경험을 추구하는 이용자의 주제요구를 반영하지 못한다. 이 연구에서는 소설 주제접근체계의 확장을 위해 상징 및 모티프의 개념과 주제접근점으로서의 가능성을 검토하였다. 이와 함께 해당 용어사전을 정보원으로 활용하여 상징과 모티프 체계를 구성하고, 20세기 한국소설에 적용해 이용성과 한계점을 논하였다.

Abstract

The access to literary works, including fictions, has focused on descriptive elements, and the subject access has been confined to denotative elements such as the subject matter, name of character and geographical name, etc, which appear in the work. This practice will not lead to the essence of subject of fiction, and does not reflect the demand of users for the subject who pursue aesthetic experience. In this study, concepts of symbol and motif and their possibility to be used as subject access point are considered to enhance a subject access scheme. In addition, this study tries to build the scheme of symbol and motif by using the glossary as the source of information. The composed schemes are applied to 20th century Korean fictions and its usability and limits are discussed.

37

디렉토리 서비스 중개 게이트웨이 모형 구축 -주요 검색포털의 뉴스, 미디어 분야를 중심으로-

김성원(공군사관학교) ; 김태수(연세대학교) 2006, Vol.23, No.1, pp.99-119 https://doi.org/10.3743/KOSIM.2006.23.1.099

초록보기

초록

인터넷 정보검색과정에서 가장 보편적으로 사용되고 있는 검색방법은 키워드 검색이다. 키워드 검색은 정확률과 재현율의 관점에서 여러가지 단점을 지니고 있다. 이러한 키워드 검색의 단점을 보완해 줄 수 있는 장치로서 다수의 웹 포털에서 디렉토리 검색서비스를 제공하고 있다. 검색포털에서 제공하고 있는 디렉토리 서비스는 포털별로 상이한 분류체계를 사용하는 이유로 이용자에게 불편을 주고 있으며, 이러한 불편의 해소를 위해 디렉토리 서비스간 통합검색을 제공하는 중개 게이트웨이의 구축필요성이 제기되고 있다. 이에 따라 이 연구에서는 네이버, 야후, 엠파스 등 국내 주요 포털의 디렉토리 서비스를 대상으로 통합검색을 제공하는 중개 게이트웨이의 모형을 구축하고 그 성능을 평가하였다.

Abstract

The most widely used information searching method in the current internet environment is the keyword-based one, which has certain limitations in terms of precision and recall. Most major internet portals provide directory-based searching as a means to complement these limitations. However, that they adopt different classification schemes brings significant inconvenience to the users, and it consequently suggests a need to develop mapping gateway to provide cross-portal, or cross-directory information searching. In this context, this study attempts to develop a prototype system of intermediary gateway for integrated search, using the directory services of three major portals, Naver, Yahoo and Empas, and test its performance.

38

장기보존기록물 선별을 위한 업무분석적 평가방안 국회를 중심으로

이원영(국회기록보존소) ; 임효정(이화여자대학교) 2006, Vol.23, No.3, pp.187-204 https://doi.org/10.3743/KOSIM.2006.23.3.187

초록보기

초록

기록관리의 가장 궁극적인 목표는 기관의 기능과 활동을 역사로 남기는 것이다. 기관의 활동을 반영하는 많은 기록물들 가운데 어떤 기록물들이 장기적으로 보존할 가치가 있는가를 객관적으로 평가하여 가치 있는 기록물들만을 선별해 내는 것은 기록관리의 핵심이면서도 대단히 어려운 작업이다. 현대기록은 그 양적인 폭증 및 복잡성의 증대, 전자화 현상을 특징으로 하며, 이는 정보화 환경과 맞물려, 기록물의 전 생애주기에 걸친 관리, 통제라는 연속체적 개념의 성립을 가져 왔다. 본 연구는 장기보존할 가치가 있는 기록물을 기록물관리 초기 단계에서부터 선별하기 위한 객관적인 가이드 라인으로서 기관의 기능과 조직이라는 거시적인 요소와 개별 기록물의 내용평가(증거적가치)라는 미시적인 요소가 결합된 업무분석적 평가방안을 제안하였다.

Abstract

The main purpose of the archives is to maintain a history of the organization's functions and activities. Selecting valuable records for the permanent archives through objective appraisal from among many records that reflects the activities of organization is very important but also very difficult. The quantity and complexity of comtemporary records has rapidly expanded because of electronic storage, and with the information environment, and it is now possible to manage and control the records an entire lifetime. In this study, it is proposed the appraisal methods based on a business analysis that is combined the macro-appraisal factor and micro-appraisal factor; The former is functions and organizations as the objective guideline of selecting valuable records from the beginningand the latter is a contents appraisal (evidential value) of the individual records.

39

복수의 신문기사 자동요약에 관한 실험적 연구

김용광(연세대학교) ; 정영미(연세대학교) 2006, Vol.23, No.1, pp.83-98 https://doi.org/10.3743/KOSIM.2006.23.1.083

초록보기

초록

이 연구에서는 복수의 신문기사를 자동으로 요약하기 위해 문장의 의미범주를 활용한 템플리트 기반 요약 기법을 제시하였다. 먼저 학습과정에서 사건/사고 관련 신문기사의 요약문에 포함할 핵심 정보의 의미범주를 식별한 다음 템플리트를 구성하는 각 슬롯의 단서어를 선정한다. 자동요약 과정에서는 입력되는 복수의 뉴스기사들을 사건/사고 별로 범주화한 후 각 기사로부터 주요 문장을 추출하여 템플리트의 각 슬롯을 채운다. 마지막으로 문장을 단문으로 분리하여 템플리트의 내용을 수정한 후 이로부터 요약문을 작성한다. 자동 생성된 요약문을 평가한 결과 요약 정확률과 요약 재현율은 각각 0.541과 0.581로 나타났고, 요약문장 중복률은 0.116으로 나타났다.

Abstract

This study proposes a template-based method of automatic summarization of multiple news articles using the semantic categories of sentences. First, the semantic categories for core information to be included in a summary are identified from training set of documents and their summaries. Then, cue words for each slot of the template are selected for later classification of news sentences into relevant slots. When a news article is input, its event/accident category is identified, and key sentences are extracted from the news article and filled in the relevant slots. The template filled with simple sentences rather than original long sentences is used to generate a summary for an event/accident. In the user evaluation of the generated summaries, the results showed the 54.1% recall ratio and the 58.1% precision ratio in essential information extraction and 11.6% redundancy ratio.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지