정보관리학회지, 한국정보관리학회

1

김나름(연세대학교) ; 김태수(연세대학교) 2006, Vol.23, No.4, pp.69-87 https://doi.org/10.3743/KOSIM.2006.23.4.069

초록보기

초록

소설을 비롯한 문학작품에 대한 접근은 기술요소 중심이었고, 주제접근 역시 작품 속에 등장하는 소재, 인물명, 지명 등 형식 요소에 국한되어 왔다. 이러한 관행은 소설 주제의 본질을 놓친 것이며 미학적 경험을 추구하는 이용자의 주제요구를 반영하지 못한다. 이 연구에서는 소설 주제접근체계의 확장을 위해 상징 및 모티프의 개념과 주제접근점으로서의 가능성을 검토하였다. 이와 함께 해당 용어사전을 정보원으로 활용하여 상징과 모티프 체계를 구성하고, 20세기 한국소설에 적용해 이용성과 한계점을 논하였다.

Abstract

The access to literary works, including fictions, has focused on descriptive elements, and the subject access has been confined to denotative elements such as the subject matter, name of character and geographical name, etc, which appear in the work. This practice will not lead to the essence of subject of fiction, and does not reflect the demand of users for the subject who pursue aesthetic experience. In this study, concepts of symbol and motif and their possibility to be used as subject access point are considered to enhance a subject access scheme. In addition, this study tries to build the scheme of symbol and motif by using the glossary as the source of information. The composed schemes are applied to 20th century Korean fictions and its usability and limits are discussed.

2

학습문헌집합에 기 부여된 범주의 정확성과 문헌 범주화 성능

심경(Systems R&D Center, Iris.Net) ; 정영미(연세대학교) 2006, Vol.23, No.2, pp.265-285 https://doi.org/10.3743/KOSIM.2006.23.2.265

초록보기

초록

문헌범주화에서는 학습문헌집합에 부여된 주제범주의 정확성이 일정 수준을 가진다고 가정한다. 그러나, 이는 실제 문헌집단에 대한 지식이 없이 이루어진 가정이다. 본 연구는 실제 문헌집단에서 기 부여된 주제범주의 정확성의 수준을 알아보고, 학습문헌집합에 기 부여된 주제범주의 정확도와 문헌범주화 성능과의 관계를 확인하려고 시도하였다. 특히, 학습문헌집합에 부여된 주제범주의 질을 수작업 재색인을 통하여 향상시킴으로써 어느 정도까지 범주화 성능을 향상시킬 수 있는가를 파악하고자 하였다. 이를 위하여 과학기술분야의 1,150 초록 레코드 1,150건을 전문가 집단을 활용하여 재색인한 후, 15개의 중복문헌을 제거하고 907개의 학습문헌집합과 227개의 실험문헌집합으로 나누었다. 이들을 초기문헌집단, Recat-1, Recat-2의 재 색인 이전과 이후 문헌집단의 범주화 성능을 kNN 분류기를 이용하여 비교하였다. 초기문헌집단의 범주부여 평균 정확성은 16%였으며, 이 문헌집단의 범주화 성능은 F1값으로 17%였다. 반면, 주제범주의 정확성을 향상시킨 Recat-1 집단은 F1값 61%로 초기문헌집단의 성능을 3.6배나 향상시켰다.

Abstract

In text categorization a certain level of correctness of labels assigned to training documents is assumed without solid knowledge on that of real-world collections. Our research attempts to explore the quality of pre-assigned subject categories in a real-world collection, and to identify the relationship between the quality of category assignment in training set and text categorization performance. Particularly, we are interested in to what extent the performance can be improved by enhancing the quality (i.e., correctness) of category assignment in training documents. A collection of 1,150 abstracts in computer science is re-classified by an expert group, and divided into 907 training documents and 227 test documents (15 duplicates are removed). The performances of before and after re-classification groups, called Initial set and Recat-1/Recat-2 sets respectively, are compared using a kNN classifier. The average correctness of subject categories in the Initial set is 16%, and the categorization performance with the Initial set shows 17% in F1 value. On the other hand, the Recat-1 set scores F1 value of 61%, which is 3.6 times higher than that of the Initial set.

3

웹 검색어 선택과정에서의 이용자 불확실성의 유형 : 자연과학연구자들의 정보탐색환경에 대한 고찰

김양우(한성대학교) 2006, Vol.23, No.2, pp.287-309 https://doi.org/10.3743/KOSIM.2006.23.2.287

초록보기

초록

다수의 연구에서 정보추구 과정상 불 확신성(Uncertainty) 의 중요성이 지적되었지만, 실제 정보검색시스템을 이용한 탐색과정에서 이용자들의 불 확신성에 대한 연구는 많지 않았다. 본 연구는 실제로 정보를 추구하는 이용자들의 웹 검색어 선정과정에서의 불 확신성 인식을 조사하여, 정보탐색 과정에서의 다양한 불 확신성 유형을 식별하였다. 불 확신성 유형에 입각하여 발견된 불 확신성의 주요 원인(Origins)은 정보검색시스템 및 서비스 발전을 위한 시사점을 제시하여준다.

Abstract

While numerous studies have suggested the significance of uncertainty during the process of information-seeking, less research has investigated user uncertainty in the actual search process using a real system. This study investigated user perceptions of uncertainty in the process of the selection of Web search terms in the real information-seeking process. The subjects at the doctoral or post-doctoral level were limited to the discipline of science in order to understand user perceptions in this field. The findings revealed various dimensions, types, and incidents of uncertainty. The typology of uncertainty facilitated an understanding of the subjects' information-seeking context by identifying various aspects of the context that constituted the subjects’ uncertainty. The identification of two principal origins of uncertainty based on the different types of uncertainty generated implications to improve information systems and services.

4

웹 포털 이용자 로그 데이터에 기반한 개인화 검색 서비스 모형의 설계 및 평가

이소영(다음커뮤니케이션) ; 정영미(연세대학교) 2006, Vol.23, No.4, pp.179-196 https://doi.org/10.3743/KOSIM.2006.23.4.179

초록보기

초록

이 연구에서는 한국형 포털에 적합한 커뮤니티 기반 개인화 검색 서비스 모형을 제안하였다. 개인화 검색 서비스 모형은 이용자의 관심 주제를 파악하는 과정과 이를 반영한 검색 결과 재순위화 및 관련 주제 카테고리와 질의어 추천 과정으로 구성된다. 개인화 검색 모형의 유용성을 검증하기 위한 실험에서는 포털 사이트 다음에서 12일간 수집한 이용자 로그 데이터를 사용하였다. 실험 결과 개별 이용자의 주제 카테고리 선정에 사용한 카페 활동성 분석과 신지식 활동성 분석 데이터는 매우 유용한 것으로 나타났으며, 개인화 검색 결과와 추천 서비스에 대한 만족도도 비교적 높게 나타났다.

Abstract

This study proposes an expanded model of personalized search service based on community activities on a Korean Web portal. The model is composed of defining subject categories of users, providing personalized search results, and recommending additional subject categories and queries. Several experiments were performed to verify the feasibility and effectiveness of the proposed model. It was found that users’ activities on community services provide valuable data for identifying their interests, and the personalized search service increases users’ satisfaction.

5

럿거스 정보검색 평가 프로젝트에 관한 연구

이혁진(Texas Woman’s University) 2006, Vol.23, No.2, pp.97-111 https://doi.org/10.3743/KOSIM.2006.23.2.097

초록보기

초록

이 논문의 주요목적은 정보이용자들이 어떤 수준의 정확률 차이에서 유의미하게 차이를 인지하는지를 알아보고자 하는 것이다. 그에 관련한 몇 가지 흥미 있는 결과가 도출되었다. 그 외에 적합성 판정은 이용자의 판정시간과 관계가 없는 것으로 나타났다. 그리고 주제에 대한 이용자의 배경지식과 적합성 판정의 관계가 두드러졌다. 또한, 적합문서의 숫자가 적었을 때 이용자들은 적합성 판정에 더욱 어려움을 겪었다. 마지막으로, 검색결과리스트중 상위 N 문서의 적합성 판정에 대한 중요성을 확인할 수 있었다.

Abstract

The purpose of this study is to investigate what level of difference in precision would be significantly perceived by a human user of an information retrieval system. Not many researches have been conducted with regards to this issue in information retrieval field. Despite the non-significant results, there were several interesting findings in recognizing different levels of precision rates. The correctness of relevance task had little to do with the taken time for the task. In addition, the strong relationship between the subjects' topic familiarity and rate of correct judgments is one of the most interesting results in this study. It turned out that the subjects have more difficulty in a situation they have to judge between the two lists having more non-relevant documents than in a situation they do between the lists having more relevant documents. Finally, the serious influence from the first top N documents in a list for relevance judgment task has been confirmed.

6

웹 2.0 기반 생명과학 오픈 아카이빙 커뮤니티 구축

안부영(한국과학기술정보연구원) ; 이응봉(충남대학교) ; 한정민(KISTI) 2006, Vol.23, No.4, pp.89-110 https://doi.org/10.3743/KOSIM.2006.23.4.089

초록보기

초록

생명과학은 인간이 살아가는데 있어 직접적으로 영향을 미치는 중요한 학문분야 중 하나이다. 국내 생명과학 관련 연구자들은 산학연에 흩어져 중요한 연구를 수행하고 있으며, 이를 통한 연구결과는 다양한 형태(실질적인 연구결과물, 논문, 연구노트, 세미나 자료, 단행본, 교재 등)로 생산되고 있다. KISTI에서는 생명과학 관련 연구정보의 신속한 획득을 위해 생명과학관련 정보를 공유하고 교환할 수 있는 오픈 아카이빙 커뮤니티 (BioInfoNet)를 구축하여 연구자들이 커뮤니티를 발전시켜 가도록 인프라를 제공하고 있다. 본 연구에서는 최근 플랫폼으로서의 웹인 웹 2.0을 기반으로 오픈 액세스가 가능한 생명과학 문헌정보를 수집하여 메타 데이터베이스를 구축하였으며, 이용자들이 자발적으로 주제별 공개 BBS(BioBBS)를 구성하고 운영할 수 있도록 커뮤니티를 설계하고 구현하였다.

Abstract

Life science is one of the most important fields which have direct influence on human life. Many domestic life scientists in the industries, educational organizations and research institutes have been producing important results in a variety of forms such as papers, research notes, presentation materials, books and teaching materials. Open Archiving Community has been constructed in order to share and exchange research information related to life science between researchers. The domestic life scientists can acquire valuable information through the community quickly and efficiently. In this study, the community system has been designed and implemented to provide free access to all data including metadata registry of the bibliographic information on life science and research results accumulated by researchers of their own accord. The community system also has been designed and implemented based on Web 2.0 and provides users with BBS by subjects.

7

대학생의 웹기반 전자책 이용에 관한 연구

장혜란(상명대학교) 2006, Vol.23, No.4, pp.233-256 https://doi.org/10.3743/KOSIM.2006.23.4.233

초록보기

초록

대학생의 전자책 이용에 대한 이해를 돕고 현황을 파악하기 위하여 A대학교 학생들을 표집하여 설문조사와 면접을 수행하였다. 466명의 응답에 기초하여 분석한 결과를 보면, 대학생들의 전자책과 서비스에 대한 인지도는 낮은편이며, 약 30%가 이용경험을 가지고 있고, 접근경로는 대학도서관사이트가 지배적이다. 이용자의 73%가 3권 이하의 전자책을 읽었으며, 이용 분야는 다양하나 문학과 장르문학에 치우쳐 있고, 목적은 학술적 독서와 개인적 독서로 양분되어 있다. 부가기능에 대한 인지도와 활용 수준은 미약하다. 이용자들의 만족도 또한 낮고, 50% 이상이 중립적 견해를 보이고 있다. 이용 경험이 없는 학생들의 비이용 요인은 주로 불편함과 관련지식 부족으로 나타났다. 그러나 비이용자의 약 88%가 향후 이용의지를 표명하고 있다. 면접조사 결과를 보면, 적극적 이용자들은 전자책의 유용성을 인식하고, 화면독서에 친숙하며, 실용도서를 이용하는 것을 알 수 있다. 그러나 이들의 부가기능 인지도 및 활용수준 그리고 만족도 또한 낮다. 분석 결과에 따라, 이용 활성화를 위한 홍보, 생산의 다양화, 교육, 서비스 평가의 필요성이 제언되었다.

Abstract

To understand the use of the ebooks among undergraduate students, a questionnaire was devised and collected data from 466 respondents. The level of ebook and its service awareness appears to be low, and only about 30% of the students have used ebooks in the past. Students access ebooks primarily through the library homepage. 73% of the users read 3 ebooks and below. The subject and area of reading is fairly spread, however literary works and genre fiction were most popular. And the purpose is split into academic and private reading. Most of the users lack of knowledge about additional functions. Overall satisfaction level is low. Discomfort and ebooks illiteracy constitute the major reasons of nonuse, however about 88% of the nonusers show willingness to use in the future. According to the interview, active users are familiar with the screen reading as well as perceived advantages of ebooks. Nontheless, their satisfaction level is still low. Based on the results, recommendations for creating awareness, education, production development and service evaluation are suggested to promote the ebooks use.

8

문헌범주화에서 학습문헌수 최적화에 관한 연구

심경(아이리스닷넷) 2006, Vol.23, No.4, pp.277-294 https://doi.org/10.3743/KOSIM.2006.23.4.277

초록보기

초록

본 연구는 실재 시스템 환경에서 문헌 분류를 위해 범주화 기법을 적용할 경우, 범주화 성능이 어느 정도이며, 적정한 문헌범주화 성능의 달성을 위하여 분류기 학습에 필요한 범주당 가장 이상적인 학습문헌집합의 규모는 무엇인가를 파악하기 위하여 kNN 분류기를 사용하여 실험하였다. 실험문헌집단으로15만 여건의 실제 서비스되는 데이터베이스에서 2,556건 이상의 문헌을 가진 8개 범주를 선정하였다. 이들을 대상으로 범주당 학습문헌수 20개(Tr20)에서 2,000개(Tr2000)까지 단계별로 증가시키며 8개 학습문헌집합 규모를 갖도록 하위문헌집단을 구성한 후, 학습문헌집합 규모에 따른 하위문헌집단 간 범주화 성능을 비교하였다. 8개 하위문헌집단의 거시평균 성능은 F1 값 30%로 선행연구에서 발견된 kNN 분류기의 일반적인 성능에 미치지 못하는 낮은 성능을 보였다. 실험을 수행한 8개 대상문헌집단 중 학습문헌수가 100개인 Tr100 문헌집단이 F1 값 31%로 비용대 효과면에서 분류기 학습에 필요한 최적정의 실험문헌집합수로 판단되었다. 또한, 실험문헌집단에 부여된 주제범주 정확도를 수작업 재분류를 통하여 확인한 후, 이들의 범주별 범주화 성능과 관련성을 기반으로 위 결론의 신빙성을 높였다.

Abstract

This paper examines a level of categorization performance in a reallife collection of abstract articles in the fields of science and technology, and tests the optimal size of documents per category in a training set using a kNN classifier. The corpus is built by choosing categories that hold more than 2,556 documents first, and then 2,556 documents per category are randomly selected. It is further divided into eight subsets of different size of training documents: each set is randomly selected to build training documents ranging from 20 documents (Tr20) to 2,000 documents (Tr2000) per category. The categorization performances of the 8 subsets are compared. The average performance of the eight subsets is 30% in F1 measure which is relatively poor compared to the findings of previous studies. The experimental results suggest that among the eight subsets the Tr100 appears to be the most optimal size for training a kNN classifier. In addition, the correctness of subject categories assigned to the training sets is probed by manually reclassifying the training sets in order to support the above conclusion by establishing a relation between and the correctness and categorization performance.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지