정보관리학회지, 한국정보관리학회

1

남영준(중앙대학교) ; 정의섭(한국과학기술정보) 2006, Vol.23, No.1, pp.221-241 https://doi.org/10.3743/KOSIM.2006.23.1.221

초록보기

초록

본 연구에서는 인용 정보를 활용하여, 특허 인용색인의 기법을 분석하고, 이를 바탕으로 새로운 특허지수를 제시하였다. 이를 위해 문헌정보 및 특허정보 관련 인용색인데이터베이스에서 제공하는 인용색인지수를 비교 분석하였다. 특히 JCR의 영향력 지수와 CHI의 기술영향력 지수가 갖고 있는 정보적 가치와 의미를 재해석하였다. 전자는 상대적 인용빈도를 사용하여, 연속간행물과 같은 매체의 가치를 중시하고 있다. 후자의 경우는 특허고유의 가치를 평가하기 때문에, 자체정보만을 중시하고 있다. 이러한 차이점을 극복하기 위해 후자의 경우 해당 특허를 보유하고 있는 기관의 기술영향력 지수를 이용하여 상대적 가치를 재부여하였다. 이를 보완하기 위해 인용정보에 기반하여 다음 특정 특허의 피인용지수를 비롯하여 상대적 반감기 평가지수, 특허기술 활용 통합지수 등 세가지의 새로운 지수를 제안하였다. 단 비교분석대상은 출원특허사항에 인용정보를 제시하고 있는 미국 특허정보로 제한하여 국내 특허정보에 대한 비교분석은 수행하지 못하였다.

Abstract

This research suggested a new patent information based on patent citation technique using cited information. For this purpose, comparison research on library and patent information related citation database providing citation index was done. The information quality and meaning on the impact factor of JCR and the technology factor of CHI was reinterpreted. The former emphasizes the quality of continuous publication using relative citation frequency. The latter only emphasizes the information itself because it assesses the quality of patent characteristics. To overcome these difficulties, the latter re-authorized a relative quality to the organization possessing the patent using technology impact factor. Three new patent index was proposed on the basis of cited information to complement this. However, the comparative object was limited to American patent information that presented cited information of patent applied cases, and the comparison and research of domestic patent information could not be done.

2

연구성과 측정을 위한 h-지수의 개량에 관한 연구

이재윤(명지대학교) 2006, Vol.23, No.3, pp.167-186 https://doi.org/10.3743/KOSIM.2006.23.3.167

초록보기

초록

Hirsch(2005)가 제안한 h-지수는 인용을 통해서 개인의 연구 성과를 측정하려는 시도이다. h-지수는 용이한 산출 방법과 함께 지수의 강건성 등이 인정되면서 발표 이후 다양한 분야에서 이를 적용하거나 보완하는 연구가 활발히 이루어지고 있다. 이 연구에서는 우선 h-지수와 이를 보완한 g-지수를 비롯한 변형 지수에 대해서 현재까지 전개되고 있는 논의를 살펴보았다. 그리고 h-지수와 g-지수의 단점을 보완할 수 있는 개량 지수를 몇 가지 제안하고 가상 자료와 실제 자료에 대해서 측정해보았다. 측정 결과 제안한 지수들이 h-지수와 g-지수를 대체할 가능성이 있음을 확인하였다.

Abstract

The h-index, also called as Hirsch-index, is a new tool for measuring research outputs by citations. This h-index is not only easy to calculate, but also robust enough to handle various citation data. After its suggestion by Hirsch in 2005, many researchers applied the h-index to their own areas, and some others tried to improve the weak points of the h-index such as low discriminating power. Firstly, several of these efforts are reviewed in the present article, and then novel indexes are suggested to measure research outputs by citations more fairly and reasonably. Calculating these indexes on both artificial data and real data showed that the newly suggested indexes in this article can replace the h-index and its variants.

3

로치오 알고리즘을 이용한 학술지 논문의 디스크 립터 자동부여에 관한 연구

김판준(신라대학교) 2006, Vol.23, No.3, pp.69-89 https://doi.org/10.3743/KOSIM.2006.23.3.069

초록보기

초록

로치오 알고리즘에 기초한 통제어휘 자동색인 또는 텍스트 범주화에서 적용되어 온 여러 성능 요인들을 재검토하였고, 성능 향상을 위한 기본적인 방법을 찾아보았다. 또한, 동등한 조건에서 통제어휘 자동색인을 위한 로치오 알고리즘 기반 방법의 성능을 다른 학습기반 방법들의 성능과 비교하였다. 결과에 따르면, 통제어휘 자동색인을 위한 로치오 기반의 프로파일 방법은 구현의 용이성과 컴퓨터 처리시간 측면의 경제성이라는 기존의 장점을 그대로 유지하면서도, 다른 학습기반 방법들(SVM, VPT, NB)과 거의 동등하거나 더 나은 성능을 보여주었다. 특히, 색인전문가의 색인작업을 지원하는 반-자동 색인의 목적으로는 비교적 높은 수준의 재현율을 유지하면서 학습 데이터의 증가에 따라 정확률이 크게 향상되는 로치오 알고리즘을 이용한 방법을 우선적으로 고려할 수 있을 것이다.

Abstract

Several performance factors which have applied to the automatic indexing with controlled vocabulary and text categorization based on Rocchio algorithm were examined, and the simple method for performance improvement of them were tried. Also, results of the methods using Rocchio algorithm were compared with those of other learning based methods on the same conditions. As a result, keeping with the strong points which are implementational easiness and computational efficiency, the methods based Rocchio algorithms showed equivalent or better results than other learning based methods(SVM, VPT, NB). Especially, for the semi-automatic indexing(computer-aided indexing), the methods using Rocchio algorithm with a high recall level could be used preferentially.

4

기계학습을 통한 디스크립터 자동부여에 관한 연구

김판준(신라대학교) 2006, Vol.23, No.1, pp.279-299 https://doi.org/10.3743/KOSIM.2006.23.1.279

초록보기

초록

학술지 논문에 디스크립터를 자동부여하기 위하여 기계학습 기반의 접근법을 적용하였다. 정보학 분야의 핵심 학술지를 선정하여 지난 11년간 수록된 논문들을 대상으로 문헌집단을 구성하였고, 자질 선정과 학습집합의 크기에 따른 성능을 살펴보았다. 자질 선정에서는 카이제곱 통계량(CHI)과 고빈도 선호 자질 선정 기준들(COS, GSS, JAC)을 사용하여 자질을 축소한 다음, 지지벡터기계(SVM)로 학습한 결과가 가장 좋은 성능을 보였다. 학습집합의 크기에서는 지지벡터기계(SVM)와 투표형 퍼셉트론(VPT)의 경우에는 상당한 영향을 받지만 나이브 베이즈(NB)의 경우에는 거의 영향을 받지 않는 것으로 나타났다.

Abstract

This study utilizes various approaches of machine learning in the process of automatically assigning descriptors to journal articles. After selecting core journals in the field of information science and organizing test collection from the articles of the past 11 years, the effectiveness of feature selection and the size of training set was examined. In the regard of feature selection, after reducing the feature set by χ2 statistics(CHI) and criteria which prefer high-frequency features(COS, GSS, JAC), the trained Support Vector Machines(SVM) performs the best. With respective to the size of the training set, it significantly influences the performance of Support Vector Machines(SVM) and Voted Perceptron(VTP). but it scarcely affects that of Naive Bayes(NB).

5

소설 주제 접근체계의 확장 연구 - 상징과 모티프를 중심으로 -

김나름(연세대학교) ; 김태수(연세대학교) 2006, Vol.23, No.4, pp.69-87 https://doi.org/10.3743/KOSIM.2006.23.4.069

초록보기

초록

소설을 비롯한 문학작품에 대한 접근은 기술요소 중심이었고, 주제접근 역시 작품 속에 등장하는 소재, 인물명, 지명 등 형식 요소에 국한되어 왔다. 이러한 관행은 소설 주제의 본질을 놓친 것이며 미학적 경험을 추구하는 이용자의 주제요구를 반영하지 못한다. 이 연구에서는 소설 주제접근체계의 확장을 위해 상징 및 모티프의 개념과 주제접근점으로서의 가능성을 검토하였다. 이와 함께 해당 용어사전을 정보원으로 활용하여 상징과 모티프 체계를 구성하고, 20세기 한국소설에 적용해 이용성과 한계점을 논하였다.

Abstract

The access to literary works, including fictions, has focused on descriptive elements, and the subject access has been confined to denotative elements such as the subject matter, name of character and geographical name, etc, which appear in the work. This practice will not lead to the essence of subject of fiction, and does not reflect the demand of users for the subject who pursue aesthetic experience. In this study, concepts of symbol and motif and their possibility to be used as subject access point are considered to enhance a subject access scheme. In addition, this study tries to build the scheme of symbol and motif by using the glossary as the source of information. The composed schemes are applied to 20th century Korean fictions and its usability and limits are discussed.

6

대단위 우리말 온톨리지 구축을 위한 시소러스의 개발

최석두(한성대학교) ; 이우범(한성대학교) ; 김이겸(광주대학교) ; 이정연(한국학술진흥재단 지식정보센터) ; 최상기(전북대학교) ; 한상길(대림대학교) 2006, Vol.23, No.4, pp.147-164 https://doi.org/10.3743/KOSIM.2006.23.4.147

초록보기

초록

Abstract

This paper reports an effort to construct a grand-scale Korean thesaurus that can be used for enhancing retrieval performance in various fields. This thesaurus is currently being used for indexing and retrieving purpose and new terms are being added to it. As the new demands on retrieval performance increase in Korea, developing a grand-scale ontology appears to be necessary so a project is undertaken to transfer the current thesaurus into an ontology system. The paper describes how the thesaurus is constructed and prepared to be the base for an ontology system.

7

텍스트 마이닝 기법을 이용한 연관용어 선정에 관한 실험적 연구

김수연(연세대학교) ; 정영미(연세대학교) 2006, Vol.23, No.3, pp.147-165 https://doi.org/10.3743/KOSIM.2006.23.3.147

초록보기

초록

이 연구에서는 전체 문헌집단으로부터 초기 질의어에 대한 연관용어 선정 시 사용할 수 있는 최적의 기법을 찾기 위해 연관규칙 마이닝과 용어 클러스터링 기법을 이용하여 연관용어 선정 실험을 수행하였다. 연관규칙 마이닝 기법에서는 Apriori 알고리즘을 사용하였으며, 용어 클러스터링 기법에서는 연관성 척도로 GSS 계수, 자카드계수, 코사인계수, 소칼 & 스니스 5, 상호정보량을 사용하였다. 성능평가 척도로는 연관용어 정확률과 연관용어 일치율을 사용하였으며, 실험결과 Apriori 알고리즘과 GSS 계수가 가장 좋은 성능을 나타냈다.

Abstract

In this study, experiments for selection of association terms were conducted in order to discover the optimum method in selecting additional terms that are related to an initial query term. Association term sets were generated by using support, confidence, and lift measures of the Apriori algorithm, and also by using the similarity measures such as GSS, Jaccard coefficient, cosine coefficient, and Sokal & Sneath 5, and mutual information. In performance evaluation of term selection methods, precision of association terms as well as the overlap ratio of association terms and relevant documents' indexing terms were used. It was found that Apriori algorithm and GSS achieved the highest level of performances.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지