정보관리학회지, 한국정보관리학회

1

이재윤(명지대학교) 2017, Vol.34, No.3, pp.209-228 https://doi.org/10.3743/KOSIM.2017.34.3.209

초록보기

초록

이 연구에서는 기존의 h-지수와 g-지수에 대해 달리 해석해보고 새로운 Hirsch 유형 복합 지표인 전치 g-지수를 제안하고자 한다. 새로운 해석에 따라서 h-지수 및 g-지수 산출 그래프의 가로축이 인용빈도 기준값에 해당하고 세로축이 논문 수가 되도록 축을 전치하고, 이로부터 새로운 전치 g-지수를 제안하였다. 한국학술지인용색인 KCI의 문헌정보학 분야 연구자들을 대상으로 적용해본 결과, 제안된 새 지수는 h-지수 및 g-지수에 비해서 변별력이 높으며 연구의 질보다 연구의 양 차이에 더 민감한 것으로 나타났다. 연구를 지속하는 꾸준한 연구자를 그렇지 못한 연구자와 변별해주는 차별화된 특성을 가지고 있으므로 전치 g-지수는 다면적인 연구 성과 평가에 도움이 될 것으로 기대된다.

Abstract

This study suggests a new Hirsch-type composite index, ‘transposed g-index’ with a different viewpoint on h-index and g-index. From this new point of view, the axes of the graph describing the h-index and g-index are transposed so that the horizontal axis corresponds to the citation frequency threshold and the vertical axis corresponds to the number of documents. Based on this transposed graph, a new indicator transposed g-index is suggested and applied to library and information science researchers’ outcomes in Korean Citation Index database. The results show that this new index has more discriminating power than h-index and g-index, and is more sensitive to differences in quantitative aspects than quality of research. It is expected that the transposed g-index will be helpful for the multifaceted evaluation of the research outcome because it has differentiating characteristics that distinguish consistent researchers who continue to study from those who do not.

2

분포유사도를 이용한 문헌클러스터링의 성능향상에 대한 연구

이재윤(경기대학교) 2007, Vol.24, No.4, pp.267-283 https://doi.org/10.3743/KOSIM.2007.24.4.267

초록보기

초록

이 연구에서는 분포 유사도를 문헌 클러스터링에 적용하여 전통적인 코사인 유사도 공식을 대체할 수 있는 가능성을 모색해보았다. 대표적인 분포 유사도인 KL 다이버전스 공식을 변형한 Jansen-Shannon 다이버전스, 대칭적 스큐 다이버전스, 최소 스큐 다이버전스의 세 가지 공식을 문헌 벡터에 적용하는 방안을 고안하였다. 분포 유사도를 적용한 문헌 클러스터링 성능을 검증하기 위해서 세 실험 집단을 대상으로 두 가지 실험을 준비하여 실행하였다. 첫 번째 문헌 클러스터링 실험에서는 최소 스큐 다이버전스가 코사인 유사도 뿐만 아니라 다른 다이버전스 공식의 성능도 확연히 앞서는 뛰어난 성능을 보였다. 두 번째 실험에서는 피어슨 상관계수를 이용하여 1차 유사도 행렬로부터 2차 분포 유사도를 산출하여 문헌 클러스터링을 수행하였다. 실험 결과는 2차 분포 유사도가 전반적으로 더 좋은 문헌 클러스터링 성능을 보이는 것으로 나타났다. 문헌 클러스터링에서 처리 시간과 분류 성능을 함께 고려한다면 이 연구에서 제안한 최소 스큐 다이버전스 공식을 사용하고, 분류 성능만 고려할 경우에는 2차 분포 유사도 방식을 사용하는 것이 바람직하다고 판단된다.

Abstract

In this study, measures of distributional similarity such as KL-divergence are applied to cluster documents instead of traditional cosine measure, which is the most prevalent vector similarity measure for document clustering. Three variations of KL-divergence are investigated; Jansen-Shannon divergence, symmetric skew divergence, and minimum skew divergence. In order to verify the contribution of distributional similarities to document clustering, two experiments are designed and carried out on three test collections. In the first experiment the clustering performances of the three divergence measures are compared to that of cosine measure. The result showed that minimum skew divergence outperformed the other divergence measures as well as cosine measure. In the second experiment second-order distributional similarities are calculated with Pearson correlation coefficient from the first-order similarity matrixes. From the result of the second experiment, second-order distributional similarities were found to improve the overall performance of document clustering. These results suggest that minimum skew divergence must be selected as document vector similarity measure when considering both time and accuracy, and second-order similarity is a good choice for considering clustering accuracy only.

3

공저자 수를 고려한 공저 네트워크 중심성과 연구성과의 연관성 분석

이재윤(명지대학교) 2016, Vol.33, No.4, pp.175-199 https://doi.org/10.3743/KOSIM.2016.33.4.175

초록보기

초록

국내 문헌정보학 분야에서 10년간 발표된 논문의 저자와 인용빈도를 대상으로 공저 네트워크에서의 중심성과 연구성과 지수 사이의 관계를 분석하였다. 특히 공저를 고려하지 않고 연구성과 지수를 산출하는 경우와 공저를 고려하여 연구성과 지수를 산출하는 경우로 나누어서 분석하였다. 또한 저자 집단을 논문 수에 따라 다르게 설정하여 지수 사이의 상관관계를 분석한 결과, 연구자의 인용지수와 연구자 중심성 사이의 상관관계에 대한 선행 연구의 일관성없는 결과를 설명해낼 수 있었다. 전체적으로 공저 활동의 정도는 연구성과와 상관관계가 유의하지 않았으며 일부에서는 오히려 부정적인 상관관계를 가진 것으로 나타났다. 중심성과 연구성과 사이의 관계는 통계적으로 유의한 긍정적인 상관관계가 나타났으나 상위 저자 30명만을 대상으로 분석한 결과에서는 상관관계가 유의하지 않았다.

Abstract

We analyzed the relationships between the co-authorship network centralities and the research performance indicators with the authors and the number of citations of the papers published for 10 years in Korean library and information science journals. In particular, the research performance indicators were calculated with normal counting and with fractional counting also. As a result of correlation analysis between the variables by setting the different ranges of the author groups to be analyzed according to the number of articles, it was possible to explain the inconsistent results of the previous studies on the correlations between the researchers' citation indicators and their co-authorship network centralities. Overall, the degree of co-authorship activities measured by collaboration coefficient showed no or negatively correlated with research performance. There were statistically significant positive correlations between the centralities and the research performance indicators, but the correlation was not significant in the analysis of the top 30 authors by number of articles.

4

공저자 수를 고려한 h-지수 산출

이재윤(명지대학교) 2016, Vol.33, No.3, pp.7-29 https://doi.org/10.3743/KOSIM.2016.33.3.007

초록보기

초록

연구자 성과 평가를 위해 널리 사용되는 h-지수는 일관성 부족 문제와 공저자 수를 고려하지 않는다는 문제를 가지고 있다. 이를 극복하기 위해 h-지수와 g-지수, 그리고 공저 보정 방안을 검토하고 2004년부터 2013년 사이의 실제 KCI 데이터를 대상으로 분석해본 결과는 다음과 같다. 첫째, 일관성 결여 문제를 해소하기 위해서는 g-지수를 사용하는 것이 더 바람직하다고 판단된다. 둘째, 연구 성과의 양적인 측면과 질적인 측면을 한꺼번에 반영하는 복합 지수라는 h-지수와 g-지수의 고유한 특성을 유지하기 위해서는 반드시 공저를 보정하여 지수를 측정해야 한다. 셋째, 공저자 수로 나눈 인용빈도를 사용하는 hC-지수와 gC-지수를 적용하면 단독 저술 비중이 높은 인문학 분야 연구자도 공정하게 평가할 수 있고, 특정 분야나 특정 기관에 속한 연구자가 상위 순위를 과점하는 현상을 방지할 수 있다.

Abstract

The h-index is a popular bibliometric indicator for evaluating individual researchers. However, it has been criticized for its inconsistency with reflecting increased number of citations and disregarding the number of co-authors in a paper. In order to overcome these problems, we examined the g-index and other Hirsch-type indices considering the number of co-authors. Test data collection was extracted from Korean Citation Index database published from 2004 to 2013. The results of this study are as follows: First, g-index is more reliable indicator than h-index with consistency. Second, number of co-authors must be considered to maintain the h-index as an complex indicator applying the quality and the quantity of research performance. Finally, hc-index and gc-index, with fractionalised counting of the papers, can fairly measure the research performance of humanities researchers, and successfully prevent specific disciplines or institutions occupying majority of top rankings.

5

문헌간 유사도를 이용한 SVM 분류기의 문헌분류성능 향상에 관한 연구

이재윤(경기대학교) 2005, Vol.22, No.3, pp.261-287 https://doi.org/10.3743/KOSIM.2005.22.3.261

초록보기

초록

이 논문의 목적은 SVM(지지벡터기계) 분류기의 성능을 문헌간 유사도를 이용해서 향상시키는 것이다. 는 문헌 벡터 자질 표현에 기반한 SVM 문헌자동분류를 제안하였다. 제안한 방식은 분류 자질로 색인어 대신 문헌 벡터를, 자질값으로 가중치 대신 벡터유사도를 사용한다. 제안한 방식에 대한 실험 결과, SVM 분류기의 성능을 향상시킬 수 있었다. 실행 효율 향상을 위해서 문헌 벡터 자질 선정 방안과 범주 센트로이드 벡터를 사용하는 방안을 제안하였다. 실험 결과 소규모의 벡터 자질 집합만으로도 색인어 자질을 사용하는 기존 방식보다 나은 성능을 얻을 수 있었다.

Abstract

The purpose of this paper is to explore the ways to improve the performance of SVM(Support Vector Machines) text classifier using inter-document similarit ies. SVMs are powerful machine technique for automatic document classification. In this paper text categorization via SVMs aproach based on feature representation with document vectors is suggested. In this appr oach, document vectors instead stead of term weights are used as feature values. Experiments show that SVM clasifier with do cument vector features can improve the document classification performance. For the sake o f run-time efficiency, two methods are developed: One is to select document vector feature s, and the other is to use category centroid vector features instead. Experiments on these two methods show that we the performance of conventional methods with index term features.

6

공동연구 특성을 고려한 연구자 유형 구분에 대한 연구

이재윤(명지대학교) 2023, Vol.40, No.2, pp.59-80 https://doi.org/10.3743/KOSIM.2023.40.2.059

초록보기

초록

기존의 연구자 유형 구분 모델은 대부분 연구성과 지표를 활용해왔다. 이 연구에서는 인용 영향력이 공동연구와 관련이 있다는 점을 감안하여 인용 데이터를 활용하지 않고 공동연구 지표만으로 연구자 유형을 분석하는 새로운 방법을 모색해보았다. 공동연구 패턴과 공동연구 범위를 기준으로 연구자를 Sparse & Wide (SW) 유형, Dense & Wide (DW) 유형, Dense & Narrow (DN) 유형, Sparse & Narrow (SN) 유형의 4가지로 구분하는 모델을 제안하였다. 제안된 모델을 양자계측 분야에 적용해본 결과, 구분된 연구자 유형별로 인용지표와 공저 네트워크 지표에 차이가 있음이 통계적으로 검증되었다. 이 연구에서 제시한 공동연구 특성에 따른 연구자 유형 구분 모델은 인용정보를 필요로 하지 않으므로 연구관리 정책과 연구지원서비스 측면에서 폭넓게 활용할 수 있을 것으로 기대된다.

Abstract

Traditional models for categorizing researcher types have mostly utilized research output metrics. This study proposes a new model that classifies researchers based on the characteristics of research collaboration. The model uses only research collaboration indicators and does not rely on citation data, taking into account that citation impact is related to collaborative research. The model categorizes researchers into four types based on their collaborative research pattern and scope: Sparse & Wide (SW) type, Dense & Wide (DW) type, Dense & Narrow (DN) type, Sparse & Narrow (SN) type. When applied to the quantum metrology field, the proposed model was statistically verified to show differences in citation indicators and co-author network indicators according to the classified researcher types. The proposed researcher type classification model does not require citation information. Therefore, it is expected to be widely used in research management policies and research support services.

7

문서 클러스터링을 위한 학술지 논문의 구조적 초록 활용성 연구

최상희(대구가톨릭대학교) ; 이재윤(경기대학교) 2012, Vol.29, No.1, pp.331-349 https://doi.org/10.3743/KOSIM.2012.29.1.331

초록보기

초록

구조적 초록은 학술 논문의 주제를 표현하는 역할을 하여 학술 논문을 처리하는데 중요한 요소로 인식되어왔다. 이 연구에서는 구조적 초록을 구성하는 세부 필드의 속성을 4개로 분석하고 초록의 구조를 활용하여 문서 클러스터링에 적용할 수 있는 가능성을 고찰고자 하였다. 구조적 초록의 필드 속성을 문서 클러스터링에 적용한 결과 클러스터링 기법간의 편차가 있었으나 연구 목적이 제공하는 정보량에 비해 주제성이 커서 클러스터링 성능에 가장 큰 영향을 미치고 있는 것으로 나타났다. 또한 분석 결과 특정 필드에 특화되어 출현하는 필드 종속적인 단어가 발생하는 것으로 나타나 필드 종속적인 단어를 배제하고 집단내 평균연결 기법을 적용하였을 때는 클러스터링의 성능이 개선되는 것으로 분석되었다.

Abstract

Structured abstracts have been regarded as an essential information factor to represent topics of journal articles. This study aims to provide an unconventional view to utilize structured abstracts with the analysis on sub fields of a structured abstract in depth. In this study, a structured abstract was segmented into four fields, namely, purpose, design, findings, and values/implications. Each field was compared in the performance analysis of document clustering. In result, the purpose statement of an abstract affected on the performance of journal article clustering more than any other fields. Furthermore, certain types of keywords were identified to be excluded in the document clustering to improve clustering performance, especially by Within group average clustering method. These keywords had stronger relationship to a specific abstract field such as research design than the topic of an article.

8

문헌간 유사도를 이용한 자동분류에서 미분류 문헌의 활용에 관한 연구

김판준(신라대학교) ; 이재윤(경기대학교) 2007, Vol.24, No.1, pp.251-271 https://doi.org/10.3743/KOSIM.2007.24.1.251

초록보기

초록

문헌간 유사도를 자질로 사용하는 분류기에서 미분류 문헌을 학습에 활용하여 분류 성능을 높이는 방안을 모색해보았다. 자동분류를 위해서 다량의 학습문헌을 수작업으로 확보하는 것은 많은 비용이 들기 때문에 미분류 문헌의 활용은 실용적인 면에서 중요하다. 미분류 문헌을 활용하는 준지도학습 알고리즘은 대부분 수작업으로 분류된 문헌을 학습데이터로 삼아서 미분류 문헌을 분류하는 첫 번째 단계와, 수작업으로 분류된 문헌과 자동으로 분류된 문헌을 모두 학습 데이터로 삼아서 분류기를 학습시키는 두 번째 단계로 구성된다. 이 논문에서는 문헌간 유사도 자질을 적용하는 상황을 고려하여 두 가지 준지도학습 알고리즘을 검토하였다. 이중에서 1단계 준지도학습 방식은 미분류 문헌을 문헌유사도 자질 생성에만 활용하므로 간단하며, 2단계 준지도학습 방식은 미분류 문헌을 문헌유사도 자질 생성과 함께 학습 예제로도 활용하는 알고리즘이다. 지지벡터기계와 나이브베이즈 분류기를 이용한 실험 결과, 두 가지 준지도학습 방식 모두 미분류 문헌을 활용하지 않는 지도학습 방식보다 높은 성능을 보이는 것으로 나타났다. 특히 실행효율을 고려한다면 제안된 1단계 준지도학습 방식이 미분류 문헌을 활용하여 분류 성능을 높일 수 있는 좋은 방안이라는 결론을 얻었다

Abstract

This paper studies the problem of classifying documents with labeled and unlabeled learning data, especially with regards to using document similarity features. The problem of using unlabeled data is practically important because in many information systems obtaining training labels is expensive, while large quantities of unlabeled documents are readily available. There are two steps in general semi-supervised learning algorithm. First, it trains a classifier using the available labeled documents, and classifies the unlabeled documents. Then, it trains a new classifier using all the training documents which were labeled either manually or automatically. We suggested two types of semi-supervised learning algorithm with regards to using document similarity features. The one is one step semi-supervised learning which is using unlabeled documents only to generate document similarity features. And the other is two step semi-supervised learning which is using unlabeled documents as learning examples as well as similarity features. Experimental results, obtained using support vector machines and naive Bayes classifier, show that we can get improved performance with small labeled and large unlabeled documents then the performance of supervised learning which uses labeled-only data. When considering the efficiency of a classifier system, the one step semi-supervised learning algorithm which is suggested in this study could be a good solution for improving classification performance with unlabeled documents.

9

계량서지적 분석에서 지적구조 매핑을 위한 링크 삭감 알고리즘의 적합도 측정

이재윤(명지대학교 문헌정보학과) 2022, Vol.39, No.2, pp.233-254 https://doi.org/10.3743/KOSIM.2022.39.2.233

초록보기

초록

지적구조 분석을 위해 가중 네트워크를 시각화해야 하는 경우에 패스파인더 네트워크와 같은 링크 삭감 알고리즘이 널리 사용되고 있다. 이 연구에서는 네트워크 시각화를 위한 링크 삭감 알고리즘의 적합도를 측정하기 위한 지표로 NetRSQ를 제안하였다. NetRSQ는 개체간 연관성 데이터와 생성된 네트워크에서의 경로 길이 사이의 순위 상관도에 기반하여 네트워크의 적합도를 측정한다. NetRSQ의 타당성을 확인하기 위해서 몇 가지 네트워크 생성 방식에 대해 정성적으로 평가를 했었던 선행 연구의 데이터를 대상으로 시험적으로 NetRSQ를 측정해보았다. 그 결과 품질이 좋게 평가된 네트워크일수록 NetRSQ가 높게 측정됨을 확인하였다. 40가지 계량서지적 데이터에 대해서 4가지 링크 삭감 알고리즘을 적용한 결과에 대해서 NetRSQ로 품질을 측정하는 실험을 수행한 결과, 특정 알고리즘의 네트워크 표현 결과가 항상 좋은 품질을 보이는 것은 아니며, 반대로 항상 나쁜 품질을 보이는 것도 아님을 알 수 있었다. 따라서 이 연구에서 제안한 NetRSQ는 생성된 계량서지적 네트워크의 품질을 측정하여 최적의 기법을 선택하는 근거로 활용될 수 있을 것이다.

Abstract

Link reduction algorithms such as pathfinder network are the widely used methods to overcome problems with the visualization of weighted networks for knowledge domain analysis. This study proposed NetRSQ, an indicator to measure the goodness of fit of a link reduction algorithm for the network visualization. NetRSQ is developed to calculate the fitness of a network based on the rank correlation between the path length and the degree of association between entities. The validity of NetRSQ was investigated with data from previous research which qualitatively evaluated several network generation algorithms. As the primary test result, the higher degree of NetRSQ appeared in the network with better intellectual structures in the quality evaluation of networks built by various methods. The performance of 4 link reduction algorithms was tested in 40 datasets from various domains and compared with NetRSQ. The test shows that there is no specific link reduction algorithm that performs better over others in all cases. Therefore, the NetRSQ can be a useful tool as a basis of reliability to select the most fitting algorithm for the network visualization of intellectual structures.

10

국내 대학도서관 연구성과 서비스 개발 및 운영 모형 연구

김수정(전북대학교 문헌정보학과 교수, 문화융복합아카이빙연구소 연구원) ; 이재윤(명지대학교 문헌정보학과 교수) ; 이지원(대구가톨릭대학교 도서관학과 부교수) 2021, Vol.38, No.3, pp.287-309 https://doi.org/10.3743/KOSIM.2021.38.3.287

초록보기

초록

본 연구의 목적은 국내 대학도서관에서 수행하고 있는 연구성과 서비스의 도입, 성장, 현재 운영 내용 및 향후 계획을 구체적으로 살펴봄으로써 향후 서비스 도입을 고려하는 도서관이 참고할 수 있도록 하기 위한 것이다. 이를 위해 연구성과 서비스를 선도적으로 제공하고 있는 4개의 대학도서관을 대상으로 서비스 담당자와의 심층면담을 수행하였다. 심층면담의 내용은 성장, 운영, 서비스를 포함하는 5개의 범주로 구성되었다. 연구 결과, 연구성과 서비스는 대학 연구경쟁력 강화를 지원하려는 목적으로 학내 구성원의 요구 또는 서비스 확대를 위한 도서관 내부의 기획으로 2010년 전후로 시작되었고, 시스템의 개선과 서비스 내용의 확대를 통하여 지속적으로 서비스가 강화되어온 것으로 나타났다. 또한 연구성과 서비스의 개발과 운영을 위해 참조할 수 있는 연구성과 서비스의 종합적인 모형을 제시하였다.

Abstract

This study describes the introduction, growth, current practices and future plans of research evaluation services performed in domestic academic libraries, with a view to informing other libraries considering similar endeavours. To that end, in-depth interviews were conducted with four librarians from academic libraries leading in research evaluation services. The contents of the interviews were grouped into five categories including growth, management, and services. The study found that their research evaluation services were launched around 2010 by demands of members of a university or as a library’s initiative to expand the existing services for the purpose of enhancing the university’s research competitiveness. The research evaluation services have been strengthened by extending the service scope and improving related systems. Also, the study suggests a comprehensive model that can guide the development and operation of research evaluation services.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지