정보관리학회지, 한국정보관리학회

21

이혁진(Texas Woman’s University) 2006, Vol.23, No.2, pp.97-111 https://doi.org/10.3743/KOSIM.2006.23.2.097

초록보기

초록

이 논문의 주요목적은 정보이용자들이 어떤 수준의 정확률 차이에서 유의미하게 차이를 인지하는지를 알아보고자 하는 것이다. 그에 관련한 몇 가지 흥미 있는 결과가 도출되었다. 그 외에 적합성 판정은 이용자의 판정시간과 관계가 없는 것으로 나타났다. 그리고 주제에 대한 이용자의 배경지식과 적합성 판정의 관계가 두드러졌다. 또한, 적합문서의 숫자가 적었을 때 이용자들은 적합성 판정에 더욱 어려움을 겪었다. 마지막으로, 검색결과리스트중 상위 N 문서의 적합성 판정에 대한 중요성을 확인할 수 있었다.

Abstract

The purpose of this study is to investigate what level of difference in precision would be significantly perceived by a human user of an information retrieval system. Not many researches have been conducted with regards to this issue in information retrieval field. Despite the non-significant results, there were several interesting findings in recognizing different levels of precision rates. The correctness of relevance task had little to do with the taken time for the task. In addition, the strong relationship between the subjects' topic familiarity and rate of correct judgments is one of the most interesting results in this study. It turned out that the subjects have more difficulty in a situation they have to judge between the two lists having more non-relevant documents than in a situation they do between the lists having more relevant documents. Finally, the serious influence from the first top N documents in a list for relevance judgment task has been confirmed.

22

사회적 가상세계에서 인터페이스가 초보사용자들의 성과에 미치는 영향

정윤혁(울산과학기술대학교) ; 주보령(Louisiana State Univ.) ; Lisl Zach(Drexel University) 2012, Vol.29, No.4, pp.7-23 https://doi.org/10.3743/KOSIM.2012.29.4.007

초록보기

초록

Abstract

This paper explores how interface environments have an influence on novice users’ performance in social virtual worlds (SVWs), which are emerging user-centric three-dimensional cyberspaces. Despite their early popularity, SVWs have experienced that numerous new users leave the cyberspaces soon before they become long-term users. One possible reason is that unfamiliar interfaces of SVWs can be a barrier to novice users’ adaptation of the technology. To understand a role of interfaces in the users’ assimilation of SVWs, we examine an impact of three interface factors (presence, affordance, and feedback) on performance which is regarded as a yardstick for users’ adaptation of SVWs. Forty participants were recruited and went through one-hour experimental sessions with seven tasks in Second Life; they were also asked to answer a questionnaire. Findings indicate that while affordance and feedback are significant factors influencing novice users’ performance, presence has no impact on their performance.

23

노드정보를 이용한 문서검색의 성능에 관한 연구

윤소영(국사편찬위원회) 2007, Vol.24, No.1, pp.103-120 https://doi.org/10.3743/KOSIM.2007.24.1.103

초록보기

초록

통신기술과 정보기기의 발달로 대학에서 교육과정에 정보를 활용하는 방식이 급격히 변화하고 있어 저작권이 있는 정보를 윤리적, 합법적으로 교육 자료로 사용하게 될 경우 인지해야할 사항들이 점점 늘어나고 있다. 이 연구에서는 대학에 필요한 교육적 목적의 정보 공정사용에 관련한 저작권 법과 각종 지침을 분석한 후 대학도서관에서 교육자와 학생에게 인지시켜야 할 주요 개념을 도출하여 대학 도서관이 정보 공정사용 지침에 포함해야 할 주요 영역을 제시하고자 하였다. 또한 영역별로 도출된 주요 개념들이 국내 대학 도서관 사이트에서 적절하게 교육자와 학생들에게 제공되고 있는지를 조사하였다.

Abstract

Due to the radical changes of information technology, it becomes indispensable for educators and students of university to learn how to use copyrighted works ethically and legally without violating the copyright law. As a result, academic libraries need to take responsibilities to inform them fair use criteria and to provide proper fair use guidelines. This study analysed various fair use guidelines for them and copyright law to identify key areas of fair use guideline for the academic libraries. It also investigated 10 university libraries' web sites to find that the identified key areas are delivered to the educators and students.

24

분포유사도를 이용한 문헌클러스터링의 성능향상에 대한 연구

이재윤(경기대학교) 2007, Vol.24, No.4, pp.267-283 https://doi.org/10.3743/KOSIM.2007.24.4.267

초록보기

초록

이 연구에서는 분포 유사도를 문헌 클러스터링에 적용하여 전통적인 코사인 유사도 공식을 대체할 수 있는 가능성을 모색해보았다. 대표적인 분포 유사도인 KL 다이버전스 공식을 변형한 Jansen-Shannon 다이버전스, 대칭적 스큐 다이버전스, 최소 스큐 다이버전스의 세 가지 공식을 문헌 벡터에 적용하는 방안을 고안하였다. 분포 유사도를 적용한 문헌 클러스터링 성능을 검증하기 위해서 세 실험 집단을 대상으로 두 가지 실험을 준비하여 실행하였다. 첫 번째 문헌 클러스터링 실험에서는 최소 스큐 다이버전스가 코사인 유사도 뿐만 아니라 다른 다이버전스 공식의 성능도 확연히 앞서는 뛰어난 성능을 보였다. 두 번째 실험에서는 피어슨 상관계수를 이용하여 1차 유사도 행렬로부터 2차 분포 유사도를 산출하여 문헌 클러스터링을 수행하였다. 실험 결과는 2차 분포 유사도가 전반적으로 더 좋은 문헌 클러스터링 성능을 보이는 것으로 나타났다. 문헌 클러스터링에서 처리 시간과 분류 성능을 함께 고려한다면 이 연구에서 제안한 최소 스큐 다이버전스 공식을 사용하고, 분류 성능만 고려할 경우에는 2차 분포 유사도 방식을 사용하는 것이 바람직하다고 판단된다.

Abstract

In this study, measures of distributional similarity such as KL-divergence are applied to cluster documents instead of traditional cosine measure, which is the most prevalent vector similarity measure for document clustering. Three variations of KL-divergence are investigated; Jansen-Shannon divergence, symmetric skew divergence, and minimum skew divergence. In order to verify the contribution of distributional similarities to document clustering, two experiments are designed and carried out on three test collections. In the first experiment the clustering performances of the three divergence measures are compared to that of cosine measure. The result showed that minimum skew divergence outperformed the other divergence measures as well as cosine measure. In the second experiment second-order distributional similarities are calculated with Pearson correlation coefficient from the first-order similarity matrixes. From the result of the second experiment, second-order distributional similarities were found to improve the overall performance of document clustering. These results suggest that minimum skew divergence must be selected as document vector similarity measure when considering both time and accuracy, and second-order similarity is a good choice for considering clustering accuracy only.

25

공저자 수를 고려한 공저 네트워크 중심성과 연구성과의 연관성 분석

이재윤(명지대학교) 2016, Vol.33, No.4, pp.175-199 https://doi.org/10.3743/KOSIM.2016.33.4.175

초록보기

초록

국내 문헌정보학 분야에서 10년간 발표된 논문의 저자와 인용빈도를 대상으로 공저 네트워크에서의 중심성과 연구성과 지수 사이의 관계를 분석하였다. 특히 공저를 고려하지 않고 연구성과 지수를 산출하는 경우와 공저를 고려하여 연구성과 지수를 산출하는 경우로 나누어서 분석하였다. 또한 저자 집단을 논문 수에 따라 다르게 설정하여 지수 사이의 상관관계를 분석한 결과, 연구자의 인용지수와 연구자 중심성 사이의 상관관계에 대한 선행 연구의 일관성없는 결과를 설명해낼 수 있었다. 전체적으로 공저 활동의 정도는 연구성과와 상관관계가 유의하지 않았으며 일부에서는 오히려 부정적인 상관관계를 가진 것으로 나타났다. 중심성과 연구성과 사이의 관계는 통계적으로 유의한 긍정적인 상관관계가 나타났으나 상위 저자 30명만을 대상으로 분석한 결과에서는 상관관계가 유의하지 않았다.

Abstract

We analyzed the relationships between the co-authorship network centralities and the research performance indicators with the authors and the number of citations of the papers published for 10 years in Korean library and information science journals. In particular, the research performance indicators were calculated with normal counting and with fractional counting also. As a result of correlation analysis between the variables by setting the different ranges of the author groups to be analyzed according to the number of articles, it was possible to explain the inconsistent results of the previous studies on the correlations between the researchers' citation indicators and their co-authorship network centralities. Overall, the degree of co-authorship activities measured by collaboration coefficient showed no or negatively correlated with research performance. There were statistically significant positive correlations between the centralities and the research performance indicators, but the correlation was not significant in the analysis of the top 30 authors by number of articles.

26

엘리먼트 기반 XML 문서검색의 성능에 관한 실험적 연구

윤소영(국사편찬위원회) ; 문성빈(연세대학교) 2006, Vol.23, No.1, pp.201-219 https://doi.org/10.3743/KOSIM.2006.23.1.201

초록보기

초록

이 연구에서는 가장 적합한 엘리먼트 기반 XML 문서검색 기법을 제시하기 위해 언어모델 검색 접근법으로 다이버전스 기법, 보정 기법 그리고 계층적 언어모델의 검색성능을 평가하는 실험을 수행하였다. 실험 결과, 가장 효율적인 검색 접근법으로 문서의 구조정보를 적용한 계층적 언어모델 검색을 제안하였다. 특히, 계층적 언어모델은 실제 검색에서 중요성을 가지는 검색순위 상위에서 뛰어난 성능을 보였다.

Abstract

This experimental study suggests an element-based XML document retrieval method that reveals highly relevant elements. The models investigated here for comparison are divergence and smoothing method, and hierarchical language model. In conclusion, the hierarchical language model proved to be most effective in element-based XML document retrieval with regard to the improved exhaustivity and harmed specificity.

27

영상저작물 활용에 관한 도서관의 저작권 쟁점 분석

정경희(한성대학교) ; 이호신(한성대학교) ; 최상희(대구가톨릭대학교) 2014, Vol.31, No.4, pp.179-200 https://doi.org/10.3743/KOSIM.2014.31.4.179

초록보기

초록

본 연구는 대학도서관과 공공도서관에서 영상물 이용현황과 그에 따른 저작권 문제를 조사하였다. 이를 위하여 공공도서관과 대학도서관을 대상으로 설문조사를 실시하였으며, 사서의 저작권 문제는 설문조사와 더불어 도메리와 저작권 관련 웹사이트 등의 질문을 분석하였다. 그 결과 영상자료를 위한 설비의 다양화와 서비스의 다양화에 따라 저작권 문제도 공연뿐만 아니라 대출, 보존용 복제, 디지털화, 인터넷을 통한 서비스 등 다양하게 발생하고 있음을 알 수 있었다. 또한 각 영역별 저작권 질문이 기초적인 질문에서부터 세부적인 문제에 대한 질문까지 다양하게 이루어지고 있었다. 이와 관련하여 본 연구는 저작권 문제의 복잡성을 고려해 볼 때 사서양성과 관련한 대학 정규교육과정에서 저작권에 대한 기초적 이해를 위한 교육, 직무연수과정에서 저작권법의 개정에 따른 보완 교육, 세부적인 저작권 문제에 대한 해결을 위해서 사서를 위한 온라인 질의응답 서비스가 실시될 필요가 있음을 제안하였다.

Abstract

This study investigated the present situation of the use of cinematographic works and the problems of copyright. Surveys were conducted in public and university libraries for these. Also, content analyses were conducted to make sense of copyright problems in libraries. As a result, this study found that problems of copyright had occurred in various aspects related to public performance, lending, digitization and internet services according to the diversity of facilities for watching cinematographic works and library services. Also, the librarians’ questions to the copyright were very various from the primary level to specific level. This study suggested that regular courses of study in library schools need to be opened to primitive understanding to copyright law and occupational training programs for librarians need to be opened to complementary education as revisions of copyright law. This study also suggested that the online Q&A services need to be started for librarians who have detailed copyright problems.

28

국내 클래식 음악 공연 정보의 참여형 아카이브 포털 기능 설계

장재우(명지대학교 기록정보과학전문대학원) ; 이해영(명지대학교) ; 이승휘(명지대학교 기록정보과학전문대학원) ; 김인택(명지대학교) ; 백대원(㈜세미콘네트웍스) 2019, Vol.36, No.2, pp.223-254 https://doi.org/10.3743/KOSIM.2019.36.2.223

초록보기

초록

국내에서 공연 예술의 아카이빙 방안에 관련된 연구들은 최근 다방면으로 이루어져 왔으나 클래식 음악 공연정보를 한 곳에서 체계적으로 확인하고 지난 공연정보를 얻을 수 있는 디지털 아카이브의 구축은 시도된 바도 없었고 이에 대한 연구도 이루어진 일이 없었다. 이 연구에서는 모든 클래식 음악 공연에 대한 정보를 제공하고 보존하며 서비스할 수 있는 참여형 아카이브 포털을 제안하고, 선행연구에서 확인된 공연 아카이브 포털의 요건 및 요구사항을 바탕으로 기능을 설계하고자 하였다. 이를 위해 아카이브 포털의 기능요건을 확인하고 클래식 음악 공연정보의 특성에 맞추어 재구성하였으며, 메타데이터 요소를 설계하였다. 또한, 검색 및 패싯 내비게이션과 이용자 참여 및 커뮤니케이션 기능의 설계와 구현 방향을 제시하였다. 마지막으로 이 기능들의 인터페이스를 구현한 프로토타입을 개발하였다.

Abstract

The research on the archiving performing arts has recently been conducted on various aspects in Korea, but there has been no research or attempts to build a digital archive that can systematically identify the information on classical music performances in one place and get the past performance information. In this study, we proposed a participatory archive portal for providing, preserving and servicing information on all classical music performances, and tried to design the functions based on the requirements of the performance archive portal identified in previous research. We identified and proposed functional requirements of the portal, reorganized the characteristics of classical music performance information, and designed metadata elements. Then we presented the design and implementation direction of search and facet navigation, as well as user participation and communication functions. Lastly, we developed a prototype that implemented the requirements of the interface.

29

기계학습에 기초한 자동분류의 성능 요소에 관한 연구

김판준(신라대학교) 2016, Vol.33, No.2, pp.33-59 https://doi.org/10.3743/KOSIM.2016.33.2.033

초록보기

초록

국내 학술회의 논문으로 구성된 문헌집합을 대상으로 기계학습에 기초한 자동분류의 성능에 영향을 미치는 요소들을 검토하였다. 특히 구현이 쉽고 컴퓨터 처리 속도가 빠른 로치오 알고리즘을 사용하여 『한국정보관리학회 학술대회 논문집』의 논문에 주제 범주를 자동 할당하는 분류 성능 측면에서 분류기 생성 방법, 학습집합 규모, 가중치부여 기법, 범주 할당 방법 등 주요 요소들의 특성을 다각적인 실험을 통해 살펴보았다. 결과적으로 분류 환경 및 문헌집합의 특성에 따라 파라미터(β, λ)와 학습집합의 크기(5년 이상)를 적절하게 적용하는 것이 효과적이며, 동등한 성능 수준이라면 보다 단순한 단일 가중치부여 기법을 사용하여 분류의 효율성을 높일 수 있음을 발견하였다. 또한 국내 학술회의 논문의 분류는 특정 논문에 하나 이상의 범주가 부여되는 복수-범주 분류(multi-label classification)가 실제 환경에 부합한다고 할 수 있으므로, 이러한 환경을 고려하여 주요 성능 요소들의 특성에 기초한 최적의 분류 모델을 개발할 필요가 있다.

Abstract

This study examined the factors affecting the performance of automatic classification for the domestic conference papers based on machine learning techniques. In particular, In view of the classification performance that assigning automatically the class labels to the papers in Proceedings of the Conference of Korean Society for Information Management using Rocchio algorithm, I investigated the characteristics of the key factors (classifier formation methods, training set size, weighting schemes, label assigning methods) through the diversified experiments. Consequently, It is more effective that apply proper parameters (β, λ) and training set size (more than 5 years) according to the classification environments and properties of the document set. and If the performance is equivalent, I discovered that the use of the more simple methods (single weighting schemes) is very efficient. Also, because the classification of domestic papers is corresponding with multi-label classification which assigning more than one label to an article, it is necessary to develop the optimum classification model based on the characteristics of the key factors in consideration of this environment.

30

단행본 서명의 단어 임베딩에 따른 자동분류의 성능 비교

이용구(경북대학교 문헌정보학과) 2023, Vol.40, No.4, pp.307-327 https://doi.org/10.3743/KOSIM.2023.40.4.307

초록보기

초록

이 연구는 짧은 텍스트인 서명에 단어 임베딩이 미치는 영향을 분석하기 위해 Word2vec, GloVe, fastText 모형을 이용하여 단행본 서명을 임베딩 벡터로 생성하고, 이를 분류자질로 활용하여 자동분류에 적용하였다. 분류기는 k-최근접 이웃(kNN) 알고리즘을 사용하였고 자동분류의 범주는 도서관에서 도서에 부여한 DDC 300대 강목을 기준으로 하였다. 서명에 대한 단어 임베딩을 적용한 자동분류 실험 결과, Word2vec와 fastText의 Skip-gram 모형이 TF-IDF 자질보다 kNN 분류기의 자동분류 성능에서 더 우수한 결과를 보였다. 세 모형의 다양한 하이퍼파라미터 최적화 실험에서는 fastText의 Skip-gram 모형이 전반적으로 우수한 성능을 나타냈다. 특히, 이 모형의 하이퍼파라미터로는 계층적 소프트맥스와 더 큰 임베딩 차원을 사용할수록 성능이 향상되었다. 성능 측면에서 fastText는 n-gram 방식을 사용하여 하부문자열 또는 하위단어에 대한 임베딩을 생성할 수 있어 재현율을 높이는 것으로 나타났다. 반면에 Word2vec의 Skip-gram 모형은 주로 낮은 차원(크기 300)과 작은 네거티브 샘플링 크기(3이나 5)에서 우수한 성능을 보였다.

Abstract

To analyze the impact of word embedding on book titles, this study utilized word embedding models (Word2vec, GloVe, fastText) to generate embedding vectors from book titles. These vectors were then used as classification features for automatic classification. The classifier utilized the k-nearest neighbors (kNN) algorithm, with the categories for automatic classification based on the DDC (Dewey Decimal Classification) main class 300 assigned by libraries to books. In the automatic classification experiment applying word embeddings to book titles, the Skip-gram architectures of Word2vec and fastText showed better results in the automatic classification performance of the kNN classifier compared to the TF-IDF features. In the optimization of various hyperparameters across the three models, the Skip-gram architecture of the fastText model demonstrated overall good performance. Specifically, better performance was observed when using hierarchical softmax and larger embedding dimensions as hyperparameters in this model. From a performance perspective, fastText can generate embeddings for substrings or subwords using the n-gram method, which has been shown to increase recall. The Skip-gram architecture of the Word2vec model generally showed good performance at low dimensions(size 300) and with small sizes of negative sampling (3 or 5).

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지