정보관리학회지, 한국정보관리학회

61

기술과학 분야 학술문헌에 대한 학습집합 반자동 구축 및 자동 분류 통합 연구

김선우(경기대학교 문헌정보학과) ; 고건우(경기대학교 문헌정보학과) ; 최원준(한국과학기술정보연구원 콘텐츠 큐레이션센터) ; 정희석(한국과학기술정보연구원 콘텐츠 큐레이션센터) ; 윤화묵(한국과학기술정보연구원 콘텐츠큐레이션센터) ; 최성필(경기대학교) 2018, Vol.35, No.4, pp.141-164 https://doi.org/10.3743/KOSIM.2018.35.4.141

초록보기

초록

최근 학술문헌의 양이 급증하고, 융복합적인 연구가 활발히 이뤄지면서 연구자들은 선행 연구에 대한 동향 분석에 어려움을 겪고 있다. 이를 해결하기 위해 우선적으로 학술논문 단위의 분류 정보가 필요하지만 국내에는 이러한 정보가 제공되는 학술 데이터베이스가 존재하지 않는다. 이에 본 연구에서는 국내 학술문헌에 대해 다중 분류가 가능한 자동 분류 시스템을 제안한다. 먼저 한국어로 기술된 기술과학 분야의 학술문헌을 수집하고 K-Means 클러스터링 기법을 활용하여 DDC 600번 대의 중분류에 맞게 매핑하여 다중 분류가 가능한 학습집합을 구축하였다. 학습집합 구축 결과, 메타데이터가 존재하지 않는 값을 제외한 총 63,915건의 한국어 기술과학 분야의 자동 분류 학습집합이 구축되었다. 이를 활용하여 심층학습 기반의 학술문헌 자동 분류 엔진을 구현하고 학습하였다. 객관적인 검증을 위해 수작업 구축한 실험집합을 통한 실험 결과, 다중 분류에 대해 78.32%의 정확도와 72.45%의 F1 성능을 얻었다.

Abstract

Recently, as the amount of academic literature has increased rapidly and complex researches have been actively conducted, researchers have difficulty in analyzing trends in previous research. In order to solve this problem, it is necessary to classify information in units of academic papers. However, in Korea, there is no academic database in which such information is provided. In this paper, we propose an automatic classification system that can classify domestic academic literature into multiple classes. To this end, first, academic documents in the technical science field described in Korean were collected and mapped according to class 600 of the DDC by using K-Means clustering technique to construct a learning set capable of multiple classification. As a result of the construction of the training set, 63,915 documents in the Korean technical science field were established except for the values in which metadata does not exist. Using this training set, we implemented and learned the automatic classification engine of academic documents based on deep learning. Experimental results obtained by hand-built experimental set-up showed 78.32% accuracy and 72.45% F1 performance for multiple classification.

62

국내 대학도서관의 정보화경영체제 인식차이에 관한 실증적 연구

박재용(신라대학교) 2006, Vol.23, No.1, pp.139-158 https://doi.org/10.3743/KOSIM.2006.23.1.139

초록보기

초록

대학조직에서의 대학도서관은 정보화경영을 통한 경영혁신의 필요성이 높아졌다. 본 연구에서는 대학도서관에 대한 “정보화경영체제”(IMS: information management system)의 인식도를 조사하였다. 즉, 여기서는 정보화경영체제에 대한 2년제와 4년제 대학간의 인식의 차이를 조사하였다. 연구를 위한 설문조사 표본은 모두 67개 였으며, 대부분의 응답자들은 정보화경영체제 도입을 위한 인식에서 중요하게 생각하지 않은 것으로 나타났다. 그리고 일원배치 분산분석(ANOVA)의 결과, 정보화경영시스템의 도입시기와 필요성에 대하여는 F=0.469(p=0.497), sig=0.05. and F=2.410(p=0.125), sig=0.05으로 비유의적으로 나타났다. 그러나 교육의 필요성에서는 F=7.470(p=0.008), sig=0.01으로 유의적으로 나타났다. 본 연구결과는 향후 정보화경영체계의 도입을 위한 대학도서관의 정책수립과 관련하여 표준화되고 일관성있는 기초자료를 제공하였다. 아울러 대학도서관에서 정보화경영체제의 도입을 위해 고려해야 할 사항 등을 제시함으로써 도서관경영관리의 새로운 방향을 모색하였다.

Abstract

University library in university organization needs more to innovating with through information management systems. In this paper introduced IMS to university library in Korea. In this paper investigated difference of recognition between college and university about IMS. The samples (n=67) were composed of those who had already engaged in university library and college. The results of this study were as follows; Most participants who was considerate no more important to mind for IMS introduction. Then, the results of ANOVA analysis was recognized employee's about IMS that among of the best problems was negative(-) effect to introduction necessary and introduction timeless of IMS was insignified with F=0.469(p=0.497), sig=0.05. and F=2.410(p=0.125), sig=0.05. By the way, education necessary of IMS was signified with F=7.470(p=0.008), sig=0.01. Finally, in this paper provided the fundamental data for the standardization and transparency related to policy establish of university library for introducing of IMS.

63

TPIPF로 계산된 이용자프로파일을 적용한 논문추천시스템에 대한 연구

장령령(전남대학교 문헌정보학과) ; 장우권(전남대학교) 2016, Vol.33, No.1, pp.317-336 https://doi.org/10.3743/KOSIM.2016.33.1.317

초록보기

초록

오늘날 폭발적인 정보의 증가로 이용자들은 자신이 원하는 정보를 찾기 위해 엄청난 시간과 노력을 기울여야 한다. 이 문제를 해결하기 위하여 이용자의 정보요구를 분석하고 이용자에게 적합한 논문을 추천해주는 논문추천시스템이 등장하고 있다. 그러나 대부분의 논문추천시스템은 논문추천시스템의 핵심인 이용자 프로파일을 간과하고 있다. 따라서 이 연구는 논문추천시스템의 성능을 좌우하는 이용자 프로파일을 기존의 평균으로 계산하지 않고 새로운 TPIPF(Topic Proportion-Inverse Paper Frequency)로 계산하는 방법을 제안하였다. 제안된 방법과 기존의 방법을 모두 논문추천시스템에 적용하여 각각의 성능을 온라인 참고문헌 관리도구인 CiteULike에서 제공된 데이터 실험을 통하여 비교하였다. 그 결과 제안된 TPIPF 방법을 적용한 논문추천시스템의 성능이 더 높다는 것을 알 수 있었다.

Abstract

Nowadays users spend more time and effort to find what they want because of information overload. To solve the problem, scientific article recommendation system analyse users’ needs and recommend them proper articles. However, most of the scientific article recommendation systems neglected the core part, user profile. Therefore, in this paper, instead of mean which applied in user profile in previous studies, New TPIPF (Topic Proportion-Inverse Paper Frequency) was applied to scientific article recommendation system. Moreover, the accuracy of two scientific article recommendation systems with above different methods was compared with experiments of public dataset from online reference manager, CiteULike. As a result, the proposed scientific article recommendation system with TPIPF was proven to be better.

64

국내 학술지의 저작권 관리 특성 분석

정경희(한성대학교) ; 김규환(전주대학교) 2016, Vol.33, No.4, pp.269-291 https://doi.org/10.3743/KOSIM.2016.33.4.269

초록보기

초록

본 연구는 한국연구재단 등재지 1,890종의 저작권 관련 문서를 분석하여 국내 학술지의 저작권 관리 현황과 문제점을 파악하고자 하였다. 분석결과는 다음과 같다. 첫째, 등재지의 32.6%가 저작권 소유에 대하여 어떠한 공지도 하고 있지 않았다. 둘째, ‘규정’을 통하여 공지를 하고 있는 1,141종의 학술지 중에서도 77.1%가 구체적으로 양도할 권리를 명시하지 않고 있었다. 셋째, 발행자측이 저작권을 소유하고 있는 학술지(61%) 중에서 저자에 대한 이용허락을 밝히고 있는 학술지는 매우 적었다. 이러한 문제를 해결하기 위해서는 학술지 발행기관이 논문의 배포 목적과 방법을 명확히 설정하고 그에 맞는 저작권 정책을 마련하여 공지해야 할 것이다.

Abstract

This study analyzed 1,890 KRF journals to understand the current situation and problems related to copyright management by journal publishers. The results of the study are as follows: 32.6% of journals did not provide any copyright notice, 77.1% of 1,141 journals which gave copyright information with regulation documents did not specify the type of author’s property right to be transferred and most of the journals which owned copyright transferred from the author did not specify the permission needed to use their article. This study suggested that journal publishers establish the object and method for distribution of journal articles and then develop and publish their copyright policy suited to their own objective.

65

대학도서관 단행본 자료의 장서폐기에 관한 연구

이용민(연세대학교 대학원 문헌정보학과) ; 이지연(연세대학교 문헌정보학과) 2021, Vol.38, No.1, pp.71-86 https://doi.org/10.3743/KOSIM.2021.38.1.071

초록보기

초록

본 연구는 지속적인 증가추세에 있는 대학도서관 단행본 장서폐기와 관련하여, 최근 10년간의 폐기 현황을 분석하고 사서 대상 면담을 통해 장서폐기와 관련된 이슈를 파악하고자 하였다. 한국교육학술정보원의 학술정보시스템에서 제공하는 2010년부터 2019년까지의 장서폐기 데이터를 활용하여 폐기 현황을 분석하였다. 장서 규모가 200만 권 이상인 대학도서관 그룹에서 가장 큰 규모로 장서폐기가 진행되었고, 장서구성 면에서는 단행본 복본의 감소가 두드러지게 나타났다. 최근 10년간 3회 이상의 장서폐기 작업과 1회 이상의 대량폐기를 시행한 기관의 사서 면담 결과, 대량의 장서폐기 과정에서 다수의 단행본 복본이 폐기되는 가운데 활용가치가 있는 자료가 포함되어 있음을 알 수 있었다. 사서들은 내용적 가치가 있는 자료의 디지털화 필요성을 강조하면서, 이를 위한 저작권법상의 제약도 토로하였다. 결론적으로 본 연구는 장서폐기가 장기적으로 장서구성의 변화를 초래할 수 있음을 지적하며, 대학도서관들이 거시적으로 장서구성의 변화에 관심을 두기를 제안하였다. 폐기 단행본의 가치를 평가하여 디지털장서로 구축하고 소장도서의 복본 수를 디지털장서의 이용권 개념으로 유지하는 등 다각적인 노력을 기울일 필요가 있음을 제시하였다.

Abstract

This study attempted to discover the problems related to the increasing book disposal trend within academic libraries by analyzing the disposal status over ten years and interviews with the librarians. The analysis utilized Korea Education and Research Information Service provided disposal information from 2010 to 2019. The academic libraries with more than 2 million books had disposed of the most number of books. This trend led to a distinctive decrease in the books’ duplicate copies in terms of the collection composition. The librarians from the organizations, which conducted disposal more than three times and one massive removal within ten years, revealed in the interviews that they discarded many valuable duplicate books. They discussed the importance of digitizing high-value resources and also the limitation imposed by the copyright law. In conclusion, this study pointed out that book disposal can cause changes in the collection composition in the long run and suggested that academic libraries pay attention to these changes. The study also suggested evaluating the discarded books’ values to guide the digitization efforts and count the number of books to include digital book use rights.

66

문헌간 유사도를 이용한 자동분류에서 미분류 문헌의 활용에 관한 연구

김판준(신라대학교) ; 이재윤(경기대학교) 2007, Vol.24, No.1, pp.251-271 https://doi.org/10.3743/KOSIM.2007.24.1.251

초록보기

초록

문헌간 유사도를 자질로 사용하는 분류기에서 미분류 문헌을 학습에 활용하여 분류 성능을 높이는 방안을 모색해보았다. 자동분류를 위해서 다량의 학습문헌을 수작업으로 확보하는 것은 많은 비용이 들기 때문에 미분류 문헌의 활용은 실용적인 면에서 중요하다. 미분류 문헌을 활용하는 준지도학습 알고리즘은 대부분 수작업으로 분류된 문헌을 학습데이터로 삼아서 미분류 문헌을 분류하는 첫 번째 단계와, 수작업으로 분류된 문헌과 자동으로 분류된 문헌을 모두 학습 데이터로 삼아서 분류기를 학습시키는 두 번째 단계로 구성된다. 이 논문에서는 문헌간 유사도 자질을 적용하는 상황을 고려하여 두 가지 준지도학습 알고리즘을 검토하였다. 이중에서 1단계 준지도학습 방식은 미분류 문헌을 문헌유사도 자질 생성에만 활용하므로 간단하며, 2단계 준지도학습 방식은 미분류 문헌을 문헌유사도 자질 생성과 함께 학습 예제로도 활용하는 알고리즘이다. 지지벡터기계와 나이브베이즈 분류기를 이용한 실험 결과, 두 가지 준지도학습 방식 모두 미분류 문헌을 활용하지 않는 지도학습 방식보다 높은 성능을 보이는 것으로 나타났다. 특히 실행효율을 고려한다면 제안된 1단계 준지도학습 방식이 미분류 문헌을 활용하여 분류 성능을 높일 수 있는 좋은 방안이라는 결론을 얻었다

Abstract

This paper studies the problem of classifying documents with labeled and unlabeled learning data, especially with regards to using document similarity features. The problem of using unlabeled data is practically important because in many information systems obtaining training labels is expensive, while large quantities of unlabeled documents are readily available. There are two steps in general semi-supervised learning algorithm. First, it trains a classifier using the available labeled documents, and classifies the unlabeled documents. Then, it trains a new classifier using all the training documents which were labeled either manually or automatically. We suggested two types of semi-supervised learning algorithm with regards to using document similarity features. The one is one step semi-supervised learning which is using unlabeled documents only to generate document similarity features. And the other is two step semi-supervised learning which is using unlabeled documents as learning examples as well as similarity features. Experimental results, obtained using support vector machines and naive Bayes classifier, show that we can get improved performance with small labeled and large unlabeled documents then the performance of supervised learning which uses labeled-only data. When considering the efficiency of a classifier system, the one step semi-supervised learning algorithm which is suggested in this study could be a good solution for improving classification performance with unlabeled documents.

67

완벽주의 지수 PI의 개량을 통한 유력 학술지와 대량생산 학술지의 구분

이재윤(명지대학교) 2019, Vol.36, No.2, pp.201-222 https://doi.org/10.3743/KOSIM.2019.36.2.201

초록보기

초록

최근 제안된 완벽주의 지수 PI는 연구자를 유력자와 대량생산자로 구분하는 지표이다. 이 연구에서는 PI를 개량한 새로운 지표인 준완벽주의 지수 NPI를 제안하였다. NPI는 특히 발행시기 등을 고려하지 않고 저인용논문에 무조건 획일적인 기준으로 패널티를 부과하던 PI의 방식을 개선하는 보완 지수이다. NPI에서는 꼬리 보상 영역에 인용빈도 곡선을 고려하면서 패널티를 부과함으로써, h-지수의 향상이 오히려 영향력 지표에 불리하게 작용하는 것을 방지한다. 이렇게 개발된 NPI를 Web of Science 문헌정보학 관련 분야 학술지에 시험 적용해본 결과 h-지수와 평균 인용횟수로는 불가능했던, 유력 학술지와 대량생산 학술지의 구분을 성공적으로 수행할 수 있었다.

Abstract

The Perfectionism Index (PI) is an indicator that is recently proposed to distinguish influential researchers from mass producers. In this study, Near Perfectionism Index (NPI), an improved indicator of Perfectionism Index, can be a solution to the problem of PI that indiscriminately gives a penalty to all low-cited papers regardless of publishing time or other issues. NPI improved the method to give a penalty to tail complement area considering the citation distribution curve. It prevents the improvement of the h-index from adversely affecting the researcher’s influence indicator. This study uses NPI to evaluate information and library science journals in Web of Science database. It successfully distinguishes between influential journals and mass producers unlike journal h-index or average citation frequency which could not differentiate influentials from mass producers.

68

근거중심 문헌정보실무의 내용과 방법론에 관한 연구

표순희(이화여자대학교) 2009, Vol.26, No.1, pp.351-370 https://doi.org/10.3743/KOSIM.2009.26.1.351

초록보기

초록

본 연구는 근거중심 문헌정보실무의 개념과 연구방법 및 동향을 분석하여 국내 적용가능성을 살펴보기 위한 기초연구로 수행되었다. 근거중심 문헌정보실무는 실무에서 문제해결과 업무수행 향상을 위해 신뢰성 있는 연구 결과의 활용을 촉진함으로써 연구와 실무 간에 근본적으로 존재하는 격차를 줄이기 위한 운동이다. 근거중심 문헌정보실무는 초기에 의학도서관을 중심으로 수행되었으나 점차로 대학, 전문, 학교도서관으로 확대되었으며 연구 주제 또한 이용자연구, 평가에 한정되었던 것이 경영, 장서, 서비스 등 도서관 서비스 전분야로 확대되고 있다. 근거중심 문헌정보실무의 수행은 특정한 문제해결을 위한 다양한 연구 결과의 검색, 선정, 평가, 활용의 과정이 이루어진다. 또한 가장 높은 수준의 연구결과인 비평적 리뷰의 생산과정을 정형화함으로써 기존 연구의 체계적인 평가를 수행함과 동시에 새로운 연구통합 방법을 제시하고 있다.

Abstract

The purpose of this study is to apply the evidence-based library and information practice(EBLIP) in Korean librarianship with analysis of concepts and research method on EBLIP. EBLIP seeks to improve library practice by utilising the best available evidence in conjunction with a pragmatic perspective developed from working experiences in librarianship. The EBLIP focused on the medicine library, however, it is spread to academic, special, school library. EBLIP process can be described through its five stage: formulate a question, find evidence, critically appraise the evidence, apply results of appraisal, evaluate change, redefine problem. It provides a standardized methodology of systematic review, which is a best evidence in EBLIP and is a new mixed research method.

69

데이터품질관리 성숙도모델에 대한 연구

김찬수() ; 박주석() 2003, Vol.20, No.4, pp.249-275 https://doi.org/10.3743/KOSIM.2003.20.4.249

초록보기

초록

오늘날 정보화 사회에서 경쟁하는 기업들에 있어서 데이터품질 저하는 기업경쟁력 하락과 새로운 비용창출이라는 부정적인 영향요인으로써 작용하고 있다. 이러한 데이터품질 저하의 문제를 해결하기 위해 데이터품질에 대한 많은 선행연구들이 진행되어 왔으며, 데이터품질의 측면 중 결과적이고 현상적인 품질개념인 데이터값의 품질과 데이터서비스의 품질에 대해 주로 연구되어 왔다. 이에 반해 본 연구에서는 원인적인 데이터품질 개념인 데이터의 구조적 품질을 메타데이터 관리의 관점에서 연구하였으며, 이를 통해 평가와 개선을 위한 관리의 관점이 적용된 데이터품질관리 성숙도모델을 제시하였다. 또한 본 연구에서 제시한 데이터품질관리 성숙도모델의 타당성 검증을 위해 데이터품질 관리단계가 성숙될수록 데이터품질수준이 높아지게 된다는 것을 실증적으로 검증하였다.

Abstract

In companies competing for today's information society. Data quality deterioration is causing a negative influence to generate company competitiveness fall and new cost. A lot of preceding study about data quality have been proceeded in order to solve a problem of these data quality deterioration. Among the sides of data quality, it has been studied mainly on quality of the data value and quality of data service that are the results quality concept. However, this study studied structural quality of the data which were cause quality concept in a viewpoint of metadata management and presented data quality management maturity model through this. Also empirically this study verified that data quality improved if the management level matured.

70

정부 전자문서유통의 발전방향에 관한 연구

송병호(상명대학교) 2004, Vol.21, No.3, pp.185-202 https://doi.org/10.3743/KOSIM.2004.21.3.185

초록보기

초록

전자문서는 사람이 판독할 수 있는 문서 측면과 시스템이 이해하고 자동 처리할 수 있는 전자적 측면을 모두 갖추고 있어서 가용성이 뛰어나다. 전자문서를 이용하는 목적이 가용성이라면 이 특성이 잘 발휘되어 효과적이고 효율적인 전자문서유통이 되도록 강구하여야 할 것이다. 정부의 전자문서유통 상황은 종래의 종이문서 중심 사고방식과 각 부처별 업무중심 관점에서 아직 벗어나지 못하여 이러한 장점을 충분히 살리지 못하고 있다. 본 논문에서는 전자문서유통의 발전 방향을 제시하여 향후 방대하게 생산될 전자문서의 효용을 높이는 데에 도움을 주는 것을 목적으로 하였다. 우선 전자문서의 개념을 정의하고 XML을 이용하여 문서의 구조정보를 표현하는 이유와 한계, 정부 전자문서유통의 문제점을 설명하였다. 그리고 향후 정보 표현 방안, 문서 구성 방안과 표준 관리 방안을 제시하였다.

Abstract

The electronic documents have the documental aspect that can legible by human and the electronic aspect that can be interpreted and processed automatically by machinery. This usability of electronic documents must be the reason that people use them in almost all the business areas rapidly in these days. That is because we have to utilize the characteristics to interchange electronic documents(EDI) effectively and efficiently. The electronic document interchange of Korean government cannot make the best of this benefit due to the traditional way of thinking based on paper document and the viewpoint of individual business, organization, and project. This paper proposed a direction of EDI for Korean government. At first, the concept of electronic docuemnts is defined. The reason and limitations of structuring the documents with XML and the hidden problems of present EDI in Korean public sector are also explained. And the way to preserve information, to design the structure of electronic document, and to maintain relevant standards is proposed.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지