정보관리학회지, 한국정보관리학회

1

국가 연구데이터플랫폼과 바이오 연구데이터플랫폼의 메타데이터 상호운용성에 관한 연구

박성은(성균관대학교 문헌정보학과 박사과정) ; 고영만(성균관대학교 문헌정보학과) 2022, Vol.39, No.2, pp.159-202 https://doi.org/10.3743/KOSIM.2022.39.2.159

초록보기

초록

‘국가 연구데이터플랫폼’과 ‘바이오 연구데이터플랫폼’은 비교적 최근 구축되어 활발하게 각각의 생태계를 만들어 가고 있다. 따라서 다른 메타데이터 표준을 기반으로 독립적으로 구축되어 향후 상호운용성의 문제가 발생할 수 있다. 본 연구의 목적은 각 플랫폼의 메타데이터 요소를 매핑하고, 이를 검증하여 상호운용성을 확보하기 위한 기반을 제안하는 것이다. 이를 위해 각 플랫폼의 메타데이터 표준을 분석하고 크로스워크 대상을 선정하여 매핑한 후, 바이오 분야 전문가를 통해 매핑된 요소의 적합성을 검증하고 더 적절한 매핑 요소를 추천받아 데이터셋 및 파일에 대한 메타데이터 요소를 도출하였다. 이를 통해 각 플랫폼의 메타데이터가 의미적으로 연결될 수 있는 가능성과 상호운용성 확보를 위한 기반을 확인할 수 있었다.

Abstract

The ‘National Research Data Platform’ and the ‘Bio Research Data Platform’ were recently built and each is actively creating an ecosystem. It is built independently based on other metadata standards, which may cause future interoperability issues. The purpose of this study is to propose a basis for metadata interoperability between the two platforms. To this end, the metadata standards of each platform were analyzed, crosswork targets were selected and mapped, and the suitability of the mapped elements was verified through experts in the bio field. And more appropriate mapping elements were recommended to derive metadata elements for datasets and files. Through this, it was possible to confirm the possibility that the metadata of each platform could be semantically linked and the basis for securing interoperability.

2

생명공학 분야 연구자의 연구데이터 공유 의도에 영향을 미치는 요인에 관한 연구: 학술적 평판의 조절효과를 중심으로

김선(성균관대학교 문헌정보학과 석사졸업) 2022, Vol.39, No.1, pp.45-68 https://doi.org/10.3743/KOSIM.2022.39.1.045

초록보기

초록

본 연구는 연구자들의 데이터 공유 행위에 대한 이해에 목적을 두고 국내 생명공학분야 연구자와 연구학생을 대상으로 데이터 공유 의도에 영향을 미치는 요인을 살펴보았다. 이메일로 수집된 411개의 유효 응답은 PLS-SEM을 이용하여 분석하였다. 연구 결과, 첫째, 데이터 공유 규범과 학술적 상호주의는 데이터 공유 의도에 직접적으로 긍정적인 영향을 미친 것으로 나타났다. 둘째, 공동체 신뢰는 학술적 상호주의가 공동체 신뢰와 데이터 공유 의도의 매개변인일 때, 데이터 공유 의도에 유의미한 영향을 미치는 것으로 나타났다. 셋째, 학술적 평판은 데이터 공유 규범과 학술적 상호주의, 그리고 데이터 공유 규범과 데이터 공유 의도 간의 관계에서, 학술적 상호주의와 데이터 공유 의도의 관계에서 유의한 조절효과를 보였다. 본 연구는 국내 생명공학 연구자들의 데이터 공유 의도에 영향을 미치는 요인에 대하여 Ostrom의 집단행동이론을 적용하여 살펴보았다는 점과 변인들의 영향 관계 안에서 학술적 평판의 조절효과를 발견하였다는 점에서 그 의의가 있다. 이러한 결과는 연구자들의 데이터 공유 행위를 촉진시킬 수 있는 방안으로 학술적인 보상 시스템의 개발의 필요성을 시사한다.

Abstract

The objective of this study is to investigate the factors which influence biotechnology scientists’ data sharing intention. This study employed Ostrom’s theory of collective action. The target population of this study includes scientists and students of biotechnology field in South Korea. A total of 411 responses which collected by e-mail were used for the final data analysis. The summary of this study is as follows. First, norm of data sharing and academic reciprocity were found to have significant positive influences on data sharing intention directly. Second, perceived community trust was found to have significant positive influences on data sharing intention when academic reciprocity was the mediator. Third, academic reputation showed the moderating effects on the relationship between norm of data sharing and academic reciprocity, and between norm of data sharing and data sharing intention. These findings show that researchers can approach the data sharing behaviors by using the mechanism of trust, norms, reciprocity, and reputation and indicate necessity for a development of academic reputation system to promote more data sharing behaviors of researchers.

3

종이기록 데이터화를 위한 AI-OCR 적용 사례연구

안세진(김포시 행정과) ; 황현호(㈜악어디지털) ; 임진희(이화여자대학교 정책과학과) 2022, Vol.39, No.3, pp.165-193 https://doi.org/10.3743/KOSIM.2022.39.3.165

초록보기

초록

현대 업무환경 변화의 중심은 디지털 기술이라고 할 수 있다. 특히 업무관리시스템 및 문서생산시스템에서 생산한 기록으로 업무를 증명하는 일반적인 공공기관에서 기록관리체계는 업무환경 그 자체이기도 하다. 김포시는 제4차 산업혁명기술 시대에 선제적으로 대응하고 업무환경 혁신을 이루기 위해 한국지능정보사회진흥원(NIA)의 2021년 공공부문 클라우드 선도 프로젝트 사업에 지원하였고 선도 기관으로 확정되어 3억 3천의 지원을 받아 공공 클라우드 기반의 AI-OCR을 통한 기록물 검색 및 활용기능 강화 프로젝트를 진행하였다. 이를 통해 규격화된 색인 값에 의존한 검색과 이미지 열람에 그치던 비전자기록의 한계를 넘어 데이터화 하였고 AI-OCR이라는 신기술 적용으로 98%의 인식률을 구현하였다. 공공기관에 디지털 기술을 사용하여 업무 효율화, 생산성 향상, 개발비용 절감, 내․외부 이용자들의 기록관리 서비스 수준의 제고를 이루었기에 신기술과 기록물관리의 결합 사례연구를 통해 기록관리 분야 본연의 전문성을 높이는 방향과 업무환경 혁신 구현 사례를 공유하고자 한다.

Abstract

It can be said that digital technology is at the center of the change in the modern work environment. In particular, in general public institutions that prove their work with records produced by business management systems and document production systems, the record management system is also the work environment itself. Gimpo City applied for the 2021 public cloud leading project of the National Information Society Agency (NIA) to proactively respond to the 4th industrial revolution technology era and implemented a public cloud-based AI-OCR technology enhancement project with 330 million won in support of 330 million won. Through this, it was converted into data beyond the limitations of non-electronic records limited to search and image viewing that depend on standardized index values. In addition, a 98% recognition rate was realized by applying a new technology called AI-OCR. Since digital technology has been used to improve work efficiency, productivity, development cost, and record management service levels of internal and external users, we would like to share the direction of enhancing expertise in the record management and implementation of work environment innovation.

4

데이터사이언스 관련 교과목의 강의 계획서 분석: ALA의 인가를 받은 문헌정보학 프로그램을 중심으로

박형주(충남대학교 문헌정보학과) 2022, Vol.39, No.1, pp.119-143 https://doi.org/10.3743/KOSIM.2022.39.1.119

초록보기

초록

본 연구는 미국도서관협회(American Library Association, ALA)의 인가를 받은 문헌정보학 프로그램에서 제공하는 데이터사이언스와 관련된 수업의 내용을 조사했다. 연구의 목적은 강의 계획서의 내용 분석을 통해 해당 수업에서 다뤄지는 교과목 명, 교과 설명, 학습 목표, 주차 별 주제를 살펴보는 것이다. 문헌정보학 프로그램에서의 데이터사이언스와 관련된 필수 과목 및 선택 과목은, 데이터사이언스 개론, 데이터 마이닝, 데이터베이스, 데이터 분석, 데이터 시각화, 데이터 큐레이션 및 관리, 머신 러닝, 메타데이터, 컴퓨터 프로그래밍 등 데이터사이언스 전 분야에 걸쳐 다양하게 교과목이 개설되어 있었다. 본 연구의 결과는 문헌정보학 프로그램에서 데이터사이언스 교과 과정을 개설 및 개정할 때 논의의 시작점이 될 수 있는 기초 자료가 되어 운영 역량을 강화하는데 활용되기를 기대한다.

Abstract

This preliminary study examined the status of data science-related course syllabi in the American Library Association (ALA) accredited Library and Information Science (LIS) programs. The purpose of this study was to explore LIS course syllabi related to data science, such as course title, course description, learning outcomes, and weekly topics. LIS programs offer various topics in data science such as the introduction to data science, data mining, database, data analysis, data visualization, data curation and management, machine learning, metadata, and computer programming. This study contributes to helping instructors develop or revise course materials to improve course competencies related to data science in the ALA-accredited LIS programs.

5

행정정보 데이터세트 이관도구 SIARD_KR의 개선방안

변우영(명지대학교 기록정보관리학과) ; 임진희(명지대학교 기록정보과학전문대학원) 2022, Vol.39, No.1, pp.195-217 https://doi.org/10.3743/KOSIM.2022.39.1.195

초록보기

초록

SIARD_KR은 스위스 연방 기록보존소에서 개발한 관계형 데이터베이스 컨텐츠의 장기보존에 이용하는 기술인 SIARD를 우리나라의 실정에 맞게 일부 수정한 행정정보 데이터세트 보존 도구이다. 기존의 선행연구는 SIARD가 얼마나 관계형 데이터베이스안에 들어있는 모든 데이터를 손실 없이 잘 추출할 수 있는지에 초점이 맞춰져 있다. 하지만 데이터베이스에 들어있는 데이터 전부가 의미 있는 정보, 즉 행정정보 데이터세트는 아니다. 따라서 이 논문은 SIARD_KR이 행정정보 데이터세트의 특성을 반영하고 있는가에 대한 문제의식에서 시작한다. SIARD_KR이 단순히 DB에 저장된 데이터를 추출하는 도구가 아니고 의미 있는 정보만을 식별하여 추출할 수 있을지, 본래의 시스템에서 유리되어도 의미 있는 정보를 유지할 수 있을지 확인하려 한다. 본 논문은 SIARD_KR의 구조를 분석하고, 예상되는 문제점을 도출하여 그에 대한 개선방안을 제시하는 것을 목적으로 한다.

Abstract

SIARD_KR is an administrative information dataset preservation tool. It is a partially modified version of SIARD, technology used for long-term preservation of relational databases developed by the Swiss Federal Archives, to suit Korea’s situation better. Previous studies have focused on how SIARD is able to effectively extract all data contained in the relational database without loss. However, not all data contained in the database is meaningful information, that is, an administrative information dataset. This paper began, therefore, with the awareness of the problem of whether SIARD_KR reflects the characteristics of the administrative information dataset. SIARD_KR is not only a tool for extracting data stored in the DB. We want to see if it is capable of identifying and extracting only meaningful information, and maintaining meaningful information, even if it is separated from the original system. The purpose of this paper is to analyze the structure of SIARD_KR, identify expected problems, and suggest improvement measures for them.

6

딥러닝 기반의 BERT 모델을 활용한 학술 문헌 자동분류

김인후(중앙대학교 문헌정보학과 대학원) ; 김성희(중앙대학교 문헌정보학과) 2022, Vol.39, No.3, pp.293-310 https://doi.org/10.3743/KOSIM.2022.39.3.293

초록보기

초록

본 연구에서는 한국어 데이터로 학습된 BERT 모델을 기반으로 문헌정보학 분야의 문서를 자동으로 분류하여 성능을 분석하였다. 이를 위해 문헌정보학 분야의 7개 학술지의 5,357개 논문의 초록 데이터를 학습된 데이터의 크기에 따라서 자동분류의 성능에 어떠한 차이가 있는지를 분석, 평가하였다. 성능 평가척도는 정확률(Precision), 재현율(Recall), F 척도를 사용하였다. 평가결과 데이터의 양이 많고 품질이 높은 주제 분야들은 F 척도가 90% 이상으로 높은 수준의 성능을 보였다. 반면에 데이터 품질이 낮고 내용적으로 다른 주제 분야들과 유사도가 높고 주제적으로 확실히 구별되는 자질이 적을 경우 유의미한 높은 수준의 성능 평가가 도출되지 못하였다. 이러한 연구는 미래 학술 문헌에서 지속적으로 활용할 수 있는 사전학습모델의 활용 가능성을 제시하기 위한 기초자료로 활용될 수 있을 것으로 기대한다.

Abstract

In this study, we analyzed the performance of the BERT-based document classification model by automatically classifying documents in the field of library and information science based on the KoBERT. For this purpose, abstract data of 5,357 papers in 7 journals in the field of library and information science were analyzed and evaluated for any difference in the performance of automatic classification according to the size of the learned data. As performance evaluation scales, precision, recall, and F scale were used. As a result of the evaluation, subject areas with large amounts of data and high quality showed a high level of performance with an F scale of 90% or more. On the other hand, if the data quality was low, the similarity with other subject areas was high, and there were few features that were clearly distinguished thematically, a meaningful high-level performance evaluation could not be derived. This study is expected to be used as basic data to suggest the possibility of using a pre-trained learning model to automatically classify the academic documents.

7

토픽 성장 분석을 통한 오픈액세스 분야 연구 동향 분석

정재민(한국과학기술정보연구원 오픈액세스센터 AccessON개발팀) ; 김완종(한국과학기술정보연구원 오픈액세스센터 AccessON개발팀) 2022, Vol.39, No.4, pp.75-97 https://doi.org/10.3743/KOSIM.2022.39.4.075

초록보기

초록

전통적인 학술 커뮤니케이션 체제의 문제점을 해결하기 위한 대안으로 오픈액세스 패러다임에 대한 국제적 관심과 확산이 지속되고 있다. 하지만 데이터 기반의 정량적인 방법을 통해 오픈액세스 분야의 글로벌한 동향이나 성장 추세를 파악하려는 노력은 아직까지 부족한 실정이다. 본 연구는 오픈액세스 분야의 학술논문 데이터에 토픽 모델링을 적용하여 세부 연구토픽을 식별하고, 성장곡선을 적합하여 각 연구토픽의 성숙도와 예상 잔여수명을 계산한다. 본 연구는 오픈 사이언스의 세 가지 핵심요소인 오픈액세스, 오픈데이터, 오픈협업과 관련된 14개 토픽들을 식별하였으며, 오픈액세스 분야가 앞으로 약 65년간 꾸준히 성장할 것으로 예상하였다. 본 연구의 분석 결과는 연구자들과 정책 의사결정자들이 오픈액세스 분야의 동향과 성장 추세를 이해하는 데 도움을 줄 수 있을 것으로 기대된다.

Abstract

To solve the problems of the traditional scholarly communication system, global interest in the open access paradigm continues. Nevertheless, there is still a lack of research to understand global research and growth trends in the field of open access through data-based quantitative methods. This study aims to identify which sub-fields exist in open access and analyze how long each research field will grow in the future. To this end, topic modeling and growth curve analysis were applied to global academic papers in the field of open access. This study identified 14 research topics related to open access, open data, and open collaboration, which are three key elements of open science, and foresaw that the field of open access will grow over the next 65 years. The results of this study are expected to support researchers and policymakers in understanding global research trends of open access.

8

계량서지적 분석에서 지적구조 매핑을 위한 링크 삭감 알고리즘의 적합도 측정

이재윤(명지대학교 문헌정보학과) 2022, Vol.39, No.2, pp.233-254 https://doi.org/10.3743/KOSIM.2022.39.2.233

초록보기

초록

지적구조 분석을 위해 가중 네트워크를 시각화해야 하는 경우에 패스파인더 네트워크와 같은 링크 삭감 알고리즘이 널리 사용되고 있다. 이 연구에서는 네트워크 시각화를 위한 링크 삭감 알고리즘의 적합도를 측정하기 위한 지표로 NetRSQ를 제안하였다. NetRSQ는 개체간 연관성 데이터와 생성된 네트워크에서의 경로 길이 사이의 순위 상관도에 기반하여 네트워크의 적합도를 측정한다. NetRSQ의 타당성을 확인하기 위해서 몇 가지 네트워크 생성 방식에 대해 정성적으로 평가를 했었던 선행 연구의 데이터를 대상으로 시험적으로 NetRSQ를 측정해보았다. 그 결과 품질이 좋게 평가된 네트워크일수록 NetRSQ가 높게 측정됨을 확인하였다. 40가지 계량서지적 데이터에 대해서 4가지 링크 삭감 알고리즘을 적용한 결과에 대해서 NetRSQ로 품질을 측정하는 실험을 수행한 결과, 특정 알고리즘의 네트워크 표현 결과가 항상 좋은 품질을 보이는 것은 아니며, 반대로 항상 나쁜 품질을 보이는 것도 아님을 알 수 있었다. 따라서 이 연구에서 제안한 NetRSQ는 생성된 계량서지적 네트워크의 품질을 측정하여 최적의 기법을 선택하는 근거로 활용될 수 있을 것이다.

Abstract

Link reduction algorithms such as pathfinder network are the widely used methods to overcome problems with the visualization of weighted networks for knowledge domain analysis. This study proposed NetRSQ, an indicator to measure the goodness of fit of a link reduction algorithm for the network visualization. NetRSQ is developed to calculate the fitness of a network based on the rank correlation between the path length and the degree of association between entities. The validity of NetRSQ was investigated with data from previous research which qualitatively evaluated several network generation algorithms. As the primary test result, the higher degree of NetRSQ appeared in the network with better intellectual structures in the quality evaluation of networks built by various methods. The performance of 4 link reduction algorithms was tested in 40 datasets from various domains and compared with NetRSQ. The test shows that there is no specific link reduction algorithm that performs better over others in all cases. Therefore, the NetRSQ can be a useful tool as a basis of reliability to select the most fitting algorithm for the network visualization of intellectual structures.

9

자질선정을 통한 국내 학술지 논문의 자동분류에 관한 연구

김판준(신라대학교 문헌정보학과) 2022, Vol.39, No.1, pp.69-90 https://doi.org/10.3743/KOSIM.2022.39.1.069

초록보기

초록

국내 학술연구의 동향을 구체적으로 파악하여 연구개발 활동의 체계적인 지원 및 평가는 물론 현재와 미래의 연구 방향을 설정할 수 있는 기초 데이터로서, 개별 학술지 논문에 표준화된 주제 범주(통제키워드)를 부여할 수 있는 효율적인 방안을 모색하였다. 이를 위해 한국연구재단 ｢학술연구분야분류표｣ 상의 분류 범주를 국내 학술지 논문에 자동 할당하는 과정에서, 자질선정 기법을 중심으로 자동분류의 성능에 영향을 미치는 주요 요소들에 대한 다각적인 실험을 수행하였다. 그 결과, 실제 환경의 불균형 데이터세트(imbalanced dataset)인 국내 학술지 논문의 자동분류에서는 보다 단순한 분류기와 자질선정 기법, 그리고 비교적 소규모의 학습집합을 사용하여 상당히 좋은 수준의 성능을 기대할 수 있는 것으로 나타났다.

Abstract

As basic data that can systematically support and evaluate R&D activities as well as set current and future research directions by grasping specific trends in domestic academic research, I sought efficient ways to assign standardized subject categories (control keywords) to individual journal papers. To this end, I conducted various experiments on major factors affecting the performance of automatic classification, focusing on feature selection techniques, for the purpose of automatically allocating the classification categories on the National Research Foundation of Korea’s Academic Research Classification Scheme to domestic journal papers. As a result, the automatic classification of domestic journal papers, which are imbalanced datasets of the real environment, showed that a fairly good level of performance can be expected using more simple classifiers, feature selection techniques, and relatively small training sets.

10

연관규칙을 활용한 학교도서관 도서추천시스템 개발에 관한 연구

임정훈(대전과학고등학교 교사) ; 조창제(NeuroEars 연구개발전담부서) ; 김종헌(대전과학고등학교 교사) 2022, Vol.39, No.3, pp.1-22 https://doi.org/10.3743/KOSIM.2022.39.3.001

초록보기

초록

본 연구는 학교도서관에서 활용할 수 있는 도서추천시스템을 제안하는데 목적이 있다. 도서추천시스템은 DLS의 대출 데이터를 활용하여 연관규칙 기반의 알고리즘을 적용하였으며, 학교도서관 이용자들에게 개인화 도서추천 서비스 제공이 가능하도록 설계하였다. 이를 위해 Apriori 알고리즘 기반의 연관규칙과 매개 중심성 분석을 적용하고, 기술통계, 연관규칙 생성, 학생중심 추천, 도서 중심추천 등 세부 기능을 구현하였다. 이어서 사서교사를 대상으로 심층면담을 통해 도서추천시스템 사용에 대한 의견을 조사하였다. 조사 결과, 도서추천의 필요성 및 어려움, 학생의 반응, 기존 추천방식과의 차이점 및 활용방안, 개선 사항에 대한 의견을 확인할 수 있었으며, 이를 토대로 다음의 논의점을 제안하였다. 첫째, 개별학교의 특성을 파악하기 위해서 장기간의 대출 데이터의 제공이 필요하다. 둘째, 지역별 혹은 학교 특성별 데이터 통합 방안에 대한 논의가 필요하다. 셋째, 독서교육종합시스템에서 제공하는 도서추천시스템의 구축이 필요하다. 본 연구에서 제안된 내용을 토대로 향후 학교도서관 현장에서 활용할 수 있는 개인화 추천시스템 적용에 대한 다양한 논의가 이루어지길 기대한다.

Abstract

The purpose of this study is to propose a book recommendation system that can be used in school libraries. The book recommendation system applies an algorithm based on association rules using DLS lending data and is designed to provide personalized book recommendation services to school library users. For this purpose, association rules based on the Apriori algorithm and betweenness centrality analysis were applied and detailed functions such as descriptive statistics, generation of association rules, student-centered recommendation, and book-centered recommendation were materialized. Subsequently, opinions on the use of the book recommendation system were investigated through in-depth interviews with teacher librarians. As a result of the investigation, opinions on the necessity and difficulty of book recommendation, student responses, differences from existing recommendation methods, utilization methods, and improvements were confirmed and based on this, the following discussions were proposed. First, it is necessary to provide long-term lending data to understand the characteristics of each school. Second, it is necessary to discuss the data integration plan by region or school characteristics. Third, It is necessary to establish a book recommendation system provided by the Comprehensive Support System for Reading Education. Based on the contents proposed in this study, it is expected that various discussions will be made on the application of a personalization recommendation system that can be used in the school library in the future.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지