정보관리학회지, 한국정보관리학회

1

한나은(한국과학기술정보연구원) 2023, Vol.40, No.1, pp.51-71 https://doi.org/10.3743/KOSIM.2023.40.1.051

초록보기

초록

본 연구는 공공데이터 품질관리 모델, 빅데이터 품질관리 모델, 그리고 연구데이터 관리를 위한 데이터 생애주기 모델을 분석하여 각 품질관리 모델에서 공통적으로 나타나는 구성 요인을 분석하였다. 품질관리 모델은 품질관리를 수행하는 객체인 대상 데이터의 특성에 따라 생애주기에 맞추어 혹은 PDCA 모델을 바탕으로 구축되고 제안되는데 공통적으로 계획, 수집 및 구축, 운영 및 활용, 보존 및 폐기의 구성요소가 포함된다. 이를 바탕으로 본 연구는 연구데이터를 대상으로 한 품질관리 프로세스 모델을 제안하였는데, 특히 연구데이터를 대상 데이터로 하여 서비스를 제공하는 연구데이터 서비스 플랫폼에서 데이터를 수집하여 서비스하는 일련의 과정에서 수행해야하는 품질관리에 대해 계획, 구축 및 운영, 활용단계로 나누어 논의하였다. 본 연구는 연구데이터 품질관리 수행 방안을 위한 지식 기반을 제공하는데 의의를 갖는다.

Abstract

This study analyzed the government data quality management model, big data quality management model, and data lifecycle model for research data management, and analyzed the components common to each data quality management model. Those data quality management models are designed and proposed according to the lifecycle or based on the PDCA model according to the characteristics of target data, which is the object that performs quality management. And commonly, the components of planning, collection and construction, operation and utilization, and preservation and disposal are included. Based on this, the study proposed a process model for research data quality management, in particular, the research data quality management to be performed in a series of processes from collecting to servicing on a research data platform that provides services using research data as target data was discussed in the stages of planning, construction and operation, and utilization. This study has significance in providing knowledge based for research data quality management implementation methods.

2

연구데이터 관리서비스의 구현 시 고려사항에 관한 연구

김성훈(성균관대학교 문헌정보학과) ; 오삼균(성균관대학교 문헌정보학과) 2018, Vol.35, No.2, pp.141-165 https://doi.org/10.3743/KOSIM.2018.35.2.141

초록보기

초록

본 연구의 목적은 연구데이터 관리서비스 구현 시 성공적인 서비스를 위한 고려사항을 도출하는 것이다. 이를 위해 선행연구를 활용하여 연구데이터 관리서비스의 영역을 파악하였고, 미국, 독일, 호주에서 연구데이터 관리서비스를 시행중인 대학도서관 6곳과 1개의 기관에서 담당자 8명을 대상으로 연구데이터 서비스에 관한 질문의 답변을 이메일을 통해 수집하였다. 또 해외서비스를 대상으로 수집한 고려사항이 국내에 적용가능한지 국내 연구데이터 관리서비스 전문가와 검토하였다. 연구데이터 서비스 영역은 총 9개의 카테고리로 구분하여 분석하였는데, 연구서비스와 연구데이터 관리서비스 연계, 국가/대학/기관 차원의 협약, 메타데이터 입력주체 및 필수 요소, 직원의 전문화 방안, 이용자 요구분석을 통한 주요서비스 영역 선정, 연구데이터와 연구결과물의 효과적인 연결방안, 이용자와 유관기관과 긴밀한 공조 등의 연구데이터 관리서비스 구축 시 고려사항을 도출할 수 있었다.

Abstract

The purpose of this study is to determine crucial factors of consideration in ensuring the successful implementation of research data management services. The study begins by extracting a range of service areas from their equivalent in existing research on data management services. It then collects relevant information via e-mail survey from eight individuals respectively overseeing research data management services at six university libraries and one institution located throughout the United States, Germany, and Australia. Having originated in overseas cases, the resulting factors of consideration were reviewed by domestic experts in research data management services. The finalized areas of research data management services consist of nine categories. The crucial factors of consideration in RDM services are connection between research services and research data management services; national/university-level/institutional agreements; metadata entry personnel and required elements; strategies for the provision of specialized staff; major service area selection through user demand analysis; effective linkage between research data and research results; and close cooperation with users and related organizations.

3

연구데이터 관리 및 검색을 위한 스키마 클래스 상속 모델

김선태(한국과학기술정보연구원) 2014, Vol.31, No.2, pp.41-56 https://doi.org/10.3743/KOSIM.2014.31.2.041

초록보기

초록

최근 연구데이터가 국가자산이라는 인식의 확산으로 원시데이터 관리 및 재사용의 필요성이 이슈이다. 본 연구에서는 데이터의 체계적인 관리를 위해, 스키마 클래스를 상속하는 방식의 메타데이터 설계 모델과 상속을 통해 생성된 스키마 객체들을 대상으로 메타데이터 통합 검색 모델을 제안하였다. 스키마 클래스를 상속한 스키마 객체가 데이터 컬렉션에 1대1의 관계를 갖도록 데이터 아키텍처를 설계하였으며, 제안된 모델의 검증을 위해서 가상 스키마 클래스 및 객체가 시스템적으로 구현 가능함을 증명하였다. 본 연구에서 제안하는 스키마 클래스 상속 및 통합검색 모델은 일반적으로 사용되는 ‘하향식 계층 모델’의 단점을 극복하는 모델로서, 정부 기관에서 생산되는 데이터를 독립적으로 관리하는데 활용될 수 있다고 사료된다.

Abstract

The necessity of the raw data management and reuse is issued by diffusion of the recognition that research data is a national asset. In this paper, a metadata design model by schema class inheritance and a metadata integrated search model by schema objects are suggested for a structural management of the data. A data architecture in which an schema object has an 1：1 relation to the data collection was designed. A suggested model was testified by creation of a virtual schema class and objects which inherit the schema class. It showed the possibility of implement systematically. A suggested model can be used to manage the data which are produced by government agencies because schema inheritance and integrated search model present way to overcome the weak points of the ‘Top-dow Hierarchy model’ which is being used to design the metadata schema.

4

대학구성원의 연구데이터 관리 인식 및 경험 연구

채현수(연세대학교) ; 전정현(연세대학교 문헌정보학과) ; 김기영(연세대학교) ; 이지연(연세대학교) 2021, Vol.38, No.4, pp.173-198 https://doi.org/10.3743/KOSIM.2021.38.4.173

초록보기

초록

본 연구는 대학구성원의 연구데이터 관리 및 공유에 대한 인식과 경험을 파악하고, 대학구성원을 위한 효과적인 연구데이터 관리 방안을 마련하기 위해서 고려해야 할 주요 사항들을 탐색하는데 목적이 있다. 문헌조사를 바탕으로 연구데이터 관리 및 공유의 주요 쟁점을 정리하였고, 이를 바탕으로 대학구성원을 대상으로 연구데이터 관리․공유에 대한 인식 및 경험을 묻는 설문조사를 실시하였다. 조사로부터 얻어낸 결과를 종합하여 연구데이터에 대한 인식, 연구데이터 관리에 대한 인식 및 경험, 연구데이터 공유․공개에 대한 인식 및 경험 측면의 시사점을 정리하였다. 본 연구는 장기적으로 수행되어야 할 연구데이터 관리 정책 및 서비스 개발 연구의 초석을 다졌다는 점에서 의의가 있다.

Abstract

The study aims to understand university constituents’ perceptions and experiences of research data management and sharing then explore the critical factors for establishing effective research data management plans. The literature review enabled summarization of the significant issues regarding research data management and sharing. In addition, the follow-up survey revealed the university constituents’ perceptions and experiences about research data management and sharing. This study has significance because it laid the foundation for long-term research data management policies and services development.

5

디지털 시대 오픈 데이터 정책의 현황과 과제

신은자(세종대학교) 2015, Vol.32, No.3, pp.49-68 https://doi.org/10.3743/KOSIM.2015.32.3.049

초록보기

초록

과거에는 오픈 데이터에 공감한다 하더라도 이를 실천할 방법이 마땅하지 않았으나 요즈음은 디지털 형태의 연구데이터를 IT를 통해 공유하는 것이 어렵지 않은 상황이 되었다. 그러나 많은 연구자가 오픈 데이터를 시행하였을 때의 부작용과 추가 작업에 대한 부담을 느끼고 있고 이외 해결하여야 할 문제도 다소 있어, 오픈 데이터는 현재 기대만큼 활발히 수행되고 있지는 않다. 지구과학, 기상학 등 일부 학문 분야에서 활발하게 추진되고 있을 뿐 나머지 학문 분야에서는 오픈 데이터에 대하여 큰 관심을 보이지 않는 듯하다. 연구결과 해외의 학회, 비영리단체, 대학, 연구지원기관에서는 오픈 데이터를 공공의 이익 추구 차원에서, 주요 출판사에서는 오픈 데이터를 논문을 엄격하게 심사하기 위한 보완책 차원에서 추진하고 있었다. 오픈 데이터는 후속 연구를 이끌고 학문을 발전시키는 발판 역할을 한다는 점에서 중요하고 앞으로 나아가야 할 방향이라는 것은 분명해 보인다. 따라서, 국내에서도 해외의 사례를 충분히 고찰하고 정책에 반영함은 물론이려니와, 연구자, 대학, 도서관 모두 오픈 데이터의 필요성과 향후 전개될 상황에 관하여 관심을 갖고 보다 적극적으로 협력하여야 할 것이며, 이 연구는 이에 관한 구체적인 내용을 기술하였다.

Abstract

There were not many ways to share research data in the past, but modern information technology has allowed us to share these data. As data sharing has its side effects, researchers’ attitude and practice to sharing data vary by individual discipline. This study found that foreign learned societies, NGOs, universities and research funders support data sharing in a utilitarian perspective, while major publishers demand it so that other researchers can verify the data in peer review. It is important that open data policy should be settled down in near future for evoking further studies and encouraging progress in science. In order to establish data sharing successfully in Korea, efforts could be made by researchers, universities, academic libraries, and governments as well as the stakeholder. This study also proposed specific ways to perform it.

6

응집물질물리분야 연구데이터 관리 방안 연구

김성욱(전북대학교 문헌정보학과 석사과정) ; 김선태(전북대학교 문헌정보학과) 2020, Vol.37, No.3, pp.77-106 https://doi.org/10.3743/KOSIM.2020.37.3.077

초록보기

초록

본 연구에서는 학제 간 연구가 가장 활발하고 응용가능성이 가장 높은 응집물질물리분야의 연구데이터를 체계적으로 관리하기 위한 개선방안을 제안하였다. 이를 위해 연구데이터 관리 도구인 Data Asset Framework (DAF)와 데이터 공유 및 재사용을 위한 FAIR원칙을 바탕으로 설문 내용을 구성하여 14명의 연구자를 대상으로 응집물질물리분야의 연구데이터 관리 현황을 수집하였다. 수집된 데이터는 설문에 응답한 연구자의 특성 및 기초정보, 데이터 보존 및 관리, 데이터 공유 및 접근에 관한 데이터로 구성되었다. 수집된 설문결과를 분석하여 응집물질물리분야의 연구데이터 특징과 데이터 수집과 생산, 데이터 보존과 관리, 데이터 공유 및 접근에 대한 9가지 문제점을 도출하였으며, 각 측면에서 도출된 문제점에 대한 개선방안을 제언하였다.

Abstract

In this study, we proposed a method to systematically manage research data in the field of condensed matter physics, which is the most active and interdisciplinary field. In the course of the research, a questionnaire was conducted for researchers in the field of condensed matter physics. The questionnaire was constructed based on the research data management tool Data Asset Framework (DAF) and the FAIR principle for data sharing and reuse. The current status of research data management in the field of aggregated material physics was collected from 14 researchers. The collected data consisted of data on the characteristics and basic information of researchers who answered the questionnaire, data preservation and management, and data sharing and access. By analyzing the collected questionnaire results, nine problems were drawn about the characteristics of research data in the field of aggregate material physics, data collection and production, data preservation and management, data sharing and access. In this study, suggestions were made to improve the problems derived from each aspect.

7

북미 대학도서관 연구데이터 관리 교육 프로그램 내용 분석: 데이터 리터러시 세부 역량을 중심으로

김지현(이화여자대학교) 2018, Vol.35, No.4, pp.7-36 https://doi.org/10.3743/KOSIM.2018.35.4.007

초록보기

초록

본 연구에서는 북미에서 연구데이터 관리 서비스를 제공하는 121개 대학도서관 중 연구데이터 관리 교육 프로그램을 제공하는 51개 기관을 대상으로 제공되는 교육 프로그램의 내용을 12개 데이터 리터러시 세부 역량에 기반을 두어 분석하고 시사점을 제시하는 것을 목적으로 하였다. 내용 분석을 위해 집합 교육 프로그램의 제목 317개와 온라인 튜토리얼의 상위 목차 제목 42개를 수집하였으며 선행연구에서 제시된 12개 데이터 리터러시 세부 역량에 따라 코딩을 수행하였다. 집합 교육 프로그램 중에서는 데이터 처리 및 분석 역량에 대한 교육 프로그램이 가장 많은 것으로 나타났으며, 가장 많은 수의 기관에서 데이터 관리 및 조직 역량에 대한 교육을 제공하고 있었다. 데이터 시각화 및 표현은 집합 교육 프로그램 중에서 세 번째로 많이 다루어지는 역량이었다. 그러나 나머지 9개 역량에 대한 교육 프로그램은 매우 적은 것으로 나타나 교육 프로그램 내용이 특정 역량에 집중되어 있음을 알 수 있다. 집합 교육 없이 자체 개발한 온라인 튜토리얼을 제공하는 기관은 5곳이었으며 목차 제목을 분석한 결과 데이터 보존, 윤리 및 데이터 인용, 데이터 관리 및 조직 역량에 대한 교육 내용을 중점적으로 다루고 있어 집합 교육 프로그램에서 강조되는 역량과 차이를 보였다. 효과적인 연구데이터 관리 교육 프로그램 운영을 위해서는 대학도서관 사서들이 전통적으로 교육하고 강조해왔던 역량뿐만 아니라 데이터 처리와 분석, 데이터 시각화와 표현 등 연구자들의 연구 결과 도출에 필요한 데이터 리터러시 세부 역량에 대한 이해와 지원이 요청된다. 또한 연구데이터 관리 서비스와 관련된 사서들의 계속 교육을 지원하는 교육 자원의 개발도 필요할 것이다.

Abstract

This study aimed to analyze the content of Records Data Management (RDM) training programs provided by 51 out of 121 university libraries in North America that implemented RDM services, and to provide implications from the results. For the content analysis, 317 titles of classroom training programs and 42 headings at the highest level from the tables of content of online tutorials were collected and coded based on 12 data literacy competencies identified from previous studies. Among classroom training programs, those regarding data processing and analysis competency were offered the most. The highest number of the libraries provided classroom training programs in relation to data management and organization competency. The third most classroom training programs dealt with data visualization and representation competency. However, each of the remaining 9 competencies was covered by only a few classroom training programs, and this implied that classroom training programs focused on the particular data literacy competencies. There were five university libraries that developed and provided their own online tutorials. The analysis of the headings showed that the competencies of data preservation, ethics and data citation, and data management and organization were mainly covered and the difference existed in the competencies stressed by the classroom training programs. For effective RDM training program, it is necessary to understand and support the education of data literacy competencies that researchers need to draw research results, in addition to competencies that university librarians traditionally have taught and emphasized. It is also needed to develop educational resources that support continuing education for the librarians involved in RDM services.

8

데이터품질관리 성숙도모델에 대한 연구

김찬수() ; 박주석() 2003, Vol.20, No.4, pp.249-275 https://doi.org/10.3743/KOSIM.2003.20.4.249

초록보기

초록

오늘날 정보화 사회에서 경쟁하는 기업들에 있어서 데이터품질 저하는 기업경쟁력 하락과 새로운 비용창출이라는 부정적인 영향요인으로써 작용하고 있다. 이러한 데이터품질 저하의 문제를 해결하기 위해 데이터품질에 대한 많은 선행연구들이 진행되어 왔으며, 데이터품질의 측면 중 결과적이고 현상적인 품질개념인 데이터값의 품질과 데이터서비스의 품질에 대해 주로 연구되어 왔다. 이에 반해 본 연구에서는 원인적인 데이터품질 개념인 데이터의 구조적 품질을 메타데이터 관리의 관점에서 연구하였으며, 이를 통해 평가와 개선을 위한 관리의 관점이 적용된 데이터품질관리 성숙도모델을 제시하였다. 또한 본 연구에서 제시한 데이터품질관리 성숙도모델의 타당성 검증을 위해 데이터품질 관리단계가 성숙될수록 데이터품질수준이 높아지게 된다는 것을 실증적으로 검증하였다.

Abstract

In companies competing for today's information society. Data quality deterioration is causing a negative influence to generate company competitiveness fall and new cost. A lot of preceding study about data quality have been proceeded in order to solve a problem of these data quality deterioration. Among the sides of data quality, it has been studied mainly on quality of the data value and quality of data service that are the results quality concept. However, this study studied structural quality of the data which were cause quality concept in a viewpoint of metadata management and presented data quality management maturity model through this. Also empirically this study verified that data quality improved if the management level matured.

9

국가연구데이터커먼즈 체계 수립을 위한 연구데이터 관리자들의 인식에 관한 연구

박성은(한국과학기술정보연구원 연구데이터공유센터 선임기술원) ; 이미경(한국과학기술정보연구원 연구데이터공유센터 책임연구원) ; 조민희(한국과학기술정보연구원 연구데이터공유센터 책임연구원) ; 송사광(한국과학기술정보연구원 연구데이터공유센터 책임연구원, UST 응용AI학과 교수) ; 김다솔(한국과학기술정보연구원 연구데이터공유센터 기술원) ; 임형준(한국과학기술정보연구원 연구데이터공유센터 센터장) 2024, Vol.41, No.1, pp.465-486 https://doi.org/10.3743/KOSIM.2024.41.1.465

초록보기

초록

본 연구는 한국과학기술정보연구원(KISTI)에서 개발하고 있는 국가연구데이터커먼즈(KRDC)를 실제 이용할 국가과학기술연구회(NST) 산하 정부출연연구기관의 연구데이터 관리자를 대상으로 연구데이터를 분석하기 위한 인프라와 서비스의 현황을 파악하고, KRDC 체계 구축과 관련한 연구데이터 관리자들의 인식을 조사하는 것을 목적으로 하였다. 이를 위해 KISTI를 제외한 24개의 정부출연연구기관을 대상으로 설문을 실시하였으며, 설문조사에 응답한 15개 기관 중 후속 인터뷰에 동의한 9개 기관의 연구데이터 관리자를 대상으로 인터뷰를 수행하였다. 설문 결과, 대부분의 기관들이 관련 서비스를 제공하고 있었으며, 연구데이터 활용을 위한 통합 분석 프레임워크의 도입과 외부에 공개된 분석 SW를 사용할 수 있는 체제에 대한 제공 의향 역시 높은 것으로 나타났다. 한편 후속 인터뷰를 통해 각 기관별로 제공하는 분석 서비스의 외부 공개 현황을 파악해보았을 때, 매우 소수의 기관만이 이를 외부에 공개하고 있었다. 이러한 연구 결과를 분석해보면, 프레임워크를 통해 분석 인프라와 서비스가 제공될 경우 활용하고자 하는 수요가 있으나, 각 기관에서 보유하고 있는 분석 자원을 공개 및 공유하기 어렵다는 것을 알 수 있다. KRDC 체계 구축을 위해서는 연구 현장에서의 분석 인프라와 분석 서비스의 공유가 필수적인 만큼 연구 현장에서의 인식 전환, 나아가 제도적 변화가 필요하며, 후속 인터뷰에서 제시된 시스템의 편리성, 보안, 보상체계 등을 잘 고려하는 정책을 수립하기 위해 노력할 필요가 있다.

Abstract

The purpose of this study is to identify the current status of infrastructure and services for analyzing research data for research data managers at government-funded research institutions under the National Research Council for Science and Technology (NST) who will actually use the Korea Research Data Commons (KRDC), which is being developed by the Korea Institute of Science and Technology Information (KISTI) and to investigate the perceptions of research data managers related to the establishment of KRDC system. For the study, we conducted a survey targeting 24 government-funded research institutes, excluding KISTI, and interviewed research data managers from 9 of the 15 institutions surveyed who agreed to follow-up interviews. As a result of the survey, most institutions were providing related services, and their willingness to introduce an integrated analysis framework for the use of research data and provide a system for using externally released analysis software was also high. Meanwhile, when we investigated the external disclosure status of each institution’s analysis services through follow-up interviews, only a minimal number of institutions were disclosing them to the outside world. The findings reveal that there is a demand to utilize analysis infrastructure and services when provided through the framework. However, it is difficult to disclose and share the analysis resources held by each organization. In order to establish the KRDC system, it is essential to share research sites’ analysis infrastructure and services, and in addition, changes in the perception of research sites and institutional changes are necessary. Furthermore, there is a need to establish policies that consider the system’s convenience, security, and compensation system raised in the follow-up interviews.

10

연구데이터 관점에서 본 거대언어모델 품질 평가 기준 제언

한나은(한국과학기술정보연구원) ; 서수정(한국과학기술정보연구원) ; 엄정호(한국과학기술정보연구원) 2023, Vol.40, No.3, pp.77-98 https://doi.org/10.3743/KOSIM.2023.40.3.077

초록보기

초록

본 연구는 지금까지 제안된 거대언어모델 가운데 LLaMA 및 LLaMA 기반 모델과 같이 연구데이터를 주요 사전학습데이터로 활용한 모델의 데이터 품질에 중점을 두어 현재의 평가 기준을 분석하고 연구데이터의 관점에서 품질 평가 기준을 제안하였다. 이를 위해 데이터 품질 평가 요인 중 유효성, 기능성, 신뢰성을 중심으로 품질 평가를 논의하였으며, 거대언어모델의 특성 및 한계점을 이해하기 위해 LLaMA, Alpaca, Vicuna, ChatGPT 모델을 비교하였다. 현재 광범위하게 활용되는 거대언어모델의 평가 기준을 분석하기 위해 Holistic Evaluation for Language Models를 중심으로 평가 기준을 살펴본 후 한계점을 논의하였다. 이를 바탕으로 본 연구는 연구데이터를 주요 사전학습데이터로 활용한 거대언어모델을 대상으로 한 품질 평가 기준을 제시하고 추후 개발 방향을 논의하였으며, 이는 거대언어모델의 발전 방향을 위한 지식 기반을 제공하는데 의의를 갖는다.

Abstract

Large Language Models (LLMs) are becoming the major trend in the natural language processing field. These models were built based on research data, but information such as types, limitations, and risks of using research data are unknown. This research would present how to analyze and evaluate the LLMs that were built with research data: LLaMA or LLaMA base models such as Alpaca of Stanford, Vicuna of the large model systems organization, and ChatGPT from OpenAI from the perspective of research data. This quality evaluation focuses on the validity, functionality, and reliability of Data Quality Management (DQM). Furthermore, we adopted the Holistic Evaluation of Language Models (HELM) to understand its evaluation criteria and then discussed its limitations. This study presents quality evaluation criteria for LLMs using research data and future development directions.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지