정보관리학회지, 한국정보관리학회

1

Scientific Data 학술지 분석을 통한 데이터 논문 현황에 관한 연구

정은경(이화여자대학교) 2019, Vol.36, No.1, pp.117-135 https://doi.org/10.3743/KOSIM.2019.36.1.117

초록보기

초록

데이터 학술지와 데이터 논문이 오픈과학 패러다임에서 데이터 공유와 재이용이라는 학술활동이 등장하여 지속적으로 성장하고 있다. 본 논문은 영향력있는 다학제적 분야의 데이터 학술지인 Scientific Data에 게제된 총 713건의 논문을 대상으로 저자, 인용, 주제분야 측면을 분석하였다. 그 결과 저자의 주된 주제 영역은 생명공학, 물리학 등으로 나타났으며, 공저자 수는 평균 12명이다. 공저 형태를 네트워크로 살펴보면, 특정 연구자 그룹이 패쇄적으로 공저활동을 수행하는 것으로 나타났다. 인용의 주제영역을 살펴보면, 데이터 논문 저자의 주제영역과 크게 다르지 않게 나타났으나, 방법론을 주로 다루는 학술지의 인용 비중이 높은 것은 데이터 논문의 특징으로 볼 수 있다. 데이터 논문 저자의 키워드를 사용하여 동시출현단어분석 네트워크로 살펴본 데이터 논문의 주제영역은 생물학이 중심이며, 구체적으로 해양생태, 암, 게놈, 데이터베이스, 기온 등의 세부 주제 영역을 확인할 수 있다. 이러한 결과는 다학제학문 분야를 다루는 데이터 학술지이지만, 데이터 학술지 출간에 관한 논의를 일찍부터 시작해온 생명공학 분야에 집중된 현상을 보여준다.

Abstract

Data journals and data papers have grown and considered an important scholarly practice in the paradigm of open science in the context of data sharing and data reuse. This study investigates a total of 713 data papers published in Scientific Data in terms of author, citation, and subject areas. The findings of the study show that the subject areas of core authors are found as the areas of Biotechnology and Physics. An average number of co-authors is 12 and the patterns of co-authorship are recognized as several closed sub-networks. In terms of citation status, the subject areas of cited publications are highly similar to the areas of data paper authors. However, the citation analysis indicates that there are considerable citations on the journals specialized on methodology. The network with authors’ keywords identifies more detailed areas such as marine ecology, cancer, genome, database, and temperature. This result indicates that biology oriented-subjects are primary areas in the journal although Scientific Data is categorized in multidisciplinary science in Web of Science database.

2

Web of Science 데이터학술지 게재 데이터논문의 지적구조 규명

정은경(이화여자대학교 사회과학대학 문헌정보학과 교수) 2020, Vol.37, No.1, pp.153-177 https://doi.org/10.3743/KOSIM.2020.37.1.153

초록보기

초록

오픈과학의 흐름에서 데이터 공유와 재이용은 중요한 연구자의 활동이 되어가고 있다. 데이터 공유와 재이용에 관한 여러 논의 중에서 데이터학술지와 데이터논문의 발간이 가시적인 결과를 보여주고 있다. 데이터학술지는 여러 학문 분야에서 발간되고 있으며, 논문의 수도 점차 증가하고 있다. 데이터논문은 데이터 자체와는 다르게 인용을 주고 받는 활동이 포함되어, 따라서 이들이 형성하는 고유한 지적구조가 생겨나게 된다. 본 연구는 데이터학술지와 데이터논문이 학술커뮤니티에서 구성하는 지적구조를 규명하고자 Web of Science에 색인된 14종의 데이터학술지와 6,086건의 데이터논문과 인용된 참고문헌 84,908건을 분석하였다. 저자사항과 함께 동시인용분석과 서지결합분석을 네트워크로 시각화하여 데이터논문이 형성한 세부 주제 분야를 규명하였다. 분석결과, 저자, 저자소속기관, 국가를 추출하여 출현빈도를 살펴보면, 전통적인 학술지 논문과 다른 양상을 보인다. 이러한 결과는 데이터의 생산이 용이한 기관과 국가에 주로 데이터논문을 출간하기 때문이라고 해석될 수 있다. 동시인용분석와 서지결합분석 모두 분석도구, 데이터베이스, 게놈구성 등이 주된 세부 주제 영역으로 나타났다. 동시인용분석결과는 9개의 군집으로 형성되었는데, 특정 주제 분야로 나타난 영역은 수질과 기후 등의 분야이다. 서지결합분석은 총 27개의 컴포넌트로 구성되었는데, 수질, 기후 이 외에도 해양, 대기 등의 세부 주제 영역이 파악되었다. 특기할만한 사항으로는 사회과학 분야의 주제 영역도 나타났다는 점이다.

Abstract

In the context of open science, data sharing and reuse are becoming important researchers’ activities. Among the discussions about data sharing and reuse, data journals and data papers shows visible results. Data journals are published in many academic fields, and the number of papers is increasing. Unlike the data itself, data papers contain activities that cite and receive citations, thus creating their own intellectual structures. This study analyzed 14 data journals indexed by Web of Science, 6,086 data papers and 84,908 cited references to examine the intellectual structure of data journals and data papers in academic community. Along with the author’s details, the co-citation analysis and bibliographic coupling analysis were visualized in network to identify the detailed subject areas. The results of the analysis show that the frequent authors, affiliated institutions, and countries are different from that of traditional journal papers. These results can be interpreted as mainly because the authors who can easily produce data publish data papers. In both co-citation and bibliographic analysis, analytical tools, databases, and genome composition were the main subtopic areas. The co-citation analysis resulted in nine clusters, with specific subject areas being water quality and climate. The bibliographic analysis consisted of a total of 27 components, and detailed subject areas such as ocean and atmosphere were identified in addition to water quality and climate. Notably, the subject areas of the social sciences have also emerged.

3

종이기록 데이터화를 위한 AI-OCR 적용 사례연구

안세진(김포시 행정과) ; 황현호(㈜악어디지털) ; 임진희(이화여자대학교 정책과학과) 2022, Vol.39, No.3, pp.165-193 https://doi.org/10.3743/KOSIM.2022.39.3.165

초록보기

초록

현대 업무환경 변화의 중심은 디지털 기술이라고 할 수 있다. 특히 업무관리시스템 및 문서생산시스템에서 생산한 기록으로 업무를 증명하는 일반적인 공공기관에서 기록관리체계는 업무환경 그 자체이기도 하다. 김포시는 제4차 산업혁명기술 시대에 선제적으로 대응하고 업무환경 혁신을 이루기 위해 한국지능정보사회진흥원(NIA)의 2021년 공공부문 클라우드 선도 프로젝트 사업에 지원하였고 선도 기관으로 확정되어 3억 3천의 지원을 받아 공공 클라우드 기반의 AI-OCR을 통한 기록물 검색 및 활용기능 강화 프로젝트를 진행하였다. 이를 통해 규격화된 색인 값에 의존한 검색과 이미지 열람에 그치던 비전자기록의 한계를 넘어 데이터화 하였고 AI-OCR이라는 신기술 적용으로 98%의 인식률을 구현하였다. 공공기관에 디지털 기술을 사용하여 업무 효율화, 생산성 향상, 개발비용 절감, 내․외부 이용자들의 기록관리 서비스 수준의 제고를 이루었기에 신기술과 기록물관리의 결합 사례연구를 통해 기록관리 분야 본연의 전문성을 높이는 방향과 업무환경 혁신 구현 사례를 공유하고자 한다.

Abstract

It can be said that digital technology is at the center of the change in the modern work environment. In particular, in general public institutions that prove their work with records produced by business management systems and document production systems, the record management system is also the work environment itself. Gimpo City applied for the 2021 public cloud leading project of the National Information Society Agency (NIA) to proactively respond to the 4th industrial revolution technology era and implemented a public cloud-based AI-OCR technology enhancement project with 330 million won in support of 330 million won. Through this, it was converted into data beyond the limitations of non-electronic records limited to search and image viewing that depend on standardized index values. In addition, a 98% recognition rate was realized by applying a new technology called AI-OCR. Since digital technology has been used to improve work efficiency, productivity, development cost, and record management service levels of internal and external users, we would like to share the direction of enhancing expertise in the record management and implementation of work environment innovation.

4

문헌정보학 분야 연구데이터 공유에 관한 연구

조재인(인천대학교) 2017, Vol.34, No.4, pp.59-79 https://doi.org/10.3743/KOSIM.2017.34.4.059

초록보기

초록

본 연구는 Figshare를 통해 공유되고 있는 문헌정보학분야 연구데이터의 유형, 주제, 공개 수준 등을 분석하고 재사용성이 상대적으로 높은 데이터의 특성을 통계적으로 해석해 보았다. 분석 결과 데이터의 유형은 dataset과 paper 유형이, 주제 분야는 open access와 research data가 가장 많은 비중을 차지하였으며, 70%에 가까운 연구데이터가 pdf와 같이 편집과 재사용이 원활하지 않은 형태로 공개되어 있는 것으로 조사되었다. 또한 연구데이터의 특성과 활용 정도간의 관계 분석 결과, 주제에 있어서는 APC(Article Processing Charge)를 비롯한 open access 영역이 가장 많이 활용되고 있는 것으로 나타났으며, 데이터 유형에 있어서는 paper의 활용도가 가장 높은 것으로 나타났다.

Abstract

This study analyzed the type, subject and open level of research data in the field of library and information science field shared by Figshare, and statistically analyzed the characteristics of data with relatively high recyclability. The results of the analysis showed that datasets and papers were most common data types, and open access and research data were the most common keywords of data, and that 70% of the data were published in a form that can not be processed mechanically such as pdf. As a result of analysis of the relationship between characteristics of research data and degree of sharing, open access areas such as APC (Article Processing Charge) were found to be most common in the subject. However in data type, gray literature such as paper found to be highly utilized rather than dataset.

5

국내 기관 소속 연구자의 데이터 가용성 진술(Data Availability Statements) 현황 연구: PLOS ONE 학술지를 중심으로

안병군(한국과학기술정보연구원) ; 변제연(성균관대학교 문헌정보학과) 2023, Vol.40, No.1, pp.225-258 https://doi.org/10.3743/KOSIM.2023.40.1.225

초록보기

초록

본 연구는 국내 연구자가 저술한 논문의 데이터 가용성 진술(DAS)에 명시된 데이터 공유 메커니즘과 리포지터리를 조사함으로써 국내 연구자의 연구데이터 공유 현황과 특징을 탐구하는 것을 목적으로 한다. 이를 위하여 2014년부터 2022년까지 PLOS ONE에 게재된 국내기관 소속 연구자의 논문을 연구의 대상으로 선정하였다. 우선 논문 내 DAS 존재 현황을 파악하고 선행연구를 활용하여 데이터 공유 메커니즘의 유형을 분석하였으며, 시간의 흐름에 따른 데이터 공유 메커니즘별 변화 추이 등을 조사하였다. 그 결과, 대상 논문의 99.6%에 DAS가 작성되어 있으며 데이터 공유 메커니즘의 유형별 언급 양상은 국제적인 양상과 유사하되, 시간의 흐름에 따라 선호되는 유형이 변화하고 있음을 파악하였다. 이후 데이터 공유 메커니즘 중 리포지터리에 주목하여 DAS에 언급된 리포지터리의 횟수와 비율을 파악하고 다수 언급된 5개 리포지터리의 이용 변화 추이를 시계열적으로 분석하였다. 또한 리포지터리와 함께 언급된 데이터 접근점의 제시 방식과 유형, 유효성 등도 함께 조사하였다. 이를 통해 빈번하게 언급되는 상위 5개 리포지터리가 전체 리포지터리 언급의 60%를 차지하며 데이터 코드를 다루는 리포지터리의 이용이 증가하는 현황이 확인되었고, 리포지터리와 함께 제시된 데이터의 접근점은 대부분 유효하다는 사실을 파악할 수 있었다.

Abstract

The purpose of this study is to investigate the current status and characteristics of research data sharing by domestic researchers by analyzing the data sharing mechanism and repository specified in DAS of papers authored by domestic researchers. To this end, in this study, papers of researchers belonging to domestic institutions published in PLOS ONE from 2014 to 2022 were selected as the subject of the study. First of all, the status of DAS’s existence in the papers was identified, the types of data-sharing mechanisms were analyzed using precedent studies, and the trend of changes in each data-sharing mechanism over time was investigated. As a result, it was found that DAS was written in 99.6% of the target papers, and the types of data-sharing mechanisms were similar to international patterns, but preferred types were changing over time. Afterward, focusing on repositories among data sharing mechanisms, the number and ratio of repositories mentioned in DAS were identified, and the trend of changes in use of the five repositories mentioned a lot was analyzed in a time series. In addition, the presentation method, type, and validity of the data access point mentioned along with the repository were also investigated. It was confirmed that the top five frequently mentioned repositories account for 60% of all repository mentions, and the use of a repository dealing with data codes is increasing; in addition, it was found that most of the data access points presented with the repository were valid.

6

동시출현단어 분석을 이용한 오픈 데이터 분야의 지적 구조 분석

이혜경(대구가톨릭대학교 문헌정보학과 강사) ; 이용구(경북대학교 문헌정보학과) 2023, Vol.40, No.4, pp.429-450 https://doi.org/10.3743/KOSIM.2023.40.4.429

초록보기

초록

본 연구의 목적은 오픈 데이터 관련 연구의 최근 동향과 지적 구조를 고찰하는 것이다. 이를 위하여 본 연구는 Scopus에서 저자 키워드로 ‘open data’를 검색하여 1999년부터 2023년까지 총 6,543건의 논문을 수집하였으며, 데이터 전처리 이후 5,589편 논문의 저자 키워드를 대상으로 오픈 데이터 관련 연구 분야 및 링크드 오픈 데이터 관련 연구 분야의 중심성 도출과 네트워크 분석을 수행하였다. 그 결과, 오픈 데이터 관련 연구에서는 ‘big data’가 가장 높은 중심성을 보였으며, 주로 공공데이터 개념의 오픈 데이터로서의 활용 및 정책 적용 연구, 빅데이터와의 연관개념으로서의 오픈 데이터를 활용한 데이터 분석에 관한 연구, 오픈 데이터의 재생산이나 활용 및 접근과 같은 오픈 데이터의 이용과 관련한 주제의 연구가 이뤄지고 있음이 나타났다. 그리고 링크드 오픈 데이터 관련 연구는 삼각매개중심성 및 최근접이웃중심성에서 모두 ‘semantic web’이 가장 높은 것으로 나타났으며, 정부 정책의 공공데이터보다 데이터 연계와 관계 형성을 중점으로 한 연구가 많이 수행된 것으로 나타났다.

Abstract

The purpose of this study is to examine recent trends and intellectual structures in research related to open data. To achieve this, the study conducted a search for the keyword “open data” in Scopus and collected a total of 6,543 papers from 1999 to 2023. After data preprocessing, the study focused on the author keywords of 5,589 papers to perform network analysis and derive centrality in the field of open data research and linked open data research. As a result, the study found that “big data” exhibited the highest centrality in research related to open data. The research in this area mainly focuses on the utilization of open data as a concept of public data, studies on the application of open data in analysis related to big data as an associated concept, and research on topics related to the use of open data, such as the reproduction, utilization, and access of open data. In linked open data research, both triadic centrality and closeness centrality showed that “the semantic web” had the highest centrality. Moreover, it was observed that research emphasizing data linkage and relationship formation, rather than public data policies, was more prevalent in this field.

7

도서관 공공데이터의 품질에 관한 연구: 도서관 정보나루의 도서 상세 조회 API를 중심으로

양수완(중앙대학교 문헌정보학과 박사과정 수료) 2020, Vol.37, No.4, pp.181-206 https://doi.org/10.3743/KOSIM.2020.37.4.181

초록보기

초록

공공데이터의 개방과 제공의 활성화와 함께, 공공도서관이 업무 중에 생산한 서지 데이터와 대출 이력과 같은 데이터가 도서관 공공데이터로 제공되고 있다. 본 논문은 도서관 공공데이터의 품질을 진단하고, 그 결과를 바탕으로 도서관 공공데이터의 품질을 높일 개선방안을 제안하고자 한다. 먼저, 문헌정보학 영역에서 공공데이터에 관해 이루어진 연구를 개괄한다. 그다음으로, 도서관 공공데이터 개방 플랫폼인 도서관 정보나루의 오픈 API를 통해 확보한 도서관 공공데이터의 완전성과 정확성을 진단한다. 마지막으로, 데이터 품질 진단 결과에 바탕을 개선방안을 도출한다. 완전성을 진단한 결과, 도서의 식별과 검색을 위 필수적인 서지 요소에서 다수의 공백이 확인되었다. 정확성을 진단한 결과, 값의 유형, 값의 범위, 제한조건을 따르지 않는 부정확한 서지 요소가 확인되었다. 본 연구는 데이터 품질 진단 분석 결과를 바탕으로, 도서관 정보나루의 데이터 수집 절차 개선, 데이터별 스키마 구축, 데이터 수집과 데이터 처리에 관한 안내 제공, 원자료 공개를 제언하였다.

Abstract

With the popularization of open government data, Library-related open government data is also open and utilized to the public. The purpose of this paper is to diagnose the quality of library-related open government data and propose improvement measures to enhance the quality based on the diagnosis result. As a result of diagnosing the completeness of the data, a number of blanks are identified in the bibliographic elements essential for identifying and searching a book. As a result of diagnosing the accuracy of the data, the bibliographic elements that are not compliant with the data schema have been identified. Based on the result of data quality diagnosis, this study suggested improving the data collection procedure, establishing data set schema, providing details on data collection and data processing, and publishing raw data.

8

데이터 융합을 이용한 내용기반 이미지 검색에 관한 연구

백우진(건국대학교) ; Sun-Eun Jung(Konkuk U) ; Euigun Ahn(Yonsei U) ; 김기용(건국대학교) ; 신문선(건국대학교) 2008, Vol.25, No.2, pp.49-68 https://doi.org/10.3743/KOSIM.2008.25.2.049

초록보기

초록

Abstract

In many information retrieval experiments, the data fusion techniques have been used to achieve higher effectiveness in comparison to the single evidence-based retrieval. However, there had not been many image retrieval studies using the data fusion techniques especially in combining retrieval results based on multiple retrieval methods. In this paper, we describe how the image retrieval effectiveness can be improved by combining two sets of the retrieval results using the Sobel operator-based edge detection and the Self Organizing Map(SOM) algorithms. We used the clip art images from a commercial collection to develop a test data set. The main advantage of using this type of the data set was the clear cut relevance judgment, which did not require any human interven- tion.

9

인용 이미지 구축자 프로파일링을 이용한 국내 여성학 분야 연구 전선 분석

김조아(명지대학교 대학원 문헌정보학과) ; 이재윤(명지대학교) 2016, Vol.33, No.2, pp.201-225 https://doi.org/10.3743/KOSIM.2016.33.2.201

초록보기

초록

학제적 분야의 연구 전선을 분석하는 새로운 기법으로 인용 이미지 구축자 프로파일링 기법을 제안하였다. 인용 이미지 구축자 프로파일링은 해당 문헌을 인용한 문헌의 표제어를 단서로 사용하여 문헌 간의 주제관계를 파악하는 방법이다. 이 연구에서는 시험적으로 국내 여성학 연구를 대상으로 인용 이미지 구축자 프로파일링 기법을 적용하여 연구 전선과 주요 연구 주제를 파악해보았다. 분석 대상은 KCI의 2015년 기준 여성학분야 인용빈도 10회 이상에 해당하는 핵심문헌 집합이다. 여성학 분야에 문헌동시인용 기법을 적용한 결과 인용 데이터 부족 때문에 어려움이 있었던 반면에, 인용 이미지 구축자 프로파일링 기법을 적용한 결과 성공적으로 2개 대분야 및 6개 소분야를 파악할 수 있었다. 이 연구에서 제안한 인용 이미지 구축자 프로파일링 기법은 학제적 연구분야의 동향을 파악하는데 기여할 수 있을 것으로 기대된다.

Abstract

A new technique for revealing the research fronts of a interdisciplinary discipline has been developed. Citation image makers profiling (CIMP) determines the relationships between research papers with the title words of the citing documents. We adapted this new technique to analyze the research fronts and hot topics in women’s studies of Korea. By Korean Citation Index (KCI) data in 2015, we selected 148 papers cited more than 9 times as the core documents of women’s studies. Analysis of intellectual structure using citation image makers profiling was performed with the 148 core documents and those citing papers. Document co-citation analysis was hindered by citation data sparsity, while CIMP method successfully revealed the structure of research fronts of Korean women’s studies including 2 divisions and 6 subdivisions. The CIMP method suggested in this study has good potential to discover the characteristics of research fronts of interdisciplinary research domains.

10

문헌동시인용 분석을 통한 한국 문헌정보학의 연구 전선 파악

이재윤(명지대학교) 2015, Vol.32, No.4, pp.77-106 https://doi.org/10.3743/KOSIM.2015.32.4.077

초록보기

초록

한국학술지인용색인 KCI의 데이터를 사용한 문헌동시인용 분석을 통해 2004년부터 2013년까지 10년 동안의 한국 문헌정보학의 연구 전선을 구체적으로 파악해보았다. KCI 웹사이트로부터 문헌정보학 분야 핵심 논문 159개와 이를 인용한 논문 정보를 수작업으로 수집하였다. 군집 분석 및 네트워크 분석 결과 27개의 복수 논문 군집과 8개의 단일 논문 군집이 도출되었다. 27개의 복수 논문 군집 중에서 논문 수가 가장 많은 것은 ‘문헌정보학 교육’ 주제 군집이었고, 인용 영향력이 가장 큰 것은 ‘인용분석 & 지적구조 분석’ 주제 군집이었다. 핵심 문헌 집합에 대한 인용 중에서 67.5%는 문헌정보학 내부에서 이루어졌고, 나머지 32.5%는 타 학문 분야로부터 발생한 것이었다. 전반적으로 문헌정보학 분야 내 인용 비율과 인용 영향력 성장 지수를 모두 고려하였을 때, 문헌정보학 분야 내부에서 최근 연구가 가장 활발해지고 있는 연구 전선 주제로는 ‘지역 기록’, ‘인용분석 & 지적구조 분석’, ‘연구동향 분석’의 세 주제가 꼽혔다. 이 연구에서 사용된 분석 기법은 국내 학제적 연구 분야의 연구 전선 분석에 효과적일 것으로 기대된다.

Abstract

By document co-citation analysis with Korean Citation Index (KCI) data, this study accurately identified the research fronts and hot topics in Korean library and information science (LIS) from 2004 to 2013. 159 core papers in LIS domain and their citations are scraped manually from Korean Citation Index web site. In the cluster analysis and network analysis, 159 core papers were grouped into 27 clusters with multiple papers and 8 singlton clusters. Among the 27 clusters which have multple papers, ‘LIS education’ cluster was the largest with 16 core papers, and ‘citation analysis & intellectual structure analysis’ cluster had the strongest citation impact according to the ehs-index. Closer observation of the citations to the core papers in each research front showed that 67.5% of the citations were made by LIS research papers and 32.5% of the citations were made by non-LIS research papers. Considering the share of citations and the citation impact growth index, ‘local documentation’, ‘citation analysis & intellectual structure analysis’, and ‘research trends analysis’ were identified as the most emerging research front in Korean library and information science. The analytical methods used in this study have great potential in discovering the characteristics of research fronts in Korean interdisciplinary research domains.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지