정보관리학회지, 한국정보관리학회

41

문화재 중심 기록물 서비스 개선을 위한 온톨로지 설계: 황룡사 관련 기록물 중심으로

김시정(대구대학교 기록물관리 전문요원) ; 최상희(대구가톨릭대학교) 2022, Vol.39, No.4, pp.241-268 https://doi.org/10.3743/KOSIM.2022.39.4.241

초록보기

초록

문화재 관련 기록물은 문화재에 대한 구체적인 증거이며 보존에 있어 중요한 근거자료 역할을 하므로 문화재만큼이나 중요한 의미가 있다. 특히 국가적이나 사회적으로 중요한 가치를 가진 특정 문화재인 경우 해당 문화재가 하나의 주제로 다양한 연구가 진행되고 문화재를 주제로 한 프로그램이 기획되는 경우가 많다. 그러나 유명한 문화재를 중심으로 생산되는 기록물은 긴 시간 동안 발생하면서 분산되어 관리되어 왔고 다양한 형태로 나타나고 있어 해당 기록물의 범위와 소재, 내용을 파악하기 어렵다. 이와 같은 문제들의 해결 방안으로, 이 연구는 황룡사와 같이 사회적, 역사적 가치를 가지는 주요 문화재를 중심으로 발생하는 관련 기록물을 11개 공공기관 및 웹서비스에서 수집하여 기록물의 유형, 기록물과 관련된 활동, 메타데이터 분석을 통해 전체 기록물의 범위와 관계를 파악할 수 있는 온톨로지 설계를 하여 특정 문화재 중심으로 기록물을 이해할 수 있도록 하고자 하였다.

Abstract

Records related to a certain cultural heritage are concrete evidence that prove the value of the cultural heritage and become a criterion for long-term preservation of its records. The value of the records is as important as cultural heritage value. In the case of specific cultural heritage with national or socially important values, various studies are conducted on cultural heritage as one theme, and various programs about cultural heritage are developed. However, it is difficult to grasp the scope, record types, and contents of the records because they have been distributed and managed in many institutes. They also appear in various forms. As a solution to these problems, this study collected records of a major cultural heritage with social and historical values such as Hwangnyongsa from 11 public institutions and web services and analyzed the types of records, activities related to the records, and metadata. Through data analysis, an ontology that can understand the range and relationship of the entire record was suggested so that the record can be understood with a focus on specific cultural heritage.

42

구술사 기록물 아카이브 구축을 위한 메타데이터 모델링 및 표준 요소 개발에 관한 연구

이정연(경기대학교 인문과학연구소) 2009, Vol.26, No.1, pp.163-184 https://doi.org/10.3743/KOSIM.2009.26.1.163

초록보기

초록

본 연구는 문헌사를 대치하는 구술사에 관하여 기록물로서의 중요성을 제시하면서 구술사 기록물을 구조화 할 수 있는 표준 메타데이터 모형과 설계 요소를 개발 하고자 하였다. 이를 위하여 구술사 기록의 내용과 형태를 정보원으로 표현할 수 있는 표준 메타데이터 기술 요소를 분석하였으며, 디지털 구술 아카이빙 구축을 위하여 프로젝트, 관리, 레코드 그리고 관련 레코드 영역으로 메타데이터 모델링을 설계 하였다. 또한 기본 요소와 세부 요소, 구분 요소의 설계 원칙에 의하여 실제 구술사 기록물을 대상으로 구현을 통해 적용 해 보았다.

Abstract

This study is aimed to develop standard metadata model and develop elements and present the importance of oral history archives that contrast with literature history. The study analyzed standard metadata description elements that can express contents and forms of oral history archives as information. Furthermore, it designed Project, Management, Record, and metadata modeling as a Related record domain for the sake of building digital oral archives. Finally, the study gave shape to real oral history archives based on basic elements, details, and design principles of division elements.

43

ChatGPT가 자동 생성한 더블린 코어 메타데이터의 품질 평가: 국내 도서를 대상으로

김선욱(경북대학교 사회과학대학 문헌정보학과) ; 이혜경(경북대학교 문헌정보학과) ; 이용구(경북대학교) 2023, Vol.40, No.2, pp.183-209 https://doi.org/10.3743/KOSIM.2023.40.2.183

초록보기

초록

이 연구의 목적은 ChatGPT가 도서의 표지, 표제지, 판권기 데이터를 활용하여 생성한 더블린코어의 품질 평가를 통하여 ChatGPT의 메타데이터의 생성 능력과 그 가능성을 확인하는 데 있다. 이를 위하여 90건의 도서의 표지, 표제지와 판권기 데이터를 수집하여 ChatGPT에 입력하고 더블린 코어를 생성하게 하였으며, 산출물에 대해 완전성과 정확성 척도로 성능을 파악하였다. 그 결과, 전체 데이터에 있어 완전성은 0.87, 정확성은 0.71로 준수한 수준이었다. 요소별로 성능을 보면 Title, Creator, Publisher, Date, Identifier, Right, Language 요소가 다른 요소에 비해 상대적으로 높은 성능을 보였다. Subject와 Description 요소는 완전성과 정확성에 대해 다소 낮은 성능을 보였으나, 이들 요소에서 ChatGPT의 장점으로 알려진 생성 능력을 확인할 수 있었다. 한편, DDC 주류인 사회과학과 기술과학 분야에서 Contributor 요소의 정확성이 다소 낮았는데, 이는 ChatGPT의 책임표시사항 추출 오류 및 데이터 자체에서 메타데이터 요소용 서지 기술 내용의 누락, ChatGPT가 지닌 영어 위주의 학습데이터 구성 등에 따른 것으로 판단하였다.

Abstract

The purpose of this study is to evaluate the Dublin Core metadata generated by ChatGPT using book covers, title pages, and colophons from a collection of books. To achieve this, we collected book covers, title pages, and colophons from 90 books and inputted them into ChatGPT to generate Dublin Core metadata. The performance was evaluated in terms of completeness and accuracy. The overall results showed a satisfactory level of completeness at 0.87 and accuracy at 0.71. Among the individual elements, Title, Creator, Publisher, Date, Identifier, Rights, and Language exhibited higher performance. Subject and Description elements showed relatively lower performance in terms of completeness and accuracy, but it confirmed the generation capability known as the inherent strength of ChatGPT. On the other hand, books in the sections of social sciences and technology of DDC showed slightly lower accuracy in the Contributor element. This was attributed to ChatGPT’s attribution extraction errors, omissions in the original bibliographic description contents for metadata, and the language composition of the training data used by ChatGPT.

44

사전학습 된 언어 모델 기반의 양방향 게이트 순환 유닛 모델과 조건부 랜덤 필드 모델을 이용한 참고문헌 메타데이터 인식 연구

지선영(경기대학교 일반대학원 문헌정보학과) ; 최성필(경기대학교 문헌정보학과) 2021, Vol.38, No.1, pp.221-242 https://doi.org/10.3743/KOSIM.2021.38.1.221

초록보기

초록

본 연구에서는 사전학습 된 언어 모델을 기반으로 양방향 게이트 순환 유닛 모델과 조건부 랜덤 필드 모델을 활용하여 참고문헌을 구성하는 메타데이터를 자동으로 인식하기 위한 연구를 진행하였다. 실험 집단은 2018년에 발행된 학술지 40종을 대상으로 수집한 PDF 형식의 학술문헌 53,562건을 규칙 기반으로 분석하여 추출한 참고문헌 161,315개이다. 실험 집합을 구축하기 위하여 PDF 형식의 학술 문헌에서 참고문헌을 분석하여 참고문헌의 메타데이터를 자동으로 추출하는 연구를 함께 진행하였다. 본 연구를 통하여 가장 높은 성능을 나타낸 언어 모델을 파악하였으며 해당 모델을 대상으로 추가 실험을 진행하여 학습 집합의 규모에 따른 인식 성능을 비교하고 마지막으로 메타데이터별 성능을 확인하였다.

Abstract

This study applied reference metadata recognition using bidirectional GRU-CRF model based on pre-trained language model. The experimental group consists of 161,315 references extracted by 53,562 academic documents in PDF format collected from 40 journals published in 2018 based on rules. In order to construct an experiment set. This study was conducted to automatically extract the references from academic literature in PDF format. Through this study, the language model with the highest performance was identified, and additional experiments were conducted on the model to compare the recognition performance according to the size of the training set. Finally, the performance of each metadata was confirmed.

45

지식정보자원 표준화 모델 연구

이창열(동의대학교) ; 정의석(고려대학교) 2006, Vol.23, No.4, pp.165-177 https://doi.org/10.3743/KOSIM.2006.23.4.165

초록보기

초록

한국정보문화진흥원이 관리하는 국가 지식정보자원은 여러 기관에 분산되어 있으며, 메타데이터 규격은 통합이 아니라 검색을 위한 개념적 수준의 권고 표준이었다. 그래서 데이터를 연계하거나 통합하는데 많은 문제가 발생하고 있다. 본 논문에서는 여러 기관에 분산된 지식정보자원에 대한 통합을 위하여 기존에 여러 기관에 분산된 메타데이터를 분석하여 문제점을 도출하고 이를 보완하며, 지속적으로 연계 및 통합할 수 있는 표준 모델을 제시하고자 한다.

Abstract

National Knowledge and Information Resources of KADO(Korea Agency for Digital Opportunity and Promotion) were distributed to the several data centers. The metadata for the resources was the conceptual level recommended standard. It was not for the integration, but the retrieval. So it is not easy to integrate to the central metadata DB or connect metadata among the data centers. In this paper, we analysed the metadata of the several data centers and provided the integrated standard model for the central metadata DB.

46

한국학 연구 논문의 텍스트 구조 기반 메타데이터 검색 시스템 개발 연구

송민선(성균관대학교 정보관리연구소) ; 고영만(성균관대학교) ; 이승준(성균관대학교 정보관리연구소) 2016, Vol.33, No.3, pp.155-176 https://doi.org/10.3743/KOSIM.2016.33.3.155

초록보기

초록

본 연구는 한국학 연구 논문 텍스트의 의미 구조를 기반으로 하는 메타데이터를 적용한 학술정보시스템을 구축하여 기존 유사 시스템과의 비교를 통해, 텍스트 구조 기반 메타데이터의 활용 가능성을 확인해 보고자 하는 것을 목적으로 한다. 이를 위해 한국학술지인용색인(Korea Citation Index, KCI)에서 일정 기준을 충족하는 한국학 분야 연구 논문 데이터를 대상으로 의미 구조 메타데이터 항목을 적용한 시범적 검색 시스템(Korean Studies Metadata Database, KMD)을 구축하였으며, 동일한 검색 키워드를 적용하여 기존의 KCI 시스템과 비교했을 때 어떤 특징과 차이점을 갖는지 비교해 보았다. 연구 결과, KMD 시스템이 KCI에 비해 이용자의 검색 의도에 맞는 결과를 보다 효율적으로 보여주는 것으로 확인되었다. 즉 검색하고자 하는 키워드의 조합이나 조건식이 기존 시스템과 동일하더라도 검색 결과를 통해 최종적으로 연구 진행과 관련해 찾고자 하는 연구 목적, 연구의 대상 데이터나 시공간적 배경 등에 따른 검색 결과를 다양하게 보여줄 수 있는 것으로 나타났다.

Abstract

This study aims to develope a scholarly metadata information system based on conceptual elements of text structure of Korean studies research articles and to identify the applicability of text structure based metadata as compared with the existing similar system. For the study, we constructed a database(Korean Studies Metadata Database, KMD) with text structure based on metadata of Korean Studies journal articles selected from the Korea Citation Index(KCI). Then we verified differences between KCI system and KMD system through search results using same keywords. As a result, KMD system shows the search results which meet the users’ intention of searching more efficiently in comparison with the KCI system. In other words, even if keyword combinations and conditional expressions of searching execution are same, KMD system can directly present the content of research purposes, research data, and spatial-temporal contexts of research et cetera as search results through the search procedure.

47

메타데이터를 활용한 기록물 자동분류 성능 요소 비교

김영범(전남대학교 대학원 기록관리학 석사) ; 장우권(전남대학교 문헌정보학과 교수) 2023, Vol.40, No.3, pp.99-118 https://doi.org/10.3743/KOSIM.2023.40.3.099

초록보기

초록

이 연구의 목적은 기록물의 맥락정보를 담고 있는 메타데이터를 활용하여 기록물 자동분류 과정에서의 성능요소를 파악하는데 있다. 연구를 위해 2022년 중앙행정기관 원문정보 약 97,064건을 수집하였다.수집한 데이터를 대상으로 다양한 분류 알고리즘과 데이터선정방법, 문헌표현기법을 적용하고 그 결과를 비교하여 기록물 자동 분류를 위한 최적의 성능요소를 파악하고자 하였다. 연구 결과 분류 알고리즘으로는 Random Forest가, 문헌표현기법으로는 TF 기법이 가장 높은 성능을 보였으며, 단위과제의 최소데이터 수량은 성능에 미치는 영향이 미미하였고 자질은 성능변화에 명확한 영향을 미친다는 것이 확인되었다.

Abstract

The objective of this study is to identify performance factors in the automatic classification of records by utilizing metadata that contains the contextual information of records. For this study, we collected 97,064 records of original textual information from Korean central administrative agencies in 2022. Various classification algorithms, data selection methods, and feature extraction techniques are applied and compared with the intent to discern the optimal performance-inducing technique. The study results demonstrated that among classification algorithms, Random Forest displayed higher performance, and among feature extraction techniques, the TF method proved to be the most effective. The minimum data quantity of unit tasks had a minimal influence on performance, and the addition of features positively affected performance, while their removal had a discernible negative impact.

48

연구 논문의 의미 구조 기반 메타데이터 항목의 자동 식별 처리를 위한 문장 구조 분석

송민선(대림대학교) 2018, Vol.35, No.3, pp.101-121 https://doi.org/10.3743/KOSIM.2018.35.3.101

초록보기

초록

This study proposes the analysis method in sentence semantics that can be automatically identified and processed as appropriate items in the system according to the composition of the sentences contained in the data corresponding to the logical semantic structure metadata of the research papers. In order to achieve the purpose, the structure of sentences corresponding to ‘Research Objectives’ and ‘Research Outcomes’ among the semantic structure metadata was analyzed based on the number of words, the link word types, the role of many-appeared words in sentences, and the end types of a word. As a result of this study, the number of words in the sentences was 38 in ‘Research Objectives’ and 212 in ‘Research Outcomes’. The link word types in ‘Research Objectives’ were occurred in the order such as Causality, Sequence, Equivalence, In-other-word/Summary relation, and the link word types in ‘Research Outcomes’ were appeared in the order such as Causality, Equivalence, Sequence, In-other-word/Summary relation. Analysis target words like ‘역할(Role)’, ‘요인(Factor)’ and ‘관계(Relation)’ played a similar role in both purpose and result part, but the role of ‘연구(Study)’ was little different. Finally, the verb endings in sentences were appeared many times such as ‘∼고자’, ‘∼였다’ in ‘Research Objectives’, and ‘∼었다’, ‘∼있다’, ‘∼였다’ in ‘Research Outcomes’. This study is significant as a fundamental research that can be utilized to automatically identify and input the metadata element reflecting the common logical semantics of research papers in order to support researchers’ scholarly sensemaking.

Abstract

49

해양전문정보센터의 멀티미디어 메타데이터베이스 및 디지털도서관 통합정보시스템 구현에 관한 연구

한종엽(한국해양연구원) ; 최영준(㈜킨스 e사업본부) 2004, Vol.21, No.4, pp.5-26 https://doi.org/10.3743/KOSIM.2004.21.4.005

초록보기

초록

본 연구는 국내 해양전문정보센터에서 효율적인 정보서비스를 위해 필요한 멀티미디어 메타데이터베이스와 디지털도서관 통합정보시스템을 구현할 목적으로 선행연구를 조사하고 분석하였다. 연구대상자원은 해양분야의 인쇄매체, 네트워크자원, 원문화일, 동영상 등을 범위로 하였다. 본 연구에서는 인쇄매체를 포함한 각종 멀티미디어 컨텐츠 자원의 기술과 조직을 위해 LC표준으로 사용하고 있는 MODS를 기반으로 하여 통합정보검색서비스를 제공하고자 하였다. 이를 위해 본 연구에서는 해양분야 각종 정보자원 조사, 멀티미디어 정보처리, MODS 등 메타데이터 기술요소 분석, 메타데이터 분류체계, 시스템 구성 및 검색 구현방안의 연구를 수행하였다.

Abstract

A literature analysis for the planning and realization of the multimedia meta database and digital library's integrated information system was carried out to establish the various oceanographic resources in the Oceanographic Information Center, the first in Korea. The study targeted from printed matter, network resources, full-text and to VOD. The focus of the analysis lies in the providing practical integrated information retrieval service for oceanographic resources based on the framework of effective MODS metadata with network resources description. The analyses included oceanographic resources, multimedia information processing, MODS metadata descriptive elements, metadata classification, system organization, and retrieval for planning and implementation of the multimedia meta database system.

50

기관 리포지터리의 검색기능 향상을 위한 인명 접근점제어 시스템 구축 연구

김미향(서울대학교) ; 김태수(연세대학교) 2010, Vol.27, No.3, pp.125-146 https://doi.org/10.3743/KOSIM.2010.27.3.125

초록보기

초록

본 연구에서는 셀프 아카이빙(self-archiving)을 기본으로 메타데이터가 구축되는 기관 리포지터리의 인명 검색 문제점을 해결하고자, 인명 접근점제어 데이터를 구축하였다. 이를 위해 기존 도서관의 전거데이터를 활용하면서도 전거형을 인정하지 않고, 정보원에 기재된 형식을 모두 접근점으로 사용하는 그룹화 방법을 사용하고, 동명이인 처리를 위해 저작자의 주제분야와 저작정보를 확장해서 사용하는 새로운 방법을 토대로 인명 접근점제어 데이터를 구축하고 시스템에 적용하여 검색의 기능이 향상되었다. 향후 기관 리포지터리 외에 도서관이 총괄하는 모든 메타데이터의 검색 기능 향상을 위해서도 활용할 수 있을 것이다.

Abstract

This study developed a name access point control system for better performance of information retrieval from institutional repositories, which are equipped with author- generated metadata processes for self-archiving. In developing name access point control data for the system, the primary data were created from the existing authority. However, unlike the existing authority data, the primary data did not use any authority forms. Instead, the data utilized all the forms provided by the resources as access points. Specifically, field of activity(subject) and title information on authorship were used to distinguish between persons who have the same name. The result showed that the system improved the performance of the information retrieval. The system has been also expected to be utilized over other metadata provided by libraries, in addition to the institutional repositories, in order to provide better quality information.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지