정보관리학회지, 한국정보관리학회

1

한나은(한국과학기술정보연구원) ; 서수정(한국과학기술정보연구원) ; 엄정호(한국과학기술정보연구원) 2023, Vol.40, No.3, pp.77-98 https://doi.org/10.3743/KOSIM.2023.40.3.077

초록보기

초록

본 연구는 지금까지 제안된 거대언어모델 가운데 LLaMA 및 LLaMA 기반 모델과 같이 연구데이터를 주요 사전학습데이터로 활용한 모델의 데이터 품질에 중점을 두어 현재의 평가 기준을 분석하고 연구데이터의 관점에서 품질 평가 기준을 제안하였다. 이를 위해 데이터 품질 평가 요인 중 유효성, 기능성, 신뢰성을 중심으로 품질 평가를 논의하였으며, 거대언어모델의 특성 및 한계점을 이해하기 위해 LLaMA, Alpaca, Vicuna, ChatGPT 모델을 비교하였다. 현재 광범위하게 활용되는 거대언어모델의 평가 기준을 분석하기 위해 Holistic Evaluation for Language Models를 중심으로 평가 기준을 살펴본 후 한계점을 논의하였다. 이를 바탕으로 본 연구는 연구데이터를 주요 사전학습데이터로 활용한 거대언어모델을 대상으로 한 품질 평가 기준을 제시하고 추후 개발 방향을 논의하였으며, 이는 거대언어모델의 발전 방향을 위한 지식 기반을 제공하는데 의의를 갖는다.

Abstract

Large Language Models (LLMs) are becoming the major trend in the natural language processing field. These models were built based on research data, but information such as types, limitations, and risks of using research data are unknown. This research would present how to analyze and evaluate the LLMs that were built with research data: LLaMA or LLaMA base models such as Alpaca of Stanford, Vicuna of the large model systems organization, and ChatGPT from OpenAI from the perspective of research data. This quality evaluation focuses on the validity, functionality, and reliability of Data Quality Management (DQM). Furthermore, we adopted the Holistic Evaluation of Language Models (HELM) to understand its evaluation criteria and then discussed its limitations. This study presents quality evaluation criteria for LLMs using research data and future development directions.

2

전문도서관 발전에 영향을 미치는 요인 분석에 관한 연구

김은형(건국대학교 문헌정보학과) ; 노영희(건국대학교) 2023, Vol.40, No.2, pp.81-114 https://doi.org/10.3743/KOSIM.2023.40.2.081

초록보기

초록

본 연구에서는 전문도서관 사서를 대상으로 설문조사를 진행하였으며, 대내․외 환경변화와 정책적 지원방안에 따라 업무영역에 미치는 영향력을 분석하고, 설문조사 결과에서 분석된 현장사서들의 인식조사 결과를 토대로 전문도서관 발전에 영향을 미치는 요인과 이에 따른 정책적 제안을 도출하고자 하였다. 연구결과, 첫째, 개별 기관 내 도서관 위상 역할에 대한 인식에서는 도서관 발전계획 중요도에서 부정적인 의견이 58.3%, 도서관으로서 본연의 역할 수행여부에 대해서는 긍정적인 인식을 하는 것을 확인할 수가 있었다. 둘째, 전문도서관의 위상을 높이기 위해서는 주요 기능과 역할의 중요성을 인식하여 학술연구활동에 대한 인식이 필요함을 알 수 있었다. 셋째, 도서관발전종합계획 중 전문도서관 및 운영평가 인식에서는 국가 공공 정보의 대국민 서비스 확대에 대해 가장 높게 인식하고 있었다. 또한 5년간 발전전략 중 우선 시행되어야 할 정책으로는 전문도서관 현황 갱신 및 조사를 위한 시스템 구축을 선호하는 것을 확인할 수가 있었다. 넷째, 도서관 운영평가 참여율 제고를 위한 효과적인 대안 및 개선지표에 대해 분석한 결과, 공기업 평가 항목에서 “기관 도서관 운영 평가” 항목의 가중치 부여가 평균 4.01로 가장 높게 나타났다. 따라서, 전문도서관 발전을 위해서는 현재 전문도서관의 현황을 종합적으로 파악할 수 있는 체계를 구축하고 적극적인 학술 연구를 지원하는 것이 가장 시급할 것으로 파악되었다

Abstract

In this study, a survey was conducted targeting specialized librarians, and the impact on the work area according to changes in the internal and external environment and policy support measures was analyzed. In this study, we tried to derive factors that affect library development and policy suggestions accordingly. As a result of the study, first, it was confirmed that 58.3% of the negative opinions in terms of the importance of library development plans were positive in recognition of the role of library status within individual institutions. Second, in order to increase the status of specialized libraries, it was found that awareness of academic research activities was necessary by recognizing the importance of major functions and roles. Third, among the comprehensive library development plans, the recognition of specialized libraries and operational evaluation was the highest in recognition of the expansion of national public information services to the public. In addition, it was confirmed that among the five-year development strategies, the policy that should be implemented first is the preference for updating the status of specialized libraries and establishing a system for investigation. Fourth, as a result of analyzing effective alternatives and improvement indicators to increase the participation rate in library operation evaluation, the weighting of the “institutional library operation evaluation” item in the evaluation item of public enterprises was the highest at 4.01 on average. Therefore, for the development of specialized libraries, it was recognized as the most urgent task to establish a system that can comprehensively grasp the current status of specialized libraries as well as active academic research and support them.

3

고양시 독서문화진흥 종합계획 수립을 위한 독서문화 환경 분석 및 관련 사업 성과 평가 연구

송민선(대림대학교) ; 김수경(창원문성대학교) ; 황금숙(대림대학교) ; 장인호(대진대학교) 2023, Vol.40, No.2, pp.1-31 https://doi.org/10.3743/KOSIM.2023.40.2.001

초록보기

초록

본 연구는 고양시 제2차 독서문화진흥 종합계획 수립을 위해 그동안 고양시에서 시행해 왔던 주요 독서문화 진흥사업들의 추진 성과를 평가․분석함으로써, 고양시 특성을 반영한 독서문화 진흥정책의 주요 방향을 제안하기 위한 목적으로 수행되었다. 이러한 연구 목적을 달성하기 위해, 본 연구는 각종 문헌 및 통계자료를 토대로 고양시의 독서문화환경을 분석하고, 2017년에 수립된 ‘제1차 독서문화진흥 종합계획’ 및 해당 계획에 따라 수행된 주요 사업들의 추진 여부 및 주요 성과를 파악한 후, 추진 성과 평가를 위한 설문을 구성하여 고양시 사서들을 대상으로 설문 조사를 실시하였다. 마지막으로 앞서 수행된 고양시 독서문화 환경 분석 결과 및 사서 대상 설문 조사에서 도출된 독서문화진흥 사업 성과를 평가한 결과, 그리고 이전의 관련 선행연구에서 고양시민들을 대상으로 수행되었던 독서실태조사 내용 중 독서진흥사업과 관련한 의견수렴 항목의 결과 등을 종합하여 향후 제2차 고양시독서문화진흥 종합계획 수립을 위해 고양시 도서관에서 고려해야 할 여섯 가지 주요 정책 방향을 제안하였다.

Abstract

The purpose of this study is to evaluate and analyze the performance of major reading culture promotion projects implemented in Goyang City, in preparation for the establishment of the 2nd Comprehensive Reading Culture Promotion Plan in Goyang City. To achieve this research objective, this study analyzed various literature and statistical data, identified the reading culture environment in Goyang City, and determined whether major projects implemented according to the 1st Comprehensive Reading Culture Promotion Plan established in 2017 were promoted. In addition, a survey was conducted among librarians in Goyang City by constructing a questionnaire to evaluate the promotion performance of actual relevant projects. Then, by aggregating the data on the reading culture environment in Goyang City, the evaluation results of the reading culture promotion projects obtained from the survey among librarians, and the results of the opinions on the reading promotion projects of the reading surveys of Goyang citizens in the previous relevant study, six major policy directions were proposed to be considered in the establishment of the 2nd Comprehensive Reading Culture Promotion Plan in Goyang City.

4

딥러닝 언어 모델을 이용한 연구보고서의 참고문헌 자동추출 연구

한유경(정보통신정책연구원) ; 최원석(정보통신정책연구원) ; 이민철(카카오엔터프라이즈) 2023, Vol.40, No.2, pp.115-135 https://doi.org/10.3743/KOSIM.2023.40.2.115

초록보기

초록

본 연구는 단행본, 학술지, 보고서 등 다양한 종류의 발간물로 구성된 연구보고서의 참고문헌 데이터베이스를 효율적으로 구축하기 위한 것으로 딥러닝 언어 모델을 이용하여 참고문헌의 자동추출 성능을 비교 분석하고자 한다. 연구보고서는 학술지와는 다르게 기관마다 양식이 상이하여 참고문헌 자동추출에 어려움이 있다. 본 연구에서는 참고문헌 자동추출에 널리 사용되는 연구인 메타데이터 추출과 더불어 참고문헌과 참고문헌이 아닌 문구가 섞여 있는 환경에서 참고문헌만을 분리해내는 원문 분리 연구를 통해 이 문제를 해결하였다. 자동 추출 모델을 구축하기 위해 특정 연구기관의 연구보고서 내 참고문헌셋, 학술지 유형의 참고문헌셋, 학술지 참고문헌과 비참고문헌 문구를 병합한 데이터셋을 구성했고, 딥러닝 언어 모델인 RoBERTa+CRF와 ChatGPT를 학습시켜 메타데이터 추출과 자료유형 구분 및 원문 분리 성능을 측정하였다. 그 결과 F1-score 기준 메타데이터 추출 최대 95.41%, 자료유형 구분 및 원문 분리 최대 98.91% 성능을 달성하는 등 유의미한 결과를 얻었다. 이를 통해 비참고문헌 문구가 포함된 연구보고서의 참고문헌 추출에 대한 딥러닝 언어 모델과 데이터셋 유형별 참고문헌 구축 방향을 제안하였다.

Abstract

The purpose of this study is to assess the effectiveness of using deep learning language models to extract references automatically and create a reference database for research reports in an efficient manner. Unlike academic journals, research reports present difficulties in automatically extracting references due to variations in formatting across institutions. In this study, we addressed this issue by introducing the task of separating references from non-reference phrases, in addition to the commonly used metadata extraction task for reference extraction. The study employed datasets that included various types of references, such as those from research reports of a particular institution, academic journals, and a combination of academic journal references and non-reference texts. Two deep learning language models, namely RoBERTa+CRF and ChatGPT, were compared to evaluate their performance in automatic extraction. They were used to extract metadata, categorize data types, and separate original text. The research findings showed that the deep learning language models were highly effective, achieving maximum F1-scores of 95.41% for metadata extraction and 98.91% for categorization of data types and separation of the original text. These results provide valuable insights into the use of deep learning language models and different types of datasets for constructing reference databases for research reports including both reference and non-reference texts.

5

텍스트 마이닝 분석 기법을 활용한 월경주기측정 애플리케이션 사용자 경험 평가

정우경(숙명여자대학교 문헌정보학과 석사) ; 신동희(숙명여자대학교 문헌정보학과) 2023, Vol.40, No.4, pp.1-31 https://doi.org/10.3743/KOSIM.2023.40.4.001

초록보기

초록

본 연구는 여성의 건강과 밀접한 관련이 있는 모바일 월경주기 측정 애플리케이션을 대상으로 토픽모델링 기법과 함께 다양한 텍스트 마이닝 기법을 도입하여 사용자 경험 평가를 실시하였으며 그 결과를 허니콤(Honeycomb)모델과 결합하여 분석하였다. 월경주기측정 애플리케이션 리뷰에서 드러난 사용자 경험을 평가하기 위해 월경주기측정 애플리케이션의 한국어 리뷰 47,117개를 수집하였다. 리뷰에서 드러난 사용자 경험에 관한 전체적인 담론 확인을 위해 토픽모델링 분석을 실시하였고, 각 토픽 별 구체적인 경험을 확인하고자 동시출현 네트워크 관계로 구축한 텍스트 네트워크 분석을 실시하였다. 또한 사용자의 정서적 경험을 파악하기 위해 감정분석(Sentiment Analysis)을 실시하였다. 이를 기반으로 월경주기측정 애플리케이션의 개발 전략을 정확도, 디자인, 모니터링, 데이터관리 및 사용자관리 측면에서 제시하였다. 연구 결과, 애플리케이션의 월경주기측정 정확도 및 모니터링 기능을 개선해야 함이 확인되었으며 다양한 디자인적 시도가 필요함이 관찰되었다. 또한 개인정보와 사용자의 생체 데이터 관리방법에 대한 보완의 필요성도 확인되었다. 본 연구는 월경주기측정 애플리케이션의 사용자 경험(UX)을 심층적으로 탐색하여 이용자들이 경험한 다양한 요인을 밝히고 더 나은 경험을 제공하기 위한 실질적인 개선점을 제시하였다. 또한 사용자 경험을 평가하는 과정에서 방대한 양의 리뷰 데이터를 연구자가 면밀하게 파악할 수 있도록 토픽모델링과 텍스트 네트워크 분석 기법을 결합하여 방법론을 제시하였다는 점에서 의의가 있다.

Abstract

This study conducted user experience evaluation by introducing various text mining techniques along with topic modeling techniques for mobile menstrual cycle measurement applications that are closely related to women’s health and analyzed the results by combining them with a honeycomb model. To evaluate the user experience revealed in the menstrual cycle measurement application review, 47,117 Korean reviews of the menstrual cycle measurement application were collected. Topic modeling analysis was conducted to confirm the overall discourse on the user experience revealed in the review, and text network analysis was conducted to confirm the specific experience of each topic. In addition, sentimental analysis was conducted to understand the emotional experience of users. Based on this, the development strategy of the menstrual cycle measurement application was presented in terms of accuracy, design, monitoring, data management, and user management. As a result of the study, it was confirmed that the accuracy and monitoring function of the menstrual cycle measurement of the application should be improved, and it was observed that various design attempts were required. In addition, the necessity of supplementing personal information and the user’s biometric data management method was also confirmed. By exploring the user experience (UX) of the menstrual cycle measurement application in-depth, this study revealed various factors experienced by users and suggested practical improvements to provide a better experience. It is also significant in that it presents a methodology by combines topic modeling and text network analysis techniques so that researchers can closely grasp vast amounts of review data in the process of evaluating user experiences.

6

다차원 메타데이터 공간을 활용한 학술 문헌 추천기법 연구

감미아(연세대학교 문헌정보학과) ; 이지연(연세대학교 문헌정보학과) 2023, Vol.40, No.1, pp.121-148 https://doi.org/10.3743/KOSIM.2023.40.1.121

초록보기

초록

본 연구는 ‘우수한 성능의 메타데이터 속성 유사도 기반의 학술 문헌추천시스템’을 제안하는 데에 목적을 두고 있다. 본 연구에서는 정보조직에서 다루는 메타데이터의 활용과 계량정보학에서 다루고 있는 동시인용, 저자-서지결합법, 동시출현 빈도, 코사인 유사도의 개념을 활용한 문헌정보학 기반의 학술 문헌 추천기법을 제안하고자 하였다. 실험을 위해 수집한 ‘불평등’, ‘격차’ 관련 총 9,643개의 논문 메타데이터를 정제하여 코사인 유사도를 활용한 저자, 키워드, 제목 속성 간의 상대적 좌표 수치를 도출하였고, 성능 좋은 가중치 조건 및 차원의 수를 선정하기 위해 실험을 수행하였다. 실험 결과를 제시하여 이용자의 평가를 거쳤으며, 이를 이용해 기준노드와 추천조합 특성 분석 및 컨조인트 분석, 결과 비교 분석을 수행하여 연구질문 중심의 논의를 수행하였다. 그 결과 전반적으로는 저자 관련 속성을 제한 조합 혹은 제목 관련 속성만 사용하는 경우 성능이 뛰어난 것으로 나타났다. 본 연구에서 제시한 기법을 활용하고 광범위한 표본의 확보를 이룬다면, 향후 정보서비스의 문헌 추천 분야뿐 아니라 사회의 다양한 분야에 대한 추천기법 성능 향상에 도움을 줄 수 있을 것이다.

Abstract

The purpose of this study is to propose a scholarly paper recommendation system based on metadata attribute similarity with excellent performance. This study suggests a scholarly paper recommendation method that combines techniques from two sub-fields of Library and Information Science, namely metadata use in Information Organization and co-citation analysis, author bibliographic coupling, co-occurrence frequency, and cosine similarity in Bibliometrics. To conduct experiments, a total of 9,643 paper metadata related to “inequality” and “divide” were collected and refined to derive relative coordinate values between author, keyword, and title attributes using cosine similarity. The study then conducted experiments to select weight conditions and dimension numbers that resulted in a good performance. The results were presented and evaluated by users, and based on this, the study conducted discussions centered on the research questions through reference node and recommendation combination characteristic analysis, conjoint analysis, and results from comparative analysis. Overall, the study showed that the performance was excellent when author-related attributes were used alone or in combination with title-related attributes. If the technique proposed in this study is utilized and a wide range of samples are secured, it could help improve the performance of recommendation techniques not only in the field of literature recommendation in information services but also in various other fields in society.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지