정보관리학회지, 한국정보관리학회

31

대학 기관 리포지토리의 운영 현황 분석 및 개선 방안에 관한 연구 - dCollection을 중심으로 -

김현희(명지대학교) ; 정경희(한성대학교) ; 김용호(부경대학교) 2006, Vol.23, No.4, pp.17-39 https://doi.org/10.3743/KOSIM.2006.23.4.017

초록보기

초록

기관 리포지토리는 오픈 액세스 운동을 실현할 수 있는 핵심적인 체제의 하나로 알려져 있다. 한국교육학술정보원은 학술 정보 공유 공간으로 대학 기관 리포지토리 컨소시엄인 dCollection을 2003년에 구성하여 현재 62개의 국사립 대학들이 회원 대학으로 참가하고 있다. 본 연구의 목적은 2005년도에 구축된 dCollection 평가 모형을 조사 도구로 활용하여, 40개의 대학 기관 리포지토리의 운영 현황을 파악하고, 이러한 조사 결과를 기초로 하여 dCollection 자료의 등록률 및 이용율 향상에 초점을 맞춰 국내 기관 리포지토리의 발전 방안을 제안하여 효율적인 국가지식정보 유통체제의 인프라 구축을 목적으로 한다.

Abstract

Building institutional repositories is known as one of powerful methods for realizing the open access movement. The Korean Education and Research Information Service(KERIS) proposed to organize institutional repositories into a consortium, called "dCollection (Digital Collection)," composed of 62 universities since 2003. The purpose of this study is to investigate the current state of 40 member universities of dCollection using the evaluation model including 4 categories and 39 indicators, and, based on the survey outcomes, to pinpoint the procedural or performance weak points of the dCollection systems in order to find its customized solutions focusing on the improvement of use and self-archiving rates.

32

연구데이터 관점에서 본 거대언어모델 품질 평가 기준 제언

한나은(한국과학기술정보연구원) ; 서수정(한국과학기술정보연구원) ; 엄정호(한국과학기술정보연구원) 2023, Vol.40, No.3, pp.77-98 https://doi.org/10.3743/KOSIM.2023.40.3.077

초록보기

초록

본 연구는 지금까지 제안된 거대언어모델 가운데 LLaMA 및 LLaMA 기반 모델과 같이 연구데이터를 주요 사전학습데이터로 활용한 모델의 데이터 품질에 중점을 두어 현재의 평가 기준을 분석하고 연구데이터의 관점에서 품질 평가 기준을 제안하였다. 이를 위해 데이터 품질 평가 요인 중 유효성, 기능성, 신뢰성을 중심으로 품질 평가를 논의하였으며, 거대언어모델의 특성 및 한계점을 이해하기 위해 LLaMA, Alpaca, Vicuna, ChatGPT 모델을 비교하였다. 현재 광범위하게 활용되는 거대언어모델의 평가 기준을 분석하기 위해 Holistic Evaluation for Language Models를 중심으로 평가 기준을 살펴본 후 한계점을 논의하였다. 이를 바탕으로 본 연구는 연구데이터를 주요 사전학습데이터로 활용한 거대언어모델을 대상으로 한 품질 평가 기준을 제시하고 추후 개발 방향을 논의하였으며, 이는 거대언어모델의 발전 방향을 위한 지식 기반을 제공하는데 의의를 갖는다.

Abstract

Large Language Models (LLMs) are becoming the major trend in the natural language processing field. These models were built based on research data, but information such as types, limitations, and risks of using research data are unknown. This research would present how to analyze and evaluate the LLMs that were built with research data: LLaMA or LLaMA base models such as Alpaca of Stanford, Vicuna of the large model systems organization, and ChatGPT from OpenAI from the perspective of research data. This quality evaluation focuses on the validity, functionality, and reliability of Data Quality Management (DQM). Furthermore, we adopted the Holistic Evaluation of Language Models (HELM) to understand its evaluation criteria and then discussed its limitations. This study presents quality evaluation criteria for LLMs using research data and future development directions.

33

시맨틱웹 기술과 활용방안

오삼균(성균관대학교) 2002, Vol.19, No.4, pp.298-319 https://doi.org/10.3743/KOSIM.2002.19.4.298

초록보기

초록

시맨틱웹은 기계가독형 정의에 기반한 정보의 연계를 통해 웹 자원을 지식화함으로써 정보의 효율적 검색, 통합, 재사용을 도모하는 새로운 기술이다. 시맨틱웹의 구축은 자원에 불변 고유식별자를 부과하는 URI 체제, 각 정보기관에서 생성되는 요소와 속성의 의미 충돌을 방지하는 XML 네임스페이스, 메타데이터 스키마를 활용한 호환적 자원기술을 가능하게 하는 RDF, 메타 데이터 요소 및 이와 연관된 클래스와 속성 관계 정의의 기반이 되는 RDF 스키마, 그리고 RDF 스키마 위에 논리적 추론과 표현력을 강화한 웹 온톨로지 언어 DAML+OIL 및 그 건조자 (constructors)를 삭제 또는 수정 보완한 OWL (Web Ontology Language) 등의 여러 핵심 개념과 기술을 필요로 하는 작업이다. 이 논문은 이러한 개념과 기술의 점진적 발전 양상을 개괄 설명하고, XML/RDF 스키마를 기반으로 메타데이터 요소들을 정의할 경우 도출할 수 있는 상호운용성과 온톨로지의 다양한 활용 방안 등을 고찰한다.

Abstract

The Semantic Web is a new technology that attempts to achieve effective retrieval, automation, integration, and reuse of web resources by constructing knowledge bases that are composed of machine-readable definitions and associations of resources that express the relationships among them. To have this kind of Semantic Web in place, it is necessary to have the following infrastructures: capability to assign unchangeable and unique identifier (URI) to each resource, adoption of XML namespace concept to prevent collision of element and attribute names defined by various institutions, widespread use of RDF to describe resources so that diverse metadata can be interoperable, use of RDF schema to define the meaning of metadata elements and the relationships among them, adoption of DAML+OIL that is built upon RDF(S) to increase reasoning capability and expressive power, and finally adoption of OWL that is built upon DAML+OIL by removing unnecessary constructors and adding new ones based on experience of using DAML+OIL. The purpose of this study is to describe the central concepts and technologies related to the Semantic Web and to discuss the benefits of metadata interoperability based on XML/RDF schemas and the potential applications of diverse ontologies.

34

대학생의 웹기반 전자책 이용에 관한 연구

장혜란(상명대학교) 2006, Vol.23, No.4, pp.233-256 https://doi.org/10.3743/KOSIM.2006.23.4.233

초록보기

초록

대학생의 전자책 이용에 대한 이해를 돕고 현황을 파악하기 위하여 A대학교 학생들을 표집하여 설문조사와 면접을 수행하였다. 466명의 응답에 기초하여 분석한 결과를 보면, 대학생들의 전자책과 서비스에 대한 인지도는 낮은편이며, 약 30%가 이용경험을 가지고 있고, 접근경로는 대학도서관사이트가 지배적이다. 이용자의 73%가 3권 이하의 전자책을 읽었으며, 이용 분야는 다양하나 문학과 장르문학에 치우쳐 있고, 목적은 학술적 독서와 개인적 독서로 양분되어 있다. 부가기능에 대한 인지도와 활용 수준은 미약하다. 이용자들의 만족도 또한 낮고, 50% 이상이 중립적 견해를 보이고 있다. 이용 경험이 없는 학생들의 비이용 요인은 주로 불편함과 관련지식 부족으로 나타났다. 그러나 비이용자의 약 88%가 향후 이용의지를 표명하고 있다. 면접조사 결과를 보면, 적극적 이용자들은 전자책의 유용성을 인식하고, 화면독서에 친숙하며, 실용도서를 이용하는 것을 알 수 있다. 그러나 이들의 부가기능 인지도 및 활용수준 그리고 만족도 또한 낮다. 분석 결과에 따라, 이용 활성화를 위한 홍보, 생산의 다양화, 교육, 서비스 평가의 필요성이 제언되었다.

Abstract

To understand the use of the ebooks among undergraduate students, a questionnaire was devised and collected data from 466 respondents. The level of ebook and its service awareness appears to be low, and only about 30% of the students have used ebooks in the past. Students access ebooks primarily through the library homepage. 73% of the users read 3 ebooks and below. The subject and area of reading is fairly spread, however literary works and genre fiction were most popular. And the purpose is split into academic and private reading. Most of the users lack of knowledge about additional functions. Overall satisfaction level is low. Discomfort and ebooks illiteracy constitute the major reasons of nonuse, however about 88% of the nonusers show willingness to use in the future. According to the interview, active users are familiar with the screen reading as well as perceived advantages of ebooks. Nontheless, their satisfaction level is still low. Based on the results, recommendations for creating awareness, education, production development and service evaluation are suggested to promote the ebooks use.

35

Mendeley co-readership 정보를 활용한 한국 관련 논문의 글로벌 독자 국가 네트워크 분석

조재인(인천대학교) ; 박종도(인천대학교) 2018, Vol.35, No.4, pp.107-124 https://doi.org/10.3743/KOSIM.2018.35.4.107

초록보기

초록

Mendeley의 독자 정보는 학계 밖에서 학문의 결과물이 어떻게 소비되고 있는지 다각도로 파악하여 피인용도로는 해석할 수 없었던 미지의 세계를 예측하는데 활용될 수 있다. 본 연구는 Mendeley의 co-readership 데이터를 활용해 한국 관련 논문의 독자 국가 네트워크 분석을 수행하여 공통의 학문적 관심사를 공유하는 국가 군집을 이해하고 이들 국가가 네트워크 상에서 어떠한 영향력을 가지는지 확인하였다. 그 결과 전 분야에서 미국을 비롯한 선진국은 대체로 높은 전역중심성을 보여 한국 관련 연구에 대한 전반적인 협력과 잠재적 교류 가능성을 가지는 것으로 나타났으며, 일부 개발도상국은 높은 지역중심성을 보여 상호간 공통의 학문적 관심사로 연계되어 있는 것으로 확인되었다. 한편 의학과 사회과학 분야는 OECD 국가와 개발도상국이 분리된 독자층을 이루었으며, 공학 분야는 신흥경제개발국이 대규모 독자 군집으로 형성되는 특징을 보였다. 또한 공학은 네트워크 밀도가 상대적으로 높게 나타나 국가간 학문적 교류와 지식의 확산, 협력의 가능성이 높은 것으로 분석되었다.

Abstract

Mendeley readership data could be used to understand how research outcome be spent outside of academia in multi way. So it could be utilized to understand unknown world which citation rate could not explain still now. This study, by conducting a country network analysis using Mendeley’s co readership data about articles of Korea related research, clusters countries that share common academic interest. As a result, the US and other advanced countries in all fields showed high overall and regional centrality, indicating that they have overall cooperation and potential for exchange of Korea related studies. Some developing countries have shown high regional centrality and are linked to common academic interests. In the medical and social sciences, the OECD and developing countries have formed a separate group of readers, and the engineering sector has been characterized by emerging developing countries as a large community of readers. In addition, engineering science field has shown that network density is relatively high, so there might be high possibility of academic exchanges, knowledge dissemination and cooperation among countries.

36

문화예술기관 기본정보의 품질개선과 연계를 위한 지식그래프 구축

선은택(중앙대학교 일반대학원 문헌정보학과 정보학전공 석사과정) ; 김학래(중앙대학교 문헌정보학과) 2023, Vol.40, No.4, pp.329-349 https://doi.org/10.3743/KOSIM.2023.40.4.329

초록보기

초록

정보통신 기술이 빠르게 발전하면서 데이터의 생산 속도가 급증하였고, 이는 빅데이터라는 개념으로 대표되고 있다. 단시간에 데이터 규모가 급격하게 증가한 빅데이터에 대해 품질과 신뢰성에 대한 논의도 진행되고 있다. 반면 스몰데이터는 품질이 우수한 최소한의 데이터로, 특정 문제 상황에 필요한 데이터를 의미한다. 문화예술 분야는 다양한 유형과 주제의 데이터가 존재하며 빅데이터 기술을 활용한 연구가 진행되고 있다. 하지만 문화예술기관의 기본정보가 정확하게 제공되고 활용되는지를 탐색한 연구는 부족하다. 기관의 기본정보는 대부분의 빅데이터 분석에서 사용하는 필수적인 근거일 수 있고, 기관을 식별하기 위한 출발점이 된다. 본 연구는 문화예술 기관의 기본정보를 다루는 데이터를 수집하여 공통 메타데이터를 정의하고, 공통 메타데이터를 중심으로 기관을 연계하는 지식그래프 형태로 스몰데이터를 구축하였다. 이는 통합적으로 문화예술기관의 유형과 특징을 탐색할 수 있는 방안이 될 수 있다.

Abstract

With the rapid development of information and communication technology, the speed of data production has increased rapidly, and this is represented by the concept of big data. Discussions on quality and reliability are also underway for big data whose data scale has rapidly increased in a short period of time. On the other hand, small data is minimal data of excellent quality and means data necessary for a specific problem situation. In the field of culture and arts, data of various types and topics exist, and research using big data technology is being conducted. However, research on whether basic information about culture and arts institutions is accurately provided and utilized is insufficient. The basic information of an institution can be an essential basis used in most big data analysis and becomes a starting point for identifying an institution. This study collected data dealing with the basic information of culture and arts institutions to define common metadata and constructed small data in the form of a knowledge graph linking institutions around common metadata. This can be a way to explore the types and characteristics of culture and arts institutions in an integrated way.

37

맞춤형 이용자교육에 관한 대학도서관 사서들의 인식 조사

정미정(명지대학교) ; 권나현(명지대학교) 2014, Vol.31, No.4, pp.133-159 https://doi.org/10.3743/KOSIM.2014.31.4.133

초록보기

초록

본 연구의 목적은 맞춤형 이용자교육을 시행하고 있는 국내 대학도서관의 맞춤형 이용자교육담당 사서들의 인식을 조사하는 것이다. 맞춤형 이용자교육을 제공해 본 경험이 있는 8명의 사서를 대상으로 일차적으로 심층면담하고, 그들의 응답과 문헌조사를 토대로 설문지를 개발하여 해당 교육을 제공해 본 전국 94개 대학도서관의 사서 94명을 대상으로 설문조사하였다. 연구결과, 응답자 대부분이 맞춤형 이용자교육에 대해 긍정적으로 인식하는 가운데, 교육 운영에서 교수진과의 협력을 가장 중요하다고 여기는 것으로 나타났다. 또 맞춤형 교육사서에게 필요한 것으로 석사학위나 주제지식보다 교육에 대한 의지나 봉사정신 등 태도적 측면을 더 중요한 것으로 인식하고 있었으며, 담당인력 부족을 교육 운영의 가장 큰 어려움으로 파악하고 있었다. 본 연구결과는 현장에서 맞춤형 이용자교육을 확대, 제공하는데 활용할 수 있을 것이다.

Abstract

The purpose of this study was to investigate the librarians’ perceptions of the customized bibliographic instructions (CBI) at four-year academic libraries in Korea. The study also examined instruction librarians’ perceptions of the obstacles and the factors associated with the effective adoption and delivery of the CBI. This study conducted both a survey research method and an in depth-interview. The findings of the study revealed the librarians’ perception of the obstacles and facilitators in adopting and operating the CBI, which suggests useful strategies to apply in delivering the CBI in academic libraries.

38

검색 성능 향상을 위한 약품 온톨로지 기반 연관 피드백

임수연(경북대학교) 2005, Vol.22, No.2, pp.41-56 https://doi.org/10.3743/KOSIM.2005.22.2.041

초록보기

초록

기계가 정보의 의미를 이해하고 처리할 수 있도록 기존의 웹을 확장하는 것을 목적으로 하는 시멘틱 웹은 온톨로지를 이용하여 지식을 공유하게 된다. 본 논문에서는 정교한 질의의 처리를 위하여 온톨로지 내에 존재하는 의미 관계들을 질의의 확장을 위한 연관피드백 정보로 이용하는 방안을 제안한다. 실험은 도메인 온톨로지인 Medicine 온톨로지를 대상으로 하였으며, 출현 용어들의 빈도정보만을 이용한 키워드기반 문서검색과 제안한 온톨로지기반 문서검색의 성능을 비교하였다. 이 때, 두 시스템의 정확률과 재현율을 성능 평가의 기준으로 삼았다. 그 결과, 검색 엔진은 온톨로지에 정의된 개념들과 규칙들을 활용하면서 검색의 정확률을 향상시키는데 도움이 되었고 검색 성능을 향상시키기 위한 추론의 기반으로도 사용될 수 있었다.

Abstract

For the purpose of extending the Web that is able to understand and process information by machine, Semantic Web shared knowledge in the ontology form. For exquisite query processing, this paper proposes a method to use semantic relations in the ontology as relevance feedback information to query expansion. We made experiment on pharmacy domain. And in order to verify the effectiveness of the semantic relation in the ontology, we compared a keyword based document retrieval system that gives weights by using the frequency information compared with an ontology based document retrieval system that uses relevant information existed in the ontology to a relevant feedback. From the evaluation of the retrieval performance, we knew that search engine used the concepts and relations in ontology for improving precision effectively. Also it used them for the basis of the inference for improvement the retrieval performance.

39

연구정보를 위한 보존 메타데이터 요소 개발에 관한 연구: 경제·인문사회연구회 연구관리시스템을 중심으로

김판준(경제․인문사회연구회) 2010, Vol.27, No.4, pp.169-191 https://doi.org/10.3743/KOSIM.2010.27.4.169

초록보기

초록

가치 있는 디지털 정보자원으로서 연구정보를 위한 보존 메타데이터 요소를 개발하였다. 특히 국가정책지식 생산의 주역이라고 할 수 있는 경제․인문사회 분야 정부출연연구기관의 연구정보를 장기적으로 보존하여 활용할 수 있는 기반으로서 보존 메타데이터 요소를 개발하였다. 다양한 부서와 기관에서 분산 관리되고 있는 연구정보의 상호운용성을 확보하기 위하여 OAIS 참조모형을 기반으로 유럽표준인 CERIF와 PREMIS 데이터 사전의 요소들을 비교 분석한 다음, 양자의 특성을 반영하여 상호보완적인 보존 메타데이터 요소를 개발하였다. 그 결과로서 개념적 차원이 아닌 실제 구현이 가능하고 시스템 간의 호환성이 전제된 연구정보 보존 메타데이터 요소들과 적용사례를 제시하였다.

Abstract

This study aimed at developing preservation metadata elements and its applications for research information which is considered as a valuable digital resource these days. Specifically, the developed preservation metadata intends to provide a basis for the research information of the government-funded research institutes in economic and social science fields which are major knowledge producers of national policy. To ensure the interoperability of the research information across various departments and organizations, this study compared the elements from the CERIF(European Standard) and those from the PREMIS Data Dictionary which is based on OAIS reference model (ISO 14721). Based on this comparative analysis, this study developed complementary preservation metadata elements based on the two standards’ characteristics. Consequently, this study suggested a new preservation metadata elements and its applications that are compatible between the two systems and can be implemented in practice.

40

딥러닝 기반 소셜미디어 한글 텍스트 우울 경향 분석

박서정(연세대학교 문헌정보학과) ; 이수빈(연세대학교 문헌정보학과) ; 김우정(연세대학교 의과대학 용인세브란스병원 정신건강의학교실) ; 송민(연세대학교 문헌정보학과) 2022, Vol.39, No.1, pp.91-117 https://doi.org/10.3743/KOSIM.2022.39.1.091

초록보기

초록

국내를 비롯하여 전 세계적으로 우울증 환자 수가 매년 증가하는 추세이다. 그러나 대다수의 정신질환 환자들은 자신이 질병을 앓고 있다는 사실을 인식하지 못해서 적절한 치료가 이루어지지 않고 있다. 우울 증상이 방치되면 자살과 불안, 기타 심리적인 문제로 발전될 수 있기에 우울증의 조기 발견과 치료는 정신건강 증진에 있어 매우 중요하다. 이러한 문제점을 개선하기 위해 본 연구에서는 한국어 소셜 미디어 텍스트를 활용한 딥러닝 기반의 우울 경향 모델을 제시하였다. 네이버 지식인, 네이버 블로그, 하이닥, 트위터에서 데이터 수집을 한 뒤 DSM-5 주요 우울 장애 진단 기준을 활용하여 우울 증상 개수에 따라 클래스를 구분하여 주석을 달았다. 이후 구축한 말뭉치의 클래스 별 특성을 살펴보고자 TF-IDF 분석과 동시 출현 단어 분석을 실시하였다. 또한, 다양한 텍스트 특징을 활용하여 우울 경향 분류 모델을 생성하기 위해 단어 임베딩과 사전 기반 감성 분석, LDA 토픽 모델링을 수행하였다. 이를 통해 문헌 별로 임베딩된 텍스트와 감성 점수, 토픽 번호를 산출하여 텍스트 특징으로 사용하였다. 그 결과 임베딩된 텍스트에 문서의 감성 점수와 토픽을 모두 결합하여 KorBERT 알고리즘을 기반으로 우울 경향을 분류하였을 때 가장 높은 정확률인 83.28%를 달성하는 것을 확인하였다. 본 연구는 다양한 텍스트 특징을 활용하여 보다 성능이 개선된 한국어 우울 경향 분류 모델을 구축함에 따라, 한국 온라인 커뮤니티 이용자 중 잠재적인 우울증 환자를 조기에 발견해 빠른 치료 및 예방이 가능하도록 하여 한국 사회의 정신건강 증진에 도움을 줄 수 있는 기반을 마련했다는 점에서 의의를 지닌다.

Abstract

The number of depressed patients in Korea and around the world is rapidly increasing every year. However, most of the mentally ill patients are not aware that they are suffering from the disease, so adequate treatment is not being performed. If depressive symptoms are neglected, it can lead to suicide, anxiety, and other psychological problems. Therefore, early detection and treatment of depression are very important in improving mental health. To improve this problem, this study presented a deep learning-based depression tendency model using Korean social media text. After collecting data from Naver KonwledgeiN, Naver Blog, Hidoc, and Twitter, DSM-5 major depressive disorder diagnosis criteria were used to classify and annotate classes according to the number of depressive symptoms. Afterwards, TF-IDF analysis and simultaneous word analysis were performed to examine the characteristics of each class of the corpus constructed. In addition, word embedding, dictionary-based sentiment analysis, and LDA topic modeling were performed to generate a depression tendency classification model using various text features. Through this, the embedded text, sentiment score, and topic number for each document were calculated and used as text features. As a result, it was confirmed that the highest accuracy rate of 83.28% was achieved when the depression tendency was classified based on the KorBERT algorithm by combining both the emotional score and the topic of the document with the embedded text. This study establishes a classification model for Korean depression trends with improved performance using various text features, and detects potential depressive patients early among Korean online community users, enabling rapid treatment and prevention, thereby enabling the mental health of Korean society. It is significant in that it can help in promotion.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지