정보관리학회지, 한국정보관리학회

권한신청
P-ISSN1013-0799
E-ISSN2586-2073
KCI

검색어: 연구주제분석, 검색결과: 13

김선욱(경북대학교 사회과학대학 문헌정보학과) ; 양기덕(영남고문헌아카이브센터) 2022, Vol.39, No.3, pp.99-132 https://doi.org/10.3743/KOSIM.2022.39.3.099

초록보기

초록

본 연구의 목적은 LDA 토픽모델링 결과와 BERTopic 토픽모델링 결과를 합성하는 방법론인 Augmented and Extended Topics(AET)를 제안하고, 이를 사용해 문헌정보학 분야의 연구주제를 분석하는 데 있다. AET의 실제 적용결과를 확인하기 위해 2001년 1월부터 2021년 10월까지의 Web of Science 내 문헌정보학 학술지 85종에 게재된 학술논문 서지 데이터 55,442건을 분석하였다. AET는 서로 다른 토픽모델링 결과의 관계를 WORD2VEC 기반 코사인 유사도 매트릭스로 구축하고, 매트릭스 내 의미적 관계가 유효한 범위 내에서 매트릭스 재정렬 및 분할 과정을 반복해 증강토픽(Augmented Topics, 이하 AT)을 추출한 뒤, 나머지 영역에서 코사인 유사도 평균값 순위와 BERTopic 토픽 규모 순위에 대한 조화평균을 통해 확장토픽(Extended Topics, 이하 ET)을 결정한다. 최적 표준으로 도출된 LDA 토픽모델링 결과와 AET 결과를 비교한 결과, AT는 LDA 토픽모델링 토픽을 한층 더 구체화하고 세분화하였으며 ET는 유효한 토픽을 발견하였다. AT(Augmented Topics)의 성능은 LDA 이상이었으며 ET(Extended Topics)는 일부 경우를 제외하고 대부분 LDA와 유사한 수준의 성능을 나타내었다.

Abstract

The purpose of this study is to propose AET (Augmented and Extended Topics), a novel method of synthesizing both LDA and BERTopic results, and to analyze the recently published LIS articles as an experimental approach. To achieve the purpose of this study, 55,442 abstracts from 85 LIS journals within the WoS database, which spans from January 2001 to October 2021, were analyzed. AET first constructs a WORD2VEC-based cosine similarity matrix between LDA and BERTopic results, extracts AT (Augmented Topics) by repeating the matrix reordering and segmentation procedures as long as their semantic relations are still valid, and finally determines ET (Extended Topics) by removing any LDA related residual subtopics from the matrix and ordering the rest of them by (BERTopic topic size rank, Inverse cosine similarity rank). AET, by comparing with the baseline LDA result, shows that AT has effectively concretized the original LDA topic model and ET has discovered new meaningful topics that LDA didn’t. When it comes to the qualitative performance evaluation, AT performs better than LDA while ET shows similar performances except in a few cases.

안양시 공공도서관 중장기 발전 계획 수립을 위한 현황 분석 및 운영 전략 연구

장인호(대진대학교 문헌정보학과) ; 황금숙(대림대학교 도서관미디어정보과) ; 송민선(대림대학교 도서관미디어정보과) 2022, Vol.39, No.1, pp.145-170 https://doi.org/10.3743/KOSIM.2022.39.1.145

초록보기

초록

본 연구는 안양시의 정체성과 특수성을 고려한 도서관 중장기 종합 발전 계획 수립을 위한 운영 전략들을 도출하고자 하는 목적으로 수행되었다. 이를 위해 본 연구에서는 다음과 같은 내용으로 진행되었다. 첫째, 안양시의 대내외 환경과 지역적 특수성 등을 파악하기 위해 안양시와 관련된 각종 문헌 자료 및 통계 자료들을 수집․분석해 정리하였다. 둘째, ｢국가도서관통계시스템｣ 및 ｢경기도 공공도서관 연감｣, 각종 도서관 관련 법규 등의 자료를 토대로 안양시 공공도서관의 운영 현황에 대해 파악하였다. 셋째, 안양시 공공도서관에서 근무하고 있는 사서 26명을 대상으로 안양시 정체성부터 현재의 도서관 운영 전반에 대한 의견을 수렴하기 위해 주관식 개방형 질문으로 구성한 설문조사를 실시해 분석하였다. 마지막으로, 앞서 정리한 현황 분석 내용과 최근의 도서관 트렌드, 정책 및 사회문화 환경 등을 반영해 향후 안양시 공공도서관의 중장기 발전 계획 수립의 토대가 될 수 있는 구체적인 운영 전략들을 ① 조직체계 및 인력 구성, ② 지역 간 균형 발전을 위한 시설 계획, ③ 장서 개발 및 보존 방향, ④ 특화주제 서비스 방안, ⑤ 협력체계 구축, ⑥ 홍보 방안의 여섯 가지 영역으로 구분해 제안하였다.

Abstract

The purpose of this study is to derive the operation strategies for establishing the mid-to-long-term comprehensive library development plans considering the identity and specificity of Anyang City. For this purpose, this study proceeded as follows. First, to understand the internal and external environment and regional characteristics of Anyang City, various literature and statistical data related to Anyang City were collected, analyzed, and organized. Second, the operation status of Anyang municipal libraries was analyzed with data such as ｢National Library Statistics System｣, ｢Gyeonggido Public Library Yearbook｣, and various library-related laws. Third, a survey with open-ended questions was conducted for twenty-six librarians working in Anyang municipal libraries to collect opinions on the identity of Anyang and the overall operations of the libraries. Lastly, by reflecting the current status analysis, the latest library trends, policies, and sociocultural environments, detailed operation strategies that can serve as a basis for establishing mid-to long term development plans for Anyang municipal libraries in the future were proposed. The above operating strategies were proposed by dividing into six areas such as (1) the plans for organizational system and manpower composition, (2) the facility plans for balanced regional development, (3) the collection development and preservation direction, (4) the special subject materials service plans, (5) the for establishing cooperation system, and (6) the public relations plans.

BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축

고영수(연세대학교 문헌정보학과 석사과정) ; 이수빈(연세대학교 문헌정보학과 박사과정) ; 차민정(연세대학교 소셜오믹스 연구센터) ; 김성덕(연세대학교 문헌정보학과 석사과정) ; 이주희(연세대학교 문헌정보학과 석사과정) ; 한지영(연세대학교 문헌정보학과 석사과정) ; 송민(연세대학교 문헌정보학과) 2022, Vol.39, No.2, pp.111-129 https://doi.org/10.3743/KOSIM.2022.39.2.111

초록보기

초록

불면증은 최근 5년 새 환자가 20% 이상 증가하고 있는 현대 사회의 만성적인 질병이다. 수면이 부족할 경우 나타나는 개인 및 사회적 문제가 심각하고 불면증의 유발 요인이 복합적으로 작용하고 있어서 진단 및 치료가 중요한 질환이다. 본 연구는 자유롭게 의견을 표출하는 소셜 미디어 ‘Reddit’의 불면증 커뮤니티인 ‘insomnia’를 대상으로 5,699개의 데이터를 수집하였고 이를 국제수면장애분류 ICSD-3 기준과 정신의학과 전문의의 자문을 받은 가이드라인을 바탕으로 불면증 경향 문헌과 비경향 문헌으로 태깅하여 불면증 말뭉치를 구축하였다. 구축된 불면증 말뭉치를 학습데이터로 하여 5개의 딥러닝 언어모델(BERT, RoBERTa, ALBERT, ELECTRA, XLNet)을 훈련시켰고 성능 평가 결과 RoBERTa가 정확도, 정밀도, 재현율, F1점수에서 가장 높은 성능을 보였다. 불면증 소셜 데이터를 심층적으로 분석하기 위해 기존에 많이 사용되었던 LDA의 약점을 보완하며 새롭게 등장한 BERTopic 방법을 사용하여 토픽 모델링을 진행하였다. 계층적 클러스터링 분석 결과 8개의 주제군(‘부정적 감정’, ‘조언 및 도움과 감사’, ‘불면증 관련 질병’, ‘수면제’, ‘운동 및 식습관’, ‘신체적 특징’, ‘활동적 특징’, ‘환경적 특징’)을 확인할 수 있었다. 이용자들은 불면증 커뮤니티에서 부정 감정을 표현하고 도움과 조언을 구하는 모습을 보였다. 또한, 불면증과 관련된 질병들을 언급하고 수면제 사용에 대한 담론을 나누며 운동 및 식습관에 관한 관심을 표현하고 있었다. 발견된 불면증 관련 특징으로는 호흡, 임신, 심장 등의 신체적 특징과 좀비, 수면 경련, 그로기상태 등의 활동적 특징, 햇빛, 담요, 온도, 낮잠 등의 환경적 특징이 확인되었다.

Abstract

Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from ‘insomnia’, a community on ‘Reddit’, a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups (‘Negative emotions’, ‘Advice and help and gratitude’, ‘Insomnia-related diseases’, ‘Sleeping pills’, ‘Exercise and eating habits’, ‘Physical characteristics’, ‘Activity characteristics’, ‘Environmental characteristics’) could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지