정보관리학회지, 한국정보관리학회

1

생명공학 분야 연구자의 연구데이터 공유 의도에 영향을 미치는 요인에 관한 연구: 학술적 평판의 조절효과를 중심으로

김선(성균관대학교 문헌정보학과 석사졸업) 2022, Vol.39, No.1, pp.45-68 https://doi.org/10.3743/KOSIM.2022.39.1.045

초록보기

초록

본 연구는 연구자들의 데이터 공유 행위에 대한 이해에 목적을 두고 국내 생명공학분야 연구자와 연구학생을 대상으로 데이터 공유 의도에 영향을 미치는 요인을 살펴보았다. 이메일로 수집된 411개의 유효 응답은 PLS-SEM을 이용하여 분석하였다. 연구 결과, 첫째, 데이터 공유 규범과 학술적 상호주의는 데이터 공유 의도에 직접적으로 긍정적인 영향을 미친 것으로 나타났다. 둘째, 공동체 신뢰는 학술적 상호주의가 공동체 신뢰와 데이터 공유 의도의 매개변인일 때, 데이터 공유 의도에 유의미한 영향을 미치는 것으로 나타났다. 셋째, 학술적 평판은 데이터 공유 규범과 학술적 상호주의, 그리고 데이터 공유 규범과 데이터 공유 의도 간의 관계에서, 학술적 상호주의와 데이터 공유 의도의 관계에서 유의한 조절효과를 보였다. 본 연구는 국내 생명공학 연구자들의 데이터 공유 의도에 영향을 미치는 요인에 대하여 Ostrom의 집단행동이론을 적용하여 살펴보았다는 점과 변인들의 영향 관계 안에서 학술적 평판의 조절효과를 발견하였다는 점에서 그 의의가 있다. 이러한 결과는 연구자들의 데이터 공유 행위를 촉진시킬 수 있는 방안으로 학술적인 보상 시스템의 개발의 필요성을 시사한다.

Abstract

The objective of this study is to investigate the factors which influence biotechnology scientists’ data sharing intention. This study employed Ostrom’s theory of collective action. The target population of this study includes scientists and students of biotechnology field in South Korea. A total of 411 responses which collected by e-mail were used for the final data analysis. The summary of this study is as follows. First, norm of data sharing and academic reciprocity were found to have significant positive influences on data sharing intention directly. Second, perceived community trust was found to have significant positive influences on data sharing intention when academic reciprocity was the mediator. Third, academic reputation showed the moderating effects on the relationship between norm of data sharing and academic reciprocity, and between norm of data sharing and data sharing intention. These findings show that researchers can approach the data sharing behaviors by using the mechanism of trust, norms, reciprocity, and reputation and indicate necessity for a development of academic reputation system to promote more data sharing behaviors of researchers.

2

행정정보 데이터세트 이관도구 SIARD_KR의 개선방안

변우영(명지대학교 기록정보관리학과) ; 임진희(명지대학교 기록정보과학전문대학원) 2022, Vol.39, No.1, pp.195-217 https://doi.org/10.3743/KOSIM.2022.39.1.195

초록보기

초록

SIARD_KR은 스위스 연방 기록보존소에서 개발한 관계형 데이터베이스 컨텐츠의 장기보존에 이용하는 기술인 SIARD를 우리나라의 실정에 맞게 일부 수정한 행정정보 데이터세트 보존 도구이다. 기존의 선행연구는 SIARD가 얼마나 관계형 데이터베이스안에 들어있는 모든 데이터를 손실 없이 잘 추출할 수 있는지에 초점이 맞춰져 있다. 하지만 데이터베이스에 들어있는 데이터 전부가 의미 있는 정보, 즉 행정정보 데이터세트는 아니다. 따라서 이 논문은 SIARD_KR이 행정정보 데이터세트의 특성을 반영하고 있는가에 대한 문제의식에서 시작한다. SIARD_KR이 단순히 DB에 저장된 데이터를 추출하는 도구가 아니고 의미 있는 정보만을 식별하여 추출할 수 있을지, 본래의 시스템에서 유리되어도 의미 있는 정보를 유지할 수 있을지 확인하려 한다. 본 논문은 SIARD_KR의 구조를 분석하고, 예상되는 문제점을 도출하여 그에 대한 개선방안을 제시하는 것을 목적으로 한다.

Abstract

SIARD_KR is an administrative information dataset preservation tool. It is a partially modified version of SIARD, technology used for long-term preservation of relational databases developed by the Swiss Federal Archives, to suit Korea’s situation better. Previous studies have focused on how SIARD is able to effectively extract all data contained in the relational database without loss. However, not all data contained in the database is meaningful information, that is, an administrative information dataset. This paper began, therefore, with the awareness of the problem of whether SIARD_KR reflects the characteristics of the administrative information dataset. SIARD_KR is not only a tool for extracting data stored in the DB. We want to see if it is capable of identifying and extracting only meaningful information, and maintaining meaningful information, even if it is separated from the original system. The purpose of this paper is to analyze the structure of SIARD_KR, identify expected problems, and suggest improvement measures for them.

3

도서추천 시스템 개선을 위한 도서이용 맥락 요소 탐색

심지영(연세대학교 대학도서관발전연구소) 2022, Vol.39, No.2, pp.299-324 https://doi.org/10.3743/KOSIM.2022.39.2.299

초록보기

초록

본 연구는 기존의 도서추천 시스템 연구에서 간과되어 온 도서이용의 맥락 요소를 파악하기 위해, 다양한 도서탐색 배경을 지닌 적극적인 도서 이용자 15명을 대상으로 6가지 도서탐색 상황에서 생성하는 내용을 사고구술(think-aloud) 프로토콜을 통해 수집하였다. 수집된 도서이용 내용은 내용분석 과정을 통해 독자자문 서비스의 이론적 개념인 ‘어필 요소(appeal factor)’를 토대로 도서이용에 영향을 미치는 내부 어필 요소와 외부 어필 요소를 각각 식별하였으며, 도서탐색에 사용하는 정보원과 탐색방법 관련 개념들을 또한 세분화하였다. 본 연구의 결과는 향후 도서추천 시스템 설계에 의미 있는 속성 데이터를 추출하고 반영하는 데 사용될 수 있을 것이다.

Abstract

In this study, in order to explore the contextual elements of book use that were overlooked in the existing book recommender system research, for 15 avid readers with various book search backgrounds, the contents generated in 6 book search situations were collected through the think-aloud protocol. By using content analysis from the collected book use contents, not only the internal and external appeal factors affecting book use, based on the ‘appeal factor’, the theoretical concept of the readers’ advisory service, but also information sources and search methods regarding book use were identified and categorized. The results of this study can be used to extract and reflect meaningful attribute data in the future book recommender system design process.

4

연관규칙을 활용한 학교도서관 도서추천시스템 개발에 관한 연구

임정훈(대전과학고등학교 교사) ; 조창제(NeuroEars 연구개발전담부서) ; 김종헌(대전과학고등학교 교사) 2022, Vol.39, No.3, pp.1-22 https://doi.org/10.3743/KOSIM.2022.39.3.001

초록보기

초록

본 연구는 학교도서관에서 활용할 수 있는 도서추천시스템을 제안하는데 목적이 있다. 도서추천시스템은 DLS의 대출 데이터를 활용하여 연관규칙 기반의 알고리즘을 적용하였으며, 학교도서관 이용자들에게 개인화 도서추천 서비스 제공이 가능하도록 설계하였다. 이를 위해 Apriori 알고리즘 기반의 연관규칙과 매개 중심성 분석을 적용하고, 기술통계, 연관규칙 생성, 학생중심 추천, 도서 중심추천 등 세부 기능을 구현하였다. 이어서 사서교사를 대상으로 심층면담을 통해 도서추천시스템 사용에 대한 의견을 조사하였다. 조사 결과, 도서추천의 필요성 및 어려움, 학생의 반응, 기존 추천방식과의 차이점 및 활용방안, 개선 사항에 대한 의견을 확인할 수 있었으며, 이를 토대로 다음의 논의점을 제안하였다. 첫째, 개별학교의 특성을 파악하기 위해서 장기간의 대출 데이터의 제공이 필요하다. 둘째, 지역별 혹은 학교 특성별 데이터 통합 방안에 대한 논의가 필요하다. 셋째, 독서교육종합시스템에서 제공하는 도서추천시스템의 구축이 필요하다. 본 연구에서 제안된 내용을 토대로 향후 학교도서관 현장에서 활용할 수 있는 개인화 추천시스템 적용에 대한 다양한 논의가 이루어지길 기대한다.

Abstract

The purpose of this study is to propose a book recommendation system that can be used in school libraries. The book recommendation system applies an algorithm based on association rules using DLS lending data and is designed to provide personalized book recommendation services to school library users. For this purpose, association rules based on the Apriori algorithm and betweenness centrality analysis were applied and detailed functions such as descriptive statistics, generation of association rules, student-centered recommendation, and book-centered recommendation were materialized. Subsequently, opinions on the use of the book recommendation system were investigated through in-depth interviews with teacher librarians. As a result of the investigation, opinions on the necessity and difficulty of book recommendation, student responses, differences from existing recommendation methods, utilization methods, and improvements were confirmed and based on this, the following discussions were proposed. First, it is necessary to provide long-term lending data to understand the characteristics of each school. Second, it is necessary to discuss the data integration plan by region or school characteristics. Third, It is necessary to establish a book recommendation system provided by the Comprehensive Support System for Reading Education. Based on the contents proposed in this study, it is expected that various discussions will be made on the application of a personalization recommendation system that can be used in the school library in the future.

5

리뷰 정보를 활용한 이용자의 선호요인 식별에 관한 연구

송성전(독립연구자) ; 심지영(연세대학교 대학도서관발전연구소) 2022, Vol.39, No.3, pp.311-336 https://doi.org/10.3743/KOSIM.2022.39.3.311

초록보기

초록

본 연구는 도서관 정보서비스 환경에서 도서 이용자의 도서추천에 영향을 미치는 선호요인을 파악하기 위해 전 세계 도서 이용자의 참여로 이루어지는 사회적 목록 서비스인 Goodreads 리뷰 데이터를 대상으로 내용분석하였다. 이용자 선호의 내용을 보다 세부적인 관점에서 파악하기 위해 샘플 선정 과정에서 평점 그룹별, 도서별, 이용자별 하위 데이터 집합을 구성하였으며, 다양한 토픽을 고루 반영하기 위해 리뷰 텍스트의 토픽모델링 결과에 기반하여 층화 샘플링을 수행하였다. 그 결과, ‘내용’, ‘캐릭터’, ‘글쓰기’, ‘읽기’, ‘작가’, ‘스토리’, ‘형식’의 7개 범주에 속하는 총 90개 선호요인 관련 개념을 식별하는 한편, 평점에 따라 드러나는 일반적인 선호요인은 물론 호불호가 분명한 도서와 이용자에서 드러나는 선호요인의 양상을 파악하였다. 본 연구의 결과는 이용자 선호요인의 구체적 양상을 파악하여 향후 추천시스템 등에서 보다 정교한 추천에 기여할 수 있을 것으로 보인다.

Abstract

This study analyzed the contents of Goodreads review data, which is a social cataloging service with the participation of book users around the world, to identify the preference factors that affect book users’ book recommendations in the library information service environment. To understand user preferences from a more detailed point of view, sub-datasets for each rating group, each book, and each user were constructed in the sample selection process. Stratified sampling was also performed based on the result of topic modeling of review text data to include various topics. As a result, a total of 90 preference factors belonging to 7 categories(‘Content’, ‘Character’, ‘Writing’, ‘Reading’, ‘Author’, ‘Story’, ‘Form’) were identified. Also, the general preference factors revealed according to the ratings, as well as the patterns of preference factors revealed in books and users with clear likes and dislikes were identified. The results of this study are expected to contribute to more sophisticated recommendations in future recommendation systems by identifying specific aspects of user preference factors.

6

BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축

고영수(연세대학교 문헌정보학과 석사과정) ; 이수빈(연세대학교 문헌정보학과 박사과정) ; 차민정(연세대학교 소셜오믹스 연구센터) ; 김성덕(연세대학교 문헌정보학과 석사과정) ; 이주희(연세대학교 문헌정보학과 석사과정) ; 한지영(연세대학교 문헌정보학과 석사과정) ; 송민(연세대학교 문헌정보학과) 2022, Vol.39, No.2, pp.111-129 https://doi.org/10.3743/KOSIM.2022.39.2.111

초록보기

초록

불면증은 최근 5년 새 환자가 20% 이상 증가하고 있는 현대 사회의 만성적인 질병이다. 수면이 부족할 경우 나타나는 개인 및 사회적 문제가 심각하고 불면증의 유발 요인이 복합적으로 작용하고 있어서 진단 및 치료가 중요한 질환이다. 본 연구는 자유롭게 의견을 표출하는 소셜 미디어 ‘Reddit’의 불면증 커뮤니티인 ‘insomnia’를 대상으로 5,699개의 데이터를 수집하였고 이를 국제수면장애분류 ICSD-3 기준과 정신의학과 전문의의 자문을 받은 가이드라인을 바탕으로 불면증 경향 문헌과 비경향 문헌으로 태깅하여 불면증 말뭉치를 구축하였다. 구축된 불면증 말뭉치를 학습데이터로 하여 5개의 딥러닝 언어모델(BERT, RoBERTa, ALBERT, ELECTRA, XLNet)을 훈련시켰고 성능 평가 결과 RoBERTa가 정확도, 정밀도, 재현율, F1점수에서 가장 높은 성능을 보였다. 불면증 소셜 데이터를 심층적으로 분석하기 위해 기존에 많이 사용되었던 LDA의 약점을 보완하며 새롭게 등장한 BERTopic 방법을 사용하여 토픽 모델링을 진행하였다. 계층적 클러스터링 분석 결과 8개의 주제군(‘부정적 감정’, ‘조언 및 도움과 감사’, ‘불면증 관련 질병’, ‘수면제’, ‘운동 및 식습관’, ‘신체적 특징’, ‘활동적 특징’, ‘환경적 특징’)을 확인할 수 있었다. 이용자들은 불면증 커뮤니티에서 부정 감정을 표현하고 도움과 조언을 구하는 모습을 보였다. 또한, 불면증과 관련된 질병들을 언급하고 수면제 사용에 대한 담론을 나누며 운동 및 식습관에 관한 관심을 표현하고 있었다. 발견된 불면증 관련 특징으로는 호흡, 임신, 심장 등의 신체적 특징과 좀비, 수면 경련, 그로기상태 등의 활동적 특징, 햇빛, 담요, 온도, 낮잠 등의 환경적 특징이 확인되었다.

Abstract

Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from ‘insomnia’, a community on ‘Reddit’, a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups (‘Negative emotions’, ‘Advice and help and gratitude’, ‘Insomnia-related diseases’, ‘Sleeping pills’, ‘Exercise and eating habits’, ‘Physical characteristics’, ‘Activity characteristics’, ‘Environmental characteristics’) could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

7

종이기록 데이터화를 위한 AI-OCR 적용 사례연구

안세진(김포시 행정과) ; 황현호(㈜악어디지털) ; 임진희(이화여자대학교 정책과학과) 2022, Vol.39, No.3, pp.165-193 https://doi.org/10.3743/KOSIM.2022.39.3.165

초록보기

초록

현대 업무환경 변화의 중심은 디지털 기술이라고 할 수 있다. 특히 업무관리시스템 및 문서생산시스템에서 생산한 기록으로 업무를 증명하는 일반적인 공공기관에서 기록관리체계는 업무환경 그 자체이기도 하다. 김포시는 제4차 산업혁명기술 시대에 선제적으로 대응하고 업무환경 혁신을 이루기 위해 한국지능정보사회진흥원(NIA)의 2021년 공공부문 클라우드 선도 프로젝트 사업에 지원하였고 선도 기관으로 확정되어 3억 3천의 지원을 받아 공공 클라우드 기반의 AI-OCR을 통한 기록물 검색 및 활용기능 강화 프로젝트를 진행하였다. 이를 통해 규격화된 색인 값에 의존한 검색과 이미지 열람에 그치던 비전자기록의 한계를 넘어 데이터화 하였고 AI-OCR이라는 신기술 적용으로 98%의 인식률을 구현하였다. 공공기관에 디지털 기술을 사용하여 업무 효율화, 생산성 향상, 개발비용 절감, 내․외부 이용자들의 기록관리 서비스 수준의 제고를 이루었기에 신기술과 기록물관리의 결합 사례연구를 통해 기록관리 분야 본연의 전문성을 높이는 방향과 업무환경 혁신 구현 사례를 공유하고자 한다.

Abstract

It can be said that digital technology is at the center of the change in the modern work environment. In particular, in general public institutions that prove their work with records produced by business management systems and document production systems, the record management system is also the work environment itself. Gimpo City applied for the 2021 public cloud leading project of the National Information Society Agency (NIA) to proactively respond to the 4th industrial revolution technology era and implemented a public cloud-based AI-OCR technology enhancement project with 330 million won in support of 330 million won. Through this, it was converted into data beyond the limitations of non-electronic records limited to search and image viewing that depend on standardized index values. In addition, a 98% recognition rate was realized by applying a new technology called AI-OCR. Since digital technology has been used to improve work efficiency, productivity, development cost, and record management service levels of internal and external users, we would like to share the direction of enhancing expertise in the record management and implementation of work environment innovation.

8

비대면 서비스가 도서관 불안해소에 미치는 영향에 관한 연구: K대학도서관 이용자를 중심으로

이경화(건국대학교 일반대학원 문헌정보학과) ; 노영희(건국대학교 문헌정보학과) 2022, Vol.39, No.1, pp.17-44 https://doi.org/10.3743/KOSIM.2022.39.1.017

초록보기

초록

본 연구는 대학도서관 이용자의 도서관 불안 요인을 분석하여 비대면 서비스가 도서관 불안해소에 미치는 영향에 관한 방안 제시를 목적으로 하였다. 이를 위해 코로나19 사태에 따른 대학도서관의 이용자 서비스 대응 활동 사례를 살펴보고, 재학생 5,000명 이상 10,000명 이하의 국내 4년제 대학도서관에서 재학생 1인당 도서관 방문자수가 가장 높은 순위부터 40교를 선정하여 비대면 방식의 정보 서비스 및 프로그램 사례를 분석하였고, K대학도서관을 이용하는 재학생을 대상으로 K-LAS를 재구성하여 설문조사를 실시하였다. 수집된 데이터를 대상으로 빈도분석, 기술통계분석, 탐색적 요인분석, 신뢰도분석, 상관관계분석, 다중회귀분석을 적용하여 이용자의 도서관 불안 요인을 분석하였다. 도서관의 물리적․환경적 요인, 자료검색선정 요인, 디지털 정보시스템 요인, 사서(직원) 요인, 심리․정서적 요인등 5가지 도서관 불안 요인과 비대면 서비스 활성화 요인간 관계를 파악하고, 비대면 서비스 활성화 요인이 도서관 불안요인에 미치는 영향을 살펴보았으며, 그 결과, 비대면 서비스 활성화 요인들이 도서관 디지털 정보시스템 불안 요인에 가장 크게 영형을 끼치는 것으로 나타났다. 분석결과에 기초하여 비대면 서비스 활성화를 통하여 이용자의 도서관 불안해소 방안을 도출해보고자 하였다.

Abstract

The purpose of this study was to present a plan on the effect of non-face-to-face services on library anxiety facilities by analyzing the library anxiety factors of university library users. To this end, we look at the cases of university library user service response activities in response to the COVID-19 crisis and select 40 schools with the highest number of library visitors per student from among domestic four-year university libraries with 5,000 or more and less than 10,000 students. Methods of information service and program cases were analyzed, and K-LAS was reconstructed and surveyed for current students using the K university library, and frequency analysis, descriptive statistical analysis, exploratory factor analysis, and reliability analysis, correlation analysis, and multiple regression analysis were applied to analyze the library anxiety factors of users. Identify the relationship between 5 library anxiety factors and non-face-to-face service activation factors, such as physical/environmental factors of the library, data search selection factors, digital information system factors, librarian (staff) factors, and psychological/emotional factors, and activate non-face-to-face services. The influence of these factors on library anxiety factors was examined, and as a result, it was found that non-face-to-face service activation factors had the greatest influence on library digital information system anxiety factors. Based on the analysis results, it was attempted to derive a plan to relieve users’ library anxiety by activating non-face-to-face services.

9

딥러닝 기반 소셜미디어 한글 텍스트 우울 경향 분석

박서정(연세대학교 문헌정보학과) ; 이수빈(연세대학교 문헌정보학과) ; 김우정(연세대학교 의과대학 용인세브란스병원 정신건강의학교실) ; 송민(연세대학교 문헌정보학과) 2022, Vol.39, No.1, pp.91-117 https://doi.org/10.3743/KOSIM.2022.39.1.091

초록보기

초록

국내를 비롯하여 전 세계적으로 우울증 환자 수가 매년 증가하는 추세이다. 그러나 대다수의 정신질환 환자들은 자신이 질병을 앓고 있다는 사실을 인식하지 못해서 적절한 치료가 이루어지지 않고 있다. 우울 증상이 방치되면 자살과 불안, 기타 심리적인 문제로 발전될 수 있기에 우울증의 조기 발견과 치료는 정신건강 증진에 있어 매우 중요하다. 이러한 문제점을 개선하기 위해 본 연구에서는 한국어 소셜 미디어 텍스트를 활용한 딥러닝 기반의 우울 경향 모델을 제시하였다. 네이버 지식인, 네이버 블로그, 하이닥, 트위터에서 데이터 수집을 한 뒤 DSM-5 주요 우울 장애 진단 기준을 활용하여 우울 증상 개수에 따라 클래스를 구분하여 주석을 달았다. 이후 구축한 말뭉치의 클래스 별 특성을 살펴보고자 TF-IDF 분석과 동시 출현 단어 분석을 실시하였다. 또한, 다양한 텍스트 특징을 활용하여 우울 경향 분류 모델을 생성하기 위해 단어 임베딩과 사전 기반 감성 분석, LDA 토픽 모델링을 수행하였다. 이를 통해 문헌 별로 임베딩된 텍스트와 감성 점수, 토픽 번호를 산출하여 텍스트 특징으로 사용하였다. 그 결과 임베딩된 텍스트에 문서의 감성 점수와 토픽을 모두 결합하여 KorBERT 알고리즘을 기반으로 우울 경향을 분류하였을 때 가장 높은 정확률인 83.28%를 달성하는 것을 확인하였다. 본 연구는 다양한 텍스트 특징을 활용하여 보다 성능이 개선된 한국어 우울 경향 분류 모델을 구축함에 따라, 한국 온라인 커뮤니티 이용자 중 잠재적인 우울증 환자를 조기에 발견해 빠른 치료 및 예방이 가능하도록 하여 한국 사회의 정신건강 증진에 도움을 줄 수 있는 기반을 마련했다는 점에서 의의를 지닌다.

Abstract

The number of depressed patients in Korea and around the world is rapidly increasing every year. However, most of the mentally ill patients are not aware that they are suffering from the disease, so adequate treatment is not being performed. If depressive symptoms are neglected, it can lead to suicide, anxiety, and other psychological problems. Therefore, early detection and treatment of depression are very important in improving mental health. To improve this problem, this study presented a deep learning-based depression tendency model using Korean social media text. After collecting data from Naver KonwledgeiN, Naver Blog, Hidoc, and Twitter, DSM-5 major depressive disorder diagnosis criteria were used to classify and annotate classes according to the number of depressive symptoms. Afterwards, TF-IDF analysis and simultaneous word analysis were performed to examine the characteristics of each class of the corpus constructed. In addition, word embedding, dictionary-based sentiment analysis, and LDA topic modeling were performed to generate a depression tendency classification model using various text features. Through this, the embedded text, sentiment score, and topic number for each document were calculated and used as text features. As a result, it was confirmed that the highest accuracy rate of 83.28% was achieved when the depression tendency was classified based on the KorBERT algorithm by combining both the emotional score and the topic of the document with the embedded text. This study establishes a classification model for Korean depression trends with improved performance using various text features, and detects potential depressive patients early among Korean online community users, enabling rapid treatment and prevention, thereby enabling the mental health of Korean society. It is significant in that it can help in promotion.

10

트위터에서의 COVID-19와 관련된 반시민성 주제 탐색: 혐오 대상 및 키워드 분석

김규리(성균관대학교 문헌정보학과 석사과정) ; 오찬희(성균관대학교 문헌정보학과 석사과정) ; 주영준(연세대학교 문헌정보학과) 2022, Vol.39, No.1, pp.331-350 https://doi.org/10.3743/KOSIM.2022.39.1.331

초록보기

초록

본 연구는 코로나바이러스감염증-19 (이하 코로나19)로 인해 생겨난 코로나19 반시민성 주제와 코로나19 혐오 정서를 파악하기 위해 소셜 미디어 중 하나인 트위터의 코로나19 관련 게시물을 분석하였다. 2019년 12월 1일부터 2021년 8월 31일까지 21개월 동안 작성된 코로나19 관련 혐오 대상별(지역, 공공시설 혐오, 특정 인구 집단 혐오, 종교 혐오) 게시물 수집 및 전처리를 진행하여 총 63,802개의 게시물을 분석하였다. 혐오 대상별 빈도 분석, 다이나믹 토픽 모델링, 키워드 동시 출현 네트워크 분석 기법을 통하여 혐오 대상별 반시민성 주제와 혐오 키워드를 파악하였다. 첫째, 빈도 분석 결과, 지역, 공공시설 혐오는 상대적으로 증가하는 추세를 보이고 특정 인구 집단과 종교 혐오는 상대적으로 감소하는 추세를 확인할 수 있었다. 둘째, 다이나믹 토픽 모델링 분석 결과, 지역, 공공시설 혐오는 ‘대구, 경북지방 혐오’, ‘지역 간 혐오’, ‘공공시설 혐오’로 나타났고, 특정 인구 집단 혐오는 ‘중국 혐오’, ‘바이러스 전파자’, ‘실외(야외)활동 제재’로 나타났으며, 종교 혐오는 ‘신천지’, ‘기독교’, ‘종교 내 감염’, ‘방역 의무 거부’, ‘확진자 동선 비난’으로 나타났다. 셋째, 키워드 동시 출현 네트워크 분석 결과, 지역, 공공시설 혐오(코로나, 대구, 확진자, 신천지, 경북, 지역), 특정 인구 집단 혐오(코로나바이러스, 우한폐렴, 우한, 중국, 중국인, 사람, 입국, 금지), 종교 혐오(신천지, 코로나, 교회, 대구, 확진자, 감염) 등을 핵심 키워드로 확인할 수 있었다. 본 연구는 소셜 미디어를 활용한 국내 코로나19 혐오 대상 및 키워드 파악을 통해 코로나19 관련한 대중의 반시민성 여론을 파악하고자 하였다. 특히 기존의 선행연구에서 시도하지 않았던 주제인 코로나19 관련 혐오에 데이터 마이닝 기법을 이용하여 소셜 미디어에서 표출하는 대중의 반시민성 주제와 혐오 정서 탐색은 대중들의 여론을 파악하는 것이 의의가 있다. 더불어 본 연구 결과는 포스트 코로나 시대를 대비하는 문화적 소통 방안의 제도 및 정책 수립 기여를 위한 기본 자료에 기초할 수 있다는 점에서 실질적 함의를 시사한다.

Abstract

This study aims to understand topics of incivility related to COVID-19 from analyzing Twitter posts including COVID-19-related hate speech. To achieve the goal, a total of 63,802 tweets that were created between December 1st, 2019, and August 31st, 2021, covering three targets of hate speech including region and public facilities, groups of people, and religion were analyzed. Frequency analysis, dynamic topic modeling, and keyword co-occurrence network analysis were used to explore topics and keywords. 1) Results of frequency analysis revealed that hate against regions and public facilities showed a relatively increasing trend while hate against specific groups of people and religion showed a relatively decreasing trend. 2) Results of dynamic topic modeling analysis showed keywords of each of the three targets of hate speech. Keywords of the region and public facilities included “Daegu, Gyeongbuk local hate”, “interregional hate”, and “public facility hate”; groups of people included “China hate”, “virus spreaders”, and “outdoor activity sanctions”; and religion included “Shincheonji”, “Christianity”, “religious infection”, “refusal of quarantine”, and “places visited by confirmed cases”. 3) Similarly, results of keyword co-occurrence network analysis revealed keywords of three targets: region and public facilities (Corona, Daegu, confirmed cases, Shincheonji, Gyeongbuk, region); specific groups of people (Coronavirus, Wuhan pneumonia, Wuhan, China, Chinese, People, Entry, Banned); and religion (Corona, Church, Daegu, confirmed cases, infection). This study attempted to grasp the public’s anti-citizenship public opinion related to COVID-19 by identifying domestic COVID-19 hate targets and keywords using social media. In particular, it is meaningful to grasp public opinion on incivility topics and hate emotions expressed on social media using data mining techniques for hate-related to COVID-19, which has not been attempted in previous studies. In addition, the results of this study suggest practical implications in that they can be based on basic data for contributing to the establishment of systems and policies for cultural communication measures in preparation for the post-COVID-19 era.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지