정보관리학회지, 한국정보관리학회

권한신청
P-ISSN1013-0799
E-ISSN2586-2073
KCI

검색어: 도서 범주화, 검색결과: 3

목차 정보와 kNN 분류기를 이용한 사회과학 분야 도서 자동 분류에 관한 연구

이용구(계명대학교 문헌정보학과 부교수) 2020, Vol.37, No.1, pp.1-21 https://doi.org/10.3743/KOSIM.2020.37.1.001

초록보기

초록

이 연구에서는 한 대학도서관의 신착 도서 리스트 중 사회 과학 분야 6,253권에 대해 목차 정보를 이용하여 자동 분류를 적용하였다. 분류기는 kNN 알고리즘을 사용하였으며 자동 분류의 범주로 도서관에서 도서에 부여한 DDC 300대 강목을 사용하였다. 분류 자질은 도서의 서명과 목차를 사용하였으며, 목차는 인터넷 서점으로부터 Open API를 통해 획득하였다. 자동 분류 실험 결과, 목차 자질은 분류 재현율과 분류 정확률 모두를 향상시키는 좋은 자질임을 알 수 있었다. 또한 목차는 풍부한 자질로 불균형인 데이터의 과적합 문제를 완화시키는 것으로 나타났다. 법학과 교육학은 사회 과학 분야에서 특정성이 높아 서명 자질만으로도 좋은 분류 성능을 가져오는 점도 파악할 수 있었다.

Abstract

This study applied automatic classification using table of contents (TOC) text for 6,253 social science books from a newly arrived list collected by a university library. The k-nearest neighbors (kNN) algorithm was used as a classifier, and the ten divisions on the second level of the DDC’s main class 300 given to books by the library were used as classes (labels). The features used in this study were keywords extracted from titles and TOCs of the books. The TOCs were obtained through the OpenAPI from an Internet bookstore. As a result, it was found that the TOC features were good for improving both classification recall and precision. The TOC was shown to reduce the overfitting problem of imbalanced data with its rich features. Law and education have high topic specificity in the field of social sciences, so the only title features can bring good classification performance in these fields.

문헌정보학 분야 정보격차 연구동향 분석

강인서(공주대학교 문헌정보교육과 박사과정,대전자운초등학교 교사) ; 김혜진(공주대학교 문헌정보교육과 조교수) 2020, Vol.37, No.2, pp.333-352 https://doi.org/10.3743/KOSIM.2020.37.2.333

초록보기

초록

본 연구는 문헌정보학에서 다뤄지는 정보격차 연구동향을 분석하기 위해서 문헌정보학관련 4개 학술지에서 발행한 195편의 논문을 수집하고, 해당 문헌을 대상으로 연구대상(11개 하위 범주), 연구목적(4개 하위 범주), 연구방법(4개 하위 범주)로 코딩하였다. 이것을 저자 키워드와 함께 패스파인더 알고리즘을 적용한 키워드 네트워크를 구축하여 분석을 진행하였다. 분석 결과, 정보취약계층(연구대상) 중 장애인, 다문화가정, 고령자에 대한 연구가 79.5%로 특정 계층에 집중되어있는 것으로 나타났다. 그리고 정보격차, 장애인, 공공도서관을 중심으로, 다문화, 고령자 등에 대한 정보취약 실태와 해소방안을 목적으로 활발히 연구가 진행되었으나, 정보취약 해소효과, 정보취약의 영향요인을 목적으로 한 연구는 고령자, 독서치료, 정보화교육, 정보활용, 독서프로그램을 구안하고 적용하여 효과를 검증하는 연구에 국한되어 있었다. 마지막으로 정보격차에서 가장 많이 활용되는 연구방법은 문헌연구와 함께 사례연구 또는 설문조사를 동시에 이용하는 것으로 나타났다.

Abstract

This study aimed to analyze research trends of ‘digital divide’ in Library and Information Science. To this end, we coded research subjects with 11 subcategories, and research objectives with 4 subcategories, and research methods with 4 subcategories, and constructed keyword networks to which a pathfinder algorithm was applied. As a result of the analysis, 79.5% of studies are on the disabled, multicultural families, and the elderly among information vulnerable groups, and it was found to be concentrated in specific groups. In addition, digital divide related studies have been actively conducted for the purpose of resolving information vulnerabilities such as people with disabilities. We also found that these studies focused on verifying the effectiveness by designing and applying treatments such as informatization education, information utilization, and reading programs. Lastly, the most frequently used research method in the digital divide was found to use case studies or questionnaires simultaneously with literature research.

이용자 생성 도서정보 태그에 기반한 소설 검색의 패싯 유형 개발

심지영(독립 연구자) 2020, Vol.37, No.2, pp.225-249 https://doi.org/10.3743/KOSIM.2020.37.2.225

초록보기

초록

본 연구는 소설 검색 환경을 개선하기 위해, 도서태그로부터 소설 이용자가 소설 탐색 상황에서 요구하는 다양한 패싯 요소를 식별하고 체계화하는 것을 목적으로 한다. 소설의 기본 패싯 체계를 랑가나단의 PMEST 기본 패싯에 기반하여, 1) 소설 자료를 형성하는 주체, 2) 소설을 구성하는 내용적, 외형적 성질, 3) 독자가 책과 상호작용하는 행위, 4) 소설 및 독서활동과 관련된 공간 정보, 5) 소설 및 독서활동과 관련된 시간 정보로 정의하고, 소설 7,174건에 부여된 약 31만 건의 태그 중 핵심 태그 3,730건을 선별하여 내용분석하였다. 그 결과, 소설 패싯의 상위범주 25개를 중심으로 다양한 속성을 체계화하였다. 본 연구의 결과는 향후 도서관 OPAC이나 소설 DB에 패싯 내비게이션 형태로 적용될 수 있을 것으로 기대된다.

Abstract

The purpose of this study is to identify and systematize various facet elements required by users in fiction search situations from book tags to improve the fiction search environment. Based on the Ranganathan’s PMEST formula, the basic facet system of the fiction was defined as 1) the personality that forms the fiction material, 2) the content and external characteristics that compose the fiction, 3) the reader interaction with books, 4) spatial information related to fiction and reading activities, and 5) time information related to fiction and reading activities. Out of approximately 310,000 tags assigned to 7,174 fiction, 3,730 core tags were selected and content-analyzed. As a result, various attributes were systematized around the top 25 categories of the fiction facets. The results of this study can be applied to facet navigation of OPAC and fiction DB in the future.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지