정보관리학회지, 한국정보관리학회

1

Yang, Kiduk(경북대학교) 2015, Vol.32, No.1, pp.7-22 https://doi.org/10.3743/KOSIM.2015.32.1.007

초록보기

초록

Abstract

This paper describes a Web search optimization study that investigates both static and dynamic tuning methods for optimizing system performance. We extended the conventional fusion approach by introducing the “dynamic tuning” process with which to optimize the fusion formula that combines the contributions of diverse sources of evidence on the Web. By engaging in iterative dynamic tuning process, where we successively fine-tuned the fusion parameters based on the cognitive analysis of immediate system feedback, we were able to significantly increase the retrieval performance.Our results show that exploiting the richness of Web search environment by combining multiple sources of evidence is an effective strategy.

2

문헌범주화에서 학습문헌수 최적화에 관한 연구

심경(아이리스닷넷) 2006, Vol.23, No.4, pp.277-294 https://doi.org/10.3743/KOSIM.2006.23.4.277

초록보기

초록

본 연구는 실재 시스템 환경에서 문헌 분류를 위해 범주화 기법을 적용할 경우, 범주화 성능이 어느 정도이며, 적정한 문헌범주화 성능의 달성을 위하여 분류기 학습에 필요한 범주당 가장 이상적인 학습문헌집합의 규모는 무엇인가를 파악하기 위하여 kNN 분류기를 사용하여 실험하였다. 실험문헌집단으로15만 여건의 실제 서비스되는 데이터베이스에서 2,556건 이상의 문헌을 가진 8개 범주를 선정하였다. 이들을 대상으로 범주당 학습문헌수 20개(Tr20)에서 2,000개(Tr2000)까지 단계별로 증가시키며 8개 학습문헌집합 규모를 갖도록 하위문헌집단을 구성한 후, 학습문헌집합 규모에 따른 하위문헌집단 간 범주화 성능을 비교하였다. 8개 하위문헌집단의 거시평균 성능은 F1 값 30%로 선행연구에서 발견된 kNN 분류기의 일반적인 성능에 미치지 못하는 낮은 성능을 보였다. 실험을 수행한 8개 대상문헌집단 중 학습문헌수가 100개인 Tr100 문헌집단이 F1 값 31%로 비용대 효과면에서 분류기 학습에 필요한 최적정의 실험문헌집합수로 판단되었다. 또한, 실험문헌집단에 부여된 주제범주 정확도를 수작업 재분류를 통하여 확인한 후, 이들의 범주별 범주화 성능과 관련성을 기반으로 위 결론의 신빙성을 높였다.

Abstract

This paper examines a level of categorization performance in a reallife collection of abstract articles in the fields of science and technology, and tests the optimal size of documents per category in a training set using a kNN classifier. The corpus is built by choosing categories that hold more than 2,556 documents first, and then 2,556 documents per category are randomly selected. It is further divided into eight subsets of different size of training documents: each set is randomly selected to build training documents ranging from 20 documents (Tr20) to 2,000 documents (Tr2000) per category. The categorization performances of the 8 subsets are compared. The average performance of the eight subsets is 30% in F1 measure which is relatively poor compared to the findings of previous studies. The experimental results suggest that among the eight subsets the Tr100 appears to be the most optimal size for training a kNN classifier. In addition, the correctness of subject categories assigned to the training sets is probed by manually reclassifying the training sets in order to support the above conclusion by establishing a relation between and the correctness and categorization performance.

3

자동 분류 기법과 지적 구조 분석 기법을 융합한 처방적 분석 시스템 구현 방안 연구

정도헌(덕성여자대학교) 2017, Vol.34, No.4, pp.33-57 https://doi.org/10.3743/KOSIM.2017.34.4.033

초록보기

초록

본 연구는 새로운 분석법으로 떠오르는 처방적 분석 기법을 소개하고, 이를 분류 기반의 시스템에 효율적으로 적용하는 방안을 제시하는 것을 목적으로 한다. 처방적 분석 기법은 분석의 결과를 제시함과 동시에 최적화된 결과가 나오기까지의 과정 및 다른 선택지까지 제공한다. 새로운 개념의 분석 기법을 도입함으로써 문헌 분류를 기반으로 하는 응용 시스템을 더욱 쉽게 최적화하고 효율적으로 운영하는 방안을 제시하였다. 최적화의 과정을 시뮬레이션하기 위해, 대용량의 학술문헌을 수집하고 기준 분류 체계에 따라 자동 분류를 실시하였다. 처방적 분석 개념을 적용하는 과정에서 대용량의 문헌 분류를 위한 동적 자동 분류 기법과 학문 분야의 지적 구조 분석 기법을 동시에 활용하였다. 실험의 결과로 효과적으로 서비스 분류 체계를 수정하고 재적용할 수 있는 몇 가지 최적화 시나리오를 효율적으로 도출할 수 있음을 보여 주었다.

Abstract

This study aims to introduce an emerging prescriptive analytics method and suggest its efficient application to a category-based service system. Prescriptive analytics method provides the whole process of analysis and available alternatives as well as the results of analysis. To simulate the process of optimization, large scale journal articles have been collected and categorized by classification scheme. In the process of applying the concept of prescriptive analytics to a real system, we have fused a dynamic automatic-categorization method for large scale documents and intellectual structure analysis method for scholarly subject fields. The test result shows that some optimized scenarios can be generated efficiently and utilized effectively for reorganizing the classification-based service system.

4

국가과학기술문헌센터 건립 최적화 연구

홍현진(전남대학교) ; 정준민(전남대학교) ; 강미희() ; 정대근(전남대학교) 2003, Vol.20, No.2, pp.285-318 https://doi.org/10.3743/KOSIM.2003.20.2.285

초록보기

초록

본 연구는 국가과학기술문헌센터 건립에 대한 이론적 기반을 수립하는 것으로서 국가과학기술 문헌센터의 기능과 역할에 부합한 적정규모의 최적모형을 도출하기 위한 것이다. 이를 위해 국가 과학기술문헌센터의 기능 중 건물 규모 산출과 설비를 위해 자료공동보존소로서의 콘도미니엄 프로그램 개발과 하이브리드 도서관, 클리어링하우스 등의 기능과 역할을 검토하고, 이에 따라 국가과학기술문헌센터 건물의 공간구성 프로그램 국가과학기술문헌센터의 계획 대지 현황분석과 기본구상을 제시하였다.

Abstract

The purpose of this research is to suggest a theoretical base and guideline for the national scientific and techmical information ivfrastructure of the national scientific technology. And the obhective of this study is to contribute ti strengthen the need of the building of the national scientific and technical information center and provide operation programs and vision toward information center which will be established later. This study suggesrs a plan and strategy which make it possoble to conduct functions asthe national repository, clearinghouse, and portal gateway of the electronic resource and propose apace program fpr the optimal building construction. Therefore, the contents of this study cover the building of the national scientific and technical information center as well as the basic plan of scale and space progran, validity analysis of site location and environment

5

도서관에 적용가능한 정보불평등 측정지표 개발 연구

노영희(건국대학교 문헌정보학과 교수) ; 장로사(중앙대학교 문헌정보학과 강사) 2019, Vol.36, No.4, pp.53-81 https://doi.org/10.3743/KOSIM.2019.36.4.053

초록보기

초록

IFLA의 UN 2030 Agenda, 도서관정보정책위원회의 제3차 도서관발전종합계획(2019-2023)에서 사회적 포용을 실천하는 도서관의 역할을 강조함에 따라 최근 국내외를 막론하고 정보불평등을 해소하기 위한 공공서비스기관으로서 도서관이 새롭게 조명되고 있다. 이에 본 연구에서는 도서관에 적합한 정보불평등 측정지표를 개발하였으며, 이를 위해 전문가 집단의 검증단계로서 FGI 및 델파이기법을 실시하였다. 그 결과, 최종지표는 총 3개의 평가영역, 총 12개의 평가항목, 총 30개의 평가지표로 도출되었다. 구체적으로 첫째, 접근 평가영역에서는 3개의 평가항목, 8개의 평가지표가 도출되었으며, 둘째, 역량 평가영역에서는 5개의 평가항목, 12개의 평가지표가 도출되었고, 셋째, 활용 평가영역에서는 4개의 평가항목, 10개의 평가지표가 도출되었다. 본 연구는 현재 범지구적으로 도서관에 적용 가능한 정보불평등 측정지표가 전무한 상황에서 개발되었다는 점에서 무엇보다 그 의미와 가치가 클 것으로 사료된다.

Abstract

The 3rd Library Comprehensive Development Plan (2019-2023) of the Committee on Library and Information Policy under IFLA-UN 2030 Agenda emphasize the role of libraries in practicing social inclusion. At home and abroad, this is shedding new light on libraries as the public service institutions aimed at resolving information inequality. This study thus developed the information inequality measurement indicator optimized for libraries. For this purpose, FGI and Delphi technique were implemented as the verification stage of the expert group. As a result, the final indicators were derived in three evaluation areas, twelve evaluation items, and 30 evaluation indicators. Specifically, first, 3 evaluation items and 8 evaluation indicators were derived in the access evaluation area; second, 5 evaluation items and 12 evaluation indicators were derived in the competency evaluation area; and third, 4 evaluation items and 10 evaluation indicators were derived in the utilization evaluation area. This study is considered to be of great significance in that the information inequality measurement indicators optimized for libraries were developed, the first of its kind.

6

도서관 개인정보보호 가이드라인 개발에 관한 연구

노영희(건국대학교) ; 김태경(국립중앙도서관 도서관연구소) 2015, Vol.32, No.2, pp.25-61 https://doi.org/10.3743/KOSIM.2015.32.2.025

초록보기

초록

본 연구에서는 도서관 개인정보 가이드라인(안)을 제안하되, 관종을 구분하지 않고 어느 도서관에서나 적용가능하도록 하였다. 개개 도서관은 이 가이드라인(안)을 기초로 하여 자관의 실정에 맞게 수정․보완하여 사용할 수 있도록 하였다. 목적, 용어정의, 개인정보의 범위, 관련법 및 정책, 일반적인 내용, 도서관의 업무수행상 개인정보처리, 도서관의 외주업체 등으로 구분하여 도서관 개인정보보호 가이드라인을 개발하였으며, 개정된 「개인정보 보호법」 시행에 따른 도서관의 대처방향 마련, 개인정보처리지침의 도서관 최적화, 관련법령에의 반영, 도서관 개인정보 가이드라인 표준화 지향 등을 고려하였다.

Abstract

This study was designed to propose library privacy guidelines to be applicable in any library without distinguishing library types. Individual libraries can refine, modify, and use them to fit their situation, using the guidelines as a base. The library privacy protection guidelines developed in this study are composed of purposes, definitions, scope of privacy, law and policy, general information, the library’s job performance on the handling of personal information, and library subcontractors. The development objectives and utilization direction of the library privacy guidelines developed in this study are meant to provide a guide for change according to the amended provision of library “Privacy Act” implementation, optimization of library Privacy Directive, a reflection of the relevant laws and regulations, and the standardization-oriented library privacy guidelines.

7

질의응답을 위한 복수문서 요약에 관한 실험적 연구

최상희(대구가톨릭대학교) ; 정영미(연세대학교) 2004, Vol.21, No.3, pp.289-303 https://doi.org/10.3743/KOSIM.2004.21.3.289

초록보기

초록

This experimental study proposes a multi-document summarization method that produces optimal summaries in which users can find answers to their queries. In order to identify the most effective method for this purpose, the performance of the three summarization methods were compared. The investigated methods are sentence clustering, passage extraction through spreading activation, and clustering-passage extraction hybrid methods. The effectiveness of each summarizing method was evaluated by two criteria used to measure the accuracy and the redundancy of a summary. The passage extraction method using the sequential bnb search algorithm proved to be most effective in summarizing multiple documents with regard to summarization precision. This study proposes the passage extraction method as the optimal multi-document summarization method. 攀＊＊＊ 본 연구는 연세대학교 대학원 박사학위논문의 일부를 요약한 것임.＊＊＊ 연세대학교 문헌정보학과 시간강사(shchoi@lis.yonsei.ac.kr)＊＊＊＊연세대학교 문헌정보학과 교수(ymchung@yonsei.ac.kr) 논문접수일자 : 2004년 8월 27일 게재확정일자 : 2004년 9월 13일攀攀

Abstract

8

메타데이터를 활용한 기록물 자동분류 성능 요소 비교

김영범(전남대학교 대학원 기록관리학 석사) ; 장우권(전남대학교 문헌정보학과 교수) 2023, Vol.40, No.3, pp.99-118 https://doi.org/10.3743/KOSIM.2023.40.3.099

초록보기

초록

이 연구의 목적은 기록물의 맥락정보를 담고 있는 메타데이터를 활용하여 기록물 자동분류 과정에서의 성능요소를 파악하는데 있다. 연구를 위해 2022년 중앙행정기관 원문정보 약 97,064건을 수집하였다.수집한 데이터를 대상으로 다양한 분류 알고리즘과 데이터선정방법, 문헌표현기법을 적용하고 그 결과를 비교하여 기록물 자동 분류를 위한 최적의 성능요소를 파악하고자 하였다. 연구 결과 분류 알고리즘으로는 Random Forest가, 문헌표현기법으로는 TF 기법이 가장 높은 성능을 보였으며, 단위과제의 최소데이터 수량은 성능에 미치는 영향이 미미하였고 자질은 성능변화에 명확한 영향을 미친다는 것이 확인되었다.

Abstract

The objective of this study is to identify performance factors in the automatic classification of records by utilizing metadata that contains the contextual information of records. For this study, we collected 97,064 records of original textual information from Korean central administrative agencies in 2022. Various classification algorithms, data selection methods, and feature extraction techniques are applied and compared with the intent to discern the optimal performance-inducing technique. The study results demonstrated that among classification algorithms, Random Forest displayed higher performance, and among feature extraction techniques, the TF method proved to be the most effective. The minimum data quantity of unit tasks had a minimal influence on performance, and the addition of features positively affected performance, while their removal had a discernible negative impact.

9

고문헌 기술을 위한 LRM 기반 서지구조 구축: 에이전트, 장소, 시간 개체를 중심으로

박민정(중앙대학교 일반대학원 문헌정보학과) ; 이승민(중앙대학교 문헌정보학과 교수) 2023, Vol.40, No.3, pp.197-219 https://doi.org/10.3743/KOSIM.2023.40.3.197

초록보기

초록

자료를 기술하기 위해 일반적으로 활용하는 AACR 계열의 목록규칙과 서지구조는 한국의 고문헌만이 지니고 있는 서지적 특성을 구체적으로 반영하는 것에 한계를 보이고 있다. 이에 본 연구에서는 고문헌의 서지적 측면을 분석하고 FRBR LRM 개념적 모형을 기반으로 기술항목 사이의 관계를 형성하여 고문헌의 고유한 특성에 최적화된 서지구조를 제안하였다. 이때 관계의 설정은 관련된 고문헌을 서지적으로 연결시켜 줄 수 있는 방향으로 이루어져야 하며, 이를 위해서는 고문헌, 특히 우리나라의 고문헌이 지닌 서지적 특성과 형태적, 내용적 변형을 충분하게 반영할 수 있는 관계의 형성이 설정되어야 한다. 단위저록 형태의 단편적인 서지레코드만을 생성하는 기존의 서지 환경에서 벗어나 LRM 구조를 적용함으로써 서지데이터 단위로 기술항목을 분리 및 통합하는 것이 가능해진다. 이를 통해 새로운 서지적 환경을 마련함으로써 고문헌의 조직, 관리, 활용을 보다 효율적으로 할 수 있게 되며, 향후 BIBFRAME 형식의 서지데이터 생성 기반을 마련할 수 있다.

Abstract

The cataloging rules of AACR families and bibliographic structure, which are broadly used in describing resources, show limitations in reflecting the unique bibliographic characteristics of Korean old materials. Thus this research proposed a bibliographic structure optimized to the unique bibliographic characteristics of Korean old materials by establishing bibliographic relationships between bibliographic entities based on the FRBR LRM conceptual model. The bibliographic relationships should be established in the way of connecting related materials in the bibliographic structure. These relationships should sufficiently reflect the bibliographic characteristics of the materials, physical variations, and content variations. Through this structure, the bibliographic description can be separated and integrated into the bibliograhpic unit by applying LRM conceptual model. By using the proposed structure, the organization, management, and utilization of Korean old materials can be more efficient. Also, it can provide a new bibliographic environment that can be the foundation of creating BIBFRAME records for Korean old materials.

10

뉴스 웹 페이지에서 기사 본문 추출에 관한 연구

이용구(피츠버그대학) 2009, Vol.26, No.1, pp.305-320 https://doi.org/10.3743/KOSIM.2009.26.1.305

초록보기

초록

웹을 통해 제공되는 뉴스 페이지의 경우 필요한 정보 뿐 아니라 많은 불필요한 정보를 담고 있다. 이러한 불필요한 정보는 뉴스를 처리하는 시스템의 성능 저하와 비효율성을 가져온다. 이 연구에서는 웹 페이지로부터 뉴스 콘텐츠를 추출하기 위해 문장과 블록에 기반한 뉴스 기사 추출 방법을 제시하였다. 또한 이들을 결합하여 최적의 성능을 가져올 수 있는 방안을 모색하였다. 실험 결과, 웹 페이지에 대해 하이퍼링크 텍스트를 제거한 후 문장을 이용한 추출 방법을 적용하였을 때 효과적이었으며, 여기에 블록을 이용한 추출 방법과 결합하였을 때 더 좋은 결과를 가져왔다. 문장을 이용한 추출 방법은 추출 재현율을 높여주는 효과가 있는 것으로 나타났다.

Abstract

The news pages provided through the web contain unnecessary information. This causes low performance and inefficiency of the news processing system. In this study, news content extraction methods, which are based on sentence identification and block-level tags news web pages, was suggested. To obtain optimal performance, combinations of these methods were applied. The results showed good performance when using an extraction method which applied the sentence identification and eliminated hyperlink text from web pages. Moreover, this method showed better results when combined with the extraction method which used block-level. Extraction methods, which used sentence identification, were effective for raising the extraction recall ratio.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지