정보관리학회지, 한국정보관리학회

51

온라인 과학기술정보 서비스 품질에 대한 기대수준과 성과에 대한 지각수준이 이용자 만족도와 충성도에 미치는 영향

김완종(한국과학기술정보연구원) ; 김혜선(한국과학기술정보연구원) ; 현미환(한국과학기술정보연구원) 2013, Vol.30, No.3, pp.207-228 https://doi.org/10.3743/KOSIM.2013.30.3.207

초록보기

초록

본 연구는 온라인 과학기술정보 서비스 품질에 대한 기대수준과 성과에 대한 지각수준이 이용자 만족도와 충성도에 어떠한 영향을 미치는가를 분석하고자 하였다. 이를 위해 한국과학기술정보연구원에서 제공하고 있는 온라인 과학기술정보 서비스인 NDSL의 서비스 품질 측정 모형인 NDSLQual을 사용하여 조사 및 분석을 실시하였다. 분석 결과, 첫째, 일곱 가지 기대 요인 가운데 신뢰성, 편리성, 시스템사용성, 정보품질의 네 가지 요인이 이용자 만족도에 유의한 정(+)의 영향을 미치는 것으로 나타났다. 둘째, 일곱 가지 기대 요인 가운데 충성도에 유의한 부(-)의 영향을 미치는 서비스문제해결을 제외한 신뢰성, 편리성, 시스템사용성, 대응성, 보안, 정보품질의 여섯 가지 요인은 모두 충성도에 유의한 정(+)의 영향을 미치는 것으로 나타났다. 셋째, 일곱 가지 성과에 대한 지각 요인 가운데 신뢰성, 편리성, 정보품질의 세 가지 요인이 이용자 만족도에 유의한 정(+)의 영향을 미치는 것으로 나타났다. 넷째, 일곱 가지 성과에 대한 지각 요인 가운데 신뢰성, 편리성, 정보품질의 세 가지 요인이 충성도에 유의한 정(+)의 영향을 미치는 것으로 나타났다. 연구 결과, 일곱 가지 서비스 요인 가운데 정보품질, 신뢰성, 편리성에 대한 기대수준과 성과에 대한 지각수준이 NDSL의 이용자 만족도와 충성도를 높일 수 있는 공통 요인으로 나타났다.

Abstract

The purpose of this study is to reveal the influence of the expectation and perceived performance of the online science & technology information service quality on user satisfaction and royalty. To achieve this goal, we use the NDSLQual model to measure the quality of NDSL service. The results were as follows: First, among seven expectation factors, four factors (reliability, convenience, system usability and information quality) had a positive effect on the user satisfaction. Second, while service recovery had a negative effect on the royalty, the other six factors (reliability, convenience, system usability, responsiveness, security and information quality) had a positive effect on the royalty. Third, among seven perceived performance factors, three factors (reliability, convenience and information quality) had a positive effect on the user satisfaction. Fourth, among seven perceived performance factors, three factors (reliability, convenience and information quality) had a positive effect on the royalty. As a result, information quality, reliability and convenience of the expectation and perceived performance are common factors influencing user satisfaction and royalty.

52

기계학습에 기초한 국내 학술지 논문의 자동분류에 관한 연구

김판준(신라대학교) 2018, Vol.35, No.2, pp.37-62 https://doi.org/10.3743/KOSIM.2018.35.2.037

초록보기

초록

문헌정보학 분야의 국내 학술지 논문으로 구성된 문헌집합을 대상으로 기계학습에 기초한 자동분류의 성능에 영향을 미치는 요소들을 검토하였다. 특히, 「정보관리학회지」에 수록된 논문에 주제 범주를 자동 할당하는 분류 성능 측면에서 용어 가중치부여 기법, 학습집합 크기, 분류 알고리즘, 범주 할당 방법 등 주요 요소들의 특성을 다각적인 실험을 통해 살펴보았다. 결과적으로 분류 환경 및 문헌집합의 특성에 따라 각 요소를 적절하게 적용하는 것이 효과적이며, 보다 단순한 모델의 사용으로 상당히 좋은 수준의 성능을 도출할 수 있었다. 또한, 국내 학술지 논문의 분류는 특정 논문에 하나 이상의 범주를 할당하는 복수-범주 분류(multi-label classification)가 실제 환경에 부합한다고 할 수 있다. 따라서 이러한 환경을 고려하여 단순하고 빠른 분류 알고리즘과 소규모의 학습집합을 사용하는 최적의 분류 모델을 제안하였다.

Abstract

This study examined the factors affecting the performance of automatic classification based on machine learning for domestic journal articles in the field of LIS. In particular, In view of the classification performance that assigning automatically the class labels to the articles in 「Journal of the Korean Society for Information Management」, I investigated the characteristics of the key factors(weighting schemes, training set size, classification algorithms, label assigning methods) through the diversified experiments. Consequently, It is effective to apply each element appropriately according to the classification environment and the characteristics of the document set, and a fairly good performance can be obtained by using a simpler model. In addition, the classification of domestic journals can be considered as a multi-label classification that assigns more than one category to a specific article. Therefore, I proposed an optimal classification model using simple and fast classification algorithm and small learning set considering this environment.

53

문헌정보학 학술지를 대상으로 한 온톨로지 구축에 관한 연구

노영희(건국대학교) 2011, Vol.28, No.2, pp.177-193 https://doi.org/10.3743/KOSIM.2011.28.2.177

초록보기

초록

Abstract

This study constructed an ontology targeting journal articles and evaluated its performance. Also, the performance of a triple structure ontology was compared with the knowledge base of an inverted index file designed for a simple keyword search engine. The coverage was three years of articles published in the Journal of the Korean Society for Information Management from 2007 to 2009. Protégé was used to construct an ontology, whilst utilizing an inverted index file to compare performance. The concept ontology was manually established, and the bibliography ontology was automatically constructed to produce an OWL concept ontology and an OWL bibliography ontology, respectively. This study compared the performance of the knowledge base of the ontology, using the Jena search engine with the performance of an inverted index file using the Lucene search engine. As a result, The Lucene showed higher precision rate, but Jena showed higher recall rate.

54

PISA 2009 학업성취도에 대한 학교도서관 변인의 영향력 분석

박주현(신가초등학교) ; 장우권(전남대학교) 2014, Vol.31, No.3, pp.331-351 https://doi.org/10.3743/KOSIM.2014.31.3.331

초록보기

초록

이 연구는 OECD PISA에 참여한 우리나라 학생, 학부모와 학교장의 설문자료와 읽기․수학․과학소양의 성취도를 분석하여 학교도서관변인이 학업성취도에 미치는 영향력을 파악하고자 하였다. 이를 통해 학교도서관의 교육적 영향력과 책무성에 대한 기초자료를 얻고 학교도서관 교육을 개선하기 위한 시사점을 도출하고자 하였다. 분석결과, 첫째, 학생들이 독서를 즐겁게 생각할수록 학생들의 학업성취도가 높았다. 둘째, 가정에 책이 많고 이용할 수 있는 독서자원이 많을수록 학생들의 학업성취도가 높았다. 셋째, 인터넷을 사용하기 위해 학교도서관을 방문할수록 학업성취도가 낮았다. 넷째, 학교도서관 직원이 부족할수록 학생들의 학업성취도가 낮았다. 즉 학교도서관 직원이 학생들의 학업성취도에 긍정적인 영향을 미치고 있었다. 다섯째, 독서 사교육에 대한 경험은 읽기․과학소양성취도에 부정적인 영향을 미치고 있었다. 여섯째, 독서가 좋아서 학교도서관에서 대출하는 것과 국어교과 관련 학교도서관 수업은 학업성취도에 영향을 미치지 않는 것으로 나타났다.

Abstract

This study analyzed students’, parents of student, and school master’ survey materials and reading, math, and science knowledge performance based on OECD PISA 2009 Koreana data. Also, school library level variables grasped the impacts of academic achievement. Through this study looked for suggestion to improvement, educational accountability and leverage of school library. The results, first, when they use reading more pleasantly thought, they achieve higher scores in academic performance. Second, when they use more books and reading resources in housekeeping, they achieve higher scores in academic performance. Third, when they were more visits school library for exploit internet, they achieve higher lower scores in academic performance. Fourth, when the member of the staff in the school library are more lack of people, they achieve lower scores in academic performance. Fifth, private education’s experience in reading have a negative influence in reading achievement and science achievement. Sixth, school library’s visiting and library application study about national language curriculum in the impacts of academic performance would not change statistical evaluation significantly.

55

딥러닝 기반의 BERT 모델을 활용한 학술 문헌 자동분류

김인후(중앙대학교 문헌정보학과 대학원) ; 김성희(중앙대학교 문헌정보학과) 2022, Vol.39, No.3, pp.293-310 https://doi.org/10.3743/KOSIM.2022.39.3.293

초록보기

초록

본 연구에서는 한국어 데이터로 학습된 BERT 모델을 기반으로 문헌정보학 분야의 문서를 자동으로 분류하여 성능을 분석하였다. 이를 위해 문헌정보학 분야의 7개 학술지의 5,357개 논문의 초록 데이터를 학습된 데이터의 크기에 따라서 자동분류의 성능에 어떠한 차이가 있는지를 분석, 평가하였다. 성능 평가척도는 정확률(Precision), 재현율(Recall), F 척도를 사용하였다. 평가결과 데이터의 양이 많고 품질이 높은 주제 분야들은 F 척도가 90% 이상으로 높은 수준의 성능을 보였다. 반면에 데이터 품질이 낮고 내용적으로 다른 주제 분야들과 유사도가 높고 주제적으로 확실히 구별되는 자질이 적을 경우 유의미한 높은 수준의 성능 평가가 도출되지 못하였다. 이러한 연구는 미래 학술 문헌에서 지속적으로 활용할 수 있는 사전학습모델의 활용 가능성을 제시하기 위한 기초자료로 활용될 수 있을 것으로 기대한다.

Abstract

In this study, we analyzed the performance of the BERT-based document classification model by automatically classifying documents in the field of library and information science based on the KoBERT. For this purpose, abstract data of 5,357 papers in 7 journals in the field of library and information science were analyzed and evaluated for any difference in the performance of automatic classification according to the size of the learned data. As performance evaluation scales, precision, recall, and F scale were used. As a result of the evaluation, subject areas with large amounts of data and high quality showed a high level of performance with an F scale of 90% or more. On the other hand, if the data quality was low, the similarity with other subject areas was high, and there were few features that were clearly distinguished thematically, a meaningful high-level performance evaluation could not be derived. This study is expected to be used as basic data to suggest the possibility of using a pre-trained learning model to automatically classify the academic documents.

56

랜덤포레스트를 이용한 국내 학술지 논문의 자동분류에 관한 연구

김판준(신라대학교) 2019, Vol.36, No.2, pp.57-77 https://doi.org/10.3743/KOSIM.2019.36.2.057

초록보기

초록

대표적인 앙상블 기법으로서 랜덤포레스트(RF)를 문헌정보학 분야의 학술지 논문에 대한 자동분류에 적용하였다. 특히, 국내 학술지 논문에 주제 범주를 자동 할당하는 분류 성능 측면에서 트리 수, 자질선정, 학습집합 크기 등 주요 요소들에 대한 다각적인 실험을 수행하였다. 이를 통해, 실제 환경의 불균형 데이터세트(imbalanced dataset)에 대하여 랜덤포레스트(RF)의 성능을 최적화할 수 있는 방안을 모색하였다. 결과적으로 국내 학술지 논문의 자동분류에서 랜덤포레스트(RF)는 트리 수 구간 100〜1000(C)과 카이제곱통계량(CHI)으로 선정한 소규모의 자질집합(10%), 대부분의 학습집합(9〜10년)을 사용하는 경우에 가장 좋은 분류 성능을 기대할 수 있는 것으로 나타났다.

Abstract

Random Forest (RF), a representative ensemble technique, was applied to automatic classification of journal articles in the field of library and information science. Especially, I performed various experiments on the main factors such as tree number, feature selection, and learning set size in terms of classification performance that automatically assigns class labels to domestic journals. Through this, I explored ways to optimize the performance of random forests (RF) for imbalanced datasets in real environments. Consequently, for the automatic classification of domestic journal articles, Random Forest (RF) can be expected to have the best classification performance when using tree number interval 100〜1000(C), small feature set (10%) based on chi-square statistic (CHI), and most learning sets (9-10 years).

57

다차원 메타데이터 공간을 활용한 학술 문헌 추천기법 연구

감미아(연세대학교 문헌정보학과) ; 이지연(연세대학교 문헌정보학과) 2023, Vol.40, No.1, pp.121-148 https://doi.org/10.3743/KOSIM.2023.40.1.121

초록보기

초록

본 연구는 ‘우수한 성능의 메타데이터 속성 유사도 기반의 학술 문헌추천시스템’을 제안하는 데에 목적을 두고 있다. 본 연구에서는 정보조직에서 다루는 메타데이터의 활용과 계량정보학에서 다루고 있는 동시인용, 저자-서지결합법, 동시출현 빈도, 코사인 유사도의 개념을 활용한 문헌정보학 기반의 학술 문헌 추천기법을 제안하고자 하였다. 실험을 위해 수집한 ‘불평등’, ‘격차’ 관련 총 9,643개의 논문 메타데이터를 정제하여 코사인 유사도를 활용한 저자, 키워드, 제목 속성 간의 상대적 좌표 수치를 도출하였고, 성능 좋은 가중치 조건 및 차원의 수를 선정하기 위해 실험을 수행하였다. 실험 결과를 제시하여 이용자의 평가를 거쳤으며, 이를 이용해 기준노드와 추천조합 특성 분석 및 컨조인트 분석, 결과 비교 분석을 수행하여 연구질문 중심의 논의를 수행하였다. 그 결과 전반적으로는 저자 관련 속성을 제한 조합 혹은 제목 관련 속성만 사용하는 경우 성능이 뛰어난 것으로 나타났다. 본 연구에서 제시한 기법을 활용하고 광범위한 표본의 확보를 이룬다면, 향후 정보서비스의 문헌 추천 분야뿐 아니라 사회의 다양한 분야에 대한 추천기법 성능 향상에 도움을 줄 수 있을 것이다.

Abstract

The purpose of this study is to propose a scholarly paper recommendation system based on metadata attribute similarity with excellent performance. This study suggests a scholarly paper recommendation method that combines techniques from two sub-fields of Library and Information Science, namely metadata use in Information Organization and co-citation analysis, author bibliographic coupling, co-occurrence frequency, and cosine similarity in Bibliometrics. To conduct experiments, a total of 9,643 paper metadata related to “inequality” and “divide” were collected and refined to derive relative coordinate values between author, keyword, and title attributes using cosine similarity. The study then conducted experiments to select weight conditions and dimension numbers that resulted in a good performance. The results were presented and evaluated by users, and based on this, the study conducted discussions centered on the research questions through reference node and recommendation combination characteristic analysis, conjoint analysis, and results from comparative analysis. Overall, the study showed that the performance was excellent when author-related attributes were used alone or in combination with title-related attributes. If the technique proposed in this study is utilized and a wide range of samples are secured, it could help improve the performance of recommendation techniques not only in the field of literature recommendation in information services but also in various other fields in society.

58

텍스트 분류를 위한 자질 순위화 기법에 관한 연구

김판준(신라대학교 문헌정보학과) 2023, Vol.40, No.1, pp.1-21 https://doi.org/10.3743/KOSIM.2023.40.1.001

초록보기

초록

본 연구는 텍스트 분류를 위한 효율적인 자질선정 방법으로 자질 순위화 기법의 성능을 구체적으로 검토하였다. 지금까지 자질 순위화 기법은 주로 문헌빈도에 기초한 경우가 대부분이며, 상대적으로 용어빈도를 사용한 경우는 많지 않았다. 따라서 텍스트 분류를 위한 자질선정 방법으로 용어빈도와 문헌빈도를 개별적으로 적용한 단일 순위화 기법들의 성능을 살펴본 다음, 양자를 함께 사용하는 조합 순위화 기법의 성능을 검토하였다. 구체적으로 두 개의 실험 문헌집단(Reuters-21578, 20NG)과 5개 분류기(SVM, NB, ROC, TRA, RNN)를 사용하는 환경에서 분류 실험을 진행하였고, 결과의 신뢰성 확보를 위해 5-fold cross validation과 t-test를 적용하였다. 결과적으로, 단일 순위화 기법으로는 문헌빈도 기반의 단일 순위화 기법(chi)이 전반적으로 좋은 성능을 보였다. 또한, 최고 성능의 단일 순위화 기법과 조합 순위화 기법 간에는 유의한 성능 차이가 없는 것으로 나타났다. 따라서 충분한 학습문헌을 확보할 수 있는 환경에서는 텍스트 분류의 자질선정 방법으로 문헌빈도 기반의 단일 순위화 기법(chi)을 사용하는 것이 보다 효율적이라 할 수 있다.

Abstract

This study specifically reviewed the performance of the ranking schemes as an efficient feature selection method for text classification. Until now, feature ranking schemes are mostly based on document frequency, and relatively few cases have used the term frequency. Therefore, the performance of single ranking metrics using term frequency and document frequency individually was examined as a feature selection method for text classification, and then the performance of combination ranking schemes using both was reviewed. Specifically, a classification experiment was conducted in an environment using two data sets (Reuters-21578, 20NG) and five classifiers (SVM, NB, ROC, TRA, RNN), and to secure the reliability of the results, 5-Fold cross-validation and t-test were applied. As a result, as a single ranking scheme, the document frequency-based single ranking metric (chi) showed good performance overall. In addition, it was found that there was no significant difference between the highest-performance single ranking and the combination ranking schemes. Therefore, in an environment where sufficient learning documents can be secured in text classification, it is more efficient to use a single ranking metric (chi) based on document frequency as a feature selection method.

59

대단위 우리말 온톨리지 구축을 위한 시소러스의 개발

최석두(한성대학교) ; 이우범(한성대학교) ; 김이겸(광주대학교) ; 이정연(한국학술진흥재단 지식정보센터) ; 최상기(전북대학교) ; 한상길(대림대학교) 2006, Vol.23, No.4, pp.147-164 https://doi.org/10.3743/KOSIM.2006.23.4.147

초록보기

초록

Abstract

This paper reports an effort to construct a grand-scale Korean thesaurus that can be used for enhancing retrieval performance in various fields. This thesaurus is currently being used for indexing and retrieving purpose and new terms are being added to it. As the new demands on retrieval performance increase in Korea, developing a grand-scale ontology appears to be necessary so a project is undertaken to transfer the current thesaurus into an ontology system. The paper describes how the thesaurus is constructed and prepared to be the base for an ontology system.

60

뉴스 웹 페이지에서 기사 본문 추출에 관한 연구

이용구(피츠버그대학) 2009, Vol.26, No.1, pp.305-320 https://doi.org/10.3743/KOSIM.2009.26.1.305

초록보기

초록

웹을 통해 제공되는 뉴스 페이지의 경우 필요한 정보 뿐 아니라 많은 불필요한 정보를 담고 있다. 이러한 불필요한 정보는 뉴스를 처리하는 시스템의 성능 저하와 비효율성을 가져온다. 이 연구에서는 웹 페이지로부터 뉴스 콘텐츠를 추출하기 위해 문장과 블록에 기반한 뉴스 기사 추출 방법을 제시하였다. 또한 이들을 결합하여 최적의 성능을 가져올 수 있는 방안을 모색하였다. 실험 결과, 웹 페이지에 대해 하이퍼링크 텍스트를 제거한 후 문장을 이용한 추출 방법을 적용하였을 때 효과적이었으며, 여기에 블록을 이용한 추출 방법과 결합하였을 때 더 좋은 결과를 가져왔다. 문장을 이용한 추출 방법은 추출 재현율을 높여주는 효과가 있는 것으로 나타났다.

Abstract

The news pages provided through the web contain unnecessary information. This causes low performance and inefficiency of the news processing system. In this study, news content extraction methods, which are based on sentence identification and block-level tags news web pages, was suggested. To obtain optimal performance, combinations of these methods were applied. The results showed good performance when using an extraction method which applied the sentence identification and eliminated hyperlink text from web pages. Moreover, this method showed better results when combined with the extraction method which used block-level. Extraction methods, which used sentence identification, were effective for raising the extraction recall ratio.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지