정보관리학회지, 한국정보관리학회

61

이성숙(충남대학교) 2005, Vol.22, No.2, pp.205-228 https://doi.org/10.3743/KOSIM.2005.22.2.205

초록보기

초록

이 연구는 웹정보원의 지적 구조를 동시링크분석을 이용하여 시기별 변화와 검색엔진별 차이를 중심으로 분석하였다. 지적 구조의 시기별 변화를 분석한 결과, 이차원 지도상에 나타난 웹정보원의 군집과 위치는 6년간의 시간흐름에 따른 지적 구조의 변화를 나타냈다. AltaVista와 MSN Search 검색엔진을 대상으로 지적구조의 차이를 분석한 결과, 웹정보원 지도에 나타난 전체적인 지적 구조는 유사한 양상을 나타냈지만, 몇몇 웹정보원의 경우 소속 군집이 달라지는 경우가 발생했다. 인쇄 정보원에 적용되던 동시인용기법이 웹정보원에 대한 통시적 분석에도 응용될 수 있음을 확인하였다.

Abstract

This research analyzed changes of the intellectual structure of web information by examining time changes and search engines using the co-links analysis. According to the results, the co-links web information clusters on the two maps appeared to contain changes in the intellectual structure over the two time periods.The intellectual structure that appeared in the information map for AltaVista and MSN Search engines was relatively similar. However, there were also cases where the clusters of some web information was different. The results of the research revealed that the cocitation analysis could be applied simultaneously to diachronous analysis in the web information.

62

의미거리측정방법을 활용한 분산 온톨로지 간 자동 정렬 방법 연구

황상규(홍익대학교 컴퓨터공학과) ; 변영태(홍익대학교) 2009, Vol.26, No.4, pp.319-336 https://doi.org/10.3743/KOSIM.2009.26.4.319

초록보기

초록

시멘틱 웹은 현재의 월드와이드웹의 진화된 모습으로 컴퓨터와 인간이 서로 협업할 수 있도록 컴퓨터가 이해할 수 있는 지식데이터베이스인 온톨로지 기술을 활용한다. 그러나, 온톨로지를 활용하여 정보의 의미를 이해하고 처리 가능하도록 데이터의 표현형식이 표준화 되더라도, 각기 다른 개발자가 서로 다른 개념하에 구축한 온톨로지를 기반으로 작성된 데이터는 상호 불일치 문제를 유발할 수 있다. 따라서, 서로 다른 개념 하에 구축된 온톨로지 간에는 상호 서로 다른 온톨로지 간 정렬작업이 필요하다. 서로 다른 온톨로지 개념노드 간 자동화 처리된 의미정렬 시 인간전문가가 참으로 판단한 사실을 거짓으로 잘못 판단하는 문제상황(false negative)에 의해 정렬오류문제가 발생하게 되는데, 본 연구에서는 서로 다른 온톨로지 개념노드 간 의미정렬과정에서 발생하는 false negative 오류를 최소화 할 수 있는 알고리즘을 새롭게 개발, 제시하였다.

Abstract

Semantic web technology is the evolution of current World Wide Web including a machine-understandable knowledge database, ontology, it may be enable machine and people to work together. However, problems arise when we try to communicate with different data, which are annotated by different ontologies created by different people with different concepts. Thus, to communicate between ontologies, it needs to align between heterogeneous ontologies. When it is aligned between concept nodes of heterogeneous ontologies, one of main problems is a misalignment situation caused by false negative of automatic ontology mapping. So, in this paper, we present a new method to minimize the false negative error in the process of aligning concept nodes of different ontology.

63

토픽모델링과 딥 러닝을 활용한 생의학 문헌 자동 분류 기법 연구

육지희(연세대학교 일반대학원 문헌정보학과) ; 송민(연세대학교) 2018, Vol.35, No.2, pp.63-88 https://doi.org/10.3743/KOSIM.2018.35.2.063

초록보기

초록

본 연구는 LDA 토픽 모델과 딥 러닝을 적용한 단어 임베딩 기반의 Doc2Vec 기법을 활용하여 자질을 선정하고 자질집합의 크기와 종류 및 분류 알고리즘에 따른 분류 성능의 차이를 평가하였다. 또한 자질집합의 적절한 크기를 확인하고 문헌의 위치에 따라 종류를 다르게 구성하여 분류에 이용할 때 높은 성능을 나타내는 자질집합이 무엇인지 확인하였다. 마지막으로 딥 러닝을 활용한 실험에서는 학습 횟수와 문맥 추론 정보의 유무에 따른 분류 성능을 비교하였다. 실험문헌집단은 PMC에서 제공하는 생의학 학술문헌을 수집하고 질병 범주 체계에 따라 구분하여 Disease-35083을 구축하였다. 연구를 통하여 가장 높은 성능을 나타낸 자질집합의 종류와 크기를 확인하고 학습 시간에 효율성을 나타냄으로써 자질로의 확장 가능성을 가지는 자질집합을 제시하였다. 또한 딥 러닝과 기존 방법 간의 차이점을 비교하고 분류 환경에 따라 적합한 방법을 제안하였다.

Abstract

This research evaluated differences of classification performance for feature selection methods using LDA topic model and Doc2Vec which is based on word embedding using deep learning, feature corpus sizes and classification algorithms. In addition to find the feature corpus with high performance of classification, an experiment was conducted using feature corpus was composed differently according to the location of the document and by adjusting the size of the feature corpus. Conclusionally, in the experiments using deep learning evaluate training frequency and specifically considered information for context inference. This study constructed biomedical document dataset, Disease-35083 which consisted biomedical scholarly documents provided by PMC and categorized by the disease category. Throughout the study this research verifies which type and size of feature corpus produces the highest performance and, also suggests some feature corpus which carry an extensibility to specific feature by displaying efficiency during the training time. Additionally, this research compares the differences between deep learning and existing method and suggests an appropriate method by classification environment.

64

연관규칙을 활용한 학교도서관 도서추천시스템 개발에 관한 연구

임정훈(대전과학고등학교 교사) ; 조창제(NeuroEars 연구개발전담부서) ; 김종헌(대전과학고등학교 교사) 2022, Vol.39, No.3, pp.1-22 https://doi.org/10.3743/KOSIM.2022.39.3.001

초록보기

초록

본 연구는 학교도서관에서 활용할 수 있는 도서추천시스템을 제안하는데 목적이 있다. 도서추천시스템은 DLS의 대출 데이터를 활용하여 연관규칙 기반의 알고리즘을 적용하였으며, 학교도서관 이용자들에게 개인화 도서추천 서비스 제공이 가능하도록 설계하였다. 이를 위해 Apriori 알고리즘 기반의 연관규칙과 매개 중심성 분석을 적용하고, 기술통계, 연관규칙 생성, 학생중심 추천, 도서 중심추천 등 세부 기능을 구현하였다. 이어서 사서교사를 대상으로 심층면담을 통해 도서추천시스템 사용에 대한 의견을 조사하였다. 조사 결과, 도서추천의 필요성 및 어려움, 학생의 반응, 기존 추천방식과의 차이점 및 활용방안, 개선 사항에 대한 의견을 확인할 수 있었으며, 이를 토대로 다음의 논의점을 제안하였다. 첫째, 개별학교의 특성을 파악하기 위해서 장기간의 대출 데이터의 제공이 필요하다. 둘째, 지역별 혹은 학교 특성별 데이터 통합 방안에 대한 논의가 필요하다. 셋째, 독서교육종합시스템에서 제공하는 도서추천시스템의 구축이 필요하다. 본 연구에서 제안된 내용을 토대로 향후 학교도서관 현장에서 활용할 수 있는 개인화 추천시스템 적용에 대한 다양한 논의가 이루어지길 기대한다.

Abstract

The purpose of this study is to propose a book recommendation system that can be used in school libraries. The book recommendation system applies an algorithm based on association rules using DLS lending data and is designed to provide personalized book recommendation services to school library users. For this purpose, association rules based on the Apriori algorithm and betweenness centrality analysis were applied and detailed functions such as descriptive statistics, generation of association rules, student-centered recommendation, and book-centered recommendation were materialized. Subsequently, opinions on the use of the book recommendation system were investigated through in-depth interviews with teacher librarians. As a result of the investigation, opinions on the necessity and difficulty of book recommendation, student responses, differences from existing recommendation methods, utilization methods, and improvements were confirmed and based on this, the following discussions were proposed. First, it is necessary to provide long-term lending data to understand the characteristics of each school. Second, it is necessary to discuss the data integration plan by region or school characteristics. Third, It is necessary to establish a book recommendation system provided by the Comprehensive Support System for Reading Education. Based on the contents proposed in this study, it is expected that various discussions will be made on the application of a personalization recommendation system that can be used in the school library in the future.

65

책임표시의 기술방식 개선을 위한 역할어의 활용 방안 연구

박지영(한성대학교) 2011, Vol.28, No.3, pp.65-82 https://doi.org/10.3743/KOSIM.2011.28.3.065

초록보기

초록

서지레코드의 기술에 있어서 책임표시는 저작의 지적 책임 소재를 밝혀 주고 접근점을 구성하는 바탕이 된다. 그런데 목록규칙에서는 책임표시를 주된 역할과 부차적 역할로 나누고, 이에 따라 기술방법을 달리 하는데 치중하는 문제점이 있다. 역할의 중요도를 판단하여 순서를 매기기보다는 역할 자체를 구조화시키는 것이 우선이기 때문이다. 또한 목록에서 주저자를 선정한 것은 책임성에 따른 것보다는 저록의 작성이나 배열과 관련된 실무적 결정이었다. 이에 본 연구에서는 역할 자체를 구조화함으로써 책임표시 기술방식을 개선하고자 하였다. 즉, 역할어를 체계적으로 기술하여 서지레코드에서 분산되거나 접근점에서 제외된 책임표시를 집중시키는 것이다. 나아가 이를 통해 책임표시 정보의 품질을 제고하고, 역할어를 검색의 패싯이나 전거레코드의 추가적인 식별 정보로도 활용할 수 있음을 제안하였다.

Abstract

Statement of responsibility in bibliographical records plays a key role in clarifying intellectual responsibility of the work, and it also plays a role in making up access points. However, cataloging rules for the statement of responsibility mostly deal with the distinction between the principal role and minor roles. This becomes a problem because the responsibility type itself is more important than the order of the types. For this reason, in this paper I will explore improvements of the description methods of statement of responsibility by organizing the role indicators. Namely, using the role indicators more effectively than the current description methods do, we can collocate the dispersed statements of responsibilities. The role indicators can also be used for an author facet in information retrieval and can provide additional information for authority control.

66

로치오 알고리즘을 이용한 학술지 논문의 디스크 립터 자동부여에 관한 연구

김판준(신라대학교) 2006, Vol.23, No.3, pp.69-89 https://doi.org/10.3743/KOSIM.2006.23.3.069

초록보기

초록

로치오 알고리즘에 기초한 통제어휘 자동색인 또는 텍스트 범주화에서 적용되어 온 여러 성능 요인들을 재검토하였고, 성능 향상을 위한 기본적인 방법을 찾아보았다. 또한, 동등한 조건에서 통제어휘 자동색인을 위한 로치오 알고리즘 기반 방법의 성능을 다른 학습기반 방법들의 성능과 비교하였다. 결과에 따르면, 통제어휘 자동색인을 위한 로치오 기반의 프로파일 방법은 구현의 용이성과 컴퓨터 처리시간 측면의 경제성이라는 기존의 장점을 그대로 유지하면서도, 다른 학습기반 방법들(SVM, VPT, NB)과 거의 동등하거나 더 나은 성능을 보여주었다. 특히, 색인전문가의 색인작업을 지원하는 반-자동 색인의 목적으로는 비교적 높은 수준의 재현율을 유지하면서 학습 데이터의 증가에 따라 정확률이 크게 향상되는 로치오 알고리즘을 이용한 방법을 우선적으로 고려할 수 있을 것이다.

Abstract

Several performance factors which have applied to the automatic indexing with controlled vocabulary and text categorization based on Rocchio algorithm were examined, and the simple method for performance improvement of them were tried. Also, results of the methods using Rocchio algorithm were compared with those of other learning based methods on the same conditions. As a result, keeping with the strong points which are implementational easiness and computational efficiency, the methods based Rocchio algorithms showed equivalent or better results than other learning based methods(SVM, VPT, NB). Especially, for the semi-automatic indexing(computer-aided indexing), the methods using Rocchio algorithm with a high recall level could be used preferentially.

67

가톨릭교회 오픈소스 기록관리시스템 구축 방안에 관한 연구

황지민(대구가톨릭대학교 도서관학과 석사) ; 이지원(대구가톨릭대학교 도서관학과 부교수) 2021, Vol.38, No.1, pp.263-291 https://doi.org/10.3743/KOSIM.2021.38.1.263

초록보기

초록

본 연구에서는 가톨릭교회에서 생산되는 기록물 중 생산 비중이 높고 중요하게 여겨지고 있는 ‘성사 기록물’에 초점을 맞추어 성사 기록물을 체계적으로 관리․보존할 수 있도록 오픈소스 소프트웨어를 활용한 가톨릭교회 기록관리시스템을 구축하는 방안을 모색하고자 하였다. 이를 위해 각종 가톨릭교회의 규정과 참고문헌, 오픈소스 소프트웨어 홈페이지를 분석하였고 천주교 대구대교구의 주교좌 성당인 G성당을 대상으로 인터뷰를 진행하여 가톨릭교회의 기록관리 현황을 조사하였다. 조사한 내용을 바탕으로 성사 기록물의 특성을 반영한 4가지 기술계층 및 기술계층별 메타데이터 구조, 전거 데이터를 설계하였으며 이를 기반으로 오픈소스를 활용한 기록관리시스템에 실험적으로 구현하기 위해 적합한 오픈소스 기록관리시스템으로 AtoM을 선정하여 설계한 내용을 적용하였다.

Abstract

This study aims to develop a records management system for catholic churches that use open-source software to ensure systemic management and preservation of sacrament records, which accounts for the highest percentage among all records produced by catholic churches and holds great significance for them. To that end, the researcher analyzed catholic church regulations and reference materials, as well as the websites of open-source software. The researcher also interviewed members at the G Cathedral of the Catholic Archdiocese of Daegu, to examine the current status of records management at catholic churches. Based on the investigation, the researcher designed four layers of metadata and authority data that reflected the characteristics of the sacramental records. Then the design was experimentally implemented using AtoM, an open-source records management system.

68

형사사법정보의 빅데이터 활용방안 연구: 구조화 범주화 관점으로

김미령(서울지방경찰청 사서) ; 노윤주(경찰청 사서) ; 김성훈(성균관대학교 문헌정보학과 초빙교수) 2019, Vol.36, No.4, pp.253-277 https://doi.org/10.3743/KOSIM.2019.36.4.253

초록보기

초록

4차 산업혁명시대를 맞아 데이터의 중요성은 심화되고 있으나, 개인정보보호 등의 문제로 데이터의 활용이 쉽지 않은 경우가 많이 있다. 형사사법정보는 범죄 예측 및 예방, 범죄수사 과학화, 양형합리화 등 다양한 활용가치가 예상됨에도 현재 개인정보보호와 형사사법정보 관련 법률적 해석 문제로 활용이 상당히 제한되고 있다. 본 연구는 형사사법정보의 구조화․범주화를 통해 ‘범죄데이터’로 전환하여 빅데이터로서 활용하도록 제안하였으며, ‘범죄데이터’ 활용시 법률적 문제, 활용가치, 데이터 생성 및 활용시 고려사항을 전문가를 통해 검증하고 향후 전략적 발전방안을 도출하였다. 연구결과, ‘범죄데이터’는 개인정보보호문제는 해결된 것으로 보여지나, 형사사법정보 관련법에 명시할 필요는 있으며, 빅데이터 활용을 위해 분석 가능하도록 표준화된 형태로 정리되는 것이 시급함이 밝혀졌다. 향후 진행방향으로는 데이터 요소 도출, 용어사전 시소러스 구축, 데이터 등급화를 위한 개인민감정보 정의 및 등급지정, 비정형데이터의 정형화를 위한 알고리즘 개발 등을 제시하였다.

Abstract

In the era of the 4th Industrial Revolution, the importance of data is intensifying, but there are many cases where it is not easy to use data due to personal information protection. Although criminal justice information is expected to have various useful values such as crime prediction and prevention, scientific investigation of criminal investigations, and rationalization of sentencing, the use of criminal justice information is currently limited as a matter of legal interpretation related to privacy protection and criminal justice information. This study proposed to convert criminal justice information into ‘crime data’ and use it as big data through the structuralization and categorization of criminal justice information. And when using “crime data,” legal issues, value in use, considerations for data generation and use were verified by experts, and future strategic development plans were identified. Finally we found that ‘crime data’ seems to have solved the privacy problem, but it is necessary to specify in the criminal justice information related law and it is urgent to be organized in a standardized form for analysis to use big data. Future directions are to derive data elements, construct a dictionary thesaurus, define and classify personal sensitive information for data grading, and develop algorithms for shaping unstructured data.

69

사전 정보를 이용한 단어 중의성 해소 모형에 관한 실험적 연구

이용구(계명대학교) ; 정영미(연세대학교) 2007, Vol.24, No.1, pp.321-342 https://doi.org/10.3743/KOSIM.2007.24.1.321

초록보기

초록

이 연구에서는 수작업 태깅없이 기계가독형 사전을 이용하여 자동으로 의미를 태깅한 후 학습데이터로 구축한 분류기에 대해 의미를 분류하는 단어 중의성 해소 모형을 제시하였다. 자동 태깅을 위해 사전 추출 정보 기반 방법과 연어 공기 기반 방법을 적용하였다. 실험 결과, 자동 태깅에서는 복수 자질 축소를 적용한 사전 추출 정보 기반 방법이 70.06%의 태깅 정확도를 보여 연어 공기 기반 방법의 56.33% 보다 24.37% 향상된 성능을 가져왔다. 사전 추출 정보 기반 방법을 이용한 분류기의 분류 정학도는 68.11%로서 연어 공기 기반 방법의 62.09% 보다 9.7% 향상된 성능을 보였다. 또한 두 자동 태깅 방법을 결합한 결과 태깅 정확도는 76.09%, 분류 정확도는 76.16%로 나타났다.

Abstract

This study presents an effective word sense disambiguation model that does not require manual sense tagging process by automatically tagging the right sense using a machine-readable dictionary, and attempts to classify the senses of those words using a classifier built from the training data. The automatic tagging technique was implemnted by the dictionary information-based and the collocation co-occurrence-based methods. The dictionary information-based method that applied multiple feature selection showed the tagging accuracy of 70.06%, and the collocation co-occurrence-based method 56.33%. The sense classifier using the dictionary information-based tagging method showed the classification accuracy of 68.11%, and that using the collocation co-occurrence-based tagging method 62.09%. The combined tagging method applying data fusion technique achieved a greater performance of 76.09% resulting in the classification accuracy of 76.16%.

70

우리나라 공공데이터의 이용활성화 방안에 관한 연구: 링크드 오픈 데이터화 전략을 중심으로

이현정(중앙대학교) ; 남영준(중앙대학교) 2014, Vol.31, No.4, pp.249-266 https://doi.org/10.3743/KOSIM.2014.31.4.249

초록보기

초록

우리나라는 공공데이터 제공과 관련된 제도가 최근 제정되면서 정부기관 및 지방자치단체 등의 공공기관이 보유한 데이터를 적극적으로 개방하고 제공하는 방향으로 정책이 변화하고 있다. 개방의 목적은 크게 두 가지로 구분한다. 정부운영의 투명성을 확보하여 국민의 알 권리를 충족시키는 것이다. 다른 하나는 공공데이터를 하나의 국가부존 자산으로 활용하여 국익을 창출하기 위함이다. 이 연구에서는 공공데이터의 개방 현황을 분석하고 개선방안을 제시하였다. 연구범위는 지방자치단체에서 제공하는 공공데이터이기 때문에 서울특별시를 비롯한 17개 광역시도와 기초 자치단체 228개 시․군․구에서 보유한 것을 전수 조사하였다. 연구결과에 따르면 지방자치단체는 각 기관에서 생산 및 소장한 공공데이터에 대한 목록파악과 공개에 대해 상대적으로 소극적인 것과 개방 데이터의 포맷도 특정 소프트웨어에 의존적인 형태였다. 이러한 점을 해결하기 위해서는 궁극적으로 지역 공공데이터개방과 활용을 높일 방안으로 링크드 오픈 데이터 형태로 개방하는 필요성과 방안을 제시하였으며, 국가 공공데이터개방을 위한 통합 플랫폼을 통한 종합적 개방절차와 방안을 제안하였다.

Abstract

In South Korea, systems related to the provision of public data were recently implemented. As a result, policy changes have been made that are headed in the direction of actively providing open access to data held by public institutions, such as government agencies and local municipalities. The purpose of the open I will be divided into two broad. To ensure the transparency of government operations, and is intended to satisfy the right to know the people. The other one is to create national interest by utilizing the public data as one country endowment assets. In this study, we analyze the open situation of public data, were presented the improvement measures. Range of research, the public data that local government owns, to determine to have a central information and other limitations and characteristics, Seoul the beginning to the seventeen regional support municipality 228 that you have held for city districts were census. According to the research results, local governments, themselves produced, is a relatively reluctant to disclosure and understanding of the list of public data that are holdings. According to the research results, local governments, themselves produced, is a relatively reluctant to disclosure and understanding of the list of public data that are holdings, also emphasizes the conservative value than take advantage of value have had. Therefore, it was determined that there is a need to resolve several issues through disclosure via a linked data format as a strategy to increase the openness and utilization of local public data.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지