정보관리학회지, 한국정보관리학회

1

지도적 잠재의미색인(LSI)기법을 이용한 의견 문서 자동 분류에 관한 실험적 연구

이지혜(연세대학교) ; 정영미(연세대학교) 2009, Vol.26, No.3, pp.451-462 https://doi.org/10.3743/KOSIM.2009.26.3.451

초록보기

초록

본 연구에서는 의견이나 감정을 담고 있는 의견 문서들의 자동 분류 성능을 향상시키기 위하여 개념색인의 하나인 잠재의미색인 기법을 사용한 분류 실험을 수행하였다. 실험을 위해 수집한 1,000개의 의견 문서는 500개씩의 긍정 문서와 부정 문서를 포함한다. 의견 문서 텍스트의 형태소 분석을 통해 명사 형태의 내용어 집합과 용언, 부사, 어기로 구성되는 의견어 집합을 생성하였다. 각기 다른 자질 집합들을 대상으로 의견 문서를 분류한 결과 용어색인에서는 의견어 집합, 잠재의미색인에서는 내용어와 의견어를 통합한 집합, 지도적 잠재의미색인에서는 내용어 집합이 가장 좋은 성능을 보였다. 전체적으로 의견 문서의 자동 분류에서 용어색인 보다는 잠재의미색인 기법의 분류 성능이 더 좋았으며, 특히 지도적 잠재의미색인 기법을 사용할 경우 최고의 분류 성능을 보였다.

Abstract

The aim of this study is to apply latent semantic indexing(LSI) techniques for efficient automatic classification of opinionated documents. For the experiments, we collected 1,000 opinionated documents such as reviews and news, with 500 among them labelled as positive documents and the remaining 500 as negative. In this study, sets of content words and sentiment words were extracted using a POS tagger in order to identify the optimal feature set in opinion classification. Findings addressed that it was more effective to employ LSI techniques than using a term indexing method in sentiment classification. The best performance was achieved by a supervised LSI technique.

2

딜리셔스에서 유사태그 추출에 관한 연구

이관(Univ. of Kentucky) 2009, Vol.26, No.2, pp.127-147 https://doi.org/10.3743/KOSIM.2009.26.2.127

초록보기

초록

Abstract

The synonym issue is an inherent barrier in human-computer communication, and it is more challenging in a Web 2.0 application, especially in social tagging applications. In an effort to resolve the issue, the goal of this study is to test the feasibility of a Web 2.0 application as a potential source for synonyms. This study investigates a way of identifying similar tags from a popular collaborative tagging application, Delicious. Specifically, we propose an algorithm (FolkSim) for measuring the similarity of social tags from Delicious. We compared FolkSim to a cosine-based similarity method and observed that the top-ranked tags on the similar list generated by FolkSim tend to be among the best possible similar tags in given choices. Also, the lists appear to be relatively better than the ones created by CosSim. We also observed that tag folksonomy and similar list resemble each other to a certain degree so that it possibly serves as an alternative outcome, especially in case the FolkSim-based list is unavailable or infeasible.

3

의미거리측정방법을 활용한 분산 온톨로지 간 자동 정렬 방법 연구

황상규(홍익대학교 컴퓨터공학과) ; 변영태(홍익대학교) 2009, Vol.26, No.4, pp.319-336 https://doi.org/10.3743/KOSIM.2009.26.4.319

초록보기

초록

시멘틱 웹은 현재의 월드와이드웹의 진화된 모습으로 컴퓨터와 인간이 서로 협업할 수 있도록 컴퓨터가 이해할 수 있는 지식데이터베이스인 온톨로지 기술을 활용한다. 그러나, 온톨로지를 활용하여 정보의 의미를 이해하고 처리 가능하도록 데이터의 표현형식이 표준화 되더라도, 각기 다른 개발자가 서로 다른 개념하에 구축한 온톨로지를 기반으로 작성된 데이터는 상호 불일치 문제를 유발할 수 있다. 따라서, 서로 다른 개념 하에 구축된 온톨로지 간에는 상호 서로 다른 온톨로지 간 정렬작업이 필요하다. 서로 다른 온톨로지 개념노드 간 자동화 처리된 의미정렬 시 인간전문가가 참으로 판단한 사실을 거짓으로 잘못 판단하는 문제상황(false negative)에 의해 정렬오류문제가 발생하게 되는데, 본 연구에서는 서로 다른 온톨로지 개념노드 간 의미정렬과정에서 발생하는 false negative 오류를 최소화 할 수 있는 알고리즘을 새롭게 개발, 제시하였다.

Abstract

Semantic web technology is the evolution of current World Wide Web including a machine-understandable knowledge database, ontology, it may be enable machine and people to work together. However, problems arise when we try to communicate with different data, which are annotated by different ontologies created by different people with different concepts. Thus, to communicate between ontologies, it needs to align between heterogeneous ontologies. When it is aligned between concept nodes of heterogeneous ontologies, one of main problems is a misalignment situation caused by false negative of automatic ontology mapping. So, in this paper, we present a new method to minimize the false negative error in the process of aligning concept nodes of different ontology.

4

문서범주화 성능 향상을 위한 의미기반 자질확장에 관한 연구

정은경(이화여자대학교) 2009, Vol.26, No.3, pp.261-278 https://doi.org/10.3743/KOSIM.2009.26.3.261

초록보기

초록

기계학습 기반 문서범주화 기법에 있어서 최적의 자질을 구성하는 것이 성능향상에 있어서 중요하다. 본 연구는 학술지 수록 논문의 필수적 구성요소인 저자 제공 키워드와 논문제목을 대상으로 자질확장에 관한 실험을 수행하였다. 자질확장은 기본적으로 선정된 자질에 기반하여 WordNet과 같은 의미기반 사전 도구를 활용하는 것이 일반적이다. 본 연구는 키워드와 논문제목을 대상으로 WordNet 동의어 관계 용어를 활용하여 자질확장을 수행하였으며, 실험 결과 문서범주화 성능이 자질확장을 적용하지 않은 결과와 비교하여 월등히 향상됨을 보여주었다. 이러한 성능향상에 긍정적인 영향을 미치는 요소로 파악된 것은 정제된 자질 기반 및 분류어 기준의 동의어 자질확장이다. 이때 용어의 중의성 해소 적용과 비적용 모두 성능향상에 영향을 미친 것으로 파악되었다. 본 연구의 결과로 키워드와 논문제목을 활용한 분류어 기준 동의어 자질 확장은 문서 범주화 성능향상에 긍정적인 요소라는 것을 제시하였다.

Abstract

Identifying optimal feature sets in Text Categorization(TC) is crucial in terms of improving the effectiveness. In this study, experiments on feature expansion were conducted using author provided keyword sets and article titles from typical scientific journal articles. The tool used for expanding feature sets is WordNet, a lexical database for English words. Given a data set and a lexical tool, this study presented that feature expansion with synonymous relationship was significantly effective on improving the results of TC. The experiment results pointed out that when expanding feature sets with synonyms using on classifier names, the effectiveness of TC was considerably improved regardless of word sense disambiguation.

5

지식 서비스 지향 도서관 시스템의 논리 모델

이현실(원광대학교) ; 배창섭(마포구립 서강도서관) ; 이은주(마포구립 서강도서관) ; 한성국(원광대학교) 2009, Vol.26, No.3, pp.47-67

초록보기

초록

유비쿼터스 정보 서비스 기술의 보편화로 도서관 생태 환경에도 커다란 변화가 일어나고 있다. 디지털 도서관 등 정보 매체 중심의 변화뿐만 아니라, Library 2.0 또는 소셜 시맨틱 디지털 라이브러리 등 이용자 중심의 서비스 지향 관점으로의 변화를 실감하고 있다. 본 연구에서는 도서관 시스템의 진화에 초점을 두고, 지식 서비스 실현을 위한 제반 환경 요소를 분석하였으며, 지식 서비스 지향 도서관 시스템의 논리모델을 제시하였다. 본 연구의 논리 모델은 다양한 지식 정보자원, 참여와 협력의 능동적 이용자, 도서관 업무 혁신, 유비쿼터스 정보 기술을 조화하여 도서관의 근본적 임무를 수행하는데 프레임워크가 될 수 있을 것이다.

Abstract

The ecosystem of the Library has been radically changing in the advent of ubiquitous information service technology. We are already aware of the digital library due to popularizing digital information resources and we are impressed with Library 2.0 and Social Semantic Digital Library of user-centered, service-oriented library. We summarize the ultimate goal of the evolution of library systems as knowledge services and propose a logical model of library system for the realization of knowledge services. This local model can be applied for a library framework to harmonize the diverse knowledge resources, active users with participation and collaboration, the innovation of library business and ubiquitous information service technologies to achieve the missions of library in knowledge-intensive society.

6

시맨틱 디지털도서관 서비스를 위한 서지 온톨로지 구축

이유진((주) 아이네크) ; 양성권(서울대학교 치과대학 의생명 지식공학연구실) ; 송민아(서울대학교 치과대학 의생명 지식공학연구실) ; 김홍기(서울대학교) 2009, Vol.26, No.1, pp.215-230 https://doi.org/10.3743/KOSIM.2009.26.1.215

초록보기

초록

MARC, DC, MODS, MarcOnt 등의 서지 메타데이터 모델과 소셜 시맨틱 디지털도서관 사례인 JeromeDL의 메타데이터 모델 및 서지적 개념모델인 FRBR모델 등의 분석을 통해 국내 디지털도서관의 서지메타데이터에 적용 가능한 온톨로지 모델을 제안하고자 한다. 이는 디지털도서관의 다양한 자원형식과 특성을 고려하고, 기존의 서지메타데이터들을 내포시켜 발전시킴으로써 서지자원에 대한 범용성과 상호운용성이 높은 서지 온톨로지를 구축하여 의미적인 검색과 서비스를 확보하고자 하였다.

Abstract

We propose semantic model that is possible to apply for the bibliographic metadata of domestic digital library by analysing bibliographic metadata models like MARC, DC, MODS, JeromeDL's metadata model MarcOnt as the representative case of semantic digital library and FRBR model as the conceptual model.

7

온톨로지 기반의 연구자정보 검색 인터페이스 설계

서은경(한성대학교) ; 박미향() 2009, Vol.26, No.2, pp.173-194 https://doi.org/10.3743/KOSIM.2009.26.2.173

초록보기

초록

Abstract

Recently, semantic search techniques which are based on information space as consisting of non- ambiguous, non-redundant, formal pieces of ontological knowledge have been developed so that users do exploit large knowledge bases. The purpose of the study is to design more user-friendly and smarter retrieval interface based on ontological analysis, which can provide more precise information by reducing semantic ambiguity or more rich linked information based on well-defined relationships. Therefore, this study, first of all, focuses on ontological analysis on researcher information as selecting descriptive elements, defining classes and properties of descriptive elements, and identifying relationships between the properties and their restriction between relationships. Next, the study designs the prototypical retrieval interface based on ontology-based representation, which supports to semantic searching and browsing regarding researchers as a full-fledged domain. On the proposed retrieval interface, users can search various facts for researcher information such as research outputs or the personal information, or carrier history and browse the social connection of the researchers such as researcher group that is lecturing or researching on the same subject or involving in the same intellectual communication.

8

시맨틱 웹 환경에서 적합한 문장을 제공하는 이야기 쓰기 도우미에 관한 연구

이태영(전북대학교) 2009, Vol.26, No.4, pp.7-34 https://doi.org/10.3743/KOSIM.2009.26.4.007

초록보기

초록

이야기 쓰기를 돕는 본문 및 문장 검색시스템의 구축을 위해서 (1)이야기와 단락 및 문장의 구조를 분석하고 (2)색인작성과 탐색 질문에 적용되는 언어 추론을 연구하였다. 이야기 쓰기에 필요한 이야기, 단락, 그리고 문장으로 구성된 사항 데이터베이스와 필요한 추론규칙으로 이루어진 지식베이스와 온톨로지가 고안되었다. 추론의 기초인 실례(實例) 파일들은 시맨틱 웹 환경에서 작동될 마크업 언어 형식으로 만들어졌다. 시맨틱 웹 환경에서 실용적인 시스템이 되려면 단락과 문장을 정확히 대변하는 색인 방법론과 이를 정밀하게 지식베이스화 할 수 있는 마크업 언어의 창조가 필수적이라 사료된다.

Abstract

Structures of stories, paragraphs, and sentences and inferences applied to indexing and searching were studied to construct the full-text and sentence retrieval system for storytelling. The system designed the database of stories, paragraphs, and sentences and the knowledge-base of inference rules to aid to write the story. The Knowledge-base comprised the files of story frames, paragraph scripts, and sentence logics made by mark-up languages like SWRL etc. able to operate in semantic web. It is necessary to establish more precise indexing language represented the sentences and to create a mark-up languages able to construct more accurate inference rules.

9

차세대 검색서비스의 속성에 관한 연구

이수상(부산대학교) ; 이순영(부산대학교) 2009, Vol.26, No.4, pp.93-112 https://doi.org/10.3743/KOSIM.2009.26.4.093

초록보기

초록

최근 정보검색 환경은 검색 2.0으로 대표되는 차세대 검색서비스에 대한 논의들이 활발해지고 있다. 따라서 이 연구에서는 정보검색의 발전과 진화에 대한 다양한 논의들을 토대로 정보검색의 발전 과정을 구분하였다. 그리고 현재 거론되고 있는 차세대 검색서비스의 등장 배경, 주요 개념, 그리고 관련 사례와 속성을 파악하였으며, 이러한 속성과 사례에 대한 데이터를 통해 차세대 검색서비스를 설명하는 핵심적인 키워드를 확인하기 위한 군집 분석을 수행하였다. 군집 분석의 결과 차세대 검색서비스를 대표하는 주요 키워드는 소셜 검색, 지능형 의미 검색, 그리고 관계기반 검색 등으로 나타났다.

Abstract

Recently in the area of the information environment, there are lively discussions about search 2.0 which is representative of the next generation search services. In this study, we divide information search model into matching and linking models according the developmental stages. Therefore, on the one hand, we analyze the background, main concepts, related attributes and cases of the next generation search services and the other, we identify the representative keywords by the group analysis of various attributes and cases of it. The result shows that the main keywords such as social search, artificial intelligence and semantic search, and relation/network based search are representative of the search 2.0.

10

집단지성을 활용한 시소러스 갱신에 관한 연구: 위키피디아를 중심으로

한승희(서울여자대학교) 2009, Vol.26, No.3, pp.25-43 https://doi.org/10.3743/KOSIM.2009.26.3.025

초록보기

초록

이 연구에서는 위키피디아를 활용하여 시소러스를 갱신하고, 그 결과를 평가함으로써 시소러스 갱신에 있어 집단지성의 활용가능성에 대해 확인하고자 하였다. ASIS&T 시소러스를 대상으로 시소러스를 갱신한 결과, 용어 포괄성의 측면에서 ASIS&T 시소러스에 비해 위키 시소러스가 우수한 것으로 나타났다. 또한, 갱신된 시소러스를 평가한 결과, 위키피디아가 시소러스 갱신에 활용될 수 있음이 증명되었다. 특히, 리디렉션, 카테고리, 상호 링크로 요약되는 위키피디아의 구조적 특성은 시소러스의 의미관계를 추출하는 데 있어 적합하다는 것을 확인하였다. 이 연구의 결과를 일반화하기 위해 다국어 시소러스를 포함한 다양한 시소러스를 대상으로 적용해 볼 필요가 있다.

Abstract

The purpose of this study is to suggest how the classic thesaurus structure of terms and links can be mined and updated from Wikipedia encyclopedia, which is the best practice of collective intelligence. In a comparison with ASIS&T thesaurus, it was found that Wikipedia contains a substantial coverage of domain-specific concepts and semantic relations. Furthermore, it was resulted that the structural characteristics of Wikipedia, such as redirects, categories, and mutual links are suitable to extract semantic relationships of thesaurus. It is needed to apply to update various thesauri, including multilingual thesaurus, in order to generalize the results of this research.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지