정보관리학회지, 한국정보관리학회

41

Scientific Data 학술지 분석을 통한 데이터 논문 현황에 관한 연구

정은경(이화여자대학교) 2019, Vol.36, No.1, pp.117-135 https://doi.org/10.3743/KOSIM.2019.36.1.117

초록보기

초록

데이터 학술지와 데이터 논문이 오픈과학 패러다임에서 데이터 공유와 재이용이라는 학술활동이 등장하여 지속적으로 성장하고 있다. 본 논문은 영향력있는 다학제적 분야의 데이터 학술지인 Scientific Data에 게제된 총 713건의 논문을 대상으로 저자, 인용, 주제분야 측면을 분석하였다. 그 결과 저자의 주된 주제 영역은 생명공학, 물리학 등으로 나타났으며, 공저자 수는 평균 12명이다. 공저 형태를 네트워크로 살펴보면, 특정 연구자 그룹이 패쇄적으로 공저활동을 수행하는 것으로 나타났다. 인용의 주제영역을 살펴보면, 데이터 논문 저자의 주제영역과 크게 다르지 않게 나타났으나, 방법론을 주로 다루는 학술지의 인용 비중이 높은 것은 데이터 논문의 특징으로 볼 수 있다. 데이터 논문 저자의 키워드를 사용하여 동시출현단어분석 네트워크로 살펴본 데이터 논문의 주제영역은 생물학이 중심이며, 구체적으로 해양생태, 암, 게놈, 데이터베이스, 기온 등의 세부 주제 영역을 확인할 수 있다. 이러한 결과는 다학제학문 분야를 다루는 데이터 학술지이지만, 데이터 학술지 출간에 관한 논의를 일찍부터 시작해온 생명공학 분야에 집중된 현상을 보여준다.

Abstract

Data journals and data papers have grown and considered an important scholarly practice in the paradigm of open science in the context of data sharing and data reuse. This study investigates a total of 713 data papers published in Scientific Data in terms of author, citation, and subject areas. The findings of the study show that the subject areas of core authors are found as the areas of Biotechnology and Physics. An average number of co-authors is 12 and the patterns of co-authorship are recognized as several closed sub-networks. In terms of citation status, the subject areas of cited publications are highly similar to the areas of data paper authors. However, the citation analysis indicates that there are considerable citations on the journals specialized on methodology. The network with authors’ keywords identifies more detailed areas such as marine ecology, cancer, genome, database, and temperature. This result indicates that biology oriented-subjects are primary areas in the journal although Scientific Data is categorized in multidisciplinary science in Web of Science database.

42

기록물용 KORMARC 데이터필드 개발을 위한 메타데이터 요소에 관한 연구

박진희(전북대학교) 2005, Vol.22, No.3, pp.351-378 https://doi.org/10.3743/KOSIM.2005.22.3.351

초록보기

초록

본 연구는 기존의 도서관정보시스템에서 기록물을 검색, 이용할 수 있도록 기록물용 KORMARC 데이터필드 개발을 위한 메타데이터 요소를 설정하였다. 본 연구의 결과를 요약하면 다음과 같다.첫째, 본 연구에서는 ISAD(G)2에서 제시하고 있는 7개 영역 외에 보존영역(conservation area) 과 물리적 기술영역(physical description area)을 추가하였다. 그리고 ISAD(G)2는 26가지 요소만을 제시하 고 있어 상세수준의 기술요소를 필요로 하는 기관에서는 불충분하다는 선행연구에서 제시된 문제점을 보완하기 위해 분석결과를 토대로 영역별 하위요소를 종합하여 선정하였다.둘째, 우리나라 기록물의 특수성을 기술요소에 반영하기 위해 사무관리규정시행규칙과 전자정부 구현을 위한 행정업무 등의 전자화 촉진에 관한 법률에서 제시하고 있는 종이 공문서 및 전자문서 서식의 분석을 통해 선정한 기록물 기술요소를 추가하였다. 또한 공공기관의 기록물 관리에 관한 법률 시행령에서 규정하고 있는 공개여부 및 등급, 공개 일자, 공개범위, 보존기간, 보존등급, 보존가치, 기록물의 상태기술 요소를 추가하였다.셋째, 기록물 관리를 위해 512 생산일자 관련주기(creation dates note)와 5 검색보조도구주 기(finding aids note), 583 작업현황 주기(action note), 584 245 표제저자사항(title statement), 30 물리적 기술(physical description), 306 재생/연 주시간(playing time), 506 접근제한주기(restriction on acces note), 534 원본주기(original version note), 535 원본/사본의 소장처주기(location of originals/duplicates note), 540 이용과 복제제한에 관한 주기(terms governing use and reproduction notes), 541 직접적 graphical or historical note), 581 출판주기(publication note), 850 소장처(holding institution) 데이터필드의 식별기호를 재구성, 추가하였다.

Abstract

The study intended to develop KORMARC for archives in order to integrate archives with library materials. The results of the study can be sumarized as folows; (1) 2 areas for conservation and physical description are aded study has also proved that the existing 26 elements of ISAD(G)2 are not fuly enough to satisfy the information demands of institutions and its users as wel. (2) For the use of domestic archives in particular, the study h as added the description elements of archives that apeared in the Government Regulations of Ofice Managemen t and those forms of documents that are specified by law for the sake of computerization. The study has aded the possible release and grade, release dates, release range, conservation periods, conservation grade, conservation value, the status description of archives elements that are specified in Public Record Management Law.(3) The study has developed the following data fields to be add ed into KORMARC. and 584 accumulation note. Also it reorganizes and adds the indicators of the 245 title statement, 300 physica l description, 306 playing time, 506 restriction on access note, 534 original version note, 535 location of orig inals/duplicates note, 540 terms governing use and reproduction notes, 541 imediate source of acquisition not publication note, 850 holding institution data fields.

43

과학기술분야 문헌제공서비스의 트랜잭션 데이터 분석 연구

김홍렬(전주대학교) 2004, Vol.21, No.2, pp.169-187 https://doi.org/10.3743/KOSIM.2004.21.2.169

초록보기

초록

미래 도서관의 패러다임이 정보소장에서 정보접근으로 변화됨에 따라 도서관상호협력과 문헌제공서비스의 중요성이 증가하고 있다. 이를 위하여 개별 도서관들은 자료의 구입예산을 절감하고 도서관정보서비스의 질적인 향상을 물론 이용자의 정보서비스에 대한 만족도를 제고한다는 점에서 문헌제공서비스의 활용은 상당한 장점이 있다. 본 연구는 문헌제공서비스의 효과적인 수행과 이용을 위한 국내 이용자들의 문헌제공서비스 트랜잭션 데이터 분석을 통하여 문헌제공서비스의 이용추이와 동향을 예측하고, 여기에 나타난 그들의 요구변화를 검토하여 국내 도서관 및 정보센터의 문헌제공서비스의 질적인 향상과 이용자 만족도 제고에 사용할 수 있는 근거 자료를 제시하는데 그 목적이 있다. 이를 위하여 KISTI-DDS의 실제 이용데이터를 활용하여 문헌제공서비스의 연도별, 지역별, 이용계층별 차이를 분석하였으며, 자료유형별 복사추이도 관찰하였다. 또한 이용자들이 원문복사를 제공하는 복사제공기관과 원문입수방법을 검토하고 연도별, 이용계층별로 의미있는 차이가 있는지를 분석하였다.

Abstract

The purpose of this study is to analyze the usage patterns of document delivery services of domestic users based on usage transaction data about photocopying services of KISTI-DDS that the most important document delivery organization in Korea. For the purpose of this study, it was investigated the number of processed document, type of favorite documents, ordering coverage for photocopying, delivery methods of photocopying documents for users in DDS(document delivery service) through transaction data of DDS during the past 4 years from 2000 to 2003.

44

빅데이터 연구 논문의 주제 분야 연관관계 분석: 동시 인용 관계를 적용하여

곽철완(강남대학교) 2018, Vol.35, No.1, pp.13-32 https://doi.org/10.3743/KOSIM.2018.35.1.013

초록보기

초록

본 연구의 목적은 빅데이터 연구 논문의 주제 분야 간의 연관관계를 분석하는데 있다. 동시 인용 관계를 적용하여 분석 대상의 주제 분야를 추출하였으며, R 프로그램의 Apriori 알고리즘을 이용하여 연관관계의 규칙을 분석하고, arulesViz 패키지를 사용하여 시각화하였다. 연구 결과 22개 주제 분야가 추출되었는데, 이들 주제 분야는 3가지 군집으로 구분되었다. 주제 분야의 연관관계 유형을 분석한 결과, 연관관계의 복잡성에 따라 ‘전문형’, ‘일반형’, ‘확대형’으로 구분되었다. 전문형에는 문헌정보학, 신문방송학 등이 포함되었고, 일반형에는 정치외교학, 무역학, 관광학 등이 포함되었고, 확대형에는 기타인문학, 사회과학일반, 관광학일반 등이 포함되었다. 이 연관관계는 빅데이터 연구자가 한 주제 분야를 인용할 때 관계가 있는 다른 주제 분야를 인용하는 경향을 보여주는 것으로, 도서관에서 학술정보서비스를 위해 연관관계를 활용한 서비스를 고려해야 할 필요가 있다.

Abstract

The purpose of this study is to analyze the association among the subject areas of big data research papers. The subject group of the units of analysis was extracted by applying co-citation networks, and the rules of association were analyzed using Apriori algorithm of R program, and visualized using the arulesViz package of R program. As a result of the study, 22 subject areas were extracted and these subjects were divided into three clusters. As a result of analyzing the association type of the subject, it was classified into ‘professional type’, ‘general type’, ‘expanded type’ depending on the complexity of association. The professional type included library and information science and journalism. The general type included politics & diplomacy, trade, and tourism. The expanded types included other humanities, general social sciences, and general tourism. This association networks show a tendency to cite other subject areas that are relevant when citing a subject field, and the library should consider services that use the association for academic information services.

45

종이기록 데이터화를 위한 AI-OCR 적용 사례연구

안세진(김포시 행정과) ; 황현호(㈜악어디지털) ; 임진희(이화여자대학교 정책과학과) 2022, Vol.39, No.3, pp.165-193 https://doi.org/10.3743/KOSIM.2022.39.3.165

초록보기

초록

현대 업무환경 변화의 중심은 디지털 기술이라고 할 수 있다. 특히 업무관리시스템 및 문서생산시스템에서 생산한 기록으로 업무를 증명하는 일반적인 공공기관에서 기록관리체계는 업무환경 그 자체이기도 하다. 김포시는 제4차 산업혁명기술 시대에 선제적으로 대응하고 업무환경 혁신을 이루기 위해 한국지능정보사회진흥원(NIA)의 2021년 공공부문 클라우드 선도 프로젝트 사업에 지원하였고 선도 기관으로 확정되어 3억 3천의 지원을 받아 공공 클라우드 기반의 AI-OCR을 통한 기록물 검색 및 활용기능 강화 프로젝트를 진행하였다. 이를 통해 규격화된 색인 값에 의존한 검색과 이미지 열람에 그치던 비전자기록의 한계를 넘어 데이터화 하였고 AI-OCR이라는 신기술 적용으로 98%의 인식률을 구현하였다. 공공기관에 디지털 기술을 사용하여 업무 효율화, 생산성 향상, 개발비용 절감, 내․외부 이용자들의 기록관리 서비스 수준의 제고를 이루었기에 신기술과 기록물관리의 결합 사례연구를 통해 기록관리 분야 본연의 전문성을 높이는 방향과 업무환경 혁신 구현 사례를 공유하고자 한다.

Abstract

It can be said that digital technology is at the center of the change in the modern work environment. In particular, in general public institutions that prove their work with records produced by business management systems and document production systems, the record management system is also the work environment itself. Gimpo City applied for the 2021 public cloud leading project of the National Information Society Agency (NIA) to proactively respond to the 4th industrial revolution technology era and implemented a public cloud-based AI-OCR technology enhancement project with 330 million won in support of 330 million won. Through this, it was converted into data beyond the limitations of non-electronic records limited to search and image viewing that depend on standardized index values. In addition, a 98% recognition rate was realized by applying a new technology called AI-OCR. Since digital technology has been used to improve work efficiency, productivity, development cost, and record management service levels of internal and external users, we would like to share the direction of enhancing expertise in the record management and implementation of work environment innovation.

46

질의로그 데이터에 기반한 특허 및 상표검색에 관한 연구

이지연(연세대학교) ; 백우진(건국대학교) 2006, Vol.23, No.2, pp.61-79 https://doi.org/10.3743/KOSIM.2006.23.2.061

초록보기

초록

본 연구는 특허 및 상표 검색 개선을 위한 방법을 제안하고자 하는 목적에서 출발하였다. 이를 위해 193일간 한국특허정보원의 특허기술정보서비스를 이용한 17,559명의 이용자가 작성한 100,016개의 질의문에 대한 로그 데이터를 분석하였다. 개별적인 질의로그 분석 이외에, 2,202개의 복수 질의문을 이용한 탐색세션을 분석함으로써 검색 개선을 위한 추가적인 단서를 발견하였다. 분석결과에 의하면, 특허 및 상표검색은 일반적인 웹 검색의 유형과 유사한데, 특히 질의문의 길이가 짧다는 측면에서 매우 흡사하다. 그러나 특허 및 상표검색의 경우, 일반 웹 검색보다 불리언 연산자를 많이 사용하고 있었다. 복수 질의문 분석을 통해 이용자들이 질의문을 재작성하는데 도움이 될 수 있는 탐색기능을 제안할 수 있었다. 복수의 질의문으로 구성된 탐색세션을 분석한 결과, 이용자들은 질의문을 재작성하기 위하여 부연하기, 특정화하기, 일반화하기, 교체하기, 중단하기와 같은 방법을 사용하고 있음을 알 수 있었다.

Abstract

To come up with the recommendations to improve the patent & trademark retrieval efficiency, 100,016 patent & trademark search requests by 17,559 unique users over a period of 193 days were analyzed. By analyzing 2,202 multi-query sessions, where one user issuing two or more queries consecutively, we discovered a number of retrieval efficiency improvements clues. The session analysis result also led to suggestions for new system features to help users reformulating queries. The patent & trademark retrieval users were found to be similar to the typical web users in certain aspects especially in issuing short queries. However, we also found that the patent & trademark retrieval users used Boolean operators more than the typical web search users. By analyzing the multi-query sessions, we found that the users had five intentions in reformulating queries such as paraphrasing, specialization, generalization, alternation, and interruption, which were also used by the web search engine users.

47

국내 도서관 링크드 오픈 데이터 구축과 발행의 개선방안 연구

이성숙(충남대학교) 2020, Vol.37, No.2, pp.145-169 https://doi.org/10.3743/KOSIM.2020.37.2.145

초록보기

초록

도서관 LOD가 확산되지 못하는 현시점에서, 본 연구의 목적은 국내 도서관 LOD의 발행과 구축에 대한 현황을 살펴보고, 그 개선방안을 모색하기 위한 것이다. 사용한 연구방법은 문헌연구, 사례조사, 전문가 면담이다. 본 연구에서 제시된 개선방안은 첫째 도서관은 LOD 구축 대상의 중복을 피하고, 유일하고 특화된 자료를 구축할 필요가 있다. 둘째 도서관은 이용자 요구를 반영한 LOD 서비스를 개발하고, 편리한 LOD 인터페이스를 구현할 필요가 있다. 셋째 도서관은 데이터의 식별체계를 마련하고 전거파일을 구축할 필요가 있다. 넷째 도서관은 사서나 이용자에게 데이터 개방과 연계의 필요성을 인식시키고, 이를 위한 교육과 홍보의 기회를 제공할 필요가 있다. 다섯째 도서관은 통합 검색을 위해 LOD를 활용하고, 도서관 LOD를 검색할 수 있는 통합 플랫폼을 마련할 필요가 있다. 여섯째 도서관은 LOD 발행과 활용을 위한 협력을 강화하고, 실무협의체를 구성할 필요가 있다. 일곱째 정부는 LOD 추진에 대한 지속적인 의지로 강력한 정책을 추진해야 하며, 계속해서 예산 지원을 할 필요가 있다.

Abstract

The purpose of this study is to find the cause and solution of the situation where library LOD does not spread after the introduction of library LOD. Research methods include literature research, case studies, and expert interviews. The improvement plan presented in this study is that first, the library needs to avoid the redundancy of the LOD construction target and build the only and specialized data. Second, libraries need to develop LOD services that reflect user needs and implement convenient LOD interfaces. Third, libraries need to establish identification system of data and build a authority file. Fourth, libraries need to recognize the necessity of data opening and linking to librarians and users, and provide opportunities for education and publicity. Fifth, it is necessary to use LOD for integrated search and to establish an integrated platform for search of library LOD. Sixth, libraries need to strengthen cooperation for LOD issuance and utilization, and form a working-level consultative body. Seventh, the government should pursue strong policies with a continuous commitment to LOD promotion and need to continue to provide budget support.

48

학술지 단위 서지결합분석을 통한 빅데이터 연구분야의 학제적 구조에 관한 연구

이보람(이화여자대학교) ; 정은경(이화여자대학교) 2016, Vol.33, No.3, pp.133-154 https://doi.org/10.3743/KOSIM.2016.33.3.133

초록보기

초록

현대사회의 다양하고 복잡한 문제들을 해결하기 위해 학문영역을 넘나드는 학제적 연구가 등장하게 되었다. 본 연구에서는 최근 다양한 영역에서 주목 받고 있는 빅데이터 분야를 대상으로 학제성을 규명하고 학제적 구조를 파악하고자 하였다. 이를 위해 빅데이터를 다룬 학술지 총 1,083종의 데이터를 수집하였다. 이 중 420종(38.8%)의 학술지에 둘 이상의 Web of Science SC범주가 부여되었고, 239종(22.1%)에 부여된 SC범주는 상이한 학문영역에 속하여 빅데이터 분야의 비교적 높은 학제성을 확인할 수 있었다. 이와 함께 논문 게재 상위 56종의 학술지를 대상으로 서지결합분석 네트워크를 생성한 결과 총 10개의 군집이 나타났다. 10개 군집 중 7개 군집이 컴퓨터공학 분야에 해당하여 대부분의 연구가 빅데이터의 저장, 처리, 분석 등 기술적인 부분에 집중되어 있었다. 이외에도 군집분석을 통해 과학기술, 공학, 커뮤니케이션, 법학, 지리학, 생명공학 등 다양한 분야에서 빅데이터의 분석과 활용에 관한 연구가 이루어지고 있음을 확인할 수 있었다. 마지막으로 네트워크에서 매개중심성, 최근접중심성, 삼각매개중심성을 측정한 결과 컴퓨터공학 분야의 학술지들이 네트워크에 미치는 영향력이 크고 주제적 연관성이 강한 것으로 나타났다.

Abstract

Interdisciplinary approach has been recognized as one of key strategies to address various and complex research problems in modern science. The purpose of this study is to investigate the interdisciplinary characteristics and structure of the field of big data. Among the 1,083 journals related to the field of big data, multiple Subject Categories (SC) from the Web of Science were assigned to 420 journals (38.8%) and 239 journals (22.1%) were assigned with the SCs from different fields. These results show that the field of big data indicates the characteristics of interdisciplinarity. In addition, through bibliographic coupling network analysis of top 56 journals, 10 clusters in the network were recognized. Among the 10 clusters, 7 clusters were from computer science field focusing on technical aspects such as storing, processing and analyzing the data. The results of cluster analysis also identified multiple research works of analyzing and utilizing big data in various fields such as science & technology, engineering, communication, law, geography, bio-engineering and etc. Finally, with measuring three types of centrality (betweenness centrality, nearest centrality, triangle betweenness centrality) of journals, computer science journals appeared to have strong impact and subjective relations to other fields in the network.

49

도서관 빅데이터 서비스 모형 개발에 관한 연구: 공공도서관을 중심으로

표순희(성균관대학교 정보관리연구소) ; 김윤형((주)기술과가치) ; 김혜선(한국과학기술정보연구원) ; 김완종(한국과학기술정보연구원) 2015, Vol.32, No.2, pp.63-86 https://doi.org/10.3743/KOSIM.2015.32.2.063

초록보기

초록

본 연구는 최근 많은 이슈가 되고 있는 빅데이터를 도서관 분야에 적용하여 다양한 형태의 도서관 빅데이터의 활용 가치에 대한 이해를 높이고 이에 대한 수요자의 요구 분석을 바탕으로 공공도서관 빅데이터 서비스 모형을 개발하는 것을 목적으로 하고 있다. 이를 위해 도서관 빅데이터의 개념과 내용 및 가치 등을 고찰하고, 도서관 빅데이터 서비스에 대한 수요 분석을 바탕으로 도서관 빅데이터 서비스 모형을 개발하였다. 서비스 모형 개발을 위해 도서관 빅데이터의 유형에 따라 활용 가능한 도서관 빅데이터를 분석하였으며, 수요자의 요구를 다양한 방법으로 도출하였다. 수요자의 요구 분석은 도서관계 연구자 및 현장 사서와의 심층인터뷰, 표적집단인터뷰(Focus Group Interview, 이하 FGI), 사서 및 이용자 설문조사를 통해 이루어졌다. 이를 바탕으로 총 16개의 도서관 빅데이터 서비스 모형을 정의하고, 서비스의 필요성, 시급성, 개발 가능성을 고려해 최종적으로 사서 의사결정 지원 서비스와 이용자 도서 추천 및 독서이력 관리 서비스 모형을 개발하였다.

Abstract

Big data refers to dataset whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. And now it is considered to create the new opportunity in every industry. The purpose of this study is to develop of big data services in public library for improved library services. To this end, analysed the type of library big data and needs of stockholders through the various methods such as deep interview, focus group interview, questionnaire. At first step, we defined the 16 big data service models from interview with librarians, and LIS professions. Second step, it was considered necessity, timeliness, possibility of development. We developed the final two services called on ‘Decision Support Services for Public Librarians’ and ‘Book Recommendation Services for Users.’

50

행정정보 데이터세트의 이관규격의 다양화 및 재현 방안에 관한 연구

양동민(전북대학교 기록관리학과) ; 최광훈(알엠소프트) ; 김지혜(전북대학교 기록관리학과 박사과정) ; 유남희(전북대학교 기록관리학과) 2023, Vol.40, No.4, pp.167-200 https://doi.org/10.3743/KOSIM.2023.40.4.167

초록보기

초록

국내 행정정보 데이터세트 기록관리에서는 행정정보 데이터세트를 이관할 때 이관규격으로 SIARD를 활용할 것을 권고하고 있다. 그러나 행정정보 데이터세트의 기록관리 단위, SIARD를 지원하는 도구의 기술적 한계, 공공기관의 현실적인 상황 등으로 인해 SIARD 적용이 적합하지 않은 경우가 다수 발생하고 있다. 본 연구에서는 SIARD 이외에 행정정보 데이터세트의 이관규격을 다양화하는 방안을 제안하고자 한다. 행정정보 데이터세트의 기록관리에서는 데이터세트와 연계된 사용자 인터페이스의 재현에 대한 필요성에 대한 논의는 지속되고 있지만 구체적으로 제시되고 있지 않다. 본 연구에서는 필수보존속성(Significant Properties) 관점에서 사용자 인터페이스도 함께 보존되어야 할 속성임을 확인하고, 사용자 인터페이스를 효과적으로 재현하는 방안을 제시하고, 실제 검증한 사례를 제공하고자 한다.

Abstract

For the record management of administrative information datasets in Korea, it is recommended to utilize SIARD as a transfer specification when transferring administrative information datasets. However, there are many cases where the application of SIARD is not suitable due to the record management unit of administrative information datasets, technical limitations of tools that support SIARD, and the realistic situation of public institutions. In this study, we propose a plan to diversify the transfer specifications of administrative information datasets other than SIARD. In the record management of administrative information datasets, the need to reproduce the user interface associated with the dataset has been discussed but not specifically presented. This study confirms that the user interface is a property to be preserved from the perspective of Significant Properties, proposes a method to effectively reproduce the user interface, and provides an example of actual verification.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지