바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

Construction of Research Fronts Using Factor Graph Model in the Biomedical Literature

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2017, v.34 no.1, pp.177-195
https://doi.org/10.3743/KOSIM.2017.34.1.177


  • Downloaded
  • Viewed

Abstract

This study attempts to infer research fronts using factor graph model based on heterogeneous features. The model suggested by this study infers research fronts having documents with the potential to be cited multiple times in the future. To this end, the documents are represented by bibliographic, network, and content features. Bibliographic features contain bibliographic information such as the number of authors, the number of institutions to which the authors belong, proceedings, the number of keywords the authors provide, funds, the number of references, the number of pages, and the journal impact factor. Network features include degree centrality, betweenness, and closeness among the document network. Content features include keywords from the title and abstract using keyphrase extraction techniques. The model learns these features of a publication and infers whether the document would be an RF using sum-product algorithm and junction tree algorithm on a factor graph. We experimentally demonstrate that when predicting RFs, the FG predicted more densely connected documents than those predicted by RFs constructed using a traditional bibliometric approach. Our results also indicate that FG-predicted documents exhibit stronger degrees of centrality and betweenness among RFs.

keywords
Bibliographic features, content features, factor graph model, network features, probabilistic graphical model(PGM), research front, 내용 자질, 네트워크 자질, 서지 자질, 심층학습, 연구전선, 인용분석, 팩터그래프, 확률 그래프 모델

Reference

1.

김조아. (2016). 인용 이미지 구축자 프로파일링을 이용한 국내 여성학 분야 연구 전선 분석. 정보관리학회지, 33(2), 201-225. http://dx.doi.org/10.3743/KOSIM.2016.33.2.201.

2.

서은경. (2013). Detecting Research Trends in Korean Information Science Research, 2000-2011. 정보관리학회지, 30(4), 215-239. http://dx.doi.org/10.3743/KOSIM.2013.30.4.215.

3.

이재윤. (2015). 문헌동시인용 분석을 통한 한국 문헌정보학의 연구 전선 파악. 정보관리학회지, 32(4), 77-106. http://dx.doi.org/10.3743/KOSIM.2015.32.4.077.

4.

조재인. (2011). 네트워크 텍스트 분석을 통한 문헌정보학 최근 연구 경향 분석. 정보관리학회지, 28(4), 65-83.

5.

Phillip Bonacich. (2007). Some unique properties of eigenvector centrality. Social Networks, 29(4), 555-564. http://dx.doi.org/10.1016/j.socnet.2007.04.002.

6.

Kevin W. Boyack. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?. Journal of the American Society for Information Science and Technology, 61(12), 2389-2404. http://dx.doi.org/10.1002/asi.21419.

7.

Castillo, C.. (2007). Estimating number of citations using author reputation (107-117). Proceedings of the String Processing and Information Retrieval.

8.

D. J. de Solla Price. (1965). Networks of Scientific Papers. Science, 149(3683), 510-515. http://dx.doi.org/10.1126/science.149.3683.510.

9.

Frey, B.. (1998). Graphical models for machine learning and digital communication:The MIT Press.

10.

Fu, L.. (2008). Models for predicting and explaining citation count of biomedical articles. Proceedings of the American Medical Informatics Association, 1, 222-226.

11.

Jaffe, A.. (1996). Flows of knowledge from universities and federal laboratories: Modeling the flow of patent citations over time and across institutional and geographic boundaries. Proceedings of the National Academy of Sciences, 93(23), 12671-12677.

12.

Bo Jarneving. (2007). Bibliographic coupling and its application to research-front and other core documents. Journal of Informetrics, 1(4), 287-307. http://dx.doi.org/10.1016/j.joi.2007.07.004.

13.

Michael I. Jordan. (2004). Graphical Models. Statistical Science, 19(1), 140-155. http://dx.doi.org/10.1214/088342304000000026.

14.

Kollar, D.. (2009). Probabilistic graphical models: Principles and techniques:The MIT Press.

15.

Kschischang, F.. (2001). Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 47(2), 498-519.

16.

H. Loeliger. (2004). An Introduction to factor graphs. IEEE Signal Processing Magazine, 21(1), 28-41. http://dx.doi.org/10.1109/MSP.2004.1267047.

17.

Katherine W. McCain. (1989). Citation context analysis and aging patterns of journal articles in molecular genetics. Scientometrics, 17(1-2), 127-163. http://dx.doi.org/10.1007/BF02017729.

18.

F. Narin. (1996). Bibliometric performance measures. Scientometrics, 36(3), 293-310. http://dx.doi.org/10.1007/BF02129596.

19.

Persson, O.. (1994). The intellectual base and research fronts of JASIS 1986-1990. Journal of the American Society for Information Science, 45(1), 31-38.

20.

Porta, M.. (2014). A Dictionary of Epidemiology:Oxford University Press.

21.

Shen, H.. (2007). Grouping using factor graphs: An approach for finding text with a camera phone. Graph-Based Representations in Pattern Recognition, 4538, 394-403. http://dx.doi.org/10.1007/978-3-540-72903-7_36.

22.

Naoki Shibata. (2007). Topological analysis of citation networks to discover the future core articles. Journal of the American Society for Information Science and Technology, 58(6), 872-882. http://dx.doi.org/10.1002/asi.20529.

23.

Naoki Shibata. (2009). Comparative study on methods of detecting research fronts using different types of citation. Journal of the American Society for Information Science and Technology, 60(3), 571-580. http://dx.doi.org/10.1002/asi.20994.

24.

Henry Small. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265-269. http://dx.doi.org/10.1002/asi.4630240406.

25.

H. Small. (1985). Clustering the science citation index using co-citations. II. Mapping science. Scientometrics, 8(5-6), 321-340. http://dx.doi.org/10.1007/bf02018057.

26.

Min Song. (2013). Detecting the knowledge structure of bioinformatics by mining full-text collections. Scientometrics, 96(1), 183-201. http://dx.doi.org/10.1007/s11192-012-0900-9.

27.

Sun, Y.. (2012). Mining Text Data.

28.

Sutton, C.. (2007). Introduction to Statistical Relational Learning.

29.

S. Phineas Upham. (2010). Emerging research fronts in science and technology: patterns of new knowledge development. Scientometrics, 83(1), 15-38. http://dx.doi.org/10.1007/s11192-009-0051-9.

30.

Nees Jan van Eck. (2010). A comparison of two techniques for bibliometric mapping: Multidimensional scaling and VOS. Journal of the American Society for Information Science and Technology, 61(12), 2405-2416. http://dx.doi.org/10.1002/asi.21421.

31.

Y. Weiss. (2001). On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory, 47(2), 736-744. http://dx.doi.org/10.1109/18.910585.

32.

Witten, I. H.. (1999). KEA: Practical automatic keyphrase extraction (254-255). Proceedings of the 4th ACM conference on Digital Libraries.

33.

Yi-Ting Yeh. (2013). Synthesis of tiled patterns using factor graphs. ACM Transactions on Graphics, 32(1), 1-13. http://dx.doi.org/10.1145/2421636.2421639.

Journal of the Korean Society for Information Management