ISSN : 1013-0799
As basic data that can systematically support and evaluate R&D activities as well as set current and future research directions by grasping specific trends in domestic academic research, I sought efficient ways to assign standardized subject categories (control keywords) to individual journal papers. To this end, I conducted various experiments on major factors affecting the performance of automatic classification, focusing on feature selection techniques, for the purpose of automatically allocating the classification categories on the National Research Foundation of Korea’s Academic Research Classification Scheme to domestic journal papers. As a result, the automatic classification of domestic journal papers, which are imbalanced datasets of the real environment, showed that a fairly good level of performance can be expected using more simple classifiers, feature selection techniques, and relatively small training sets.