Open Access Open Access  Restricted Access Subscription Access

Ontology-based Integration and Refinement of Evaluation-Committee Data from Heterogeneous Data Sources


Affiliations
1 NTIS Center, KISTI, Korea, Republic of
 

Korean National Science and Technology Information Service (NTIS) provide a service of searching national R&D projects and their participating researcher information. It also provides a service of recommending and selecting evaluation committees for the R&D projects. Such R&D data and information are collected from 17 Korean government ministries and agencies and integrated into NTIS. Therefore, the duplicates of the R&D data and researcher information can be inserted because the titles of a researcher's R&D accomplishment data can be differently inserted from the different organizations. Furthermore, the names of researchers and other related objects such as organizations and journals can also be inserted vairously as the names have various aliases in general. In this research, we present an ontology-based data integration and refinement system for integrating such researcher information and their R&D accomplishment data, which would be useful for the recommendation and selection services. Also, we also used Jaro-Winkler distance algorithm to find and eliminate the duplicated accomplishment data. Furthermore, incorrectly entered data are also corrected from the duplicate elimination process with the information obtained from some authoritative science libraries.

Keywords

Data Integration, Data Refinement, Jaro-Winkler Distance, National R&D Data, Ontology.
User

Abstract Views: 188

PDF Views: 0




  • Ontology-based Integration and Refinement of Evaluation-Committee Data from Heterogeneous Data Sources

Abstract Views: 188  |  PDF Views: 0

Authors

Heeseok Jeong
NTIS Center, KISTI, Korea, Republic of
Hanjo Jeong
NTIS Center, KISTI, Korea, Republic of

Abstract


Korean National Science and Technology Information Service (NTIS) provide a service of searching national R&D projects and their participating researcher information. It also provides a service of recommending and selecting evaluation committees for the R&D projects. Such R&D data and information are collected from 17 Korean government ministries and agencies and integrated into NTIS. Therefore, the duplicates of the R&D data and researcher information can be inserted because the titles of a researcher's R&D accomplishment data can be differently inserted from the different organizations. Furthermore, the names of researchers and other related objects such as organizations and journals can also be inserted vairously as the names have various aliases in general. In this research, we present an ontology-based data integration and refinement system for integrating such researcher information and their R&D accomplishment data, which would be useful for the recommendation and selection services. Also, we also used Jaro-Winkler distance algorithm to find and eliminate the duplicated accomplishment data. Furthermore, incorrectly entered data are also corrected from the duplicate elimination process with the information obtained from some authoritative science libraries.

Keywords


Data Integration, Data Refinement, Jaro-Winkler Distance, National R&D Data, Ontology.



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i23%2F136780