Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Visualizing the Domain in 3-Dimension Using Semantic Clustering


Affiliations
1 Computer Science and Engineering, Thapar University, Patiala-147004, India
2 Information Technology, Banasthali University, Banasthali, Rajasthan, India
3 Computer Science and Engineering Department, Thapar University, Patiala-147004, India
     

   Subscribe/Renew Journal


To understand the software source code lots of approaches have been developed and many of them concern to the program structural information but this results in the loss of domain semantic crucial information contained in the text or symbols of source code. To understand software as a whole, we need to enrich these approaches with conceptual insights gained from the domain semantics. This paper proposes the mapping of domain to the code using the information retrieval techniques to use linguistic information, such as identifier names and comments in source code. Here we introduce the concept of Semantic Clustering, and an algorithm to group source artifacts based on how the synonymy and polysemy is related. Based on semantic similarity automatic labeling of the program code is done after detecting the clusters, and is visually explore in 3-Dimension format. The most important feature of theis approach is that it works at the source code textual level which makes it language independent. The approach correlates the semantics with structural information applies at different levels of abstraction (e.g.packages, classes, methods).

Keywords

Information Retrieval, Latent Semantic Indexing, Semantic Clustering, Software Reverse Engineering.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 235

PDF Views: 2




  • Visualizing the Domain in 3-Dimension Using Semantic Clustering

Abstract Views: 235  |  PDF Views: 2

Authors

Sanjay Madan
Computer Science and Engineering, Thapar University, Patiala-147004, India
Purnima Ahuja
Information Technology, Banasthali University, Banasthali, Rajasthan, India
Shalini Batra
Computer Science and Engineering Department, Thapar University, Patiala-147004, India

Abstract


To understand the software source code lots of approaches have been developed and many of them concern to the program structural information but this results in the loss of domain semantic crucial information contained in the text or symbols of source code. To understand software as a whole, we need to enrich these approaches with conceptual insights gained from the domain semantics. This paper proposes the mapping of domain to the code using the information retrieval techniques to use linguistic information, such as identifier names and comments in source code. Here we introduce the concept of Semantic Clustering, and an algorithm to group source artifacts based on how the synonymy and polysemy is related. Based on semantic similarity automatic labeling of the program code is done after detecting the clusters, and is visually explore in 3-Dimension format. The most important feature of theis approach is that it works at the source code textual level which makes it language independent. The approach correlates the semantics with structural information applies at different levels of abstraction (e.g.packages, classes, methods).

Keywords


Information Retrieval, Latent Semantic Indexing, Semantic Clustering, Software Reverse Engineering.