Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Training the SVM to Larger Dataset Applications Using the SVM Sampling Technique


Affiliations
1 Computer Science and Engineering Department, VTU University, Nitte Meenakshi Institute of Technology Bangalore Karnataka, India
2 Computer Science and Engineering Department, VTU University, Veerappa Nisty Engineering College Shorapurk Karnataka, India
     

   Subscribe/Renew Journal


with increasing amounts of data being generated by businesses and researchers there is a need for fast, accurate and robust algorithms for data analysis. Improvements in databases technology, computing performance and artificial intelligence have contributed to the development of intelligent data analysis. The primary aim of data mining is to discover patterns in the data that lead to better understanding of the data generating process and to useful predictions. Examples of applications of data mining include detecting fraudulent credit card transactions, character recognition in automated zip code reading, and predicting compound activity in drug discovery. Real-world data sets are often characterized by having large numbers of examples, e.g. billions of credit card transactions and potential 'drug-like' compounds; being highly unbalanced, e.g. most transactions are not fraudulent, most compounds are not active against a given biological target; and, being corrupted by noise. The relationship between predictive variables, e.g. physical descriptors, and the target concept, e.g. compound activity, is often highly non-linear. One recent technique that has been developed to address these issues is the support vector machine. The support vector machine has been developed as robust tool for classification and regression in noisy, complex domains. The two key features of support vector machines are generalization theory, which leads to a principled way to choose an hypothesis; and, kernel functions, which introduce non-linearity in the hypothesis space without explicitly requiring a non-linear algorithm. In this paper we introduce support vector machines cascade svm and randomized sampling technique highlight the advantages thereof over existing data analysis techniques, also are noted some important points for the data mining practitioner who wishes to use support vector machines.

Keywords

Support Vector Machine, SVM, Machine Learning, Multiprocessing, Scalability and Accurate Performance, Randomized Algorithm.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 210

PDF Views: 1




  • Training the SVM to Larger Dataset Applications Using the SVM Sampling Technique

Abstract Views: 210  |  PDF Views: 1

Authors

G. M. Sangeetha
Computer Science and Engineering Department, VTU University, Nitte Meenakshi Institute of Technology Bangalore Karnataka, India
Prashanth
Computer Science and Engineering Department, VTU University, Veerappa Nisty Engineering College Shorapurk Karnataka, India

Abstract


with increasing amounts of data being generated by businesses and researchers there is a need for fast, accurate and robust algorithms for data analysis. Improvements in databases technology, computing performance and artificial intelligence have contributed to the development of intelligent data analysis. The primary aim of data mining is to discover patterns in the data that lead to better understanding of the data generating process and to useful predictions. Examples of applications of data mining include detecting fraudulent credit card transactions, character recognition in automated zip code reading, and predicting compound activity in drug discovery. Real-world data sets are often characterized by having large numbers of examples, e.g. billions of credit card transactions and potential 'drug-like' compounds; being highly unbalanced, e.g. most transactions are not fraudulent, most compounds are not active against a given biological target; and, being corrupted by noise. The relationship between predictive variables, e.g. physical descriptors, and the target concept, e.g. compound activity, is often highly non-linear. One recent technique that has been developed to address these issues is the support vector machine. The support vector machine has been developed as robust tool for classification and regression in noisy, complex domains. The two key features of support vector machines are generalization theory, which leads to a principled way to choose an hypothesis; and, kernel functions, which introduce non-linearity in the hypothesis space without explicitly requiring a non-linear algorithm. In this paper we introduce support vector machines cascade svm and randomized sampling technique highlight the advantages thereof over existing data analysis techniques, also are noted some important points for the data mining practitioner who wishes to use support vector machines.

Keywords


Support Vector Machine, SVM, Machine Learning, Multiprocessing, Scalability and Accurate Performance, Randomized Algorithm.