Open Access Open Access  Restricted Access Subscription Access

Bio Inspired Algorithms for Dimensionality Reduction and Outlier Detection in Medical Datasets


Affiliations
1 Department of Computer Science, School of Computer Science and Engineering, Bharathiar University, Coimbatore, India
2 Department of Computer Science, PSG College of Arts & Science, Coimbatore, India
3 Department of Computer Science, Nirmala College for Women, Coimbatore, India
 

Dimensionality Reduction is one of the useful techniques used in number of applications in order to reduce the number of features to improve the productivity and efficiency of the task. Clustering is one of the influential tasks in data mining. Dimensionality reductions are used in data mining, Image processing, Networking, Mobile computing, etc. The elementary intention of this work is to apply dimensionality reduction algorithms and then cluster the datasets to detect outliers. A bio-inspired ACO (Ant Colony optimization) algorithm has been proposed to reduce dimensionality. Also another bio-inspired algorithm FA (Firefly Algorithm) has been proposed to detect outliers. The three distinct medical datasets: thyroid dataset, Oesophagal dataset and Heart disease dataset are used for experimental results.

Keywords

Dimensionality Reduction, Clustering, Outlier Detection, ACO (Ant Colony Optimization) Algorithm, FA (Firefly Algorithm).
User
Notifications
Font Size

  • . Dr. S. Vijayarani, S. Maria Sylviaa-Comparative Analysis of Dimensionality Reduction Techniques. International Journal of Innovative Research in Computer and Communication Engineering Vol. 4, Issue 1, January 2016.
  • . http://www.comp.dit.ie/btierney/Oracle11gDoc/datamine. 111/b28129/feature_extr.html
  • . Larose D.T, “Discovery knowledge in data-Introduction to Data mining, ISBN 0-471-66657-2, ohn Wiley & Sons, Inc., 2005.
  • . Dr. S. Vijayarani, Ms. P. Jothi- Hierarchical and Partitioning Clustering Algorithms for Detecting Outliers in Data Streams- International Journal of Advanced Research in Computer and Communication Engineering Vol. 3, Issue 4, April 2014
  • . S. D. Pachgade, S. S. Dhande, “Outlier Detection over Data Set Using Cluster-Based and Distance-Base Approach”, International Journal of Advanced Research in Computer Science and Software Engineering ISSN: 2277, Volume 2, Issue 6, June 2012.
  • . Cormen, Thomas H, Charles E, “Introduction to Algorithms, 2nd edition”, McGraw- Hill, New York.
  • . Edwin M. Knox and Raymond T. Ng, “Algorithms for Mining Distance”, Based Outliers in Large Datasets. http://www.vldb.org/conf/1998/p392.pdf
  • . Rajendra Pamula, Jatindra Kumar Deka, SukumarNandi “An Outlier Detection Method based on clustering”, Second International Conference on Emerging Applications of Information Technology, 2011.
  • . Grubbs, F. E. (February 1969). "Procedures for detecting outlying observations in samples". Technometrics 11 (1): 1–21. doi:10.1080/00401706.1969.10490657.
  • . Hodge.V and J. Austin, “A Survey of Outlier Detection Methodologies”, Artificial Intelligence Review, Vol. 22, pp. 85-126, 2003.
  • . Aastha Joshi, Rajneet Kaur, “A Review: Comparative Study of Various Clustering Techniques in Data Mining” International Journal of Advanced Research in computer Science and Software Engineering, Volume 3, Issue 3, March 2013 ISSN: 2277 128X
  • . Elahi, M. KunLi, Nisar, W. XinjieLv, HonganWang, “Fuzzy Systems and Knowledge Discovery”, Fifth International Conference on Vol.5, andVol .3, pp. 23-27, 2002
  • . Behera Abhishek, T., Johnson, T., and Chadderdon, G.: 1998, ‘Classification and Novelty Detection using Linear Models and a Class Dependent - Elliptical Bassi Function Neural Network ’. In: Proceedings of the International conference on neural networks. Anchorage, Alaska.
  • . Garima Singh, Vijay Kumar, “An Efficient Clustering and Distance Based Approach for Outlier Detection”, International Journal of Computer Trends and Technology (IJCTT) – volume 4 Issue 7–July 2013
  • . Rekha Awasthi, Anil Kumar Tiwari and Seema Pathak, “An Analysis Of Density Based Clustering Technique with Dimensionality Reduction For Diabetic Patient” International Journal of Computer Engineering and Applications, Volume IX, Issue IV, April 15 www.ijcea.com ISSN 2321-3469.
  • . Zhang, T., Ramakrishnan, R., and Livny, M. 1997. BIRCH: A new data clustering algorithm and its applications. Journal of Data Mining and Knowledge Discovery, 1, 2, 141-182
  • . Sadia Patka1, M. S. Khatib2, Kamlesh Kelwade, “Recent Trends and Rapid Development of Applications in Data Mining”, International Conference on Advances in Engineering & Technology, IOSR Journal of Computer Science 2014, page no 73-78.
  • . Charu C. Aggarwal, Phillip S. Y, An effective and efficient algorithm for higher dimensional outlier detection.
  • . Ishida, E.E.O & de Souza, R.S, Hubble parameter reconstruction from a principal component analysis: minimizing the bias. Astronomy & Astrophysics, Volume 527, id.A49 (2011)
  • . http://www.slideshare.net/kompellark/t19-factor-analysis
  • .Dr. T. Christopher, T. Divya, “A Study of Clustering Based Algorithm for Outlier Detection in Data streams”. Proceedings of the UGC Sponsored National Conference on Advanced Networking and Applications, 27th March 2015.
  • . Caruana, R., Niculescu-Mizil, A. "An empirical comparison of supervised learning algorithms". Proceedings: 23rd International Conference on Machine Learning, 2006. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.5901&rep=rep1&type=pdf
  • .http://www.academia.edu/14545523/IRJET-A HYBRID FIREFLY BASED APPROACH FOR DATA CLUSTERING Volume No.3 Issue No. 4, August 2015
  • . Ajeet Pandey, Akhilesh Kumar Singh-Ant Colony Optimization Based Routing Algorithm in Various Wireless Sensor Network- A Survey- Journal of Advanced Computing and Communication Technologies
  • .https://books.google.co.in/books?id=7KuqCAAAQBAJ&pg=PA15&lpg=PA15&dq=%22been+applied+to+many+combinatorial+optimization+problems,%22&source=bl&ots=LxB2gjBdyW&sig=FNup4pJps2J8sqQpvsYRa2bQYA&hl=en&sa=X&redir_esc=y#v=onepage&q=%22been%20applied%20to%20many%20combinatorial%20optimization%20problems%2C%22&f=false
  • . D. Asir Antony Gnana Singh, P.Surenther, E. Jebamalar Leavline- Ant Colony Optimization Based Attribute Reduction for Disease Diagnostic System-International Journal of Applied Engineering Research, ISSN 0973-4562 Vol. 10 No.55 (2015)© Research India Publications; httpwww.ripublication.comijaer.htm
  • . Fahim. A. M., A. M. Salem, F. A. Torkey and M. A. Ramadan, “An Efficient enhanced k-means clustering algorithm”, journal of Zhejiang University, Vol.10 (7), 2006 page no 1626-1633.
  • . Han, J., Kamber, M.: “Data Mining Concepts and Techniques”, Morgan Kaufmann Publishers, 2006.
  • . http://www.thearling.com/text/dmwhite/dmwhite.html
  • . Jonathon Shlens, “A Tutorial on Principal Component Analysis” https://en.wikipedia.org/wiki/Principal_component_analysis

Abstract Views: 177

PDF Views: 0




  • Bio Inspired Algorithms for Dimensionality Reduction and Outlier Detection in Medical Datasets

Abstract Views: 177  |  PDF Views: 0

Authors

S. Vijayarani
Department of Computer Science, School of Computer Science and Engineering, Bharathiar University, Coimbatore, India
C. Sivamathi
Department of Computer Science, PSG College of Arts & Science, Coimbatore, India
S. Maria Sylviaa
Department of Computer Science, Nirmala College for Women, Coimbatore, India

Abstract


Dimensionality Reduction is one of the useful techniques used in number of applications in order to reduce the number of features to improve the productivity and efficiency of the task. Clustering is one of the influential tasks in data mining. Dimensionality reductions are used in data mining, Image processing, Networking, Mobile computing, etc. The elementary intention of this work is to apply dimensionality reduction algorithms and then cluster the datasets to detect outliers. A bio-inspired ACO (Ant Colony optimization) algorithm has been proposed to reduce dimensionality. Also another bio-inspired algorithm FA (Firefly Algorithm) has been proposed to detect outliers. The three distinct medical datasets: thyroid dataset, Oesophagal dataset and Heart disease dataset are used for experimental results.

Keywords


Dimensionality Reduction, Clustering, Outlier Detection, ACO (Ant Colony Optimization) Algorithm, FA (Firefly Algorithm).

References