Open Access Open Access  Restricted Access Subscription Access

Optimized Binning Technique in Decision Tree Model for Predicting The Helicoverpa armigera (Hubner) Incidence on Cotton


Affiliations
1 ICAR-National Bureau of Agricultural Insect Resources, Bengaluru – 560024, Karnataka, India
2 Department of Computer Science, Jain University, Bengaluru – 560011, Karnataka, India
3 University of Agricultural Sciences, Agricultural Research Station, Raichur - 584102, Karnataka, India
 

The data mining technique decision tree induction model is a popular method used for prediction and classification problems. The most suitable model in pest forewarning systems is decision tree analysis since pest surveillance data contains biotic, abiotic and environmental variables and IF-THEN rules can be easily framed. The abiotic factors like maximum and minimum temperature, rainfall, relative humidity, etc. are continuous numerical data and are important in climate-change studies. The decision tree model is implemented after pre-processing the data which are suitable for analysis. Data discretization is a pre-processing technique which is used to transform the continuous numerical data into categorical data resulting in interval as nominal values. The most commonly used binning methods are equal-width partitioning and equal-depth partitioning. The total number of bins created for the variable is important because either large number of bins or small number of bins affects the accuracy in results of IF-THEN rules. Hence, optimized binning technique based on Mean Integrated Squared Error (MISE) method is proposed for forming accurate IF-THEN rules in predicting the pest Helicoverpa armigera incidence on cotton crop based on decision tree analysis.

Keywords

Bin Optimization, Decision Tree, Discretization, Helicoverpa armigera, If-Then Rules, Pest Prediction.
User
Notifications

  • Dhaliwal GS, Arora R. 1996. Integrated pest management: Achievements and Challenges, pp. 308–355. In: Dhaliwal GS, Arora R. (Eds). Principles of Insect Pest Management, NATIC, India.
  • George HJ, Ron K, Karl P. 1994. Irrelevant features and the subset selection problem. In: William W Cohen and Haym Hirsh (Eds.) Machine Learning: Proceedings of the Eleventh International Conference. 121-129, Morgan Kaufmann Publishers, San Francisco, CA.
  • Gupta GK. 2006. Classification. In: Introduction to Data Mining with Case Studies, Prentice-Hall of India, 106– 136. https://doi.org/10.1016/B978-044451636-7/50013-9
  • Leonardo T, Miriam EP. 2002. The distribution and movement of cotton bollworm, Helicoverpa armigera Hübner (Lepidoptera: Noctuidae) larvae on cotton. Philippine J Sci, 131: 91–98.
  • Pratheepa M, Meena K, Subramaniam KR, Venugopalan R, Bheemanna H. 2011. A decision tree analysis for predicting the occurrence of the pest, Helicoverpa armigera and its natural enemies on cotton based on economic threshold level. Curr Sci. 100(2): 238–246.
  • Shimazaki H, Shinomoto S. 2007. A method of selecting the binsize of a Time Histogram. Neural Comput.19(6): 1503–1527.
  • SPSS V 17.0. 2008. Statistical Package for Social Sciences. SPSS Inc. Illinois, Chicago,USA.
  • Sotiris K, Dimitris K. 2006. Discretization techniques: A recent survey. GESTS International Trans Comput. Sci Engineering. 32(1): 47–58.
  • Zhao H, Ram S. 2004. Constrained cascade generalization of decision trees. IEEE Trans Knowledge Data Engineering. 16(6): 727–739. Available from: https://dl.acm.org/citation.cfm?id=1437601 https://doi.org/10.1109/TKDE.2004.3

Abstract Views: 247

PDF Views: 114




  • Optimized Binning Technique in Decision Tree Model for Predicting The Helicoverpa armigera (Hubner) Incidence on Cotton

Abstract Views: 247  |  PDF Views: 114

Authors

M. Pratheepa
ICAR-National Bureau of Agricultural Insect Resources, Bengaluru – 560024, Karnataka, India
J. Cruz Antony
Department of Computer Science, Jain University, Bengaluru – 560011, Karnataka, India
Chandish R. Ballal
ICAR-National Bureau of Agricultural Insect Resources, Bengaluru – 560024, Karnataka, India
H. Bheemanna
University of Agricultural Sciences, Agricultural Research Station, Raichur - 584102, Karnataka, India

Abstract


The data mining technique decision tree induction model is a popular method used for prediction and classification problems. The most suitable model in pest forewarning systems is decision tree analysis since pest surveillance data contains biotic, abiotic and environmental variables and IF-THEN rules can be easily framed. The abiotic factors like maximum and minimum temperature, rainfall, relative humidity, etc. are continuous numerical data and are important in climate-change studies. The decision tree model is implemented after pre-processing the data which are suitable for analysis. Data discretization is a pre-processing technique which is used to transform the continuous numerical data into categorical data resulting in interval as nominal values. The most commonly used binning methods are equal-width partitioning and equal-depth partitioning. The total number of bins created for the variable is important because either large number of bins or small number of bins affects the accuracy in results of IF-THEN rules. Hence, optimized binning technique based on Mean Integrated Squared Error (MISE) method is proposed for forming accurate IF-THEN rules in predicting the pest Helicoverpa armigera incidence on cotton crop based on decision tree analysis.

Keywords


Bin Optimization, Decision Tree, Discretization, Helicoverpa armigera, If-Then Rules, Pest Prediction.

References





DOI: https://doi.org/10.18311/jbc%2F2018%2F18163