Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Image Classification using Model Ensembling


Affiliations
1 Department of Computer Science, St. Xavier’s College, India
2 Department of Computer Application, RCC Institute of Information Technology, India
3 Department of Computer Science and Engineering, RCC Institute of Information Technology, India
     

   Subscribe/Renew Journal


Classifying images efficiently using various algorithms is very useful now-a-days given that the field of computer vision is growing rapidly. The research work highlighted in this paper focuses on the independent use of various models to classify images and then combining them together to form a better model in terms of performance than each of the individual models. The dataset used consists of 200 classes with 90,000 training images, 10,000 validation images and 10,000 test images. The data preparation step in this work involves resizing the images (data), shuffling them and transforming them into a data generator to provide input to the models. The images were also augmented using two different sets of image transformation effects to get more data for the models to train on. These data were then used to train five different models (one model trained from scratch and four other models using pre-trained weights and transfer learning) independently. The performance of each model was judged by checking two evaluation metrics – f1-score and categorical accuracy. The models were also tried to be fine-tuned to get a better performance, and finally the models were ensembled together to get a better categorical accuracy and f1-score on unseen (validation and test) data.

Keywords

Image Classification, Convolutional Neural Networks, Image Augmentation, Model Ensembling, F1-Score
Subscription Login to verify subscription
User
Notifications
Font Size

  • D.H. Ballard and C.M. Brown, “Computer Vision”, Prentice Hall, 1982.
  • Github, “Convolution Neural Networks for Visual Recognition”, Available at: https://cs231n.github.io/classification/, Accessed at 2020.
  • Y. Le Cun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-Based Learning Applied to Document Recognition”, Proceedings of the IEEE, Vol. 86, pp. 1-13, 1998.
  • Y. Le Cun, Y. Bengio and G. Hinton, “Deep Learning”, Nature, pp. 436-444, 2015.
  • A. Krizhevsky, I. Sutskever and G. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks”, Neural Information Processing Systems, Vol. 23, No. 1, pp. 1-14, 2000.
  • J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei Fei, “ImageNet: A Large-Scale Hierarchical Image Database”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, 2009.
  • Image Net, “Image Net”, Available at:http://www.image-net.org/. Accessed at 2020.
  • F. Sultana, A. Sufian and P. Dutta, “Advancements in Image Classification using Convolutional Neural Network”, Proceedings of International Conference on Research in Computational Intelligence and Communication Networks, pp. 122-129, 2018.
  • Image Net, “ImageNet Large Scale Visual Recognition Challenge 2012”. Available at: http://image-net.org/challenges/LSVRC/2012/, Accessed at 2020.
  • K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-14, 2014.
  • VGG16, Available at: https://neurohive.io/en/popular-networks/vgg16/.Accessed at 2020.
  • ImageNet Large Scale Visual Recognition Challenge 2014, Available at: http://image-net.org/challenges/LSVRC/2014/.Accessed at 2020.
  • C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going Deeper with Convolutions”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
  • K. He, X. Zhang, S. Ren and J. Sun, “Deep Residual Learning for Image Recognition”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
  • ImageNet Large Scale Visual Recognition Challenge 2015, Available at: http://image-net.org/challenges/LSVRC/2015/, Accessed at 2020.
  • S. Albawi, T.A. Mohammed and S. Al Zawi, “Understanding of a Convolutional Neural Network”, Proceedings of International Conference on Engineering and Technology, pp. 1-6, 2017.
  • A. Mikołajczyk and M. Grochowski, “Data Augmentation for Improving Deep Learning in Image Classification Problem”, Proceedings of International Conference on Engineering and Technology, pp. 117-122, 2018.
  • W.H. Beluch, T. Genewein, A. Nurnberger and J.M. Kohler, “The Power of Ensembles for Active Learning in Image Classification”, Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 9368-9377, 2018.
  • V. Thakkar, S. Tewary and C. Chakraborty, “Batch Normalization in Convolutional Neural Networks - A Comparative Study with CIFAR-10 Data”, Proceedings of International Conference on Emerging Applications of Information Technology, pp. 1-5, 2018.
  • N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting”, Journal of Machine Learning Research, Vol. 12, No. 1, pp. 1929-1958, 2014.
  • E.M. Dogo, O.J. Afolabi, N.I. Nwulu, B. Twala and C.O. Aigbavboa, “A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks”, Proceedings of International Conference on Computational Techniques, Electronics and Mechanical Systems, pp. 92-99, 2018.
  • Data Science, “Vanishing Gradient Problem”. Available at: https://towardsdatascience.com/the-vanishing-gradient-problem-69bf08b15484. Accessed at 2020.
  • SGD Optimizer, Available at: https://keras.io/api/optimizers/sgd/. Accessed at 2020.
  • Reduce LR on Plateau Callback, Available at: https://keras.io/api/callbacks/reduce_lr_on_plateau/, Accessed at 2020.
  • Categorical Cross-Entropy Loss, Available at: https://keras.io/api/losses/probabilistic_losses/#categoricalcrossentropy-class, Accessed at 2020.
  • Image Data Generator Class, Available at: https://keras.io/api/preprocessing/image/#imagedatagenerator-class, Accessed at 2020.
  • Image Augmentation, Available at https://imgaug.readthedocs.io/en/latest/, Accessed at 2020.
  • F. Chollet, “Xception: Deep Learning with Depthwise Separable Convolutions”, Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 1800-1807, 2017.
  • Image-Net Data, Available at: http://www.image-net.org/, Accessed at 2020.
  • Ada Delta Optimizer, Available at https://keras.io/api/optimizers/adadelta/. Accessed at 2020.
  • C. Szegedy,S. Ioffe,V. Vanhoucke andA. Alemi, “Inception-V4, Inception-ResNet and the Impact of Residual Connections on Learning”, Proceedings of International Conference on Artificial Intelligence, pp. 1-7, 2016.
  • M. Sandler, A. Howard, M. Zhu, A. Zhmoginov and L. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks”, Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 4510-4520, 2018.
  • G. Huang, Z. Liu, L. Van Der Maaten and K.Q. Weinberger, “Densely Connected Convolutional Networks”, Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 2261-2269, 2017.
  • Kaggle Leaderboard, Available at: https://www.kaggle.com/c/image-detect/leaderboard, Accessed at 2020.
  • Keras Library, Available at: https://keras.io/, Accessed at 2020.
  • Google Colaboratory, Available at: https://colab.research.google.com/, Accessed at 2020.
  • Image Dataset, Available at:https://www.kaggle.com/c/image-detect/data, Accessed at 2020.

Abstract Views: 239

PDF Views: 1




  • Image Classification using Model Ensembling

Abstract Views: 239  |  PDF Views: 1

Authors

Debabrata Datta
Department of Computer Science, St. Xavier’s College, India
Anweshan Mukherjee
Department of Computer Science, St. Xavier’s College, India
Soumen Mukherjee
Department of Computer Application, RCC Institute of Information Technology, India
Arup Kr. Bhattacharjee
Department of Computer Science and Engineering, RCC Institute of Information Technology, India
Anal Acharya
Department of Computer Science, St. Xavier’s College, India

Abstract


Classifying images efficiently using various algorithms is very useful now-a-days given that the field of computer vision is growing rapidly. The research work highlighted in this paper focuses on the independent use of various models to classify images and then combining them together to form a better model in terms of performance than each of the individual models. The dataset used consists of 200 classes with 90,000 training images, 10,000 validation images and 10,000 test images. The data preparation step in this work involves resizing the images (data), shuffling them and transforming them into a data generator to provide input to the models. The images were also augmented using two different sets of image transformation effects to get more data for the models to train on. These data were then used to train five different models (one model trained from scratch and four other models using pre-trained weights and transfer learning) independently. The performance of each model was judged by checking two evaluation metrics – f1-score and categorical accuracy. The models were also tried to be fine-tuned to get a better performance, and finally the models were ensembled together to get a better categorical accuracy and f1-score on unseen (validation and test) data.

Keywords


Image Classification, Convolutional Neural Networks, Image Augmentation, Model Ensembling, F1-Score

References