Open Access Open Access  Restricted Access Subscription Access

Video Segmentation and Object Tracking using Improvised Deep Learning Algorithms


Affiliations
1 Department of Artificial Intelligence and Data Science, Adhiyamaan College of Engineering, India
2 Department of Computer Science and Engineering, RVS College of Engineering and Technology, India
3 Department of Electrical and Electronics Engineering, Excel Engineering College, India
4 Department of Electronics and Communication Engineering, Bapuji Institute of Engineering and Technology, India

   Subscribe/Renew Journal


Video segmentation and object tracking are critical tasks in computer vision, with applications ranging from autonomous driving to surveillance and video analytics. Traditional approaches often struggle with challenges like occlusion, background clutter, and high computational costs, limiting their accuracy and efficiency in realworld scenarios. This research addresses these issues by employing improvised deep learning algorithms, specifically Convolutional Neural Networks (CNN), VGG, and AlexNet, to enhance the precision and speed of video segmentation and object tracking. The proposed method integrates feature extraction capabilities of CNN with the deeper architecture of VGG for improved feature representation and AlexNet's computational efficiency to ensure scalability. A novel multistage training process is implemented, where CNN provides initial object localization, VGG refines segmentation boundaries, and AlexNet accelerates tracking in real-time. The framework was trained and evaluated on benchmark datasets such as DAVIS and MOT17, covering diverse scenarios with varying complexities. The results show significant improvements in accuracy and speed compared to existing methods. On the DAVIS dataset, the approach achieved a segmentation accuracy of 89.7% and an Intersection over Union (IoU) score of 86.5%. For object tracking on MOT17, the system attained a MultiObject Tracking Accuracy (MOTA) of 82.3% and an average frame processing rate of 35 frames per second (FPS), outperforming baseline methods by 8.5% in accuracy and 15% in computational efficiency. The CNN, VGG, and AlexNet in a unified framework offers a robust solution for video segmentation and object tracking, demonstrating enhanced accuracy, adaptability, and real-time performance. These findings hold promise for applications in areas requiring reliable and efficient visual analysis.

Keywords

Video segmentation, object tracking, deep learning, CNN, VGG, AlexNet
Subscription Login to verify subscription
User
Notifications
Font Size


  • Video Segmentation and Object Tracking using Improvised Deep Learning Algorithms

Abstract Views: 130  | 

Authors

G. Shanmugapriya
Department of Artificial Intelligence and Data Science, Adhiyamaan College of Engineering, India
G. Pavithra
Department of Computer Science and Engineering, RVS College of Engineering and Technology, India
M.K. Anandkumar
Department of Electrical and Electronics Engineering, Excel Engineering College, India
D. Pavankumar
Department of Electronics and Communication Engineering, Bapuji Institute of Engineering and Technology, India

Abstract


Video segmentation and object tracking are critical tasks in computer vision, with applications ranging from autonomous driving to surveillance and video analytics. Traditional approaches often struggle with challenges like occlusion, background clutter, and high computational costs, limiting their accuracy and efficiency in realworld scenarios. This research addresses these issues by employing improvised deep learning algorithms, specifically Convolutional Neural Networks (CNN), VGG, and AlexNet, to enhance the precision and speed of video segmentation and object tracking. The proposed method integrates feature extraction capabilities of CNN with the deeper architecture of VGG for improved feature representation and AlexNet's computational efficiency to ensure scalability. A novel multistage training process is implemented, where CNN provides initial object localization, VGG refines segmentation boundaries, and AlexNet accelerates tracking in real-time. The framework was trained and evaluated on benchmark datasets such as DAVIS and MOT17, covering diverse scenarios with varying complexities. The results show significant improvements in accuracy and speed compared to existing methods. On the DAVIS dataset, the approach achieved a segmentation accuracy of 89.7% and an Intersection over Union (IoU) score of 86.5%. For object tracking on MOT17, the system attained a MultiObject Tracking Accuracy (MOTA) of 82.3% and an average frame processing rate of 35 frames per second (FPS), outperforming baseline methods by 8.5% in accuracy and 15% in computational efficiency. The CNN, VGG, and AlexNet in a unified framework offers a robust solution for video segmentation and object tracking, demonstrating enhanced accuracy, adaptability, and real-time performance. These findings hold promise for applications in areas requiring reliable and efficient visual analysis.

Keywords


Video segmentation, object tracking, deep learning, CNN, VGG, AlexNet