Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Patch Based Stereo Matching Using Convolutional Neural Network


Affiliations
1 Department of Computer Science and Engineering, Jai Narain Vyas University, India
2 Department of Production and Industrial Engineering, Jai Narain Vyas University, India
     

   Subscribe/Renew Journal


The paper presents a new Convolutional Neural Network (CNN) architecture, called stacked stereo CNN, for computing disparity map from stereo images. In stacked stereo CNN, left and right image patches are stacked back-to-back and fed to a single tower CNN. This is in contrast to Siamese network where two towers are used, one for the left patch and other for the right patch. The proposed network is trained on a large set of similar and dissimilar image patches, which are generated from stereo images and their ground truth images from Middlebury stereo datasets. The network returns a dissimilarity score for a pair of image patch which is used to compute the cost volume. The cost volume is further refined using post processing steps before generating the final disparity map. The proposed network is evaluated on Middlebury datasets and achieves comparable results to the state-of-art algorithms.

Keywords

Stereo Vision, Patch Matching, Disparity Map, CNN.
Subscription Login to verify subscription
User
Notifications
Font Size


  • Patch Based Stereo Matching Using Convolutional Neural Network

Abstract Views: 385  |  PDF Views: 0

Authors

Rachna Verma
Department of Computer Science and Engineering, Jai Narain Vyas University, India
Arvind Kumar Verma
Department of Production and Industrial Engineering, Jai Narain Vyas University, India

Abstract


The paper presents a new Convolutional Neural Network (CNN) architecture, called stacked stereo CNN, for computing disparity map from stereo images. In stacked stereo CNN, left and right image patches are stacked back-to-back and fed to a single tower CNN. This is in contrast to Siamese network where two towers are used, one for the left patch and other for the right patch. The proposed network is trained on a large set of similar and dissimilar image patches, which are generated from stereo images and their ground truth images from Middlebury stereo datasets. The network returns a dissimilarity score for a pair of image patch which is used to compute the cost volume. The cost volume is further refined using post processing steps before generating the final disparity map. The proposed network is evaluated on Middlebury datasets and achieves comparable results to the state-of-art algorithms.

Keywords


Stereo Vision, Patch Matching, Disparity Map, CNN.

References