Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan

Depth Estimation refers to a set of techniques and algorithms that aim to obtain a representation of spatial information of a scene. Nowadays specific hardware such as sensors, radars and multiple-view-recording cameras are being used in order to acquire depth data of a scene. Modern approaches use...

Full description

Bibliographic Details
Main Author: Ali, Kohan
Format: Thesis
Published: 2020
Subjects:
_version_ 1849735875674505216
author Ali, Kohan
author_facet Ali, Kohan
author_sort Ali, Kohan
description Depth Estimation refers to a set of techniques and algorithms that aim to obtain a representation of spatial information of a scene. Nowadays specific hardware such as sensors, radars and multiple-view-recording cameras are being used in order to acquire depth data of a scene. Modern approaches use deep learning to address this task by trying to learn depth information in a supervised manner. However, this approach requires a large amount ground-truth data for a particular scene so that a model can be trained successfully. Also preparing ground-truth data for a range of environments is a challenging and expensive task to accomplish. Most recent works in this context have proposed self-supervised learning approaches, where they implicitly infer the target data from a stereo pair of images and use that self-obtained target data to train a deep neural network to learn disparities of the two views from the image pair. Disparities between two horizontal views of a same object, says all about how much that object moves on the horizontal line from one view to the other. Predicting the disparities will help calculate the depth data of the scene using simple geometric formulas. This approach however has shown some flaws in estimating depth on specular and transparent surfaces, where they end up predicting inconsistent depth for such surfaces. In this work a novel training objective is proposed, where a deep convolutional neural network learns to predict depth from a single image, where it improves the quality of depth prediction for specular and transparent surfaces. This proposed method follows the previous works that try to reconstruct the right-view of a scene, given the left one. On top of that, having considered the importance of loss layers in the performance of neural networks, it suggests a new image reconstruction and matching loss function that is aimed to improve depth estimation consistency on specular and transparent surfaces. The proposed loss function is perceptually motivated by the human visual system, assuming that it will help increase image reconstruction quality while maintaining key structures of a scene; hoping that it will impact directly on depth prediction which resolves the aforementioned deficiencies of the predecessor works.
format Thesis
id oai:studentsrepo.um.edu.my:14369
institution Universiti Malaya
publishDate 2020
record_format eprints
spelling oai:studentsrepo.um.edu.my:143692023-05-08T20:31:39Z Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan Ali, Kohan QA75 Electronic computers. Computer science QA76 Computer software Depth Estimation refers to a set of techniques and algorithms that aim to obtain a representation of spatial information of a scene. Nowadays specific hardware such as sensors, radars and multiple-view-recording cameras are being used in order to acquire depth data of a scene. Modern approaches use deep learning to address this task by trying to learn depth information in a supervised manner. However, this approach requires a large amount ground-truth data for a particular scene so that a model can be trained successfully. Also preparing ground-truth data for a range of environments is a challenging and expensive task to accomplish. Most recent works in this context have proposed self-supervised learning approaches, where they implicitly infer the target data from a stereo pair of images and use that self-obtained target data to train a deep neural network to learn disparities of the two views from the image pair. Disparities between two horizontal views of a same object, says all about how much that object moves on the horizontal line from one view to the other. Predicting the disparities will help calculate the depth data of the scene using simple geometric formulas. This approach however has shown some flaws in estimating depth on specular and transparent surfaces, where they end up predicting inconsistent depth for such surfaces. In this work a novel training objective is proposed, where a deep convolutional neural network learns to predict depth from a single image, where it improves the quality of depth prediction for specular and transparent surfaces. This proposed method follows the previous works that try to reconstruct the right-view of a scene, given the left one. On top of that, having considered the importance of loss layers in the performance of neural networks, it suggests a new image reconstruction and matching loss function that is aimed to improve depth estimation consistency on specular and transparent surfaces. The proposed loss function is perceptually motivated by the human visual system, assuming that it will help increase image reconstruction quality while maintaining key structures of a scene; hoping that it will impact directly on depth prediction which resolves the aforementioned deficiencies of the predecessor works. 2020-07 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/14369/2/Ali_Kohan.pdf application/pdf http://studentsrepo.um.edu.my/14369/1/Ali_Kohan.pdf Ali, Kohan (2020) Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan. Masters thesis, Universiti Malaya. http://studentsrepo.um.edu.my/14369/
spellingShingle QA75 Electronic computers. Computer science
QA76 Computer software
Ali, Kohan
Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan
title Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan
title_full Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan
title_fullStr Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan
title_full_unstemmed Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan
title_short Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan
title_sort unsupervised monocular depth estimation with multi scale structural similarity powered loss function ali kohan
topic QA75 Electronic computers. Computer science
QA76 Computer software
url-record http://studentsrepo.um.edu.my/14369/
work_keys_str_mv AT alikohan unsupervisedmonoculardepthestimationwithmultiscalestructuralsimilaritypoweredlossfunctionalikohan