Unsupervised Confidence for LiDAR Depth Maps and Applications
🎉 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2020) 🎉
Andrea Conti · Matteo Poggi · Filippo Aleotti · Stefano Mattoccia
Overview
Depth perception is pivotal in many fields, such as robotics and autonomous driving, to name a few. Consequently, depth sensors such as LiDARs rapidly spread in many applications. The 3D point clouds generated by these sensors must often be coupled with an RGB camera to understand the framed scene semantically. Usually, the former is projected over the camera image plane, leading to a sparse depth map. Unfortunately, this process, coupled with the intrinsic issues affecting all the depth sensors, yields noise and gross outliers in the final output. As an example, in the image below the outliers formation process due to visual occlusions between camera and depth sensor is showed.
We propose an effective unsupervised framework aimed at explicitly addressing this issue by learning to estimate the confidence of the LiDAR sparse depth map and thus allowing for filtering out the outliers.
To train our framework we model the confidence of LiDAR depth $d$ assuming a Gaussian Distribution and minimize the negative log-likelihood function.
$$ \mathcal{L}_G = - \ln \left( \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(d - d^*)^2}{2\sigma^2}} \right) $$
which can be rewritten as follows
$$ \mathcal{L}_G \approx \ln(\sigma) + \frac{(d - d^*)^2}{2\sigma^2} $$
To apply the loss function above the preduction of both $d^, \sigma$ is required. However, doing so means learn the confidence $\sigma$ of the network output $d^$ and this is not our goal. Thus, instead of predict $d^*$ we employ a proxy label computed as follows representing a plausibly correct depth for each original LiDAR depth value.
$$ d^*_x = \min \ \{ d : d \in P(x), d > 0 \} $$
Where $x$ is a valid coordinate and $P(x)$ a patch of size $N \times N$. Using the minimum depth value correctly select the foreground points as reliable in presence of occlusions. As a drawback, it may lead to indiscriminately detecting as outliers most of the pixels in the background, even if not occluded. However, in practice, we will show that the network network is not extremelly affected by this approssimation that is on the other hand fast and unsupervised. Further details are described in our paper.
Qualitative Results
In this section we report a small set of examples, for each one we show respectively the image, the raw lidar, the lidar filtered with our approach and our sparse confidence map.
Reference
@inproceedings{aconti2022lidarconf,
title={Unsupervised confidence for LiDAR depth maps and applications},
author={Conti, Andrea and Poggi, Matteo and Aleotti, Filippo and Mattoccia, Stefano},
booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems},
note={IROS},
year={2022}
}