Robust Depth Enhancement via Polarization Prompt Fusion Tuning.


CVPR 2024

1CAIR,HKISI-CAS, 2Princeton University, 3HKUST 4CASIA, 5CUHK, 6KTH
* Equal Contribution, Corresponding Authors



Polarization Prompt Fusion Tuning (PPFT) leverages the dense shape cues from polarization and produces accurate results on challenging depth enhancement problems.

Abstract

Existing depth sensors are imperfect and may provide inaccurate depth values in challenging scenarios, such as in the presence of transparent or reflective objects. In this work, we present a general framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors. Previous polarization-based depth enhancement methods focus on utilizing pure physics-based formulas for a single sensor. In contrast, our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors. To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets, as the size of the polarization dataset is limited to train a strong model from scratch. We conducted extensive experiments on a public dataset, and the results demonstrate that the proposed method performs favorably compared to existing depth enhancement baselines.


Architecture

Prompt Fusion Tuning (PPFT). We fuse polarization embeddings to the features extracted from pre-trained layers sequentially using our proposed Polarization Prompt Fusion Block (PPFB). Specifically, polarization features are passed into our PPFB as prompt M, and features from the pre-trained foundation as the input X. Both are then updated and passed into the next set of foundation encoder and our PPFB respectively.


Visualization

Point Cloud Visualization


Comparison between our approach and baselines using point cloud visualization. In addition to restoring challeng- ing irregularities (e.g. the transparent bottle highlighted in row 2), we also observe a higher degree of regularity (e.g. the left-bottom corner highlighted in row 2, which is misaligned in the sensor depth, resulting in blank point clouds) using our proposed method, showing strong surface geometry information provided by polarization.

Depth Map Visualization


Qualitative comparison on HAMMER dataset. We present results on different depth sensors. The red boxes highlight regions to emphasize. We can observe significantly better results on transparent surfaces, i.e. the water bottle, and detailed regions, for example, the stack of objects on the left in Row 2.


Results on HAMMER


Polarization Prompt Fusion Tuning (PPFT) leverages the dense shape cues from polarization and produces accurate results on challenging depth enhancement problems.

BibTeX

@misc{ikemura2024robust,
      title={Robust Depth Enhancement via Polarization Prompt Fusion Tuning},
      author={Kei Ikemura and Yiming Huang and Felix Heide and Zhaoxiang Zhang and Qifeng Chen and Chenyang Lei},
      year={2024},
      eprint={2404.04318},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
  }

Acknowledgement

We sincerely thank CompletionfFormer for their opensource code. We also thanks HAMMER for the opensource dataset.