Lin, Kai-En

Neural 3D Representations for View Synthesis with Sparse Input Views

2023

Lin, Kai-En
Advisor(s): Ramamoorthi, Ravi

Abstract

Reconstructing scenes or objects from observed images has long been a critical problem in the graphics and vision community.Traditional methods solve the inverse problem by having a large number of images to resolve the ambiguity in geometry and appearance. However, capturing and storing complex data consumes extensive compute resources, and it is infeasible for consumer-grade hardware. This dissertation presents several algorithms to reconstruct the 3D geometry and appearance from a handful of input views, allowing efficient data capture, storage and generalization to unseen scenes.

Starting with scene reconstruction, we first aim at data captured by 360° cameras.We introduce multi depth panoramas, a compact representation to enable translational and rotational movements in the 3D scene. We leverage multi-view stereo (MVS) techniques and deep neural networks to promote 16 input views into a panoramic representation that could efficiently render convincing visual results with a small storage requirement.

Furthermore, we explore a harder problem by reducing the input view count to only 2 and capturing scenes with dynamic components.We present deep 3D mask volume, a novel representation to ensure temporally stable renderings for view extrapolation. Our network takes information from the video frames to infer disocclusion caused by the moving objects. Then it produces a 3D mask volume to clean up the disoccluded regions with the temporally stable background content, producing flicker-free visual results.

Next, we focus on human portraits and seek to change the viewpoint and the lighting at the same time.We develop neural light-transport field (NeLF). This representation is trained on synthetic human portraits to generate novel views under novel lighting from only 5 input images.

Finally, we investigate the 3D reconstruction problem where only a single image is given.To this end, we present VisionNeRF. This algorithm combines the expressiveness and capacity from vision transformers and the high-fidelity rendering from volumetric representation to synthesize unseen views of a given object.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC San Diego

Neural 3D Representations for View Synthesis with Sparse Input Views