THESIS
2024
1 online resource (xviii, 135 pages) : illustrations (some color)
Abstract
Generative models have demonstrated notable advancements recently, particularly in the realms of 2D and video synthesis. However, evident inconsistencies, such as those related to lighting and geometry, persist in 2D and video generation. The inclusion of 3D modeling holds the potential to enhance the coherence and realism of 2D and video generation, urging the need for advancements in 3D generation.
Given the challenges associated with collecting a huge amount of 3D data for direct generative modeling, a practical approach to 3D generation involves learning 3D distributions from single-view images. This approach is viable due to the availability of abundant, unstructured, high-quality, and diverse single-view image data. A common strategy for 3D generation from single-view images is t...[
Read more ]
Generative models have demonstrated notable advancements recently, particularly in the realms of 2D and video synthesis. However, evident inconsistencies, such as those related to lighting and geometry, persist in 2D and video generation. The inclusion of 3D modeling holds the potential to enhance the coherence and realism of 2D and video generation, urging the need for advancements in 3D generation.
Given the challenges associated with collecting a huge amount of 3D data for direct generative modeling, a practical approach to 3D generation involves learning 3D distributions from single-view images. This approach is viable due to the availability of abundant, unstructured, high-quality, and diverse single-view image data. A common strategy for 3D generation from single-view images is the adoption of generative adversarial networks (GANs), with the generator being replaced by a 3D renderer.
This thesis delves into the domain of 3D generation from four perspectives. We first look into the generated geometry and propose an enhancement of the learned geometry by injecting 3D awareness not only to the generator but also to the discriminator. Second, we analyze the pose requirements for the training of 3D generative models and free the generator from the constraints of pose priors, resulting in a more flexible 3D generative model. Third, in the context of complex scene synthesis, an analysis of the shortcomings in existing methods is presented, along with a proposal to leverage 3D priors to facilitate 3D modeling from single-view scene images. Fourth, we will also discuss the incorporation of efficient representations for 3D generation, especially Gaussian Splatting. In the end, we will present the potential future directions in 3D generation.
Post a Comment