
Requirement:
Complete any one option of below computer vision research paper implementation:
Option 1: A) Select a paper published in either CVPR, ICCV, ECCV (published/accepted in 2019, 2020, 2021) whose code is not available online (ANYWHERE) and code the paper from scratch B) Recreate the metrics using the available dataset. C) Either use Tensorflow/Pytorch and STRICTLY NO KERAS. D) Should be able to run on Google Colab E) Provide the report
Option 2: A) Select a paper published in either CVPR, ICCV, ECCV (published/accepted in 2019, 2020, 2021) whose code is available online (Preferably official code) B) Bring it onto Google Colab and Recreate the metrics using the available dataset. C) Improvise the model [MUST] D) Show the metrics between paper and your improvements E) Provide the report
Solution:
The Paper was taken from CVPR 2020 and here is the paper link: https://ieeexplore.ieee.org/document/9157537
Published in: IEEE/CVF Computer Vision and Pattern Recognition Conference 2020(CVPR)
Abstract:
In this article, a method was proposed to transform one RGB-D image into a 3D photograph, i.e. a multi-layer display of new view synthesis containing hallucinated color and depth structures in the original view regions. We use an explicit pixel-connected layered depth image as the background representation, and present an educational model of painting that synthesizes the latest local color-and-proof content in a spatially context-conscious manner in the occluded region. The resulting 3D photographs can be made effectively using normal graphics motors with motion parallax. The efficiency of our approach is validated in a broad array of difficult daily scenes and less than state-of-the-art objects are shown.
improvisation’s made:
1) A single color RGB-D image is used as input and a 3D picture is created in mp4 format and 3D photogeneration is based on a multi-layer representation, in which the color and depth structures of these layers are hallucinated in areas covered by the original view.
2) This library uses Layered Depth Image (as a form of input) with specific pixel connectivity as the basis of the representation and presents a model of learning-based paint, which synthesizes new local content in a spatial context-conscious way into the occluded area.
3) A motion parallax with standardized graphics motors can efficiently render the final 3D picture results.
● And, by cloning the paper’s original git repository which is made by authors and then implemented all the different metrics and also taken the different datasets Such as different images and videos of monkeys and implemented the code on it and got interesting results
Comparison of Results in Orginal paper and in Our Paper:
We equate our model with RealEstate10K dataset MPI-dependent approaches. DPSNet is used to obtain our method input depth maps. We provide new insights into the approaches based on MPI with the authors' pre-trained weights. And two difficult examples of complex depth structures are shown below from the results we got, Our method synthesizes plausible frameworks around depth limits, while stereo enlargements and PB-MPIs yield artifacts about profound discontinuities. When extrapolating new visions LLFF suffers from ghosting symptoms


Compared to MPI models, we test how well our model can extrapolate views [56, 77, 4, 40]. We sample 1500 RealEstate10K video sequences at random for test triplets. In order for each triplet to be extrapolated beyond the sources (t = 0) and reference (t = 4), we set t = 10 for target view. DPSNet is used to generate the necessary input depth maps. We quantify the output between the synthesized target vision and the ground truth of each model using SSIM and PSNR.
Conclusions:
Finally, In this paper, we present a single RGB-D picture with an algorithm for persuasive 3D photographing. A full-depth pictorial representation through context-aware color and depth painting lies in our key technological innovation. Our process is validated in a broad range of daily scenes. Our experimental findings show that compared to the cutting-edge new viewing methods, our algorithm creates significantly fewer visual artifacts. We think that such
Implementation
If you need code implementation of this advance Machine Learning project then send your request and get solution by our expert.
Send your machine learning requirement details at realcode4you@gmail.com to get help in any machine learning project related to:
Deep Learning
Natural Language Processing
Research Paper Implementation
Advance Deep Learning Data Visualization
And more other
Comentarios