PixelSynth: Generating a 3D-Consistent Experience from a Single Image
Chris Rockwell
David Fouhey
Justin Johnson
University of Michigan
ICCV 2021

Interactive Demo



SynSin (6x)

No 3D Accum.

Demo Instructions.
View in widescreen or zoom out until all images fit on one line.
Look around by clicking and dragging on an image above or using [W,A,S,D] keys.
Use the buttons above or the shift key to toggle between translation and rotation.
Click "Granular Scene Movement" to load more images for smoother movement.
This probably won't work on Internet Explorer or on mobile.


Recent advancements in differentiable rendering and 3D reasoning have driven exciting results in novel view synthesis from a single image. Despite realistic results, methods are limited to relatively small view change. In order to synthesize immersive scenes, models must also be able to extrapolate. We present an approach that fuses 3D reasoning with autoregressive modeling to outpaint large view changes in a 3D-consistent manner, enabling scene synthesis. We demonstrate considerable improvement in single image large-angle view synthesis results compared to a variety of methods and possible variants across simulated and real datasets. In addition, we show increased 3D consistency compared to alternative accumulation methods.

Paper and Supplemental Material

Rockwell, Fouhey and Johnson.
PixelSynth: Generating a 3D-Consistent Experience from a Single Image.
In ICCV 2021. (Hosted on arXiv)

[Paper] [Supplemental] [Code]
              author = {Chris Rockwell and David F. Fouhey and Justin Johnson},
              title = {PixelSynth: Generating a 3D-Consistent Experience from a Single Image},
              booktitle = {ICCV},
              year = 2021

Video Results

We display rendered scenes from PixelSynth and baselines on RealEstate10K and Matterport.

Recent & Concurrent Work

There has been a variety of exciting recent and concurrent work on single-image novel view synthesis. In addition to SynSin, here is a partial list:


Thanks to Angel Chang, Angela Dai, Richard Tucker and Noah Snavely for allowing us to share frames from their datasets. Thanks Olivia Wiles and Ajay Jain for polished model repositories which were so helpful in this work. Thanks to Shengyi Qian, Karan Desai, Mohamed El Banani, Linyi Jin, and Richard Higgins for the helpful discussions. Special thanks to the Michigan Help Desk (DCO) for after-hours help with machines. The webpage template originally came from some colorful folks.