CS888 Winter 2010

Presentation topics

We're quite happy for you to find your own papers to read and present (in fact, we encourage you to do so); you will enjoy the work more if you do. On this page, we will offer you a few pointers to places to find good papers, along with a short list of example papers that we offer for consideration.

Once you've narrowed your choices down to a few possibilities, you can send your ideas to me (Craig) in an email or stop by my office. Be sure also to include a few dates on which you'd be willing to present. We'll make a final paper selection and assign you a date.

Resources

You can start by looking at papers in some of the important graphics conferences: SIGGRAPH, SIGGRAPH Asia, Eurographics and Graphics Interface. There might also be relevant photography-related papers in some of the computer vision conferences; the big ones are ICCV and CVPR. Consider also some recent specialty conferences, most importantly Projector-Camera Systems Workshop and The International Conference on Computational Photography. Generally speaking, you can find lots of links to graphics authors and project pages on the website maintined by Ke-Sen Huang.

Here are a few additional top-level resources:

Kyros Kutulakos's Computer Vision for Advanced Digital Photography course at UofT.
Rob Fergus's Computational Photography course at NYU.
Marc Levoy's Digital Photography course at Stanford.
Computational Photography information from Ramesh Raskar and Jack Tumblin. Their page includes links to SIGGRAPH course notes from previous years and a bit of source code. It also points to a book they have coming out (alas not for a few months) and some other courses.
Rick Szeliski has an extended introduction to image alignment and stitching that would make good background for many of the papers mentioned below.

If you come across other good links, please send them to me and I'll include them here.

Suggested papers

Here is a list of potential papers for presentations. In some cases, it may make sense to combine two or more papers into a single presentation, to show the historical development of a technique or to contrast different approaches. The DOI links will take you to the publisher's page for the paper; if you're on campus, you'll be able to download the paper from these pages. You should also check out the project pages. They'll often have freely available copies of the paper, as well as additional resources (images, video, slides, source code, etc.).

You can click on the BibTeX and Abstract links to expand the information on any paper inline. For your convenience, the references below are also available for download as a single BibTeX file.

Stitching

Key	Citation	Details
[AAC+2006]	Agarwala, A., Agrawala, M., Cohen, M., Salesin, D. and Szeliski, R. 2006. Photographing long scenes with multi-viewpoint panoramas. In SIGGRAPH '06: ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, 853–861. We present a system for producing multi-viewpoint panoramas of long, roughly planar scenes, such as the facades of buildings along a city street, from a relatively sparse set of photographs captured with a handheld still camera that is moved along the scene. Our work is a significant departure from previous methods for creating multi-viewpoint panoramas, which composite thin vertical strips from a video sequence captured by a translating video camera, in that the resulting panoramas are composed of relatively large regions of ordinary perspective. In our system, the only user input required beyond capturing the photographs themselves is to identify the dominant plane of the photographed scene; our system then computes a panorama automatically using Markov Random Field optimization. Users may exert additional control over the appearance of the result by drawing rough strokes that indicate various high-level goals. We demonstrate the results of our system on several scenes, including urban streets, a river bank, and a grocery store aisle. @inproceedings{AAC+2006, author = {Agarwala, Aseem and Agrawala, Maneesh and Cohen, Michael and Salesin, David and Szeliski, Richard}, title = {Photographing long scenes with multi-viewpoint panoramas}, booktitle = {SIGGRAPH '06: ACM SIGGRAPH 2006 Papers}, year = {2006}, isbn = {1-59593-364-6}, pages = {853--861}, location = {Boston, Massachusetts}, doi = {10.1145/1179352.1141966}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://grail.cs.washington.edu/projects/multipano/}, abstract = {We present a system for producing multi-viewpoint panoramas of long, roughly planar scenes, such as the facades of buildings along a city street, from a relatively sparse set of photographs captured with a handheld still camera that is moved along the scene. Our work is a significant departure from previous methods for creating multi-viewpoint panoramas, which composite thin vertical strips from a video sequence captured by a translating video camera, in that the resulting panoramas are composed of relatively large regions of ordinary perspective. In our system, the only user input required beyond capturing the photographs themselves is to identify the dominant plane of the photographed scene; our system then computes a panorama automatically using Markov Random Field optimization. Users may exert additional control over the appearance of the result by drawing rough strokes that indicate various high-level goals. We demonstrate the results of our system on several scenes, including urban streets, a river bank, and a grocery store aisle.} }	[BibTeX] [Abstract] [DOI] [Project]
[ADA+2004]	Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D. and Cohen, M. 2004. Interactive digital photomontage. In SIGGRAPH '04: ACM SIGGRAPH 2004 Papers, ACM, New York, NY, USA, 294–302. We describe an interactive, computer-assisted framework for combining parts of a set of photographs into a single composite picture, a process we call "digital photomontage." Our framework makes use of two techniques primarily: graph-cut optimization, to choose good seams within the constituent images so that they can be combined as seamlessly as possible; and gradient-domain fusion, a process based on Poisson equations, to further reduce any remaining visible artifacts in the composite. Also central to the framework is a suite of interactive tools that allow the user to specify a variety of high-level image objectives, either globally across the image, or locally through a painting-style interface. Image objectives are applied independently at each pixel location and generally involve a function of the pixel values (such as "maximum contrast") drawn from that same location in the set of source images. Typically, a user applies a series of image objectives iteratively in order to create a finished composite. The power of this framework lies in its generality; we show how it can be used for a wide variety of applications, including "selective composites" (for instance, group photos in which everyone looks their best), relighting, extended depth of field, panoramic stitching, clean-plate production, stroboscopic visualization of movement, and time-lapse mosaics. @inproceedings{ADA+2004, author = {Agarwala, Aseem and Dontcheva, Mira and Agrawala, Maneesh and Drucker, Steven and Colburn, Alex and Curless, Brian and Salesin, David and Cohen, Michael}, title = {Interactive digital photomontage}, booktitle = {SIGGRAPH '04: ACM SIGGRAPH 2004 Papers}, year = {2004}, pages = {294--302}, location = {Los Angeles, California}, doi = {10.1145/1186562.1015718}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://grail.cs.washington.edu/projects/photomontage/}, abstract = {We describe an interactive, computer-assisted framework for combining parts of a set of photographs into a single composite picture, a process we call "digital photomontage." Our framework makes use of two techniques primarily: graph-cut optimization, to choose good seams within the constituent images so that they can be combined as seamlessly as possible; and gradient-domain fusion, a process based on Poisson equations, to further reduce any remaining visible artifacts in the composite. Also central to the framework is a suite of interactive tools that allow the user to specify a variety of high-level image objectives, either globally across the image, or locally through a painting-style interface. Image objectives are applied independently at each pixel location and generally involve a function of the pixel values (such as "maximum contrast") drawn from that same location in the set of source images. Typically, a user applies a series of image objectives iteratively in order to create a finished composite. The power of this framework lies in its generality; we show how it can be used for a wide variety of applications, including "selective composites" (for instance, group photos in which everyone looks their best), relighting, extended depth of field, panoramic stitching, clean-plate production, stroboscopic visualization of movement, and time-lapse mosaics.} }	[BibTeX] [Abstract] [DOI] [Project]
[Agarwala2007]	Agarwala, A. 2007. Efficient gradient-domain compositing using quadtrees. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 94. We describe a hierarchical approach to improving the efficiency of gradient-domain compositing, a technique that constructs seamless composites by combining the gradients of images into a vector field that is then integrated to form a composite. While gradient-domain compositing is powerful and widely used, it suffers from poor scalability. Computing an n pixel composite requires solving a linear system with n variables; solving such a large system quickly overwhelms the main memory of a standard computer when performed for multi-megapixel composites, which are common in practice. In this paper we show how to perform gradient-domain compositing approximately by solving an O(p) linear system, where p is the total length of the seams between image regions in the composite; for typical cases, p is O(√n). We achieve this reduction by transforming the problem into a space where much of the solution is smooth, and then utilize the pattern of this smoothness to adaptively subdivide the problem domain using quadtrees. We demonstrate the merits of our approach by performing panoramic stitching and image region copy-and-paste in significantly reduced time and memory while achieving visually identical results. @inproceedings{Agarwala2007, author = {Agarwala, Aseem}, title = {Efficient gradient-domain compositing using quadtrees}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {94}, location = {San Diego, California}, doi = {10.1145/1275808.1276495}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://agarwala.org/efficient_gdc/}, abstract = {We describe a hierarchical approach to improving the efficiency of gradient-domain compositing, a technique that constructs seamless composites by combining the gradients of images into a vector field that is then integrated to form a composite. While gradient-domain compositing is powerful and widely used, it suffers from poor scalability. Computing an n pixel composite requires solving a linear system with n variables; solving such a large system quickly overwhelms the main memory of a standard computer when performed for multi-megapixel composites, which are common in practice. In this paper we show how to perform gradient-domain compositing approximately by solving an O(p) linear system, where p is the total length of the seams between image regions in the composite; for typical cases, p is O(√n). We achieve this reduction by transforming the problem into a space where much of the solution is smooth, and then utilize the pattern of this smoothness to adaptively subdivide the problem domain using quadtrees. We demonstrate the merits of our approach by performing panoramic stitching and image region copy-and-paste in significantly reduced time and memory while achieving visually identical results.} }	[BibTeX] [Abstract] [DOI] [Project]
[BSF+2009]	Barnes, C., Shechtman, E., Finkelstein, A. and Goldman, D.B. 2009. PatchMatch: a randomized correspondence algorithm for structural image editing. In SIGGRAPH '09: ACM SIGGRAPH 2009 papers, ACM, New York, NY, USA, 1–11. This paper presents interactive image editing tools using a new randomized algorithm for quickly finding approximate nearest-neighbor matches between image patches. Previous research in graphics and vision has leveraged such nearest-neighbor searches to provide a variety of high-level digital image editing tools. However, the cost of computing a field of such matches for an entire image has eluded previous efforts to provide interactive performance. Our algorithm offers substantial performance improvements over the previous state of the art (20-100x), enabling its use in interactive editing tools. The key insights driving the algorithm are that some good patch matches can be found via random sampling, and that natural coherence in the imagery allows us to propagate such matches quickly to surrounding areas. We offer theoretical analysis of the convergence properties of the algorithm, as well as empirical and practical evidence for its high quality and performance. This one simple algorithm forms the basis for a variety of tools – image retargeting, completion and reshuffling – that can be used together in the context of a high-level image editing application. Finally, we propose additional intuitive constraints on the synthesis process that offer the user a level of control unavailable in previous methods. @inproceedings{BSF+2009, author = {Barnes, Connelly and Shechtman, Eli and Finkelstein, Adam and Goldman, Dan B}, title = {PatchMatch: a randomized correspondence algorithm for structural image editing}, booktitle = {SIGGRAPH '09: ACM SIGGRAPH 2009 papers}, year = {2009}, isbn = {978-1-60558-726-4}, pages = {1--11}, location = {New Orleans, Louisiana}, doi = {10.1145/1576246.1531330}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.princeton.edu/gfx/pubs/Barnes_2009_PAR/index.php}, abstract = {This paper presents interactive image editing tools using a new randomized algorithm for quickly finding approximate nearest-neighbor matches between image patches. Previous research in graphics and vision has leveraged such nearest-neighbor searches to provide a variety of high-level digital image editing tools. However, the cost of computing a field of such matches for an entire image has eluded previous efforts to provide interactive performance. Our algorithm offers substantial performance improvements over the previous state of the art (20-100x), enabling its use in interactive editing tools. The key insights driving the algorithm are that some good patch matches can be found via random sampling, and that natural coherence in the imagery allows us to propagate such matches quickly to surrounding areas. We offer theoretical analysis of the convergence properties of the algorithm, as well as empirical and practical evidence for its high quality and performance. This one simple algorithm forms the basis for a variety of tools -- image retargeting, completion and reshuffling -- that can be used together in the context of a high-level image editing application. Finally, we propose additional intuitive constraints on the synthesis process that offer the user a level of control unavailable in previous methods.} }	[BibTeX] [Abstract] [DOI] [Project]
[HE2007]	Hays, J. and Efros, A.A. 2007. Scene completion using millions of photographs. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 4. What can you do with a million images? In this paper we present a new image completion algorithm powered by a huge database of photographs gathered from the Web. The algorithm patches up holes in images by finding similar image regions in the database that are not only seamless but also semantically valid. Our chief insight is that while the space of images is effectively infinite, the space of semantically differentiable scenes is actually not that large. For many image completion tasks we are able to find similar scenes which contain image fragments that will convincingly complete the image. Our algorithm is entirely data-driven, requiring no annotations or labelling by the user. Unlike existing image completion methods, our algorithm can generate a diverse set of results for each input image and we allow users to select among them. We demonstrate the superiority of our algorithm over existing image completion approaches. @inproceedings{HE2007, author = {Hays, James and Efros, Alexei A.}, title = {Scene completion using millions of photographs}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {4}, location = {San Diego, California}, doi = {10.1145/1275808.1276382}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://graphics.cs.cmu.edu/projects/scene-completion/}, abstract = {What can you do with a million images? In this paper we present a new image completion algorithm powered by a huge database of photographs gathered from the Web. The algorithm patches up holes in images by finding similar image regions in the database that are not only seamless but also semantically valid. Our chief insight is that while the space of images is effectively infinite, the space of semantically differentiable scenes is actually not that large. For many image completion tasks we are able to find similar scenes which contain image fragments that will convincingly complete the image. Our algorithm is entirely data-driven, requiring no annotations or labelling by the user. Unlike existing image completion methods, our algorithm can generate a diverse set of results for each input image and we allow users to select among them. We demonstrate the superiority of our algorithm over existing image completion approaches.} }	[BibTeX] [Abstract] [DOI] [Project]
[LHE+2007]	Lalonde, J.-F., Hoiem, D., Efros, A.A., Rother, C., Winn, J. and Criminisi, A. 2007. Photo clip art. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 3. We present a system for inserting new objects into existing photographs by querying a vast image-based object library, pre-computed using a publicly available Internet object database. The central goal is to shield the user from all of the arduous tasks typically involved in image compositing. The user is only asked to do two simple things: 1) pick a 3D location in the scene to place a new object; 2) select an object to insert using a hierarchical menu. We pose the problem of object insertion as a data-driven, 3D-based, context-sensitive object retrieval task. Instead of trying to manipulate the object to change its orientation, color distribution, etc. to fit the new image, we simply retrieve an object of a specified class that has all the required properties (camera pose, lighting, resolution, etc) from our large object library. We present new automatic algorithms for improving object segmentation and blending, estimating true 3D object size and orientation, and estimating scene lighting conditions. We also present an intuitive user interface that makes object insertion fast and simple even for the artistically challenged. @inproceedings{LHE+2007, author = {Lalonde, Jean-Fran\c{c}ois and Hoiem, Derek and Efros, Alexei A. and Rother, Carsten and Winn, John and Criminisi, Antonio}, title = {Photo clip art}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {3}, location = {San Diego, California}, doi = {10.1145/1275808.1276381}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://graphics.cs.cmu.edu/projects/photoclipart/}, abstract = {We present a system for inserting new objects into existing photographs by querying a vast image-based object library, pre-computed using a publicly available Internet object database. The central goal is to shield the user from all of the arduous tasks typically involved in image compositing. The user is only asked to do two simple things: 1) pick a 3D location in the scene to place a new object; 2) select an object to insert using a hierarchical menu. We pose the problem of object insertion as a data-driven, 3D-based, context-sensitive object retrieval task. Instead of trying to manipulate the object to change its orientation, color distribution, etc. to fit the new image, we simply retrieve an object of a specified class that has all the required properties (camera pose, lighting, resolution, etc) from our large object library. We present new automatic algorithms for improving object segmentation and blending, estimating true 3D object size and orientation, and estimating scene lighting conditions. We also present an intuitive user interface that makes object insertion fast and simple even for the artistically challenged.} }	[BibTeX] [Abstract] [DOI] [Project]
[PGB2003]	Pérez, P., Gangnet, M. and Blake, A. 2003. Poisson image editing. In SIGGRAPH '03: ACM SIGGRAPH 2003 Papers, ACM, New York, NY, USA, 313–318. Using generic interpolation machinery based on solving Poisson equations, a variety of novel tools are introduced for seamless editing of image regions. The first set of tools permits the seamless importation of both opaque and transparent source image regions into a destination region. The second set is based on similar mathematical ideas and allows the user to modify the appearance of the image seamlessly, within a selected region. These changes can be arranged to affect the texture, the illumination, and the color of objects lying in the region, or to make tileable a rectangular selection. @inproceedings{PGB2003, author = {P\'{e}rez, Patrick and Gangnet, Michel and Blake, Andrew}, title = {Poisson image editing}, booktitle = {SIGGRAPH '03: ACM SIGGRAPH 2003 Papers}, year = {2003}, isbn = {1-58113-709-5}, pages = {313--318}, location = {San Diego, California}, doi = {10.1145/1201775.882269}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.irisa.fr/vista/Publis/Auteur/Patrick.Perez.english.html}, abstract = {Using generic interpolation machinery based on solving Poisson equations, a variety of novel tools are introduced for seamless editing of image regions. The first set of tools permits the seamless importation of both opaque and transparent source image regions into a destination region. The second set is based on similar mathematical ideas and allows the user to modify the appearance of the image seamlessly, within a selected region. These changes can be arranged to affect the texture, the illumination, and the color of objects lying in the region, or to make tileable a rectangular selection.} }	[BibTeX] [Abstract] [DOI] [Project]
[SYJ+2005]	Sun, J., Yuan, L., Jia, J. and Shum, H.-Y. 2005. Image completion with structure propagation. In SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, ACM, New York, NY, USA, 861–868. In this paper, we introduce a novel approach to image completion, which we call structure propagation. In our system, the user manually specifies important missing structure information by extending a few curves or line segments from the known to the unknown regions. Our approach synthesizes image patches along these user-specified curves in the unknown region using patches selected around the curves in the known region. Structure propagation is formulated as a global optimization problem by enforcing structure and consistency constraints. If only a single curve is specified, structure propagation is solved using Dynamic Programming. When multiple intersecting curves are specified, we adopt the Belief Propagation algorithm to find the optimal patches. After completing structure propagation, we fill in the remaining unknown regions using patch-based texture synthesis. We show that our approach works well on a number of examples that are challenging to state-of-the-art techniques. @inproceedings{SYJ+2005, author = {Sun, Jian and Yuan, Lu and Jia, Jiaya and Shum, Heung-Yeung}, title = {Image completion with structure propagation}, booktitle = {SIGGRAPH '05: ACM SIGGRAPH 2005 Papers}, year = {2005}, pages = {861--868}, location = {Los Angeles, California}, doi = {10.1145/1186822.1073274}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://research.microsoft.com/apps/pubs/default.aspx?id=69211}, abstract = {In this paper, we introduce a novel approach to image completion, which we call structure propagation. In our system, the user manually specifies important missing structure information by extending a few curves or line segments from the known to the unknown regions. Our approach synthesizes image patches along these user-specified curves in the unknown region using patches selected around the curves in the known region. Structure propagation is formulated as a global optimization problem by enforcing structure and consistency constraints. If only a single curve is specified, structure propagation is solved using Dynamic Programming. When multiple intersecting curves are specified, we adopt the Belief Propagation algorithm to find the optimal patches. After completing structure propagation, we fill in the remaining unknown regions using patch-based texture synthesis. We show that our approach works well on a number of examples that are challenging to state-of-the-art techniques.} }	[BibTeX] [Abstract] [DOI] [Project]

Deblurring and denoising

Key	Citation	Details
[FSH+2006]	Fergus, R., Singh, B., Hertzmann, A., Roweis, S.T. and Freeman, W.T. 2006. Removing camera shake from a single photograph. In SIGGRAPH '06: ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, 787–794. Camera shake during exposure leads to objectionable image blur and ruins many photographs. Conventional blind deconvolution methods typically assume frequency-domain constraints on images, or overly simplified parametric forms for the motion path during camera shake. Real camera motions can follow convoluted paths, and a spatial domain prior can better maintain visually salient image characteristics. We introduce a method to remove the effects of camera shake from seriously blurred images. The method assumes a uniform camera blur over the image and negligible in-plane camera rotation. In order to estimate the blur from the camera shake, the user must specify an image region without saturation effects. We show results for a variety of digital photographs taken from personal photo collections. @inproceedings{FSH+2006, author = {Fergus, Rob and Singh, Barun and Hertzmann, Aaron and Roweis, Sam T. and Freeman, William T.}, title = {Removing camera shake from a single photograph}, booktitle = {SIGGRAPH '06: ACM SIGGRAPH 2006 Papers}, year = {2006}, isbn = {1-59593-364-6}, pages = {787--794}, location = {Boston, Massachusetts}, doi = {10.1145/1179352.1141956}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.nyu.edu/~fergus/research/deblur.html}, abstract = {Camera shake during exposure leads to objectionable image blur and ruins many photographs. Conventional blind deconvolution methods typically assume frequency-domain constraints on images, or overly simplified parametric forms for the motion path during camera shake. Real camera motions can follow convoluted paths, and a spatial domain prior can better maintain visually salient image characteristics. We introduce a method to remove the effects of camera shake from seriously blurred images. The method assumes a uniform camera blur over the image and negligible in-plane camera rotation. In order to estimate the blur from the camera shake, the user must specify an image region without saturation effects. We show results for a variety of digital photographs taken from personal photo collections.} }	[BibTeX] [Abstract] [DOI] [Project]
[Fattal2008]	Fattal, R. 2008. Single image dehazing. In SIGGRAPH '08: ACM SIGGRAPH 2008 papers, ACM, New York, NY, USA, 1–9. In this paper we present a new method for estimating the optical transmission in hazy scenes given a single input image. Based on this estimation, the scattered light is eliminated to increase scene visibility and recover haze-free scene contrasts. In this new approach we formulate a refined image formation model that accounts for surface shading in addition to the transmission function. This allows us to resolve ambiguities in the data by searching for a solution in which the resulting shading and transmission functions are locally statistically uncorrelated. A similar principle is used to estimate the color of the haze. Results demonstrate the new method abilities to remove the haze layer as well as provide a reliable transmission estimate which can be used for additional applications such as image refocusing and novel view synthesis. @inproceedings{Fattal2008, author = {Fattal, Raanan}, title = {Single image dehazing}, booktitle = {SIGGRAPH '08: ACM SIGGRAPH 2008 papers}, year = {2008}, pages = {1--9}, location = {Los Angeles, California}, doi = {10.1145/1399504.1360671}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.huji.ac.il/~raananf/projects/defog/index.html}, abstract = {In this paper we present a new method for estimating the optical transmission in hazy scenes given a single input image. Based on this estimation, the scattered light is eliminated to increase scene visibility and recover haze-free scene contrasts. In this new approach we formulate a refined image formation model that accounts for surface shading in addition to the transmission function. This allows us to resolve ambiguities in the data by searching for a solution in which the resulting shading and transmission functions are locally statistically uncorrelated. A similar principle is used to estimate the color of the haze. Results demonstrate the new method abilities to remove the haze layer as well as provide a reliable transmission estimate which can be used for additional applications such as image refocusing and novel view synthesis.} }	[BibTeX] [Abstract] [DOI] [Project]
[LWD+2009]	Levin, A., Weiss, Y., Durand, F. and Freeman, W.T. 2009. Understanding and evaluating blind deconvolution algorithms. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 1964–1971. Blind deconvolution is the recovery of a sharp version of a blurred image when the blur kernel is unknown. Recent algorithms have afforded dramatic progress, yet many aspects of the problem remain challenging and hard to understand. The goal of this paper is to analyze and evaluate recent blind deconvolution algorithms both theoretically and experimentally. We explain the previously reported failure of the naive MAP approach by demonstrating that it mostly favors no-blur explanations. On the other hand we show that since the kernel size is often smaller than the image size a MAP estimation of the kernel alone can be well constrained and accurately recover the true blur. The plethora of recent deconvolution techniques makes an experimental evaluation on ground-truth data important. We have collected blur data with ground truth and compared recent algorithms under equal settings. Additionally, our data demonstrates that the shift-invariant blur assumption made by most algorithms is often violated. @inproceedings{LWD+2009, author = {A. Levin and Y. Weiss and F. Durand and W.T. Freeman}, title = {Understanding and evaluating blind deconvolution algorithms}, booktitle = {IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)}, year = {2009}, month = {June}, pages = {1964--1971}, doi = {10.1109/CVPRW.2009.5206815}, project = {http://www.wisdom.weizmann.ac.il/~levina/}, abstract = {Blind deconvolution is the recovery of a sharp version of a blurred image when the blur kernel is unknown. Recent algorithms have afforded dramatic progress, yet many aspects of the problem remain challenging and hard to understand. The goal of this paper is to analyze and evaluate recent blind deconvolution algorithms both theoretically and experimentally. We explain the previously reported failure of the naive MAP approach by demonstrating that it mostly favors no-blur explanations. On the other hand we show that since the kernel size is often smaller than the image size a MAP estimation of the kernel alone can be well constrained and accurately recover the true blur. The plethora of recent deconvolution techniques makes an experimental evaluation on ground-truth data important. We have collected blur data with ground truth and compared recent algorithms under equal settings. Additionally, our data demonstrates that the shift-invariant blur assumption made by most algorithms is often violated.}, }	[BibTeX] [Abstract] [DOI] [Project]
[LSK+2008]	Liu, C., Szeliski, R., Kang, S.B., Zitnick, C.L. and Freeman, W.T. 2008. Automatic Estimation and Removal of Noise from a Single Image. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 2 (February), 299–314. Most existing image denoising work assumes additive white Gaussian noise (AWGN) and removes the noise independent of the RGB channels. Therefore, the current approaches are not fully automatic and cannot effectively remove color noise produced by CCD digital camera. In this paper, we propose a framework for two tasks, automatically estimating and removing color noise from a single image using piecewise smooth image models. We estimate noise level function (NLF), a continuous function of noise level to image brightness, as the upper bound of the real noise level by fitting the lower envelope to the standard deviations of the per-segment image variance. In the denoising module, the chrominance of color noise is significantly removed by projecting pixel values to a line fit to the RGB space in each segment. Then, a Gaussian conditional random field (GCRF) is constructed to obtain the underlying clean image from the noisy input. Extensive experiments are conducted to test the proposed algorithms, which are proven to outperform the state-of-the-art denoising algorithms with promising and convincing results. @article{LSK+2008, author = {Liu, Ce and Szeliski, R. and Kang, Sing Bing and Zitnick, C.L. and Freeman, W.T.}, title = {Automatic Estimation and Removal of Noise from a Single Image}, journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {30}, number = {2}, pages = {299--314}, year = {2008}, month = {February}, doi = {10.1109/TPAMI.2007.1176}, project = {http://research.microsoft.com/apps/pubs/default.aspx?id=70384}, abstract = {Most existing image denoising work assumes additive white Gaussian noise (AWGN) and removes the noise independent of the RGB channels. Therefore, the current approaches are not fully automatic and cannot effectively remove color noise produced by CCD digital camera. In this paper, we propose a framework for two tasks, automatically estimating and removing color noise from a single image using piecewise smooth image models. We estimate noise level function (NLF), a continuous function of noise level to image brightness, as the upper bound of the real noise level by fitting the lower envelope to the standard deviations of the per-segment image variance. In the denoising module, the chrominance of color noise is significantly removed by projecting pixel values to a line fit to the RGB space in each segment. Then, a Gaussian conditional random field (GCRF) is constructed to obtain the underlying clean image from the noisy input. Extensive experiments are conducted to test the proposed algorithms, which are proven to outperform the state-of-the-art denoising algorithms with promising and convincing results.} }	[BibTeX] [Abstract] [DOI] [Project]
[YSQ+2007]	Yuan, L., Sun, J., Quan, L. and Shum, H.-Y. 2007. Image deblurring with blurred/noisy image pairs. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 1. Taking satisfactory photos under dim lighting conditions using a hand-held camera is challenging. If the camera is set to a long exposure time, the image is blurred due to camera shake. On the other hand, the image is dark and noisy if it is taken with a short exposure time but with a high camera gain. By combining information extracted from both blurred and noisy images, however, we show in this paper how to produce a high quality image that cannot be obtained by simply denoising the noisy image, or deblurring the blurred image alone. Our approach is image deblurring with the help of the noisy image. First, both images are used to estimate an accurate blur kernel, which otherwise is difficult to obtain from a single blurred image. Second, and again using both images, a residual deconvolution is proposed to significantly reduce ringing artifacts inherent to image deconvolution. Third, the remaining ringing artifacts in smooth image regions are further suppressed by a gain-controlled deconvolution process. We demonstrate the effectiveness of our approach using a number of indoor and outdoor images taken by off-the-shelf hand-held cameras in poor lighting environments. @inproceedings{YSQ+2007, author = {Yuan, Lu and Sun, Jian and Quan, Long and Shum, Heung-Yeung}, title = {Image deblurring with blurred/noisy image pairs}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {1}, location = {San Diego, California}, doi = {10.1145/1275808.1276379}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://research.microsoft.com/en-us/um/people/jiansun/}, abstract = {Taking satisfactory photos under dim lighting conditions using a hand-held camera is challenging. If the camera is set to a long exposure time, the image is blurred due to camera shake. On the other hand, the image is dark and noisy if it is taken with a short exposure time but with a high camera gain. By combining information extracted from both blurred and noisy images, however, we show in this paper how to produce a high quality image that cannot be obtained by simply denoising the noisy image, or deblurring the blurred image alone. Our approach is image deblurring with the help of the noisy image. First, both images are used to estimate an accurate blur kernel, which otherwise is difficult to obtain from a single blurred image. Second, and again using both images, a residual deconvolution is proposed to significantly reduce ringing artifacts inherent to image deconvolution. Third, the remaining ringing artifacts in smooth image regions are further suppressed by a gain-controlled deconvolution process. We demonstrate the effectiveness of our approach using a number of indoor and outdoor images taken by off-the-shelf hand-held cameras in poor lighting environments.} }	[BibTeX] [Abstract] [DOI] [Project]

Colour and tone

Key	Citation	Details
[BPD2006]	Bae, S., Paris, S. and Durand, F. 2006. Two-scale tone management for photographic look. In SIGGRAPH '06: ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, 637–645. We introduce a new approach to tone management for photographs. Whereas traditional tone-mapping operators target a neutral and faithful rendition of the input image, we explore pictorial looks by controlling visual qualities such as the tonal balance and the amount of detail. Our method is based on a two-scale non-linear decomposition of an image. We modify the different layers based on their histograms and introduce a technique that controls the spatial variation of detail. We introduce a Poisson correction that prevents potential gradient reversal and preserves detail. In addition to directly controlling the parameters, the user can transfer the look of a model photograph to the picture being edited. @inproceedings{BPD2006, author = {Bae, Soonmin and Paris, Sylvain and Durand, Fr\'{e}do}, title = {Two-scale tone management for photographic look}, booktitle = {SIGGRAPH '06: ACM SIGGRAPH 2006 Papers}, year = {2006}, isbn = {1-59593-364-6}, pages = {637--645}, location = {Boston, Massachusetts}, doi = {10.1145/1179352.1141935}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://people.csail.mit.edu/soonmin/photolook/}, abstract = {We introduce a new approach to tone management for photographs. Whereas traditional tone-mapping operators target a neutral and faithful rendition of the input image, we explore pictorial looks by controlling visual qualities such as the tonal balance and the amount of detail. Our method is based on a two-scale non-linear decomposition of an image. We modify the different layers based on their histograms and introduce a technique that controls the spatial variation of detail. We introduce a Poisson correction that prevents potential gradient reversal and preserves detail. In addition to directly controlling the parameters, the user can transfer the look of a model photograph to the picture being edited.} }	[BibTeX] [Abstract] [DOI] [Project]
[CSG+2006]	Cohen-Or, D., Sorkine, O., Gal, R., Leyvand, T. and Xu, Y.-Q. 2006. Color harmonization. In SIGGRAPH '06: ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, 624–630. Harmonic colors are sets of colors that are aesthetically pleasing in terms of human visual perception. In this paper, we present a method that enhances the harmony among the colors of a given photograph or of a general image, while remaining faithful, as much as possible, to the original colors. Given a color image, our method finds the best harmonic scheme for the image colors. It then allows a graceful shifting of hue values so as to fit the harmonic scheme while considering spatial coherence among colors of neighboring pixels using an optimization technique. The results demonstrate that our method is capable of automatically enhancing the color "look-and-feel" of an ordinary image. In particular, we show the results of harmonizing the background image to accommodate the colors of a foreground image, or the foreground with respect to the background, in a cut-and-paste setting. Our color harmonization technique proves to be useful in adjusting the colors of an image composed of several parts taken from different sources. @inproceedings{CSG+2006, author = {Cohen-Or, Daniel and Sorkine, Olga and Gal, Ran and Leyvand, Tommer and Xu, Ying-Qing}, title = {Color harmonization}, booktitle = {SIGGRAPH '06: ACM SIGGRAPH 2006 Papers}, year = {2006}, isbn = {1-59593-364-6}, pages = {624--630}, location = {Boston, Massachusetts}, doi = {10.1145/1179352.1141933}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.nyu.edu/~sorkine/ProjectPages/Harmonization/}, abstract = {Harmonic colors are sets of colors that are aesthetically pleasing in terms of human visual perception. In this paper, we present a method that enhances the harmony among the colors of a given photograph or of a general image, while remaining faithful, as much as possible, to the original colors. Given a color image, our method finds the best harmonic scheme for the image colors. It then allows a graceful shifting of hue values so as to fit the harmonic scheme while considering spatial coherence among colors of neighboring pixels using an optimization technique. The results demonstrate that our method is capable of automatically enhancing the color "look-and-feel" of an ordinary image. In particular, we show the results of harmonizing the background image to accommodate the colors of a foreground image, or the foreground with respect to the background, in a cut-and-paste setting. Our color harmonization technique proves to be useful in adjusting the colors of an image composed of several parts taken from different sources.} }	[BibTeX] [Abstract] [DOI] [Project]
[DD2002]	Durand, F. and Dorsey, J. 2002. Fast bilateral filtering for the display of high-dynamic-range images. In SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques, ACM, New York, NY, USA, 257–266. We present a new technique for the display of high-dynamic-range images, which reduces the contrast while preserving detail. It is based on a two-scale decomposition of the image into a base layer, encoding large-scale variations, and a detail layer. Only the base layer has its contrast reduced, thereby preserving detail. The base layer is obtained using an edge-preserving filter called the bilateral filter. This is a non-linear filter, where the weight of each pixel is computed using a Gaussian in the spatial domain multiplied by an influence function in the intensity domain that decreases the weight of pixels with large intensity differences. We express bilateral filtering in the framework of robust statistics and show how it relates to anisotropic diffusion. We then accelerate bilateral filtering by using a piecewise-linear approximation in the intensity domain and appropriate subsampling. This results in a speed-up of two orders of magnitude. The method is fast and requires no parameter setting. @inproceedings{DD2002, author = {Durand, Fr\'{e}do and Dorsey, Julie}, title = {Fast bilateral filtering for the display of high-dynamic-range images}, booktitle = {SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques}, year = {2002}, isbn = {1-58113-521-1}, pages = {257--266}, location = {San Antonio, Texas}, doi = {10.1145/566570.566574}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://people.csail.mit.edu/fredo/PUBLI/Siggraph2002/}, abstract = {We present a new technique for the display of high-dynamic-range images, which reduces the contrast while preserving detail. It is based on a two-scale decomposition of the image into a base layer, encoding large-scale variations, and a detail layer. Only the base layer has its contrast reduced, thereby preserving detail. The base layer is obtained using an edge-preserving filter called the bilateral filter. This is a non-linear filter, where the weight of each pixel is computed using a Gaussian in the spatial domain multiplied by an influence function in the intensity domain that decreases the weight of pixels with large intensity differences. We express bilateral filtering in the framework of robust statistics and show how it relates to anisotropic diffusion. We then accelerate bilateral filtering by using a piecewise-linear approximation in the intensity domain and appropriate subsampling. This results in a speed-up of two orders of magnitude. The method is fast and requires no parameter setting.} }	[BibTeX] [Abstract] [DOI] [Project]
[HMP+2008]	Hsu, E., Mertens, T., Paris, S., Avidan, S. and Durand, F. 2008. Light mixture estimation for spatially varying white balance. In SIGGRAPH '08: ACM SIGGRAPH 2008 papers, ACM, New York, NY, USA, 1–7. White balance is a crucial step in the photographic pipeline. It ensures the proper rendition of images by eliminating color casts due to differing illuminants. Digital cameras and editing programs provide white balance tools that assume a single type of light per image, such as daylight. However, many photos are taken under mixed lighting. We propose a white balance technique for scenes with two light types that are specified by the user. This covers many typical situations involving indoor/outdoor or flash/ambient light mixtures. Since we work from a single image, the problem is highly underconstrained. Our method recovers a set of dominant material colors which allows us to estimate the local intensity mixture of the two light types. Using this mixture, we can neutralize the light colors and render visually pleasing images. Our method can also be used to achieve post-exposure relighting effects. @inproceedings{HMP+2008, author = {Hsu, Eugene and Mertens, Tom and Paris, Sylvain and Avidan, Shai and Durand, Fr\'{e}do}, title = {Light mixture estimation for spatially varying white balance}, booktitle = {SIGGRAPH '08: ACM SIGGRAPH 2008 papers}, year = {2008}, pages = {1--7}, location = {Los Angeles, California}, doi = {10.1145/1399504.1360669}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://people.csail.mit.edu/ehsu/work/sig08lme/}, abstract = {White balance is a crucial step in the photographic pipeline. It ensures the proper rendition of images by eliminating color casts due to differing illuminants. Digital cameras and editing programs provide white balance tools that assume a single type of light per image, such as daylight. However, many photos are taken under mixed lighting. We propose a white balance technique for scenes with two light types that are specified by the user. This covers many typical situations involving indoor/outdoor or flash/ambient light mixtures. Since we work from a single image, the problem is highly underconstrained. Our method recovers a set of dominant material colors which allows us to estimate the local intensity mixture of the two light types. Using this mixture, we can neutralize the light colors and render visually pleasing images. Our method can also be used to achieve post-exposure relighting effects.} }	[BibTeX] [Abstract] [DOI] [Project]
[LCT+2005]	Ledda, P., Chalmers, A., Troscianko, T. and Seetzen, H. 2005. Evaluation of tone mapping operators using a High Dynamic Range display. In SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, ACM, New York, NY, USA, 640–648. Tone mapping operators are designed to reproduce visibility and the overall impression of brightness, contrast and color of the real world onto limited dynamic range displays and printers. Although many tone mapping operators have been published in recent years, no thorough psychophysical experiments have yet been undertaken to compare such operators against the real scenes they are purporting to depict. In this paper, we present the results of a series of psychophysical experiments to validate six frequently used tone mapping operators against linearly mapped High Dynamic Range (HDR) scenes displayed on a novel HDR device. Individual operators address the tone mapping issue using a variety of approaches and the goals of these techniques are often quite different from one another. Therefore, the purpose of this investigation was not simply to determine which is the "best" algorithm, but more generally to propose an experimental methodology to validate such operators and to determine the participants' impressions of the images produced compared to what is visible on a high contrast ratio display. @inproceedings{LCT+2005, author = {Ledda, Patrick and Chalmers, Alan and Troscianko, Tom and Seetzen, Helge}, title = {Evaluation of tone mapping operators using a High Dynamic Range display}, booktitle = {SIGGRAPH '05: ACM SIGGRAPH 2005 Papers}, year = {2005}, pages = {640--648}, location = {Los Angeles, California}, doi = {10.1145/1186822.1073242}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.bris.ac.uk/Publications/pub_master.jsp?id=2000255}, abstract = {Tone mapping operators are designed to reproduce visibility and the overall impression of brightness, contrast and color of the real world onto limited dynamic range displays and printers. Although many tone mapping operators have been published in recent years, no thorough psychophysical experiments have yet been undertaken to compare such operators against the real scenes they are purporting to depict. In this paper, we present the results of a series of psychophysical experiments to validate six frequently used tone mapping operators against linearly mapped High Dynamic Range (HDR) scenes displayed on a novel HDR device. Individual operators address the tone mapping issue using a variety of approaches and the goals of these techniques are often quite different from one another. Therefore, the purpose of this investigation was not simply to determine which is the "best" algorithm, but more generally to propose an experimental methodology to validate such operators and to determine the participants' impressions of the images produced compared to what is visible on a high contrast ratio display.} }	[BibTeX] [Abstract] [DOI] [Project]
[LLW2004]	Levin, A., Lischinski, D. and Weiss, Y. 2004. Colorization using optimization. In SIGGRAPH '04: ACM SIGGRAPH 2004 Papers, ACM, New York, NY, USA, 689–694. Colorization is a computer-assisted process of adding color to a monochrome image or movie. The process typically involves segmenting images into regions and tracking these regions across image sequences. Neither of these tasks can be performed reliably in practice; consequently, colorization requires considerable user intervention and remains a tedious, time-consuming, and expensive task.In this paper we present a simple colorization method that requires neither precise image segmentation, nor accurate region tracking. Our method is based on a simple premise; neighboring pixels in space-time that have similar intensities should have similar colors. We formalize this premise using a quadratic cost function and obtain an optimization problem that can be solved efficiently using standard techniques. In our approach an artist only needs to annotate the image with a few color scribbles, and the indicated colors are automatically propagated in both space and time to produce a fully colorized image or sequence. We demonstrate that high quality colorizations of stills and movie clips may be obtained from a relatively modest amount of user input. @inproceedings{LLW2004, author = {Levin, Anat and Lischinski, Dani and Weiss, Yair}, title = {Colorization using optimization}, booktitle = {SIGGRAPH '04: ACM SIGGRAPH 2004 Papers}, year = {2004}, pages = {689--694}, location = {Los Angeles, California}, doi = {10.1145/1186562.1015780}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.huji.ac.il/~yweiss/Colorization/}, abstract = {Colorization is a computer-assisted process of adding color to a monochrome image or movie. The process typically involves segmenting images into regions and tracking these regions across image sequences. Neither of these tasks can be performed reliably in practice; consequently, colorization requires considerable user intervention and remains a tedious, time-consuming, and expensive task.In this paper we present a simple colorization method that requires neither precise image segmentation, nor accurate region tracking. Our method is based on a simple premise; neighboring pixels in space-time that have similar intensities should have similar colors. We formalize this premise using a quadratic cost function and obtain an optimization problem that can be solved efficiently using standard techniques. In our approach an artist only needs to annotate the image with a few color scribbles, and the indicated colors are automatically propagated in both space and time to produce a fully colorized image or sequence. We demonstrate that high quality colorizations of stills and movie clips may be obtained from a relatively modest amount of user input.} }	[BibTeX] [Abstract] [DOI] [Project]
[LSA2005]	Li, Y., Sharan, L. and Adelson, E.H. 2005. Compressing and companding high dynamic range images with subband architectures. In SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, ACM, New York, NY, USA, 836–844. High dynamic range (HDR) imaging is an area of increasing importance but most display devices still have limited dynamic range (LDR). Various techniques have been proposed for compressing the dynamic range while retaining important visual information. Multi-scale image processing techniques, which are widely used for many image processing tasks, have a reputation of causing halo artifacts when used for range compression. However, we demonstrate that they can work when properly implemented. We use a symmetrical analysis-synthesis filter bank, and apply local gain control to the subbands. We also show that the technique can be adapted for the related problem of "companding", @inproceedings{LSA2005, author = {Li, Yuanzhen and Sharan, Lavanya and Adelson, Edward H.}, title = {Compressing and companding high dynamic range images with subband architectures}, booktitle = {SIGGRAPH '05: ACM SIGGRAPH 2005 Papers}, year = {2005}, pages = {836--844}, location = {Los Angeles, California}, doi = {10.1145/1186822.1073271}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://publications.csail.mit.edu/abstracts/abstracts05/adelson/adelson.html}, abstract = {High dynamic range (HDR) imaging is an area of increasing importance, but most display devices still have limited dynamic range (LDR). Various techniques have been proposed for compressing the dynamic range while retaining important visual information. Multi-scale image processing techniques, which are widely used for many image processing tasks, have a reputation of causing halo artifacts when used for range compression. However, we demonstrate that they can work when properly implemented. We use a symmetrical analysis-synthesis filter bank, and apply local gain control to the subbands. We also show that the technique can be adapted for the related problem of "companding", in which an HDR image is converted to an LDR image, and later expanded back to high dynamic range.} }	[BibTeX] [Abstract] [DOI] [Project]
[TAH+2007]	Talvala, E.-V., Adams, A., Horowitz, M. and Levoy, M. 2007. Veiling glare in high dynamic range imaging. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 37. The ability of a camera to record a high dynamic range image, whether by taking one snapshot or a sequence, is limited by the presence of veiling glare - the tendency of bright objects in the scene to reduce the contrast everywhere within the field of view. Veiling glare is a global illumination effect that arises from multiple scattering of light inside the camera's body and lens optics. By measuring separately the direct and indirect components of the intra-camera light transport, one can increase the maximum dynamic range a particular camera is capable of recording. In this paper, we quantify the presence of veiling glare and related optical artifacts for several types of digital cameras, and we describe two methods for removing them: deconvolution by a measured glare spread function, and a novel direct-indirect separation of the lens transport using a structured occlusion mask. In the second method, we selectively block the light that contributes to veiling glare, thereby attaining significantly higher signal-to-noise ratios than with deconvolution. Finally, we demonstrate our separation method for several combinations of cameras and realistic scenes. @inproceedings{TAH+2007, author = {Talvala, Eino-Ville and Adams, Andrew and Horowitz, Mark and Levoy, Marc}, title = {Veiling glare in high dynamic range imaging}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {37}, location = {San Diego, California}, doi = {10.1145/1275808.1276424}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://graphics.stanford.edu/papers/glare_removal/}, abstract = {The ability of a camera to record a high dynamic range image, whether by taking one snapshot or a sequence, is limited by the presence of veiling glare - the tendency of bright objects in the scene to reduce the contrast everywhere within the field of view. Veiling glare is a global illumination effect that arises from multiple scattering of light inside the camera's body and lens optics. By measuring separately the direct and indirect components of the intra-camera light transport, one can increase the maximum dynamic range a particular camera is capable of recording. In this paper, we quantify the presence of veiling glare and related optical artifacts for several types of digital cameras, and we describe two methods for removing them: deconvolution by a measured glare spread function, and a novel direct-indirect separation of the lens transport using a structured occlusion mask. In the second method, we selectively block the light that contributes to veiling glare, thereby attaining significantly higher signal-to-noise ratios than with deconvolution. Finally, we demonstrate our separation method for several combinations of cameras and realistic scenes.} }	[BibTeX] [Abstract] [DOI] [Project]

Enhanced cameras

Key	Citation	Details
[GSM+2007]	Green, P., Sun, W., Matusik, W. and Durand, F. 2007. Multi-aperture photography. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 68. The emergent field of computational photography is proving that, by coupling generalized imaging optics with software processing, the quality and flexibility of imaging systems can be increased. In this paper, we capture and manipulate multiple images of a scene taken with different aperture settings (f-numbers). We design and implement a prototype optical system and associated algorithms to capture four images of the scene in a single exposure, each taken with a different aperture setting. Our system can be used with commercially available DSLR cameras and photographic lenses without modification to either. We leverage the fact that defocus blur is a function of scene depth and f/# to estimate a depth map. We demonstrate several applications of our multi-aperture camera, such as post-exposure editing of the depth of field, including extrapolation beyond the physical limits of the lens, synthetic refocusing, and depth-guided deconvolution. @inproceedings{GSM+2007, author = {Green, Paul and Sun, Wenyang and Matusik, Wojciech and Durand, Fr\'{e}do}, title = {Multi-aperture photography}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {68}, location = {San Diego, California}, doi = {10.1145/1275808.1276462}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://people.csail.mit.edu/green/multiaperture/}, abstract = {The emergent field of computational photography is proving that, by coupling generalized imaging optics with software processing, the quality and flexibility of imaging systems can be increased. In this paper, we capture and manipulate multiple images of a scene taken with different aperture settings (f-numbers). We design and implement a prototype optical system and associated algorithms to capture four images of the scene in a single exposure, each taken with a different aperture setting. Our system can be used with commercially available DSLR cameras and photographic lenses without modification to either. We leverage the fact that defocus blur is a function of scene depth and f/# to estimate a depth map. We demonstrate several applications of our multi-aperture camera, such as post-exposure editing of the depth of field, including extrapolation beyond the physical limits of the lens, synthetic refocusing, and depth-guided deconvolution.} }	[BibTeX] [Abstract] [DOI] [Project]
[HK2008]	Hasinoff, S.W. and Kutulakos, K.N. 2008. Light-Efficient Photography. In ECCV '08: Proceedings of the 10th European Conference on Computer Vision, Springer-Verlag, Berlin, Heidelberg, 45–59. We consider the problem of imaging a scene with a given depth of field at a given exposure level in the shortest amount of time possible. We show that by (1) collecting a sequence of photos and (2) controlling the aperture, focus and exposure time of each photo individually, we can span the given depth of field in less total time than it takes to expose a single narrower-aperture photo. Using this as a starting point, we obtain two key results. First, for lenses with continuously-variable apertures, we derive a closed-form solution for the globally optimal capture sequence, i.e., that collects light from the specified depth of field in the most efficient way possible. Second, for lenses with discrete apertures, we derive an integer programming problem whose solution is the optimal sequence. Our results are applicable to off-the-shelf cameras and typical photography conditions, and advocate the use of dense, wide-aperture photo sequences as a light-efficient alternative to single-shot, narrow-aperture photography. @inproceedings{HK2008, author = {Hasinoff, Samuel W. and Kutulakos, Kiriakos N.}, title = {Light-Efficient Photography}, booktitle = {ECCV '08: Proceedings of the 10th European Conference on Computer Vision}, year = {2008}, isbn = {978-3-540-88692-1}, pages = {45--59}, location = {Marseille, France}, doi = {10.1007/978-3-540-88693-8_4}, publisher = {Springer-Verlag}, address = {Berlin, Heidelberg}, project = {http://people.csail.mit.edu/hasinoff/lightefficient/}, abstract = {We consider the problem of imaging a scene with a given depth of field at a given exposure level in the shortest amount of time possible. We show that by (1) collecting a sequence of photos and (2) controlling the aperture, focus and exposure time of each photo individually, we can span the given depth of field in less total time than it takes to expose a single narrower-aperture photo. Using this as a starting point, we obtain two key results. First, for lenses with continuously-variable apertures, we derive a closed-form solution for the globally optimal capture sequence, i.e., that collects light from the specified depth of field in the most efficient way possible. Second, for lenses with discrete apertures, we derive an integer programming problem whose solution is the optimal sequence. Our results are applicable to off-the-shelf cameras and typical photography conditions, and advocate the use of dense, wide-aperture photo sequences as a light-efficient alternative to single-shot, narrow-aperture photography. } }	[BibTeX] [Abstract] [DOI] [Project]
[HKD+2009]	Hasinoff, S.W., Kutulakos, K.N., Durand, F. and Freeman, W.T. 2009. Time-Constrained Photography. In Proceedings of the IEEE International Conference on Computer Vision (ICCV '09). Capturing multiple photos at different focus settings is a powerful approach for reducing optical blur, but how many photos should we capture within a fixed time budget? We develop a framework to analyze optimal capture strategies balancing the tradeoff between defocus and sensor noise, incorporating uncertainty in resolving scene depth. We derive analytic formulas for restoration error and use Monte Carlo integration over depth to derive optimal capture strategies for different camera designs, under a wide range of photographic scenarios. We also derive a new upper bound on how well spatial frequencies can be preserved over the depth of field. Our results show that by capturing the optimal number of photos, a standard camera can achieve performance at the level of more complex computational cameras, in all but the most demanding of cases. We also show that computational cameras, although specifically designed to improve one-shot performance, generally benefit from capturing multiple photos as well. @inproceedings{HKD+2009, Author = {Samuel W. Hasinoff and Kiriakos N. Kutulakos and Fr\'{e}do Durand and William T. Freeman}, Title = {Time-Constrained Photography}, Booktitle = {Proceedings of the {IEEE} International Conference on Computer Vision ({ICCV '09})}, Year = {2009}, project = {http://people.csail.mit.edu/hasinoff/timecon/}, abstract = {Capturing multiple photos at different focus settings is a powerful approach for reducing optical blur, but how many photos should we capture within a fixed time budget? We develop a framework to analyze optimal capture strategies balancing the tradeoff between defocus and sensor noise, incorporating uncertainty in resolving scene depth. We derive analytic formulas for restoration error and use Monte Carlo integration over depth to derive optimal capture strategies for different camera designs, under a wide range of photographic scenarios. We also derive a new upper bound on how well spatial frequencies can be preserved over the depth of field. Our results show that by capturing the optimal number of photos, a standard camera can achieve performance at the level of more complex computational cameras, in all but the most demanding of cases. We also show that computational cameras, although specifically designed to improve one-shot performance, generally benefit from capturing multiple photos as well. } }	[BibTeX] [Abstract] [Project]
[KF2009]	Krishnan, D. and Fergus, R. 2009. Dark flash photography. In SIGGRAPH '09: ACM SIGGRAPH 2009 papers, ACM, New York, NY, USA, 1–11. Camera flashes produce intrusive bursts of light that disturb or dazzle. We present a prototype camera and flash that uses infra-red and ultra-violet light mostly outside the visible range to capture pictures in low-light conditions. This "dark" flash is at least two orders of magnitude dimmer than conventional flashes for a comparable exposure. Building on ideas from flash/no-flash photography, we capture a pair of images, one using the dark flash, other using the dim ambient illumination alone. We then exploit the correlations between images recorded at different wavelengths to denoise the ambient image and restore fine details to give a high quality result, even in very weak illumination. The processing techniques can also be used to denoise images captured with conventional cameras. @inproceedings{KF2009, author = {Krishnan, Dilip and Fergus, Rob}, title = {Dark flash photography}, booktitle = {SIGGRAPH '09: ACM SIGGRAPH 2009 papers}, year = {2009}, isbn = {978-1-60558-726-4}, pages = {1--11}, location = {New Orleans, Louisiana}, doi = {10.1145/1576246.1531402}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.nyu.edu/~fergus/research/dark_flash.html}, abstract = {Camera flashes produce intrusive bursts of light that disturb or dazzle. We present a prototype camera and flash that uses infra-red and ultra-violet light mostly outside the visible range to capture pictures in low-light conditions. This "dark" flash is at least two orders of magnitude dimmer than conventional flashes for a comparable exposure. Building on ideas from flash/no-flash photography, we capture a pair of images, one using the dark flash, other using the dim ambient illumination alone. We then exploit the correlations between images recorded at different wavelengths to denoise the ambient image and restore fine details to give a high quality result, even in very weak illumination. The processing techniques can also be used to denoise images captured with conventional cameras.} }	[BibTeX] [Abstract] [DOI] [Project]
[LFD+2007]	Levin, A., Fergus, R., Durand, F. and Freeman, W.T. 2007. Image and depth from a conventional camera with a coded aperture. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 70. A conventional camera captures blurred versions of scene information away from the plane of focus. Camera systems have been proposed that allow for recording all-focus images, or for extracting depth, but to record both simultaneously has required more extensive hardware and reduced spatial resolution. We propose a simple modification to a conventional camera that allows for the simultaneous recovery of both (a) high resolution image information and (b) depth information adequate for semi-automatic extraction of a layered depth representation of the image. Our modification is to insert a patterned occluder within the aperture of the camera lens, creating a coded aperture. We introduce a criterion for depth discriminability which we use to design the preferred aperture pattern. Using a statistical model of images, we can recover both depth information and an all-focus image from single photographs taken with the modified camera. A layered depth map is then extracted, requiring user-drawn strokes to clarify layer assignments in some cases. The resulting sharp image and layered depth map can be combined for various photographic applications, including automatic scene segmentation, post-exposure refocusing, or re-rendering of the scene from an alternate viewpoint. @inproceedings{LFD+2007, author = {Levin, Anat and Fergus, Rob and Durand, Fr\'{e}do and Freeman, William T.}, title = {Image and depth from a conventional camera with a coded aperture}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {70}, location = {San Diego, California}, doi = {10.1145/1275808.1276464}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://groups.csail.mit.edu/graphics/CodedAperture/}, abstract = {A conventional camera captures blurred versions of scene information away from the plane of focus. Camera systems have been proposed that allow for recording all-focus images, or for extracting depth, but to record both simultaneously has required more extensive hardware and reduced spatial resolution. We propose a simple modification to a conventional camera that allows for the simultaneous recovery of both (a) high resolution image information and (b) depth information adequate for semi-automatic extraction of a layered depth representation of the image. Our modification is to insert a patterned occluder within the aperture of the camera lens, creating a coded aperture. We introduce a criterion for depth discriminability which we use to design the preferred aperture pattern. Using a statistical model of images, we can recover both depth information and an all-focus image from single photographs taken with the modified camera. A layered depth map is then extracted, requiring user-drawn strokes to clarify layer assignments in some cases. The resulting sharp image and layered depth map can be combined for various photographic applications, including automatic scene segmentation, post-exposure refocusing, or re-rendering of the scene from an alternate viewpoint.} }	[BibTeX] [Abstract] [DOI] [Project]
[LHG+2009]	Levin, A., Hasinoff, S.W., Green, P., Durand, F. and Freeman, W.T. 2009. 4D frequency analysis of computational cameras for depth of field extension. In SIGGRAPH '09: ACM SIGGRAPH 2009 papers, ACM, New York, NY, USA, 1–14. Depth of field (DOF), the range of scene depths that appear sharp in a photograph, poses a fundamental tradeoff in photography—wide apertures are important to reduce imaging noise, but they also increase defocus blur. Recent advances in computational imaging modify the acquisition process to extend the DOF through deconvolution. Because deconvolution quality is a tight function of the frequency power spectrum of the defocus kernel, designs with high spectra are desirable. In this paper we study how to design effective extended-DOF systems, and show an upper bound on the maximal power spectrum that can be achieved. We analyze defocus kernels in the 4D light field space and show that in the frequency domain, only a low-dimensional 3D manifold contributes to focus. Thus, to maximize the defocus spectrum, imaging systems should concentrate their limited energy on this manifold. We review several computational imaging systems and show either that they spend energy outside the focal manifold or do not achieve a high spectrum over the DOF. Guided by this analysis we introduce the lattice-focal lens, which concentrates energy at the low-dimensional focal manifold and achieves a higher power spectrum than previous designs. We have built a prototype lattice-focal lens and present extended depth of field results. @inproceedings{LHG+2009, author = {Levin, Anat and Hasinoff, Samuel W. and Green, Paul and Durand, Fr\'{e}do and Freeman, William T.}, title = {4D frequency analysis of computational cameras for depth of field extension}, booktitle = {SIGGRAPH '09: ACM SIGGRAPH 2009 papers}, year = {2009}, isbn = {978-1-60558-726-4}, pages = {1--14}, location = {New Orleans, Louisiana}, doi = {http://doi.acm.org/10.1145/1576246.1531403}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.wisdom.weizmann.ac.il/~levina/papers/lattice/}, abstract = {Depth of field (DOF), the range of scene depths that appear sharp in a photograph, poses a fundamental tradeoff in photography---wide apertures are important to reduce imaging noise, but they also increase defocus blur. Recent advances in computational imaging modify the acquisition process to extend the DOF through deconvolution. Because deconvolution quality is a tight function of the frequency power spectrum of the defocus kernel, designs with high spectra are desirable. In this paper we study how to design effective extended-DOF systems, and show an upper bound on the maximal power spectrum that can be achieved. We analyze defocus kernels in the 4D light field space and show that in the frequency domain, only a low-dimensional 3D manifold contributes to focus. Thus, to maximize the defocus spectrum, imaging systems should concentrate their limited energy on this manifold. We review several computational imaging systems and show either that they spend energy outside the focal manifold or do not achieve a high spectrum over the DOF. Guided by this analysis we introduce the lattice-focal lens, which concentrates energy at the low-dimensional focal manifold and achieves a higher power spectrum than previous designs. We have built a prototype lattice-focal lens and present extended depth of field results.} }	[BibTeX] [Abstract] [DOI] [Project]
[LSC+2008]	Levin, A., Sand, P., Cho, T.S., Durand, F. and Freeman, W.T. 2008. Motion-invariant photography. In SIGGRAPH '08: ACM SIGGRAPH 2008 papers, ACM, New York, NY, USA, 1–9. Object motion during camera exposure often leads to noticeable blurring artifacts. Proper elimination of this blur is challenging because the blur kernel is unknown, varies over the image as a function of object velocity, and destroys high frequencies. In the case of motions along a 1D direction (e.g. horizontal) we show that these challenges can be addressed using a camera that moves during the exposure. Through the analysis of motion blur as space-time integration, we show that a parabolic integration (corresponding to constant sensor acceleration) leads to motion blur that is invariant to object velocity. Thus, a single deconvolution kernel can be used to remove blur and create sharp images of scenes with objects moving at different speeds, without requiring any segmentation and without knowledge of the object speeds. Apart from motion invariance, we prove that the derived parabolic motion preserves image frequency content nearly optimally. That is, while static objects are degraded relative to their image from a static camera, a reliable reconstruction of all moving objects within a given velocities range is made possible. We have built a prototype camera and present successful deblurring results over a wide variety of human motions. @inproceedings{LSC+2008, author = {Levin, Anat and Sand, Peter and Cho, Taeg Sang and Durand, Fr\'{e}do and Freeman, William T.}, title = {Motion-invariant photography}, booktitle = {SIGGRAPH '08: ACM SIGGRAPH 2008 papers}, year = {2008}, pages = {1--9}, location = {Los Angeles, California}, doi = {10.1145/1399504.1360670}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://groups.csail.mit.edu/graphics/pubs/MotionInvariant/}, abstract = {Object motion during camera exposure often leads to noticeable blurring artifacts. Proper elimination of this blur is challenging because the blur kernel is unknown, varies over the image as a function of object velocity, and destroys high frequencies. In the case of motions along a 1D direction (e.g. horizontal) we show that these challenges can be addressed using a camera that moves during the exposure. Through the analysis of motion blur as space-time integration, we show that a parabolic integration (corresponding to constant sensor acceleration) leads to motion blur that is invariant to object velocity. Thus, a single deconvolution kernel can be used to remove blur and create sharp images of scenes with objects moving at different speeds, without requiring any segmentation and without knowledge of the object speeds. Apart from motion invariance, we prove that the derived parabolic motion preserves image frequency content nearly optimally. That is, while static objects are degraded relative to their image from a static camera, a reliable reconstruction of all moving objects within a given velocities range is made possible. We have built a prototype camera and present successful deblurring results over a wide variety of human motions.} }	[BibTeX] [Abstract] [DOI] [Project]
[MBN2007]	Moreno-Noguer, F., Belhumeur, P.N. and Nayar, S.K. 2007. Active refocusing of images and videos. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 67. We present a system for refocusing images and videos of dynamic scenes using a novel, single-view depth estimation method. Our method for obtaining depth is based on the defocus of a sparse set of dots projected onto the scene. In contrast to other active illumination techniques, the projected pattern of dots can be removed from each captured image and its brightness easily controlled in order to avoid under- or over-exposure. The depths corresponding to the projected dots and a color segmentation of the image are used to compute an approximate depth map of the scene with clean region boundaries. The depth map is used to refocus the acquired image after the dots are removed, simulating realistic depth of field effects. Experiments on a wide variety of scenes, including close-ups and live action, demonstrate the effectiveness of our method. @inproceedings{MBN2007, author = {Moreno-Noguer, Francesc and Belhumeur, Peter N. and Nayar, Shree K.}, title = {Active refocusing of images and videos}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {67}, location = {San Diego, California}, doi = {10.1145/1275808.1276461}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.columbia.edu/CAVE/projects/active_refocus/}, abstract = {We present a system for refocusing images and videos of dynamic scenes using a novel, single-view depth estimation method. Our method for obtaining depth is based on the defocus of a sparse set of dots projected onto the scene. In contrast to other active illumination techniques, the projected pattern of dots can be removed from each captured image and its brightness easily controlled in order to avoid under- or over-exposure. The depths corresponding to the projected dots and a color segmentation of the image are used to compute an approximate depth map of the scene with clean region boundaries. The depth map is used to refocus the acquired image after the dots are removed, simulating realistic depth of field effects. Experiments on a wide variety of scenes, including close-ups and live action, demonstrate the effectiveness of our method.} }	[BibTeX] [Abstract] [DOI] [Project]
[NKZ+2008]	Nagahara, H., Kuthirummal, S., Zhou, C. and Nayar, S.K. 2008. Flexible Depth of Field Photography. In ECCV '08: Proceedings of the 10th European Conference on Computer Vision, Springer-Verlag, Berlin, Heidelberg, 60–73. The range of scene depths that appear focused in an image is known as the depth of field (DOF). Conventional cameras are limited by a fundamental trade-off between depth of field and signal-to-noise ratio (SNR). For a dark scene, the aperture of the lens must be opened up to maintain SNR, which causes the DOF to reduce. Also, today's cameras have DOFs that correspond to a single slab that is perpendicular to the optical axis. In this paper, we present an imaging system that enables one to control the DOF in new and powerful ways. Our approach is to vary the position and/or orientation of the image detector, during the integration time of a single photograph. Even when the detector motion is very small (tens of microns), a large range of scene depths (several meters) is captured both in and out of focus. Our prototype camera uses a micro-actuator to translate the detector along the optical axis during image integration. Using this device, we demonstrate three applications of flexible DOF. First, we describe extended DOF, where a large depth range is captured with a very wide aperture (low noise) but with nearly depth-independent defocus blur. Applying deconvolution to a captured image gives an image with extended DOF and yet high SNR. Next, we show the capture of images with discontinuous DOFs. For instance, near and far objects can be imaged with sharpness while objects in between are severely blurred. Finally, we show that our camera can capture images with tilted DOFs (Scheimpflug imaging) without tilting the image detector. We believe flexible DOF imaging can open a new creative dimension in photography and lead to new capabilities in scientific imaging, vision, and graphics. @inproceedings{NKZ+2008, author = {Nagahara, Hajime and Kuthirummal, Sujit and Zhou, Changyin and Nayar, Shree K.}, title = {Flexible Depth of Field Photography}, booktitle = {ECCV '08: Proceedings of the 10th European Conference on Computer Vision}, year = {2008}, isbn = {978-3-540-88692-1}, pages = {60--73}, location = {Marseille, France}, doi = {10.1007/978-3-540-88693-8_5}, publisher = {Springer-Verlag}, address = {Berlin, Heidelberg}, project = {http://www1.cs.columbia.edu/CAVE/projects/flexible_dof/}, abstract = {The range of scene depths that appear focused in an image is known as the depth of field (DOF). Conventional cameras are limited by a fundamental trade-off between depth of field and signal-to-noise ratio (SNR). For a dark scene, the aperture of the lens must be opened up to maintain SNR, which causes the DOF to reduce. Also, today's cameras have DOFs that correspond to a single slab that is perpendicular to the optical axis. In this paper, we present an imaging system that enables one to control the DOF in new and powerful ways. Our approach is to vary the position and/or orientation of the image detector, during the integration time of a single photograph. Even when the detector motion is very small (tens of microns), a large range of scene depths (several meters) is captured both in and out of focus. Our prototype camera uses a micro-actuator to translate the detector along the optical axis during image integration. Using this device, we demonstrate three applications of flexible DOF. First, we describe extended DOF, where a large depth range is captured with a very wide aperture (low noise) but with nearly depth-independent defocus blur. Applying deconvolution to a captured image gives an image with extended DOF and yet high SNR. Next, we show the capture of images with discontinuous DOFs. For instance, near and far objects can be imaged with sharpness while objects in between are severely blurred. Finally, we show that our camera can capture images with tilted DOFs (Scheimpflug imaging) without tilting the image detector. We believe flexible DOF imaging can open a new creative dimension in photography and lead to new capabilities in scientific imaging, vision, and graphics.} }	[BibTeX] [Abstract] [DOI] [Project]
[RAT2006]	Raskar, R., Agrawal, A. and Tumblin, J. 2006. Coded exposure photography: motion deblurring using fluttered shutter. In SIGGRAPH '06: ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, 795–804. In a conventional single-exposure photograph, moving objects or moving cameras cause motion blur. The exposure time defines a temporal box filter that smears the moving object across the image by convolution. This box filter destroys important high-frequency spatial details so that deblurring via deconvolution becomes an ill-posed problem.Rather than leaving the shutter open for the entire exposure duration, we "flutter" the camera's shutter open and closed during the chosen exposure time with a binary pseudo-random sequence. The flutter changes the box filter to a broad-band filter that preserves high-frequency spatial details in the blurred image and the corresponding deconvolution becomes a well-posed problem. We demonstrate that manually-specified point spread functions are sufficient for several challenging cases of motion-blur removal including extremely large motions, textured backgrounds and partial occluders. @inproceedings{RAT2006, author = {Raskar, Ramesh and Agrawal, Amit and Tumblin, Jack}, title = {Coded exposure photography: motion deblurring using fluttered shutter}, booktitle = {SIGGRAPH '06: ACM SIGGRAPH 2006 Papers}, year = {2006}, isbn = {1-59593-364-6}, pages = {795--804}, location = {Boston, Massachusetts}, doi = {10.1145/1179352.1141957}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.umiacs.umd.edu/~aagrawal/sig06/sig06Main.html}, abstract = {In a conventional single-exposure photograph, moving objects or moving cameras cause motion blur. The exposure time defines a temporal box filter that smears the moving object across the image by convolution. This box filter destroys important high-frequency spatial details so that deblurring via deconvolution becomes an ill-posed problem.Rather than leaving the shutter open for the entire exposure duration, we "flutter" the camera's shutter open and closed during the chosen exposure time with a binary pseudo-random sequence. The flutter changes the box filter to a broad-band filter that preserves high-frequency spatial details in the blurred image and the corresponding deconvolution becomes a well-posed problem. We demonstrate that manually-specified point spread functions are sufficient for several challenging cases of motion-blur removal including extremely large motions, textured backgrounds and partial occluders.} }	[BibTeX] [Abstract] [DOI] [Project]
[VRA+2007]	Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A. and Tumblin, J. 2007. Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 69. We describe a theoretical framework for reversibly modulating 4D light fields using an attenuating mask in the optical path of a lens based camera. Based on this framework, we present a novel design to reconstruct the 4D light field from a 2D camera image without any additional refractive elements as required by previous light field cameras. The patterned mask attenuates light rays inside the camera instead of bending them, and the attenuation recoverably encodes the rays on the 2D sensor. Our mask-equipped camera focuses just as a traditional camera to capture conventional 2D photos at full sensor resolution, but the raw pixel values also hold a modulated 4D light field. The light field can be recovered by rearranging the tiles of the 2D Fourier transform of sensor values into 4D planes, and computing the inverse Fourier transform. In addition, one can also recover the full resolution image information for the in-focus parts of the scene. We also show how a broadband mask placed at the lens enables us to compute refocused images at full sensor resolution for layered Lambertian scenes. This partial encoding of 4D ray-space data enables editing of image contents by depth, yet does not require computational recovery of the complete 4D light field. @inproceedings{VRA+2007, author = {Veeraraghavan, Ashok and Raskar, Ramesh and Agrawal, Amit and Mohan, Ankit and Tumblin, Jack}, title = {Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {69}, location = {San Diego, California}, doi = {10.1145/1275808.1276463}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.umiacs.umd.edu/~aagrawal/sig07/MatlabCodeImages.html}, abstract = {We describe a theoretical framework for reversibly modulating 4D light fields using an attenuating mask in the optical path of a lens based camera. Based on this framework, we present a novel design to reconstruct the 4D light field from a 2D camera image without any additional refractive elements as required by previous light field cameras. The patterned mask attenuates light rays inside the camera instead of bending them, and the attenuation recoverably encodes the rays on the 2D sensor. Our mask-equipped camera focuses just as a traditional camera to capture conventional 2D photos at full sensor resolution, but the raw pixel values also hold a modulated 4D light field. The light field can be recovered by rearranging the tiles of the 2D Fourier transform of sensor values into 4D planes, and computing the inverse Fourier transform. In addition, one can also recover the full resolution image information for the in-focus parts of the scene. We also show how a broadband mask placed at the lens enables us to compute refocused images at full sensor resolution for layered Lambertian scenes. This partial encoding of 4D ray-space data enables editing of image contents by depth, yet does not require computational recovery of the complete 4D light field.} }	[BibTeX] [Abstract] [DOI] [Project]

Other

Key	Citation	Details
[ARN+2005]	Agrawal, A., Raskar, R., Nayar, S.K. and Li, Y. 2005. Removing photography artifacts using gradient projection and flash-exposure sampling. In SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, ACM, New York, NY, USA, 828–835. Flash images are known to suffer from several problems: saturation of nearby objects, poor illumination of distant objects, reflections of objects strongly lit by the flash and strong highlights due to the reflection of flash itself by glossy surfaces. We propose to use a flash and no-flash (ambient) image pair to produce better flash images. We present a novel gradient projection scheme based on a gradient coherence model that allows removal of reflections and highlights from flash images. We also present a brightness-ratio based algorithm that allows us to compensate for the falloff in the flash image brightness due to depth. In several practical scenarios, the quality of flash/no-flash images may be limited in terms of dynamic range. In such cases, we advocate using several images taken under different flash intensities and exposures. We analyze the flash intensity-exposure space and propose a method for adaptively sampling this space so as to minimize the number of captured images for any given scene. We present several experimental results that demonstrate the ability of our algorithms to produce improved flash images. @inproceedings{ARN+2005, author = {Agrawal, Amit and Raskar, Ramesh and Nayar, Shree K. and Li, Yuanzhen}, title = {Removing photography artifacts using gradient projection and flash-exposure sampling}, booktitle = {SIGGRAPH '05: ACM SIGGRAPH 2005 Papers}, year = {2005}, pages = {828--835}, location = {Los Angeles, California}, doi = {10.1145/1186822.1073269}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.umiacs.umd.edu/~aagrawal/sig05/Gradient_Projection.html}, abstract = {Flash images are known to suffer from several problems: saturation of nearby objects, poor illumination of distant objects, reflections of objects strongly lit by the flash and strong highlights due to the reflection of flash itself by glossy surfaces. We propose to use a flash and no-flash (ambient) image pair to produce better flash images. We present a novel gradient projection scheme based on a gradient coherence model that allows removal of reflections and highlights from flash images. We also present a brightness-ratio based algorithm that allows us to compensate for the falloff in the flash image brightness due to depth. In several practical scenarios, the quality of flash/no-flash images may be limited in terms of dynamic range. In such cases, we advocate using several images taken under different flash intensities and exposures. We analyze the flash intensity-exposure space and propose a method for adaptively sampling this space so as to minimize the number of captured images for any given scene. We present several experimental results that demonstrate the ability of our algorithms to produce improved flash images.} }	[BibTeX] [Abstract] [DOI] [Project]
[AS2007]	Avidan, S. and Shamir, A. 2007. Seam carving for content-aware image resizing. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 10. Effective resizing of images should not only use geometric constraints, but consider the image content as well. We present a simple image operator called seam carving that supports content-aware image resizing for both reduction and expansion. A seam is an optimal 8-connected path of pixels on a single image from top to bottom, or left to right, where optimality is defined by an image energy function. By repeatedly carving out or inserting seams in one direction we can change the aspect ratio of an image. By applying these operators in both directions we can retarget the image to a new size. The selection and order of seams protect the content of the image, as defined by the energy function. Seam carving can also be used for image content enhancement and object removal. We support various visual saliency measures for defining the energy of an image, and can also include user input to guide the process. By storing the order of seams in an image we create multi-size images, that are able to continuously change in real time to fit a given size. @inproceedings{AS2007, author = {Avidan, Shai and Shamir, Ariel}, title = {Seam carving for content-aware image resizing}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {10}, location = {San Diego, California}, doi = {10.1145/1275808.1276390}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.seamcarving.com/}, abstract = {Effective resizing of images should not only use geometric constraints, but consider the image content as well. We present a simple image operator called seam carving that supports content-aware image resizing for both reduction and expansion. A seam is an optimal 8-connected path of pixels on a single image from top to bottom, or left to right, where optimality is defined by an image energy function. By repeatedly carving out or inserting seams in one direction we can change the aspect ratio of an image. By applying these operators in both directions we can retarget the image to a new size. The selection and order of seams protect the content of the image, as defined by the energy function. Seam carving can also be used for image content enhancement and object removal. We support various visual saliency measures for defining the energy of an image, and can also include user input to guide the process. By storing the order of seams in an image we create multi-size images, that are able to continuously change in real time to fit a given size.} }	[BibTeX] [Abstract] [DOI] [Project]
[ED2004]	Eisemann, E. and Durand, F. 2004. Flash photography enhancement via intrinsic relighting. In SIGGRAPH '04: ACM SIGGRAPH 2004 Papers, ACM, New York, NY, USA, 673–678. We enhance photographs shot in dark environments by combining a picture taken with the available light and one taken with the flash. We preserve the ambiance of the original lighting and insert the sharpness from the flash image. We use the bilateral filter to decompose the images into detail and large scale. We reconstruct the image using the large scale of the available lighting and the detail of the flash. We detect and correct flash shadows. This combines the advantages of available illumination and flash photography. @inproceedings{ED2004, author = {Eisemann, Elmar and Durand, Fr\'{e}do}, title = {Flash photography enhancement via intrinsic relighting}, booktitle = {SIGGRAPH '04: ACM SIGGRAPH 2004 Papers}, year = {2004}, pages = {673--678}, location = {Los Angeles, California}, doi = {10.1145/1186562.1015778}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://people.csail.mit.edu/fredo/PUBLI/flash/index.htm}, abstract = {We enhance photographs shot in dark environments by combining a picture taken with the available light and one taken with the flash. We preserve the ambiance of the original lighting and insert the sharpness from the flash image. We use the bilateral filter to decompose the images into detail and large scale. We reconstruct the image using the large scale of the available lighting and the detail of the flash. We detect and correct flash shadows. This combines the advantages of available illumination and flash photography.} }	[BibTeX] [Abstract] [DOI] [Project]
[KAG+2008]	Kuthirummal, S., Agarwala, A., Goldman, D.B. and Nayar, S.K. 2008. Priors for Large Photo Collections and What They Reveal about Cameras. In ECCV '08: Proceedings of the 10th European Conference on Computer Vision, Springer-Verlag, Berlin, Heidelberg, 74–87. A large photo collection downloaded from the internet spans a wide range of scenes, cameras, and photographers. In this project we discovered several statistics of such large photo collections that are independent of these factors. This makes it possible to recover the radiometric properties of a particular camera model by measuring how the statistics of images taken by one model differ from those of a broader photo collection with camera-specific effects removed. We show that using this approach we can recover both the non-linear response function of a particular camera model and the spatially-varying vignetting of several different lens settings. All this is achieved using publicly available photographs, without requiring acquisition of images under controlled conditions or even physical access to the cameras. We also applied this concept to identify bad pixels on the detectors of specific camera instances. @inproceedings{KAG+2008, author = {Kuthirummal, Sujit and Agarwala, Aseem and Goldman, Dan B and Nayar, Shree K.}, title = {Priors for Large Photo Collections and What They Reveal about Cameras}, booktitle = {ECCV '08: Proceedings of the 10th European Conference on Computer Vision}, year = {2008}, isbn = {978-3-540-88692-1}, pages = {74--87}, location = {Marseille, France}, doi = {10.1007/978-3-540-88693-8_6}, publisher = {Springer-Verlag}, address = {Berlin, Heidelberg}, project = {http://www.adobe.com/technology/graphics/priors_for_large_photo_collections_and_what_they_reveal_about_cameras.html}, abstract = {A large photo collection downloaded from the internet spans a wide range of scenes, cameras, and photographers. In this project we discovered several statistics of such large photo collections that are independent of these factors. This makes it possible to recover the radiometric properties of a particular camera model by measuring how the statistics of images taken by one model differ from those of a broader photo collection with camera-specific effects removed. We show that using this approach we can recover both the non-linear response function of a particular camera model and the spatially-varying vignetting of several different lens settings. All this is achieved using publicly available photographs, without requiring acquisition of images under controlled conditions or even physical access to the cameras. We also applied this concept to identify bad pixels on the detectors of specific camera instances.} }	[BibTeX] [Abstract] [DOI] [Project]
[KUD+2007]	Kopf, J., Uyttendaele, M., Deussen, O. and Cohen, M.F. 2007. Capturing and viewing gigapixel images. In SIGGRAPH '07: ACM SIGGRAPH 2007 papers, ACM, New York, NY, USA, 93. We present a system to capture and view "Gigapixel images": very high resolution, high dynamic range, and wide angle imagery consisting of several billion pixels each. A specialized camera mount, in combination with an automated pipeline for alignment, exposure compensation, and stitching, provide the means to acquire Gigapixel images with a standard camera and lens. More importantly, our novel viewer enables exploration of such images at interactive rates over a network, while dynamically and smoothly interpolating the projection between perspective and curved projections, and simultaneously modifying the tone-mapping to ensure an optimal view of the portion of the scene being viewed. @inproceedings{KUD+2007, author = {Kopf, Johannes and Uyttendaele, Matt and Deussen, Oliver and Cohen, Michael F.}, title = {Capturing and viewing gigapixel images}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {93}, location = {San Diego, California}, doi = {10.1145/1275808.1276494}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://johanneskopf.de/publications/gigapixel/index.html}, abstract = {We present a system to capture and view "Gigapixel images": very high resolution, high dynamic range, and wide angle imagery consisting of several billion pixels each. A specialized camera mount, in combination with an automated pipeline for alignment, exposure compensation, and stitching, provide the means to acquire Gigapixel images with a standard camera and lens. More importantly, our novel viewer enables exploration of such images at interactive rates over a network, while dynamically and smoothly interpolating the projection between perspective and curved projections, and simultaneously modifying the tone-mapping to ensure an optimal view of the portion of the scene being viewed.} }	[BibTeX] [Abstract] [DOI] [Project]
[KVH+2009]	Kalogerakis, E., Vesselova, O., Hays, J., Efros, A.A. and Hertzmann, A. 2009. Image Sequence Geolocation with Human Travel Priors. In Proceedings of the IEEE International Conference on Computer Vision (ICCV '09). This paper presents a method for estimating geographic location for sequences of time-stamped photographs. A prior distribution over travel describes the likelihood of traveling from one location to another during a given time interval. This distribution is based on a training database of 6 million photographs from Flickr.com. An image likelihood for each location is defined by matching a test photograph against the training database. Inferring location for images in a test sequence is then performed using the Forward- Backward algorithm, and the model can be adapted to individual users as well. Using temporal constraints allows our method to geolocate images without recognizable landmarks, and images with no geographic cues whatsoever. This method achieves a substantial performance improvement over the best-available baseline, and geolocates some users' images with near-perfect accuracy. @inproceedings{KVH+2009, Author = {Evangelos Kalogerakis and Olga Vesselova and James Hays and Alexei A. Efros and Aaron Hertzmann}, Title = {Image Sequence Geolocation with Human Travel Priors}, Booktitle = {Proceedings of the {IEEE} International Conference on Computer Vision ({ICCV '09})}, Year = {2009}, project = {http://www.dgp.toronto.edu/~kalo/papers/images2gps/}, abstract = {This paper presents a method for estimating geographic location for sequences of time-stamped photographs. A prior distribution over travel describes the likelihood of traveling from one location to another during a given time interval. This distribution is based on a training database of 6 million photographs from Flickr.com. An image likelihood for each location is defined by matching a test photograph against the training database. Inferring location for images in a test sequence is then performed using the Forward- Backward algorithm, and the model can be adapted to individual users as well. Using temporal constraints allows our method to geolocate images without recognizable landmarks, and images with no geographic cues whatsoever. This method achieves a substantial performance improvement over the best-available baseline, and geolocates some users' images with near-perfect accuracy.} }	[BibTeX] [Abstract] [Project]
[NN2004]	Nishino, K. and Nayar, S.K. 2004. Eyes for relighting. In SIGGRAPH '04: ACM SIGGRAPH 2004 Papers, ACM, New York, NY, USA, 704–711. The combination of the cornea of an eye and a camera viewing the eye form a catadioptric (mirror + lens) imaging system with a very wide field of view. We present a detailed analysis of the characteristics of this corneal imaging system. Anatomical studies have shown that the shape of a normal cornea (without major defects) can be approximated with an ellipsoid of fixed eccentricity and size. Using this shape model, we can determine the geometric parameters of the corneal imaging system from the image. Then, an environment map of the scene with a large field of view can be computed from the image. The environment map represents the illumination of the scene with respect to the eye. This use of an eye as a natural light probe is advantageous in many relighting scenarios. For instance, it enables us to insert virtual objects into an image such that they appear consistent with the illumination of the scene. The eye is a particularly useful probe when relighting faces. It allows us to reconstruct the geometry of a face by simply waving a light source in front of the face. Finally, in the case of an already captured image, eyes could be the only direct means for obtaining illumination information. We show how illumination computed from eyes can be used to replace a face in an image with another one. We believe that the eye not only serves as a useful tool for relighting but also makes relighting possible in situations where current approaches are hard to use. @inproceedings{NN2004, author = {Nishino, Ko and Nayar, Shree K.}, title = {Eyes for relighting}, booktitle = {SIGGRAPH '04: ACM SIGGRAPH 2004 Papers}, year = {2004}, pages = {704--711}, location = {Los Angeles, California}, doi = {10.1145/1186562.1015783}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www.cs.columbia.edu/CAVE/projects/eyes_relight/}, abstract = {The combination of the cornea of an eye and a camera viewing the eye form a catadioptric (mirror + lens) imaging system with a very wide field of view. We present a detailed analysis of the characteristics of this corneal imaging system. Anatomical studies have shown that the shape of a normal cornea (without major defects) can be approximated with an ellipsoid of fixed eccentricity and size. Using this shape model, we can determine the geometric parameters of the corneal imaging system from the image. Then, an environment map of the scene with a large field of view can be computed from the image. The environment map represents the illumination of the scene with respect to the eye. This use of an eye as a natural light probe is advantageous in many relighting scenarios. For instance, it enables us to insert virtual objects into an image such that they appear consistent with the illumination of the scene. The eye is a particularly useful probe when relighting faces. It allows us to reconstruct the geometry of a face by simply waving a light source in front of the face. Finally, in the case of an already captured image, eyes could be the only direct means for obtaining illumination information. We show how illumination computed from eyes can be used to replace a face in an image with another one. We believe that the eye not only serves as a useful tool for relighting but also makes relighting possible in situations where current approaches are hard to use.} }	[BibTeX] [Abstract] [DOI] [Project]
[PSA+2004]	Petschnigg, G., Szeliski, R., Agrawala, M., Cohen, M., Hoppe, H. and Toyama, K. 2004. Digital photography with flash and no-flash image pairs. In SIGGRAPH '04: ACM SIGGRAPH 2004 Papers, ACM, New York, NY, USA, 664–672. Digital photography has made it possible to quickly and easily take a pair of images of low-light environments: one with flash to capture detail and one without flash to capture ambient illumination. We present a variety of applications that analyze and combine the strengths of such flash/no-flash image pairs. Our applications include denoising and detail transfer (to merge the ambient qualities of the no-flash image with the high-frequency flash detail), white-balancing (to change the color tone of the ambient image), continuous flash (to interactively adjust flash intensity), and red-eye removal (to repair artifacts in the flash image). We demonstrate how these applications can synthesize new images that are of higher quality than either of the originals. @inproceedings{PSA+2004, author = {Petschnigg, Georg and Szeliski, Richard and Agrawala, Maneesh and Cohen, Michael and Hoppe, Hugues and Toyama, Kentaro}, title = {Digital photography with flash and no-flash image pairs}, booktitle = {SIGGRAPH '04: ACM SIGGRAPH 2004 Papers}, year = {2004}, pages = {664--672}, location = {Los Angeles, California}, doi = {10.1145/1186562.1015777}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://vis.berkeley.edu/papers/fnf/}, abstract = {Digital photography has made it possible to quickly and easily take a pair of images of low-light environments: one with flash to capture detail and one without flash to capture ambient illumination. We present a variety of applications that analyze and combine the strengths of such flash/no-flash image pairs. Our applications include denoising and detail transfer (to merge the ambient qualities of the no-flash image with the high-frequency flash detail), white-balancing (to change the color tone of the ambient image), continuous flash (to interactively adjust flash intensity), and red-eye removal (to repair artifacts in the flash image). We demonstrate how these applications can synthesize new images that are of higher quality than either of the originals.} }	[BibTeX] [Abstract] [DOI] [Project]
[SGS+2008]	Snavely, N., Garg, R., Seitz, S.M. and Szeliski, R. 2008. Finding paths through the world's photos. In SIGGRAPH '08: ACM SIGGRAPH 2008 papers, ACM, New York, NY, USA, 1–11. When a scene is photographed many times by different people, the viewpoints often cluster along certain paths. These paths are largely specific to the scene being photographed, and follow interesting regions and viewpoints. We seek to discover a range of such paths and turn them into controls for image-based rendering. Our approach takes as input a large set of community or personal photos, reconstructs camera viewpoints, and automatically computes orbits, panoramas, canonical views, and optimal paths between views. The scene can then be interactively browsed in 3D using these controls or with six degree-of-freedom free-viewpoint control. As the user browses the scene, nearby views are continuously selected and transformed, using control-adaptive reprojection techniques. @inproceedings{SGS+2008, author = {Snavely, Noah and Garg, Rahul and Seitz, Steven M. and Szeliski, Richard}, title = {Finding paths through the world's photos}, booktitle = {SIGGRAPH '08: ACM SIGGRAPH 2008 papers}, year = {2008}, pages = {1--11}, location = {Los Angeles, California}, doi = {10.1145/1399504.1360614}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://phototour.cs.washington.edu/findingpaths/}, abstract = {When a scene is photographed many times by different people, the viewpoints often cluster along certain paths. These paths are largely specific to the scene being photographed, and follow interesting regions and viewpoints. We seek to discover a range of such paths and turn them into controls for image-based rendering. Our approach takes as input a large set of community or personal photos, reconstructs camera viewpoints, and automatically computes orbits, panoramas, canonical views, and optimal paths between views. The scene can then be interactively browsed in 3D using these controls or with six degree-of-freedom free-viewpoint control. As the user browses the scene, nearby views are continuously selected and transformed, using control-adaptive reprojection techniques.} }	[BibTeX] [Abstract] [DOI] [Project]
[SCG+2005]	Sen, P., Chen, B., Garg, G., Marschner, S.R., Horowitz, M., Levoy, M. and Lensch, H.P.A. 2005. Dual photography. In SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, ACM, New York, NY, USA, 745–755. We present a novel photographic technique called dual photography, which exploits Helmholtz reciprocity to interchange the lights and cameras in a scene. With a video projector providing structured illumination, reciprocity permits us to generate pictures from the viewpoint of the projector, even though no camera was present at that location. The technique is completely image-based, requiring no knowledge of scene geometry or surface properties, and by its nature automatically includes all transport paths, including shadows, inter-reflections and caustics. In its simplest form, the technique can be used to take photographs without a camera; we demonstrate this by capturing a photograph using a projector and a photo-resistor. If the photo-resistor is replaced by a camera, we can produce a 4D dataset that allows for relighting with 2D incident illumination. Using an array of cameras we can produce a 6D slice of the 8D reflectance field that allows for relighting with arbitrary light fields. Since an array of cameras can operate in parallel without interference, whereas an array of light sources cannot, dual photography is fundamentally a more efficient way to capture such a 6D dataset than a system based on multiple projectors and one camera. As an example, we show how dual photography can be used to capture and relight scenes. @inproceedings{SCG+2005, author = {Sen, Pradeep and Chen, Billy and Garg, Gaurav and Marschner, Stephen R. and Horowitz, Mark and Levoy, Marc and Lensch, Hendrik P. A.}, title = {Dual photography}, booktitle = {SIGGRAPH '05: ACM SIGGRAPH 2005 Papers}, year = {2005}, pages = {745--755}, location = {Los Angeles, California}, doi = {http://doi.acm.org/10.1145/1186822.1073257}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://www-graphics.stanford.edu/papers/dual_photography/}, abstract = {We present a novel photographic technique called dual photography, which exploits Helmholtz reciprocity to interchange the lights and cameras in a scene. With a video projector providing structured illumination, reciprocity permits us to generate pictures from the viewpoint of the projector, even though no camera was present at that location. The technique is completely image-based, requiring no knowledge of scene geometry or surface properties, and by its nature automatically includes all transport paths, including shadows, inter-reflections and caustics. In its simplest form, the technique can be used to take photographs without a camera; we demonstrate this by capturing a photograph using a projector and a photo-resistor. If the photo-resistor is replaced by a camera, we can produce a 4D dataset that allows for relighting with 2D incident illumination. Using an array of cameras we can produce a 6D slice of the 8D reflectance field that allows for relighting with arbitrary light fields. Since an array of cameras can operate in parallel without interference, whereas an array of light sources cannot, dual photography is fundamentally a more efficient way to capture such a 6D dataset than a system based on multiple projectors and one camera. As an example, we show how dual photography can be used to capture and relight scenes.} }	[BibTeX] [Abstract] [DOI] [Project]
[SSS2006]	Snavely, N., Seitz, S.M. and Szeliski, R. 2006. Photo tourism: exploring photo collections in 3D. In SIGGRAPH '06: ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, 835–846. We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo explorer uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, and to annotate image details, which are automatically transferred to other relevant images. We demonstrate our system on several large personal photo collections as well as images gathered from Internet photo sharing sites. @inproceedings{SSS2006, author = {Snavely, Noah and Seitz, Steven M. and Szeliski, Richard}, title = {Photo tourism: exploring photo collections in 3D}, booktitle = {SIGGRAPH '06: ACM SIGGRAPH 2006 Papers}, year = {2006}, isbn = {1-59593-364-6}, pages = {835--846}, location = {Boston, Massachusetts}, doi = {10.1145/1179352.1141964}, publisher = {ACM}, address = {New York, NY, USA}, project = {http://phototour.cs.washington.edu/}, abstract = {We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo explorer uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, and to annotate image details, which are automatically transferred to other relevant images. We demonstrate our system on several large personal photo collections as well as images gathered from Internet photo sharing sites.} }	[BibTeX] [Abstract] [DOI] [Project]

Craig S. Kaplan

Last updated: