Why People Should Care About Light-Field Displays
Why People Should Care About Light-Field Displays
Light-field displays have a wide range of potential applications, including 3-D television and projection systems, technologies for vision assessment and
correction, and small form factors with support for focus cues in head-mounted displays.
by Gordon Wetzstein
MANY PEOPLE believe that the recent hype about stereoscopic 3-D displays is over, at least for the moment. Part of the reason why 3-D television, in particular, has not been widely adopted by consumers may be the lack of a truly unique or useful enhancement of the 2-D viewing experience.
At the same time, light-field displays are expected to be “the future of 3-D displays.” So what makes people believe that a new technology delivering an experience that consumers have not adopted in the past will work in the future?
This article does not attempt to outline possible future scenarios for 3-D television applications. Instead, we discuss a range of unconventional display applications that are facilitated by light-field technology that perhaps would benefit much more than conventional television from emerging capabilities. Whereas glasses-based stereoscopic displays present two different views of a scene to the eyes of an observer, light-field displays aim for a multiview solution that usually removes the need for additional glasses. Conceptually, a light-field display emits a set of all possible light rays (over some field of view) such that the observer can move anywhere in front of the display and his eyes sample the appropriate viewing zones (Fig. 1). The next section outlines a short historical review of light-field displays and recent trends toward compressive light-field displays, followed by a discussion of applications in projection systems, vision assessment and correction, wearable displays, and a brief comparison to holography.
Fig. 1: In this conceptual sketch of a light-field display, several people enjoy glasses-free 3-D television. The emitted light field (top right) is a collection of images depicting the same scene from slightly different horizontal and vertical perspectives. These images are all very similar, hence compressible. Image courtesy Wetzstein et al. (Ref. 10).
From Integral Imaging to Compressive Light-Field Displays
The invention of light-field displays dates back to the beginning of the last century. In 1908, Gabriel Lipmann built the first integrated light-field camera and
display using integral imaging.1 He mounted microlens arrays on photographic plates, then exposed and developed these plates with the lens arrays in place such that they could be viewed as a light field or glasses-free 3-D image after the fact. Over the course of more than 100 years, many alternative technologies have emerged that deliver a glasses-free 3-D experience;2 for example, holography, volumetric displays,3,4 multi-focal-plane displays,5–7 and multi-projector arrays.8,9 However, extreme requirements in feature sizes, cumbersome form factors, and high cost are some of the reasons why none of these technologies has found widespread use in consumer electronics.
With an ever-increasing demand on image resolution, one of the major bottlenecks in the light-field-display pipeline is the computation. Consider the example of a high-quality light-field display with 100 × 100 views, each having HD resolution streamed at 60 Hz. More than 1 trillion light rays have to be rendered per second, requiring more than 100 terabytes of floating-point RGB ray data to be stored and processed. Furthermore, with conventional integral imaging, one would need a display panel that has a resolution 10,000× higher than available HD panels.
To tackle the big data problem and relax requirements on display hardware, compressive light-field displays have recently been introduced as a modern take on Lippmann’s vision. They exploit two simple insights: (1) light fields are highly redundant high-dimensional visual signals and (2) the human visual system has limitations that can be exploited for visual signal compression.
Rather than pursuing a direct optical solution (e.g., adding one more pixel or voxel to support the emission of one additional light ray), compressive displays aim to create flexible optical systems that have the capability to synthesize a compressed target light field. In effect, each pixel emits a superposition of light rays; through compression and tailored optical designs, fewer display pixels are necessary to emit a given light field (i.e., a set of light rays with a specific target set of radiances) than would be demanded with a direct optical solution.
Implementations of compressive light-field displays include multiple stacked layers of liquid-crystal displays (LCDs) (see Fig. 2), a thin “sandwich” of two LCDs enclosing a microlens array, or, in general, any combination of stacked programmable light modulators and refractive optical elements.10 The main difference between stacked LCDs and conventional volumetric displays is the non-linear image formation. Cascading LCDs have a multiplicative effect on the incident light,10–13 with the polarizers between panels in place or a polarization-rotating effect without the polarizers.14 Volumetric displays, such as spinning disks or multi-focal-plane displays, are inherently additive, and hence provide a linear image formation. By using a 4-D frequency analysis,10,11 one can show that multiplicative non-linear displays, at least in theory, support high-resolution virtual content to be displayed outside and in between the physical layers with high resolution. This is not the case for additive multi-focal-plane displays.
The human visual system is a complex mechanism. In addition to all the depth cues, two of its characteristics are particularly important for 3-D display design: visual acuity and the critical flicker fusion threshold (CFF). Usually, 3-D display capabilities are created by reducing either the resolution of the screen or multiplexing information in time. These are fundamental tradeoffs that cannot be easily overcome; basically, “there is no free lunch.”
The most reasonable thing to do would be to make tradeoffs where an observer would not perceive them. Resolutions available in commercial panels are now at, or at least close to, the “retina” level. Adding 3-D capabilities through spatial multiplexing, as is the case for integral imaging, reduces the resolution by a factor that is equal to the number of light-field views. If only a few views are desired, this may be a reasonable approach, but for wide field-of-field multiview displays, the loss in resolution is unacceptable for an observer.
With currently available panel resolutions, adding 3-D capabilities through spatial multiplexing may actually decrease the overall viewing experience because the gain in 3-D capabilities may not be justified by the loss of resolution. However, the critical flicker fusion threshold of the human visual system is much lower than display refresh rates offered by consumer displays. LCDs can achieve about 240 Hz, and MEMS devices run in the kHz range. This is already widely exploited in active shutter-glasses-based 3-D displays, field-sequential-color displays (e.g., in projectors), and also in time-sequential volumetric displays. Although not perfect, it appears to be the most reasonable choice if making tradeoffs cannot be avoided. Compressive light-field displays usually are also time-multiplex visual signals, but some incarnations10,15 are designed such that the same hardware configuration allows for dynamic tradeoffs in resolution, image brightness, 3-D quality, etc., to be made in software in a content-adaptive manner.
And although compressive light-field displays have so far only been demonstrated in the form of academic prototypes (Fig. 2), the promised benefits make further research and development worthwhile: traditional resolution tradeoffs can be overcome, device form factors can be reduced, requirements for display refresh rate and resolution can be relaxed, and visual discomfort introduced by the vergence-accommodation conflict ca be mitigated. The design and implementation of such displays is at the convergence of applied mathematics, optics, electronics, high-performance computing, and human perception.
Fig. 2: This compressive light-field prototype uses three stacked layers of LCDs (left) that are rear-illuminated by a single backlight. A non-negative light-field tensor factorization algorithm computes time-multiplexed patterns for all LCD layers (bottom right), which are then displayed at a speed exceeding the critical flicker fusion threshold of the human visual system. Perceptually, these patterns fuse into a consistent high-resolution light field that supports binocular disparity and motion parallax without the need for glasses (right). Image courtesy Wetzstein et al. (Ref. 10).
3-D Projection: Toward the Holodeck
Large-scale glasses-free 3-D displays have a wide range of applications in collaborative learning, planning and training, advertisement, scientific visualization, and entertainment. Over the last decade, light-field projection systems that provide an impressive image quality and extreme depth of field have emerged.8,9 For horizontal-only parallax systems, a one-dimensional array of projectors is mounted in front of or behind a vertical-only diffusing screen. The diffuser ensures that the generated image is independent of the vertical viewing position, whereas the horizontal directionality of individual projector rays is preserved by the screen. As a viewer moves in front of the displays, his eyes sample the emitted light field. As with any 3-D display, there are tradeoffs. The number of required projectors directly scales with the desired field of view and angular density of emitted rays. In practical applications, dozens of projectors are often employed, which makes multi-projector light-field displays expensive, difficult to calibrate, power hungry, and sometimes bulky. These limitations may not be restrictive for some applications – for example, military training – but it could be argued that they make consumer applications currently unfeasible.
The compressive light-field methodology also applies to projection systems.16 In this case, the goal is to “compress” the number of required devices, thereby increasing power efficiency, form factor, and cost of the system. The basic idea behind compressive light-field projection is intuitive: ideally, one would only use a single projector that generates a light field inside the device and emits it on a screen that preserves the angular variation of the light field. Diffusers would not be suitable for that task. Furthermore, the field of view of the emitted light field would only be as large as the projector aperture, so at first glance this does not seem like a feasible option.
A possible solution to this problem was recently presented16: a light-field projector that employs two cascading spatial light modulators (SLMs) and a rear-illuminated passive screen that not only preserves angular light variation but actually expands it such that the perceived field of view is much larger than the projector aperture. This angle expansion can be achieved by a screen that is composed of two microlens arrays with different focal lengths, mounted back-to-back. Each of the microlenses covers the area of a single pixel, thereby not compromising image resolution. As illustrated in Fig. 3, each of the screen pixels is a beam expander (or tiny Keplerian telescope), which expands the field of view but inherently shrinks the pixel area on the other side. The tradeoff is now between pixel fill factor and desired angle amplification. A prototype system providing a field of view 5° for a monochromatic light field was recently demonstrated by Hirsch et al.16 This is a promising direction that may lead to practical and low-cost 3-D projection systems, but high-quality manufacturing techniques for angle-expanding screens providing a significantly higher angle expansion factor have yet to be developed.
Fig. 3: On the top is a schematic of a compressive light-field projection system comprised of a passive angle-expanding screen and a light-field projector. On the bottom is a photograph of a prototype system demonstrating a field of view and viewing zone separation suitable for a 3-D display. Image courtesy Hirsch et al. (Ref. 16).
Assessing and Enhancing Human Vision
Perhaps one of the most unconventional applications of light-field displays is the assessment17 and correction18 of visual aberrations in a human observer (see Fig. 4). In essence, the light-field display presents a distorted light field to the eyes of the viewer such that her aberrations undistort the light-field imagery, resulting in the desired image. The idea is very similar to wavefront correction with coherent optics. The requirements on the light-field display are extreme in this case.
Fig. 4: At left is a comparison between a conventional display and a vision-correcting light-field display. On the right are some application scenarios. Vision correcting displays could be integrated into existing displays and used in laptops, cellphones, watches, e-Readers, and cars to provide a comfortable eyeglasses-free experience. Image courtesy Huang et al. (Ref. 18).
For conventional 3-D displays, as discussed above, stereoscopic depth cues and motion parallax are usually sufficient. Hence, the emitted light field has to provide views that are densely packed such that each of the eyes of the observer always sees a different image. For vision assessing and correcting displays, one not only needs to emit two different images into the observer’s eyes, but multiple different images into the same pupil. At least two different views have to enter the same pupil to produce a retinal blur that triggers accommodative responses.19 Although researchers have attempted to produce faithful focus cues – accommodation and retinal blur – in addition to binocular disparity and motion parallax,13 it is very challenging to achieve high image quality, and diffraction places an upper boundary on the achieved resolution. The diffraction limit is critical in this context because transparent LCDs are
high-frequency pixel grids that filter visual signals (or additional LCDs) physically located behind the former LCDs such that those are optically blurred via diffraction. The smaller the pixel structures, the more blurred the spatial content of farther display planes becomes. Hence, there is a type of uncertainty principle at work: adding more and higher-resolution LCDs to the stack ought to increase the 3-D capabilities of the device, yet it comes at the cost of reduced spatial resolution of other device components.
Ignoring stereo cues and motion parallax, however, allows for practical light-field displays to be built that support only focus cues.17,18 To assess visual aberrations, a light field containing a pair of lines can be presented to an observer such that the retinal projections exactly overlap when the viewer has normal or corrected vision but appear at spatially distinct locations when that is not the case. By interactively aligning the lines, one can manually pre-distort the light field until the pre-distortions are canceled out by the refractive errors of the eye. The amount of manually induced pre-distortion then gives insights into the distortions in the eye. The procedure can be repeated for lines at varying rotation angles, allowing for astigmatism to be estimated as well as defocus.
Assuming that the refractive errors in the eye are known, either from an eye exam or through self-diagnosis, one can also emit arbitrary light fields that are pre-distorted for a particular viewer. This approach creates a vision-correcting display. Instead of correcting vision with eyeglasses or contact lenses, the same can be done directly in the screen, allowing for myopia, hyperoptia, astigmatism, and even higher-order aberrations to be corrected.
The main technical insight of recent work on vision-correcting displays is this: the image presented on a conventional display appears blurred for a viewer with
refractive errors. One could attempt to deconvolve the presented image with the point-spread function of the observer, such that the image and the observer’s visual defects together optically cancel out,20 but it turns out that this is usually an ill-posed mathematical problem. Employing light-field displays, however, uses the same idea but lifts the deconvolution problem into the 4-D light-field domain, where it becomes mathematically well-posed. Current vision-correcting displays require the prescription of the viewer to be known (no changes to the hardware are necessary; different prescriptions can be corrected in software) and the pupil positions and diameters to be tracked. The latter is an engineering challenge that would have to be implemented to make vision-correcting light-field displays robust enough for consumer devices.
One application of light-field displays that has received a lot of attention recently are wearable displays. With the emergence of Google Glass, the Oculus Rift, and, more recently, Microsoft’s HoloLens near-to-eye displays for virtual reality (VR) and augmented-reality (AR) applications in the consumer space have become one of the most anticipated emerging technologies.
Unfortunately, no existing near-to-eye display supports correct focus cues, which are crucial for comfortable and natural viewing experiences. For immersive VR, this lack of support results in the well-known vergence-accommodation conflict,21 which can lead to visual discomfort and fatigue, eyestrain, diplopic vision, headaches, nausea, compromised image quality, and even pathologies in developing visual systems. Only a few display solutions exist that support correct or nearly correct focus cues;5,6 i.e., accommodation and retinal blur. These are multi-focal-plane displays that closely approximate volumetric displays.3,4 Volumetric displays can be interpreted as one special type of light-field display that is capable of producing light distributions that are a subset of the full 4-D light-field space.
Lanman et al.22 have recently shown that a different implementation of a near-to-eye light-field display (i.e., integral imaging) is also suitable for supporting focus cues within some range and at a reduced image resolution (see Fig. 5). The focus of their paper was the design of a very thin device form factor rather than an in-depth evaluation of the range in which focus cues are actually supported. This idea is very similar to vision-correcting light-field displays: a virtual image is shown outside the physical device enclosure. In principle, both could be combined into a single, slim near-to-eye display that supports focus cues and simultaneously corrects myopia, hyperopia, astigmatism, and higher-order aberrations in the eye. However, such a device has not been demonstrated yet. The requirements for focus-supporting near-to-eye light-field displays are the same as for vision-correcting displays: at least two different views have to enter the same pupil.19 In display applications, the algorithms doing digital refocus in light-field cameras (“shift and add light-field views”) are optically performed in the eye. Overall, near-to-eye light-field displays hold great promise in overcoming some of the most challenging limitations of head-mounted displays today and helping to create more-comfortable and natural viewing experiences. Perhaps this is one of the most promising and at the same time unexplored areas of light-field displays in general.
Fig. 5: The shown near-to-eye light-field display provides a very thin form factor and correct or nearly correct focus cues within a limited depth range. Image courtesy Lanman et al. (Ref. 22).
Light Fields and Holography
The difference between light fields and holograms is a subject for evening-filling philosophical discussions and cannot easily be answered. The author’s take on it is this that both effectively create similar viewing experiences, but a fundamental difference is that light fields usually work with incoherent light and holograms are based on coherent illumination. However, partially coherent or incoherent holograms exist as well and light-field displays can also account for diffractive phenomena.24 The boundaries between these fields are fluid, but traditionally holograms are modeled using wave optics, whereas light fields are modeled as rays or geometric optics. As pixels get smaller, the image formation of light-field displays will have to incorporate diffractive effects, so the areas are expected to merge eventually. Nevertheless, it is advantageous to think about light fields in the ray-space because it makes connections to advanced signal processing and optimization algorithms, such as computed tomography11 and a non-negative matrix12 and tensor factorization,10 very intuitive.
A Potential As Yet Unrealized
In summary, light-field displays have a wide range of applications in 3-D television and projection systems; they enable technologies for vision assessment and correction, and one of the “hottest” areas at the moment are applications to small form factors and the support for focus cues in head-mounted displays. Even if public interest in any of these applications may diminish in the long term, light fields provide a fundamental framework that allows almost all
existing display technologies to be analyzed and compared in a unifying mathematical framework. The same framework also makes it intuitive to leverage modern signal processing and optimization algorithms, which has led to the emergence of compressive light-field displays. Finally, very similar strategies can be employed to design and build 2-D super-resolution and high-dynamic-range displays.15,16,23
Thus far, all of the recent advances in computational and compressive display technology have been made using standard optical and electronics components, including LCDs, liquid-crystal–on–silicon (LCoS) microlens arrays, etc. Future designs, however, may incorporate completely new optical designs that
are not only tailored to a particular application (wearable, mobile, television) but also to specific algorithms driving these devices. We have entered an area in which advanced computation has become an integral part of the image formation in many emerging display technologies. The synergy between computation, optics, electronics, and human perception is expected to become even more important in the future.
1G. Lippmann, La Photographie Int´egrale. Academie des Sciences 146, 446–451 (1908).
2H. Urey, K. V. Chellappan, E. Erden, and P, Surman, “State of the Art in Stereoscopic and Autostereoscopic Displays,” Proc. IEEE 99, 4, 540–555 (2011).
3O. S. Cossairt, J. Napoli, S. L. Hill, R. K. Dorval, and G. E. Favalora, “Occlusion-Capable Multiview Volumetric Three-Dimensional Display,” Applied Optics 46, 8, 1244–1250 (2007).
4A. Jones, I. McDowall, H. Yamada, M. Bolas, and P. Debevec, “Rendering for an Interactive 360° Light Field Display,” ACM Trans. Graph. (SIGGRAPH) 26, 40:1-40:10 (2007).
5K. Akeley, S. J. Watt, A. R. Girshik, and M. S. Banks, “A Stereo Display Prototype with Multiple Focal Distances,” ACM Trans. Graph. (SIGGRAPH) 23, 804–813 (2004).
6B. T. Schowengerdt and E. J. Seibel, “True 3-D scanned voxel displays using single and multiple light sources, J. Soc. Info. Display 14(2), 135-143 (2006).
7G. Love, D. Hoffman, P. Hands, J. Gao, A. Kirby, and M. Banks, “High-speed switchable lens enables the development of a volumetric stereoscopic display,” Opt. Express 17, 15716–15725 (2009).
8T. Baloght, “The HoloVizio System,” Proc. SPIE 6055, 60550U (2006).
9A. Jones, K. Nagano, J. Liu, J. Busch, X. Yu, M. Bolas, and P. Debevec, “Interpolating vertical parallax for an autostereoscopic 3-D projector array,” J. Electron. Imaging 23, 011005 (2014).
10G. Wetzstein, D. Lanman, M. Hirsch, and R. Raskar, “Tensor Displays: Compressive Light Field Synthesis using Multilayer Displays
with Directional Backlighting,” ACM Trans. Graph. (SIGGRAPH) 31, 1–11 (2012).
11G. Wetzstein, D. Lanman, W. Heidrich, and R. Raskar, “Layered 3-D: Tomographic Image Synthesis for Attenuation-based Light Field and High Dynamic Range Displays,” ACM Trans. Graph. (SIGGRAPH) (2011).
12D. Lanman, M. Hirsch, Y. Kim, and R. Raskar, “Content-Adaptive Parallax Barriers: Optimizing Dual-Layer 3-D Displays Using
Low-Rank Light Field Factorization,” ACM Trans. Graph. (SIGGRAPH Asia) 28, 5, 1–10 (2010).
13A. Maimone, G. Wetzstein, D. Lanman, M. Hirsch, R. Raskar, and H. Fuchs, “Focus 3-D: Compressive Accommodation Display,” ACM Trans. Graph. (TOG) 32, 5, 153:1–153:13 (2013).
14D. Lanman, G. Wetzstein, M. Hirsch, W. Heidrich, and R. Raskar, “Polarization Fields: Dynamic Light Field Display Using Multi-Layer LCDs,” ACM Trans. Graph. (SIGGRAPH Asia) 30, 6, 186 (2011).
15F. Heide, J. Gregson, G. Wetzstein, R. Raskar, and W. Heidrich, “Compressive multi-mode superresolution display,” Opt. Express 22, 14981–14992 (2014).
16M. Hirsch, G. Wetzstein, and R. Raskar, “A Compressive Light Field Projection System,” ACM Trans. Graph. (ACM SIGGRAPH)> (2014).
17V. Pamplona, A. Mohan, M. Oliveira, and R. Raskar, 2010, “NETRA: Interactive Display for Estimating Refractive Errors and Focal Range,” ACM Trans. Graph. (SIGGRAPH) (2010).
18F. C. Huang, G. Wetzstein, B. Barsky, and R. Raskar, “Eyeglasses-free Display: Towards Correcting Visual Aberrations with
Computational Light Field Displays,” ACM Trans. Graph. (SIGGRAPH) (2014).
19Y. Takaki, “High-Density Directional Display for Generating Natural Three-Dimensional Images,” Proc. IEEE 94, 3 (2006).
20J. I. Yellott and J. W. Yellott, “Correcting spurious resolution in defocused images,” Proc. SPIE 6492 (2007).
21D. Hoffman, A. Girshick, K. Akeley, and M. Banks, “Vergence-accommodation conflict hinders visual performance and causes visual
fatigue,” Journal of Vision 8(3) (2008).
22D. Lanman and D. Luebke, “Near-to-Eye Light Field Displays,” ACM Trans. Graph. (SIGGRAPH Asia), 32(6) (2013).
23F. Heide, D. Lanman, D. Reddy, J. Kautz, K. Pulli, and D. Luebke, “Cascaded displays: spatiotemporal superresolution using offset pixel layers,” ACM Trans. Graph. (SIGGRAPH), 33(4) (2014).
24G. Ye, S. Jolly, M. Bove, Q. Dai, R. Raskar, and G. Wetzstein, “Toward DxDF Display using Multilayer Diffraction,” ACM Trans. Graph. (SIGGRAPH Asia), 33(6) (2014). •
Gordon Wetzstein is an assistant professor at Stanford University. He can be reached at firstname.lastname@example.org.