## Abstract

Calibration is vital to autostereoscopic 3D displays. This paper proposes a local calibration method that copes with any type of deformation in the optical layer. The proposed method is based on visual pattern analysis. Given the observations, we manage to localize the optical slits by matching the observations to the input pattern. In a principled optimization framework, we find an efficient calibration algorithm. Experimental validation follows. The local calibration shows significant improvement in 3D visual quality over the global calibration method. This paper also finds a new intuitive insight on the calibration in terms of the light field theory.

© 2017 Optical Society of America

## 1. Introduction

Three-dimensional (3D) displays provide immersive viewing environments in which the users get realistic sense of the scene depth [1,2]. Especially, a multiview autostereoscopic 3D (AS3D) display has become available in commercial domain on account of the simplicity as well as the cost efficiency [3–5]. The multiview AS3D display is also easily implementable on mobile devices such as tablets and smart phones [6]. Only with an additional optical layer (e.g., lenticular lens, parallax barrier), a usual 2D display is transformed into a 3D display. The optical elements give directionality to pixels, by refracting or blocking the light rays that emerge from the pixels. Different pixels are visible at different positions. The resulting binocular disparity and motion parallax provide appropriate cues for the 3D perception.

Limitations exist, however. If the same display panel is used, the 3D quality tends to be much lower than the maximum 2D quality. The primary reasons are, for example, (i) the spatio-angular resolution trade-off; (ii) the light leakage or crosstalk; (iii) the misalignment or inconsistency between the actual configuration of optical layer and the one assumed for the rendering. While recent research has evenly focused on all of the above issues [2, 7–19], the first two remain somewhat inevitable if significant changes in hardware do not accompany. In contrast, the last one can be almost fixed, at least in principle, solely by accurate calibration.

With regard to calibration, many prior works are found in literature (e.g., [11–18]). Most of them are *global* calibration (GC) methods assuming that the inconsistency is characterizable by only a few predefined parameters, globally uniform over the whole panel. Recent GC methods achieve quite good performance in estimating the global parameters, but they generally fail in the presence of *local* deformation. A few studies are known to deal with local deformation in some restricted settings. Li *et al.* [17] calibrate the tiled-lens-array for a huge display. Considering that the tiles might be tilted, they estimate the tilting parameters and subsequently make compensation as if there were only a single lens-array of huge size. Note that the type of deformation is very specific. Lee and Ra [16] and Zhou *et al.* [11] attempt to obtain a correct view assignment map without explicit calibration. For dense viewing positions, they first find where each pixel is directed, by making a vast number of observations on diverse patterns (e.g., 72 per viewing position [11]) and then assign the corresponding view number to every pixel. The scheme basically works even with local deformation since it does not assume any particular type of misalignments. However, setting aside the complexity, it is hardly generalizable to other viewing positions than in the predefined set. Indeed, Lee and Ra [16] show a systematic method of reassigning views when the observation is made at other positions than in the predefined set. But the method works only if a certain set of parameters (e.g., thickness of the optical layer in their case) remain globally uniform. To the best of our knowledge, there is no research that “calibrates” the AS3D displays with local deformation and thus generically works in any situations.

In this paper, we propose a novel approach to estimating and compensating for arbitrary types of local deformation in the AS3D display. Our method is based on a principled visual pattern analysis. Modeling the relationship between the input pattern and the observations, we cast the calibration as an optimization task which can be solved efficiently. The proposed calibration method is well explained with the established light field theory. Particularly, in connection with rendering, the light field representation yields very efficient algorithms. Experiments show that the proposed method significantly improves the visual quality of the AS3D display.

## 2. Related theory

In a given setting of scene, *light field* denotes the distribution of the light rays over the space, from everywhere to everywhere [20]. If we assume two parallel planes Π and Ω in the 3D space, an arbitrary light ray, except those parallel with the planes, intersects them, at (*s*, *t*) ∈ Π and (*u*, *v*) ∈ Ω, respectively. In [21], Gortler *et al.* denote each light ray by the 4D coordinate (*s*, *t*, *u*, *v*) and consider the light field as a function defined over the same parameter space (or *ray space*). The function is called *lumigraph* and is widely used for the unified and simplified analysis of light field in various fields of applications from computational photography [22,23] to 3D displays [24–27].

Here we utilize the lumigraph representation for the AS3D display calibration and rendering. Our environment is shown in Fig. 1. Consider that the *xy*-plane of the world coordinate is aligned to the display panel and that *z*-axis is directed toward the viewer. The origin is assumed to be located at the center of the display panel. We place two planes Π and Ω at *z* = 0 and *z* = *d*, respectively.

Let us start with briefly reviewing a few fundamental properties of the lumigraph parameterization. For simplicity, we restrict ourselves to a horizontal 2D slice, parameterized by *s* and *u* only (refer to Fig. 2), but the results as well as the algorithms are extensible to the full dimension.

**Properties.** (a) A light ray, naturally represented as a line in the real world, corresponds to a point (*s*, *u*) in the ray space.

(b) A pencil mathematically refers to a set of lines that meet on a common point [28]. Given a pinhole in the real world, consider the associated light pencil, i.e., the set of all light rays that pass through the pinhole. Because a single light ray corresponds to a point in the ray space, a set of light rays are mapped to a set of points, forming a line in the ray space. Specifically, if the pinhole is located at **p** = (*p ^{x}*,

*p*) in the world coordinate, the pencil function in the ray space becomes

^{z}*d*denotes the distance between Π and Ω.

The above two properties are considered to be dual to each other in the sense that a point in one space becomes a line in the other space and vice versa [29,30].

In the AS3D display environment, two important types of pencils exist: one formed by the optical element of the 3D display (e.g., barrier slit, principal point of lenticular lenslet) and the other formed by the observer (e.g., pupil of the eye, exit pupil of the camera lens). We denote the former by display pencil function (DPF) and the latter by camera pencil function (CPF).

**Display pencil function.** The optical element consists of many slits, through which an observer sees the pixels dedicated to a particular viewpoint. This generates as many DPFs in the ray space, associatedly one-by-one to the slits:

*n*denotes the slit index. The DPFs physically denote all light rays “visible” to observers. Any light ray visible to any observer must always exist on the DPFs. In this regard, the AS3D display is a sampling device on light field, only sifting the DPFs (i.e., a union of 1D lines) out of the entire 2D ray space.

Without deformation, { ${p}_{n}^{x}$} would be uniformly spaced and { ${p}_{n}^{z}$} would be constant, which would then make the DPFs be separated equidistantly in parallel with each other (see Eq. (2)). But, in most cases, neither { ${p}_{n}^{x}$} are uniformly spaced nor { ${p}_{n}^{z}$} are constant. As the result, the actual DPFs are distributed somewhat irregularly (see Fig. 3). In the 3D display calibration, the ultimate goal is to accurately localize the slits in the world coordinate. The goal is, alternatively, equivalent to obtaining the actual DPFs in the ray space.

**Camera pencil function.** In the observer side, we also assume a plural number, say *K*, of pinholes. These pinholes may represent the pupils of human eyes or the exit pupils of the camera lenses, at multiple viewpoints. They correspond to as many CPFs in the ray space:

*k*th pinhole. In the ray space, each CPF intersects the DPFs

*N*times. Each intersection point denotes one of the light rays visible, specifically through the corresponding pinhole, to the observer (refer to Fig. 2(b)).

## 3. Methods

#### 3.1. Calibration

As we already mentioned, in calibration, the goal is to accurately localize the slits on a (possibly) locally deformed surface. Our approach is based on the visual pattern analysis. We display a pattern image *I* on the panel and observe it at multiple positions. For automatic calibration, a camera replaces human eyes in every set of observations. We assume *K* observations, each observation denoted by *O _{k}* (

*k*= 1, . . . ,

*K*) and the corresponding camera position denoted by ${\mathbf{q}}_{k}=\left({q}_{k}^{x},{q}_{k}^{z}\right)$. By the principle of the AS3D display, the observations from different viewpoints are all distinct, providing mutually independent cues on where the slits are. In fact, the information we can directly extract from each observation is the location of the pixels visible at each specific viewpoint,

*not*of the slits themselves. However, remind that the light rays emanating from the visible pixels to the observer scatter as many points in the ray space, all aligned along a set of lines (i.e., DPFs). Given multiple observations, we get to have many scattered points from which we can delineate the DPFs. Note that the DPFs in the ray space are interchangeable with the slits in the world coordinate (see Sec. 2).

The minimum number of observations is ideally two, which is the number of points minimally required to determine a line. But the actual observations are essentially imperfect. They suffer from noise as well as from blurring (e.g., by defocus or by limited resolution), so we actually need more observations (*K* = 5 in this study).

**Input pattern.** Given an input image *I*, if a certain pixel *s* is visible at a certain viewpoint **q*** _{k}*, we expect that the same pixel value will be observed at the same pixel position in the observation (i.e.,

*I*(

*s*) =

*O*(

_{k}*s*)). In real settings, the observation has many imperfections (e.g., noise, blur), so the value may not be exactly the same but still should be quite similar. For occluded pixels, the pixel values in

*I*and

*O*happen to be either similar or not, depending on the input pattern

_{k}*I*. To make our observation be informative about the visible pixel locations, we design the input pattern

*I*such that

*I*and

*O*are maximally different at occluded pixels. Then, roughly speaking, we can detect the visible pixels

_{k}*s*simply by seeing whether

*I*(

*s*) ≈

*O*(

_{k}*s*). To fulfill this principle, the input image must have an appropriate fluctuation pattern. A sufficient condition is that the fluctuation period is close to the actual spacing between the visible pixels. In this study, we use the view assignment map based on “global” parameters as the input pattern. Specifically, we estimate the global parameters using a global calibration method [14]. Given the parameters, we compute the view assignment map using the seminal formula of van Berkel’s [5]. The assigned views are normalized in the range of [0, 1]. In the input image, we assign hue (normalized in the range of [0, 1]) exactly matching the assigned view, for every pixel. A pattern of similar concept has been used in the context of camera calibration [31].

**Observation.** To obtain proper observations from raw pictures, we need to conduct several preprocessing steps, as illustrated in Fig. 4. First, we detect four corner points of the display panel and project the quadrangle area to the full panel size. We also estimate the camera positions at the same time, during the projection. Next, we adjust color to compensate for the difference of the RGB spectra between the display (*r*_{d}, *g*_{d}, *b*_{d}) and the camera (*r*_{c}, *g*_{c}, *b*_{c}). The color correction can be generally performed by

*γ*denotes the gamma nonlinearity. We compute the linear operator

**T**(3 × 3 matrix) by a least-square regression.

**Loss function.** Let {**p*** _{n}*} be the slit locations, which we ultimately want to find. We denote, by

*f*(

**p**

*,*

_{n}**q**

*), the position of the pixel visible to the observer*

_{k}**q**

*through the slit*

_{k}**p**

*. It is the point where the light ray passing through both*

_{n}**p**

*and*

_{n}**q**

*meets the display panel, given by*

_{k}*n*,

*k*, it is simplified to with ${\beta}_{k}={q}_{k}^{x}/{q}_{k}^{z}$. In matching the pixels in the observations with those in the input pattern, we use the squared difference of the hue (modulo 1). We denote this measure by

*D*. Then, we consider the overall matching error between the input pattern and the observations at the candidacy of visible pixels, as follows:

**p**

*, the loss function*

_{n}*L*tends to become smallest.

**Optimization.** While looking simple, the loss function *L* is highly nonlinear. It is difficult to directly minimize *L* with respect to **p*** _{n}*. In this study, we employ the half-quadratic splitting method (see [32,33] for details) by introducing auxiliary variables {

*s*} and subsequently by augmenting the loss function, as

_{n,k}*λ*is a weight parameter. Note that if

*λ*approaches infinity, the minimizer of the augmented loss function also minimizes the original loss function. The optimization of Eq. (8) can be done in an iterative way, first solving for {

*s*} while fixing

_{n,k}**p**

*, then solving for*

_{n}**p**

*given the newly found {*

_{n}*s*}.

_{n,k}- s-subproblem: With
**p**fixed, the problem is decoupled into_{n}*K*independent subproblems, that is,$$\underset{{s}_{n,k}}{\text{minimize}}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}D\left(I({s}_{n,k}),{O}_{k}({s}_{n,k})\right)+\lambda {\left({s}_{n,k}-f\left({\mathbf{p}}_{n},{\mathbf{q}}_{k}\right)\right)}^{2},\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}k=1,\dots ,K.$$While the objective function in each subproblem is still nonlinear, it is a one-dimensional function, for which the global optimization is tractable. - p-subproblem: With {
*s*} fixed, the problem is reduced to_{n,k}$$\underset{{\mathbf{p}}_{n}}{\text{minimize}}\sum _{k}{\left({s}_{n,k}-f({\mathbf{p}}_{n},{\mathbf{q}}_{k})\right)}^{2}.$$This subproblem exactly means a least-square regression in delineating DPFs given many scattered points. The closed-form solution is available as$${\mathbf{p}}_{n}=\left[\begin{array}{c}\frac{\left({\sum}_{k}{\beta}_{k}\right)\left({\sum}_{k}{\beta}_{k}{s}_{n,k}\right)-\left({\sum}_{k}{\beta}_{k}^{2}\right)\left({\sum}_{k}{s}_{n,k}\right)}{{\left({\sum}_{k}{\beta}_{k}\right)}^{2}-K{\sum}_{k}{\beta}_{k}^{2}}\\ \frac{K\left({\sum}_{k}{\beta}_{k}{s}_{n,k}\right)-\left({\sum}_{k}{\beta}_{k}\right)\left({\sum}_{k}{s}_{n,k}\right)}{{\left({\sum}_{k}{\beta}_{k}\right)}^{2}-K{\sum}_{k}{\beta}_{k}^{2}}\end{array}\right].$$

We repeat the above process. However, if *λ* is set to a large number, {*s _{n,k}*} in the s-subproblem tends to stay at

*f*(

**p**

*,*

_{n}**q**

*), making little movement in each step. Consequently, it may take a long time until convergence. As a remedy, we initially set*

_{k}*λ*to a small number (zero in this study) and gradually increase it in the iterative optimization process.

The optimization process can be understood somewhat intuitively as follows: We assume that we start with *λ* = 0. In the first round, the s-subproblem seeks to find the visible pixels simply by selecting *s* such that satisfies *I*(*s*) = *O _{k}* (

*s*), separately for

*k*= 1, . . . ,

*K*. The p-subproblem delineates the DPFs (or estimates the slit locations) jointly based on all observations. If the observations were perfect, the first round would be sufficient for the exact solution (even with

*λ*= 0). In practice, the observations are not perfect. We go through the next round with an increased value of

*λ*. Then, the s-subproblem starts to consider a new term pertaining to the “consensus” from the other observations in determining the visible pixels. Multiple observations, which are imperfect in themselves individually, collaborate with each other in this manner, leveraging the performance of calibration to a precise level.

The robustness to the illumination change is an important issue. Two factors keep our algorithm robust to the illumination condition. First, in matching pixels, we use hue, rather than luminance or RGB values, which should remain constant with brightness change. Second, we calibrate each observed image photometrically (see Eq. (4)). Although the process primarily aims at compensating for the difference in color spectra between the display and the camera, it actually plays a more general role of “normalizing” the observation condition.

#### 3.2. Rendering

Once the calibration is done, the information on the slit locations is utilized for the 3D rendering. If the surface of the optical element has local deformation, van Berkel’s view assignment formula no longer works well because it is based on the assumption that the surface is planar and in parallel with the display panel. The underlying principle remains intact, however. We first describe, for the 3D display dedicated to a single user with an eye-tracking device, how to correctly and efficiently assign the left and right views to the pixels. The multi-user/multi-view case without the eye tracking will follow.

In the eye-tracking 3D display, the embedded frontal camera keeps track of the observer’s eye positions in real-time. The rendering module is then, at each time, required to divide the set of pixels according to the visibility from the observer’s left and right eye positions and to display the correspondent images according to the results. Identifying each set of visible pixels in the ray space is remarkably simple. Remember that the visible pixels are merely the intersection points between the DPFs and the CPFs (see the last paragraph of Sec. 2). We generate two CPFs corresponding to the current eye positions. Then, using the calibrated DPFs, we obtain each set of visible pixels by intersecting the two types of pencil functions. Specifically, the *s*-coordinate values of the intersection points refer to the pixel positions. We assign the left view to the pixels that lie on the CPF of the left eye and the right view to those on the CPF of the right eye.

In the final step, we assign some pixels that are not exactly on either CPFs, to the closer one of the two views. These pixels are occluded if the slits are infinitesimally narrow and if there is no crosstalk. In reality, neither condition is true; those pixels become partially visible. In effect, the view assignment to the partially visible pixels improves the spatial resolution of the 3D images. Figure 5(a) illustrates the details of the overall view assignment procedure in the ray space.

Rendering for the multi-user/multi-view case is similar. The only difference is that we assume *more than two* observations at *predefined, fixed* positions. The view assignment for this case is illustrated in Fig. 5(b). In comparison with the former case (i.e., Fig. 5(a)), there are more CPFs in the ray space and the cluster region surrounding each CPF, whose size is interpreted as the effective 3D resolution, becomes smaller. Besides, the CPFs are time-invariant, unlike the former case.

## 4. Experiments

Experiments proceed in two ways: one for optical simulation, the other for real-life displays. While the latter is surely more meaningful in practice, the groundtruth is generally unknown for the real-life displays, so we only rely on the visual aspect of the observed images to evaluate the calibration performance. In contrast, the simulation, while being somewhat artificial, provides us an opportunity to directly examine how the calibration works.

#### 4.1. Synthetic displays

We build the synthetic displays in a simulation environment based on POV-Ray, a ray tracing tool [34]. For the three types of nonuniform deformation shown in Fig. 6, we conduct calibration, following the procedure of Sec. 3.1. The number *K* of the observations is set to 5.

The calibration localizes the optical slits. We treat them as point cloud and reconstruct the 3D surface on which they lie. The results are shown in Fig. 6. The proposed local calibration (LC) method recovers the surface faithfully, very close to the groundtruth profile. In contrast, GC always attempts to fit the surface by a plane, which results in large error and fails to distinguish the three cases.

In obtaining the above results, we have used two iterations for LC. The weight parameter *λ* starts from zero and increases thereafter, at each iteration, by 0.1. For the completeness, we also examine how the mean-squared error (MSE) changes with the iterations. The results averaged over the three cases are shown in Fig. 7. Generally, the MSE diminishes as iteration goes but converges quite fast, becoming almost saturated after 2 iterations.

#### 4.2. Real-life displays

We also calibrate five real-life instances of lenticular displays. Each display is implemented with a tablet PC that has 2, 560 × 1, 600 pixels on the 10-inch panel. For observations, we use Point Grey cameras with resolution 4, 096 × 2, 160. In each case, we take five photos at random positions and run the calibration on a workstation PC with Intel Xeon CPU and 16GB memory. The processing time takes 5.81s on average.

Completing the calibration, we perform rendering in two different settings. First, we assume that the 3D displays are dedicated to a single user and conduct two-view rendering. We set up two cameras roughly separated by 65mm and calibrate the camera positions before rendering. Here, the cameras mimic human eyes. Of course, in real situations, human eyes replace the cameras and we use real-time eye tracking instead of the camera position calibration. We encode the views by color, assigning red to the left, blue to the right. Ideally, the left camera must see only the red pixels and the right camera must see only the blue pixels. However, if some misalignment happens between the actual optical layer configuration and the one assumed for the rendering, some pixels designated for one view may actually be directed to the other camera or vice versa. The effect is much like crosstalk. It is sometimes referred to as *extrinsic* crosstalk in a distinguished sense from the *intrinsic* one which is due to the light leakage [11]. We measure the extrinsic crosstalk by [4]

Second, we conduct 27-view rendering to accommodate multiple users simultaneously. Figure 9 shows the observation results for four example images. The shown images are photographed nearly at the center position and nearly at the optimal viewing distance. In NC cases, the edges (e.g., door frames, chinchilla, steel fences, and checkerboard) appear to be bent or shaken, with the amount of distortion depending on the depth. In GC, the distortion is alleviated but still noticeable. Little distortion remains with LC. We also measure the visual quality in terms of peak-signal-to-noise-ratio (PSNR) and structural similarity (SSIM) index [35]. Both metrics need a reference image for the quality assessment. To obtain the reference image, we follow the trick proposed in [14]: We first identify the view corresponding to the camera position; we then display the viewpoint image on the entire panel (2D rendering); the camera takes a picture, which provides the artifact-free image for the specific camera position. Table 1 summarizes the results. The gain of using LC over using GC is substantially large (4 dB in PSNR, 0.045 in SSIM index on average).

## 5. Conclusion

While providing a whole new viewing experience, AS3D displays encounter the criticism that the 3D quality falls far below the 2D quality maximally supported by the same display panel. Certainly, diverse factors are responsible for the quality degradation and that in an entangled manner. In this paper, we have shown a novel method for improving the visual quality by fixing *any* misalignment between the actual optical layer configuration and the one assumed for 3D rendering. This is accomplished by accurate calibration even in the presence of local deformation in the optical layer.

First, we formulate the calibration as an optimization problem based on the relationship between the input pattern and the observations. Then, we present an algorithm to numerically solve it. We also show how to fix the 3D rendering according to the calibration results. Modeling the display and the observer as pencil functions in the light field, we obtain intuitive and efficient algorithms. Experiments show that the proposed calibration method exhibits quite high accuracy. Particularly with real-life displays, the proposed method has demonstrated a significant improvement in the visual quality of the observed images.

## References and links

**1. **N. A. Dodgson, “Optical devices: 3D without the glasses,” Nature **495**, 316–317 (2013). [CrossRef] [PubMed]

**2. **H. Urey, K. V. Chellappan, E. Erden, and P. Surman, “State of the art in stereoscopic and autostereoscopic displays,” Proc. IEEE **99**, 540–555 (2011). [CrossRef]

**3. **Y. Takaki, “Multi-view 3-D display employing a flat-panel display with slanted pixel arrangement,” J. Soc. Inform. Display **18**, 476–482 (2010). [CrossRef]

**4. **S.-K. Kim, K.-H. Yoon, S. K. Yoon, and H. Ju, “Parallax barrier engineering for image quality improvement in an autostereoscopic 3D display,” Opt. Express **23**, 13230–13244 (2015). [CrossRef] [PubMed]

**5. **C. Van Berkel, “Image preparation for 3D LCD,” Proc. SPIE **3639**, 84–91 (1999). [CrossRef]

**6. **D. Teng, Y. Xiong, L. Liu, and B. Wang, “Multiview three-dimensional display with continuous motion parallax through planar aligned OLED microdisplays,” Opt. Express **23**, 6007–6019 (2015). [CrossRef] [PubMed]

**7. **S. Winkler and D. Min, “Stereo/multiview picture quality: Overview and recent advances,” Signal Process.: Image Commun. **28**, 1358–1373 (2013).

**8. **L. M. Meesters, W. A. IJsselsteijn, and P. J. Seuntiëns, “A survey of perceptual evaluations and requirements of three-dimensional TV,” IEEE Trans. Circuits Syst. Video Technol. **14**, 381–391 (2004). [CrossRef]

**9. **C.-Y. Chu and M.-C. Pan, “Thermal-deformation characterization of the panel of a TFT-LCD TV. part II: Solutions to thermal-induced extrusion degrading image quality,” J. Soc. Inform. Display **18**, 357–367 (2010). [CrossRef]

**10. **X.-F. Li, Q.-H. Wang, D.-H. Li, and A.-H. Wang, “Image processing to eliminate crosstalk between neighboring view images in three-dimensional lenticular display,” J. Display Technol. **7**, 443–447 (2011). [CrossRef]

**11. **M. Zhou, H. Wang, W. Li, S. Jiao, T. Hong, S. Wang, X. Sun, X. Wang, J.-Y. Kim, and D. Nam, “A unified method for crosstalk reduction in multiview displays,” J. Display Technol. **10**, 500–507 (2014). [CrossRef]

**12. **D. Li, D. Zang, X. Qiao, L. Wang, and M. Zhang, “3D synthesis and crosstalk reduction for lenticular autostereoscopic displays,” J. Display Technol. **11**, 939–946 (2015). [CrossRef]

**13. **H. Hwang, J. Park, H. S. Chang, Y. J. Jeong, D. Nam, and I. S. Kweon, “Lenticular lens parameter estimation using single image for crosstalk reduction of three-dimensional multi-view display,” in SID Symposium Digest of Technical Papers (2015), pp. 1417–1420. [CrossRef]

**14. **H. Hwang, H. S. Chang, D. Nam, and I. Kweon, “3D display calibration by visual pattern analysis,” IEEE Trans. Image Process. **26**, 2090–2102 (2017). [CrossRef] [PubMed]

**15. **Y.-G. Lee and J. B. Ra, “Image distortion correction for lenticula misalignment in three-dimensional lenticular displays,” Opt. Eng. **45**, 017007 (2006). [CrossRef]

**16. **Y.-G. Lee and J. B. Ra, “New image multiplexing scheme for compensating lens mismatch and viewing zone shifts in three-dimensional lenticular displays,” Opt. Eng. **48**, 044001 (2009). [CrossRef]

**17. **W. Li, H. Wang, M. Zhou, S. Wang, S. Jiao, X. Mei, T. Hong, H. Lee, and J. Kim, “Principal observation ray calibration for tiled-lens-array integral imaging display,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2013), pp. 1019–1026.

**18. **M. Hirsch, D. Lanman, G. Wetzstein, and R. Raskar, “Construction and calibration of optically efficient LCD-based multi-layer light field displays,” J. Phys. Conf. Ser. **415**, 012071 (2013). [CrossRef]

**19. **G. Wetzstein, D. Lanman, M. Hirsch, and R. Raskar, “Tensor displays: Compressive light field synthesis using multilayer displays with directional backlighting,” in *Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques* (ACM, 2012).

**20. **M. Levoy and P. Hanrahan, “Light field rendering,” in *Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques* (ACM, 1996), pp. 31–42.

**21. **S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen, “The lumigraph,” in *Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques* (ACM, 1996), pp. 43–54.

**22. **S. Wanner and B. Goldluecke, “Variational light field analysis for disparity estimation and super-resolution,” IEEE Trans. Pattern Anal. Machine Intell. **36**, 606–619 (2014). [CrossRef]

**23. **K. Mitra and A. Veeraraghavan, “Light field denoising, light field superresolution and stereo camera based refocussing using a GMM light field patch prior,” in *IEEE CVPR Workshop on Computational Cameras and Displays* (IEEE, 2012), pp. 22–28.

**24. **H.-S. Kim, K.-M. Jeong, S.-I. Hong, N.-Y. Jo, and J.-H. Park, “Analysis of image distortion based on light ray field by multi-view and horizontal parallax only integral imaging display,” Opt. Express **20**, 23755–23768 (2012). [CrossRef] [PubMed]

**25. **R. Bregović, P. T. Kovács, and A. Gotchev, “Optimization of light field display-camera configuration based on display properties in spectral domain,” Opt. Express **24**, 3067–3088 (2016). [CrossRef]

**26. **Y. J. Jeong, H. S. Chang, D. Nam, and C.-C. J. Kuo, “Direct light field rendering without 2D image generation,” J. Soc. Inform. Display **24**, 686–695 (2016). [CrossRef]

**27. **M. Zwicker, W. Matusik, F. Durand, H. Pfister, and C. Forlines, “Antialiasing for automultiscopic 3D displays,” in *ACM SIGGRAPH 2006 Sketches* (ACM, 2006), p. 107. [CrossRef]

**28. **L. Cremona, *Elements of Projective Geometry* (The Clarendon Press, 1885).

**29. **X. Gu, S. J. Gortler, and M. F. Cohen, “Polyhedral geometry and the two-plane parameterization,” in *Proceedings of the Eurographics Workshop on Rendering Techniques* (Springer, 1997), pp. 1–12.

**30. **G. Chen, L. Hong, K. Ng, P. McGuinness, C. Hofsetz, Y. Liu, and N. Max, “Light field duality: concept and applications,” in *Proceedings of the ACM Symposium on Virtual Reality Software and Technology* (ACM, 2002), pp. 9–16.

**31. **I. Schillebeeckx and R. Pless, “Single image camera calibration with lenticular arrays for augmented reality,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016).

**32. **D. Geman and C. Yang, “Nonlinear image recovery with half-quadratic regularization,” IEEE Trans. Image Process. **4**, 932–946 (1995). [CrossRef] [PubMed]

**33. **L. Xu, C. Lu, Y. Xu, and J. Jia, “Image smoothing via L0 gradient minimization,” ACM Trans. Graph. **30**, 174 (2011). [CrossRef]

**34. **D. K. Buck and A. A. Collins, “POV-Ray – the persistence of vision raytracer,” [Online]. Available: http://www.povray.org/.

**35. **Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process. **13**, 600–612 (2004). [CrossRef] [PubMed]