Recovering the real light field, including the light field intensity distributions and continuous volumetric data in the object space, is an attractive and important topic with the developments in light-field imaging. In this paper, a blind light field reconstruction method is proposed to recover the intensity distributions and continuous volumetric data without the assistant of prior geometric information. The light field reconstruction problem is approximated to be a summation of the localized reconstructions based on image formation analysis. Blind volumetric information derivation is proposed based on backward image formation modeling to exploit the correspondence among the deconvoluted results. Finally, a light field is blindly reconstructed via the proposed inverse image formation approximation and wave propagation. We demonstrate that the method can blindly recover the light field intensity with continuous volumetric data. It can be further extended to other light field imaging systems if the backward image formation model can be derived.
© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement
Recovering the real light field in the object space is an attractive and important topic with the developments in light-field imaging. According to the wave-optics models, the light field in the object space consists of the intensity distributions and continuous volumetric data. Reconstructing them only using the spatial and angular information recorded on the sensor is challenging since the volumetric information is lost during acquisition and the spatial resolution of acquired data is limited.
The existing light field recovery works mainly use the data acquired by plenoptic cameras since they can record the direction of light rays via a single shoot [1–3]. They reconstruct the light field by computational synthesis of 3D focal stacks across the 3D scene based on ray-optics [3,4]. While the volumetric information they provided is a relative depth among the virtual focal planes and the compromise between the lateral and angular resolution shows resolution loss during reconstruction. Zhang et al. reconstructed 3D object by moving a plenoptic camera around the object and updating the structure-from-motion method . Although the point cloud can be reconstructed, it needs to capture and register multiple light-field images, which is only applicable to static objects. S. Shroff et al. proposed wave-optics models to reconstruct the light field through point-spread-function (PSF) deconvolution [6–8]. It mitigates the resolution loss for the reconstructed light field, while, the prior information, like the distance of each object or the distance of each object point in an extreme case, is needed. We proposed a light field reconstruction model to tackle the scenarios that the imaging noise exists in the acquired data . However, the exact distance of the object plane is still needed for reconstruction. C. Guo et al. extended the work to the microscopic scale and reconstructed 3D volumetric information. Nevertheless, the geometric information of the scene is needed . M. Broxton et al. used Richardson-Lucy algorithm to recover the 3D scene . However, their work cannot obtain the exact object distance, which results in the difficulty in clarifying which object on which depth generates the reconstructed intensity. Also, they cannot reconstruct the light field for a specific object.
So, in this paper, a blind light field reconstruction method is proposed to recover the intensity distributions and continuous volumetric data without the assistant of prior geometric information. Plenoptic camera 2.0 [12–14], which inserts a microlens array behind the image plane of the main lens for an improved spatial resolution of the acquired light field, is exploited to benefit from its distinct image response. By analyzing the image responses among the neighboring microlens, we proposed to approximate the light field reconstruction problem to be a summation of the localized reconstructions. Based on the approximation, blind light field reconstruction problem becomes to be a problem in blindly deriving the distance correspondence from the reconstructions generated by the microlens images. Blind volumetric information derivation is proposed based on backward image formation modeling to exploit the correspondence among the deconvoluted results. Finally, light field is blindly reconstructed via the proposed inverse image formation approximation and wave propagation. We demonstrate that the method can blindly recover the light field intensity with continuous volumetric data. It can be further extended to other light field imaging systems if the backward image formation model can be derived.
The paper is organized as follows. The proposed light field reconstruction approximation is described in detail in Section 2. Section 3 describes the proposed blind light field reconstruction method. Section 4 provides experimental results to demonstrate the effectiveness of the proposed method. Section 5 concludes the paper.
2. Light field reconstruction approximation
2.1 Image formation of plenoptic camera 2.0 and light field reconstruction modeling
The optical configuration of plenoptic camera 2.0 is shown in Fig. 1.
As shown in the figure, a microlens array is inserted between the image plane of the main lens and the imaging sensor. Rays coming from the object on the focal plane, the rays in green, pass through the main lens and focus on the image plane. Then, treating the light field on the image plane as a new object, the microlens array reimages it on the sensor. Thus, dividing the relay imaging system into several sub-imaging-systems, our previous work successfully modeled the image formation process of plenoptic camera 2.0 for a point light source placed at (x0, y0) on depth d1 by wave-optics as [9,15]15]:15]:
h(x, y, x0, y0) describes the imaging response of a point light source, called the PSF of the imaging system. Treating a real imaging target as a set of point light sources, a pixel on the sensor actually records the summation of the imaging responses from all the light sources. Since only the intensity value is recorded, the intensity of a pixel on the sensor, I (x, y), can be formulated as:4–7,16,17] as:
So, to recover the light field intensity in the object space, in Eq. (5), the inverse problem of Eq. (5) can be formulated by Tikhonov regularization [18,19] considering the existence of imaging noise and the possibility of non-singular . It is given by:Eq. (5), which is outside the scope of this paper.
If is known, which corresponds to the prior geometric information of the imaging target is known, can be recovered with high accuracy via the derivation of . However, such prior information is generally unknown, which results in blind light field reconstruction becoming challenging.
2.2 The proposed light field reconstruction approximation
Considering the information that can be exploited is limited to the sensor data and the optical configuration of the imaging system, we propose to approximate the light field reconstruction problem by exploiting the optical structure of plenoptic camera 2.0. Referring to the system structure shown in Fig. 1, the image on the sensor can also be treated as the summation of the imaging responses from all the microlenses. So, the image formation process of plenoptic camera 2.0 with M × N microlenses can be reformulated by :
To simply this, the image formation process is further analyzed using ray-optics to discover the ray contribution on the sensor. For a point light source at (x0, y0) in the object space, as shown in Fig. 1, rays coming from it pass through the main lens and converge at (x1, y1). (x1, y1) and (x0, y0) satisfy:
Then, treating the point at (x1, y1) as a new object, the microlenses within the imaging range reimage it. The rays pass through the edge of the main lens, as Ray1 and Ray2 shown in Fig. 1, determine the imaging range on the microlens array. Using the coordinates in the vertical direction as instances, the microlens’ coordinates within the imaging range can be derived as follows. For Ray1, the vertical coordinate of its image on microlens array is ym1. It is given by:Fig. 1 on microlens array is ym2. It equals to:
As L is bigger than d2, which corresponds to imaging the objects with rays converged before the microlens array, the focused image y1 is between the microlens array and main lens and ym1 is vertically below ym2. The vertical coordinate of a microlens,, in the imaging range satisfies:Eqs. (9) and (10) into it, we have:
As L is smaller than d2, which corresponds to imaging the objects with rays converged behind the microlens array, the focused image y1 is behind the microlens array and ym1 is vertically above ym2. Similarly, my within the imaging range satisfies:
The above derivation can be performed for the x dimension equally. Combining Eqs. (15) and (16) together, it is found that for a specific object point (x0, y0), no matter it is focused or defocused, only some microlens (mx, my) together with the pixels under them record its information.
Further analyzing the imaging response of (x0, y0), i.e. the pixel on the sensor, the generated converge point (x2, y2) is given by:
After rays converging at (x2, y2), if the object point is on the focal plane of the main lens, like the rays in green shown in Fig. 1, the imaging result behind microlens (mx, my) will be a pixel (x, y). If the object point is not on the focal plane, like the red rays shown in Fig. 1, the rays will propagate from (x2, y2) to the sensor, which results in a bright disk area on the sensor. As the image formation properties are similar between the center of the disk and the points around, we use the center point (x, y) of the disk in the following derivations because of its simplicity in mathematical expressions. The center of the disk is:Eqs. (9) and (10) into Eq. (19), we have:
Thus, combining the two observations together with design constraints described in  that the image-side f-number must match the microlens f-number to prevent microlens image overlapping and to maximize the illuminated area behind each microlens for the plenoptic cameras, we propose to approximate image formation process of plenoptic camera 2.0 with M × N microlenses by the summation of the localized responses of microlenses as:Eq. (21) to be:
3. Blind light field reconstruction
3.1 Backward image formation modeling and blind volumetric information derivation
To blindly derive d1n for a correct reconstruction, the backward image formation process is analyzed to estimate d1n from the spatial correspondence among the reconstructions generated at a series of depths using multiple microlens images.
Substituting Eq. (10) into Eq. (20) and generalizing (x, y) by as a pixel under microlens = (mx, my), corresponding to a point light source in whose intensity is an element of, we can express the relationship between and by the function of d1n. Using the horizontal direction as an instance, we have:Eq. (25). Meanwhile, in Eq. (24) is constant caused by the fixed and d1n. Thus, via substituting Eq. (24) into Eq. (25), the spatial distance between the inverse projection at and that at the real distance d1n is given by:
If also contributes to the pixels under other microlens, like the red point source in Fig. 1, it can be reconstructed by the pixels from different microlenses. Thus, the distance between and that recovered from the pixel under microlens and, respectively, using different from d1n is given by:
Since and signals the entry of the reconstructed pixel intensity in image and , we propose to detect whether is spatially coincident with by evaluating the similarity between the corresponding reconstructed intensity image and . As is different from d1n, the intensity of the point located at in is different from the intensity of the collocated point in . So, if calculating the pixel-wise intensity difference between and , the difference will decrease with the difference decrement between and d1n. It will reach the minimum as is exactly the same with, which corresponds to is spatially coincident with . Generalizing the process to horizontal and vertical directions, d1n can be blindly derived by:
So, for the point sources in at depth d1n, we segment its image under microlens as in Eq. (23), and uses a series of at to reconstruct a series ofby:Eq. (28), d1n is derived. Then, can be directly reconstructed by adding the reconstruction of each object under each microlens at d1n, , together according to Eq. (23), since they have been reconstructed during deriving d1n. Although, theoretically, the images under all the microlenses need to complete the process for each object, it can be further simplified by using limited number of images based on the discussion above that the rays from specific object point only contribute to limited number of pixels on the sensor. During the implementation, we use the most complete two images of the object from two microlens to greatly reduce the computational complexity and preserve the reconstruction quality.
3.2 Light field repropagation
For the real scenario that several imaging targets are located at different depths, i.e. different d1n, the above processing can be applied iteratively to recover . Since only contains the light field intensity of the targets, i.e. s, located at depth d1n, light field repropagation is required to get additional light field on d1n that generated by the light sources (imaging targets) on other depths. Similarly using the light propagation as we exploited in deriving the imaging response of plenoptic camera 2.0 , the light field at point (x’, y’) on d1m that is generated by the light propagated from the light source (x0, y0) in on d1n equals to:
4. Experiments and results
The effectiveness of the proposed blind light field reconstruction method is demonstrated by testing on the simulated sensor data. The plenoptic camera 2.0 system is simulated according to , which consists of a main lens with f1 = 40mm and 4mm radius, and a 3 × 3 microlens array with f2 = 4mm and 160 radius for each microlens. The focal plane of the whole system is set at 65mm before the main lens. L and l equals to 122.49mm and 5.104mm, respectively. Three objects, “P,” “S,” and “F” are placed at d1n = 65mm, 67mm, and 69mm, respectively, as shown in Fig. 2(a), and the simulated sensor data is shown in Fig. 2(b).
To extract the imaging results for a same object under a microlens from the sensor data, in Eq. (23), several image segmentation methods, like graph-cut , can be exploited to distinguish the objects’ response on the sensor. During the experiments, we use the connected components analysis in  to label the 8-connected components in the image and segment out the region as the image of an object. The segmented regions are lined in red and shown in Fig. 3. Using the segmented images of “P” as instances, the regions lined in red are magnified on the right in Fig. 3. According to Eq. (23),the segmented images of “P” under microlens (1,1), (1,2), and (2,1) can be denoted by and , respectively.
4.1 Blind depth derivation verification
First, the correctness of Eq. (28), which deriving the depth by evaluating the similarity between the reconstructed images, is verified by executing Eq. (29) for and at a series of and comparing the results of Eq. (28) with the real depth information. from 62mm to 71mm with 1mm interval is used and the reconstructed ,andare shown in Fig. 4 from (b) to (k), respectively.
We use Euclidean distance as the function of in Eq. (28) because of its simplicity and sufficient accuracy. Smaller value corresponds to smaller distance and higher similarity. The results of each pair of reconstructed images at are listed in Table 1. It can be found that as increases from 62mm to the real distance 65mm, the value of Dis(.) decreases, which corresponds to the reconstructed images spatially moving closer to each other. The effect is consistent with that shown in Fig. 4 and the derivation in Eq. (27). Inversely, as increases from the real distance 65 mm to 71mm, Dis(.) increases which corresponds to the reconstructed images spatially moving apart from each other. Dis(.) always reaches the minimum at 65 mm, which is the real distance that “P” is placed at, for all the pairs. It indicates that according to Eq. (28), d1n equaling to 65mm can be achieved no matter which pair of images is input.
Similar processes are performed to the image of “S” and “F.” Since any pair of the images under two microlenses can derive the real distance, we just show the results of “S” and “F” using the most complete two images of the object from two microlens. Reconstructing “S” uses the segmented images under microlens (2, 2) and (3, 2) asand , respectively. Reconstructing “F” uses segmented images under microlens (2,3) and (3,3) asand , respectively. The reconstructed intensity images are shown in Fig. 5 from (b) to (k). To make the spatial disparity of reconstructed results clearly, we use red color to represent and its reconstructed , and use green color to highlight and its reconstructed. The similarities measured by Dis(.) between and are listed in Table 2. From Fig. 5, It can be found that the spatial disparity between and reaches the minimum at 67mm, the real depth of “S,” for object “S” and reaches 69mm, the real depth of “F,” for object “F.” Combining the disparity calculation results in Tables 1 and 2 together, it demonstrates that the blind volumetric information derivation method proposed is effective and accurate.
4.2 Blind depth derivation verification for noisy imaging results
To further verify that the proposed depth derivation method, i.e., Eq. (28), also works for the noisy imaging results, Gaussian noise is added to the imaging result in Fig. 2(b). The noisy imaging result shown in Fig. 6, whose peak signal to noise ratio (PSNR) is only 25dB, present strong noise distortion in the image relative to the noise-free result in Fig. 2(b).
Same to the implementation in processing the noise-free sensor data, we use the segmented images under microlens (1, 1) and (1, 2) for object “P,” the segmented images under microlens (2,2) and (3,2) for object “S” and the segmented images under microlens (2, 3) and (3, 3) for object “F.” Treating them as the image responses and , the reconstructedand as varies from 62mm to 71mm with 1mm interval are shown in Fig. 7 from (b) to (k), respectively.
It can be found that the spatial disparities between the reconstructed object points are similar to the noise-free case. is spatially coincident with at 65mm, the real depth of “P,” for object “P.” Also, the spatial disparity reaches the minimum at 67mm and 69mm, the real depth of “S” and “F,” for object “S” and object “F,” respectively. Still using the Euclidean distance as the function of in Eq. (28), the spatial similarities measured by Dis(.) are listed in Table 3. Although the strong noise decreases the difference in the spatial similarity, the reconstruction model proposed in Eq. (29) weakens the noise influence by smoothness regularization. Thus, we can still obtain the correct object distance from Table 3, which demonstrates the robustness of the proposed method.
4.3 Light field reconstruction results
Using the distances blindly derived above, the reconstructed discrete intensity information at the specific distance and the volumetric information recovered are shown in Fig. 8.
Compared Fig. 8 with the original object information in Fig. 2(a), our reconstructed volumetric information in Fig. 8 can embody the depth and the actual size of the objects, which gives real space information. Further applying the light field repropagation as Eq. (30), the light field intensity at depth 65mm, 67mm, and 69mm is generated using Eq. (31) and shown in Fig. 9. As shown in the figure, the light field propagates to all the directions, which results in on each light field slice we can observe some light field intensity information generated by the light sources of objects “P,” “S,” and “F.” The effect is consistent with the theoretical understanding of light field.
4.4 Light field reconstruction for bigger object with more microlenses
To further verify the universality of the proposed method, reconstruction results are provided for a much bigger imaging target using a plenoptic camera 2.0 with a 7 × 7 microlens array.
The system parameters are consistent with those in the above experiments. The imaging target “A,” shown in Fig. 10(a), is placed at 66mm. Its physical size is much larger than “P,” “S,” or “F” used before, which cannot be fully imaged by a single microlens. Thus, as the simulated sensor data shown in Fig. 10(b), the image response under each microlens is only a part of “A.”
Since image responses under different microlenses correspond to different parts of the object, as shown in the Fig. 10(b), we use three pairs of image responses to recover the light field for the whole object. As shown in the Fig. 11, the first pair uses the image responses under microlens (2, 2) and (2, 3) as and , respectively; the second pair uses the image responses under microlens (5, 2) and (5, 3) as and , respectively; the third pair uses the image responses under microlens (4, 4) and (4, 5) as and , respectively.
Using the distance derived by Eq. (28), the reconstructed information from the first, the second and the third pair image responses are shown in Fig. 12(a)-(c), respectively. Since all the derived object’s distances are 66mm, the recovered light field intensity at distance 66mm is generated by Eq. (23), i.e., adding the three reconstructed light fields together. The recovered light field intensity is shown in Fig. 12(d), which shows the information of the object “A.” Comparing it with the original imaging target in Fig. 10(a), the recovered “A” has exactly the same physical size. The completion of recovered “A” can be further improved by reconstructing the image responses under more microlenses. It demonstrates that the proposed approach also works for the plenoptic camera 2.0 with more microlenses for bigger imaging targets.
In this paper, we proposed a blind light field reconstruction method based on inverse image formation approximation and blind volumetric information derivation. The inverse image formation is approximated to be a summation of the localized reconstructions based on image formation analysis. Blind volumetric information derivation is proposed based on backward image formation modeling to exploit the correspondence among the deconvoluted results. The light field is blindly reconstructed via the proposed inverse image formation approximation and wave propagation. Experimental results demonstrated the correctness and effectiveness of the proposed method in blindly recovering light field intensity with continuous volumetric data. Since the internal parameter changes do not affect the mathematical formalism of all the derivations and the image formation analysis provided in the paper, the proposed method is general to different optical parameters of plenoptic camera 2.0.
To further optimize the proposed algorithm, we are investigating on more automatic segmentation methods to extract the imaging results even if the depth-dependent imaging distortion exists. Also, recovering real objects on heterogeneous optical configurations are under modeling.
National Natural Science Foundation of China (NSFC) (61771275); Shenzhen Project, China (JCYJ20170817162658573).
References and links
1. E. H. Adelson and J. Y. A. Wang, “Single lens stereo with a plenoptic camera,” IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 99–106 (1992). [CrossRef]
2. R. Ng, M. Levoy, M. Bredif, G. Duval, M. Horowitz, and P. Hanrahan, “Light field photography with a hand-held plenopic camera,” Technical Report, Stanford University (2005).
3. R. Ng, “Digital light field photography,” Ph.D. thesis, Stanford University (2006).
4. N. Antipa, S. Necula, R. Ng, and L. Waller, “Single-shot diffuser-encoded light field imaging,” in 2016 IEEE International Conference on Computational Photography (ICCP), Evanston, IL, pp. 1–11 (2016).
5. Y. Zhang, Z. Li, W. Yang, P. Yu, H. Lin, and J. Yu, “The light field 3D scanner,” in 2017 IEEE International Conference on Computational Photography (ICCP), Stanford, CA, pp. 1–9 (2017).
6. S. Shroff and K. Berkner, “High resolution image reconstruction for plenoptic imaging systems using system response,” in Imaging and Applied Optics Technical Papers, OSA Technical Digest (online) (Optical Society of America (2012)), paper CM2B.2. [CrossRef]
7. S. Shroff and K. Berkner, “Plenoptic system response and image formation,” in Imaging and Applied Optics, OSA Technical Digest (online) (Optical Society of America, 2013), paper JW3B.1.
8. S. Shroff and K. Berkner, “Wave analysis of a plenoptic system and its applications,” Proc. SPIE 8667, 86671L (2013). [CrossRef]
9. L. Liu, X. Jin, and Q. Dai, “Image formation analysis and light field information reconstruction for plenoptic Camera 2.0,” Pacific-Rim Conference on Multimedia (PCM)2017, Sept. 28–29, Harbin, China. [CrossRef]
10. C. Guo, H. Li, I. Muniraj, B. Schroeder, J. Sheridan, and S. Jia, “Volumetric light-field encryption at the microscopic scale,” in Frontiers in Optics 2017, OSA Technical Digest (online) (Optical Society of America, 2017), paper JTu2A.94.
11. M. Broxton, L. Grosenick, S. Yang, N. Cohen, A. Andalman, K. Deisseroth, and M. Levoy, “Wave optics theory and 3-D deconvolution for the light field microscope,” Opt. Express 21(21), 25418–25439 (2013). [CrossRef] [PubMed]
13. A. Lumsdaine and T. Georgiev, “The focused plenoptic camera,” in Proceedings of IEEE International Conference on Computational Photography (ICCP, 2009), pp. 1–8.
14. T. Georgiev and A. Lumsdaine, “Focused plenoptic camera and rendering,” J. Electron. Imaging 19(2), 1–28 (2010).
15. X. Jin, L. Liu, Y. Chen, and Q. Dai, “Point spread function and depth-invariant focal sweep point spread function for plenoptic camera 2.0,” Opt. Express 25(9), 9947–9962 (2017). [CrossRef] [PubMed]
16. T. Georgiev and A. Lumsdaine, “Superresolution with plenoptic 2.0 cameras,” in Frontiers in Optics 2009/Laser Science XXV/Fall 2009, OSA Technical Digest (CD) (Optical Society of America) (2009), paper STuA6.
18. C. C. Paige and M. A. Saunders, “LSQR: an algorithm for sparse linear equations and sparse least squares,” ACM Trans. Math. Softw. 8(1), 43–71 (1982). [CrossRef]
19. D. C. L. Fong and M. Saunders, “LSMR: an iterative algorithm for sparse least-squares problems,” SIAM J. Sci. Comput. 33(5), 2950–2971 (2011). [CrossRef]
20. Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001). [CrossRef]
21. R. M. Haralick and L. G. Shapiro, Computer and Robot Vision (Addison-Wesley Longman Publishing Co., 1992), pp. 28–48, vol. I.