Diffraction model-informed neural network for unsupervised layer-based computer-generated holography

Xinghua Shui; Huadong Zheng; Huadong Zheng; Xinxing Xia; Furong Yang; Weisen Wang; Yingjie Yu

doi:10.1364/OE.474137

1. Introduction

Holographic displays can reproduce 3D scenes and provide all 3D visual information [1], which promise unprecedented capabilities for virtual and augmented reality (VR/AR) near-eye displays [2]. Since holographic near-eye displays require relatively limited spatial bandwidth product, it is easier to implement real-time holography [3].

In CGH, limited by the modulation mode of the spatial light modulator (SLM), the complex-amplitude information requires to be encoded to the phase-only or amplitude-only pattern. Phase-only holograms (POHs) are preferred in CGH due to the higher efficiency of phase modulation and no interference of conjugate images in reconstruction.

Calculating the POHs is an ill-posed problem because the number of solutions is uncertain [4]. One of the biggest challenges for CGH is the fundamental tradeoff between algorithm runtime and image quality. Common iterative algorithms include the Gerchberg-Saxton (GS) algorithm [5,6] and non-convex optimization algorithm [7,8]. The 3D GS algorithm constrains amplitude on the multi-depth object planes and hologram plane according to the set order of iteration [9,10]. The stochastic gradient descent (SGD) with a complex loss function algorithm (complex SGD) uses a complex loss function instead of an amplitude-only loss function in the SGD optimization process [11,12]. Iterative methods can generate high-quality holograms but are time-consuming. Several non-iterative algorithms have been proposed for fast hologram generation such as the error diffusion method [13] and the double-phase method (DPM) [14,15]. However, these methods involve complex amplitude modulation at sacrifice spatial resolution and are vulnerable to the presence of artifacts in the reconstruction.

Recently, learning-based CGH has been demonstrated versatile and efficient in the high-speed generation of holograms and high-quality optimization of holograms [16,17]. For supervised CGH, preparing a large-scale dataset of target images and corresponding holograms is necessary. Some methods directly used POHs as labeled datasets. Horisaki et al. presented the inverse process of speckle pairs propagation as a dataset, but the quality of results was limited [18]. Zheng et al. used a multi-plane iterative angular spectrum algorithm to generate the dataset, which requires a high computational cost to produce the dataset [19]. Some approaches adopted forward propagation to yield complex holograms as ground truth, the predicted complex hologram is then encoded to POH. Lee et al. built an equivalent complex spatial modulation system to display complex holograms [20]. Shi et al. and Chang et al. used the double-phase method to encode complex holograms into phase-only holograms [21,22]. For supervised CGH, the quality of the predicted hologram is intrinsically bounded by the dataset’s quality. Shi et al. proposed a two-stage supervised + unsupervised training protocol for the direct synthesis of high-quality 3D phase-only holograms [23].

In contrast, due to the incorporation of physical diffraction propagation in the neural network, unsupervised CGH does not require labeled datasets and can directly predict POHs. Peng et al. developed a neural network architecture, HoloNet, that had target-phase-generator and phase-encoder subnetworks and incorporated the calibrated wave propagation model [12]. Wu et al. proposed an autoencoder-based neural network, HoloEncoder, that was implemented with UNet architecture to directly generate POHs and used a phase regularizer to enforce constraints on phase variations [4]. Yu et al. proposed a phase dual-resolution network, PDRNet, that used a dual-resolution network instead of UNet and introduced a multi-scale structural similarity (MS-SSIM) loss function [24]. Liu et al. proposed a 4K diffraction model-driven network (shorten as 4K-DMDNet) that strengthened the constraint of the reconstructed images in the frequency domain [25]. These methods utilize unsupervised training by only specifying the desired amplitude at one or multiple depth planes and rely on the convolutional neural network (CNN) itself to discover the optimal SLM phase pattern [23,26].

In this paper, we present a diffraction model-informed neural network framework with indirect phase inference to synthesize 3D phase-only holograms. We take advantage of the various representations of a 3D object. In the encoding part, one CNN converts RGB-D images into the complex amplitude of the target field, and the other CNN converts the complex amplitude of the SLM field into a phase-only hologram. In the decoding part, the 3D object is represented by the layer-based method. The hologram is randomly propagated to one layer of the object. The self-holo can naturally and efficiently extend to the 3D setting. The complexity of the network is independent of the number of depth layers of object. With different depth maps, the self-holo can generate 3D holograms or 2D holograms.

In section 2, we introduce the mathematical model of unsupervised layer-based CGH and the pipeline of the self-holo. In sections 3 and 4, numerical reconstruction and optical reconstruction experiment were carried out to validate the generalization ability of the self-holo.

2. Method

2.1 Mathematical model of unsupervised layer-based CGH

In a holographic display, a coherent beam is incident on a phase-only SLM. The phase of the source field ${u_{\textrm{src}}}$ is then delayed by $\phi ({x,y} )$. The field continues to propagate in free space to the target plane. Due to the desirable property of the angular spectrum method (ASM), i.e., the sampling window and interval of the target field are the same as the sampling window and interval of the source field, which is appropriate for simulating propagation to multiple planes. The expression of the ASM is expressed by [27]:

(1)$${f_{\textrm{ASM}}}({\phi ({x,y} ),z} )= IFFT\{{FFT\{{{e^{i\phi ({x,y} )}}{u_{\textrm{src}}}({x,y} )} \}\cdot H({{f_x},{f_y},z} )} \}$$

(2)$$H({{f_x},{f_y},z} )= \left\{ {\begin{array}{cc} {{e^{i\frac{{2\pi }}{\lambda }z\sqrt {1 - {{({\lambda {f_x}} )}^2} - {{({\lambda {f_y}} )}^2}} }},}&{\textrm{if}\sqrt {f_x^2 + f_y^2} < \frac{1}{\lambda }}\\ 0&{\textrm{otherwise}} \end{array}} \right.$$

where, ${f_{\textrm{ASM}}}({\cdot} )$ is the propagation operator, $\phi ({x,y} )$ represents the phase-only hologram, $H({\cdot} )$ is the transfer function, λ is the wavelength, z is the distance between the SLM plane and target plane, ${f_x},{f_y}$ are spatial frequencies, and FFT and IFFT denote the fast Fourier transform and inverse fast Fourier transform, respectively.

Deep learning uses a backpropagation algorithm to indicate how a machine should change its internal parameters. As the ASM is differentiable to satisfy the backpropagation of neural networks, angular spectrum propagation can be incorporated into neural networks. For unsupervised CGH to generate layer-based holograms, the interval distance of the 3D object and the propagation distance should be reasonably integrated into the neural network. As shown in Fig. 1(a), we introduce the complex amplitude of the target field to represent layer-based objects. The target field is then propagated to the SLM field via the ASM. As shown in Fig. 1(b), the phase-only hologram randomly reconstructs one layer of the 3D object.

Fig. 1. (a) Position relationship among target field, SLM field, and layer-based object. (b) Position relationship between POH and reconstructed layer-based object. ${z_0}$ is the benchmark distance between the target field plane and the SLM field plane, and $\Delta z$ is the interval distance of the layer-based object. The target field plane and $a_{\textrm{target}}^{\{ j\} }$ plane are in the same position. The SLM field plane and phase-only hologram plane are in the same position.

Download Full Size | PDF

The task for any CGH algorithm is to determine the best SLM phase pattern ϕ. Unsupervised layer-based CGH for a single or a set of multiple target image amplitudes $a_{\textrm{target}}^{\{j \}}$ located at the set of distances ${z_j}$ (j = 1, …, J) from the SLM can be formulated as:

(3)$${\phi _\textrm{H}} = {f_{\textrm{CNN2}}}({{f_{\textrm{ASM}}}({{f_{\textrm{CNN1}}}({\textrm{input}} ),{z_0}} )} )$$

(4)$$\mathop {\textrm{minimize }}\limits_{{\phi _\textrm{H}}} L({s \cdot |{{f_{ASM}}({{e^{i{\phi_\textrm{H}}}},{z_j}} )} |,a_{\textrm{target}}^{\{j \}}} )$$

where s is a fixed or variable scale factor that accounts for the output values in a different range compared with the target [12]. ${\phi _H}$ denotes the phase-only hologram, ${z_0}$ is the benchmark distance between the target field plane and the SLM field plane. ${z_j}$ is the distance between the phase-only hologram plane and the plane. $a_{\textrm{target}}^{\{j \}}$ is the amplitude of the jth plane, ${f_{\textrm{CNN1}}}$ and ${f_{\textrm{CNN2}}}$ are convolutional neural networks, L is the loss function.

When the network training is completed, the phase-only hologram inference process can be represented by Eq. (3).

2.2 Dataset of self-holo

In computer graphics, 3D objects are commonly represented in the form of RGB-D data. The RGB-D datasets are from MIT-CGH-4K, consisting of 4000 pairs of RGB-D images and corresponding 3D holograms [21]. In CGH, quantifying the depth map converts RGB-D images to layer-based objects. In this paper, we divide the target object into 3 planes. The depth map is uniformly quantized to 3 binary masks. The RGB image is converted to a set of masked images by a set of binary masks. The set of binary masks is formulated as:

(5)$$mas{k^{\{j \}}}({x,y} )= \left\{ {\begin{array}{cc} 1&{D({x,y} )\in \textrm{interva}{\textrm{l}^{\{j \}}}}\\ 0&{\textrm{otherwise}} \end{array}} \right.$$

where $D({x,y} )$ is the pixel value of the depth map, $\textrm{interva}{\textrm{l}^{\{j \}}}$ is a quantized interval.

2.3 Pipeline of self-holo

The pipeline of self-holo is illustrated in Fig. 2. According to the wavelength, one channel of the RGB image is converted into amplitude, which is then concatenated with the depth map in the channel direction as input. During the iteration, the inverse propagation distance is chosen randomly and the masked image is produced. Given the input, the target complex-amplitude generator subnetwork predicts the amplitude distribution of the target field and the phase distribution of the target field, respectively. The target field is then propagated to the SLM field. The SLM field is decomposed into amplitude and phase, which are also concatenated in the channel direction. The phase encoder subnetwork converts the concatenated field into a phase-only representation. During training, this phase pattern is propagated to one layer of the object by ASM and compared to the masked image. The loss is calculated for the masked portion of the image. With backpropagation, the learnable parameters of the target complex-amplitude generator and phase encoder subnetworks can be updated to properly minimize the training loss. During inference, this phase pattern is displayed on the SLM to reconstruct the final image.

Fig. 2. The pipeline of self-holo. (a) and (b) The encoding part. (c) The decoding part. (d) Mask and masked image production. TCAG: target complex-amplitude generator, PE: phase encoder. To deal with the backpropagation of complex numbers in neural networks, the complex-valued wavefield is decomposed into amplitude and phase.

Download Full Size | PDF

2.4 Neural network of self-holo

2.4.1 Network architecture

The target complex amplitude generator and phase encoder subnetworks are both implemented using similar UNet architecture [4,28]. Both networks have four downsampling stages and corresponding upsampling stages. Each downsampling stage is composed of one downsampling block and one residual block. Each upsampling stage is composed of one upsampling block and one residual block. The initial feature count after the input layer is set to 16 features. The last downsampling block uses 96 feature channels. The feature channels are doubled for other downsampling blocks. We use skip connections to pass the learned information to the output of the upsampling. The residual block is the ResNetv2 unit [29]. Figure 3 shows the detailed schematic of the downsampling block and upsampling block. For downsampling blocks, we use convolution layers for downsampling instead of pooling layers and use hybrid dilated convolution layers to enlarge the receptive fields [30]. For upsampling blocks, we use transposed convolution layers that allow the upsampling function to be learned jointly with the rest of the network. The phase output layer of the target complex amplitude generator and the hologram output layer of the phase encoder are tanh functions that limit the phase value in $[{ - \pi ,\pi } ]$. The target complex amplitude generator has 2 channels for the input and 2 separate output branches to predict the amplitude and phase, respectively. The phase encoder has 2 input channels and 1 output channel to synthesize phase-only holograms.

Fig. 3. Detailed schematic of the downsampling block and upsampling block. (a) Downsampling block. (b) Upsampling block. BN: batch normalization, ReLU: rectified linear unit.

Download Full Size | PDF

2.4.2 Loss function

We used a combination of perceptual loss with mean square error (MSE) loss as the loss function [12,31]. The MSE loss function captures the difference between output and ground-truth images. The perceptual loss function measures the high-level image feature representations extracted from pre-trained convolutional neural networks. If only the MSE loss function is used, the reconstructed images will occur a gridding effect. The combination of perceptual loss with MSE loss will improve reconstructed image quality [24]. Besides, we constrain the amplitude of the SLM field, which is beneficial for generalization to different wavelengths. The mixed loss function is expressed by:

(6)$$L = \left\|{s \cdot \hat{a}_{\textrm{target}}^{\{j \}} - a_{\textrm{target}}^{\{j \}}} \right\|_2^2 + \tau \left\|{P({s \cdot \hat{a}_{\textrm{target}}^{\{j \}}} )- P({a_{\textrm{target}}^{\{j \}}} )} \right\|_2^2 + \gamma \left\|{\overline H - H} \right\|_2^2$$

where j represents the jth layer of a 3D object, $a_{\textrm{target}}^{\{ j\} }$ denotes the target amplitude of the jth layer, $\hat{a}_{\textrm{target}}^{\{j \}}$ denotes the reconstructed amplitude of the jth layer, and $P({\cdot} )$ represents a transform to a perceptual feature space. $\overline H $ is the uniform amplitude of the SLM field. H is the amplitude of the SLM field. The scale factor s is set to 0.95. $\tau = 0.025$ is the relative weight on perceptual loss. $\gamma = 0.1$ is the relative weight on the amplitude of the SLM field.

3. Experiment

The wavelengths of the light source are 670 nm, 532 nm, and 473 nm, respectively. The benchmark distance z₀ is set to 0.3 m. The interval distance of the 3D object is 0.01 m. The pixel pitch of the hologram is 8 µm. The training dataset has 1000 pairs of RGB-D images. The validation dataset has 100 pairs of RGB-D images. The RGB-D images are preprocessed to a region of 1072 × 1072 pixels. The learning rate is $4 \times {10^{ - 4}}$. The self-holo is implemented with Python 3.8 and Pytorch 1.12. The self-holo is trained for 30 epochs using the Adam optimizer and the loss gradually reaches a stable state. The GPU used in the experiments is an NVIDIA RTX 3090 GPU 24 GB with CUDA version 11.2.

3.1 2D numerical reconstruction

The comparison of 2D numerical reconstructions is presented in Fig. 4. We compared the self-holo with the double-phase method and SGD method. The RGB-D images are preprocessed to occupy a region of 880 × 880 pixels, padded with zeros out to a 1072 × 1072 pixels region. The double-phase method has a fast computational speed, but the quality of the reconstructed images is limited. The interleaved encoding of the double-phase method reduces the effectiveness of the light resulting in lower brightness of the reproduced images. The reconstruction quality of the SGD algorithm increases with the iterative process. The SGD method can achieve rare high-quality numerical reconstruction, but it is difficult to meet the requirements of real-time holographic display. The self-holo takes 0.017 s and the PSNR value of reconstructed images achieves 25.08 dB. The self-holo achieves a good trade-off between computational speed and image quality.

Fig. 4. (a) The sample of 2D reconstructed results of the green channel. (b) Evaluation of algorithm running time and reconstruction quality. We ran these simulations with the DIV2K test dataset [32]. PSNR values of all methods are averaged over 100 test images. The SGD algorithm is run until convergence with 200 iterations. For self-holo, the grayscale value of the depth map is set to 0 when the reconstruction distance is 0.30 m, and the grayscale value of the depth map is set to 1 when the reconstruction distance is 0.32 m.

Download Full Size | PDF

3.2 3D numerical reconstruction

The comparison of 3D numerical reconstructions is presented in Fig. 5. We compared the self-holo with the complex SGD method. We evaluated the image quality at the focus position. The three sub-images in the intensity image are located at different distances. The complex SGD has a better out-of-focus blurring effect compared to SGD due to the introduction of complex amplitude constraints. Compared with the complex SGD method, the self-holo can predict 3D holograms in real-time. The 3D numerical simulation results illustrate the effectiveness of using random reconstruction of one layer of an object when the network training.

Fig. 5. (a) Intensity and depth map. (b) The example of 3D numerical reconstructed results of the green channel. The top middle image is in focus at 0.3 m, the bottom left image is in focus at 0.31 m, and the bottom right image is in focus at 0.32 m. The highlighted regions with pink boxes indicate the in-focus parts.

Download Full Size | PDF

3.3 Generalization capability test

To evaluate the generalization capability of the self-holo, we used binary images to test the generalization ability of the network. Figure 6 shows the numerical reconstructed results of the generalization capability test. The first row is the results of the 2D numerical reconstruction. The grayscale value of the depth map is set to 0 when the reconstruction distance is 0.30 m. The grayscale value of the depth map is set to 1 when the reconstruction distance is 0.32 m. The second row is the results of the 3D numerical reconstruction. The wings of the butterflies show a good defocus blurring effect. The results demonstrate that the network has a good generalization capability for binary images.

Fig. 6. The numerical reconstructed results of the generalization capability test. The butterfly letter on the bottom left is at 0.30 m, the butterfly on the bottom right is at 0.31 m and the butterfly on the top left is at 0.32 m.

Download Full Size | PDF

For color holographic display, it needs to generate the holograms of R, G, and B channels separately. Three separate networks are trained for self-holo, one for each color channel. Figure 7 shows the numerical reconstructed results of the color holographic display. The self-holo can be well generalized to different color channels for a color holographic display. The reconstructed details are more obvious in the RGB images.

Fig. 7. The numerical reconstructed results of color holographic display. The image on the top middle is at 0.30 m, the image on the bottom left is at 0.31 m and the image on the bottom right is at 0.32 m. The highlighted regions with pink boxes indicate the in-focus parts.

Download Full Size | PDF

4. Optical reconstruction

Commonly used color holographic display optical systems include the spatial-multiplexing method and time-multiplexing method [33]. We built an optical experimental system with the time-multiplexing method for color holographic display as illustrated in Fig. 8. The laser light is expanded and collimated by the beam expander and collimating lens to illuminate the SLM. It should be noted that our experimental system uses a beam expander that simplifies the complexity of the system. The collimating lens is a doublet lens that can suppress chromatic aberrations. The 4-f filter system is applied in the holographic reconstruction process. Lens 1 and lens 2 are doublet lenses and the focal length of the two lenses are both 200.0 mm. We used a complementary metal-oxide-semiconductor (CMOS) camera to capture the reconstruction results. The wavelength of the green laser is 532 nm. The wavelength of the red laser is 670 nm. The wavelength of the blue laser is 473 nm. The phase-only SLM is Holoeye Pluto with a resolution of 1920 × 1080 pixels and pixel pitch of 8 µm.

Fig. 8. Schematic optical path of the color holographic display system. F1, F2 and F3 are filters; P1, P2 and P3 are polarizers; DM1 and DM2 are dichroic mirrors.

Download Full Size | PDF

The 2D optical reconstructed results are shown in Fig. 9. The double-phase method has a severe structure noise around the image due to the downsampling operation. The SGD method has a good performance in numerical reconstruction. However, severe speckles will occur in optical reconstruction, because the SGD method has no constraint on the phase. The proposed method shows less speckle noise in the reconstructed images. We used a target complex-amplitude generator to predict the amplitude distribution of the target field and the phase distribution of the target field. The phase distribution of the target plane is discovered by the neural network. The fourth row of Fig. 9 clearly shows that self-holo could generate a smoother phase distribution compared with the SGD method, which is beneficial for reducing speckles.

Fig. 9. Comparison of 2D optical reconstructed results. The first row and second row are reconstructed results at different distances. The third row shows the corresponding holograms at z = 0.30 m. The fourth row shows the numerically reconstructed phases at z = 0.30 m.

Download Full Size | PDF

The 3D optical reconstructed results are shown in Fig. 10. To capture each reconstructed plane, the camera moves on the slide for zooming. The reconstructed image layers are in focus in the corresponding positions, which are consistent well with the simulation results. The third row of Fig. 10 shows the numerical phase distributions in the reconstruction plane. We can observe that the reconstructed phase distributions are smoother at in-focus locations than at out-of-focus locations. The blurring effect of in-focus and out-of-focus parts is correlated with the variation of the reconstructed phase distributions. Since the bandwidth of the smooth phase is limited [34], the diffusive effect of the 3D object in the out-of-focus region is not obvious.

Fig. 10. 3D optical reconstructed results. The first row is the reconstructed results of the binary image. The second row is the reconstructed results of a grayscale image. The third row shows the numerical phase distributions in the reconstruction plane. The highlighted regions with pink boxes indicate the in-focus parts of the reconstructed results. The highlighted regions with yellow boxes indicate the in-focus parts of reconstructed phases.

Download Full Size | PDF

Figure 11 shows the results of the 3D color reconstruction. We used the temporal multiplexing method for color holographic display. The optical color reconstructions show well the results of focusing on different positions. Due to the limited performance of the color optical reconstruction system, there still exists chromatic aberration and some laser speckle noise in the reconstruction images. Some steps can be taken, for example, by optimizing the wavelength and power ratio of the RGB lasers and by calibrating the color holographic display system, which will improve the quality of photoelectric reconstruction.

Fig. 11. The results of the 3D color reconstruction. The image on the top middle is at 0.30 m, the image on the bottom left is at 0.31 m and the image on the bottom right is at 0.32 m. The highlighted regions with pink boxes indicate the in-focus parts.

Download Full Size | PDF

5. Conclusion

In this paper, we present self-holo with indirect phase inference for 3D hologram or 2D hologram generation. The self-holo is free of labeled datasets due to the incorporation of physical diffraction propagation into the neural network. We use CNN to encode amplitude and depth images as the complex amplitude of a 3D object and adopt random reconstruction to one layer of a 3D object making the network training independent of the number of object layers. Compared with most existing unsupervised CGH algorithms, the proposed method can calculate 3D holograms, yet the number of network model parameters does not increase much. The self-holo also has a good generalization capability. The optical reconstructed images have fewer speckles and are matched well with simulation results. Furthermore, the proposed method has potential applications in the field of VR/AR. For future research, extending this approach to generate holograms with continuous depth and exploring the capability of the neural networks for wider bandwidth are practicable directions.

Funding

National Natural Science Foundation of China (61875115, 62005154); Natural Science Foundation of Shanghai (20ZR1420500)); Key Laboratory of Advanced Display and System Application, Chinese Ministry of Education (P201610).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are available in Ref. [35].

References

1. E. Sahin, E. Stoykova, J. Mäkinen, and A. Gotchev, “Computer-Generated Holograms for 3D Imaging,” ACM Comput. Surv. 53(2), 1–35 (2021). [CrossRef]

2. A. Maimone, A. Georgiou, and J. S. Kollin, “Holographic near-eye displays for virtual and augmented reality,” ACM Trans. Graph. 36(4), 1–16 (2017). [CrossRef]

3. Z. He, X. Sui, G. Jin, and L. Cao, “Progress in virtual reality and augmented reality based on holographic display,” Appl. Opt. 58(5), A74–A81 (2019). [CrossRef]

4. J. Wu, K. Liu, X. Sui, and L. Cao, “High-speed computer-generated holography using an autoencoder-based deep neural network,” Opt. Lett. 46(12), 2908–2911 (2021). [CrossRef]

5. R. W. Gerchberg and W. O. Saxton, “A practical algorithm for the determination of phase from image and diffraction plane pictures,” Optik 35(2), 237–250 (1972).

6. Y. Wu, J. Wang, C. Chen, C. J. Liu, F. M. Jin, and N. Chen, “Adaptive weighted Gerchberg-Saxton algorithm for the generation of the phase-only hologram with artifacts suppression,” Opt. Express 29(2), 1412–1427 (2021). [CrossRef]

7. J. Zhang, N. Pégard, J. Zhong, H. Adesnik, and L. Waller, “3D computer-generated holography by non-convex optimization,” Optica 4(10), 1306–1313 (2017). [CrossRef]

8. P. Chakravarthula, Y. Peng, J. Kollin, H. Fuchs, and F. Heide, “Wirtinger holography for near-eye displays,” ACM Trans. Graph. 38(6), 1–13 (2019). [CrossRef]

9. M. Makowski, M. Sypek, A. Kolodziejczyk, and G. Mikula, “Three-plane phase-only computer hologram generated with iterative Fresnel algorithm,” Opt. Eng. 44(12), 125805 (2005). [CrossRef]

10. P. Zhou, Y. Li, S. Liu, and Y. Su, “Dynamic compensatory Gerchberg-Saxton algorithm for multiple-plane reconstruction in holographic displays,” Opt. Express 27(6), 8958–8967 (2019). [CrossRef]

11. C. Chen, B. Lee, N. N. Li, M. Chae, D. Wang, Q. H. Wang, and B. Lee, “Multi-depth hologram generation using stochastic gradient descent algorithm with complex loss function,” Opt. Express 29(10), 15089–15103 (2021). [CrossRef]

12. Y. Peng, S. Choi, N. Padmanaban, and G. Wetzstein, “Neural holography with camera-in-the-loop training,” ACM Trans. Graph. 39(6), 1–14 (2020). [CrossRef]

13. P. Tsang and T. -C. Poon, “Novel method for converting digital Fresnel hologram to phase-only hologram based on bidirectional error diffusion,” Opt. Express 21(20), 23680–23686 (2013). [CrossRef]

14. C. K. Hsueh and A. A. Sawchuk, “Computer-generated double-phase holograms,” Appl. Opt. 17(24), 3874–3883 (1978). [CrossRef]

15. X. Sui, Z. He, G. Jin, D. Chu, and L. Cao, “Band-limited double-phase method for enhancing image sharpness in complex modulated computer-generated holograms,” Opt. Express 29(2), 2597–2612 (2021). [CrossRef]

16. D. Blinder, T. Birnbaum, T. Ito, and T. Shimobaba, “The state-of-the-art in computer-generated holography for 3D display,” Light Adv. Manuf. 3(35), 1 (2022). [CrossRef]

17. S. Choi, M. Gopakumar, Y. Peng, J. Kim, M. O’Toole, and G. Wetzstein, “Time-multiplexed Neural Holography: A flexible framework for holographic near-eye displays with fast heavily-quantized spatial light modulators,” arxivarxiv:2205.02367 (2022).

18. R. Horisaki, R. Takagi, and J. Tanida, “Deep-learning-generated holography,” Appl. Opt. 57(14), 3859–3863 (2018). [CrossRef]

19. H. Zheng, J. Hu, C. Zhou, and X. Wang, “Computing 3D phase-type holograms based on deep learning method,” Photonics 8(7), 280 (2021). [CrossRef]

20. J. Lee, J. Jeong, J. Cho, D. Yoo, B. Lee, and B. Lee, “Deep neural network for multi-depth hologram generation and its training strategy,” Opt. Express 28(18), 27137–27154 (2020). [CrossRef]

21. L. Shi, B. Li, C. Kim, P. Kelnhofer, and W. Matusik, “Towards real-time photorealistic 3D holography with deep neural networks,” Nature 591(7849), 234–239 (2021). [CrossRef]

22. C. Chang, D. Wang, D. Zhu, J. Li, J. Xia, and X. Zhang, “Deep-learning-based computer-generated hologram from a stereo image pair,” Opt. Lett. 47(6), 1482–1485 (2022). [CrossRef]

23. L. Shi, B. Li, and W. Matusik, “End-to-end learning of 3D phase-only holograms for holographic display,” Light: Sci. Appl. 11(1), 247 (2022). [CrossRef]

24. T. Yu, S. Zhang, W. Chen, J. Liu, X. Zhang, and Z. Tian, “Phase dual-resolution networks for a computer-generated hologram,” Opt. Express 30(2), 2378–2389 (2022). [CrossRef]

25. K. Liu, J. Wu, Z. He, and L. Cao, “4K-DMDNet: diffraction model-driven network for 4 K computer-generated holography,” Opto-Electron. Adv. 6(1), 220135 (2023). [CrossRef]

26. M. Hossein Eybposh, N. W. Caira, M. Atisa, P. Chakravarthula, and N. C. Pégard, “DeepCGH: 3D computer-generated holography using deep learning,” Opt. Express 28(18), 26636–26650 (2020). [CrossRef]

27. K. Matsushima and T. Shimobaba, “Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields,” Opt. Express 17(22), 19662–19673 (2009). [CrossRef]

28. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention (MICCAI) (2015), pp. 234–241.

29. K. He, X. Zhang, S. Ren, and J. Sun, “Identity Mappings in Deep Residual Networks,” in European Conference on Computer Vision (ECCV) (2016), pp. 630–645.

30. P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell, “Understanding convolution for semantic segmentation,” in IEEE winter conference on applications of computer vision (WACV) (2018), pp. 1451–1460.

31. J. Johnson, A. Alahi, and F. F. Li, “Perceptual losses for real-time style transfer and super-resolution,” in European conference on computer vision (ECCV) (2016), pp. 694–711.

32. E. Agustsson and R. Timofte, “NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops (CVPRW) (2017), pp. 1122–1131.

33. D. Pi, J. Liu, and Y. Wang, “Review of computer-generated hologram algorithms for color dynamic holographic three-dimensional display,” Light: Sci. Appl. 11(1), 231 (2022). [CrossRef]

34. D. Yoo, Y. Jo, S. -W. Nam, C. Chen, and B. Lee, “Optimization of computer-generated holograms featuring phase randomness control,” Opt. Lett. 46(19), 4769–4772 (2021). [CrossRef]

35. X. Shui, H. Zheng, X. Xia, F. Yang, W. Wang, and Y. Yu, “Diffraction model-informed neural network for unsupervised layer-based computer-generated holography,” GitHub, 2022https://github.com/SXHyeah/Self-Holo.

Diffraction model-informed neural network for unsupervised layer-based computer-generated holography

Abstract

1. Introduction

2. Method

2.1 Mathematical model of unsupervised layer-based CGH

2.2 Dataset of self-holo

2.3 Pipeline of self-holo

2.4 Neural network of self-holo

2.4.1 Network architecture

2.4.2 Loss function

3. Experiment

3.1 2D numerical reconstruction

3.2 3D numerical reconstruction

3.3 Generalization capability test

4. Optical reconstruction

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (11)

Equations (6)

Optics Express