Reconstructing images of two adjacent objects passing through scattering medium via deep learning

Xuetian Lai; Qiongyao Li; Ziyang Chen; Xiaopeng Shao; Jixiong Pu

doi:10.1364/OE.446630

1. Introduction

Imaging through scattering media remains one of the most important as well as challenging topics in various fields. For example, the complex tissues or turbid medium in the organism will disturb the propagation of light. The applications of automotive monitoring imaging devices often encounter fog, haze, or smock [1–3]. In such scenario, the light field propagation is impeded and scattered by objects or scattering media, and often a speckle pattern instead of the original object images is obtained at the detector [4,5].

The speckle pattern is a two-dimensional (2D) image, which contains scattered object structures. To reconstruct a 2D object image from speckle, methods such as phase conjugation [6,7], wavefront shaping [8], transmission matrix (TM) [9–11], and intensity correlation [12–14], etc., have been successfully used. However, multiple targets commonly exist on the light propagation path (e. g. subcutaneous blood vessel, body organs, or pedestrians and cars, etc.). It is more important and challenging to reconstruct more than one object images located at different depths along the optical path from a detected speckle. Due to that the information of more than one object images are highly mixed in the light propagation process, for which one cannot achieve ideal imaging results with existing traditional computational methods. To reconstruct more than one images through scattering media, methods of using structured illumination and compressive sensing (CS) algorithm has been proposed, which illuminating by changing light coding patterns at each detecting period and then reconstructing through calculation [15]. However the CS-based algorithm performs poorly in the case of strong scattering media. Horisaki et al. used support vector regression to reconstruct object image through scattering media, where the same two object images are in the optical path [16]. Van et al. proposed a method of using deep learning and built two neural networks (NNs) for image reconstructing and depth estimation of animal body images respectively, where the two NNs need to be separately trained and tested [17].

In recent years, researches of learning-based methods in scattering imaging have shown great progresses. Since the general assumption of the scattering process can be formulated as $\textrm{y} = {\cal F}(\textrm{x} )$, where x is the object matrix, y is the speckle matrix, and ${\cal F}({\cdot} )$ is the forward scattering function. To recover an object image from speckle, one obvious way is to calculate the inverse function ${{\cal F}^{ - 1}}({\cdot} )$. Accordingly, many learning- based methods train a neural network to fit a non-linear model which is simply a model of ${{\cal F}^{ - 1}}({\cdot} )$ and the trained model can be used to reconstruct object images from testing dataset. For examples, Li et al. combined speckle correlation and convolutional neural networks (CNNs) to reconstruct images through unseen diffusers apart from diffusers that used for training, and then extended their work to restore target hidden behind the scattering medium in unknown locations [18,19]; Rahmani et al. used CNN to reconstruct image amplitude and phase from speckle through multimode fiber [20]. Moreover, Li et al. proposed a CNN architecture for imaging through glass diffusers, the proposed method is generalizable among different image datasets [21]. The CNN-based methods in a way break through some of the limitations of optical scattering imaging methods, such as the memory effect range limitation and the vulnerability of TM method, showing good robustness [22–26]. Whereas, a common training scheme has been used is pixel-wise training, in which the model fits inverse functions in pixel level. It is shown that such scheme subtly results in spatial ambiguity to a certain extent. For getting the better imaging performance, the structural fidelity should be reinforced in the reconstructed images. Generative adversarial networks (GANs) have found remarkable applications in modeling image distributions, and have been successfully employed to solve inverse problems such as super-resolution imaging and in-painting etc. [27–32]. Luc improved the image semantic segmentation accuracy by using GANs [33]; Sun et al. proposed to use GAN for imaging through dynamic scattering media underwater and Yang et al. used GAN to improve ghost imaging quality underwater [34,35]. These research works demonstrate the superiority of GANs to improve the image quality.

It is important in real applications to reconstruct object images located at different depths behind scattering media from a speckle and a method with high imaging quality and strong robustness is needed. In this paper, we propose a technique of using deep neural networks for solving this problem. Our method is based on generative adversarial networks (GANs), taking advantages of convolutional neural networks (CNNs). The designed neural network is named Y-type generative adversarial network (YGAN), which can simply inverse the complex scattering process of two object images respectively by using two decoding branches. The experimental architecture is illustrated in Fig. 1, in which a speckle pattern is obtained as a laser beam illuminates two objects one after the other and passes through a scattering medium. It is found that, by using YGAN, we can realize simultaneous reconstruction for two different binary or grayscale object images behind scattering media with high fidelity. The influence of the change of the distance between the two objects (i.e. d) on the reconstruction quality will be studied. Moreover, we consider a complicated case, in which scattering medium S’ is additionally inserted between the two objects. Different types of S and S’ are used in experiments.

Fig. 1. Experimental architecture. A laser beam illuminates two objects at different depth planes, and a speckle pattern is obtained behind the scattering medium S. The proposed network is used to reconstruct the two objects simultaneously from a single speckle. The change of object depths, object types and scattering media are also studied in experiments.

Download Full Size | PDF

2. Method

Since the problem that we aim to solve is to reconstruct two adjacent object images through scattering media, the physical process of scattering can be hardly formulated and cannot be directly calculated. Fortunately, CNN-based methods are found to be robust and effective, as they do not need prior knowledge of the optical system and can simplify the experimental configuration [36,37]. However, most of the CNNs are trained to make pixel-wise prediction, which is shown to result in the lack of spatial continuity in the reconstructed images [33,38]. Generative adversarial networks fetch up this defect by drawing on the ideas of game theory. The general architecture of GAN contains two deep neural networks, i.e., generator network and discriminator network. The two networks aim at obstructing the target goal of each other. The generator is trained to produce new images that are indistinguishable from reality by learning a mapping from the input data. And the discriminator is trained to classify Ground Truth images as real, while generator output images as fake. In this way, the adversarial training improves the performance of generator by attaching an additional adversarial loss term to its objective function, ensuring that the output of generator is distinguishable from the original images. Based on this principle, we design the network, which is called as YGAN, for reconstructing two adjacent object images behind scattering media with high quality.

2.1 Network architecture

The designed YGAN consists of an imaging network (the generator) and a full-convolutional network (the discriminator), which is shown in Fig. 2. Specially, the generator network is a full convolutional Y-Net, as illustrated on the left of Fig. 2. It follows the symmetrical skip-connection of U-Net [39,40]. The input is a single speckle and the outputs are two prediction maps of two object images. Because Max-pooling will cause information loss [41], the convolution with stride of 2 is used in the down-sampling path until the 2×2×512 activation maps. Each up-sampling path contains six blocks, in which each block consists of up-sampling, 3×3 convolution, leaky rectified linear unit (leaky Relu), batch normalization (BN), and concatenate, etc. The output prediction can be obtained after taking an additional up-sampling and a 1×1 convolution.

Fig. 2. The architecture of YGAN. The left side is the architecture of the generator network, which is a Y-type network. It takes a single speckle as input and outputs two prediction maps of the objects. The right side is the architecture of the discriminator network, in which the inputs are partly from generator and partly from original images.

Download Full Size | PDF

The discriminator network is shown on the right side of Fig. 2. It has multiple inputs. The outputs of generator are concatenated to form partly the input of the discriminator. The corresponding Ground Truth are also concatenated and fed into the discriminator. In the network, the input size and the output size are 256×256×2, and 16×16, respectively. It is shown that, this strategy restricts the discriminator attention to patches, which enforces the structural fidelity in the predictions of the generator [42].

2.2 Objective Functions

The objective function of YGAN is a hybrid loss function that is a weighted sum of two terms. We use $G({\cdot} )$ to denote the outputs of the generator, and use ${G_{{o_1}{o_2}}}({\cdot} )$ to denote the output of concatenation, where ${o_1}$ and ${o_2}$ are the predicted outputs of the generator. Given an input speckle x and corresponding Ground Truth images ${y_1}$ and ${y_2}$, the objective function can be expressed as:

(1)$$l({{\theta_g},{\theta_d}} )= \lambda \left[ {l({G(x ),{y_1}} )+ l({G(x ),{y_2})} ]- } \right[l({D({{Y_{{y_1}{y_2}}}} ),1} )+ l({D({{G_{{o_1}{o_2}}}(x )} ),0} )], $$

where ${\theta _g}$ and ${\theta _d}$ denote the parameters of the generator network and the discriminator network, respectively. ${Y_{{y_1}{y_2}}}$ denotes the concatenated image of ${y_1}$ and ${y_2}$. During training, the generator and the discriminator network are trained alternatively. The generator aims at minimizing the objective function, while the discriminator is trained to maximize it.

When training the discriminator, the second term of Eq. (1) will be minimized, which is equivalent to minimizing the following binary classification loss:

(2)$$l({D({{Y_{{y_1}{y_2}}}} ),1} )+ l({D({{G_{{o_1}{o_2}}}(x )} ),0} )$$

Following this, the discriminator is trained to predict that the images from the generator network are fake with label 0 and images from Ground Truth are real with label 1. For updating the discriminator, l can be a binary-cross-entropy or other measurements, here, we use mean square error (MSE), which is formulated as:

(3)$$MSE = \frac{1}{{H\ast W}}\mathop \sum \nolimits_{i = 1}^H \mathop \sum \nolimits_{j = 1}^W {({\hat{y}({i,j} )- y({i,j} )} )^2}\; , $$

where, H and W indicate the width and height of the image, respectively. $\hat{y}$ and $\textrm{y}$ refer to the predicted object image and the Ground Truth, respectively.

With the fixed discriminator network, the generator network will be trained to minimize the following terms:

(4)$$\mathrm{\lambda }[{l({G(x ),{y_1}} )+ l({G(x ),{y_2}} )} ]- l({D({{G_{{o_1}{o_2}}}(x )} ),0} )$$

The first term is generally used in CNN for scattering imaging (see e.g. [18,26]), which drives a pixel-level mapping. The second term is based on the discriminator network, which can be considered as an auxiliary loss term to the generator network, penalizing the generator for producing blurry images. By minimizing the second term, the generator network is encouraged to produce images that are close to the distributions of the Ground Truth. In practice, we replace the second term by $+ l({D({{G_{{o_1}{o_2}}}(x )} ),1} )$. For updating the generator network, we set l to be binary-cross-entropy for reconstructing binary object images, which can be expressed as:

(5)$$l({\hat{y},y} )={-} (yln\hat{y} + ({1 - y} )\ln ({1 - \hat{y}} )), $$

where, y presents the Ground Truth, which is {0, 1}, and $\hat{y}$ presents the predicted image, the value is between 0∼1. For reconstructing grayscale object images, l is MSE.

3. Experimental setup and data requirement

3.1 Experimental setup

The experimental setup is shown in Fig. 3. The laser (MGL-F-532.8nm-2W) is transmitted through a micro-scope objective (OBJ₁, ×20, NA = 0.25) and a pinhole aperture (D= 20µm). Next, the light field is collimated by Lens₁ (f = 20mm) and passes through a horizontal polarizer to illuminate SLM₁ (Holoeye, LC2012) and SLM₂ (Holoeye, LC2012, reflective type) one after the other. The two SLMs are programmed to load object images one-by-one simultaneously. The distance between the two SLMs is chosen to be 35cm, 45cm, and 55cm respectively. The light field with overlapped information of two object images is focused to a scattering medium S by Lens₂ (f = 100mm). The scattered light is collected by OBJ₂ (×20, NA = 0.25) and captured by CCD (AVT PIKE F-421B). As for different distances between SLM₁ and SLM₂, we will fine-tune the CCD position accordingly to ensure proper speckle size. For approximating the complex scattering environment in practice, an additional scattering medium S’ can be selectively inserted at the median place between the two SLMs.

Fig. 3. The optical system for collecting speckle datasets. OBJ₁ (×20, NA = 0.25) and OBJ₂ (×20, NA = 0.25) are microscope objectives. Lens₁ (f = 20 mm) and Lens₂ (f = 100 mm) are used to collimate and focus the light field. P is a horizontal polarizer. SLM₁ (Holoeye, LC2012) and SLM₂ (Holoeye, LC2012, reflective) are used to display object images. The distance between SLM₁ and SLM₂ is chosen to be 35 cm, 45 cm or 55 cm, respectively. Both S’ and S are scattering medium, in which S’ can be selectively inserted.

Download Full Size | PDF

3.2 Data requirement

The object images loaded onto the two SLMs are acquired from MNIST handwritten digits [43], Quickdraw Objects [44], and Fashion-mnist [45]. The handwritten digits and Quickdraw images are used as binary objects in this work. We randomly select 10000 images from [43] and [44], respectively. They are resized to 512*512 pixels and binarized to 0, 1 before loaded onto the two SLMs. The images of Fashion-mnist are used as grayscale objects. We randomly select totally 20000 images from Fashion-mnist, in which the first 10000 images are displayed on SLM₁ and the remaining 10000 images are displayed on SLM₂ and all images are resized to 512×512 pixels.

For simulating scattering in practice in different conditions, we purposely design the following experiments for collecting speckle datasets:

(1) By using binary object images, the collected datasets are:

Group A1: Scattering slab (ZnO) with thickness of 200µm is used as S (d = 45cm, without S’). 10000 speckles are collected corresponding to the loaded objects.

Group A2: An additional scattering media S’ (ZnO, 50µm) is inserted between SLM₁ and SLM₂. 10000 speckles are collected corresponding to the loaded objects.

Group A3: An additional scattering media S’ (a diffuser, 600-grit) is inserted between SLM₁ and SLM₂, and S is also a diffuser (2000-grit). 10000 speckles are collected corresponding to the loaded objects.

Group B: We maintain the experimental configuration the same as Group A1 and change d to 35cm, 45cm, and 55cm, respectively. In the case of each distance, we collect 4000 speckles corresponding to the loaded objects (named Group B1, B2, and B3, respectively), and mix the speckles to form a dataset.

(2) By using grayscale object images, the collected dataset is:

Group C: Scattering slab are S (ZnO, 200µm) and S’ (ZnO, 50µm), d = 55cm. 10000 speckles are collected corresponding to the loaded objects.

Each of the collected datasets is randomly divided into training and testing sets, with split ratio of 9:1. The testing set is only used to evaluate the performance of trained network, which is not used during training. Before fed into the network, the speckle-object pairs of training set are downsampled from 512×512 pixels to 256×256 pixels and normalized to be a value between 0, 1, for reducing the network parameters and training time. The network training is performed on GPU (NVIDIA, RTX 2080 SUPER) using Keras/Tensorflow.

4. Results analysis

4.1 Reconstructing images of two binary objects

First of all, we employ the YGAN to reconstruct two binary object images from a single speckle. The influence of different scattering media and object depths on the reconstructing images will be studied and discussed. In the training process, the discriminator and generator are trained alternately step by step. The generator network is trained by using a loss weight $\lambda $=100 and learning rate = 10⁻⁴, and both generator and discriminator use mini-batch SGD and Adam with momentum parameter $\beta $=0.5. The last two layers of generator use sigmoid activation function and the related binary cross-entropy loss function.

To quantitatively analyze the reconstruction results, we use the Pearson Correlation Coefficient (PCC), Structural Similarity (SSIM), and Peak Signal to Noise Ratio (PSNR), to evaluate the similarity between the reconstructed object images and Ground Truth [46,47].

The YGAN is trained by using the training set of Group A1 and the performance is tested by using the testing set. Some test results are presented in Fig. 4(a). It is found from results that by using YGAN, we can reconstruct two adjacent binary objects from a single speckle with high fidelity. The spatial invariances of both the object images are simultaneously learned by the network. With the adversarial training scheme, the visual appearance of the reconstructed images are visually similar to the original images.

Fig. 4. Reconstructed images of two binary objects on the adjacent layer by using the YGAN. (a) Results of Group A1, S’ has not been set yet, d = 45 cm; (b) results of Group A2 and Group A3, both S and S’ exist. The yellow digits are values of SSIM index.

Download Full Size | PDF

Then, we consider the case in which there exist two scattering media on the optical path. In this situation, the YGAN is trained by the training set of Group A2 and Group A3, respectively. It is shown from Fig. 4(b) that the reconstruction results are nearly same, indicating that the YGAN is applicable in both cases of multiple diffuser scattering and thick scattering.

Table 1 presents the averaged PCC, SSIM, and PSNR indexes, which are computed over all the test results of each dataset. The validating accuracy curves as a function of the training iteration step are shown in Fig. 5. As the training iteration step grows, the networks learn to extract the invariant features from a single speckle, and the validating accuracy gradually converge. Meanwhile, we observe that the evaluation values of Group A2 and A3 (in which S’ exists) are slightly higher than Group A1 (without S’) from Table 1 and Fig. 5. It is due to the uneven thickness of the scattering medium near the focal point of lens, resulting in speckles of different complexity in Group A1, A2, or A3. So that the network performs slightly different.

Fig. 5. The validating accuracy curves as a function of the training iteration step during training of Group A1, A2, and A3, respectively. (a) curves of object1 (digit images); (b) curves of object2 (quickdraw images).

Download Full Size | PDF

Table 1. Averaged validating values of the test results of Group A1, A2 and A3, respectively.

View Table | View all tables in this article

The network generalization performance is also tested for reconstructing object images in situations of different depths. The YGAN is trained by the training set of Group B, which is a mixed dataset of different depths. Figure 6 presents the test results of Group B1, B2, and B3. The images of different depths can be reconstructed by the trained YGAN with high quality. It indicates that the YGAN is able to fit a model that is generalizable to various location depths. Table 2 presents the averaged PCC, SSIM, and PSNR of the test results, indicating the robustness of the YGAN against changes of the distance between the two objects. The validating accuracy curves during training are shown in Fig. 7. The curves converge at an early iteration step, demonstrating the effectiveness of using YGAN to simultaneously predict multiple objects located at different depths.

Fig. 6. Reconstructed object images with d = 35 cm, 45 cm, and 55 cm by using the trained YGAN. The yellow digits are values of SSIM index.

Download Full Size | PDF

Fig. 7. The validating accuracy curves as a function of the training iteration step during training of Group B. (a) curves of object1 (digit images); (b) curves of object2 (quickdraw images).

Download Full Size | PDF

Table 2. Averaged validating values of the test results of Group B1, B2 and B3, respectively.

View Table | View all tables in this article

4.2 Reconstructing images of two grayscale objects

Next, we explore the YGAN performance on reconstructing two grayscale object images from a single speckle. To further demonstrate the advantages of our proposed method, the performance of YGAN is compared with the network Y-Net, which has the same architecture as the generator network, without adversarial training. The two networks are trained on the training set of Group C with L₁ loss, which encourages less blurring. The test results are shown in Fig. 8. It can be seen that the reconstruction results of YGAN retain most of the structural information of the original grayscale images, while that of Y-Net almost lose the detailed information in reconstructed images. It reveals the advantage of adversarial training in image reconstruction, enabling the imaging network to fit a better model and to improve imaging quality. Table 3 presents the averaged evaluation values, where the averaged SSIMs of YGAN is 10% to 20% higher than that of the Y-Net. These results further demonstrate the effectiveness of using the approach of adversarial training, showing the potential of the YGAN on reconstructing complex natural images.

Fig. 8. Reconstructed of images of two grayscale objects on the adjacent layer by using the YGAN and Y-Net, respectively. The yellow digits are values of SSIM index.

Download Full Size | PDF

Table 3. Averaged validating values of the test results of Group C by using the trained YGAN, and Y-Net respectively.

View Table | View all tables in this article

5. Conclusion

We have demonstrated an approach of using YGAN to reconstruct images of two different objects located on adjacent layers behind scattering media. It has been shown that the YGAN can be used for reconstructing images of two binary objects from one speckle with high fidelity. Moreover, the YGAN is applicable in situation where there exist multiple scattering media. Furthermore, the developed technique performs well while the two object images are at different depths. The YGAN performs better than network without adversarial training when used for reconstructing images of two grayscale objects in the optical system. Our proposed technique may find applications in medical image analysis, such as high-quality medical image classification, segmentation, and studies of multi-object scattering imaging, three-dimensional imaging etc.

Funding

National Natural Science Foundation of China (62005086).

Disclosures

The authors declare that there are no conflicts of interest related to this article.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. B. Javidi, I. Moon, and S. Yeom, “Three-dimensional identification of biological microorganism using integral imaging,” Opt. Express 14(25), 12096–12108 (2006). [CrossRef]

2. K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,” IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2341–2353 (2011). [CrossRef]

3. R. T. Tan, “Visibility in bad weather from a single image,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (IEEE, 2008), pp. 1–8.

4. P. Mosk, A. Lagendijk, G. Lerosey, and M. Fink, “Controlling waves in spaces and time for imaging and focusing in complex media,” Nat. Photonics 6(5), 283–292 (2012). [CrossRef]

5. A. Ishimaru, Wave Propagation and Scattering in Random Media, IEEE Press (1978).

6. Z. Yaqoob, D. Psaltis, M. Feld, and C. Yang, “Optical phase conjugation for turbidity suppression in biological samples,” Nat. Photonics 2(2), 110–115 (2008). [CrossRef]

7. K. Si, R. Fiolka, and M. Cui, “Fluorescence imaging beyond the ballistic regime by ultrasound pulse guided digital phase conjugation,” Nat. Photonics 6(10), 657–661 (2012). [CrossRef]

8. J. Park, Z. Yu, K. R. Lee, P. Lai, and Y. K. Park, “Perspective: Wavefront shaping techniques for controlling multiple light scattering in biological tissues: Toward in vivo applications,” APL Photonics 3(10), 100901 (2018). [CrossRef]

9. S. Popoff, G. Lerosey, R. Carminati, M. Fink, A. Boccara, and S. Gigan, “Measuring the transmission matrix in optics: an approach to the study and control of the light propagation in disordered media,” Phys. Rev. Lett. 104(10), 100601 (2010). [CrossRef]

10. M. Kim, W. Choi, Y. Choi, C. Yoon, and W. Choi, “Transmission matrix of a scattering medium and its applications in biophotonics,” Opt. Express 23(10), 12648–12668 (2015). [CrossRef]

11. H. He, Y. Guan, and J. Zhou, “Image restoration through thin turbid layers by correlation with a known object,” Opt. Express 21(10), 12539–12545 (2013). [CrossRef]

12. L. Chen, R. K. Singh, Z. Chen, and J. Pu, “Phase shifting digital holography with the Hanbury Brown-Twiss approach,” Opt. Lett. 45(1), 212–215 (2020). [CrossRef]

13. L. Chen, Z. Chen, R. K. Singh, and J. Pu, “Imaging of polarimetric-phase object through scattering medium by phase shifting,” Opt. Express 28(6), 8145–8155 (2020). [CrossRef]

14. R. V. Vinu, R. K. Singh, and J. Pu, “Ghost diffraction holographic microscopy,” Optica 7(12), 1697–1704 (2020). [CrossRef]

15. T. Ando, R. Horisaki, and J. Tanida, “Three-dimensional imaging through scattering media using three-dimensionally coded pattern projection,” Appl. Opt. 54(24), 7316–7322 (2015). [CrossRef]

16. R. Horisaki, R. Takagi, and J. Tanida, “Learning-based imaging through scattering media,” Opt. Express 24(13), 13738 (2016). [CrossRef]

17. T. Van, T. Tran, H. Inujima, and K. Shimizu, “Three-dimensional imaging through turbid media using deep learning: NIR transillumination imaging of animal bodies,” Biomed. Opt. Express 12(5), 2873–2887 (2021). [CrossRef]

18. Y. Li, Y. Xue, and L. Tian, “Deep speckle correlation: a deep learning approach toward scalable imaging through scattering media,” Optica 5(10), 1181–1190 (2018). [CrossRef]

19. Y. Li, S. Cheng, Y. Xue, and L. Tian, “Displacement-agnostic coherent imaging through scatter with an interpretable deep neural network,” Opt. Express 29(2), 2244–2257 (2021). [CrossRef]

20. B. Rahmani, D. Loterie, G. Konstantinou, D. Psaltis, and C. Moser, “Multimode optical fiber transmission with a deep learning network,” Light Sci Appl 7(1), 69 (2018). [CrossRef]

21. S. Li, M. Deng, J. Lee, A. Sinha, and G. Barbastathis, “Imaging through glass diffusers using densely connected convolutional networks,” Optica 5(7), 803–813 (2018). [CrossRef]

22. J. Zhao, X. Ji, M. Zhang, and X. Wang, “High-fidelity Imaging through Multimode Fibers via Deep Learning,” J. Phys. Photonics 3(1), 015003 (2021). [CrossRef]

23. L. Wu, J. Zhao, M. Zhang, Y. Zhang, X. Wang, Z. Chen, and J. Pu, “Deep learning: High-quality Imaging through Multicore Fiber,” Curr. Opt. Photonics 4, 286–292 (2020).

24. Q. Li, J. Zhao, Y. Zhang, X. Lai, and J. Pu, “Imaging Reconstruction through Strongly Scattering Media by Using Convolutional Neural Networks,” Opt. Commun. 477, 126341 (2020). [CrossRef]

25. J. Lim, A. Ayoub, and D. Psaltis, “Three-dimensional tomography of red blood cells using deep learning,” Adv. Photonics 2(02), 1 (2020). [CrossRef]

26. X. Lai, Q. Li, X. Wu, G. Liu, Z. Chen, and J. Pu, “Mutual transfer learning of reconstructing images through a multimode fiber or a scattering medium,” IEEE Access 9, 68387–68395 (2021). [CrossRef]

27. I. Goodfellow, J. Pouget-Abadie, M. Mirza, X. Bing, and Y. Bengio, “Generative adversarial nets,” arXiv Preprint arXiv:1406.2661v1 (2014).

28. J. Zhu, P. Krhenbühl, E. Shechtman, and A. Efros, “Generative visual manipulation on the natural image manifold,” arXiv Preprint arXiv:1609.03552v3 (2016).

29. X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “Infogan: Interpretable representation learning by information maximizing generative adversarial nets,” Proc. Adv. in Neural Processing Systems (NIPS), (Neural Information Processing Systems Foundation, 2016), pp. 2172–2180.

30. V. Shah and C. Hegde, “Solving leaner inverse problems using GAN priors: An algorithm with provable guarantees,” in IEEE International Conf. on Acoustics, Speech and Signal Processing (ICASSP), (IEEE, 2018), pp. 4609–4613.

31. C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, and Z. Wang, “Photo-realistic single image super-resolution using a generative adversarial network,” in IEEE Conf. on Computer Vision and Pattern Recognition, (IEEE, 2017), pp. 105–114.

32. R. A. Yeh, C. Chen, T.Y. Lim, and A.G. Schwing, “Semantic image inpainting with deep generative models,” in IEEE Conf. on Computer Vision and Pattern Recognition, (IEEE, 2017), pp. 6882–6890.

33. P. Luc, C. Couprie, S. Chintala, and J. Verbeek, “Semantic segmentation using Adversarial Networks” arXiv Preprint arXiv:1611.08408v1 (2016).

34. Y. Sun, J. Shi, L. Sun, J. Fan, and G. Zeng, “Image reconstruction through dynamic scattering media based on deep learning,” Opt. Express 27(11), 16032–16046 (2019). [CrossRef]

35. X. Yang, Z. Yu, L. Xu, J. Hu, L. Wu, C. Yang, W. Zhang, J. Zhang, and Y. Zhang, “Underwater ghost imaging based on generative adversarial networks with high imaging quality,” Opt. Express 29(18), 28388–28405 (2021). [CrossRef]

36. D. Shan and K. Shi, “Solving missing cone problem by deep learning,” Adv. Photonics 2(02), 1 (2020). [CrossRef]

37. L. Zhang, R. Xu, H. Ye, K. Wang, B. Xu, and D. Zhang, “High definition images transmission through single multimode fiber using deep learning and simulation speckles,” Optics and Lasers in Engineering 140, 106531 (2021). [CrossRef]

38. D. Pathak, P. Krahenbuhl, J. Donahue, and T. Darrell, “Context encoders: Feature learning by inpainting,” in IEEE Conf. on Computer Vision and Pattern Recognition, (IEEE, 2016), pp. 2536–2544.

39. X. -J. Mao, C. Shen, and Y. -B. Yang, “Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections,” arXiv Preprint arxiv:1603.09056 (2016).

40. O. Ronneberger, P. Fischer, and T. Brox, “U-Net convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Medical Image Computing and Computer-Assisted Intervention, (Medical Image Computing and Computer Assisted Intervention Society, 2015), pp. 234–241.

41. Z. Gao, L. Wang, and G. Wu, “LIP: local importance-based Pooling,” arXiv Preprint arXiv:1908.04156v3 (2019).

42. P. Isola, J. Zhu, T. Zhou, and A. Efros, “Image-to-image translation with conditional adversarial networks,” in IEEE Conf. on Computer Vision and Pattern Recognition,(IEEE, 2017), pp. 5967–5976.

43. L. Deng, “The MNIST Database of Handwritten Digit Images for Machine Learning Research Best of the Web,” IEEE Signal Processing Magazine 29, 141–142 (2012). [CrossRef] .

44. https://Quickdraw.withgoogle.com/data.

45. H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,”arXiv Preprint arXiv:1708.07747v2 (2017).

46. A. G. Asuero, A. Sayago, and A. G. González, “The Correlation Coefficient: An Overview,” Crit. Rev. Anal. Chem. 36(1), 41–59 (2006). [CrossRef]

47. W. Zhou and A. C. Bovik, “Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures,” IEEE Signal Process. Mag. 26(1), 98–117 (2009). [CrossRef]

		Group A1	Group A2	Group A3
PCC	Object1	0.8290	0.8896	0.9027
PCC	Object2	0.7757	0.8261	0.7405
SSIM	Object1	0.9145	0.9311	0.9298
SSIM	Object2	0.8282	0.8636	0.8139
PSNR	Object1	15.1728	16.8540	17.0842
PSNR	Object2	13.1171	14.0015	12.3121

		Group B1	Group B2	Group B3
PCC	Object1	0.8290	0.8849	0.8663
PCC	Object2	0.7757	0.8226	0.8240
SSIM	Object1	0.9145	0.9262	0.9232
SSIM	Object2	0.8282	0.8617	0.8763
PSNR	Object1	15.1728	16.5642	16.0867
PSNR	Object2	13.1171	13.9897	13.8884

		YGAN	Y-Net
PCC	Object1	0.8686	0.8537
PCC	Object2	0.8355	0.8281
SSIM	Object1	0.6869	0.5915
SSIM	Object2	0.7329	0.5400
PSNR	Object1	17.9456	15.5988
PSNR	Object2	16.1630	13.4831

		Group A1	Group A2	Group A3
PCC	Object1	0.8290	0.8896	0.9027
PCC	Object2	0.7757	0.8261	0.7405
SSIM	Object1	0.9145	0.9311	0.9298
SSIM	Object2	0.8282	0.8636	0.8139
PSNR	Object1	15.1728	16.8540	17.0842
PSNR	Object2	13.1171	14.0015	12.3121

		Group B1	Group B2	Group B3
PCC	Object1	0.8290	0.8849	0.8663
PCC	Object2	0.7757	0.8226	0.8240
SSIM	Object1	0.9145	0.9262	0.9232
SSIM	Object2	0.8282	0.8617	0.8763
PSNR	Object1	15.1728	16.5642	16.0867
PSNR	Object2	13.1171	13.9897	13.8884

Reconstructing images of two adjacent objects passing through scattering medium via deep learning

Abstract

1. Introduction

2. Method

2.1 Network architecture

2.2 Objective Functions

3. Experimental setup and data requirement

3.1 Experimental setup

3.2 Data requirement

4. Results analysis

4.1 Reconstructing images of two binary objects

4.2 Reconstructing images of two grayscale objects

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (8)

Tables (3)

Equations (5)

Optics Express