DNF: diffractive neural field for lensless microscopic imaging

Hao Zhu; Hao Zhu; Zhen Liu; Zhen Liu; Zhen Liu; You Zhou; You Zhou; Zhan Ma; Xun Cao; Xun Cao

doi:10.1364/OE.455360

1. Introduction

With the rapid growth of applications in optofluidic on-chip imaging, wave-front sensing, cell counting, fluorescence imaging, digital pathology, and endoscope [1–7], lensless imaging has recently become an attractive microscopic realization for wide field of view (FOV) observation. In such lensless imaging system, the target object or specimen is placed very close to the image sensor for acquisition without requiring any optical lens, offering a compact optical set-up in practice. By removing the optical lens it directly records the diffractive measurements, and thus relaxes the constraint between the imaging resolution and FOV in conventional lenses-based systems [8,9] for improved imaging capacity. To produce diffractive snapshots for measurement, we typically use a coherent (or partially coherent) light for illumination in a given lensless microscopy. Then by performing the phase retrieval, we can recover both the in-focus amplitude intensity and phase information of target objects from aforementioned diffractive measurements. To best ensure the reconstruction performance, various coherent lensless schemes have been reported in the past decades [10–17], where multiple diverse measurements are often utilized to make a robust phase retrieval, such as axially moving the sensor to multiple heights [10,11], laterally shifting the pinhole probe to different positions [12,13], and illuminating the object with angle-varied [14,15] or multi-wavelength lights [16,17].

On the other hand, recent years have witnessed the explosive growth of deep learning-based algorithms in lensless imaging scenarios for reconstruction optimization [18–23]. Most of them resort to the supervised learning [18,19] that heavily relies on the excessive amount of training data. Unfortunately, having abundant training samples is often an intractable problem in microscopic field. Thus, numerous attempts have been made to explore the possibility to apply unsupervised learning that do not require a large amount of paired training data. Notable examples are the use of untrained network for various tasks as reported in [20–23].

As aforementioned, both learning-based approaches and traditional iterative methods have been developed for high-performance lensless imaging [12–15,18,20,23]. However, the accurate estimation of the parameters of underlying imaging process remains a challenge and generally a little bit inaccuracy will make the reconstruction a disaster. It is possible to calibrate or preset parameters in advance, but it is tedious and unbearable, making the deployment of existing solutions inconvenient and cumbersome. Even for the scenario having full knowledge of the imaging model parameter, there is still room for more robust reconstruction by characterizing the imaging physics more reliably or introducing more effective constraints in the recovery.

Fig. 1. Multi-height lensless imaging model and the pipeline of the proposed DNF. Overall, the proposed DNF accepts intensity snapshots acquired at multiple heights of a given object, optimizes the implicit function between the spatial coordinates of the object sample and its complex field through the use of MLPs and reconstruction loss, and finally outputs optimized amplitude and phase images of the target object. Two MLP networks are devised to connect the spatial coordinates of the object sample to its amplitude and phase representations respectively. And the loss is evaluated between the real measurements and propagated intensity images (through diffractive propagation of the complex field recovered by the MLPs) to supervise the optimization of the MLP.

Download Full Size | PDF

Therefore, this work proposes a novel unsupervised model, namely the diffractive neural field (DNF), which is inspired from the successful neural radiance field used in the novel view synthesis [24,25], and exemplifies it for complete complex field reconstruction in a self-built multi-height lensless imaging system shown in Fig. 1. We leverage few, random intensity snapshots of a given specimen acquired at different heights to instantaneously optimize the DNF to derive the complex field of the target object. By embedding the Fresnel propagation of imaging physics, the DNF effectively formulates the function to connect the spatial coordinates of the object plane and the complex field. Note that the DNF is a fully differentiable continuous function, allowing us to jointly optimize the imaging model parameters (such as the propagation heights used in this multi-height imaging system) and the implicit mapping function to best recover both intensity and phase information. As revealed in extensive studies, the proposed method offers the state-of-the-art performance for both synthetic data (e.g., > 6 dB PSNR improvement quantitatively) and real-world samples (e.g., better qualitative visualization), and also presents lightweight complexity that is attractive for practical applications, e.g., 50$\times$ model size reduction and faster convergence speed as compared with the existing deep learning based methods (see Table 1). As seen, the proposed DNF method only requires a few random snapshots as raw measurements with coarse priors of imaging parameters (e.g., the height shown in exemplified setup of Fig. 1), making the system easy to deploy in practice.

2. Diffractive neural field representation for lensless microscopic imaging

The proposed unsupervised DNF representation is generally applicable to different realizations of lensless microscopy. Here we take the multi-height realization as an example to comprehensively illustrate the proposed method. In such a multi-height system, diffractive measurements are acquired at multiple object-to-sensor distances. To this aim, as shown in the left panel of the Fig. 1, the multi-height lensless imaging prototype is built including the light source, object to be observed, image sensor, and motion stage. The image sensor is mounted on the motor-driven motion stage for axially moving to facilitate the multi-height acquisition.

2.1 Lensless imaging model

We use a coherent laser source which is placed far enough from the object sample (or specimen) to generate normally-incident illumination. As such, the illumination pattern can be modeled as an invariable plane wave $P(x,y)\in \mathbb {C}^{m\times m}$, having uniform illumination intensity at any given 2D spatial position $(x,y)$ on the object plane. Here $\mathbb {C}$ represents the complex field, and $m\times m$ indicates the size of acquired image. We further mark the signal distribution of the target object as $O(x,y)$ at 2D coordinates $(x,y)$ and then the illuminated complex field at object plane is $P(x,y)\cdot O(x,y)$, where the operator $\cdot$ represents the dot (element-wise) product. Based on wave propagation theory, the complex field $CF_z$ propagating across a given height $z$ onto the acquisition sensor plane is

(1)$$CF_z(x,y)={ PSF_z}*\left[P(x,y)\cdot O(x,y)\right],$$

where ${ PSF_z}$ is the point spread function (PSF) of Fresnel propagation over distance $z$ (e.g., axial height between the sample and image sensor) with $z = z_1,z_2,\ldots,z_N$. $N$ is the total number of measurements, and $*$ is the convolution operator. Since the image sensor can only acquire the intensity of light wave, the measurement, e.g., raw diffractive image, at height $z$ is formulated as

(2)$$I_z(x,y)=\left|CF_z(x,y)\right|^{2}=\left|{ PSF_z}(x,y)*\left[P(x,y)\cdot O(x,y)\right]\right|^{2}.$$

Here $|\cdot |$ indicates the module of a complex value. Our purpose is to determine the $O(x,y)$ of the target object from multiple measurements $I_z(x,y),~z=z_1,z_2,\ldots,z_N$. Note that the illumination $P(x,y)$ has uniform intensity, i.e. $P(x,y)$ is a constant matrix. Thereby our purpose is equivalent to the estimation of the $P(x,y)\cdot O(x,y)$ instead. For simplicity, we use the $O(x,y)$ directly in subsequent discussions.

2.2 Diffractive neural field

As mentioned, we propose an unsupervised framework DNF to recover $O(x,y)$. The proposed DNF is composed of two parts, i.e., multilayer perceptron (MLP) networks for representing complex fields and a physical informed unsupervised loss function.

2.2.1 Representing amplitude and phase using neural networks

We represent the distribution $O(x,y)$ of specimen as a 2D vector-valued implicit function $g_{\theta }(x,y)$. Its input is a 2D location $(x,y)$ and the output is a diffractive value $(A,\phi )$, representing the amplitude and phase of $O(x,y)$. In practice, this implicit function is modeled with a MLP, $g_{\theta }:\: (x,y)\rightarrow (A,\phi )$, where $\theta$ represents the weights and bias in the MLP. The MLP network is composed of 10 layers, 1 encoding layer, 8 hidden fully connected layers and 1 output layer. Each hidden layer contains 256 neurons and is activated by a ReLU function. The output layer only contain 1 neuron and is activated by a Tanh function. The architecture of the MLP network is shown in Fig. 2.

Fig. 2. Training pipeline of the proposed unsupervised DNF for high-accuracy lensless imaging.

Download Full Size | PDF

To better learn the high frequency components in the amplitude and phase images from the coordinates input, a positional encoding (PE) layer [25] is firstly applied to the input coordinates before the MLP mapping. The PE is defined as

(3)$$\begin{aligned} PE(x,y)= & \{\cos(2\pi s\sigma_i x), \sin(2\pi s\sigma_i x)\} \\ & \cup \{\cos(2\pi s\sigma_i y), \sin(2\pi s\sigma_i y)\}, i\in \{1,2,\ldots,10\}, \end{aligned}$$

where $\sigma _i \sim \mathcal {N}(0,1)$ is randomly sampled from a standard Gaussian distribution, and $s$ ($=2$ in all experiments) is a scale value, chosen according to the scene.

2.2.2 Unsupervised loss for lensless imaging

Different from previous supervised convolutional neural network (CNN) based methods which require a huge amount of paired training data, the proposed DNF representation could be optimized from several shots. For lensless imaging, given $N$ measurements $\{I_{z}(x,y)\}$ from different heights $z$ ($\:z\in \{z_1,z_2,\ldots,z_N\}$), a physical informed unsupervised loss function is then defined as

(4)$$\mathcal{L}=\sum_{z=z_1}^{z_N}\parallel |g_{\theta}(x,y)*PSF_{z}|^{2}-I_{z}(x,y)\parallel_{1},$$

where $\parallel \cdot \parallel _{1}$ is the $L_1$-norm of a matrix. Actually, this loss function models the physical process of forward imaging in math. Following the imaging model in Eqn. (1) and (2), $g_{\theta }(x,y)$ describes the complex field at the object plane output by the network, and $|g_{\theta }(x,y)*PSF_{z}|^{2}$ represents the propagated image at different $z$ positions. The propagated images are then compared with the real measurements to supervise the network. In practice, to ensure the accuracy of propagation and the efficiency of training, the convolution operation is implemented in the Fourier domain using the angular spectrum theory [26]. Therefore, the calculation of PSF can be converted to the coherent transfer function (CTF) in Fourier space, e.g.,

(5)$$\begin{aligned} { CTF_z} & =e^{{\rm j}\cdot\sqrt{k_0^{2}-k_x^{2}-k_y^{2}}\cdot z} \\ { PSF_z} & =\mathcal{F}^{{-}1}\left({ CTF_z}\right), \end{aligned}$$

where $\mathcal {F}^{-1}(\cdot )$ represents the inverse Fourier transform, ${\rm j}$ is the imaginary unit, $\vec {k}=(k_x,k_y)$ represents the coordinates in the Fourier space, $k_0=2\pi /\lambda$ is the wave number, and $\lambda$ is the central wavelength of light source in use. As a result, the loss function is rewritten as

(6)$$\mathcal{L}=\sum_{z=z_1}^{z_N}\parallel |\mathcal{F}^{{-}1}(\mathcal{F}(g_{\theta}(x,y))\cdot CTF_{z})|^{2}-I_{z}(x,y)\parallel_{1}.$$

2.3 Jointly optimizing the complex field and height

In the setup of multi-height lensless imaging shown in Fig. 1, the height $z$ of each measurement requires substaintial effort to calibrate or estimate for complex field reconstruction. A small approximation error will cause the complex field of object unsolvable. In the proposed DNF framework, the parameter $z$ can be randomly set according the coarse height range. Since the entire pipeline including the MLP and the loss function (Eqn. 6) is fully differentiable, the implicit function $O(x,y)$ and height $z$ could be simultaneously optimized.

Refinement. Although the joint optimization could produce reasonable results, the DNF could fall into local minima when the height is not perfectly optimized, resulting in artifacts in the reconstructed complex field. To jump out of the local minima, an additional refinement procedure is applied. In detail, the weights and bias parameters are re-initialized randomly and the height $z$ is initialized with the optimal results after the first joint optimization.

Noting that, the MLP network used by the DNF takes the coordinates as input, while the few randoms snapshots are used to build physical informed loss function (e.g., the Eqn. (6). The height parameters are set as learnable parameters in the network, which could be initialized according to a coarse height range. During the training process, both the height parameters and the network parameters such as the weights and bias in each neuron are optimized. Once the training is finished, these heights are converged and could be directly obtained from the network. Meanwhile, the complex field of the specimen could be obtained by feeding the coordinates $(x,y)$ (e.g., $(1,1),(1,2),\ldots,(1,W),\ldots,(L,W)$, where the $L$ and $W$ are the length and width of the specimen image respectively) to the MLP network successively. Because the mapping function between the coordinates and the corresponding complex field varies from specimen to specimen, the trained MLPs are unsuitable for other specimen and it is necessary to retrain the networks in other scenes.

Table 1. Comparisons between DNF and state-of-the-art trained neural networks in theory.

View Table | View all tables in this article

2.4 DNF vs state-of-the-art trained neural networks

Generally, the imaging formation could be modeled as

(7)$$I=f(g(\vec{x_1}),g(\vec{x_2}),\ldots, g(\vec{x_n})),$$

where $\vec {x_i}$ refers to the coordinate of the $i$-th point in the space and $g(\vec {x_i})$ is the property of the point $\vec {x_i}$. $f$ models the imaging process, such as the ray marching process of the pinhole model [24], cone tracing process of the thin lens model [27] and the diffractive propagation process of the lensless imaging model in this work. Specifically, in the 2D lensless imaging, $\vec {x_i}$ is a 2D coordinate in the specimen, $g(\vec {x_i})$ represents the amplitude $A$ and phase $\phi$ of point $\vec {x_i}$, and $f$ represents the convolution process with propagation height $z$ in the multi-height lensless imaging.

In real world applications, the imaging function $f$ is often irreversible. Previous learning based methods focus on modeling $f^{-1}$ as a convolutional neural network (CNN) and recovering $g$ by feeding $I$ into the network. In detail, the trained network optimizes the parameters in the network by comparing the estimation $\tilde {g}$ with the ground truth $g$, and consequently a large amount of paired dataset is required. Different from previous trained networks, the DNF models the $g$ as a fully connected network, and the network is fed with the coordinates of each point $\vec {x_i}$. DNF learns a mapping between the coordinates and the corresponding properties. The network optimization is conducted by applying the $f$ into the estimation $\tilde {g}$ firstly and then comparing the $f(\tilde {g})$ and $I$.

Since previous CNN based methods [18] focus on analyzing the features from the input image, a large amount of convolution parameters is used for feature extraction. This large amount of parameters not only increase the costs of training (ranges from several days to weeks) and storage (>100MB) but also introduce the risk of overfitting. On the contrary, the proposed DNF requires much less parameters ($\approx$2MB) for modeling the diffractive field and converges much faster ($\approx$2 hours for a $1024\times 1024$ image) than previous CNN based methods. Besides, the proposed DNF can simultaneously approximate the imaging parameters along with the complex field reconstruction while all previous methods require additional calibration. This makes our solution more tractable in real-life applications. Table 1 summarizes the differences between DNF and state-of-the-art trained neural networks.

3. Results

3.1 Implementation details of DNF

We implement the MLP using the Pytorch library. The Adam optimizer is chosen to minimize the $L_1$ loss between the reconstructed measurements and the recorded measurements, where the learning rate is set as $5\times 10^{-4}$ and reduced by a factor of $0.1$ per 1000 iterations. In each iteration, the positions of all pixels are combined as a batch and are fed into the network for training (e.g., the batch size is $90000$ for a $300\times 300$ image). All weights and bias parameters used in the MLP are initialized following the Gaussian distribution $\mathcal {N}(\sqrt {k}, \sqrt {k})$, where $k$ is the reciprocal of the number of input features.

3.2 Synthetic data

In the simulation, two images (see Fig. 3(a)) with both high and low frequency components are used as the amplitude and phase images of the object, which simulates an extremely tough task for complex field recovery in lensless imaging applications. The measurements are generated following the Fresnel propagation theory, and the heights of 8 measurements are set as the $(0.50,0.55,\ldots,0.85)$ mm, respectively. The wavelength of light source is set as $\lambda =638\rm ~nm$.

Fig. 3. Performance comparison between the proposed method and the traditional GS method on the synthetic data. (a) Ground truth images of amplitude and phase. (b) Results of the DNF with arbitrary heights input. (c) GS results using the heights from ToG optimization [28]. (d) Results of the GS with same arbitrary heights. (e) The $1$-st, $3$-rd and $5$-th measurements.

Download Full Size | PDF

Height Recovery. To verify the convergence of the proposed DNF framework for height parameters, three experiments are conducted on the synthetic data where the heights are initialized as $0.4$ (is smaller than the minimal height $0.5$), $0.675$ (is the central within the height ranges) and $1.0$ (is larger than the maximum height $0.85$). Each experiment is repeated 5 times to reduce the randomness error. Table 2 shows the comparisons between the reconstructed heights (with means and standard deviations) and the ground truth. It is noticed that DNF could not output reliable heights when the initial values are out of the height ranges ($[0.5, 0.85]$), e.g., the results with initial heights $0.4$ and $1.0$. It is suggested to set the initial values within the height ranges ($0.675$). In real situation, we can actually get a rough knowledge of the height range from the experimental setting. We also noticed that the ToG optimization [28] can accurately recover the height parameters to some extent, but endures a long duration of convergence. Our proposed DNF can not only further improve the accuracy of height estimation (the error rate has halved), but also no longer bear the iterative process for convergence.

Table 2. Comparisons between recovered heights and the ground truth (unit: mm). The central rows show the means and standard deviations of different methods. The last 3 rows show the error rate of different methods. The numbers after ‘DNF’ refer to the initial values for heights.

View Table | View all tables in this article

Amplitude and Phase Reconstruction. We then compare the DNF with the traditional Gerchberg-Saxton (GS) algorithm [29,30]. The qualitative comparisons are shown in Fig. 3. The proposed DNF method could output reliable results with arbitrary height parameters (within a coarse height range) as the initial guess. On the contrary, the traditional GS method fails (Fig. 3(d)) in the same arbitrary height configuration. The performance of GS method increases when the height parameters from ToG optimization [28] are adopted (Fig. 3(c)). For phase recovery, the GS method could only recover the boundaries of the dog and the chimpanzee, while our method recovers more details such as the eyes of dog and the nose of the chimpanzee. In the hard simulation task, our DNF method can reconstruct both the amplitude and phase well together with more accurate height estimation, while the the GS method with ToG optimization cannot retrieve the phase accurately.

3.3 Real-world data

To demonstrate the effectiveness of our method, we build a general optical system of multi-height lensless microscopy. A fiber-coupled laser source is used to provide normally incident light (638 nm wavelength, Thorlabs). An image sensor with 1.67 µm pixel size and 3872 (W) $\times$ 2764 (H) pixels (Image Source DMM 27UJ003-ML USB 3.0 monochrome board camera) is mounted in a motion stage for axially moving. The sample is placed about 0.5 mm above the image sensor at the beginning. Then 8 raw images are sequentially captured by moving the image sensor to different $z$ positions with about 50 µm step each. A pixel alignment in x-y directions of 8 raw images are applied in advance of reconstruction.

We show experiments of the USAF-1951 resolution target in Fig. 4. The proposed DNF framework is tolerant to arbitrary heights input, Figs. 4(a) demonstrates the corresponding results. Figures 4(b) shows the results using the traditional GS method with heights using ToG optimization [28]. The proposed DNF could provide more clear images than the traditional combinations, such as the orange and purple arrows in Figs. 4(a) and 4(b). When the same arbitrary heights are input, the traditional GS method fails in such a wrong height parameter configuration (Figs. 4(c)).

Fig. 4. Comparison of the reconstructed amplitude on the USAF-1951 resolution target. (a) DNF results with arbitrary input. (b) GS results using the heights from ToG optimization [28]. (d) GS results using the same arbitrary input with (a).

Download Full Size | PDF

Fig. 5. Comparison of the reconstructed amplitude and phase images. The top large scale image is the full FOV reconstruction of the tomato flesh section. The left-bottom and right-bottom panels show the close-ups of the amplitude and phase images within the red and blue boxes from the full FOV image, respectively. In each panel, we respectively show the results by DNF and GS with searched heights [28]

Download Full Size | PDF

We perform a large scale imaging reconstruction of a tomato flesh section by using our proposed DNF, as shown in Fig. 5. The comparison of the reconstructed amplitude and phase images by both DNF and GS methods are also exhibited accordingly. In the red box, it is noticed that the saturate phenomenon appears in both the amplitude and the phase fields by using the GS method (the yellow arrows in Fig. 5). On the contrary, the proposed DNF method could provide stable results in this area. Additionally, the amplitude image from the GS method is heavily influenced by the background noise, which causes the contents hard to be distinguished. For example, in the blue box of the Fig. 5, the dot indicated by the purple arrow could be clearly identified in the phase image, however it is drowned by the background noise in the reconstructed amplitude image of the GS method. In comparison, it could be cleared observed in both the reconstructed amplitude and phase images by DNF method.

4. Conclusion

This work has proposed an unsupervised DNF to accurately recover the amplitude and phase information for lensless imaging from a set of random snapshots, requiring only a rough prior knowledge of imaging parameters. This allows the plug-and-play characteristics of the proposed approach. In the meantime, the DNF model does not need a large amount of pre-labeled training dataset, offering a lightweight implementation that is attractive to the practical applications. Besides, the experiments on both the synthetic and real data have verified the performance and robustness of the amplitude/phase recovery in lensless imaging applications. Despite that we exemplify the DNF in lensless imaging in this paper, we believe the proposed DNF framework could also be extended easily in other microscopic imaging systems.

Appendix A: Optimizing a complex field using MLPs

Optimizing a $1024\times 1024$ complex field is a difficult and long history task when using the traditional optimization, however it can be well modeled using machine learning techniques. The proposed DNF method models the imaging complex field as a function of the spatial coordinates of a given object sample. According to Taylor’s theorem that any function can be expressed as the sum of polynomials with a certain acceptable error, the complex field can also be represented using polynomials. The MLP used in the proposed DNF is composed of multiple linear functions and nonlinear activations that can be utilized to fit any polynomial as extensively studied in [31,32]. The 2MB model saves the weights and bias of these linear functions for synthesizing a large number of desired functions.

Funding

National Natural Science Foundation of China (62025108, 62071219, 62101242).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. G. Zheng, P. Song, C. Guo, S. Jiang, T. Wang, P. Hu, D. Hu, Z. Zhang, and B. Feng, “Optofluidic ptychography on a chip,” Lab Chip 21, 1 (2021). [CrossRef]

2. S. B. Kim, H. Bae, K.-i. Koo, M. R. Dokmeci, A. Ozcan, and A. Khademhosseini, “Lens-free imaging for biological applications,” J. Lab. Autom. 17(1), 43–49 (2012). [CrossRef]

3. A. Greenbaum, Y. Zhang, A. Feizi, P.-L. Chung, W. Luo, S. R. Kandukuri, and A. Ozcan, “Wide-field computational imaging of pathology slides using lens-free on-chip microscopy,” Sci. Transl. Med. 6(267), 267 (2014). [CrossRef]

4. Y. Wu, M. K. Sharma, and A. Veeraraghavan, “Wish: wavefront imaging sensor with high resolution,” Light: Sci. Appl. 8(1), 44 (2019). [CrossRef]

5. S. A. Lee, X. Ou, J. E. Lee, and C. Yang, “Chip-scale fluorescence microscope based on a silo-filter complementary metal-oxide semiconductor image sensor,” Opt. Lett. 38(11), 1817–1819 (2013). [CrossRef]

6. J. Shin, D. N. Tran, J. R. Stroud, S. Chin, T. D. Tran, and M. A. Foster, “A minimally invasive lens-free computational microendoscope,” Sci. Adv. 5(12), eaaw5595 (2019). [CrossRef]

7. N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “Diffusercam: lensless single-exposure 3d imaging,” Optica 5(1), 1–9 (2018). [CrossRef]

8. A. Greenbaum, W. Luo, T.-W. Su, Z. Göröcs, L. Xue, S. O. Isikman, A. F. Coskun, O. Mudanyali, and A. Ozcan, “Imaging without lenses: achievements and remaining challenges of wide-field on-chip microscopy,” Nat. Methods 9(9), 889–895 (2012). [CrossRef]

9. A. Ozcan and E. McLeod, “Lensless imaging and sensing,” Annu. Rev. Biomed. Eng. 18(1), 77–102 (2016). [CrossRef]

10. A. Greenbaum and A. Ozcan, “Maskless imaging of dense samples using pixel super-resolution based multi-height lensfree on-chip microscopy,” Opt. Express 20(3), 3129–3143 (2012). [CrossRef]

11. C. Guo, S. Jiang, P. Song, T. Wang, X. Shao, Z. Zhang, and G. Zheng, “Quantitative multi-height phase retrieval via a coded image sensor,” Biomed. Opt. Express 12(11), 7173–7184 (2021). [CrossRef]

12. A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy 109(10), 1256–1262 (2009). [CrossRef]

13. A. Maiden, D. Johnson, and P. Li, “Further improvements to the ptychographical iterative engine,” Optica 4(7), 736–745 (2017). [CrossRef]

14. Z. Zhang, Y. Zhou, S. Jiang, K. Guo, K. Hoshino, J. Zhong, J. Suo, Q. Dai, and G. Zheng, “Invited article: Mask-modulated lensless imaging with multi-angle illuminations,” APL Photonics 3(6), 060803 (2018). [CrossRef]

15. Y. Zhou, X. Hua, Z. Zhang, X. Hu, K. Dixit, J. Zhong, G. Zheng, and X. Cao, “Wirtinger gradient descent optimization for reducing gaussian noise in lensless microscopy,” Opt. Lasers Eng. 134, 106131 (2020). [CrossRef]

16. W. Luo, Y. Zhang, A. Feizi, Z. Göröcs, and A. Ozcan, “Pixel super-resolution using wavelength scanning,” Light: Sci. Appl. 5(4), e16060 (2016). [CrossRef]

17. C. Allier, S. Morel, R. Vincent, L. Ghenim, F. Navarro, M. Menneteau, T. Bordy, L. Hervé, O. Cioni, X. Gidrol, Y. Usson, and J.-M. Dinten, “Imaging of dense cell cultures by multiwavelength lens-free video microscopy,” Cytometry 91(5), 433–442 (2017). [CrossRef]

18. A. Sinha, J. Lee, S. Li, and G. Barbastathis, “Lensless computational imaging through deep learning,” Optica 4(9), 1117–1125 (2017). [CrossRef]

19. K. de Haan, Y. Rivenson, Y. Wu, and A. Ozcan, “Deep-learning-based image reconstruction and enhancement in optical microscopy,” Proc. IEEE 108(1), 30–50 (2019). [CrossRef]

20. F. Wang, Y. Bian, H. Wang, M. Lyu, G. Pedrini, W. Osten, G. Barbastathis, and G. Situ, “Phase imaging with an untrained neural network,” Light: Sci. Appl. 9(1), 77 (2020). [CrossRef]

21. K. Monakhova, V. Tran, G. Kuo, and L. Waller, “Untrained networks for compressive lensless photography,” Opt. Express 29(13), 20913–20929 (2021). [CrossRef]

22. X. Zhang, F. Wang, and G. Situ, “Blindnet: an untrained learning approach toward computational imaging with model uncertainty,” J. Phys. D: Appl. Phys. 55(3), 034001 (2021). [CrossRef]

23. E. Bostan, R. Heckel, M. Chen, M. Kellman, and L. Waller, “Deep phase decoder: self-calibrating phase microscopy with an untrained deep neural network,” Optica 7(6), 559–562 (2020). [CrossRef]

24. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” in European conference on computer vision, (Springer, 2020), pp. 405–421.

25. M. Tancik, P. P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ramamoorthi, J. T. Barron, and R. Ng, “Fourier features let networks learn high frequency functions in low dimensional domains,” in 34th Conference on Neural Information Processing Systems, (2020).

26. K. Matsushima and T. Shimobaba, “Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields,” Opt. Express 17(22), 19662–19673 (2009). [CrossRef]

27. J. T. Barron, B. Mildenhall, M. Tancik, P. Hedman, R. Martin-Brualla, and P. P. Srinivasan, “Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021), pp. 5855–5864.

28. Y. Zhang, H. Wang, Y. Wu, M. Tamamitsu, and A. Ozcan, “Edge sparsity criterion for robust holographic autofocusing,” Opt. Lett. 42(19), 3824–3827 (2017). [CrossRef]

29. Y. Shechtman, Y. C. Eldar, O. Cohen, H. N. Chapman, J. Miao, and M. Segev, “Phase retrieval with application to optical imaging: a contemporary overview,” IEEE Signal Process. Mag. 32(3), 87–109 (2015). [CrossRef]

30. J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. 21(15), 2758–2769 (1982). [CrossRef]

31. A. R. Barron, “Universal approximation bounds for superpositions of a sigmoidal function,” IEEE Trans. Inf. Theory 39(3), 930–945 (1993). [CrossRef]

32. A. R. Barron, “Approximation and estimation bounds for artificial neural networks,” Mach. Learn. 14(1), 115–133 (1994). [CrossRef]

	State-of-the-art	DNF
Imaging model	$I = f (g (\vec{x_{1}}), g (\vec{x_{2}}), \dots, g (\vec{x_{n}}))$
Network modeling	$f^{- 1}$	$g$
Network input	$I$	coordinates $(x, y)$
Optimization Strategy	$\| \tilde{g} - g \|$	$\| f (\tilde{g}) - I \|$
Network Size	$\sim 100$ MB [18]	$\sim 2$ MB
Convergence speed	Slow	Fast
Required para. calibration	Yes	No

	1	2	3	4	5	6	7	8
GT	0.5000	0.5500	0.6000	0.6500	0.7000	0.7500	0.8000	0.8500
DNF (0.4)	0.5191 $\pm$	0.5026 $\pm$	0.4860 $\pm$	0.4694 $\pm$	0.4528 $\pm$	0.4362 $\pm$	0.4196 $\pm$	0.4030 $\pm$
DNF (0.4)	0.2025	0.1448	0.0871	0.0294	0.0284	0.0861	0.1438	0.2015
DNF (0.675)	0.5012 $\pm$	0.5513 $\pm$	0.6013 $\pm$	0.6514 $\pm$	0.7013 $\pm$	0.7513 $\pm$	0.8013 $\pm$	0.8514 $\pm$
DNF (0.675)	5.571e-4	5.845e-4	5.600e-4	5.953e-4	5.953e-4	5.600e-4	5.499e-4	6.046e-4
DNF (1.0)	0.8119 $\pm$	0.8621 $\pm$	0.9122 $\pm$	0.9623 $\pm$	0.1013 $\pm$	0.1061 $\pm$	0.1112 $\pm$	0.1163 $\pm$
DNF (1.0)	0.0024	0.0025	0.0024	0.0024	0.0002	0.0004	0.0003	0.0002
ToG [28]	0.5020	0.5520	0.6040	0.6530	0.7040	0.7550	0.8030	0.8550
DNF (0.675)	$0.24 %$	$0.24 %$	$0.22 %$	$0.22 %$	$0.19 %$	$0.17 %$	$0.16 %$	$0.16 %$
ToG	$0.40 %$	$0.36 %$	$0.67 %$	$0.46 %$	$0.57 %$	$0.67 %$	$0.38 %$	$0.59 %$

	State-of-the-art	DNF
Imaging model	$I = f (g (\vec{x_{1}}), g (\vec{x_{2}}), \dots, g (\vec{x_{n}}))$
Network modeling	$f^{- 1}$	$g$
Network input	$I$	coordinates $(x, y)$
Optimization Strategy	$\| \tilde{g} - g \|$	$\| f (\tilde{g}) - I \|$
Network Size	$\sim 100$ MB [18]	$\sim 2$ MB
Convergence speed	Slow	Fast
Required para. calibration	Yes	No

	1	2	3	4	5	6	7	8
GT	0.5000	0.5500	0.6000	0.6500	0.7000	0.7500	0.8000	0.8500
DNF (0.4)	0.5191 $\pm$	0.5026 $\pm$	0.4860 $\pm$	0.4694 $\pm$	0.4528 $\pm$	0.4362 $\pm$	0.4196 $\pm$	0.4030 $\pm$
DNF (0.4)	0.2025	0.1448	0.0871	0.0294	0.0284	0.0861	0.1438	0.2015
DNF (0.675)	0.5012 $\pm$	0.5513 $\pm$	0.6013 $\pm$	0.6514 $\pm$	0.7013 $\pm$	0.7513 $\pm$	0.8013 $\pm$	0.8514 $\pm$
DNF (0.675)	5.571e-4	5.845e-4	5.600e-4	5.953e-4	5.953e-4	5.600e-4	5.499e-4	6.046e-4
DNF (1.0)	0.8119 $\pm$	0.8621 $\pm$	0.9122 $\pm$	0.9623 $\pm$	0.1013 $\pm$	0.1061 $\pm$	0.1112 $\pm$	0.1163 $\pm$
DNF (1.0)	0.0024	0.0025	0.0024	0.0024	0.0002	0.0004	0.0003	0.0002
ToG [28]	0.5020	0.5520	0.6040	0.6530	0.7040	0.7550	0.8030	0.8550
DNF (0.675)	$0.24 %$	$0.24 %$	$0.22 %$	$0.22 %$	$0.19 %$	$0.17 %$	$0.16 %$	$0.16 %$
ToG	$0.40 %$	$0.36 %$	$0.67 %$	$0.46 %$	$0.57 %$	$0.67 %$	$0.38 %$	$0.59 %$

DNF: diffractive neural field for lensless microscopic imaging

Abstract

1. Introduction

2. Diffractive neural field representation for lensless microscopic imaging

2.1 Lensless imaging model

2.2 Diffractive neural field

2.2.1 Representing amplitude and phase using neural networks

2.2.2 Unsupervised loss for lensless imaging

2.3 Jointly optimizing the complex field and height

2.4 DNF vs state-of-the-art trained neural networks

3. Results

3.1 Implementation details of DNF

3.2 Synthetic data

3.3 Real-world data

4. Conclusion

Appendix A: Optimizing a complex field using MLPs

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (5)

Tables (2)

Equations (7)

Optics Express