Physics-informed deep neural network for image denoising

Emmanouil Xypakis; Emmanouil Xypakis; Valeria de Turris; Fabrizio Gala; Fabrizio Gala; Giancarlo Ruocco; Marco Leonetti; Marco Leonetti; Marco Leonetti

doi:10.1364/OE.504606

1. Introduction

In any optical and non-optical imaging technology, measurement comes with noise, producing a signal that follows a Poisson probability distribution (PPD). Signal enhancement algorithms improve the amount of information by increasing the signal to noise ratio (SNR) [1–17], enabling the modeling, and visualizing biological data, microscopy images, medical imaging, computer tomography, positron emission tomography and other in-vivo imaging technologies.

Deep neural network (DNNs) [18–26] based algorithms achieve the best signal enhancement results. The performance and the ability to train DNNs, however, depend both on the chosen loss function – a quantity comparing predictions and ground truth (GT) which is minimized to learn its internal parameters – and on the normalization of the network inputs and targets [27]. Two commonly used loss functions for denoising and other image enhancement tasks are the L1-norm and the L2-norm (MSE) [21,24,28,29], where the data are arbitrarily normalized. On the other hand, when the desired output comes from a known probability distribution such as in semantic segmentation (U-net [20]) or other classification tasks, an entropic loss function, and a probabilistic normalization of data are of great significance. Although accounting for the physics of the camera detection process is known to significantly improve imaging efficiency [31], little research has been done in applying these physical properties in DNNs.

Physics-informed machine learning is a new trend in artificial intelligence [32]. Here, we report a physically informed DNN that builds on the PPD of signal detection. Our technique aims to provide a general and exportable approach to deal with Poisson distributed signals: I) we use a non-arbitrary and physics-based normalization process, II) we employ a physically informed loss function, and III) we employed DNN architecture which takes advantage of the previous features. First, we remove any arbitrariness on the normalization just working with images in which each pixel count represents the photon number. Then, we design a loss function that considers the distance between probability distributions instead of the distance between count numbers. In particular, we propose the Kullback-Leibler divergence (KL [33]; see also methods) as a loss function which enables the algorithm to work with the same efficiency in all dynamic range windows. Moreover, we employ a DNN architecture capable of classifying each pixel on a predicted photon number, thus preserving the photon number encoding and meaning for the output images. Our custom DNN employs structures from Residual Channel Attention Networks RCAN [23,35] previously applied for the denoising and U-net [30], which is employed for semantic segmentation.

2. Results

2.1 Physics-informed loss function for PPD

One known issue in the training process of a Neural Network is the normalization of the data. Improper or arbitrary normalization leads to a failure in the training [27]. To overcome this issue of denoising DNNs, an (arbitrary) percentile normalization of the images has been proposed; however, on the evaluation of the algorithms, the efficiency scores that are reported [21,23] not only depend on the normalization choice, but an extra normalization to the ground truth image (GT) must be performed to produce the best scores. This ambiguity in the results becomes important, especially when a GT is not provided or unknown. Overall, a robust image normalization scheme is missing from the literature.

To overcome these issues, we use the photon numbers ${N_{ph}}$ as the physical parameter of our algorithm avoiding any arbitrary normalization. This is done by a standard calibration procedure that relies on converting the camera counts D. to photon number ${N_{ph}}$. [31] with the equation $D = g{N_{ph}} + offset$ i.e. an efficiency g plus an offset.

In Fig. 1(a) we review the detection process of a PPD signal. The sample emits photons from volume V in all directions, and a fraction of them, ${N_{ph}}$., arrive at the detector. The detection of photons ${N_{ph}}$ originating from a small volume V. follows a PPD of average $\mu $ which depends on the detector exposure $\tau $. as $\mu = \tau V\rho $, here $\rho $s the emitter density while the variance is $\sigma^2 = \mu $.

Thus, the SNR for each pixel scales as $1/\surd \mu $ producing a fluctuating accuracy in the reconstruction of the density, across the detection field (see Fig. 1(b) and (c)). To what follows we refer to $\mu $ as the mean value othe PPD of the GT image. The network predicts a PPD with mean value W. We assume for the GT that the mean value $\mu $ correspond to the number of photons that arrive at the detector for a single measurement.

Fig. 1. a, from a small volume V of the density $\rho $ of the sample ${N_{ph}}$ photons are absorbed in the detector and transformed into electrical signal with efficiency g plus an offset. The longer the exposure $\tau $ the higher the average signal $\mu $ originating from the same density value $\rho$. On the left lower corner the real probability of the density value inside the volume V. On the right lower corner, the detector absorbs ${N_{ph}}$. for different exposure times $\tau = {\tau _1} < \; {\tau _2} < \; {\tau _3}$ producing a PPD with average value $\mu $. and variance $\sigma = \surd \mu $. b, Three different measurements of the sample density for the exposure times of panel a. c, The reconstruction of the density inside V for the exposure times of panel a. d, The physics informed loss func. tion LsKL and the non physics-informed loss function LossMSE for a single pixel of the sample for three differen. GT average values $\mu $. as a function of the relative error $\frac{{\mu - W}}{\sigma }$. between the predicted DNN output W for three different average values $\mu $. e, A schematic representation of different DNN architectures: RESUNET, RCAN, autoencoder, Unet. RESUNET is a combination of the other three and uses the physics-informed Loss function fDi. frent predictions between the RCAN and the RESUNET (both trained in dataset 1, and evaluated in dataset 1 for a Noisy signal Input and a comparison with the GT.

Download Full Size | PDF

In Fig. 1(d) we plot two different DNN loss functions th quantify the distance between W and $\mu $. for each pixel as a function the standard score $({\mu - W} )/\sigma $: the MSE (state-of-the-art observable in DNN) and the KL divergence (see Methods)

(1)$$Los{s_{KL}} = ({\mu - W} )\log \left( {\frac{\mu }{W}} \right)$$

This loss function, which contains the information of the PPD is different from the MSE loss functions in the following way: the MSE loss consirs the absolute difference between the predicted output of the network W and the GT value $\mu $, KL loss considers the relative difference. In this way, the KL loss efficiently helps the network to find the best configuration at the full dynamic range and does not penalize the absolute difference between the GT and the output of the DNN; a fact that is crucial for the traing process.

2.2 Physics-informed neural network architecture

In Fig. 1(e), we show different DNN architectures. A state-of-the-art architecture that is used for classification is U-net, which uses a fine-tuned contracted and expansive path between convolutional blocks, while RCAN employs residual convolutional blocks to perform denoising. We employ RESUNET, which is a combination of the U-net and RCAN. RESUNET differs from U-net in the fact that instead of having convolutional blocks, it has the residual blocks of RCAN [36]. Both U-net and RESUNET are very powerful performing semantic segmentation tasks using an entropic loss function to classify compact regions of images. Here we identify the number of classes of our problem as the total number of photons arriving in a single pixel. This problem could be computationally difficult since the total number of classes is huge (up to the dynamic range of the camera). However, we overcome this problem by using the analytical expression of Eq. (1). In this way we take advantage of both the capability of RCAN to perform denoising and U-net to perform semantic segmentation tasks. In Fig. 1(f), we show that RESUNET produces artifact-free reconstructions when exported to a different imaging condition and it is robust to image normalization.

2.3 Training of the RESUNET architecture with different losses

An important characteristic of the physics-informed loss function is that it produces stability in the training process by avoiding the generation of artifacts, which makes the algorithm training robust to hyperparameter optimization. In Fig. 2 we show the training of the RESUNET architecture in dataset 1, initialized with the same weights but trained with two different losses: the physics-informed KL loss and the MSE loss. At the end of each epoch, we keep track of the MSE and KL evaluation score in the full dataset for both cases and we plot the scores for the first 10 epochs in Fig. 2(a). As it is illustrated in Fig. 2(b), after 10 epochs when the network is trained with the MSE loss it produces artifacts. In fact, the network, when trained with the MSE loss, tends to highlight the low intensity pixels since the loss does not operate the same way in the full dynamic range (see also Fig. 1(d)). In this way, the network learns to minimize the MSE error by setting most of the output values to zero and disregards the information contained in the highest values of the image. Moreover, in preliminary tests of the algorithm, when the model was trained with the MSE loss, a sophisticated fine tuning of the hyperparameters was required for the model to produce a satisfactory result to avoid artifacts as shown in Fig. 2(b) (trained with MSE). In contrast, when the network is trained with the KL physics-informed loss the full dynamic range is treated equally by the network and the output is artifact free as in Fig. 2(b) (trained with the KL). As expected, the MSE value is lower in the case where the network is trained with the MSE loss and respectively the KL evaluation value is lower when the network is trained with the KL loss. This led us to conclude that the physics-informed KL loss function provides a better metric with the respect to the MSE when comparing the output of the neural network to the corresponding GT target.

Fig. 2. a. The training of the RESUNET architectures trained with two dferent loss functions MSE and KL loss respectively. At the end of each training epoch the MSE and the KL index are calculated for the training dataset, and we plot the metrics for 10 epochs. Both networks are trained on t same dataset and are initiated with the same weights. b. From left to right: NOISY input (field of view 83.2 $\mu m,\; grayscale$) from the validation dataset; prediction of the RESUNET for the NOISY input when trained with t MSE loss function after 10 epochs; prediction of the RESUNET when trained with the KL loss function after 10 epochs; GT: the corresponding high signal of NOISY input.

Download Full Size | PDF

2.4 Comparison of different neural network architectures

To compare our algorithm with other state of the art denoising algorithms, we train also RCAN, and RESUNET on the same dataset 1 for 100 epochs after optimizing the hyperparameters of the networks. Then we evaluate the algorithms on the training dataset and four different datasets generated by experimental data for confocal fluorescence microscopy that differ from the training dataset in a) the fluorophores; b) the objective (see Methods and Table 1).

Table 1. The training and the evaluation datasets characteristics

View Table

In Fig. 3(a) we show the performance of the RESUNET, RCAN and Non-Local Mean algorithm NLM for a patch on the training dataset and the corresponding MSE map (see Methods). We observe that RESUNET when evaluated in the training dataset outperforms both RCAN and NLM in the MSE maps and produces images that are almost identical to the GT. In Fig. 3(b) we evaluate the performance of RESUNET, RCAN and NLM in five different datasets by the MSE improvement and the PSNR (see Methods) improvement (see Methods). All algorithms improve the MSE and the PSNR in comparison with the Low Resolution LR image, but RESUNET produces the best results in all datasets, with the best results achieved in the training dataset 1. Moreover, in dataset 5 RCAN is worse than the NLM algorithm. A fact that indicates a better portability of RESUNET to different experimental conditions.

Fig. 3. a. From left to right: GT, RESUNET, RCAN (trained for 100 epochs in dataset 1) and NLM for an image patch (field of view 55.5 $\mu m,\; grayscale)$ derived from dataset 3 at the lower part we show the MSE map as defined in Eq. (9) b. the MSE improvement (Eq. (7)) and the PSNR improvement (Eq. (8)) of NLM, RCAN and RESUNET for 5 different datasets.

Download Full Size | PDF

In Fig. 4, we further quantify the performance of RESUNET and compare it with RCAN and the non-local-mean NLM classical denoising algorithm. In Fig. 4(a) we show the RESUNET, and NLM denoised images for the noisy input (10 msec exposure time image slice taken from dataset 3) and the GT image (500 msec exposure time, same laser power). RESUNET produces a high-contrast, artifact-free smooth image. NLM produces a high contrast image but tends to highlight the high intensity values more and loses some information on the structure of the image (legs on the right lower corner). In Fig. 4(b), we show the line profiles of the line in Fig. 4(a). RESUNET, in comparison with the other algorithms, produces a higher contrast between the deeps and the peaks along the lines. In general, improving the contrast of the image improves the resolution. Our algorithm, as shown in Fig. 2(b) improves the contrast of the image and therefore there is no need of a tradeoff between resolution and denoising. In Fig. 4(c), we show the cumulative probability (Cum. Prob.) of the intensity histogram for the images shown in Fig. 4(a). In Fig. 4(d), we show the performance scores of the images shown in Fig. 4(a). In Fig. 5 we show the outputs of RESUNET, RCAN and NLM for an image patch taken from different dataset as described in Table 1. RESUNET architecture is able to produce denoised smooth images for all the datasets.

Fig. 4. a, The comparison of the RESUNET with RCAN and NLM algorithm reconstruction of a Noisy input 10msec exposure (Noisy) for a Noci cell image slice from dataset 2 with the corresponding GT (500msec exposure). Both RESUNET and RCAN have been trained in a different dataset 1. Characterized by different optical systems, fluorophores, and cameras. b, the line profile of the panel a for the line shown above c, The Cumulative probably of the intensity for panel a. d, The mean square error MSE, structural similarity index SSIM [34], Cumulative probability improvements for panel a.

Download Full Size | PDF

Fig. 5. The comparison for 5 different datasets (the fields of view for the panels from top to bottom are (83.2, 55.5, 55.5, 83.2, 41.6) $\mu m$) of RESUNET and RCAN trained in dataset 1 for 100 epochs, and NLM classical denoising algorithm. From left to right: the GT (500msec exposure); the RESUNET; the RCAN; the NLM; the noisy input LR (10msec exposure) MSE improvement (Eq. (7)) for each case is reported in yellow. All images are shown in grayscale.

Download Full Size | PDF

3. Methods

3.1 Physics-informed loss function

The Kullback-Leibler divergence is a measure of comparing two probability distribution functions $p(n ),\; q(n )$. for the same sample space n. It is defined as

(2)$$KL({p,q} )= \mathop \sum \nolimits_n p(n )\ast \log \left( {\frac{{p(n )}}{{q(n )}}} \right)$$

For the KL divergence, we identify the classes of the entropic-like KL loss as the number of photons. For a PPD, which is fully characterized by its mean value, the summation over the classes (photons) is simplified. For two PPDs with mean values $\mu ,\; W,$. $p({n,\mu } )= {\mu ^n}\; {e^{ - \mu }}/n!\;q({n,W} )= {W^n}\; {e^{ - W}}/n!$, $KL(p,q$.) depends only on $\mu ,\; W$.

(3)$$KL({\mu ,W} )= \mu \log \left( {\frac{\mu }{W}} \right) + ({W - \mu } )$$

In this paper we used the symmetrized version of $KL({\mu ,\; W} )$ the loss function.

(4)$$Los{s_{KL}} = \frac{1}{2}({\; KL({\mu ,W} )+ KL({W,\mu } )} )= ({\mu - W} )\log \left( {\frac{\mu }{W}} \right)$$

The task of a classification network is to predict a probability distribution $q(n )$ and compare it with the ground truth distribution $p(n )$. In general, both $p(n )$ and $q(n )$. could be complicated functions that depend on many parametersTraditionally, a classification neural network architecture should have an extra layer (the classification output layer) at the end of the network that its dimension equals to the number of classes n and the output for each node give the probability distribution $q(n )$. We identify the different classes as the number of photons that arrive at each pixel. A standard photon detector can detect up to tens of thousands of photons at each pixel. Therefore, both the neural network architecture and the loss function are in principle hard to be computed merically. By using the fact that a Poisson probability distribution is fully characterized by its mean value the use of a classification output layer is redundant, since if we know the mean value of the distribution, we can reconstruct it by the analytical expression of the Poisson distribution. In our model, we predict the output of the classification network by predicting the mean value of the Poisson distribution. For the ground truth images, we assume that mean value of the Poisson distribution is equal to the number of photons that arrive at the detector.

3.3 RESUNET architecture

Regarding the architectures, a promising approach is to use skip connections between network layers to bypass residual contents (RCAN [23]) and an encoder-decoder-like architecture efficiently solves semantic segmentation/classification tasks (U-net [30]). In our case, the classes are the number of photons, so we use the efficiency of U-net to classify photons and the power of residues to perform denoising. To do that, we substitute the convolutional layers of U-net with residual blocks employing an architecture similar to [36]. Our network uses three-dimensional convolution layers and takes as an input 8 two dimensional conjugated planes and produces 8 conjugated planes. The architecture of the network is illustrated in Fig. 6 and it is as follows: The building blocks of the network are three-dimensional residual blocks ‘resnet block’ consisting of two three-dimensional convolution blocks (3DCB) of kernel size 3 × 3x3 followed by a batch normalization layer (BNL) and a rectified linear unit ReLu layer and a residual skip connection convolution layer of kernel size 1 × 1x1. The skip connection layer and the convolutional layer are added together followed by a batch normalization layer and a ReLu layer. The ‘resnet block’ has a feature size (FS) that varies throughout the network. We use these ‘resnet blocks’ to form a U-net [30] architecture. The U-net consists of a contracting path and an expansive path, which gives it the u-shaped architecture. The contracting path consists of the Input layer followed of the repeated application of two ‘resnet blocks’ of feature size FS $= 4$. A three-dimensional max pool layer ‘Max pool’ of kernel size 2 × 2x2 for down sampling is applied and the same procedure repeats two more times doubling the feature size each time and increasing the resnet block number to three. Then the expansive path starts by applying three resnet blocks of feature size 8 FS followed by a 3 × 3x3 transposed convolution layer (TCL) of feature size 8 FS, with stride 2 for up-sampling, followed by a batch normalization and ReLu layer. Then a concatenation layer concatenates layers from the contracting and expansive path of the same input dimensions. The procedure repeats two more times with the ‘resnet blocks’ feature size that changes as indicated in Fig. 6. Finally, a convolution layer of kernel size 1 × 1x1 and feature size 15 FS is applied to the input layer which is added to the expansive path. A final 3DCB of feature size 1, a BN and a ReLu is added to the end of the architecture. More details about the network architecture, can be found in the code ocean repository: https://codeocean.com/capsule/9043085/tree/v1. The computational time for an image of 256 × 256 pixels is approximately 0.06 sec in a Desktop PC with Intel Core i9-109180XE Processor (18x 3.0 GHz) and NVIDIA GeForce RTX 3090 GPU.

Fig. 6. The RESUNET architecture. The building blocks of the architecture are shown in the upper part of the figure: the three dimensional convolution block (3DCB) of feature size FS, the transposed convolution layer TCL of feature size FS that is used for upsampling, the batch-normalization layer (BN), the ReLu layer, the Max Pool layer, the Concatenate layer, and, the resnet block of feature size FS. The U-net part of the RESUNET is shown in the lower part of the figure. Arrows indicate the flow of the network from the Input layer to the Output layer. The add symbol indicates that the network layers of the same shape are added in the node. The feature size for the 3DCB, the TCL and resnet blocks changes along the network flow and it is indicated next to the blocks.

Download Full Size | PDF

3.4 Training and evaluation datasets

The training and evaluation of our neural network were performed using experimental data acquired on a confocal microscope. Confocal image stacks were acquired using an inverted Olympus iX73 equipped with an X-light V3 spinning disc head (Crest Optics), a LDI laser illuminator (89 North), Prime BSI sCMOS camera (Photometrics) and MetaMorph software (Molecular devices). We systematically changed specific parameters, in particular we used three different objectives (Olympus) with different depths of field (UPLSAPO 30XS NA1,05, UPLSAPO 20X NA0,75, LUCPLFLN 40XPH NA0,6) and for each we acquired three separeted emission wavelength (420 nm, 525 nm, 665 nm corresponding to the excited fluorophores in the sample, DAPI, AlexaFluor488, AlexaFluor647) each with stacks of 25 planes characterized by three different z-steps each with 8 increasing exposure time. We differentiate each dataset by the parameter $\Omega = \frac{{{V_{psf}}}}{{{V_{pixel}}}}$ as the sampling rate between the volume of the point spread function psf and the pixel. For every objective we had 3 replicates. We systematically change the exposure time and thus the SNR. Specifically, we acquire the same image volumes for all the datasets with exposure times (1,2,5,10,20,50,100,500) msecs. We use the images of 500 msec as the ground truth GT target while the rest we keep as training inputs. For each image in the dataset, we first transform the image counts (efficiency g = 0.43 offset = 95). For each dataset we randomly extract 10000 image volumes of 256 × 256 × 8 with the last dimension to be along the optical axis. We train all our models to dataset 1 $\Omega = 0.40$ and we use the rest for testing. To ensure reproducibility [22], we evaluated and compared our architecture with other state-of-the-art DNN architecture and a non DNN classical denoising algorithm.

3.5 Evaluation metrics

We use MSE between two normalized images ${\textrm{A}_\textrm{x}}$ and ${\textrm{B}_\textrm{x}}$ where x is the pixel index as an evaluation index which is defined as

(5)$$MSE({A,B} )= \frac{{\mathop \sum \nolimits_x {{({{A_x} - {B_x}} )}^2}\; }}{{\left( {\mathop \sum \nolimits_x A_x^2\; } \right)\left( {\mathop \sum \nolimits_x B_x^2} \right)}}$$

The PSNR is the ratio of the maximum pixel intensity the power of the distortion.

(6)$$PSNR({A,B} )= 10{{\log }_{10}}\frac{{\max (B )}}{{MSE({A,B} )}}$$

Since these metrics may depend on the intrinsic characteristics of the image structure, we define the MSE improvement and the PSNR of the output of the denoise method respectively with respect to the GT and the LR as

(7)$$MSE\; improvement = \frac{{MSE({LR,GT} )}}{{MSE({A,GT} )}}\; ,$$

(8)$$PSNR\; improvement = \frac{{PSNR({A,GT} )}}{{MSE({LR,GT} )}}\; .$$

In Fig. 3 we also use the MSE map for comparing the denoised images and LR with the GT image, which is defined as

(9)$$MSE\; map\; ({A,GT} )= \frac{{{{({{A_x} - \; G{T_x}} )}^2}\; }}{{\left( {\mathop \sum \nolimits_x A_x^2\; } \right)\left( {\mathop \sum \nolimits_x GT_x^2} \right)}}$$

4. Discussion

In summary, we showed that a physics-informed loss fction employed in a novel DNN architecture, outperforms the state-of-the-art denoising algorithms for fluorescence microscopy. While other denoising algorithms for fluorescence microscopy [31] mainly focuses on the details of sCMOS camera, this proposed method can be extended to a more general case other than fluorescence microscopy. We evaluate our results in five different datasets, four of which are in a different imaging condition than the training dataset. We overcome the arbitrariness of the data normalization in denoising DNNs by using a photon model that does not produce instabilities, thus showing portability when use in different imaging conditions. Our loss function is based on the statistics of the Poisson distribution and could be used in DNNs performing tasks beyond microscopy, such as satellite imaging or non-optical (i.e. electron based microscopes) imaging technologies producing Poissonian signals. Further improvement of the denoising can be achieved also by introducing tailored illumination. The control of the spatio temporal properties of illumination have been already exploited in super resolution experimental techniques such as palm storm and sted [37–38]. Recently structured illumination has also been employed to reduce noise in object recognition tasks. Bridging these end-to-end information-flow recognition strategies [16] with denoising algorithm could be an exciting avenue for future research.

Funding

HORIZON EUROPE Marie Sklodowska-Curie Actions (713694); European Research Council (855923); Regione Lazio (A0375-2020-36549).

Acknowledgment

This project has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No 713694. GR acknowledges the support by European Research Council Synergy grant ASTRA (No. 855923). ML thanks Project LOCALSCENT, Grant PROT. A0375-2020-36549, Call POR-FESR “Gruppi di Ricerca 2020”.

Disclosures

The authors declare no competing interests.

Data availability

All code, raw and analyzed datasets generated during the study are available from the corresponding author on request.

References

1. M. R. Chowdhury, J. Zhang, J. Qin, et al., “Poisson image denoising based on fractional-order total variation,” Inverse Probl. Imaging 14(1), 77–96 (2020). [CrossRef]

2. W. Wang, X.-G. Xia, C. He, et al., “A Noise-Robust Online convolutional coding model and its applications to poisson denoising and image fusion,” Appl. Math. Model. 95, 644–666 (2021). [CrossRef]

3. M. Zhang, F. Zhang, Q. Liu, et al., “VST-Net: Variance-stabilizing transformation inspired network for Poisson denoising,” J. Vis. Commun. Image Represent. 62, 12–22 (2019). [CrossRef]

4. Y. Su, Q. Lian, X. Zhang, et al., “Multi-scale Cross-path Concatenation Residual Network for Poisson denoising,” IET Image Process. 13(8), 1295–1303 (2019). [CrossRef]

5. A. A. Bindilatti, “A nonlocal poisson denoising algorithm based on stochastic distances,” IEEE Signal Process. Lett. 20(11), 1010–1013 (2013). [CrossRef]

6. E. Demircan-Tureyen, F. P. Akbulut, M. E. Kamasak, et al., “Restoring Fluorescence Microscopy Images by Transfer Learning From Tailored Data,” IEEE Access 10, 61016–61033 (2022). [CrossRef]

7. B. Mandracchia, W. Liu, X. Hua, et al., “Optimal sparsity allows reliable system-aware restoration of fluorescence microscopy images,” Sci. Adv. 9(35), eadg9245 (2023). [CrossRef]

8. Z. H. Shah, M. Müller, B. Hammer, et al., “Impact of different loss functions on denoising of microscopic images,” Proc. Int. Jt. Conf. Neural Networks 2022-July, (2022). [CrossRef]

9. Y. Wang, H. Pinkard, E. Khwaja, et al., “Image denoising for fluorescence microscopy by supervised to self-supervised transfer learning,” Opt. Express29(25), 41303 (2021). [CrossRef]

10. V. Mannam, Y. Zhang, Y. Zhu, et al., “Real-time image denoising of mixed Poisson–Gaussian noise in fluorescence microscopy images using Image,” J. Optica 9(4), 335 (2022). [CrossRef]

11. K. Zhang, W. Zuo, Y. Chen, et al., “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Trans. Image Process. 26(7), 3142–3155 (2017). [CrossRef]

12. L. Huang, H. Chen, T. Liu, et al., “Self-supervised learning of hologram reconstruction using physics consistency,” Nat. Mach. Intell. 5(8), 895–907 (2023). [CrossRef]

13. G. Chen, J. Wang, H. Wang, et al., “Fluorescence microscopy images denoising via deep convolutional sparse coding,” Signal Process. Image Commun. 117, 117003 (2023). [CrossRef]

14. I. Kang, F. Zhang, G. Barbastathis, et al., “Phase extraction neural network (PhENN) with coherent modulation imaging (CMI) for phase retrieval at low photon counts,” Opt. Express 28(15), 21578 (2020). [CrossRef]

15. T. Remez, O. Litany, R. Giryes, et al., “Class-Aware Fully Convolutional Gaussian and Poisson Denoising,” IEEE Trans. Image Process 27(11), 5707–5722 (2018). [CrossRef]

16. C. Qian, “Noise-Adaptive Intelligent Programmable Meta-Imager,” Intell. Comput. 2022, (2022).

17. V. Göreke, “A novel method based on Wiener filter for denoising Poisson noise from medical X-Ray images,” Biomed. Signal Process. Control 79, 104031 (2023). [CrossRef]

18. T. Remez, O. Litany, R. Giryess, et al., “Deep Convolutional Denoising of Low-Light Images,” (2017).

19. S. Albawi, T. A. Mohammed, S. Al-Zawi, et al., “Understanding of a convolutional neural network,” Proc. 2017 Int. Conf. Eng. Technol. ICET 20172018-January, 1–6 (2018).

20. C. Belthangady, “Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction,” Nat. Methods 16(12), 1215–1225 (2019). [CrossRef]

21. M. Weigert, Uwe Schmidt, T. Boothe, et al., “Content-aware image restoration: pushing the limits of fluorescence microscopy,” Nat. Methods 15(12), 1090–1097 (2018). [CrossRef]

22. R. F. Laine, I. Arganda-Carreras, R. Henriques, et al., “Avoiding a replication crisis in deep-learning-based bioimage analysis,” Nat. Methods 18(10), 1136–1144 (2021). [CrossRef]

23. J. Chen, H. Sasaki, H. Lai, et al., “Three-dimensional residual channel attention networks denoise and sharpen fluorescence microscopy image volumes,” Nat. Methods 18(6), 678–687 (2021). [CrossRef]

24. J. Gurrola-ramos, O. Dalmau, T. E. Alarcón, et al., “A Residual Dense U-Net Neural Network for Image Denoising,” IEEE Access 9, 31742–31754 (2021). [CrossRef]

25. J. Byun, S. Cha, T. Moon, et al., “FBI-Denoiser: Fast Blind Image Denoiser for Poisson-Gaussian Noise,” (2021). [CrossRef]

26. A. S. Mayorov, D. C. Elias, M. Mucha-Kruczynski, et al., “Interaction-Driven Spectrum Reconstruction in Bilayer Graphene,” Science 333(6044), 860–863 (2011). [CrossRef]

27. M. Shanker, M. Y. Hu, M. S. Hung, et al., “Effect of data standardization on neural network training,” Omega 24(4), 385–397 (1996). [CrossRef]

28. C. Qiao, D. Li, Y. Guo, et al., “Evaluation and development of deep neural networks for image super-resolution in optical microscopy,” Nat. Methods 18(2), 194–202 (2021). [CrossRef]

29. E. Xypakis, G. Gosti, T. Giordani, et al., “Deep learning for blind structured illumination microscopy,” Sci. Rep. 12(1), 8623 (2022). [CrossRef]

30. T. Falk, D. Mai, R. Bensch, et al., “U-Net: deep learning for cell counting, detection, and morphometry,” Nat. Methods 16(1), 67–70 (2019). [CrossRef]

31. B. Mandracchia, X. Hua, C. Guo, et al., “Fast and accurate sCMOS noise correction for fluorescence microscopy,” Nat. Commun. 11(1), 94 (2020). [CrossRef]

32. G. E. Karniadakis, I. G. Kevrekidis, L. Lu, et al., “Physics-informed machine learning,” Nat. Rev. Phys. 3(6), 422–440 (2021). [CrossRef]

33. S. Kullback, “On Information and Sufficiency,” Ann. Math. Stat. 22(1), 79–86 (1951). [CrossRef]

34. Z. Wang, A. C. Bovik, H. R. Sheikh, et al., “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Trans. Image Process. 13(4), 600–612 (2004). [CrossRef]

35. Y. Zhang, K. Li, K. Li, et al., “Image super-resolution using very deep residual channel attention networks,” in Computer Vision—ECCV 2018 (eds V. Ferrari et al.) 294–310 (2018).

36. F. I. Diakogiannis, F. Waldner, P. Caccetta, et al., “ResUNet-a: A deep learning framework for semantic segmentation of remotely sensed data,” ISPRS J. Photogramm. Remote Sens. 162, 94–114 (2020). [CrossRef]

37. M. J. Rust, M. Bates, X. Zhuang, et al., “Stochastic optical reconstruction microscopy (STORM) provides sub-diffraction-limit image resolution,” Nat Meth 3(10), 793–796 (2006). [CrossRef]

38. E. Betzig, G. H. Patterson, R. Sougrat, et al., “Imaging Intracellular Fluorescent Proteins at Nanometer Resolution,” Science 313(5793), 1642–1645 (2006). [CrossRef]

Dataset	Objective	Fluorophore	$Ω$
1 (training)	UPLSAPO 20X	DAPI	0.40
2	UPLSAPO 30XS	Alexa488	0.94
3	UPLSAPO 30XS	DAPI	0.48
4	UPLSAPO 20X	Alexa647	1.17
5	LUCPLFLN40XPH 40X	DAPI	2.51

Physics-informed deep neural network for image denoising

Abstract

1. Introduction

2. Results

2.1 Physics-informed loss function for PPD

2.2 Physics-informed neural network architecture

2.3 Training of the RESUNET architecture with different losses

2.4 Comparison of different neural network architectures

3. Methods

3.1 Physics-informed loss function

3.3 RESUNET architecture

3.4 Training and evaluation datasets

3.5 Evaluation metrics

4. Discussion

Funding

Acknowledgment

Disclosures

Data availability

References

Data availability

Cited By

Figures (6)

Tables (1)

Equations (9)

Optics Express