Non-uniform image reconstruction for fast photoacoustic microscopy of histology imaging

Ling Xiao Zhou; Yu Xia; Renxiang Dai; An Ran Liu; Si Wei Zhu; Peng Shi; Wei Song; Wei Song; Xiao Cong Yuan; Xiao Cong Yuan

doi:10.1364/BOE.487622

1. Introduction

Photoacoustic imaging (PAI) is an emerging bioimaging technique that combines the high resolution of optical imaging with the deep penetration of acoustic imaging [1–3]. It irradiates the samples with a short-pulse laser and collects acoustic waves using an ultrasonic detector [4], which is then received by a computer for image reconstruction. PAI provides high resolution and good contrast than PET and MRI [5,6]. In most of photoacoustic microscopic imaging systems, raster scanning mechanisms with mechanical stages are used for capturing the photoacoustic signals of the samples, resulting in very slow imaging acquisition speed [7].

Many attempts are carried out for high-speed photoacoustic imaging [8–12], such as the utilizations of hexagonal mirrors or multi-focus photoacoustic illumination, but they surfers from high implementation costs and design difficulties due to complicated optical and mechanical configurations. Alternatively, Liu et al [13,14]. treated the sparse photoacoustic data as a sparse matrix. They first used the idea of matrix completion based on low-rank matrix approximation to restore sparse images. Then they proposed a low-rank sparse matrix completion model with multiple constraints, which can recover the details and edges of images well. However, the reconstruction magnification is minimal. So, large-scale raster scanning on the biological samples necessitates a faster method.

In recent years, deep learning-based image processing methods have gradually evolved [15] and have been applied in photoacoustic imaging [16]. Guan et al. [17] proposed the FD-UNet method to restore images sampled from a single channel to the images with multi-channel sampling at high resolution without artifacts. Awasthi et al. [18] used deep learning networks to predict the full bandwidth signal of photoacoustic signals. Zhang et al. [19] employed a deep learning approach to remove the imaging artifacts and limited viewpoint in the sparse photoacoustic raw data. DiSpirito et al. [20] compared the ability of various classical convolutional neural network architectures to reconstruct downsampled photoacoustic data from the mouse cerebrovascular system. A convolutional network consisting of residual blocks and channel attention mechanism modules achieves fully sampled mapping with very low-sampling photoacoustic data [21], demonstrating that the attention mechanism module could suppress images discontinuity and blur. Zhao et al. realize the reconstruction of sparse photoacoustic imaging with high imaging quality at ultralow laser dosages, which however requires vast computing hardware resources due to complex and huge reconstruction models [22].

In this paper, we propose a framework for sparse image reconstruction that combines an object detection amplification network with a combined front-back-end module. First, we acquired ten photoacoustic images of mouse brain slices with a resolution of 1890 × 1890 pixels as our dataset. Then, as the amplifying part of the sparse image reconstruction framework, we choose an efficient super-resolution and object detection network. According to the experimental results, we designed the front and back-end models to fuse the front and back views and customized the network to improve the model performance. Finally, we performed the ablation experiments for each module in the framework to demonstrate the efficacy of proposed method.

2. Method

2.1 Acquisition and processing of photoacoustic images

In brief, we obtained the photoacoustic histology images using our previously-developed UV-PAM system [5], which used a nanosecond ultraviolet (UV) pulsed laser (266 nm wavelength; Picolo 1, InnoLas Laser GmbH) as the photoacoustic illumination source. The laser beam was focused onto the sample by an objective (U-13X-LC, Newport), and the generated photoacoustic waves were detected using a customized water-immersed ultrasonic transducer. With raster scanning (PLS-85, Physik Instrumente GmbH, Karlsruhe, Germany), the photoacoustic image was captured.

Given that the sample thickness is much smaller than the depth resolution of the system, the maximum photoacoustic amplitude along the depth direction was extracted for forming the histology images. We obtained ten original images with a resolution of 1890 × 1890 pixels. To expand the training set and save the GPU memory consumption during network training, we rotated and cropped the large image to 1521 images of 256 × 256 pixels, and then randomly divided into ten groups, nine of which are the training set and the left one is the test set. To simulate the low-resolution photoacoustic images with sparse sampling, we perform pixel extraction with a step size of 4 to obtain a low-resolution image of 64 × 64.

We performed photoacoustic imaging on the same samples with a larger step size (i.e., 4 µm) and a standard sampling (i.e., 1 µm), respectively, while keeping the laser pulse repetition frequency. For the former, the imaging time was dramatically reduced. We used the standard-sampled image as the ground truth, and the NFSR method was used to reconstruct the undersampled images.

2.2 Framework structure

The structure of the proposed framework NFSR is shown in Fig. 1. NFSR, as an efficient super-resolution framework, is composed of an object detection network and an existing super-resolution (SR) network as the core, supplemented by a front-end fast amplification network (FN), a back-end front-back view fusion network (BN), and a background threshold loss function (BL). The core super-resolution network and the object detection network adopt the existing mature network structures and only require adaptable changes to the input and output layers of the network. The front-end network is inspired by FSRCNN [23], which utilizes 3 × 3 convolutional kernels instead of large convolutional kernels to reduce computational consumption. The last layer of the network transposes the convolutional layer, which converts the small-sized feature layer back into a large-sized image. Despite the network with only four convolutional layers, the back-end network can effectively improve the imaging quality of the output SR images. The framework’s input is a sparsely sampled low-resolution (LR) photoacoustic image. The groundImage is obtained via the front-end fast scaling network, and then fed into the object detection network to obtain the detection frame. Scaling the coordinates on the detection frame yields the coordinates of the corresponding LR image. The coordinates are used to obtain the blocks in the LR image that require high-quality image reconstruction, which are then fed into the super-resolution network. Then, take the output of the super-resolution module as the foreground image, overlap it with the groundImage, and finally complete the foreground and background image fusion through the back-end network to obtain the final output SR image.

Fig. 1. The structure of NFSR. The framework consists of a front-end fast processing network, an object detection network, a super-resolution network, and a back-end processing network. LR is the sparsely sampled photoacoustic image, groundImage is the intermediate image processed by the framework, and SR is the final output of the framework.

Download Full Size | PDF

The code runs in a Keras environment, and the activation function in the SR framework is the relu function [24]. The Adam optimizer is used with hyperparameters of β1 = 0.5, β2 = 0.999 [25]. The learning rate of the model is initially 0.0001, with an attenuation step length of one tenth of the total number of iterations, decaying by half each time. The training GPU is a single NVIDIA Tesla V100s, the batch size is 32, the number of model iterations is 1000, and the save interval is set to 10. The final model is the one with the best performance.

2.3 Image super-resolution method

Image super-resolution has received considerable attention in the field of deep learning [26]. We tested the effectiveness of several existing well-established super-resolution methods on our photoacoustic dataset, as shown in Table 1. The best evaluation metrics belongs to EDSR [27], a super-deep residual super-resolution network proposed in 2017.Thenetworks of WDSR [28] and Real-ESRGAN [29] have considerable progress in the perceptual field, fuzzy kernel, and loss function. However, the performance degrades in the photoacoustic images, which is possibly due to a smaller dataset of the photoacoustic histology images and severe imaging backgrounds information. In addition, the lack of down-sampling fuzzy kernels for sparse sampling leads to the network overfitting. Unless otherwise specified, the EDSR networks used in this paper have 32 residual blocks and a maximum number of channels count of 256.

Table 1. Result of existing super-resolution methods on our photoacoustic datasets

View Table | View all tables in this article

2.4 Object detection method

The non-uniform reconstruction method employs an object detection method to detect the blocks that require super-resolution processing. The heart of this method is a fast and efficient object detection network. First, as an object detection network, we selected the yolov5 [30,31] network that has excellent lightweight and precision in existing object detection networks [32]. The primary cell nuclei size and average pixel brightness in pixels of the photoacoustic images were counted, where the nuclei size distribution was used to modify the detection frame of the object detection network, the nuclei distribution complexity is used to adjust the number of network layers, and the actual pixel brightness is used to determine the object detection threshold. As shown in Table 2, the modified YOLO network, subtitled Yolov5_little, significantly reduces the computational cost while only marginally reducing the performance. To further reduce computational cost in the object detection network, we used the ghostmodule [33,34] to replace the network’s convolution layer, allowing us to nearly double the number of convolution channels without increasing computational cost. Specifically, the convolution process was split into two parts. Taking the output convolution dimension H*W*C as an example, the output result H*W*C/2¹ is from direct convolution, and the H*W*C/2² is obtained through a linear variation of H*W*C/2¹. Concatenating the channel number dimension yields the final output result. In the subsequent validation experiments, we used Yolov5_little as the object detection network, in which the individual operation and storage size are Flops:2.349 G and weights:4.96 M, respectively.

Table 2. Object detection method comparison

View Table | View all tables in this article

2.5 Loss function and evaluation metrics

In the image reconstruction network, the combination of the L1 loss function [35] and background threshold loss function serves as the loss function. By increasing the weight of the image’s practical information, the background threshold loss function forces the network to use practical information to generate images. Considering that the biological image contains a large amount of background information, the background threshold loss function can significantly accelerate the training process.

Intuitively, a model is desired to generate high-quality pixel information that is more significant than the threshold, rather than focusing on pixels that are less significant. Background threshold loss is composed of background threshold α and threshold proportion β. α is arbitrarily chosen and represents the maximum pixel threshold that does not affect the vision. β represents the proportion of pixels in the whole image that are larger than the threshold, expressed as $\beta = {\raise0.7ex\hbox{${\mathop \sum \nolimits_{i = 0}^I Pixel > \mathrm{\alpha }}$} \!\mathord{\left/ {\vphantom {{\mathop \sum \nolimits_{i = 0}^I Pixel > \mathrm{\alpha }} {\sum C{\ast }H{\ast }W}}}\right.}\!\lower0.7ex\hbox{${\sum C{\ast }H{\ast }W}$}}$. Based on our dataset, we define α = 17, and any pixel with a value less than 17 is classified as background data. In the object detection network, our yolov5-based network contains classification loss, regression loss, and confidence loss, in which classification loss and location loss are calculated using the binary-cross-entropy loss function.

The image evaluation metrics are Peak Signal to Noise Ratio (PSNR) [36] and Structural Similarity (SSIM) [37]. The PSNR is expressed as $PSNR = 20lo{g_{10}}\left({\raise0.7ex\hbox{$({{2^N} - 1})$} \!\mathord{\left/ {\vphantom {({{2^N} - 1}) {\sqrt {MSE} }}}\right.}\!\lower0.7ex\hbox{${\sqrt {MSE} }$}}\right)\textrm{, }{\; }MSE = \frac{1}{{mn}}\mathop \sum \nolimits_{i = 0}^{m - 1} \mathop \sum \nolimits_{j = 0}^{n - 1} {[{I({i,{\; }j} )- K({i,j} )} ]^2}$, where N is the pixel bit wide of the image, and the dataset in our work is a grayscale image (i.e., N = 8); I and K represent the two images involved in the calculation; m and n represent the image size. SSIM, consisting of brightness, contrast, and structure, is expressed as $SSIM({x,y} )= {[{l({x,y} )} ]^\alpha }{[{c({x,y} )} ]^\beta }{[{s({x,y} )} ]^\gamma }$. In our work, α, β, and γ values are set to be 1. $l({x,y} )$ denotes the brightness comparison, $c({x,y} )$ denotes the contrast comparison, and $s({x,y} )$ denotes the structure comparison. Set ${\mu _x}$ and ${\mu _y}$ as the average of x and y, respectively, ${\sigma _x}$ and ${\sigma _y}$ as the standard deviation of x and y, respectively, and ${\sigma _{xy}}$ as the covariance of x and y, we derive $l\left( {x,y} \right) = {\raise0.7ex\hbox{${2{\mu _x}{\mu _y} + C1}$} \!\mathord{\left/ {\vphantom {{2{\mu _x}{\mu _y} + C1} {{\mu _x}^2 + {\mu _y}^2 + C1}}}\right.}\!\lower0.7ex\hbox{${{\mu _x}^2 + {\mu _y}^2 + C1}$}}$, $c\left( {x,y} \right) = {\raise0.7ex\hbox{${2{\sigma _{xy}} + C2}$} \!\mathord{\left/ {\vphantom {{2{\sigma _{xy}} + C2} {{\sigma _x}^2 + {\sigma _y}^2 + C2}}}\right.}\!\lower0.7ex\hbox{${{\sigma _x}^2 + {\sigma _y}^2 + C2}$}}$, $s\left( {x,y} \right) = {\raise0.7ex\hbox{${{\sigma _{xy}} + C3}$} \!\mathord{\left/ {\vphantom {{{\sigma _{xy}} + C3} {{\; }{\sigma _x}{\sigma _y} + C3}}}\right.}\!\lower0.7ex\hbox{${{\; }{\sigma _x}{\sigma _y} + C3}$}}$, where C1, C2 and C3 represent three constant terms that aims to avoid the situation with the denominator at 0.

3. Result and discussion

EDSR serves as the basic model in Fig. 1. First, we validate the overall performance and use ablation experiments to assess the efficacy of each component of the framework in the computational cost and reconstruction quality. Then, we discuss the impacts of both the number of framework parameters and photoacoustic dataset quality.

3.1 Overall framework experiment

Results are shown in Table 3. Our framework achieves nearly identical evaluation metrics as vanilla EDSR while using only 39.08% of the computation power. Due to the lightweight design of the front and rear networks and object detection networks, the increase of our network parameters is acceptable when considering the reduced network computational cost. The result indicates that this method improves the efficiency of the super-resolution network.

Table 3. NFSR experiment comparison. This table is based on the photoacoustic dataset and uses rounding to retain three decimal places.

View Table | View all tables in this article

3.2 Framework ablation experiment

Ablation experiments are used to examine the function of each module,, and the rationality of the background threshold loss function used in our work is discussed.

First, we use the core network (NULL), which is composed of the object detection network and image super-resolution reconstruction network, to test the impact of each component on the core network through the ablation experiment in Fig. 2. At each time, only one component is added, and the SSIM and PSNR are calculated using the test dataset. According to the evaluation indicators, FN contributes more to the pixel level, while BN contributes more to the improvement of visual effects. Compared with the core network, the complete network is better, especially at the pixel level, in which it increases by 0.241 dB while consuming only a little extra computing power.

Fig. 2. (a) Framework ablation experiment. The left is the PSNR evaluation metrics, and the right is the SSIM evaluation metrics. Except for the FULL network, only one method is added based on NULL each time. (b) Framework ablation experiment. Using the bicubic up-sampling method as a comparison, NFSR_NoFrontNet is the rendering without the front-end network, NFSR_NoBackendNet is the rendering without the back-end network. Magnified images labeled in the red dotted boxes indicate the reconstruction details.

Download Full Size | PDF

The overall impact trend is consistent in Fig. 2. With a pure EDSR network as the control group, a negative impact on the image’s overall imaging performance is shown in Fig. 3(a), whereas the BL and BN have significantly enhanced the overall imaging performance in Fig. 3(b). BL inhibits the excessive learning of background information in the network, and BN can fuse the background with the enlarged region of interest that is defined as the foreground, both of which account for substantial impacts on the visual effect.

Fig. 3. (a) Influence of network structure on PSNR. (b) Influence of network structure on SSIM

Download Full Size | PDF

Although the threshold-based background loss function is simple, it can enhance the quality of network reconstruction significantly. To reduce the impact of the framework, we only use the EDSR network to compare the following loss functions, where the EDSR reconstruction network is trained with different loss functions. L1, FFL [38], and SmoothLoss [39] were utilized in the test. L1 and SmoothLoss represent the loss function in the image space domain. SmoothLoss is a combination of L1 and L2 networks, and has better optimization efficiency compared with simple L1. But SmoothLoss improves the image degree of outliers on the network, so its performance is slightly lower than L1 in the experiments. FFL transforms spatial information into frequency domain information, performing well in the authentic scene images. However, the data distribution of photoacoustic histology images differs from that of everyday images. In Table 4, the loss function increases the evaluation metrics on the L1 loss function and achieves the desired result.

Table 4. Loss Function Comparison Experiment

View Table | View all tables in this article

3.3 Super resolution image reconstruction experiment

One image super-resolution network development trend is to have a greater receptive field and a better understanding of deep features, which has increased the number of network parameters. Meanwhile, another consequential observation of the current development trends is the small super-resolution network that can be used on mobile or embedded devices [40,41]. Our acceleration framework is a method for the adaption to the super-resolution network with varying parameter sizes. In Fig. 4, we used different channel numbers to simulate the super-resolution network of different receptive fields. For EDSR, the channel numbers of 128, 256, and 384 are assigned to LittleNet, Net, and LargeNet, respectively. The efficiency declines as the network being smaller, which is attributed to the fact that the acceleration effect ratio of the framework tends to the ratio of the region of interest in the image when the overhead of the front, rear, and object detection networks is fixed, and the marginal effect of the fixed overhead decreases as the network grows in size. Also, our framework can make the accelerated network image reconstruction quality extremely close to the original network for different core network sizes. This demonstrates our framework’s adaptability to the existing super-resolution networks with different volume sizes. Finally, considering that the data schedule of the framework inevitably leads to an increase in time costs, we calculated the time consumption for reconstructing a single image with 64 × 64 pixels into the one with 256 × 256 pixels by different volume networks. LittleNet, Net and LargeNet take 0.226 seconds, 0.289 seconds, and 0.31 seconds, respectively, whereas the time reduction by the sparse sampling is about a few of hours. This suggests the acceptable result of our framework.

Fig. 4. Impact of core network size on the acceleration effect. LittleNet, Net, and LargeNet correspond to the EDSR networks using 128, 256, and 384 channels, respectively. The result is obtained using super-resolution networks of different sizes in the non-uniform reconstruction framework. The reference of each percentage is the result of the plain network without the reconstruction framework.

Download Full Size | PDF

3.4 Impact of dataset quality

The impacts of the dataset quality on the output image reconstructed from the framework are studied in Fig. 5. In Fig. 5(d), the images are reconstructed using the photoacoustic imaging dataset with moderate quality, which features that the edges of the cell nuclei in the photoacoustic histology imaging are moderately clear without severe artifacts. The photoacoustic histology image with the best quality is reconstructed very well using the frameworks, as illustrated in Fig. 5(c). Whereas, the edges of the cell nuclei in the photoacoustic histology images are obviously blurred, as shown in Figs. 5(a) and 5(b). Clearly, these interferences affect adversely the quality of the image reconstruction network. However, because no public defocus photoacoustic dataset is available, the network training cannot be conducted due to lack of paired gold standard defocused photoacoustic images. By accurately acquiring amounts of defocus images with an improved photoacoustic microscopy, we perform the imaging reconstruction on the defocused photoacoustic dataset in the future.

Fig. 5. The influence of dataset quality on network performance. The results of the test dataset were analyzed using the complete NFSR network, which are compared with the EDSR test results. The indicators were PSNR and SSIM. (a) Images of the test dataset with the lowest PSNR value. (b) Images of the test dataset with the lowest SSIM value. (c) Images of the test dataset with the highest values of PSNR and SSIM. (d) Images of test dataset with moderate level of PSNR and SSIM.

Download Full Size | PDF

To further verify the universality of this framework, blood cell count dataset (BCCD), which contains 364 labeled blood smear images, is used to test our framework. Here, the input and output dimensions of the network change to be 480*640, while keeping other processes unchanged. Compared with the reconstruction without the framework, the utilization of the framework achieves the PSNR of 92.43% and the SSIM of 99.09% with an averaged computation of 28.3% (Fig. 6). It is concluded that the proposed method is feasible in other datasets.

Fig. 6. Validation experiment on the BCCD dataset. (a) The output image is reconstructed using the NFSR framework. (b) High resolution image. (c) and (d) Magnified views labeled by the green and blue dashed boxes in (a) and (b), respectively. The values in the box (a) give the amount of computation, average PSNR, and average SSIM.

Download Full Size | PDF

4. Conclusion and further works

We propose a non-uniform reconstruction framework based on target detection networks. Compared to the existing depth learning methods [16,22], the advantage is to significantly reduce the computational complexity while maintaining the quality of image reconstruction. Since it is challenging for sufficient image reconstruction quality relying solely on object detection and image super-resolution networks, the integration of a lightweight front-end rapid amplification network, a back-end foreground and background fusion network, and a background threshold loss function is exploited in our method for improving both network efficiency and image reconstruction quality. By developing a lightweight and effective object detection network based on yolov5, invalid background information can be eliminated and thus both computing power and reconstruction quality are optimized. The experimental results indicate that the NFSR accelerates the existing network computation, which is feasible in the image reconstruction of sparse photoacoustic dataset.

These experiments provide guidance for our future work, including: (1) high-quality reconstructions of the out-of-focus and missing areas in the photoacoustic image while maintaining rapid imaging output; (2) universality of the framework on various photoacoustic images (e.g., blood vessels and tumors); (3) multifunction by optimizing the depth learning networks for the image segmentation or repair; (4) simplifying the network parameters and reducing the total parameters of the model.

Funding

Basic and Applied Basic Research Foundation of Guangdong Province (2020B0301030009); National Natural Science Foundation of China (62175159, 12174204); Science, Technology and Innovation Commission of Shenzhen Municipality (KQTD20170330110444030, JCYJ20200109113808048, 20200806173720001); Natural Science Foundation of Guangdong Province (2023A1515012888); Shenzhen newly introduced high-end talents research startup project (827000593); Key Research Project of Zhejiang Lab (K2022MG0AC05).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. A. B. E. Attia, G. Balasundaram, M. Moothanchery, U. S. Dinish, R. Z. Bi, V. Ntziachristos, and M. Olivo, “A review of clinical photoacoustic imaging: current and future trends,” Photoacoustics 16, 100144 (2019). [CrossRef]

2. J. Kim, S. Park, Y. Jung, S. Chang, J. Park, Y. M. Zhang, J. F. Lovell, and C. Kim, “Programmable real-time clinical photoacoustic and ultrasound imaging system,” Sci. Rep. 6(1), 35137 (2016). [CrossRef]

3. C. Liu, J. B. Chen, Y. C. Zhang, J. Y. Zhu, and L. D. Wang, “Five-wavelength optical-resolution photoacoustic microscopy of blood and lymphatic vessels,” Adv. Photonics 3(01), 9 (2021). [CrossRef]

4. S. Jeon, J. Kim, D. Lee, J. W. Baik, and C. Kim, “Review on practical photoacoustic microscopy,” Photoacoustics 15, 100141 (2019). [CrossRef]

5. W. Song, Y. C. Wang, H. Chen, X. Z. Li, L. X. Zhou, C. J. Min, S. W. Zhu, and X. C. Yuan, “Label-free identification of human glioma xenograft of mouse brain with quantitative ultraviolet photoacoustic histology imaging,” J. Biophotonics 15(5), e202100329 (2022). [CrossRef]

6. L. Lin and L. H. V. Wang, “The emerging role of photoacoustic imaging in clinical oncology,” Nat. Rev. Clin. Oncol. 19(6), 365–384 (2022). [CrossRef]

7. T. Wang, W. Liu, and C. Tian, “Combating acoustic heterogeneity in photoacoustic computed tomography: A review,” J. Innov. Opt. Health Sci. 13(3), 16 (2020). [CrossRef]

8. L. D. Wang, K. Maslov, and L. H. V. Wang, “Single-cell label-free photoacoustic flowoxigraphy in vivo,” Proc. Natl. Acad. Sci. U. S. A. 110(15), 5759–5764 (2013). [CrossRef]

9. J. J. Yao, C. H. Huang, L. D. Wang, J. M. Yang, L. Gao, K. I. Maslov, J. Zou, and L. H. V. Wang, “Wide-field fast-scanning photoacoustic microscopy based on a water-immersible MEMS scanning mirror,” J. Biomed. Opt. 17(8), 1 (2012). [CrossRef]

10. T. Jin, H. Guo, H. B. Jiang, B. W. Ke, and L. Xi, “Portable optical resolution photoacoustic microscopy (pORPAM) for human oral imaging,” Opt. Lett. 42(21), 4434–4437 (2017). [CrossRef]

11. B. X. Lan, W. Liu, Y. C. Wang, J. H. Shi, Y. Li, S. Xu, H. X. Sheng, Q. F. Zhou, J. Zou, U. Hoffmann, W. Yang, and J. J. Yao, “High-speed widefield photoacoustic microscopy of small-animal hemodynamics,” Biomed. Opt. Express 9(10), 4689–4701 (2018). [CrossRef]

12. L. A. Song, K. Maslov, and L. V. Wang, “Multifocal optical-resolution photoacoustic microscopy in vivo,” Opt. Lett. 36(7), 1236–1238 (2011). [CrossRef]

13. T. Liu, M. J. Sun, N. Z. Feng, M. H. Wang, D. Y. Chen, and Y. Shen, “Sparse photoacoustic microscopy based on low-rank matrix approximation,” Chin. Opt. Lett. 14(9), 091701 (2016).

14. T. Liu, M. J. Sun, Y. Liu, D. P. Hu, Y. Ma, L. Y. Ma, and N. Z. Feng, “ADMM based low-rank and sparse matrix recovery method for sparse photoacoustic microscopy,” Biomedical Signal Processing and Control 52, 14–22 (2019). [CrossRef]

15. L. C. Mao and J. Zhao, “A survey on the new generation of deep learning in image processing,” IEEE Access 7, 172231–172263 (2019). [CrossRef]

16. H. Deng, H. Qiao, Q. Dai, and C. Ma, “Deep learning in photoacoustic imaging: a review,” Photoacoustics 22, 100241 (2021). [CrossRef]

17. S. Guan, A. Khan, P. V. Chitnis, and S. Sikdar, “Fully dense UNet for 2D sparse photoacoustic tomography reconstruction,” IEEE J. Biomed. Health Inform. 24, 568–576 (2020). [CrossRef]

18. N. Awasthi, G. Jain, S. K. Kalva, M. Pramanik, and P. K. Yalavarthy, “Deep Neural Network-Based Sinogram Super-Resolution and Bandwidth Enhancement for Limited-Data Photoacoustic Tomography,” IEEE Trans. Ultrason., Ferroelect., Freq. Contr. 67(12), 2660–2673 (2020). [CrossRef]

19. H. J. Zhang, H. Y. Li, N. Nyayapathi, D. P. Wang, A. Le, L. Ying, and J. Xia, “A new deep learning network for mitigating limited-view and under-sampling artifacts in ring-shaped photoacoustic tomography,” IEEE Comput. Grap. Appl. 40(6), 8–11 (2020). [CrossRef]

20. A. DiSpirito, D. W. Li, T. Vu, M. M. Chen, D. Zhang, J. W. Luo, R. Horstmeyer, and J. J. Yao, “Reconstructing undersampled photoacoustic microscopy images using deep learning,” IEEE Trans. Med. Imaging 40(2), 562–570 (2021). [CrossRef]

21. J. S. Zhou, D. He, X. Y. Shang, Z. D. Guo, S. L. Chen, and J. J. Luo, “Photoacoustic microscopy with sparse data by convolutional neural networks,” Photoacoustics 22, 100242 (2021). [CrossRef]

22. Z. K. Huangxuan Zhao, F. Yang, K. Li, N. Chen, L. Song, C. Zheng, D. Liang, and C. Liu, “Deep Learning Enables Superior Photoacoustic Imaging at Ultralow Laser Dosages,” Adv. Sci. 8(3), 2003097 (2021). [CrossRef]

23. J. Zhang, M. Liu, X. Wang, and C. Cao, “Residual net use on FSRCNN for image super-resolution,” in 2021 40th Chinese Control Conference (CCC).

24. S. R. Dubey, S. K. Singh, and B. B. Chaudhuri, “Activation functions in deep learning: a comprehensive survey and benchmark,” Neurocomputing 503, 92–108 (2022). [CrossRef]

25. D. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv, arXiv:1412.6980 (2014). [CrossRef]

26. Y. Li, B. Sixou, and F. Peyrin, “A Review of the Deep Learning Methods for Medical Images Super Resolution Problems,” IRBM 42(2), 120–133 (2021). [CrossRef]

27. B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced deep residual networks for single image super-resolution,” 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 1132–1140 (2017).

28. J. Yu, Y. Fan, J. Yang, N. Xu, Z. Wang, X. Wang, and T. S. Huang, “Wide Activation for Efficient and Accurate Image Super-Resolution,” arXiv, arXiv:1808.08718 (2018). [CrossRef]

29. X. Wang, L. Xie, C. Dong, and Y. Shan, “Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data,” 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 1905–1914 (2021).

30. A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv, arXiv:2004.10934 (2020). [CrossRef]

31. Y. Luo, Y. F. Zhang, X. Z. Sun, H. W. Dai, and X. H. Chen, “Intelligent Solutions in Chest Abnormality Detection Based on YOLOv5 and ResNet50,” J. Healthcare Eng. 2021, 1–11 (2021). [CrossRef]

32. Y. Z. Zhu and W. Q. Yan, “Traffic sign recognition based on deep learning,” Multimed. Tools Appl. 81(13), 17779–17791 (2022). [CrossRef]

33. M. L. Cao, H. Fu, J. Y. Zhu, and C. G. Cai, “Lightweight tea bud recognition network integrating GhostNet and YOLOv5,” Math. Biosci. Eng. 19(12), 12897–12914 (2022). [CrossRef]

34. K. Han, Y. Wang, Q. Tian, J. Guo, and C. Xu, “GhostNet: More Features From Cheap Operations,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020).

35. H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss Functions for Image Restoration With Neural Networks,” IEEE Trans. Comput. Imaging 3(1), 47–57 (2017). [CrossRef]

36. Z. H. Wang, J. Chen, and S. C. H. Hoi, “Deep Learning for Image Super-Resolution: A Survey,” IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3365–3387 (2021). [CrossRef]

37. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. on Image Process. 13(4), 600–612 (2004). [CrossRef]

38. L. Jiang, B. Dai, W. Wu, and C. C. Loy, “Focal frequency loss for image reconstruction and synthesis,” Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2020), pp. 13919–13929.

39. R. Girshick, “Fast R-CNN,”arXiv, arXiv:1504.08083 (2015). [CrossRef] .

40. A. Howard, M. Sandler, G. Chu, L. C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, and V. Vasudevan, “Searching for MobileNetV3,” arXiv, arXiv:1905.02244 (2019). [CrossRef] .

41. Y. Li, J. Liu, and L. Wang, “Lightweight network research based on deep learning: a review,” in 2018 37th Chinese Control Conference (CCC) (2018).

	EDSR	WDSR	Real-ESRGAN
Weights:	164MB	576MB	127M (Generator only)
PSNR:	25.467	25.377	22.554
SSIM	0.941	0.936	0.871

	Yolov5s	Yolov5m	Yolov5_little
weights	14.1MB	41.3MB	4.96MB
Map0.5/map0.5:0.95	0.911/0.610	0.841/0.565	0.907/0.584

	NFSR	EDSR
Flops/Params	80.319 G/45.589M	205.514 G/43.062M
PSNR/SSIM	25.431/0.940	25.467/0.941

	EDSR	WDSR	Real-ESRGAN
Weights:	164MB	576MB	127M (Generator only)
PSNR:	25.467	25.377	22.554
SSIM	0.941	0.936	0.871

	Yolov5s	Yolov5m	Yolov5_little
weights	14.1MB	41.3MB	4.96MB
Map0.5/map0.5:0.95	0.911/0.610	0.841/0.565	0.907/0.584

Non-uniform image reconstruction for fast photoacoustic microscopy of histology imaging

Abstract

1. Introduction

2. Method

2.1 Acquisition and processing of photoacoustic images

2.2 Framework structure

2.3 Image super-resolution method

2.4 Object detection method

2.5 Loss function and evaluation metrics

3. Result and discussion

3.1 Overall framework experiment

3.2 Framework ablation experiment

3.3 Super resolution image reconstruction experiment

3.4 Impact of dataset quality

4. Conclusion and further works

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (6)

Tables (4)

Biomedical Optics Express