Excitation-based fully connected network for precise NIR-II fluorescence molecular tomography

Caiguang Cao; Caiguang Cao; Caiguang Cao; Anqi Xiao; Anqi Xiao; Anqi Xiao; Meishan Cai; Meishan Cai; Biluo Shen; Biluo Shen; Lishuang Guo; Lishuang Guo; Xiaojing Shi; Xiaojing Shi; Jie Tian; Jie Tian; Jie Tian; Jie Tian; Jie Tian; Zhenhua Hu; Zhenhua Hu; Zhenhua Hu

doi:10.1364/BOE.474982

1. Introduction

Fluorescence molecular tomography (FMT) is an effective optical imaging modality that has been applied to biomedical and preclinical research [1–4]. FMT provides three-dimensional (3D) information on the in vivo fluorescence signals through imaging algorithms compared with fluorescence molecular imaging.

Commonly, FMT reconstruction is performed based on the photon propagation description and the corresponding solution algorithms, which mainly focus on the optical parameters in the first near-infrared (NIR-I, 700-900 nm) window. However, photons in this spectrum suffer from severe scattering, and it is challenging to achieve desirable performance. In contrast, the photon scattering and tissue absorbance are relatively low in the second near-infrared (NIR-II, 1000-1700 nm) window, which provides a high penetration depth and imaging contrast [5,6]. From these advantages, FMT reconstruction adopting NIR-II optical properties and utilizing NIR-II fluorescence imaging has exhibited high precision compared with NIR-I FMT. For instance, Wang et al. [7] performed a singular-value analysis method in phantom studies of NIR-II FMT reconstruction and demonstrated its advantages in improving spatial resolution compared with NIR-I FMT reconstruction. Cai et al. [8] proposed a novel Gaussian weighted neighborhood fused Lasso method and carried out both NIR-I and NIR-II FMT reconstruction in mice bearing subcutaneous liver tumors based on the algorithm. In vivo experiments also verified the excellent performance of NIR-II FMT reconstruction on localization accuracy and morphological recovery.

Although the reconstruction performance has been improved by introducing NIR-II optical properties, these studies describe the photon propagation through the approximate model of the radiative transfer equation to obtain 3D information about the light source [9]. The accuracy of the simplified model inevitably limits the reconstruction performance, and there exists a deviation compared with the actual photon propagation [10]. Moreover, the complex inverse problem increases the reconstruction difficulty [11,12]. Over the decades, several spectrum-specific forward models and optimization strategies have been proposed to improve the FMT reconstruction performance [13–15]. Nonetheless, the fundamental problem has not been solved yet.

With the development of machine learning, novel reconstruction strategies based on neural networks were recently implemented to solve the above problems. Specifically, a neural network is constructed to model the inverse photon propagation process using massive parameters. The 3D light source information is obtained by directly importing the surface photon intensity of the imaged subject into the network. For instance, Meng et al. [16] proposed a K-nearest neighbor-based locally connected (KNN-LC) network to improve morphological recovery. However, the K value in the network was determined based on the specific data set, which was required to be tuned for a new reconstruction task. Guo et al. [17] designed a 3D-En-Decoder network and achieved highly accurate reconstruction results. However, the effectiveness of the network in biological applications was not proved by in vivo experiments. Furthermore, a 3D fusion dual-sampling convolutional neural network was proposed to obtain high spatial resolution [18]. It was also limited by the single morphology of the light source and could not accurately reconstruct the light source with complex shapes. In summary, a novel network structure with high generalization performance suitable for light sources with different morphology is necessary. Further application to the NIR-II FMT reconstruction is expected to improve reconstruction performance.

In this study, NIR-II optical parameters of biological tissues were first analyzed and adopted to construct simulation data sets to take advantage of NIR-II fluorescence imaging and achieve precise reconstruction results. Besides, an excitation-based fully connected network (EFCN) was proposed to learn and model the propagation of NIR-II photons in biological tissues. EFCN consisted of a fully connected neural network with an excitation block connected in parallel. The excitation block was designed to assign different weights for each neuron, allowing the network to autonomously pay more attention to neurons related to the light source. Barycenter error between the reconstructed light sources and the actual light sources was utilized for training the network to improve localization accuracy. Besides, the KNN-LC network [16] and the multilayer perceptron-based inverse problem simulation (IPS) method [19] were used as the baseline to compare the reconstruction results. Simulation experiments demonstrated the advanced performance of EFCN in reconstructing light sources with different sizes, morphology, and number. More importantly, in vivo experiments of glioma-bearing mice showed the outstanding performance of EFCN combined with NIR-II fluorescence imaging, in which the mean center localization error was only 0.26 mm and the mean Dice metric [20] was 0.78. The results revealed the superiority of our proposed EFCN, which can be applied to other biomedical research in the future.

2. Materials and methods

2.1 Model-based reconstruction strategy

The reconstruction strategy is based on the approximation model of the radiative transfer equation [9], which describes photon propagation in biological tissues. A linear relationship between photon intensity on the surface of the imaged subject and the 3D light source information is built after finite element analysis [21]. The calculation is described below:

(1)$$\Phi \textrm{ = }AX$$

where $\Phi $ is the surface photon intensity vector with a dimension of n × 1, A and X are the system matrix and actual light source vector with a size of n × m and m × 1, respectively. The actual light source distribution X is obtained by solving the inverse problem of the above equation.

2.2 Excitation-based fully connected network

In this study, an excitation-based fully connected network (EFCN) was proposed to solve the inverse problem of FMT reconstruction. The network mainly contained two blocks, a network backbone and an excitation block (Fig. 1). The backbone was a fully connected (Fc) sub-network with four layers. The input of the EFCN was the photon intensity Φ, dispersed from the surface of the numerical mouse head. The output of the network was the reconstructed light source X. The number of neurons for each hidden and output layer was assigned as the number of nodes of the numerical mouse brain. Besides, a bottleneck block with two fully connected layers was paralleled between the Fc1 and Fc2 layers. It aimed to excite the nodes in the hidden layer more relevant to the light source by assigning higher weights. This conception was inspired by the squeeze-and-excitation mechanism [22], an attention mechanism used in the convolutional neural network to allow the model to pay more attention to essential features. The mechanism of the excitation block was defined as:

(2)$$Fc{2_{in}} = \delta ({Fc{1_{out}} \ast {s_{excitation}}} )$$

where Fc2_in and Fc1_out are the input of the Fc2 layer and output of the Fc1 layer, respectively. $\delta$ refers to the ReLU function, and * refers to the element-wise multiplication. s_excitation is the output of the Ex2 layer and is defined as:

(3)$${s_{excitation}} = \sigma ({{W_2}\delta ({{W_1}Fc{1_{out}}} )} )$$

where $\sigma$ is the sigmoid activation, ${W_1} \in {R^{\frac{m}{r} \times m}}$ and ${W_2} \in {R^{m \times \frac{m}{r}}}$. The reduction ratio r of the bottleneck was set to 5 in this paper. Besides, the sigmoid function was used as activation in the Ex2 layer. Since it activates the input value to range from 0 to 1, with a large value mapped to 1 and a small value mapped to 0. It can be easily applied as the weight to indicate whether to excite or suppress the node.

Fig. 1. Schematic of EFCN. The network receives the dispersed surface photon intensity and directly learns inverse photon propagation. The backbone is shown in blue, and the excitation block in green is desired to assign different weights to the network nodes autonomously.

Download Full Size | PDF

2.3 Loss function

The localization error between the reconstructed light source and the actual light source was considered the loss term to improve the localization accuracy. The loss function of the EFCN is defined as:

(4)$$\textrm{Loss = }||{Y - X} ||_2^2 + \lambda {||W ||_2}\textrm{ + }\xi \textrm{BCE}$$

where $||{Y - X} ||_2^2$ is the mean-square error (MSE) between the output vectors of the network and the actual light source vectors. ${||W ||_2}$ is the L2 regularization of the network weights. $\lambda$ was set as 3 × 10⁻⁵ in this study. $\xi$ is the weight used to balance the BCE loss was assigned to 0.05. BCE is the barycenter error [19], which is denoted as the Euclidean distance between two coordinates:

(5)$$BCE = {||{B{C_{re}} - B{C_r}} ||_2}$$

where BC_re and BC_r are the barycenters of the reconstruction results and actual light sources, respectively:

(6)$$BC\textrm{ = }{{\left( {\sum\limits_{i = 1}^m {{C_i} \times {v_i}} } \right)} / {\sum\limits_{i = 1}^m {{v_i}} }}$$

where C_i is (x, y, z) representing the coordinate of the i_th vertex, v_i is a scalar indicating the signal intensity of the i_th vertex, and m is the number of vertices of the light source. The BCE loss was expected to improve the localization accuracy of the light source.

2.4 FMT simulation data sets in the NIR-II spectrum

Sufficient samples were crucial for training the EFCN to learn the inverse process of photon propagation. The Monte Carlo (MC) method [23,24] and a standard mesh of a mouse head were implemented to generate the simulation samples. The numerical mouse head was a standard 3D mesh identical to the one described in our previous work [25,26]. It contained the brain, muscle, and skull of a mouse head. The optical parameters corresponding to the excitation and emission wavelength of different tissues are shown in Table 1 [27–31]. For each simulation sample, the excitation light wavelength was set to 800 nm, and that of the emission light was set to 1300 nm. Besides, 200000 photons were set into the excitation source as each owned the spectrum energy of 1.00. The light source of simulation samples was assigned to pervade different positions of the mouse brain to mimic actual tumor distribution.

Table 1. Optical parameters of the main organs used to build the simulation data sets^a

View Table | View all tables in this article

Firstly, 2265 spheroid light sources with a radius of 0.6 mm were used to produce single-source samples by the Molecular Optical Simulation Environment (MOSE v2.3) [32,33]. The barycenter gap between two adjacent light sources was 0.50 mm. The 2265 simulation samples were then randomly divided into the training and validation sets, with 1965 and 300 samples. Furthermore, to expand the scale of the data set and mitigate overfitting, data augmentation based on the single-source samples was performed. The input and label of the original single-source samples were assembled separately to produce the dual-source and big-source samples, whose calculation methods are listed as follows:

(7)$${\Phi _{mul}} = \sum\limits_{i \in S} {{\Phi _i}}$$

(8)$${X_{mul}} = \sum\limits_{i \in S} {{X_i}}$$

where ${\Phi _{mul}}$ and ${X_{mul}}$ refer to the assembled photon intensity and light source vector, respectively. ${\Phi _i}$ and ${X_i}$ represent the photon intensity vector and light source vector of each single-source sample. S is the single-source data set for data assembling, either a training or validation set. Two samples were randomly selected from S to generate dual-source samples, and n nearest samples in S were chosen to create big-source samples, where n was set to 10, 20, and 30, respectively.

In total, 21860 samples were used to train the network, which contained 1965 single-source samples, 14000 dual-source samples, and 5895 big-source samples, respectively. Besides, 3200 samples, including 300 single-source samples, 2000 dual-source samples, and 900 big-source samples, were used to construct the validation set.

To further test the reconstruction ability of EFCN, 315 cylindroid single-source samples, two ellipsoidal big-source samples, and two spheroid big-source samples were also constructed using the MOSE platform. Additionally, 500 cylindroid dual-source samples were created using the cylindroid single-source samples, and 100 spheroid triple-source samples were created using the spheroid single-source samples in the validation set. These multi-source samples were constructed following the assembly method described in Eq. (7-8). The exact sizes of different light sources were described in Fig. S1. The number of varying simulation samples in the training, validation, and test set was summarized in Table S1.

2.5 In vivo experiments

To further evaluate the performance of EFCN in live animals, in vivo experiments were conducted. All animal experiments were conducted under the supervision of the Institutional Animal Care and Use Committee, Chinese Academy of Sciences. Five-week-old BALB/c nude mice (n = 2) were used to construct the tumor-bearing model. About 4 × 10⁶ Luciferase (Luc) and green fluorescence protein (GFP) labeled glioma cells (U87MG-Luc-GFP) were mixed with an equal volume of phosphate buffer saline. Then, the mixed liquid was injected into the mouse brain using the standard stereotaxic instrument (Shenzhen Reward Life Technology Co., Ltd., China). The mice then received in vivo bioluminescence imaging (BLI) to monitor the tumor status five and ten days after seeding glioma cells. BLI was performed by the IVIS Spectrum system (PerkinElmer Inc., USA).

Fourteen days after cell seeding, each mouse model was injected with 75 µL Gadopentetate dimeglumine (Beilu Pharmaceutical Co., Ltd., China) through the tail vein and then received T1 weighted Magnetic Resonance Imaging (MRI) (M3TM, Aspect Imaging, Israel). Next, in vivo NIR-II fluorescence imaging of the mouse head was conducted with the mice anesthetized and placed on the imaging platform. Indocyanine green (ICG) with a 3 mg/kg body weight dose was administered 24 hours before fluorescence imaging through the tail vein. A high-resolution camera (Cheetah 640, Xenics, Belgium) was used to acquire the fluorescence images. The exposure time was set to 1000 ms, and a high-pass filter (FEL1000, Thorlabs, USA) was used to receive the NIR-II signals. Besides, continuous light with 792 nm was used as the excitation light, and the power density of the excitation light at the image site was 100 mW/cm². After in vivo fluorescence imaging, the mouse heads were sent for frozen sections, and the ex vivo fluorescence imaging was conducted on the sections using the automatic digital slide scanner. Then the sections were sent for hematoxylin and eosin (H&E) staining.

The acquired in vivo NIR-II fluorescence images were firstly de-noised through median filtering with neighborhood size 3 × 3 and then mapped to the 3D standard mesh. The surface signal vectors were obtained by discretization. Finally, the vectors were imported into the well-trained networks to get FMT reconstruction results.

2.6 Network implementation details and quantitative evaluation metrics

Pytorch 1.9.0 and Python 3.8.0 were used to train the EFCN, KNN-LC network, and IPS method. The Adam optimizer was used for training the three networks, where ${\beta _1}$ was 0.90 and ${\beta _2}$ was 0.99. EFCN was trained with 300 epochs with a learning rate of 3 × 10⁻⁴, and the batch size was 64. Besides, the KNN-LC network and the IPS method were trained with 300 epochs with a learning rate of 1 × 10⁻³, and the batch size was 265. All calculation procedures were conducted on a computer with an Intel Core i7 CPU and an NVIDIA GeForce GTX 1660 Ti GPU. The training loss and validation curves of EFCN on the training set and validation set were shown in Fig. S2.

The BCE and Dice metrics [20] were used to quantitatively evaluate the reconstruction results between different methods on different simulation samples. BCE was the barycenter distance between the reconstruction result and the actual light source. The formulation is described in Eq. (5-6). Concretely, a smaller BCE indicated a more precise localization. The morphological difference between the actual light source and the reconstruction result was also a vital evaluation indicator. It was characterized by the Dice metric, and the calculation was described as follows:

(9)$$Dice = \frac{{2|{A \cap B} |}}{{|A |+ |B |}}$$

where A and B denote the actual source region and the reconstruction source region, respectively. The higher Dice metric indicated a higher similarity between the two light sources.

Besides, to evaluate the performance of different methods in the in vivo experiments, the H&E staining sections of the mouse brain were chosen as the gold standard. The center localization error (CLE) and Dice metrics were selected as the evaluation indicators. The CLE was calculated as:

(10)$$CLE = \sqrt {{{({x - {x_t}} )}^2} + {{({y - {y_t}} )}^2} + {{({z - {z_t}} )}^2}}$$

where (x, y, z) denotes the center coordinate of the light source on the transverse plane, and (x_t, y_t, z_t) represents the center coordinate of the tumor on the H&E staining section. Similarly, a smaller CLE indicated a more precise localization. The calculation of the Dice metric was consistent with Eq. (9), where A and B denote the tumor region on the H&E staining section and the light source region on the transverse plane, respectively.

3. Results

3.1 Validation set reconstruction

The performance of the EFCN was first compared with the KNN-LC network and IPS method by reconstructing the spheroid single-source samples in the validation set (Fig. S3). EFCN achieved the highest mean Dice (0.68) and minimum mean BCE (0.16 mm), among which the Dice of some samples was close to 1.00 (Table S2). Then, the assembled dual-source samples were utilized to evaluate the localization accuracy (Fig. S4). EFCN also showed superior performance in BCE metrics (Table S3). The mean BCE of dual-source samples reconstructed by EFCN was only 49%-63% of that of the IPS method and KNN-LC network. Besides, the standard deviation of BCE achieved by EFCN was the least, which showed a more stable reconstruction result. The big-source samples of different sizes were further used to evaluate each network (Fig. S5). Like the reconstruction of single-source and dual-source samples, EFCN also obtained excellent performance (Table S4). All these results demonstrated the superiority of EFCN over the other two methods.

3.2 Single-source reconstruction and robustness evaluation

The cylindroid single-source samples were implemented to test the reconstruction ability of EFCN (Table 2). The light source volume was 1.26 cm³, similar to the spheroid single-source in the validation set. Overall, EFCN achieved a more precise reconstruction than the other two networks. For a representative sample with a depth of 4.20 mm, EFCN achieved a higher accuracy of morphological recovery, and the corresponding localization error was also the lowest (Fig. 2(a)). The quantitative BCE and Dice of the sample are shown in Fig. 2(c, d). BCE reconstructed by EFCN was less than 0.25 mm (Fig. 2(c)). Besides, for the specific cylindroid source, EFCN achieved the highest Dice of 0.71 (Fig. 2(d)). These test results first proved the superiority of EFCN for reconstructing single-source samples and the generalization performance for sources with a different shape.

Fig. 2. Reconstruction results of the representative cylindroid single-source sample with or without Gaussian noise added. a, 3D-view and the transverse plane of the actual and reconstructed cylindroid source. b, 3D-view and the transverse plane of the actual and reconstructed cylindroid source with 10% Gaussian noise added. c, BCE of the two reconstruction results. d, Dice of the two reconstruction results. The gray dotted lines indicate the center position of actual sources.

Download Full Size | PDF

Table 2. Quantitative results (mean ± S.D.) of the single-source samples in the test set with or without Gaussian noise added

View Table | View all tables in this article

Furthermore, 10% Gaussian noise was added to the input vector of the cylindroid samples to test the robustness of EFCN (Fig. 2(b)). Quantitative analysis showed that EFCN obtained the highest Dice and the minimum BCE, indicating the excellent adaptive ability and robustness of the network (Fig. 2(c, d)). Overall, with the Gaussian noise added, the mean BCE of all the cylindroid samples reconstructed by EFCN increased by 35%, whereas the baseline method presented unstable results with a rapidly growing BCE (KNN-LC: 56%; IPS: 47%) (Table 2).

3.3 Multi-source reconstruction

The 500 assembled cylindroid dual-source samples were imported to EFCN. Quantitative analysis revealed the high performance of EFCN over the other two methods in localization accuracy. EFCN achieved the best dual-source reconstruction, whose mean BCE was only 69.6%-71.6% of the KNN-LC network and IPS method (Table 3).

Table 3. Quantitative results (mean ± S.D.) of the multi-source samples in the test set

View Table | View all tables in this article

The 100 triple-source samples were also employed to test the ability of EFCN to reconstruct multi-source samples. EFCN could still handle this kind of sample well, with a mean localization accuracy 1.4 times of the KNN-LC network and 1.5 times of the IPS method (Table 3). For a representative case shown in Fig. 3, EFCN could correctly distinguish all three light sources with a mean BCE only down to 0.43 mm. The depth of the three light sources S1, S2, and S3 were 1.0 mm, 1.3 mm, and 2.4 mm, respectively. The compared methods failed to reconstruct all three light sources or had worse localization ability and shape recovery. Another triple-source sample reconstructed by the three methods was displayed in Fig. S6. These results demonstrated that EFCN had a better generalization ability for multi-source reconstruction.

Fig. 3. Reconstruction results of a representative spheroid triple-source sample in the test set. a, 3D-view and the transverse plane of the actual and reconstructed spheroid sources. b-c, Quantitative results of each source achieved by different methods. The gray dotted lines indicate the center position of actual sources. S1: Source 1; S2: Source 2; S3: source 3; ×: The light source failed to be reconstructed.

Download Full Size | PDF

3.4 Big-source reconstruction

Additionally, the spheroid and ellipsoidal big-source samples in the test set were implemented to test the ability of EFCN to reconstruct light sources of a larger size. The radius of the two spheroid samples was 1.00 mm and 1.30 mm, whose volumes were 4.63 and 10.17 times greater than spheroid single sources. The axis lengths of the two ellipsoidal sources along the x, y, and z axes were 1.60 mm, 1.60 mm, and 2.40 mm, respectively. The volume of the ellipsoidal sources was 3.56 times greater than spheroid single sources.

Qualitative and quantitative analyses verified the superiority of EFCN for the reconstruction of big-source samples (Table 4). For the representative spheroid sample with a radius of 1.00 mm, the reconstruction result of EFCN was closer to the actual light source (Fig. 4(a)). However, the reconstruction results achieved by the KNN-LC network and IPS method provided large location deviations. Similar results were also presented in reconstructing the ellipsoidal big-source sample (Fig. 4(b)). EFCN surpassed the KNN-LC network and IPS method in localization accuracy and morphological recovery. Specifically, the mean BCE of the KNN-LC network was 2.1 times that of EFCN (Fig. 4(c)). The mean Dice of EFCN was 1.8 times that of the IPS method (Fig. 4(d)).

Fig. 4. Reconstruction results of representative big-source samples with different shapes by different methods. a, 3D-view and the transverse plane of the actual and reconstructed spheroid big-source with a radius of 1.0 mm. b, 3D-view and the transverse plane of the actual and reconstructed ellipsoidal big-source. c, BCE of the two reconstructed sources. d, Dice of the two reconstructed sources. The gray dotted lines indicate the center position of actual sources.

Download Full Size | PDF

Table 4. Quantitative results (mean ± S.D.) of the spheroid and ellipsoidal big-source samples

View Table | View all tables in this article

3.5 In vivo experiments

The in vivo experiments on two glioma-bearing mice were conducted to evaluate the feasibility of EFCN for biomedical applications (Fig. S7a). MRI was first performed to determine the tumor status (Fig. 5(a)), and NIR-II fluorescence images were then acquired (Fig. 5(b)). The ex vivo fluorescence images were consistent with the H&E stained sections, verifying ICG accumulated in the glioma (Fig. S7b; Fig. 5(c,e)). FMT reconstruction was performed based on the in vivo fluorescence images and the well-trained networks. The reconstruction results were merged with the H&E stained sections (Fig. 5(d,f)).

Fig. 5. In vivo experiments of two glioma-bearing mice. a, MRI results of the two glioma mice. b, In vivo NIR-II images of the mouse brain, which is the basis of FMT reconstruction. c, H&E stained section of the mouse-1 head. The black circle indicates the tumor region. d, FMT reconstruction results achieved by the three methods are merged with the H&E stained section, the greyish green region shows the tumor reconstructed by the three methods. e, H&E stained section of the mouse-2 head. The black circle indicates the tumor region. f, The greyish green areas show the tumor reconstructed by the three methods merged with the H&E stained section. g-h, CLE, and Dice of FMT reconstruction results on the two glioma mouse models.

Download Full Size | PDF

Quantitative analysis demonstrated that EFCN achieved more precise reconstruction on the two glioma mouse models (Fig. 5(g,h)). EFCN achieved the highest Dice and the lowest CLE consistent with simulation experiments. In particular, for a tumor with a shallower depth, EFCN obtained a high morphological recovery accuracy (Dice = 0.81), which was 1.4 and 1.5 times of the KNN-LC network and IPS method (Fig. 5(d,h)). Even for the deeper tumor with irregular morphology, a minimum CLE was obtained by EFCN (CLE = 0.20 mm), whereas the CLE of the KNN-LC network and IPS method was larger (Fig. 5(f,g)). These in vivo results demonstrated the promising application of EFCN for tracing fluorescence probes and recovering their 3D distribution.

3.6 Ablation studies of EFCN structure and evaluation of reconstruction time

Apart from the EFCN used for reconstruction in this paper, other three models were also trained, including EFCN trained without BCE loss (EFCN without BCE), EFCN without the excitation block (EFCN without Excitation), and EFCN without the excitation block and trained without BCE loss (EFCN without BCE nor Excitation). The three networks were used as validation models to assess the effectiveness of BCE loss and excitation block.

The validation set (spheroid single-source, spheroid dual-source) and test set (cylindroid single-source, spheroid triple-source) were used for ablation studies. The morphological recovery and the localization accuracy were influenced by either BCE loss or the excitation block removed (Fig. 6). The quantitative analysis further revealed that BCE loss mainly affected the source localization, with the mean BCE achieved by EFCN being only 68% of the EFCN without BCE (Fig. 6(a)). Besides, the excitation block mainly affected morphological recovery (Fig. 6(b)). These quantitative results strongly demonstrated the effectiveness of our improvement in the network structure and training loss.

Fig. 6. Quantitative comparison of the contributions of BCE loss and excitation block to the performance of EFCN. a, The mean BCE of multi-source samples achieved by the four networks. b, The mean Dice of the single-source samples given by the four networks.

Download Full Size | PDF

Finally, the reconstruction time between EFCN, KNN-LC network, and IPS method were compared. The 3200 simulated samples in the validation set were imported into each network. The mean ± S.D. time to process the whole validation set on 50 runs by EFCN, KNN-LC network, and IPS method was 0.52 ± 0.04 s, 1.77 ± 0.03 s, and 0.49 ± 0.01 s, respectively. The reconstruction time of the EFCN and IPS method was close, indicating the same performance of EFCN as other machine learning-based FMT reconstruction strategies in terms of fast reconstruction.

4. Discussion

In this study, NIR-II optical parameters of tissues were adopted to construct the simulation data sets. A novel excitation-based fully connected network was proposed to model the inverse process of NIR-II photon propagation. Compared with the model-based FMT reconstruction, the network mapped the inverse photon propagation through massive parameters and directly solved the inverse problem by importing surface photon intensity. Quantitative results demonstrated the network improved morphological recovery and achieved accurate localization of the light source based on the excitation block and BCE loss. Moreover, the network achieved desired in vivo reconstruction results by learning the large NIR-II simulation samples.

The input noise was a substantial factor disturbing the reconstruction results. The networks presented unstable performance with noise interference. Although EFCN exhibited better anti-noise ability for simulation experiments over the baseline methods, the input noise is still non-negligible for in vivo experiments. The background fluorescence and autofluorescence inevitably interfere with the reconstruction results in this condition. In this study, the optical parameters in the NIR-II spectrum were employed to construct the simulation samples, and the in vivo experiments were then conducted based on NIR-II fluorescence imaging. The lower background noise and autofluorescence in the NIR-II spectrum provided high contrast and stable fluorescence imaging for in vivo experiments [34,35]. It suppressed the input noise from signal acquisition, which was an important strategy to improve the reconstruction robustness except for network design. Besides, the high imaging depth in the NIR-II spectrum is another advantage of this imaging modality. FMT reconstruction can obtain deeper tumors in the in vivo study using NIR-II fluorescence imaging. After training in silico, the EFCN was expected to have transfer ability and robustness to be applied to in vivo data. The in vivo NIR-II fluorescence images could be directly imported into the network without retraining the network parameters.

The excitation mechanism was adopted in EFCN and showed exceptional contributions to morphological recovery. The adaptive module automatically tuned the parameters by learning the simulation samples. Thus, no additional settings were required, and it was easy to reproduce when deployed to other data sets. MSE loss and Lp regularization were generally used to train networks in machine learning-based FMT reconstruction strategies, which mainly constrain the intensity of each node and have no direct effect on the position of the light source. To further minimize the localization error of the light source, BCE loss was added to the loss function in this study. EFCN obtained optimal morphological recovery and source localization based on the two improvements on multiple simulation samples with light sources of different shapes, sizes, and quantities.

Although single-source samples were constructed by light sources with regular shapes, big-source samples were introduced into simulation experiments. In this condition, the samples were assembled by n (n = 10, 20, 30) nearest single-source samples. The shape of the assembled light sources was complex and irregular, but the network still obtained satisfactory reconstruction results (Fig. S5), which proved the network could reconstruct light sources with more complicated, irregular shapes. Besides, each simulation sample in the data set owned a unique position and depth, the network was trained to distinguish signals of different depths while learning the process of photon propagation. The BCE loss adopted in the loss function was used to constrain the barycentre of the reconstructed light source, which improved the reconstruction accuracy of the network. Simulation and in vivo experiments demonstrated the effectiveness of the BCE loss for reconstructing light sources with different depths. These results proved that the introduction of extra regularization BCE loss did not cause over-regularization that may lead to a loss of resolution. All in all, the network structure was relatively simple, and it is easy to apply to other imaging tasks in the future [36,37]. Other versatile MC platforms can be introduced as simulators and generate simulation data with more diversity [38,39].

The H&E section was adopted for evaluating the FMT results. The goal of this study was to reconstruct the fluorescence signal of tumors. Clinically, H&E staining was the gold standard for determining tumor boundaries rather than MRI [40]. It was more rigorous to compare the difference between the reconstruction results and the real tumor using the H&E section. As shown in Fig. 5, the H&E staining provides a higher resolution than MRI images. Errors would be introduced when calculating the Dice and BCE values using MRI images. The H&E section was a better choice for quantitative analysis. Moreover, the mice were sacrificed and sent for frozen section immediately after in vivo fluorescence imaging. There is no tissue deformation and drift in H&E staining, and the original morphology of tumors was maintained. Therefore, the H&E sections rather than MRI images were introduced for evaluating the FMT results.

However, some limitations are still presented in this study. First, all the simulation samples were produced based on the optical parameters in the NIR-II spectrum. No contrast experiments of NIR-I FMT reconstruction based on EFCN have been conducted. Therefore, quantitative analysis of the performance improvement due to spectral changes is absent. Besides, EFCN was designed to directly predict the signal intensity of brain nodes, which was essentially a regression task. It is more challenging to train a regression network than a classification network. It is worth trying a two-stage strategy that separates position prediction and intensity regression into two steps in the future. Specifically, use the model to classify the neural nodes related to the light source and then predict the number of photons each node owns.

To our knowledge, it was the first time we applied the neural network to NIR-II FMT reconstruction. The novel EFCN achieved highly accurate reconstruction on different light sources in morphological recovery and localization, benefiting from the excitation mechanism and BCE loss. Additionally, the network showed superior performance in the in vivo reconstruction based on learning the process of photon propagation in the NIR-II spectrum. We strongly believe the novel method would promote future research on FMT reconstruction in biomedical applications.

5. Conclusion

In this study, the NIR-II fluorescence imaging and neural network were first combined and applied to FMT reconstruction. The NIR-II fluorescence imaging provided stable and high contrast fluorescence data. The network accurately modeled the inverse photon propagation process based on the excitation mechanism and BCE loss. Thus, precise FMT reconstruction results were achieved in simulation and in vivo experiments. In the future, it could be an alternative strategy to the traditional NIR-I FMT.

Funding

National Key Research and Development Program of China (2017YFA0205200); National Natural Science Foundation of China ( 62027901, 81930053, 92059207, 81227901); Beijing Natural Science Foundation (JQ19027); CAS Youth Interdisciplinary Team (JCTD-2021-08); Strategic Priority Research Program of the Chinese Academy of Sciences (XDA16021200); Zhuhai High-level Health Personnel Team Project (Zhuhai HLHPTP201703).

Acknowledgments

The authors would like to thank X. Zhang and Q. Qu for their assistance in the in vivo experiments.

Disclosures

The authors declare no competing interests.

Data availability

The data sets and the raw code are available from the corresponding author upon request.

Supplemental document

See Supplement 1 for supporting content.

References

1. W. Xie, Y. Deng, K. Wang, X. Yang, and Q. Luo, “Reweighted L1 regularization for restraining artifacts in FMT reconstruction images with limited measurements,” Opt. Lett. 39(14), 4148–4151 (2014). [CrossRef]

2. S. Zhang, X. Ma, Y. Wang, M. Wu, H. Meng, W. Chai, X. Wang, S. Wei, and J. Tian, “Robust reconstruction of fluorescence molecular tomography based on sparsity adaptive correntropy matching pursuit method for stem cell distribution,” IEEE Trans. Med. Imaging 37(10), 2176–2184 (2018). [CrossRef]

3. X. Guo, X. Liu, X. Wang, F. Tian, F. Liu, B. Zhang, G. Hu, and J. Bai, “A combined fluorescence and microcomputed tomography system for small animal imaging,” IEEE Trans. Biomed. Eng. 57(12), 2876–2883 (2010). [CrossRef]

4. R. Baikejiang, Y. Zhao, B. Z. Fite, K. W. Ferrara, and C. Li, “Anatomical image-guided fluorescence molecular tomography reconstruction using kernel method,” J. Biomed. Opt. 22(5), 055001 (2017). [CrossRef]

5. Z. Hu, C. Fang, B. Li, Z. Zhang, C. Cao, M. Cai, S. Su, X. Sun, X. Shi, C. Li, T. Zhou, Y. Zhang, C. Chi, P. He, X. Xia, Y. Chen, S. S. Gambhir, Z. Cheng, and J. Tian, “First-in-human liver-tumour surgery guided by multispectral fluorescence imaging in the visible and near-infrared-I/II windows,” Nat. Biomed. Eng. 4, 259–271 (2020). [CrossRef]

6. A. L. Antaris, H. Chen, K. Cheng, Y. Sun, G. Hong, C. Qu, S. Diao, Z. Deng, X. Hu, B. Zhang, X. Zhang, O. K. Yaghi, Z. R. Alamparambil, X. Hong, Z. Cheng, and H. Dai, “A small-molecule dye for NIR-II imaging,” Nat. Mater. 15(2), 235–242 (2016). [CrossRef]

7. K. Wang, Q. Wang, Q. Luo, and X. Yang, “Fluorescence molecular tomography in the second near-infrared window,” Opt. Express 23(10), 12669–12679 (2015). [CrossRef]

8. M. Cai, Z. Zhang, X. Shi, Z. Hu, and J. Tian, “NIR-II/NIR-I fluorescence molecular tomography of heterogeneous mice based on Gaussian weighted neighborhood fused Lasso method,” IEEE Trans. Med. Imaging 39(6), 2213–2222 (2020). [CrossRef]

9. X. He, H. Guo, J. Yu, X. Zhang, and Y. Hou, “Effective and robust approach for fluorescence molecular tomography based on CoSaMP and SP3 model,” J. Innov. Opt. Health Sci. 9, 11 (2016). [CrossRef]

10. C. Qin, J. Zhong, Z. Hu, X. Yang, and J. Tian, “Recent advances in Cerenkov luminescence and tomography imaging,” IEEE J. Sel. Top. Quantum Electron. 18(3), 1084–1093 (2012). [CrossRef]

11. P. Mohajerani and V. Ntziachristos, “An inversion scheme for hybrid fluorescence molecular tomography using a fuzzy inference system,” IEEE Trans. Med. Imaging 35(2), 381–390 (2016). [CrossRef]

12. K. Liu, J. Tian, C. Qin, X. Yang, S. Zhu, D. Han, and P. Wu, “Tomographic bioluminescence imaging reconstruction via a dynamically sparse regularized global method in mouse models,” J. Biomed. Opt. 16(4), 046016 (2011). [CrossRef]

13. S. Jiang, J. Liu, Y. An, G. Zhang, J. Ye, Y. Mao, K. He, C. Chi, and J. Tian, “Novel l 2,1-norm optimization method for fluorescence molecular tomography reconstruction,” Biomed. Opt. Express 7(6), 2342–2359 (2016). [CrossRef]

14. E. Edjlali and Y. Bérubé-Lauzière, “Lq−Lp optimization for multigrid fluorescence tomography of small animals using simplified spherical harmonics,” J. Quant. Spectrosc. Radiat. Transf. 205, 163–173 (2018). [CrossRef]

15. J. Dutta, S. Ahn, C. Li, S. R. Cherry, and R. M. Leahy, “Joint L1 and total variation regularization for fluorescence molecular tomography,” Phys. Med. Biol. 57(6), 1459–1476 (2012). [CrossRef]

16. H. Meng, Y. Gao, X. Yang, K. Wang, and J. Tian, “K-nearest neighbor based locally connected network for fast morphological reconstruction in fluorescence molecular tomography,” IEEE Trans. Med. Imaging 39(10), 3019–3028 (2020). [CrossRef]

17. L. Guo, F. Liu, C. Cai, J. Liu, and G. Zhang, “3D deep encoder-decoder network for fluorescence molecular tomography,” Opt. Lett. 44(8), 1892–1895 (2019). [CrossRef]

18. P. Zhang, G. Fan, T. Xing, F. Song, and G. Zhang, “UHR-DeepFMT: ultra-high spatial resolution reconstruction of fluorescence molecular tomography based on 3-D fusion dual-sampling deep neural network,” IEEE Trans. Med. Imaging 40(11), 3217–3228 (2021). [CrossRef]

19. Y. Gao, K. Wang, Y. An, S. Jiang, H. Meng, and J. Tian, “Nonmodel-based bioluminescence tomography using a machine-learning reconstruction strategy,” Optica 5(11), 1451–1454 (2018). [CrossRef]

20. M. Cai, Z. Zhang, X. Shi, J. Yang, Z. Hu, and J. Tian, “Non-negative iterative convex refinement approach for accurate and robust reconstruction in Cerenkov luminescence tomography,” IEEE Trans. Med. Imaging 39(10), 3207–3217 (2020). [CrossRef]

21. D. Wang, X. Liu, Y. Chen, and J. Bai, “A novel finite-element-based algorithm for fluorescence molecular tomography of heterogeneous media,” IEEE Trans. Inf. Technol. Biomed. 13(5), 766–773 (2009). [CrossRef]

22. J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, “Squeeze-and-excitation networks,” 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7132–7141 (2018).

23. S. Bartel and A. H. Hielscher, “Monte Carlo simulations of the diffuse backscattering mueller matrix for highly scattering media,” Appl. Opt. 39(10), 1580–1588 (2000). [CrossRef]

24. H. Li, J. Tian, and G. Wang, “Photon propagation model of in vivo bioluminescent imaging based on Monte Carlo,” J. Softw. 15, 1709–1719 (2004).

25. Z. Zhang, M. Cai, Y. Gao, X. Shi, X. Zhang, Z. Hu, and J. Tian, “A novel Cerenkov luminescence tomography approach using multilayer fully connected neural network,” Phys. Med. Biol. 64(24), 245010 (2019). [CrossRef]

26. B. Dogdas, D. Stout, A. F. Chatziioannou, and R. M. Leahy, “Digimouse: a 3D whole body mouse atlas from CT and cryosection data,” Phys. Med. Biol. 52(3), 577–587 (2007). [CrossRef]

27. S. L. Jacques, “Optical properties of biological tissues: a review,” Phys. Med. Biol. 58(11), R37–R61 (2013). [CrossRef]

28. G. Alexandrakis, F. R. Rannou, and A. F. Chatziioannou, “Tomographic bioluminescence imaging by use of a combined optical-PET (OPET) system: a computer simulation feasibility study,” Phys. Med. Biol 50(17), 4225–4241 (2005). [CrossRef]

29. G. S. Hong, A. L. Antaris, and H. J. Dai, “Near-infrared fluorophores for biomedical imaging,” Nat. Biomed. Eng. 1(3), 0022 (2017). [CrossRef]

30. G. Strangman, M. A. Franceschini, and D. A. Boas, “Factors affecting the accuracy of near-infrared spectroscopy concentration calculations for focal changes in oxygenation parameters,” NeuroImage 18(4), 865–879 (2003). [CrossRef]

31. A. J. Lin, M. A. Koike, K. N. Green, J. G. Kim, A. Mazhar, T. B. Rice, F. M. LaFerla, and B. J. Tromberg, “Spatial frequency domain imaging of intrinsic optical property contrast in a mouse model of Alzheimer's disease,” Ann. Biomed. Eng. 39(4), 1349–1357 (2011). [CrossRef]

32. J. Tian, J. Liang, X. Chen, and X. Qu, “Molecular optical simulation environment,” in Molecular Imaging: Fundamentals and Applications (Springer Berlin Heidelberg, 2013), pp. 15–46.

33. H. Li, J. Tian, F. Zhu, W. Cong, L. V. Wang, E. A. Hoffman, and G. Wang, “A mouse optical simulation environment (MOSE) to investigate bioluminescent phenomena in the living mouse with the Monte Carlo method,” Acad. Radiol. 11(9), 1029–1038 (2004). [CrossRef]

34. S. Wang, J. Liu, G. Feng, L. G. Ng, and B. Liu, “NIR-Ii excitable conjugated polymer dots with bright NIR-I emission for deep in vivo two-photon brain imaging through intact skull,” Adv. Funct. Mater. 29, 11 (2019). [CrossRef]

35. H. Lin, S. Gao, C. Dai, Y. Chen, and J. Shi, “A two-dimensional biodegradable niobium carbide (MXene) for photothermal tumor eradication in NIR-I and NIR-II biowindows,” J. Am. Chem. Soc. 139(45), 16235–16247 (2017). [CrossRef]

36. Z. Hu, Y. Qu, K. Wang, X. Zhang, J. Zha, T. Song, C. Bao, H. Liu, Z. Wang, J. Wang, Z. Liu, H. Liu, and J. Tian, “In vivo nanoparticle-mediated radiopharmaceutical-excited fluorescence molecular imaging,” Nat. Commun. 6(1), 7560 (2015). [CrossRef]

37. Z. Hu, J. Liang, W. Yang, W. Fan, C. Li, X. Ma, X. Chen, X. Ma, X. Li, X. Qu, J. Wang, F. Cao, and J. Tian, “Experimental Cerenkov luminescence tomography of the mouse model with SPECT imaging validation,” Opt. Express 18(24), 24441–24450 (2010). [CrossRef]

38. Q. Fang and S. Yan, “MCX Cloud-a modern, scalable, high-performance and in-browser Monte Carlo simulation platform with cloud computing,” J. Biomed. Opt. 27(8), 083008 (2022). [CrossRef]

39. Q. Q. Fang and D. A. Boas, “Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing units,” Opt. Express 17(22), 20178–20190 (2009). [CrossRef]

40. F. J. Voskuil, J. Vonk, B. van der Vegt, S. Kruijff, V. Ntziachristos, P. J. van der Zaag, M. J. H. Witjes, and G. M. van Dam, “Intraoperative imaging in pathology-assisted surgery,” Nat. Biomed. Eng. 6(5), 503–514 (2022). [CrossRef]

Tissue	$μ_{a x}$ (mm^-1)	$μ_{s x}^{'}$ (mm^-1)	$μ_{a m}$ (mm^-1)	$μ_{s m}^{'}$ (mm^-1)
Brain	0.0126	1.1350	0.1049	0.5192
Muscle	0.0303	0.4727	0.0670	0.1904
Skull	0.0208	1.6356	0.0209	1.1554

Source type	Metric	EFCN	KNN-LC	IPS
Cylindroid	Dice	0.52 ± 0.18	0.53 ± 0.21	0.53 ± 0.21
Cylindroid	BCE (mm)	0.31 ± 0.14	0.34 ± 0.18	0.34 ± 0.19
Cylindroid +0.1 noise	Dice	0.39 ± 0.23	0.39 ± 0.25	0.38 ± 0.25
Cylindroid +0.1 noise	BCE (mm)	0.42 ± 0.19	0.53 ± 0.33	0.50 ± 0.26

Source type	Metric	EFCN	KNN-LC	IPS
Dual	BCE1 (mm)	0.50 ± 0.40	0.69 ± 0.61	0.67 ± 0.57
	BCE2 (mm)	0.47 ± 0.41	0.68 ± 0.61	0.66 ± 0.59
	Mean BCE (mm)	0.48 ± 0.41	0.69 ± 0.61	0.67 ± 0.58
Triple	BCE1 (mm)	0.45 ± 0.16	0.61 ± 0.50	0.67 ± 0.36
	BCE2 (mm)	0.45 ± 0.17	0.59 ± 0.42	0.67 ± 0.45
	BCE3 (mm)	0.42 ± 0.19	0.62 ± 0.54	0.67 ± 0.51
	Mean BCE (mm)	0.44 ± 0.17	0.61 ± 0.49	0.67 ± 0.44

Source type	Metric	EFCN	KNN-LC	IPS
Spheroid	Dice	0.60 ± 0.05	0.40 ± 0.13	0.40 ± 0.01
Spheroid	BCE (mm)	0.58 ± 0.21	1.09 ± 0.29	0.67 ± 0.13
Ellipsoidal	Dice	0.57 ± 0.12	0.49 ± 0.01	0.37 ± 0.01
Ellipsoidal	BCE (mm)	0.42 ± 0.01	0.78 ± 0.04	0.60 ± 0.13

Tissue	$μ_{a x}$ (mm^-1)	$μ_{s x}^{'}$ (mm^-1)	$μ_{a m}$ (mm^-1)	$μ_{s m}^{'}$ (mm^-1)
Brain	0.0126	1.1350	0.1049	0.5192
Muscle	0.0303	0.4727	0.0670	0.1904
Skull	0.0208	1.6356	0.0209	1.1554

Excitation-based fully connected network for precise NIR-II fluorescence molecular tomography

Abstract

1. Introduction

2. Materials and methods

2.1 Model-based reconstruction strategy

2.2 Excitation-based fully connected network

2.3 Loss function

2.4 FMT simulation data sets in the NIR-II spectrum

2.5 In vivo experiments

2.6 Network implementation details and quantitative evaluation metrics

3. Results

3.1 Validation set reconstruction

3.2 Single-source reconstruction and robustness evaluation

3.3 Multi-source reconstruction

3.4 Big-source reconstruction

3.5 In vivo experiments

3.6 Ablation studies of EFCN structure and evaluation of reconstruction time

4. Discussion

5. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (6)

Tables (4)

Equations (10)

Biomedical Optics Express