Semi-synthetic data generation to fine-tune a convolutional neural network for retrieving Raman signals from CARS spectra

Ali Saghi; Rajendhar Junjuri; Lasse Lensu; Erik M. Vartiainen

doi:10.1364/OPTCON.469753

1. Introduction

Coherent anti-Stokes Raman scattering (CARS) is a four-wave mixing process that is resonantly enhanced by molecular vibrations. In the past decade, it has been proven to be a unique tool for label-free microscopy to study a wide variety of materials and biological systems [1–6]. The unique feature of CARS microspectroscopy is its potential for providing quantitative, chemically specific hyperspectral imaging with resolution limited only by diffraction [7]. In a multiplex or a broadband approach, the CARS spectra are acquired concurrently, yielding high signal-to-noise ratio spectra at every pixel in the image [1,8]. Moreover, by dividing the measured CARS spectrum by a non-resonant reference spectrum, a CARS line-shape is obtained, which is independent of experimental parameters such as laser power fluctuations or timing jitter [9,10]. In comparison to linear Raman scattering, the third-order nonlinear nature of CARS provides signal levels typically four orders higher in magnitude. It enables rapid image acquisition and high-speed vibrational imaging with high sensitivity. The imaging speed as high as ∼ 5*10⁴ spectra per second have been achieved [11].

CARS spectra are inherently complex since they consist of the coherent sum of all resonant and non-resonant contributions to the signal, yielding a characteristic dispersive line-shape [12]. To obtain quantitative data from a normalized CARS line-shape, a suitable numerical phase retrieval procedure can be employed [13–15]. This enables the extraction of the corresponding Raman line-shape from the CARS spectrum and provides the quantitative resonant vibrational responses of the molecules [8,16]. However, the quantitative analysis often suffers from the experimental errors in the normalized CARS line-shape which typically produce an erroneous, non-additive, and low-frequency modulation error contribution to the line-shape [17]. If uncorrected, these errors can completely disable the extraction of the Raman line-shape and obtaining the quantitative results [18]. Fortunately, these errors can be compensated by a proper procedure, but not without supervision [17,18]. Hence, a fully automatic CARS line-shape analysis is one of the main objectives in quantitative, chemically specific CARS microscopy.

One of the state-of-the-art solutions to the fully unsupervised CARS line-shape analysis is using deep learning (DL) methods. Due to their potential to solve complex computational problems in an unsupervised manner, the DL algorithms have been used in many fields such as computer vision [19], text analysis [20], and social sciences [21]. Recently, there has been a growing interest in using deep neural networks (DNN) in optical spectroscopy. Most of these applications employ DNN for vibrational spectral analysis in Raman and near-infrared (NIR) spectroscopies [22]. To the best of our knowledge, there are only a few papers dealing with CARS spectroscopy using DL methods [23–25]. Valensise et al. performed CARS line-shape analysis with the help of a convolutional neural network (CNN) which was trained by a pure synthetic data set [23]. Houhou et al. investigated phase retrieval from CARS spectra using a Long Short-Term Memory (LSTM) network [24]. Wang et al. have applied a deep neural network called very deep convolutional autoencoders (VECTOR) to retrieve the Raman signal from the CARS spectra [25]. Herein, we present a way to retrieve the Raman line-shape of a CARS spectrum with the help of fine-tuning a pre-trained CNN model. Also, we investigate the impact of the different levels of noise on the strength of the presented model.

2. Theory

The theory of multiplex CARS microscopy and spectral analysis has been described in detail elsewhere [12–18]. Here we briefly review the main concepts. CARS is a four-wave mixing process, in which three incident laser fields (denoted by pump - ${E_{pu}}({{\omega_{pu}}} )$, Stokes - ${E_S}({{\omega_S}} )$ and probe - ${E_{pr}}({{\omega_{pr}}} )$ interact coherently with the sample generating an anti-Stokes polarization ($P_{as}^{(3 )}({{\omega_{as}}} )$; ${\omega _{as}} = {\omega _{pu}} - {\omega _S} + {\omega _{pr}}$). In multiplex CARS, the Stokes field is obtained from a broadband laser source and the pump and probe fields from the same narrow-band laser source. Consequently, in case the pump, probe, and Stokes fields are parallel-polarized, the multiplex CARS signal intensity is given by [26]

(1)$${I_{CARS}}({{\omega_{as}}} )\propto {\left|{\mathop \smallint \nolimits_{}^{} \mathop \smallint \nolimits_{}^{} \mathop \smallint \nolimits_{ - \infty }^\infty d{\omega_{pu}}d{\omega_S}d{\omega_{pr}}\chi_{1111}^{(3 )}({{\omega_{as}}} ){E_{pu}}({{\omega_{pu}}} ){E_S}({{\omega_S}} ){E_{pr}}({{\omega_{pr}}} )\delta ({{\omega_{pu}} + {\omega_{pr}} - {\omega_S} - {\omega_{as}}} )} \right|^2}.$$

Here, $\delta ({{\omega_{pu}} + {\omega_{pr}} - {\omega_S} - {\omega_{as}}} )$ is a delta function and $\chi _{1111}^{(3 )}$ denotes the appropriate component of the third-order susceptibility tensor. Away from one-photon resonances $\chi _{1111}^{(3 )}$ can be considered as the sum of a non-resonant (NR) part ($\chi _{NR}^{(3 )}$) arising from the electronic contributions and a Raman resonant part ($\chi _R^{(3 )}$) as:

(2)$$\chi _{1111}^{(3 )} = \chi _{NR}^{(3 )} + \chi _R^{(3 )}({{\omega_{as}}} )$$

where the NR part is purely real and frequency independent. The resonant part is a complex function and can be written as:

(3)$$\chi _R^{(3 )}({{\omega_{as}}} )= \mathop \sum \nolimits_j^{} \frac{{{A_j}}}{{{{\rm{\varOmega }}_j} - ({{\omega_{pu}} - {\omega_S}} )- i{{\rm{\varGamma }}_j}}}$$

where ${A_j}$, ${{\rm{\varOmega }}_j}$ and ${{\rm{\varGamma }}_j}$ are the amplitude, the frequency, and the line width of j'th Raman mode, respectively. In the experiments, the CARS signal from the sample is divided by the CARS signal from a reference sample that has no vibrational resonances over the frequency range of the measurement. Thereby, the obtained CARS line-shape, $S({{\omega_{as}}} )$, is directly proportional to the squared modulus of $\chi _{1111}^{(3 )}$as:

(4)$$S({{\omega_{as}}} )= \frac{{{{|{\chi_{NR}^{(3 )} + \chi_R^{(3 )}({{\omega_{as}}} )} |}^2}}}{{{{|{\chi_{NR,ref}^{(3 )}} |}^2}}} = {|{{\chi_{nr}} + {\chi_r}({{\omega_{as}}} )} |^2} = \chi _{nr}^2 + 2{\chi _{nr}}Re[{{\chi_r}({{\omega_{as}}} )} ]+ {|{{\chi_r}({{\omega_{as}}} )} |^2}$$

where ${\chi _{nr}} = \frac{{\chi _{NR}^{(3 )}}}{{\chi _{NR,ref}^{(3 )}}}$ is normalized background term and ${\chi _r}({{\omega_{as}}} )= \frac{{\chi _R^{(3 )}({{\omega_{as}}} )}}{{\chi _{NR,ref}^{(3 )}}}$ is a normalized Raman resonant term given by Eq. (3). Since $Re[{{\chi_r}({{\omega_{as}}} )} ]$ has a dispersive line-shape and ${|{{\chi_r}({{\omega_{as}}} )} |^2}$ in turn, a dissipative line-shape, retrieval of the spectral information contained in the CARS spectrum requires knowledge of the imaginary part, or the phase function, of ${\chi _r}({{\omega_{as}}} )$.

In comparison, the spontaneous Raman scattering line-shape is given by the imaginary part of the linear Raman susceptibility, $\chi _R^{(1 )}$, as:

(5)$${I_{Raman}}(\omega )\propto{-} Im[{\chi_R^{(1 )}(\omega )} ]= \mathop \sum \nolimits_j^{} \frac{{{A_j}{{\rm{\varGamma }}_j}}}{{{{({{{\rm{\varOmega }}_j} - \omega } )}^2} + {\rm{\varGamma }}_j^2}}.$$

Hence, it is obvious from comparing Eqs. (3) and (5) that both the imaginary part of ${\chi _r}({{\omega_{as}}} )$ within the complex CARS line-shape and the spontaneous Raman line-shape contain the same spectral information and are directly comparable. If the measured CARS line-shape is given by Eq. (5), the Raman line-shape can be directly computed from $S({{\omega_a}} )$ using an appropriate phase-retrieval procedure [13–15]. Unfortunately, the computation is often complicated from the experimental artifacts followed from the normalization of a CARS spectrum [17]. In such a case, the experimental CARS line-shape can be modeled as:

(6)$${S_{exp}}(\omega )= \varepsilon (\omega )S(\omega )= \varepsilon (\omega ){|{{\chi_{nr}} + {\chi_r}(\omega )} |^2} + \eta (\omega )$$

where $\varepsilon (\omega )$ is a slowly varying modulation error and $\eta (\omega )$ is an additive noise term [18]. The modulation error distorts the CARS line-shape, and if not corrected, prevents the quantitative extraction of the Raman line-shape [18]. The distortions can be corrected by using either a Hilbert transform-based method [17], or a wavelet prism-based method [18], to obtain error-free Raman line-shape. However, these error-correction methods cannot be done without supervision, and hence, for a fully automatic CARS line-shape analysis, machine learning-based methods can provide a potential option [23–25].

3. Experimental details

3.1 Details of the experimental Raman data

Raman spectra of all the samples (training & testing) were acquired by a portable Raman spectrometer (i-Raman Plus, M/s B&W Tek). It has a spectral resolution of 4.5 cm^-1 at 912 nm and offered a wide spectral coverage of ∼200-3200 cm^-1 in a single data acquisition. The 785 nm excitation wavelength with 50 mW laser power is utilized to record the spectra. Each spectrum was recorded with an integration time of 5 s and averaged over three times. In this work two different sets of data have been used as follows. The first data set contains four different sample sets as detailed in Table 1 in Supplement 1: HEM denotes high energy materials, ISO isomers, MIX mixtures and PLASTIC plastics. In the set, there are 48 sample sets, and they contain semi-synthetic spectra with different spectral regions. In each group, the number of spectra is between 1 and 100, and there are 1192 real spectra in total. The second data set is composed of 120 pharmaceutical tablet spectra also shown in the supplementary table [27]. These spectra have an identical spectral region from 200 to 3600 cm^-1 as shown in Fig. 1.

Fig. 1. Raman spectra of 120 pharmaceutical tablets forming the second set of data. No background correction has been applied.

Download Full Size | PDF

These two data sets are highly imbalanced with different numbers of measurements. To allow utilizing the data for training and testing, under sampling was applied. From the groups with less than 5 spectra, all spectra were selected, whereas from the groups with more than five spectra, five spectra were randomly selected. The total number of samples in the balanced data set was 199 spectra.

3.2 Semi synthetic CARS spectra generation

The sequential steps involved in semi synthetic CARS spectra generation are represented in a flowchart as visualized in Fig. 2. Initially, the Raman data (spectral intestines) was normalized between 0 to 1 using the Min-Max scaling approach. Then the background of the normalized Raman spectrum was removed by applying a baseline removal package in Python. It utilizes an adaptive iteratively reweighted Penalized Least Squares (air-PLS) algorithm [28]. Further, the real part of the Raman spectrum was obtained by the Kramers-Kronig relations (KK) algorithm. This method requires the frequency, background corrected Raman signal, and α=0 as the input. After that, NRB was added to the real part of the Raman signal. Various NRBs were simulated by using the product of two sigmoid functions $\{{{\sigma_1},\;\; {{\sigma_2}} \}} $ and their parameters {c_i, s_i, i = 1, 2} with randomly selected values, as follows

(7)$$NRB(\upsilon )= {\sigma _1}(\upsilon )\ast \;{\sigma _2}(\upsilon ).$$

Fig. 2. Flow chart of CARS spectra generation.

Download Full Size | PDF

Finally, The Poisson distribution noise it added to it which is a well-grounded option for representing natural phenomena. The noise can be represented by ɛ ∼ P (0, S) where s is chosen to properly mimic the real experimental noise. By changing S, it is possible to vary the amplitude of noise and investigate the sensitivity of the DL model performance to noise.

3.3 Data set division and parameter selection

The set of 199 spectra was randomly divided into two subsets: one with 155 samples for selecting the parameters for semi-synthetic training data generation, and another with 44 samples for testing. The spectral region was chosen to be between (200, 200 + 640(1/cm)) as the deep learning model has the input dimensionality of 640 data points and the relevant spectral information is within this range. Statistical analysis of the training data set was used to determine the maximum number of peaks and linewidth for each spectrum. Based on the analysis presented in Figs. 3 and 4, the parameter values were selected to be 20 and 0.04, respectively. Using the parameters from the statistical analysis (the number of spectral peaks and linewidth), non-resonant background (NRB) and the noise function, 50000 semi-synthetic spectra were generated based on the training set to fine-tune the DL model.

Fig. 3. Frequency histogram and kernel density estimate (KDE) of the number of spectral peaks in the training set with 155 samples.

Download Full Size | PDF

Fig. 4. Frequency histogram and kernel density estimate (KDE) of the spectral linewidths of the training set with 155 samples.

Download Full Size | PDF

4. Deep learning model details

4.1 SpecNet model

An artificial neural network (ANN) is a non-linear mathematical model that can be trained for a specific application. It consists of computational nodes called as neurons which are stacked on top of each other in the so-called hidden layers. In a fully-connected network architecture, each neuron’s output is the weighted sum of all inputs originating from the previous layer plus a bias, transformed by a non-linear function called as the activation function. In the forward pass, the data flows from the input layer through the hidden layers to the output layer. A loss function and an optimizer are used in the backward pass [29], where the weights are corrected as needed, and this process continues until the loss converges or decreases below a selected threshold.

A Convolutional Neural Network (CNN) [30], is a widely used neural network architecture especially in the field of image processing and analysis. This architecture is composed of two stages: a convolutional stage and a fully connected stage. In the convolutional stage, there are learnable filters that produce feature maps. Typically, it is followed by a max pooling layer reducing the output dimensionality. The fully connected stage follows the standard architecture for multi-layer ANNs. The purpose of this part is to transform the feature representation from the convolutional stage into the desired output through a set of learnable layers.

In this work, the CNN architecture named SpecNet [23], shown in Fig. 5 is adapted. It is composed of one batch-normalization layer, five 1D convolutional layers (with filters of 128, 64, 16, 16, 16 and kernel-size of 32, 16, 8, 8, 8 respectively), two dense layers (32, 16 units), a flatten layer, a drop-out layer (probability of drop-out 0.25) and a dense layer of 640 units. The model uses mean square error (MSE) as the loss function and Adam as the optimizer algorithm. The accuracy metrics are mean square error (MSE) and mean absolute error (MAE).

Fig. 5. Coarse scheme of the CNN architecture.

Download Full Size | PDF

4.2 Fine-tuning SpecNet with transfer learning

Transfer learning is an approach that reuses an already trained model for solving a similar problem as shown in the flow chart in Fig. 6. For example, a model that has been trained to make predictions in Task 1, by using transfer learning the training process for solving Task 2 can be improved. This is crucial in such cases where only limited data exists for the purpose. Pan et al have focused on categorizing and reviewing the different applications of the transfer learning such as classification, regression and clustering problems [31]. Transfer learning has gained a lot of attention in the fields of computer vision [32], and Natural language Processing (NLP) [33].

Fig. 6. General schematic of transfer learning.

Download Full Size | PDF

Recently, it is also explored in the field of spectroscopy due to the limited real data. Zhang et al have used [34], a transfer learning model where it has pretrained with a standard Raman spectral database to identify the Raman spectra data of organic components that are not included in the database. Liu et al have demonstrated the [35], the power and effectiveness of the transfer learning method to estimate the soil clay content for soil spectroscopy application.

The idea here is to fine-tune the SpecNet model with semi-synthetic data set. Firstly, the pre-trained SpecNet model with its published weights, trained with 50000 purely synthetic spectra, is used as the basis. For fine-tuning the model, all layers of SpecNet were retrained with our semi-synthetic train data set containing 50000 spectra.

5. Results and discussion

5.1 Training the model

To find the best results of the model, different loss functions including the mean square error (MSE), root mean square error (RMSE) and Huber loss [36] were used. In all experiments, the optimizer was Adam, and the number of epochs and batch size were 50 and 256, respectively.

Also, the early stopping method with patience number of 10, which means the training process will stop after 10 epochs if no improvement achieved, and monitoring parameter of validation loss were used. A k-fold cross validation procedure with the number of folds equalling 10 and a fixed seed number was applied. The best result was achieved with the MSE loss function with the loss of 0.11 and the accuracy of 86% compared to RMSE and Huber with losses of 0.17 and 1.18 and accuracies of 81% and 71% respectively. After finding the best loss function, the model was fine-tuned with the help of 50000 semi-synthetic spectra. The training has stopped after 90 epochs out of 500 due to the early stopping method.

5.2 Predicting Raman signals

In the first experiment, the model was tested to predict the 44 sample spectra from the test set which were not available in the training process. Additionally, a random non-resonant background (NRB) and random noise term with different levels was added to these spectra to assess the performance of the model. Figure 7 visualizes the model’s prediction of NTO_pure spectrum. The top part of the figure represents input CARS spectrum. The bottom plot contains the actual Raman spectrum as well predicted Raman line shapes by SpecNet and the fine-tuned SpecNet. The fined-tuned model has achieved two goals: Firstly, the height of peaks that are identified well by the fine-tuned model compared to the SpecNet model. The second point is that the fine-tuned model can find the peaks at the ends of the spectral range as well. The original SpecNet mostly fails to find such peaks at the borders.

Fig. 7. Predicting the imaginary part of the CARS spectrum of NTO pure from the unseen test set by SpecNet and fine-tuned SpecNet.

Download Full Size | PDF

5.3 Comparing the model sensitivity to different levels of noise

To evaluate the sensitivity and effectiveness of the model, different levels of noise was added to the CARS spectra. The Poisson distribution was used as the noise function for generating spectra. Three different noise levels viz, 0.003, 0.03 and 0.3, are explored to study the effectiveness of the model as presented in Fig. 8. The model easily predicts the imaginary spectrum’s peaks with a good accuracy when the noise level is 0.003. By increasing the noise level to 0.03, the model still can find the area of peaks with an average accuracy, but there are some incorrect peaks as well. When the noise level increases to 0.3, it is obvious that the model has challenges to find the exact peaks and even the areas near the peaks. In such cases, filtering the noise in the input spectrum should be considered.

Fig. 8. Predicting the imaginary part of the CARS spectrum of Benzene 60% from the test set with different levels of noise: a) 0.003, b) 0.03, c) 0.3.

Download Full Size | PDF

In addition, a signal-to-noise ratio (SNR) analysis has been performed to examine the linearity between the increasing level of noise in the spectrum and the corresponding prediction accuracy of the model. In Fig. 9, the errors in the form of mean absolute error (MAE), mean square error (MSE) and root mean square error (RMSE) between the prediction and true imaginary spectrum are presented for the 44 test spectra with 15 noise levels in the range of [0.001, 0.9]. A nonlinear correspondence has been noticed between the errors and the increasing level of noise.

Fig. 9. Signal-to-noise ratio (SNR) analysis of the model using mean absolute error (MAE), mean square error (MSE) and root mean square error (RMSE) with 15 different levels of noise in the range of [0.001-0.9 denoted as red circles.

Download Full Size | PDF

5.4 Comparing the performance of the original and fine-tuned SpecNet

The original and the fine-tuned SpecNet models were evaluated and compared by making use of the test set. The evaluation and comparison were based on predicting the Raman signal of each of the 44 samples when the models’ input was the corresponding semi-synthetic CARS spectrum. Based on the results shown in Fig. 10, the accuracy of the fine-tuned model is better than the SpecNet model: the average MSE is 0.0007 for the fine-tuned model, whereas for the original model it is 0.0236. For some cases, there is no difference or it is minimal, but for most spectra and on average, the predictions from the fine-tuned model surpass the original one with large margin.

Fig. 10. Comparison of the original and fine-tuned SpecNet models based on the MSE between the actual Raman signals in the test set and the models’ predictions when the inputs have been semi-synthetic CARS spectra.

Download Full Size | PDF

5.5 Prediction on experimental CARS spectra

To demonstrate how the prediction of Raman line-shapes from experimental CARS spectra works in practice, an example is shown in Fig. 11. A protein droplet sample (FUS LC condensate) was measured by Y. Kan et al with a home-built broadband CARS microscope at Max Planck Institute for Polymer Research in Mainz. The details on the sample preparation and the CARS measurements are explained elsewhere [37]. The CARS line-shape (the uppermost line-shape) was denoised by the Wavelet Prism procedure [18] and the corresponding Raman line-shape was computed by the MEM procedure [14]. The latter is used here as the reference Raman line-shape (black line) and compared to both the fine-tuned SpecNet (red line) and SpecNet (blue line) predictions (the midmost line-shapes). The bottom plot shows the comparison between the squared error of the fine-tuned SpecNet (red line) and the SpecNet (yellow line) predictions. The analysis of the other three experimental CARS spectra of samples including DMPC lipid [14], AMP/ADP/ATP mixture [14] and pudding yeast [1] are presented in Fig. S1 in Supplement 1.

Fig. 11. Predicting the Raman line-shape from an experimental CARS spectrum (green line) of a protein droplet sample by SpecNet (blue line) and fine-tuned SpecNet (red line), and the squared errors of both models (bottom plot).

Download Full Size | PDF

The predicted line shapes are in better agreement with the true one for the fine-tuned SpecNet as shown in Fig. 11. Whereas the SpecNet could not able to extract some of the characteristic spectra lines for example, the vibrational mode at 1200 cm^-1 led to a high square error of ∼0.45. Also, similar performance was obtained in the case of the other three experimental data (see Fig. S1 in Supplement 1) where the extracted Raman line shape from fine-tuned SpecNet is found to be better than the original SpecNet.

6. Conclusions

In this work, a convolutional neural network (CNN), SpecNet, is further studied for recovering the Raman signal from a CARS spectrum. The original model is trained with purely synthetic spectra and it is used as the baseline. Transfer learning is applied to fine-tune the model by using semi-synthetic spectra based on two semi-synthetic spectral data sets. Based on the experimental results, the accuracy of predicting the imaginary part of the Raman signal ${\chi ^{(3 )}}(\omega )$ was 86% for the fine-tuned model. It has also performed well on the experimental CARS data where the extracted Raman spectrum is resembled with the true one. Spectra with higher level of noise reduce the quality of the predictions, but prediction error increases in a consistent manner with the increasing level of noise. The comparison between the original SpecNet and the fine-tuned model shows clear advantages of the fine-tuned one.

Funding

Academy of Finland (FIRI/327734).

Acknowledgments

This work is a part of “Quantitative Chemically-Specific Imaging Infrastructure for Material and Life Sciences (qCSI)” project funded by the Academy of Finland (Grant No. FIRI/327734). We thank Michiel Müller and Hilde Rinia for providing the measurements of the DMPC lipid sample and the AMP/ADP/ATP mixture, Masanari Okuno and Hideaki Kano for providing the broadband CARS measurement of the yeast sample, and Yelena Kan and Sapun Parekh for providing the protein droplet measurements.

Disclosures

The authors declare no conflicts of interest.

Data availability

The data supporting this study’s findings are available from the corresponding author on request.

Supplemental document

See Supplement 1 for supporting content.

References

1. M. Okuno, H. Kano, P. Leproux, V. Couderc, J. P. R. Day, M. Bonn, and H. Hamaguchi, “Quantitative CARS molecular fingerprinting of single living cells with the use of the maximum entropy method,” Angew. Chem. Int. Ed. Engl. 122(38), 6925–6929 (2010). [CrossRef]

2. P. D. Chowdary, Z. Jiang, E. J. Chaney, W. A. Benalcazar, D. L. Marks, M. Gruebele, and S. A. Boppart, “Molecular Histopathology by Spectrally Reconstructed Nonlinear Interferometric Vibrational ImagingMolecular Histopathology by SR-NIVI,” Cancer Res. 70(23), 9562–9569 (2010). [CrossRef]

3. K. F. Domke, J. P. R. Day, G. Rago, T. A. Riemer, M. H. F. Kox, B. M. Weckhuysen, and M. Bonn, “Host–guest geometry in pores of zeolite ZSM-5 spatially resolved with multiplex CARS spectromicroscopy,” Angew. Chemie Int. Ed. 51(6), 1343–1347 (2012). [CrossRef]

4. S. H. Parekh and K. F. Domke, “Watching Orientational Ordering at the Nanoscale with Coherent Anti-Stokes Raman Microscopy,” Chem. Eur. J. 19(36), 11822–11830 (2013). [CrossRef]

5. C. H. Camp Jr, Y. J. Lee, J. M. Heddleston, C. M. Hartshorn, A. R. H. Walker, J. N. Rich, J. D. Lathia, and M. T. Cicerone, “High-speed coherent Raman fingerprint imaging of biological tissues,” Nat. Photonics 8(8), 627–634 (2014). [CrossRef]

6. C. Di Napoli, I. Pope, F. Masia, P. Watson, W. Langbein, and P. Borri, “Hyperspectral and differential CARS microscopy for quantitative chemical imaging in human adipocytes,” Biomed. Opt. Express 5(5), 1378–1390 (2014). [CrossRef]

7. H. A. Rinia, K. N. J. Burger, M. Bonn, and M. Müller, “Quantitative label-free imaging of lipid composition and packing of individual cellular lipid droplets using multiplex CARS microscopy,” Biophys. J. 95(10), 4908–4914 (2008). [CrossRef]

8. J. P. R. Day, K. F. Domke, G. Rago, H. Kano, H. Hamaguchi, E. M. Vartiainen, and M. Bonn, “Quantitative coherent anti-Stokes Raman scattering (CARS) microscopy,” J. Phys. Chem. B 115(24), 7713–7725 (2011). [CrossRef]

9. M. Müller and J. M. Schins, “Imaging the thermodynamic state of lipid membranes with multiplex CARS microscopy,” J. Phys. Chem. B 106(14), 3715–3723 (2002). [CrossRef]

10. J. Cheng, A. Volkmer, L. D. Book, and X. S. Xie, “Multiplex coherent anti-Stokes Raman scattering microspectroscopy and study of lipid vesicles,” J. Phys. Chem. B 106(34), 8493–8498 (2002). [CrossRef]

11. M. Tamamitsu, Y. Sakaki, T. Nakamura, G. K. Podagatlapalli, T. Ideguchi, and K. Goda, “Ultrafast broadband Fourier-transform CARS spectroscopy at 50,000 spectra/s enabled by a scanning Fourier-domain delay line,” Vib. Spectrosc. 91, 163–169 (2017). [CrossRef]

12. H. A. Rinia, M. Bonn, and M. Müller, “Quantitative multiplex CARS spectroscopy in congested spectral regions,” J. Phys. Chem. B 110(9), 4472–4479 (2006). [CrossRef]

13. E. M. Vartiainen, “Phase retrieval approach for coherent anti-Stokes Raman scattering spectrum analysis,” J. Opt. Soc. Am. B 9(8), 1209–1214 (1992). [CrossRef]

14. E. M. Vartiainen, H. A. Rinia, M. Müller, and M. Bonn, “Direct extraction of Raman line-shapes from congested CARS spectra,” Opt. Express 14(8), 3622–3630 (2006). [CrossRef]

15. Y. Liu, Y. J. Lee, and M. T. Cicerone, “Broadband CARS spectral phase retrieval using a time-domain Kramers–Kronig transform,” Opt. Lett. 34(9), 1363–1365 (2009). [CrossRef]

16. A. Volkmer, “Vibrational imaging and microspectroscopies based on coherent anti-Stokes Raman scattering microscopy,” J. Phys. D: Appl. Phys. 38(5), R59–R81 (2005). [CrossRef]

17. C. H. Camp Jr, Y. J. Lee, and M. T. Cicerone, “Quantitative, comparable coherent anti-Stokes Raman scattering (CARS) spectroscopy: correcting errors in phase retrieval,” J. Raman Spectrosc. 47(4), 408–415 (2016). [CrossRef]

18. Y. Kan, L. Lensu, G. Hehl, A. Volkmer, and E. M. Vartiainen, “Wavelet prism decomposition analysis applied to CARS spectroscopy: a tool for accurate and quantitative extraction of resonant vibrational responses,” Opt. Express 24(11), 11905–11916 (2016). [CrossRef]

19. A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis, “Deep learning for computer vision: A brief review,” Comput. Intell. Neurosci. 2018, 1–13 (2018). [CrossRef]

20. H. Liang, X. Sun, Y. Sun, and Y. Gao, “Text feature extraction based on deep learning: a review,” EURASIP J. Wirel. Commun. Netw. 2017(1), 1–14 (2017). [CrossRef]

21. S. Peng, L. Cao, Y. Zhou, Z. Ouyang, A. Yang, X. Li, W. Jia, and S. Yu, “A survey on deep learning for textual emotion analysis in social networks,” Digit. Commun. Networks (2021). [CrossRef]

22. J. Yang, J. Xu, X. Zhang, C. Wu, T. Lin, and Y. Ying, “Deep learning for vibrational spectral analysis: Recent progress and a practical guide,” Anal. Chim. Acta 1081, 6–17 (2019). [CrossRef]

23. C. M. Valensise, A. Giuseppi, F. Vernuccio, A. De la Cadena, G. Cerullo, and D. Polli, “Removing non-resonant background from CARS spectra via deep learning,” APL Photonics 5(6), 061305 (2020). [CrossRef]

24. R. Houhou, P. Barman, M. Schmitt, T. Meyer, J. Popp, and T. Bocklitz, “Deep learning as phase retrieval tool for CARS spectra,” Opt. Express 28(14), 21002–21024 (2020). [CrossRef]

25. Z. Wang, K. O’ Dwyer, R. Muddiman, T. Ward, C. H. Camp, and B. M. Hennelly, “VECTOR: Very deep convolutional autoencoders for non-resonant background removal in broadband coherent anti-Stokes Raman scattering,” J. Raman Spectrosc. 53(6), 1081–1093 (2022). [CrossRef]

26. J. J. L. Gomez, S., “Coherent Raman Spectroscopy” in Modern Techniques in Raman Spectroscopy (Wiley New York, 1996).

27. M. Dyrby, S. B. Engelsen, L. Nørgaard, M. Bruhn, and L. Lundsberg-Nielsen, “Chemometric quantitation of the active substance (containing C≡ N) in a pharmaceutical tablet using near-infrared (NIR) transmittance and NIR FT-Raman spectra,” Appl. Spectrosc. 56(5), 579–585 (2002). [CrossRef]

28. Z.-M. Zhang, S. Chen, and Y.-Z. Liang, “Baseline correction using adaptive iteratively reweighted penalized least squares,” Analyst 135(5), 1138–1146 (2010). [CrossRef]

29. R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in Neural Networks for Perception (Elsevier, 1992), pp. 65–93.

30. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Commun. ACM 60(6), 84–90 (2017). [CrossRef]

31. S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). [CrossRef]

32. H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. Mollura, and R. M. Summers, “Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning,” IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016). [CrossRef]

33. S. Ruder, M. E. Peters, S. Swayamdipta, and T. Wolf, “Transfer learning in natural language processing,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials (2019), pp. 15–18.

34. R. Zhang, H. Xie, S. Cai, Y. Hu, G. Liu, W. Hong, and Z. Tian, “Transfer-learning-based Raman spectra identification,” J. Raman Spectrosc. 51(1), 176–186 (2020). [CrossRef]

35. L. Liu, M. Ji, and M. Buchroithner, “Transfer learning for soil spectroscopy based on convolutional neural networks and its application in soil clay content mapping using hyperspectral imagery,” Sensors 18(9), 3169 (2018). [CrossRef]

36. K. Gokcesu and H. Gokcesu, “Generalized huber loss for robust learning and its efficient minimization for a robust statistics,” arXiv Prepr. arXiv2108.12627 (2021).

37. S. Chatterjee, Y. Kan, M. Brzezinski, K. Koynov, R. M. Regy, A. C. Murthy, K. A. Burke, J. J. Michels, J. Mittal, and N. L. Fawzi, “Reversible kinetic trapping of fus biomolecular condensates,” Adv. Sci. 9(4), 2104247 (2022). [CrossRef]

Semi-synthetic data generation to fine-tune a convolutional neural network for retrieving Raman signals from CARS spectra

Abstract

1. Introduction

2. Theory

3. Experimental details

3.1 Details of the experimental Raman data

3.2 Semi synthetic CARS spectra generation

3.3 Data set division and parameter selection

4. Deep learning model details

4.1 SpecNet model

4.2 Fine-tuning SpecNet with transfer learning

5. Results and discussion

5.1 Training the model

5.2 Predicting Raman signals

5.3 Comparing the model sensitivity to different levels of noise

5.4 Comparing the performance of the original and fine-tuned SpecNet

5.5 Prediction on experimental CARS spectra

6. Conclusions

Funding

Acknowledgments

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (11)

Equations (7)

Optics Continuum