Deep neural network for plasmonic sensor modeling

Xiaozhong Li; Jing Shu; Wenhua Gu; Li Gao

doi:10.1364/OME.9.003857

1. Introduction

Subwavelength metallic nanostructures have strong interaction with incident light, resulting in coherent oscillation phenomenon in the nanostructures and great field enhancement across the interface, which is known as localized surface plasmon resonances (LSPRs). The resonant frequency is sensitive to the nanostructure geometric parameters, materials properties and surrounding medium changes, thus LSPRs exhibit great potential for molecular sensing, pushing forward the development in various areas including medical diagnosis, environmental monitoring, biological agents, and so on [1–3]. LSPRs have been reported for various applications in biosensors [4–6], due to their advantages in ultra-sensitive, label-free and real-time detection of biomolecular interactions. Such sensing process usually involves attaching the analyte molecules onto the surface of plasmonic nanostructures, with or without anchoring molecules. It causes a slight change in the refractive index (RI) of the dielectric surrounding and leads to a detectable shift in the resonant wavelength or intensity, which can be instantaneously analyzed by the extinction, transmission or reflectance spectra [7].

Since the spectral peak and shape of the resonances are sensitive to geometric parameters of the nanostructure, a proper selection of parameter combinations should be optimized for specific detection spectra. The conventional design and experimental strategy start with prior experiences and intuitions, and time-consuming electromagnetic (EM) simulations by finite difference time domain (FDTD) method or finite element modeling (FEM) method are performed iteratively for optimizing a desired spectrum, experimental spectrum is then compared with simulated spectrum before actual sensing applications. This process usually takes hours or days, and once a desired spectrum is obtained, all other simulation results are useless and become wasted. In an ideal scenario, if common data bases can be set up for every type of plasmonic sensor systems, each includes all possible geometric parameter combinations and their corresponding resonance spectra, we can then bypass costly simulation step and help predict plasmonic sensor properties directly. However, conventional simulation methods can be challenging to complete such task. If we only consider three simple geometric parameters in a plasmonic sensor device such as size, thickness and period, and suppose each parameter has one hundred variations that can affect the spectra, the total number of geometrical parameter combinations and corresponding spectra can be up to one million. In cases of complex structure with mixed shapes, sizes or interparticle gaps, it is almost impossible to use numerical or analytical methods to construct such data bases.

Fortunately, deep learning methods based on artificial intelligence have emerged as a powerful tool in solving complex computation problems. Neural network (NN) has been introduced for approximating the intricate optical interaction process successfully, and precise results have been reported in various areas in nanophotonics [8–11]. These works realize the prediction and design of different physical structures by NN, including design of metamaterials, integrated silicon photonic device, and plasmonic colors etc. The NN method can turn the complicated physical problems into mathematic correlation of the networks, and discover the exact relationship between the data pairs, which works as a simulation of the physical interaction. The employment of NN in previous works demonstrate its efficiency and accuracy. Here, we propose applying the NN method to realize precise prediction of the optical properties of plasmonic nanostructures, so that the spectra can be easily acquired for any combination of geometric parameters. This tool can eliminate costly simulations and be used as a quick reference for plasmonic sensor designs. NN only needs one-time investment of sufficient EM simulation datasets for network training, once the network is trained, innumerable plasmonic nanostructure generated spectra can be accurately predicted in milliseconds and such NN can be used repeatedly without any computational resource waste.

2. Method

To illustrate this NN application for plasmonic sensors, we use a simple, periodic gold nanostructure that is commonly seen in lithographically fabricated devices, which is shown in Fig. 1(a). Gold nanodisks are arranged into a periodic array and referred as surface lattice resonances (SLRs). It can support resonances which originate from coupling of LSP and have considerably narrower spectral features [12]. Each unit of the structure consists of a gold nanodisk that supports plasmonic resonances on a Si substrate. The wavelengths at which SLRs are excited are influenced by the geometric parameters of the nanoparticles [13], which can be mapped to the diameter (D), height (H) of the gold nanodisks and period (P) of the repeating unit, variation in these parameters can affect the optical responses. A simplified architecture of NN is sketched in Fig. 1(b) which is composed of an input layer, an output layer, and several hidden layers, where each layer consists of plenty of nodes that connect the layers by weights. Here we aim to explore the relationships between the geometric parameters of the nanostructure and the optical responses, thus the inputs are set to be the period (P), diameter (D), and height (H), and the corresponding outputs are discrete spectra data, as shown in Fig. 1(c).

Fig. 1. Schematic illustration of the plasmonic nanostructure, NN, and spectra data. (a) Structure of the periodic gold nanodisks. The studied geometry parameters include the diameter (D) and height (H) of nanodisks, and the period (P) of repeating units. (b) Architecture of a basic NN. (c) A sample of the spectra consist of discrete spectra data.

Download Full Size | PDF

The training datasets are generated in batches by the EM simulations (FDTD Lumerical solutions). The value ranges of P, D and H are set to be in the ranges of 200 nm to 600 nm, 60 nm to 400 nm and 20 nm to 200 nm, respectively. The density of sampled training data is determined to minimize the computation cost but sufficient to train the NN accurately. In this work, the total datasets are 2254 groups of parameter combinations and their corresponding spectra. The number of such training datasets is actually very small compare to previous nanophotonic NN studies which have at least ten thousand of training data sets [14–16]. We only choose structure parameters that can impact the spectrum significantly and cover all possible variations in the spectrum. In fact, this small amount of training data can train NN accurately for modeling and prediction of millions of plasmonic structures that are in the parameter range.

In the training process, datasets are split into three parts: training data, validation data, and test data. Training data are fed to the NN to optimize the network by updating weights; validation data are to examine the network, serving as a check of the trained effect and helping us learn that whether the network is overfitting; test data provide completely strange data to trained NN and test the accuracy of the prediction. The optimal NN is determined by choosing appropriate hyperparameters according to the training results every time. As shown in Fig. 2(a), the best trained NN of this work has four hidden layers in total, and 330 nodes per layer. The input layer has three nodes, representing P, D and H, respectively. And output layer has 81 nodes referred to discrete spectra points of reflectance between wavelength of 380 nm and 780 nm. Training results are evaluated by training loss and validation loss which are mean squared errors (MSE) of training data and validation data, respectively. Other adopted hyperparameters of the NNs are discussed in the supplementary information. The loss curve of the NN is shown in Fig. 2(b), the training process takes only 800 epochs to converge and the final value of training loss and validation loss are 7.38×10⁻⁶ and 3.86×10⁻⁵, respectively, which shows the NN performs well in training process.

Fig. 2. Results of the applied NN. (a) Architecture of the best trained NN, including four hidden layers and 330 nodes per layer. The input layer has three nodes which represent P, D, H. The output layer has 81 nodes, representing the discrete spectra points. (b) The loss curve of training and validation. The final value of training loss and validation loss are 7.38×10⁻⁶ and 3.86×10⁻⁵, respectively.

Download Full Size | PDF

3. Results

In order to investigate the usefulness of the NN in predicting spectra of random plasmonic nanostructures, 200 groups of test data that the NN has never be trained on are employed to test the network accuracy. The data contain pairs of known geometric parameter P, D, H and the EM simulated corresponding spectra. We input the known P, D, H into the network and obtain the predicted spectra in the output layer. The results are compared with those EM simulated spectra in the test group. Relative error is employed to evaluate the similarity of the two spectra, which is given by:

(1)$$\sigma \ =\ \frac{{\int_{{\lambda _1}}^{{\lambda _2}} {|{X(\lambda )- Y(\lambda )} |d\lambda } }}{{\int_{{\lambda _1}}^{{\lambda _2}} {Y(\lambda )d\lambda } }}$$

where X(λ) and Y(λ) refer to the predicted and EM simulated spectra, respectively.

The error distribution of these test samples is shown in Fig. 3(a). The average error in the 200 groups of test data is only 1.56%. By looking into the detailed statistics, the samples are divided into several groups according to the range of errors and displayed in Fig. 3(b). 148 out of 200 samples (74%) have errors less than 2%. Only 5 samples have error more than 5% as seen in the graph. Some representative NN predicted spectra results are compared with EM simulated spectra results and displayed in Fig. 4. Figure 4(a) shows one case with relative error at 2.01%, the NN predicted spectrum is very similar to the EM simulated spectrum which almost overlap with each other. This result indicates the NN network is highly accurate in predicting plasmonic resonances. Figure 4(b) shows a case with slightly higher error at 4.25% over the whole wavelength regime, the spectra features are maintained well as seen in the plot. If we consider the predictions with error less than 5% are accurate, the statistics are 195 out of 200 groups of data which is 97.5%. For those 5 groups of data showing error over 5%, we plot one spectrum in Fig. 4(c) which has error of 6.18%, and show the worst case with error of 16.89% in Fig. 4(d). Although in these two cases, the predicted spectra have some tiny differences compared to simulated spectra, but the main resonant features still match with each other, which may still be useful for plasmonic sensor applications.

Fig. 3. Results of the test samples. (a) Error distribution of the test samples. (b) Statistics of the samples divided by different ranges of errors.

Download Full Size | PDF

Fig. 4. Representative results of NN predicted spectra compared with EM simulated spectra. (a) Spectrum with relative error at 2.01%. (b) Spectrum with relative error at 4.25%. (c) Spectrum with relative error at 6.18%. (d) Spectrum with relative error at 16.89%.

Download Full Size | PDF

Overall, the results show that by investing only about two thousand simulation data in NN can predict accurate spectra for millions of different nanostructures in the range of P, D and H. Over 97.5% of the NN predict plasmonic spectra have errors less than 5% over wavelengths from 380 nm to 780 nm. They all maintain accurate resonance features as those obtained from EM simulations and prove the NN can be trained well for electromagnetic modeling purposes. The accuracy can be further improved if we increase the training data size and train the network better. By comparison, the NN method shows advantages over conventional EM simulation. Although the first step of NN takes about two weeks to generate the 2254 groups of training data, and about two days to complete the NN training on a regular laptop (1.8 GHz Intel Core i7 CPU and a Nvidia GeForce MX150 GPU), however after the initial one-time computation investment, the application of NN can perform accurate spectrum prediction for millions of structures in the studied range in a few milliseconds, while EM simulation takes hours or days to produce a single resonance spectrum. The efficiency can be increased easily for millions of times. This powerful NN tool can be extended to other bottom-up synthesized plasmonic nanoparticles and top-down fabricated nanostructures with multiple parameters. By establishing general NN platform for different plasmonic device system, this approach can guide plasmonic sensor fabrication and application with a much shorter design period. Further study of inverse design process that generate plasmonic geometric parameters for specific desired spectra will be conducted. This work may be of great significance in revolutionizing the plasmonic sensor device practices.

4. Conclusion

In conclusion, we have used a simple plasmonic sensor structure composed of periodic gold nanodisk array as an example, to illustrate the accuracy and usefulness of neural network in predicting the optical resonances. A small amount of training datasets generated by variable period, diameter and height of the nanodisks are used for effective training. Optimal NN architecture is selected with 4 hidden layers and 330 nodes per layer. The test results show that 97.5% of the predictions has relative error below 5% over the entire visible wavelength regime, and the critical resonance features are all well maintained. The well trained NN can predict accurate spectra of countless nanostructures in the studied range of the parameters, and each prediction can be completed in a few milliseconds. Such tool provides a general solution to minimize the cost of plasmonic sensor simulations and guides the design and fabrication for related photonic devices.

Funding

Natural Science Foundation of Jiangsu Province (SBK2019020904); Jiangsu Provincial Key Research and Development Program (BE2018728); National Natural Science Foundation of China (11604151, 61974069); NUPTSF (NY219008).

Acknowledgements

X.L. conducted this project, J.S. and W.G. provides technical discussions, L.G. oversaw this work and revised all materials. The author acknowledges support from Natural Science Foundation of Jiangsu Province SBK2019020904, National Natural Science Foundation of China 6197030941 and 11604151, NUPTSF NY219008, the Key Research and Development Plan of Jiangsu Province under Grant BE2018728.

References

1. H. Im, H. Shao, Y. I. Park, V. M. Peterson, C. M. Castro, R. Weissleder, and H. Lee, “Label-free detection and molecular profiling of exosomes with a nano-plasmonic sensor,” Nat. Biotechnol. 32(5), 490–495 (2014). [CrossRef]

2. P. Strobbia, E. R. Languirand, and B. M. Cullum, “Recent advances in plasmonic nanostructures for sensing: a review,” Opt. Eng. 54(10), 100902 (2015). [CrossRef]

3. M. I. Stockman, “Nanoplasmonic sensing and detection,” Science 348(6232), 287–288 (2015). [CrossRef]

4. J. N. Anker, W. P. Hall, O. Lyandres, N. C. Shah, J. Zhao, and R. P. V. Duyne, “Biosensing with plasmonic nanosensors,” Nat. Mater. 7(6), 442–453 (2008). [CrossRef]

5. O. Limaj, D. Etezadi, N. J. Wittenberg, D. Rodrigo, D. Yoo, S. H. Oh, and H. Altug, “Infrared plasmonic biosensor for real-time and label-free monitoring of lipid membranes,” Nano Lett. 16(2), 1502–1508 (2016). [CrossRef]

6. B. Špačková, P. Wrobel, M. Bocková, and J. Homola, “Optical biosensors based on plasmonic nanostructures: a review,” Proc. IEEE 104(12), 2380–2408 (2016). [CrossRef]

7. B. Liedberg, C. Nylander, and I. Lunström, “Surface plasmon resonance for gas detection and biosensing,” Sens. Actuators 4, 299–304 (1983). [CrossRef]

8. W. Ma, F. Cheng, and Y. Liu, “Deep-learning-enabled on-demand design of chiral metamaterials,” ACS Nano 12(6), 6326–6334 (2018). [CrossRef]

9. Z. Liu, D. Zhu, S. P. Rodrigues, K. T. Lee, and W. Cai, “Generative model for the inverse design of metasurfaces,” Nano Lett. 18(10), 6570–6576 (2018). [CrossRef]

10. E. Bor, O. Alparslan, M. Turduev, Y. S. Hanay, H. Kurt, S. I. Arakawa, and M. Murata, “Integrated silicon photonic device design by attractor selection mechanism based on artificial neural networks: optical coupler and asymmetric light transmitter,” Opt. Express 26(22), 29032–29044 (2018). [CrossRef]

11. J. Baxter, A. C. Lesina, J. M. Guay, A. Weck, P. Berini, and L. Ramunno, “Plasmonic colours predicted by deep learning,” Sci. Rep. 9(1), 8074 (2019). [CrossRef]

12. V. G. Kravets, A. V. Kabashin, W. L. Barnes, and A. N. Grigorenko, “Plasmonic surface lattice resonances: a review of properties and applications,” Chem. Rev. 118(12), 5912–5951 (2018). [CrossRef]

13. C. Valsecchi and A. G. Brolo, “Periodic metallic nanostructures as plasmonic chemical sensors,” Langmuir 29(19), 5638–5649 (2013). [CrossRef]

14. J. Peurifoy, Y. Shen, L. Jing, Y. Yang, F. Cano-Renteria, B. G. DeLacy, J. D. Joannopoulos, M. Tegmark, and M. Soljačić, “Nanophotonic particle simulation and inverse design using artificial neural networks,” Sci. Adv. 4(6), eaar4206 (2018). [CrossRef]

15. D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design of nanophotonic structures,” ACS Photonics 5(4), 1365–1369 (2018). [CrossRef]

16. X. Lin, Y. Rivenson, N. T. Yardimci, M. Veli, Y. Luo, M. Jarrahi, and A. Ozcan, “All-optical machine learning using diffractive deep neural networks,” Science 361(6406), 1004–1008 (2018). [CrossRef]

Deep neural network for plasmonic sensor modeling

Abstract

1. Introduction

2. Method

3. Results

4. Conclusion

Funding

Acknowledgements

References

Cited By

Figures (4)

Equations (1)

Optical Materials Express