Inverse design of an on-chip optical response predictor enabled by a deep neural network

Junhyeong Kim; Berkay Neseli; Jae-yong Kim; Jinhyeong Yoon; Hyeonho Yoon; Hyo-hoon Park; Hamza Kurt

doi:10.1364/OE.480644

1. Introduction

In recent years, nanophotonics has grown to be one of the most interesting research fields, because it enables a high level of integration and ultra-compact sizes, mass producibility, and low power consumption [1]. With these advantages, individual nanophotonic devices can be applied in large-scale array devices, such as optical phased arrays [2–4]. In this process, researchers have proposed many methods to effectively design nanophotonic devices. Conventionally, researchers have manually scanned parameter spaces and then optimized their structure using intuition and knowledge. Even though this method has resulted in a history of highly competent devices, it can also be outdated, slow, and inefficient. These drawbacks have led to the implementation of inverse design algorithms when designing photonic devices. These inverse design methods allow researchers to scan many parameters at the same time and to produce non-intuitive structures which meet the needs of small-scale devices and today’s technological demands. Because of their free-form geometry, inverse-designed structures can be prepared which manipulate light specifically for the defined problem, resulting in high-performance devices. Another advantage of these algorithms is that they can be applied to almost any research problem, including sensing [5], imaging [6], and optical computing [7] applications. The genetic algorithm (GA) is one of the inverse design algorithms used to design nanophotonic structures [8]. GA attempts to provide the best solution to a defined problem by arbitrarily changing the locations of pixels. Recently, a compact reflector was designed using GA, which exhibited around 97% efficiency and a large bandwidth [9]. Additionally, highly efficient polarization rotator and antenna designs have also been proposed [10,11]. Similar to GA, the Direct Binary Search Algorithm (DBS) [12] is also used as an inverse design tool, and successful waveguide crossing, beam splitter, and power splitter designs have been proposed [13–15]. Other algorithms like adjoint method [16] and the objective first algorithm [17] offer an analytical approach to the targeted design problem, and provide faster convergence. By using these features, researchers developed an achromatic lens that can efficiently focus light on a broadband range, as well as its experimental verification [18]. However, optimized free-form structures have very small features, which are not supported by the conventional CMOS-compatible fabrication process. In addition, fabrication-related parameter deviations can hinder the operation of the designed structures. Because of these limitations, most of the designs have been proposed without a fabrication process, and have only been verified numerically [19–21]. Nano-pixel devices, also known as digital nanophotonic devices, have attracted great interest in this area [22]. Nano-pixel devices are analogous to pixelated metamaterials, but unlike conventional pixelated metamaterials, the periodicity of their structure can be broken to explore various topological structures. Several inverse-designed nano-pixel devices have been proposed in various fields, including optical power splitters, polarization beam splitters, wavelength de-multiplexers, waveguide crossings, and waveguide bendings [23–25]. Compared to free-form structures, nano-pixel devices are easy to fabricate because we can set the constraints in the optimization process at the beginning, to ensure the critical dimensions in the fabrication process.

Although these approaches can offer nice designs with superior performance, the optimization processes are computationally demanding, and a more efficient and practical optimization process is needed. For this purpose, a deep neural network was recently applied to efficiently design nanophotonic devices [26–29]. Several neural networks have been proposed for inverse design, and these networks have the potential to design nanophotonic structures with various functionalities [30–34]. Several researchers have proposed a new design approach that can generate optimized structures based on generative adversarial networks (GANs) [35]. In fact, both the conventional inverse design method and neural network-enabled inverse design method need a large number of numerical calculations to generate a full-wave electromagnetic simulation. However, while the conventional inverse design method has the same time cost for each design, the network-enabled inverse design method has a one-time cost for various designs. With a sufficient number of datasets, the data-driven method enables the deep neural network to learn the inverse design of the nanophotonic devices with desired functionalities. Here, we applied one of the famous deep neural networks, called the Tandem network, to inverse design the devices [36]. The Tandem network was firstly proposed in 2018, and its main idea is to divide the network into two parts: ‘pretrained forward modeling network’ and ‘inverse design network’. This is to prevent the nonunique response-to-design mapping problem which makes the network training process ineffective.

In this study, we propose an inverse-designed optical response predictor enabled by a deep neural network. The resulting proposed devices were able to generate a structure with a desired spectral response between 1450 nm to 1650 nm. The generated nanophotonic devices consist of 200 identical rectangles with an area of 150 nm × 150 nm, and the total dimensions of the entire devices were 1.5 µm × 3.0 µm. The silicon device layer had a thickness of 220 nm. The proposed structures can be fabricated with a CMOS-compatible fabrication process since their minimum feature was 150 nm. Our neural network scheme includes a response predictor network and an inverse design network. Supervised learning was employed to train the proposed neural network. During the supervised learning process, a small amount of data was used, which made the training process efficient. The response predictor network was made of fully connected layers and showed a low root-mean-square-error (RMSE) loss of 0.025 after training. The inverse design network was also composed of fully-connected layers and after the training, it also had a low RMSE loss of 0.043.

2. Design of the device and details of the deep neural network

In this paper, digital nanophotonic devices were inverse-designed with a desired output spectral response profile between 1450-1650 nm. Our device was designed on a silicon-on-insulator (SOI) platform, consisting of a silicon substrate of 725 µm, SiO₂ box of 2 µm, silicon patterning layer of 220 nm, and top oxide cladding of 1.2 µm.

A deep neural network is trained to inverse design the structure. A schematic of the inverse-designed nano-pixel device is shown in Fig. 1. The devices are composed of 20×10 equal unit pixels with dimensions of 150 nm × 150 nm, and they are filled with either silicon or silicon dioxide. Among the 200 equal unit pixels, a finite number of pixels are selected to be SiO₂ while others remain as silicon, and this perturbation of the material distorts the homogenous index profile within the device, which may change the propagation profile of the guided light. The designed devices are connected with input and output single-mode silicon waveguides with a width of 500 nm to excite the fundamental TE mode light.

Fig. 1. Schematic of the inverse-designed nanophotonic structure. The compact spectral predictor device made of nano-pixels is designed on an SOI platform.

Download Full Size | PDF

2.1 Data preparation for supervised learning

To train our inverse design network with a data-driven approach, a sufficient number of datasets composed of a nano-pixel structure (20×10 matrix) and a corresponding output spectral response profile between 1450-1650 nm (at selected 100 wavelength points) is needed. Each pixel can be filled with either silicon or silicon dioxide, thus there are 2²⁰⁰∼1.6×10⁶⁰ possible combinations. It is impossible to conduct an electromagnetic evaluation of every structure because of time costs. On the other hand, most of the possible combinations have a high insertion loss, and only a small subset of the structure gives a low insertion loss which is meaningful. Therefore, the data preparation process is assisted by the direct binary search (DBS) algorithm. A flowchart of the data generation process is shown in Fig. 2. First, starting with the design area fully-filled with silicon, one or two pixels are selected and changed to silicon dioxide. Then a Lumerical 3D finite-difference time-domain (FDTD) simulation is performed. If the maximum transmission of the corresponding response profile exceeds 80% of the input light intensity, the changed structure is maintained and the data is saved. If it does not exceed 80%, the structure is drawn back to the previous structure before the index change. The new pixels are randomly selected and the process is repeated multiple times. With this data generation strategy, we were able to obtain meaningful datasets while maintaining sufficient randomness of the structure. Using this data generation process, about 30% of the generated data was saved among total simulations. Each 3D FDTD simulation takes about 20 seconds, so a total of 66 hours was spent to generate the 12000 datasets. Among these generated datasets, 80% of the datasets were labeled as training samples, 10% of the datasets were labeled as validation samples, and the remaining 10% of the datasets were labeled as test samples.

Fig. 2. Flowchart of the data preparation process. The direct binary search algorithm is applied to generate datasets with high transmission.

Download Full Size | PDF

2.2 Architecture of the Tandem neural network and hyperparameter optimization

The proposed deep neural network is composed of two networks. The structure of the first network is shown in Fig. 3(a). The network is designated a ‘response prediction network’, and it predicts the output response profile between 1450-1650 nm when the input nano-pixel photonic structure is given. Once the network is trained, it can predict the output response profile immediately without any electromagnetic simulation when the arbitrary nano-pixel structure is given. The structure of the second network is shown in Fig. 3(b). The network is termed an ‘inverse design network’, and it generates the nano-pixel structure when an arbitrary input response profile is given. By combining the above two networks, the proposed neural network is obtained and the architecture is given in Fig. 3(c). The inverse design network is connected to a pre-trained response prediction network, which can replace the time-consuming 3D FDTD simulation. In this way, an inverse design network can be trained effectively and generate a photonic structure having the desired spectral response.

Fig. 3. (a) The architecture of the ‘response prediction network’. The encoder-type network maps the input structure and output response profile. (b) The architecture of the ‘inverse design network’. The decoder-type network maps the input response profile and output structure. (c) The architecture of the Tandem neural network. A pre-trained response prediction network is applied to train the inverse design network.

Download Full Size | PDF

As the complexity of the deep neural network increases to achieve various functionalities, hyperparameter optimization of the network becomes a challenging task. Conventionally, designers have optimized the hyperparameter manually, which is a hugely time-consuming and inefficient process. Recently, efficient hyperparameter optimization tools have been proposed, such as Hyperopt [37], Autotune [38], Optuna [39], etc. In this work, we applied the effective hyperparameter optimization tool proposed by Optuna, which provides state-of-the-art algorithms and advantages, compared to the other optimization tools.

Firstly, the hyperparameters of the response prediction network are optimized. The number of layers is optimized between 1 to 10, and the nodes of each layer are optimized between 1 to 2000. Also, the dropout value of each layer is optimized between 0 to 1. Finally, the learning rate is optimized between 10⁻⁵ to 0.10. Considering the loss and floating-point operation (FLOP), the number of hidden layers was chosen to be 2 and the number of nodes was set to be 1190 and 570 each. Next, the dropout value was set to 0.41 and 0.53 each. Finally, the learning rate was chosen to be 9.66×10⁻⁵.

After training the response prediction network, the hyperparameters of the inverse design network are also similarly optimized. With the same optimization parameter space, the number of layers chosen was 3, and the numbers of nodes were set to 220, 760, and 800 each. The dropout values were set to be 0.47, 0.13, and 0.53 each. Finally, the learning rate was chosen to be 9.06×10⁻⁴. To ensure the both generated structure and the spectral response prediction of the entire network are close to the label in the training dataset, we introduced a structural parameter in the loss function. The equation below is our loss function where the total loss (L_total) of the tandem network is defined as the weighted average of the structure loss (L_structure) and response profile loss (L_response). Here the weight (w) is determined after testing many values and 0.95 was selected. With this modified loss function, the inverse-designed nanophotonic structure has a similar index profile to the devices in the training datasets, while having enough randomness to generate the new device.

(1)$${L_{total}}\textrm{} = \textrm{}\omega \mathrm{\ast }{L_{response}}\textrm{} + \textrm{}({1 - \omega } )\mathrm{\ast }{L_{structure}}$$

2.3 Training the networks

After the hyperparameter optimization discussed in the above section, the proper values of the number of layers, number of nodes, and learning rate are selected and the network is trained. Among the various kinds of loss functions including mean absolute error (MAE), mean squared error (MSE), RMSE, mean squared log error (MSLE), mean absolute percentage error (MAPE), and mean percentage error (MPE), RMSE was used for the loss function, which is shown in the equation below:

(2)$$Loss = \sqrt {\frac{1}{N}\mathop \sum \nolimits_{i = 1}^N {{({{y_i} - \widehat {{y_i}}} )}^2}} $$

where ${y_i}$ and $\widehat {{y_i}}$ are predicted and actual spectral data, respectively.

First, the response prediction network is trained. With the epochs number of 20000, a total of 92 minutes was spent training the network using NVIDIA GeForce RTX 3070 Ti. The learning curve of the response prediction network is shown in Fig. 4(a). After the training, the training loss was 0.034, and the validation loss was 0.025. With these losses, the response prediction of our network is compared to the real response when the same structure is given.

Fig. 4. Learning curve of the (a) response prediction network, and (b) inverse design network.

Download Full Size | PDF

After the training response prediction network, this pre-trained network was used to train the inverse design network. Just as with the training of the response prediction network, the values of the number of layers, number of nodes, and learning rate were optimized using Optuna. RMSE was also used for the loss function. With the epochs number of 700, a total of 4 minutes was spent training the network. One should note that the number of epochs is carefully selected to avoid the overfitting problem. The learning curve of the inverse design network is shown in Fig. 4(b). In the figure, validation loss is plotted to see how well the model fits new data. After the training, the training loss was 0.049, and the validation loss was 0.043. With the increment of the network complexity, the training process of the inverse design network includes the loss of the pre-trained response prediction network, which caused a small increment of the noise in the learning process.

3. Results

3.1 Performance of the trained network

A trained inverse design network generates a structure when the desired spectral response is given as an input. With the generated structures, we performed a 3D FDTD simulation to verify whether our generated structures had the desired spectral response or not. The generated structures have a number of silicon dioxide pixels, which typically vary from 1 to 10. Several examples of generated nanophotonic structures are shown in Figs. 5(a), 5(c), and 5(e). The 3D FDTD simulation provides a numerical calculation of the generated structures, which were then compared with the desired response in Figs. 5(b), 5(d), and 5(f). We can see that the prediction is different for Figs. 5(b), 5(d), and 5(f). It is clear that Fig. 5(d) gives the best prediction compared to other predictions. To explain this, we analyzed the training datasets that we generated. Among 12000 labels in the training dataset, we counted the average number of pixels and it was 6.923. The structure in Fig. 5(a) has 4 silicon dioxide pixels, Fig. 5(c) has 7 silicon dioxide pixels, and Fig. 5(e) has 10 silicon dioxide pixels. As the structure in Fig. 5(c) has the closest number of silicon dioxide pixels to the average silicon dioxide pixels compared to the other structures, this result is reasonable because supervised learning will work properly with a large number of datasets.

Fig. 5. Three examples of the generated structures and prediction results of the network. (a), (c), (e) Top view of the nanophotonic structure inverse-designed with the proposed deep neural network. (b), (d), and (f) Spectral response of the inverse-designed structure and desired spectral response of (a), (c), and (e), respectively.

Download Full Size | PDF

3.2 Fabrication and experimental verification

Some of the inverse-designed structures proposed in the present work were fabricated with an SOI platform. The CMOS-compatible fabrication process was provided by Applied Nanotools, Inc. They provide 100 KeV electron beam lithography, which can produce reliable fabrication results with a minimum feature size of 60 nm. As our proposed structures have a minimum feature size of 150 nm, the design parameters and structures are consistent and reliable with this fabrication process. Our proposed network has the potential to generate various structures. The scanning electron microscope (SEM) image of one representative example of a nano-pixel device is shown in Fig. 6(a). To clearly verify the performance of the nano-pixel devices, we decided to fabricate a complex structure (∼20 SiO₂ pixels), compared to our high-transmission structures given in Fig. 5 (< 10 SiO₂ pixels). To predict the transmission spectrum of the fabricated structure, only 2000 datasets were used to train the network, using the same method proposed in this paper.

Using the fabricated device above, we experimentally verified the spectral response. Using a tunable laser source, the input light is coupled to the input grating coupler of the device with a single-mode optical fiber (SMF-28). The light is out-coupled through an output grating coupler and through a single-mode optical fiber, the power is measured by a photodetector. As the grating couplers and waveguide have a loss and wavelength dependency, we normalized the transmission by subtracting grating-to-grating loss [40]. The comparison between the numerical calculation and the experimental result is shown in Fig. 6(b). Although we proposed a spectral response between 1450 nm to 1650 nm using the numerical calculation, a spectral response of 1500 nm to 1620 nm was measured by experiment, as our laser source cannot support outside that band region. The experimental result clearly confirms the numerical calculation of the fabricated devices. Based on these results, we can emphasize that the nano-pixel devices provide huge advantages in terms of reliable fabrication with less training datasets.

Fig. 6. Fabrication results for the generated device. (a) SEM image of the fabricated device. (b) Comparison of the numerical calculation and experimental results for the fabricated device.

Download Full Size | PDF

4. Discussion

4.1 Limited variation of spectral responses

In this work, we applied supervised learning in a neural network for training to generate inverse-designed nanophotonic structures. Although supervised learning is capable of predicting the responses provided by training datasets, and is straightforward to implement compared to other deep learning processes, it also has some limitations. One of the most critical disadvantages of supervised learning is that if the input is not similar to the training data, the network might not predict the correct output. In our case, if we ask the network to inverse design a structure which has a spectral response of constant 0.99 for the 1450-1650 nm band, the network cannot generate a structure with that intended spectral response. Our datasets have a maximum transmission of greater than 0.80, and most of those spectral responses were not dramatically diverse. If we want to predict other spectral responses, such as a strong filtering effect, the training datasets must have some type of filtering effect. We verified that adding additional constraints while applying the DBS algorithm can generate structures with specific functionalities. However, we want to emphasize that in this case, the time needed for the data preparation stage will increase significantly which makes the training process inefficient. It can be argued that with a sufficient number of datasets including structures having specific functionality, the inverse design network can generate the structures having specific functionalities such as step function-like spectral response.

It should also be mentioned that to increase the diversity, one may use a randomly sampled dataset for the training. However, in this case, the network couldn’t map the relations between them successfully since the diversity of the devices and spectral responses are too broad. In addition to this issue, the generated devices coming out of this dataset will not be as efficient as the ones we obtained. Above mentioned constraints can be overpowered by applying a large number of datasets but our goal is to generate highly powerful devices with an efficient approach. These limitations can be addressed by applying another machine learning technique, such as a generative adversarial network (GAN), or reinforcement learning.

4.2 Efficient training of the network

In this paper, we used 12000 datasets among 1.6×10⁶⁰ possible combinations to train, validate, and test the network. Increasing the number of datasets would definitely increase the performance of the neural network. However, because it is a time-consuming process to generate large amounts of datasets, we need to consider the trade-off between the performance of the network and time. With other inverse design networks, the authors in Ref. [41] and Ref. [36] generated 1.5×10⁵ datasets and 5.5×10⁵ datasets, respectively, which are one order of magnitude larger than our present work. Our data preparation strategy and hyperparameter optimization allowed us to train the network effectively with a relatively small number of datasets. To the best of our knowledge, hyperparameter optimization of the inverse design network hasn’t been investigated before and we first proposed a new approach. Moreover, we can make this process even more efficient by optimizing additional hyperparameters, such as loss function, activation function, and batch size.

5. Conclusion

In this work, using a deep neural network, we inverse-designed nanophotonic structures with a desired optical spectral response. The nanophotonic structures were subsequently implemented on an SOI platform with a minimum feature size of 150 nm and were fabricated using a CMOS-compatible fabrication process. The proposed network consists of a response prediction network and an inverse design network, and our network is highly effective even though it was trained with only 12000 datasets. The hyperparameters of both networks were optimized using an open-source optimizer. A response prediction network was trained first, and after the training, it had a training loss and validation loss of 0.034 and 0.025, respectively. The inverse design network was trained with a trained response prediction network. After training, it had a training loss and validation loss of 0.049 and 0.043, respectively. To verify our proposed network, several structures were generated and the spectral responses were numerically calculated with a 3D FDTD simulation. The inverse-designed structures provided similar spectral profiles compared to the desired spectral responses. Next, the inverse-designed structures were fabricated using a CMOS-compatible fabrication process and their performance was verified experimentally. For a complex nano-pixel structure having ∼20 SiO₂ pixels among 200 pixels, the numerical calculation, prediction of the network, and the experimental result were all well-matched. With the proper datasets our proposed network is capable of predicting various kinds of spectral responses, with a very small footprint, that can be implemented in a range of nanophotonic devices for optical computing and programmable photonics. Alternative training approaches such as unsupervised learning or reinforcement learning can be employed to overcome the limitations of the supervised learning proposed in this work.

Funding

Ministry of Science and ICT, South Korea (N11220016); National Research Foundation of Korea (NRF-2022R1A2C100977311) and the (BK Four Program), funded by Ministry of Education.

Disclosures

The authors declare no conflict of interest.

Data availability

Data underlying the results presented in this paper may be obtained from the authors upon reasonable request.

References

1. A. Karabchevsky, A. Katiyi, A. S. Ang, and A. Hazan, “On-chip nanophotonics and future challenges,” Nanophotonics 9(12), 3733–3753 (2020). [CrossRef]

2. G. Kang, S.-H. Kim, J.-B. You, D.-S. Lee, H. Yoon, Y.-G. Ha, J.-H. Kim, D.-E. Yoo, D.-W. Lee, C.-H. Youn, K. Yu, and H.-H. Park, “Silicon-based optical phased array using electro-optic p-i-n phase shifters,” IEEE Photon. Technol. Lett. 31(21), 1685–1688 (2019). [CrossRef]

3. S.-H. Kim, J.-B. You, Y.-G. Ha, G. Kang, D.-S. Lee, H. Yoon, D.-E. Yoo, D.-W. Lee, K. Yu, C.-H. Youn, and H.-H. Park, “Thermo-optic control of the longitudinal radiation angle in a silicon-based optical phased array,” Opt. Lett. 44(2), 411–414 (2019). [CrossRef]

4. J.-Y. Kim, J. Yoon, J. Kim, N.-H. Kwon, H.-W. Rhee, M. Baek, Y. Lee, H.-H. Park, and H. Yoon, “Demonstration of beam steering using a passive silica optical phased array with wavelength tuning,” Opt. Lett. 47(19), 4857–4860 (2022). [CrossRef]

5. J. Qin, S. Jiang, Z. Wang, X. Cheng, B. Li, Y. Shi, D. P. Tsai, A. Q. Liu, W. Huang, and W. Zhu, “Metasurface micro/nano-optical sensors: principles and applications,” ACS Nano 16(8), 11598–11618 (2022). [CrossRef]

6. J. Li, L. Bao, S. Jiang, Q. Guo, D. Xu, B. Xiong, G. Zhang, and F. Yi, “Inverse design of multifunctional plasmonic metamaterial absorbers for infrared polarimetric imaging,” Opt. Express 27(6), 8375–8386 (2019). [CrossRef]

7. B. Neşeli, Y. A. Yilmaz, H. Kurt, and M. Turduev, “Inverse design of ultra-compact photonic gates for all-optical logic operations,” J. Phys. D: Appl. Phys. 55(21), 215107 (2022). [CrossRef]

8. S. Molesky, Z. Lin, A. Y. Piggott, W. Jin, J. Vucković, and A. W. Rodriguez, “Inverse design in nanophotonics,” Nat. Photonics 12(11), 659–670 (2018). [CrossRef]

9. Z. Yu, H. Cui, and X. Sun, “Genetically optimized on-chip wideband ultracompact reflectors and Fabry–Perot cavities,” Photonics Res. 5(6), B15–B19 (2017). [CrossRef]

10. Z. Yu, H. Cui, and X. Sun, “Genetic-algorithm-optimized wideband on-chip polarization rotator with an ultrasmall footprint,” Opt. Lett. 42(16), 3093–3096 (2017). [CrossRef]

11. S. Jafar-Zanjani, S. Inampudi, and H. Mosallaei, “Adaptive genetic algorithm for optical metasurfaces design,” Sci. Rep. 8(1), 11040–16 (2018). [CrossRef]

12. M. A. Seldowitz, J. P. Allebach, and D. W. Sweeney, “Synthesis of digital holograms by direct binary search,” Appl. Opt. 26(14), 2788–2789 (1987). [CrossRef]

13. Y. Liu, Z. Zhong, S. Wang, Y. Liu, Y. Yao, J. Du, Q. Song, and K. Xu, “Four-mode waveguide crossing via digitized meta-structure,” Optical Fiber Communication Conference and Exhibition (OFC), 1–3 (2021).

14. J.-H. Li, K. J. Webb, G. J. Burke, D. A. White, and C. A. Thompson, “Design of near-field irregular diffractive optical elements by use of a multiresolution direct binary search method,” Opt. Lett. 31(9), 1181–1183 (2006). [CrossRef]

15. H. Ma, J. Huang, K. Zhang, and J. Yang, “Ultra-compact and efficient 1 × 2 mode converters based on rotatable direct-binary-search algorithm,” Opt. Express 28(11), 17010–17019 (2020). [CrossRef]

16. K. Wang, X. Ren, W. Chang, L. Lu, D. Liu, and M. Zhang, “Inverse design of digital nanophotonic devices using the adjoint method,” Photonics Res. 8(4), 528–533 (2020). [CrossRef]

17. J. Lu and J. Vučković, “Nanophotonic computational design,” Opt. Express 21(11), 13351–13367 (2013). [CrossRef]

18. I. A. Atalay, Y. A. Yilmaz, F. C. Savas, and H. Kurt, “A broad-band achromatic polarization-insensitive in-plane lens with high focusing efficiency,” ACS Photonics 8(8), 2481–2488 (2021). [CrossRef]

19. Y. Augenstein and C. Rockstuhl, “Inverse design of nanophotonic devices with structural integrity,” ACS Photonics 7(8), 2190–2196 (2020). [CrossRef]

20. J. Kim, J. -Y. Kim, J. Yoon, H. Yoon, H. -H. Park, and H. Kurt, “Inverse design of zig-zag shaped 1 × 4 optical power splitters in SOI platform,” in Silicon Photonics XVII (SPIE, 2022), Vol. 12006, pp. 177–183.

21. K. Aydin, “Nanostructured silicon success,” Nat. Photonicsx 9(6), 353–355 (2015). [CrossRef]

22. J. Huang, H. Ma, D. Chen, H. Yuan, J. Zhang, Z. Li, J. Han, J. Wu, and J. Yang, “Digital nanophotonics: the highway to the integration of subwavelength-scale photonics,” Nanophotonics 10(3), 1011–1030 (2021). [CrossRef]

23. B. Shen, P. Wang, R. Polson, and R. Menon, “Ultra-high-efficiency metamaterial polarizer,” Optica 1(5), 356–360 (2014). [CrossRef]

24. B. Shen, R. Polson, and R. Menon, “Integrated digital metamaterials enables ultra-compact optical diodes,” Opt. Express 23(8), 10847–10855 (2015). [CrossRef]

25. Y. Liu, K. Xu, S. Wang, W. Shen, H. Xie, Y. Wang, S. Xiao, Y. Yao, J. Du, Z. He, and Q. Song, “Arbitrarily routed mode-division multiplexed photonic circuits for dense integration,” Nat. Commun. 10(1), 3263 (2019). [CrossRef]

26. S. So, T. Badloe, J. Noh, J. B. Abad, and J. Rho, “Deep learning enabled inverse design in nanophotonics,” Nanophotonics 9(5), 1041–1057 (2020). [CrossRef]

27. P. R. Wiecha, A. Arbouet, C. Girard, and O. L. Muskens, “Deep learning in nano-photonics: inverse design and beyond,” Photonics Res. 9(5), B182–B200 (2021). [CrossRef]

28. I. Malkiel, M. Mrejen, A. Nagler, U. Arieli, L. Wolf, and H. Suchowski, “Plasmonic nanostructure design and characterization via deep learning,” Light: Sci. Appl. 7(1), 60 (2018). [CrossRef]

29. T. Pu, F. Cao, Z. Liu, and C. Xie, “Deep learning for the design and characterization of high efficiency self-focusing grating,” Opt. Commun. 510, 127951 (2022). [CrossRef]

30. S. So, J. Mun, and J. Rho, “Simultaneous inverse design of materials and structures via deep learning: demonstration of dipole resonance engineering using core-shell nanoparticles,” ACS Appl. Mater. Interfaces 11(27), 24264–24268 (2019). [CrossRef]

31. S. Kim, N. Kim, I. Park, and H. Han, “Ultra-compact terahertz 50:50 power splitter designed by a perceptron-based algorithm,” Opt. Continuum 1(7), 1565–1571 (2022). [CrossRef]

32. S. Banerji, A. Majumder, A. Hamrick, R. Menon, and B. Sensale-Rodriguez, “Ultra-compact integrated photonic devices enabled by machine learning and digital metamaterials,” OSA Continuum 4(2), 602–607 (2021). [CrossRef]

33. S. So and J. Rho, “Designing nanophotonic structures using conditional deep convolutional generative adversarial networks,” Nanophotonics 8(7), 1255–1261 (2019). [CrossRef]

34. S. So, D. Lee, T. Badloe, and J. Rho, “Inverse design of ultra-narrowband selective thermal emitters designed by artificial neural networks,” Opt. Mater. Express 11(7), 1863–1873 (2021). [CrossRef]

35. J. Jiang, M. Chen, and J. A. Fan, “Deep neural networks for the evaluation and design of photonic devices,” Nat. Rev. Mater. 6(8), 679–700 (2020). [CrossRef]

36. D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training Deep Neural Networks for the inverse design of nanophotonic structures,” ACS Photonics 5(4), 1365–1369 (2018). [CrossRef]

37. J. Bergstra, D. Yamins, and D. Cox, “Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures,” in International conference on machine learning (PMLR, 2013), Vol. 28, pp. 115–123.

38. P. Koch, O. Golovidov, S. Gardner, B. Wujek, J. Griffin, and Y. Xu, “Autotune: A derivative-free optimization framework for hyperparameter tuning,” in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (ACM, 2018), pp. 443–452.

39. T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining (ACM, 2019), pp. 2623–2631.

40. J. Kim, J.-Y. Kim, J. Yoon, H. Yoon, H.-H. Park, and H. Kurt, “Experimental demonstration of inverse-designed silicon integrated photonic power splitters,” Nanophotonics 11(20), 4581–4590 (2022). [CrossRef]

41. N. J. Dinsdale, P. R. Wiecha, M. Delaney, J. Reynolds, M. Ebert, I. Zeimpekis, D. J. Thomson, G. T. Reed, P. Lalanne, K. Vynck, and O. L. Muskens, “Deep learning enabled design of complex transmission matrices for universal optical components,” ACS Photonics 8(1), 283–295 (2021). [CrossRef]

Inverse design of an on-chip optical response predictor enabled by a deep neural network

Abstract

1. Introduction

2. Design of the device and details of the deep neural network

2.1 Data preparation for supervised learning

2.2 Architecture of the Tandem neural network and hyperparameter optimization

2.3 Training the networks

3. Results

3.1 Performance of the trained network

3.2 Fabrication and experimental verification

4. Discussion

4.1 Limited variation of spectral responses

4.2 Efficient training of the network

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (6)

Equations (2)

Optics Express