Gradient-probability-driven discrete search algorithm for on-chip photonics inverse design

Shanglin Yang; Shanglin Yang; Hao Jia; Lei Zhang; Lei Zhang; Jincheng Dai; Jincheng Dai; Xin Fu; Xin Fu; Ting Zhou; Ting Zhou; Gaolu Zhang; Gaolu Zhang; Lin Yang; Lin Yang

doi:10.1364/OE.432313

1. Introduction

Recently, the computational design of photonic devices has been extensively studied and has obtained many exciting results [1–4]. Compared with traditional designs, the devices with inverse design methods have the characteristics of non-intuitive, higher efficiency and compactness. These features are benefited from exploring more geometry structures in the design space instead of optimizing limited parameters. Different application scenarios have different requirements on the design methods. The on-chip devices design typically needs to satisfy the material and the feature size constraints [5,6]. First, most on-chip devices are composed of a finite number of materials, which means that the permittivity values of the device are in a discrete set. For example, most commonly used silicon photonics devices are composed of silicon, silicon dioxide, and air [7]. Second, the nanofabrication processes, such as lithographic techniques, usually require the feature size to have a lower bound.

The optimization methods, in order to satisfy the constraints above, generally have two technical routes. One approach is first to initiate optimization from the continuous permittivity and converge on a locally optimal solution to a figure of merit (FOM). Then discretize the permittivity to a discrete set and regulate the structure to satisfy the requirement of feature size. Such methods can explore the entire design space and are conducive to discovering high-performance structures [2,8–15]. For another approach, the material and feature size constraints are satisfied during the entire optimization process. The optimization process is in fact the modification of the geometric structure. This approach sacrifices some available search space, but any structure during the optimization process can be implemented for fabrication directly [1,16–29]. A specified method in this route is the pixel-based discrete search method. This method divides the design area into several unit cells, where the optional material types of each cell meet the constraints of material and feature size. However, the iterative strategy for this method, whether the BFS method [1,17–21] or heuristic algorithms [23–27], usually requires thousands of electromagnetic simulations to converge.

As is well known, introducing gradient can significantly improve the efficiency of algorithm convergence. In photonics optimization, adjoint sensitivity analysis method [29–34] has a wide range of applications. This method can calculate gradient of the FOM concerning the continuous variation permittivity efficiently. The designer can determine the directions of modifying multiple design parameters in one iteration according to the gradient. Unfortunately, this kind of gradient information cannot be applied to the pixel-based discrete algorithm directly. This incompatibility is derived from the fact that the calculation of gradient is based on the infinitesimal variation of permittivity. And the gradient will become invalid when the permittivity change is discrete. In structural optimization, algorithms such as Evolutionary Structural Optimization (ESO) type methods have been studied for handling discrete design variables utilizing gradient information [35]. The bi-directional ESO (BESO) method changes the independent variable with the smallest sensitivity value from large to small, and changes the independent variable with the largest sensitivity value from small to large. In photonics, a similar approach is used to overcome the incompatibility, researchers merge the gradient in each unit cell to represent the modification tendency of the cell. Then modify several cells with the largest tendency, thereby ensuring the validity of the gradients [36]. However, this method does not fully utilize gradient information of the whole design space and may only update a certain amount of design parameters in one iteration. Furthermore, based on Dyson’s equation [37] in a Green’s function formalism, some researchers deduce an expression of FOM that is applicable to arbitrary permittivity change, so that the designer can evaluate the variation of FOM corresponding to permittivity variations in the discrete set [5,6,38]. However, in addition to ordinary adjoint sensitivity analysis, this method also needs to compute and store Green's function, which has higher requirements for calculation and storage.

In this paper, to solve the incompatibility between the gradient and the discrete permittivity values, we proposed a GPDS algorithm, which combines the characteristics of the pixel-based discrete method and adjoint sensitivity analysis. We map the gradient to the probability space to obtain a normalized gradient-driven probability. Then sample the normalized gradient-driven probability to determine the parameters to be modified, which associate the continuous gradient to the discrete permittivity. Therefore, we can use the mature adjoint sensitivity method to optimize multiple design parameters in one iteration while ensuring the FOM increases. To further demonstrate the characteristics of the algorithm, we compare the algorithm with the traditional BFS method and traditional gradient method. The results show that the algorithm has the features of high-speed convergence and can alleviate the impairment of FOM due to the local optima. Moreover, we give the examples of designing silicon photonic mode converters and a mode de-multiplexer, which respectively reflect the flexibility of the algorithm and the potential for solving multi-objective problems.

The rest of the paper are organized as follows: In Section 2, we introduce the algorithm. Section 3 shows the optimization strategy based on the GPDS algorithm and compares the algorithm with the traditional BFS algorithm and traditional gradient algorithm. In Section 4, we show some examples of the algorithm in designing silicon photonics devices. And in Section 5 we summarize the results.

2. Gradient-probability-driven discrete search algorithm

The inverse design of photonic devices can be described as assigning the material distribution in the design region so that the electromagnetic field excited by the input field source meets the specific requirement. Its essence is an optimization problem constrained by Maxwell's equations,

(1)$$\begin{array}{c} \max \textrm{ }FOM = f(\boldsymbol{E},\boldsymbol{H},{\boldsymbol p})\\ \textrm{s}\textrm{. t}\textrm{. }{g_s}({\boldsymbol{E},\boldsymbol{H},{\varepsilon_r}} )= 0,s = 1, \cdots ,q\\ {h_t}({\boldsymbol p}) \le 0,t = 1, \cdots ,k\\ {\varepsilon _r} = {\varepsilon _r}(\boldsymbol{p}) \end{array}, $$

where FOM is given by the function f(E, H, p) dependent on the electromagnetic field E(x, y, z), H(x, y, z), and the design parameter p. The design parameter ${\boldsymbol p} = ({p_1},{p_2}, \ldots ,{p_n})$ is the independent variable of the optimization process, where n is the number of independent variables. The equations g_s(E,H,ɛ_r) = 0 are the Maxwell's equations determining the distribution of E and H, and the q is the number of the Maxwell’s equations. Where the ɛ_r is affected by the design parameter p. The equation h_t(p) are the constraints function, in on-chip optimization, which is used to constrain the optimization parameter p so that the results meet the material and feature size constraints.

For discrete search algorithms, the values of the design parameters p_i are in a discrete set D,

(2a)$${p_i} \in D, \textrm{ }i = 1, \cdot{\cdot} \cdot ,n$$

(2b)$$D = \{{{D_1},{D_2}, \cdot{\cdot} \cdot {,{D_m}} \}} $$

(2c)$${D_1} < {D_2} < \cdots < {D_m}$$

where m is the number of optional values for the design parameters p_i, and we assume that D is sorted, as expressed in Eq. (2c). In particular, when m = 2, D₁ = 0, D₂ = 1, the discrete search degenerate to the binary search.

Now we describe the proposed GPDS algorithm as follows:

Step1. Start from an initial state given ${{\boldsymbol p}^0} = ({p_1^0,p_2^0, \ldots ,p_n^0} )$.

Step2. Obtain the FOM value.

Step3. Calculate the gradient of the FOM with respect to p:

(3)$${\nabla _{\boldsymbol p}}f = \frac{{{\rm d} f({E,H,{\boldsymbol p}} )}}{{{\rm d} {\boldsymbol p}}} = \left( {\frac{{{\rm d} f}}{{{\rm d} {p_1}}},\frac{{{\rm d} f}}{{{\rm d} {p_2}}}, \cdots ,\frac{{{\rm d} f}}{{{\rm d} {p_n}}}} \right). $$

Step4. Map the absolute value gradient $|{\nabla f} |$ to the probability space:

(4)$${\boldsymbol G} = map(\alpha \cdot |{{\nabla_{\boldsymbol p}}f} |) = ({{G_1},{G_2}, \cdot{\cdot} \cdot ,{G_n}} ),{G_i} \in [{0,1} ], $$

where G is the normalized gradient-driven probability, and α is the learning rate, α > 0.

Step5. Determine the parameters p_i to be modified based on sampling the normalized gradient-driven probability G.

Step6. Modify the selected parameters p_i according to the positive or negative characteristics of the corresponding gradient value, for the selected p_i = D_k, if ${{\textrm{d}f} / {\textrm{d}{p_i}}} > 0$, then the parameter increases, p_i = D_k+1; if ${{\textrm{d}f} / {\textrm{d}{p_i}}} < 0$, then the parameter decrease, p_i = D_k-1.

Step7. Return to Step2 until the termination conditions are satisfied.

The schematic of sampling process in Step5 is shown in Fig. 1. The short blue line is the value of the normalized gradient-driven probability, so the intersection point G_i with the gray dash line is the probability of the p_i being selected. For each design parameter, take a random number from the interval [0,1], and the asterisks in Fig. 1 are the values of the random numbers. The points whose value is greater than the probability G_i will not be selected and represented by blue asterisks. The red asterisks in the green shade denote that the random number value is less than G_i, and the corresponding p_i is selected.

Fig. 1. Schematic diagram of selecting parameters to be modified based on normalized gradient-driven probability.

Download Full Size | PDF

This sampling process in the algorithm have at least two advantages: (1) The algorithm can use gradient information in the entire space. Therefore, compared with the traditional BFS method, our algorithm can improve the device performance more effectively. (2) The results of the traditional gradient algorithms are deterministic when the initial states and parameters are given. The sampling process according to probability breaks this certainty of the traditional gradient algorithm. Although this may cause the FOM value to drop over a while, it can help the algorithm jump out of the local optimum. This provides the algorithm with better stability to the initial value. We will show numerical examples to illustrate the advantages.

3. Algorithm performance analysis

3.1 Implementation environment of the algorithm in silicon photonics

In order to illustrate the performance analysis clearly, we implement the process on solving a practical application problem. In this demonstration, we constructed a strategy for silicon photonics device design based on the proposed GPDS algorithm to design a TE0 to TE1 mode converter. Note that for simplicity, all the designs in this paper are based on a commercial SOI wafer with a 220 nm top silicon layer, a 3 µm buried oxide layer, and a cladding of silicon dioxide. The device is consist of a 3 µm × 2 µm rectangular design region, a 0.5 µm width input waveguide, and a 1.0 µm width output waveguide. The working wavelength is 1550 nm. Figure 2 shows the schematic diagram of the device.

Fig. 2. Schematic diagram of designing silicon photonic TE₀ mode to TE₁ mode converter.

Download Full Size | PDF

Firstly, same as the usual pixel-based search method, we divide the design region into 15 × 10 unit cells. Each unit cell is a 200 nm × 200 nm square with a design parameter p_i. The unit cells are classified into three material types, Type-I, Type-II, and Type-III, as shown in Fig. 2. Type-I contains a 120-nm-diameter circular hole in the center of the silicon square. Type-II, similar to Type-I contains an 80-nm-diameter circular hole. Type-III is filled with silicon material. The holes of Type-I and Type-II are filled with SiO₂. These three type cells are mapped to a set of discrete values D_i (D_i = 0,1,2). D₁ = 0, D₂ = 1, and D₃ = 2 refers to type-I Type-II, and Type-III, respectively. Using this notation, we can change the device structure by modifying the design parameter p.

The power-coupling efficiencies from input mode $\left|{\left. {\begin{array}{c} {{{{\boldsymbol E}}_{in}}}\\ {{{\boldsymbol H}_{in}}} \end{array}} \right\rangle } \right.$ toward the m-th output mode $\left|{\left. {\begin{array}{c} {{{\boldsymbol E}_m}}\\ {{{\boldsymbol H}_m}} \end{array}} \right\rangle } \right.$ is defined as FOM,

(5)$$FOM = \frac{1}{4} \times \frac{{{{\left|{\int_{{S_{out}}} {({{\boldsymbol{E}_{out}} \times \boldsymbol{H}_m^ \ast{+} \boldsymbol{E}_m^\ast{\times} {\boldsymbol{H}_{out}}} )\cdot \textrm{d}\boldsymbol{S}} } \right|}^2}}}{{\left[ {\int_{{S_{in}}} {{\textrm{Re}} ({{\boldsymbol{E}_{in}} \times \boldsymbol{H}_{in}^\ast } )} \cdot \textrm{d}\boldsymbol{S}} \right]\left[ {\int_{{S_{out}}} {{\textrm{Re}} ({{\boldsymbol{E}_m} \times \boldsymbol{H}_m^\ast } )\cdot \textrm{d}\boldsymbol{S}} } \right]}}, $$

where E_in and H_in are the electromagnetic field at the input port of TE₀ mode, E_out and H_out are the electromagnetic field at output port generated by the input excitation, and E_m and H_m are the TE₁ mode’s electromagnetic field at the output port. In this problem, the E and H distributions are determined by the steady-state Maxwell equations with frequency ω,

(6)$$\nabla \times \nabla \times \boldsymbol{{\rm E}} - k_0^2{\varepsilon _r}\boldsymbol{E} ={-} j\omega {\mu _0}\boldsymbol{J}$$

where ${k_0} = {{{\omega ^2}} / {{c^2}}}$ is the free-space wavenumber, J is the electric current density.

As mentioned above, using the adjoint sensitivity analysis, in the photonics inverse design problem, the gradient of the FOM with respect to the permittivity ɛ(x,y,z) in space can be calculated using two simulations only [34]. The gradient at a single point in the design region can be expressed as,

(7)$${\left. {\frac{{d FOM}}{{d \varepsilon }}} \right|_{\textrm{design region}}} = { {2k_0^2{\textrm{Re}} ({{{\boldsymbol E}_{adj}} \cdot {{\boldsymbol E}_{for}}} )} |_{\textrm{design region}}}, $$

where, E_for is the forward electric field at the same point in design region corresponding to the input mode source. While the E_adj is the adjoint electric field at the same point, which obtained by sending mode $\left|{\left. {\begin{array}{c} {{{\boldsymbol E}_m}}\\ {{{\boldsymbol H}_m}} \end{array}} \right\rangle } \right.$ backward from output port [38,39]. We use the 3-dimension finite difference time domain (3D-FDTD) method to obtain the electric field distribution in the time domain [40], and get the electric field distribution in the frequency domain by doing Fourier Transform.

Merge the gradient of FOM to ɛ in each unit cell to obtain the average value, which can illustrate the overall refractive index changing tendency of the unit cell. We regard this value as the gradient of FOM with respect to the design parameter p_i corresponding to the unit cell. Which can be expressed as

(8)$${{{\textrm{d}FOM} / {\textrm{d}p}}_i} = { {Average({{{\textrm{d}FOM} / {\textrm{d}\varepsilon }}} )} |_{\textrm{unit cell i}}}. $$

Then, the absolute value of the gradient is mapped into the probability space through the hyperbolic tangent (tanh) function. At this time, the value of the mapped gradient is in the [0,1] interval, which we call the normalized gradient-driven probability, denoted as G

(9)$${G_i} = \tanh (\alpha \cdot |{{{\textrm{d}FOM} / {\textrm{d}{p_i}}}} |),\textrm{ }{G_i} \in [{0,1} ], $$

where α is the learning rate, α > 0. The meaning of each G_i is the probability that the corresponding optimization parameter p_i is selected to be modified. And the process of selecting p_i can be describe as: for each p_i, obtaining a random number c_i from the uniform distribution ${c_i} \in U({0,1} )$, comparing its value with G_i, pick the corresponding p_i when ${c_i} < {G_i}$. Here, the random number ${c_i} \in U({0,1} )$ has the probability of G_i that ${c_i} < {G_i}$, thus ensuring that p_i is selected by the probability of G_i. Meanwhile, the direction of modifying p_i is determined according to the positive and negative of the homologous gradient component ${{\textrm{d}FOM} / {\textrm{d}{p_i}}}$.We control the number of modified elements in design parameter p not exceed the n × 4% by choosing the value of α. We iterate parameter p using this strategy until the FOM converges.

The strategy is implemented by in-house code. For each iteration, the electrical-magnetic simulation is performed by Lumerical 3D-FDTD solutions [41]. At each iteration, the optimization code communicates with FDTD solutions to get the electromagnetic fields in space, and provides the new structure to FDTD solutions.

3.2 Fast convergence characteristics

To show that our algorithm can efficiently improve the device performance, we compare the proposed GPDS method with the traditional BFS method. The two methods start from the same random generated structure. In the BFS method, for each unit cell, we simulating the three material types to determine how to modify the parameter. Figure 3(b) and (c) show the performance of the two algorithms in the optimization process, respectively. The blue line represents the FOM as a function of iteration numbers. The GPDS method has searched the results with a FOM = 0.933 within 150 iterations, while the BFS method requires more than 2500 iterations to converge on a result with the FOM = 0.931. Considering that the adjoint algorithm requires an additional electromagnetic simulation to obtain the gradient, the GDS algorithm runs less than 300 electromagnetic simulations in total. In comparison, the BFS method takes more than 2500 electromagnetic simulations. Typically, the electromagnetic simulation is the most time-consuming step in the optimization process of the photonics inverse design problem. Therefore, in this task, the time consumption of the GPDS algorithm is about only 12% of the BFS method. Figure 3(d) and (e) show the result obtained by the two methods, and note that the two layouts are somewhat different, indicating that the two algorithms find local optimal solutions. We show the corresponding electric field E_y component distributions respectively in Fig. 3(f) and (g), at 1550 nm wavelength. As a result, compared with the BFS algorithm, the proposed GPDS algorithm has significant advantages in terms of convergence speed.

Fig. 3. Randomly initialized device structure (a). FOM, as functions of iteration numbers for GPDS algorithm (b) and BFS method (c). The optimized layout of the TE₀ mode to TE₁ mode converter is obtained through the GPDS algorithm (d) and BFS Method (e). The electric field E_y distribution of the device obtained by the GPDS algorithm (f) and the BFS method (g).

Download Full Size | PDF

3.3 Stability to the initial states

Although the speed of convergence is an important evaluation criterion of an algorithm, the more vital criterion is the final performance in the photonic inverse design problem. For such problem, the result that the most algorithm converged is always a local optimum. The traditional gradient algorithms are deterministic. When the initial states and parameters are determined, the results are also determined. This feature makes them relatively more susceptible to the bad local optima, which impairs the final FOM performance. As aforementioned, the sampling process of the normalized gradient-driven probability in the GPDS algorithm avails the FOM jumping out of local extrema. To demonstrate this feature, for the identical problem, we perform the traditional gradient algorithm that only uses gradient information to modify independent variables in a continuous space without taking other regularization methods. In the application scenario of section 3.1, the optimization process can be regarded as finding the optimal equivalent refractive index of each unit cell in an optional discrete value set. Since the traditional gradient algorithm performs on the continuous variable, we allow continuous variation of permittivity in each unit cell while maintaining the same size of unit cells.

In Fig. 4, we show the optimization process of the traditional gradient algorithm in our design. Figure 4(a) shows the randomly generated initial structure. We let the design parameters p_i directly be the value of permittivity corresponding to each unit cell. The permittivity of each unit cell is continuous and belongs to an interval [ɛ_min,ɛ_max], so the value of p_i is continuous and belongs to the interval,

(10)$${p_i} \in [{\varepsilon _{\min }},{\varepsilon _{\max }}], $$

where, ɛ_min = 3 (corresponding to Type-I), ɛ_max= 12 (corresponding to Type-III). The gradient ${{\textrm{d}FOM} / {\textrm{d}{p_i}}}$ is calculated by Eqs. (7) and (8), and we use the Method of Moving Asymptotes (MMA) to modify the design parameters [42]. The algorithm’s convergence speed is roughly the same as the GPDS algorithm, and a structure with a local optimal FOM = 0.89 can be obtained within 100 iterations, which is shown in Fig. 4(b). To compare the convergence characteristics of the algorithm, we did not perform an additional discretization step on the result of the gradient ascent algorithm, which usually reduces the value of FOM.

Fig. 4. Schematic of the traditional gradient method for device design.

Download Full Size | PDF

In order to compare the performance of the GPDS algorithm with the traditional gradient algorithm, we randomly generated 150 initial structures to simulate the optimization with arbitrary starts. Then, for each initial state, we performed 100 iterations of the corresponding algorithm. We show the convergence process of the traditional gradient algorithm in Fig. 5(a), in which one can see that the algorithm converges at about 60 iterations. The result of the optimal FOM values of the two algorithms is shown in Fig. 5(b). The blue histogram represents the GPDS algorithm's optimal FOM values, and the orange histogram corresponds to the traditional gradient algorithm. Comparing the two histograms, the value range of the maximum FOMs obtained by the GPDS is more concentrated, and most of the values are above 0.88. In contrast, the values of FOM of the traditional gradient algorithm are relatively more dispersion, mainly in the range of 0.8 to 0.92, which indicates that the traditional gradient algorithm is affected by the initial structure and bad local optima. As a result, comparing with the traditional gradient method, the proposed GPDS algorithm has a better consistency for different random initial structures, which benefits from the sampling process. The comparison shows that the algorithm can alleviate the impairment of the final FOM caused by bad local optima and has better stability to the initial states. Note that this comparison for gradient ascent methods without regularization. In gradient methods with optimized architectures such as density-based methods [10,12–14] it is possible to achieve higher performance without falling into poor local extremes. (See Appendix II).

Fig. 5. (a) The three convergence processes of the traditional gradient algorithm. (b) Statistical histograms of the final FOM results of the two algorithms. The blue histogram is the FOM count of the proposed GDS algorithm, and the orange is the traditional gradient method.

Download Full Size | PDF

4. Device design using gradient-probability-driven discrete search algorithm

In this section, we give some commonly used device design using GPDS to show the capability of our method, the mode converters demonstrating the flexibility of algorithms and the mode de-multiplexer demonstrating the ability for solving multi-objective problems.

4.1 Mode converters

First, we give two examples of designing mode converters. Since TE₀ to TE₁ mode converter has been designed, here we demonstrate the designing of a TE₀ to TE₂ mode and a TE₁ to TE₂ mode converters, respectively. Thus, the conversion between any two of the three basic orthogonal modes (TE₀, TE₁, and TE₂) can be realized in the SOI platform. To show the algorithm's flexibility, we only change the width of the input and output waveguides and the corresponding mode of the FOM in Eq. (5) while using the same design area as the TE₀ to TE₁ mode converter.

Fig. 6. The design process and results of TE₀–TE₂ mode converter. (a) is the FOM as a function of the number of iterations. (b) is the optimized device layout. (c) The electric field E_y distribution corresponding to the device structure. (d) The simulated spectrum corresponding to the device structure.

Download Full Size | PDF

TE₀–TE₂ mode converter. For the TE₀ to TE₂ mode converter, the input waveguide is 0.5µm width, and the output waveguide is 1.4 µm width. The FOM, in this design, is the power-coupling efficiency expressed as Eq. (5), where the output mode is TE₂ mode. From a random starting point, the designed process and result of TE₀ to TE₂ mode conversion device is shown in Fig. 6. The current FOM and the maximum FOM, as functions of iteration numbers, are shown by the blue and red lines in Fig. 6(a), respectively. The FOM value reached 0.953 in about 80 iterations, corresponding to the insertion loss 0.29 dB. Figure 6(b) is the layout of the optimized device, and the footprint of the device is also 3 µm × 2 µm. The device is designed to be symmetric since the input and output modes are all symmetrical modes. In Fig. 6(c), we show the electric field E_y component distribution of the optimized device at the wavelength of 1550 nm. The simulated spectrum of the device is shown in Fig. 6(d), the blue line and red lines are the mode coupling efficiency and crosstalk in the 1525–1610 nm wavelength range, respectively. In the wavelength range, the maximum insertion loss of the device is 0.68 dB, and the crosstalk is less than -20 dB.

TE₁–TE₂ mode converter. Similarly, the optimization process and results of the mode conversion device from TE₁ mode to TE₂ mode are shown in Fig. 7. In this design, the input waveguide width is 1.0 µm, and the FOM is the power-coupling efficiency of the input TE₁ mode to the output TE₂ mode. Figure 7(a) indicate that the algorithm has converge to a structure with a FOM = 0.946 in less than 100 iterations. The corresponding layout is shown in Fig. 7(b). Identically, the device footprint is 3 µm × 2 µm. In Fig. 7(c), we show the E_y component distribution of the electric field corresponding to a wavelength of 1550 nm. Figure 7(d) is the simulated spectrum of the device. In the 1525–1610 nm wavelength range, the insertion loss of the device is 0.6 dB, shown by the blue line. The crosstalk of the device for TE₀ mode and TE₁ mode is less than -20 dB and -24 dB, respectively, which is represented by the yellow and red lines.

Fig. 7. The design process and results of TE₁–TE₂ mode converter. (a) is the FOM as a function of the number of iterations. (b) is the optimized device layout. (c) The electric field E_y distribution corresponding to the device structure. (d) The simulated spectrum corresponding to the device structure.

Download Full Size | PDF

In summary, we designed any two mode converters for the three basic orthogonal modes (TE₀, TE₁, and TE₂) by only changing the device's input and output waveguide width and the FOM of the optimization algorithm. As a result, we show the flexibility of the algorithm.

4.2 Mode division multiplexing devices

To demonstrate the algorithm’s ability of solving multi-objective problem, we designed a TE₀ and TE₁ mode de-multiplexer. The layout of the device is shown in Fig. 8(a). The footprint is 4.4 µm × 2.4 µm, the input is a 1 µm multimode waveguide, and the outputs are two 0.5 µm width single-mode waveguides with a gap of 1 µm. The design objective is to guide the TE₀ mode and TE₁ mode of the input port to the TE₀ mode of Output-1 and Output-2 ports, respectively. The power-coupling efficiency, described by Eq. (5), of the TE₀ mode input to the TE₀ mode of the Output-1 is FOM⁽¹⁾, and the mode coupling efficiency of the TE1 mode input to the TE₀ mode of the Output-2 is FOM⁽²⁾; and the total FOM^(total) = FOM⁽¹⁾ × FOM⁽²⁾. The optimization process is shown in Fig. 8(b). The iterations number is 250, and during the optimization process, the optimal value of FOM^(total) is 0.84. The corresponding FOM⁽¹⁾ and FOM⁽²⁾ values are 0.91 and 0.92, respectively. We show the simulation performance of the device in Fig. 8(c). The blue line and the purple line represent the insertion loss of the device when the TE₀ mode and the TE₁ mode are input, respectively. The insertion loss in the 1525–1610 nm wavelength range is no more than 0.9 dB and 1.0 dB, respectively. The crosstalks corresponding to the two inputs mode are respectively less than -20.4 dB and -19.7 dB, as indicated by the red and orange lines. Figure 8(d) and (e) respectively shows the electric field E_y distribution of the input TE₀ mode and TE₁ mode at the wavelength of 1550 nm. As a pivotal component of the mode division multiplexing (MDM) system, the TE₀ and TE₁ mode de-multiplexer have been widely studied [15,21,43]. Here, we compare several representative inverse-designed TE₀ and TE₁ mode de-multiplexers and our work in Table 1. Our algorithm has a better minimum feature size control.

Fig. 8. The design process and results of TE₀ and TE₁ mode demultiplexer. (a) The optimized device layout. (b) The FOM as a function of the number of iterations. (c) The simulated spectrum of the optimized device. (c) and (d) The electric field E_y distribution corresponding to the TE₀ and TE₁ mode input, respectively.

Download Full Size | PDF

Table 1. The comparison of optimizing TE₀ and TE₁ mode de-multiplexer

View Table

5. Conclusion

In conclusion, we propose a gradient-probability-driven discrete search (GPDS) algorithm suitable for the inverse design of on-chip photonic devices. The algorithm is based on the approach of the discrete search algorithm, which satisfies the material and size constraints throughout the optimization process, so the designed device can be directly implemented on-chip. What is more, the algorithm combines gradient and probabilistic methods to build a connection between continuous gradient and discrete permittivity. Therefore, it has the characteristics of rapid convergence and better stability to the initial state. These two characteristics have been illustrated by comparing the GPDS algorithm with the traditional BFS method and traditional gradient algorithm.

We illustrate several component designs which are commonly used in silicon photonics platforms, the mode converters design demonstrating the algorithm’s flexibility, and the mode de-multiplexer design showing the ability to solve multi-objective problems. The algorithm promises a powerful tool for the design of high-performance and manufacturing-friendly on-chip photonic devices. Moreover, the introduction of the probability method provides an enlightening idea for future algorithm design.

Appendix I: scalability of the algorithm for large-scale design parameters

Modern state-of-the-art gradient-based inverse design methods have been shown to be able to handle large amount design variables [44–46]. In order to illustrate that our algorithm can expand to a large number of independent variables, we demonstrate an example using the proposed GPDS algorithm to design a mode converter from TE₀ to TE₁ with a 20 µm× 20 µm footprint. The device contains 100 × 100 = 10000 unit cells, corresponding to 10000 design parameters p_i. Figure 9(a) is the optimized result of the device. The input and output waveguide widths are 4 µm and 10 µm, respectively. Figure 9(b) shows the changing of FOM during the optimization process, and one can see that the algorithm increases FOM from a low value (about 0.01) to FOM = 0.92 within 100 iterations. And the corresponding electric field E_y distribution is shown in Fig. 9(c).

Fig. 9. Schematic diagram of an example with 10,000 independent variables. (a) Optimized structure. (b) The FOM value during optimization. (c) The electric field Ey component distribution of the optimized structure.

Download Full Size | PDF

Appendix II: comparison between the proposed GPDS method and three-field topology optimization method

Here, we illustrate using the three-field topology optimization method [10,12–14] to design a same sized TE₀ and TE₁ mode converter to show the characteristics of continuous search and discrete search methods. In this method, we divide the design area into small squares of 10 nm × 10 nm. Each small square corresponds to a design field ρ_i, and there are 300 × 150 = 45000 independent variables. We use the three-field topology optimization scheme to map the design parameters to the actual relative permittivity [10,12]. The design field ρ is first filtered to produce a filtered field $\widetilde {\boldsymbol \rho }$ and then transformed into a physical field $\overline {\boldsymbol \rho } $ through the projection. The process can be expressed as,

(11a)$${\tilde{\rho }_i} = \frac{{\sum\nolimits_{j \in {{\rm N}_i}} {\omega ({{\mathbf x}_j}){\rho _j}} }}{{\sum\nolimits_{j \in {{\rm N}_i}} {\omega ({{\mathbf x}_j})} }},\textrm{ }\omega ({{\mathbf x}_j}) = R - |{{{\mathbf x}_i} - {{\mathbf x}_j}} |, $$

(11b)$${\bar{\rho }_i} = \frac{{\tanh (\beta \cdot \eta ) + \tanh (\beta \cdot ({{\tilde{\rho }}_i} - \eta ))}}{{\tanh (\beta \cdot \eta ) + \tanh (\beta \cdot (1 - \eta ))}}, $$

(11c)$${\varepsilon _i} = {\varepsilon _{\min }} + ({\varepsilon _{\max }} - {\varepsilon _{\min }}){\bar{\rho }_i}, $$

where ω is the two-dimensional linear hat-shape filter, and the radius R is set to 150 nm to achieve the same 80 nm minimum feature size constraint as the GPDS method. The parameter η is the threshold, and β controls the steepness of the approximated Heaviside function. During the optimization process, we set η = 0.5 and gradually increase the value of β from 1 to 128. The gradient update algorithm is MMA. We show the optimization process starting from a random initial state in Fig. 10.

Fig. 10. Density-based gradient method. (a) The corresponding device structure when iteration=0, 117, 200 and 300. (b) FOM as a function of iteration numbers in the optimization process.

Download Full Size | PDF

Benefit from topology optimization’s high degree of freedom of shape and structure. The algorithm can reach a very high FOM value in the optimization process, about 0.96, when the iteration = 117 and β = 7.9, as shown in Fig. 10(a). However, due to the result contains many gray-scale areas, it cannot be implemented in manufacturing directly. And as shown in Fig. 10(b), the binarization process by gradually increasing the value of β sacrifices the FOM value. The area of the gray matter gradually decreases as the value of β gradually increases, and finally, when β = 128, FOM = 0.82. And it is worth noticing that many tools to deal with the gray-scale exist have been developed in recent years, which can reduce this performance loss significantly and promote that the algorithm converges to 0,1 values with high performance [14,47,48]. In addition, one may also improve the final result of this algorithm through a better hyper-parameter settings and a more physical initial state. Compared with the discrete method after continuous optimization, the proposed GPDS algorithm is based on discrete independent variables. Although sacrificing some design freedom, as long as the designer ensures that any unit cell type corresponding to the discrete design parameters can be manufactured, all structures obtained in the entire optimization process satisfy the manufacturing constraint.

Funding

National Key Research and Development Program of China (2019YFB2203602); National Science Fund for Distinguished Young Scholars (61825504); National Natural Science Foundation of China (61975198); Youth Innovation Promotion Association of the Chinese Academy of Sciences (2019116); Beijing Science and Technology Planning Project (Z191100004819007).

Acknowledgments

We acknowledge Dr. Jiwang Peng for useful discussion.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. B. Shen, P. Wang, R. Polson, and R. Menon, “An integrated-nanophotonics polarization beamsplitter with 2.4 × 2.4 µm² footprint,” Nat. Photonics 9(6), 378–382 (2015). [CrossRef]

2. A. Y. Piggott, J. Lu, K. G. Lagoudakis, J. Petykiewicz, T. M. Babinec, and J. Vučković, “Inverse design and demonstration of a compact and broadband on-chip wavelength demultiplexer,” Nat. Photonics 9(6), 374–377 (2015). [CrossRef]

3. J. Lu and J. Vučković, “Objective-first design of high-efficiency, small-footprint couplers between arbitrary nanophotonic waveguide modes,” Opt. Express 20(7), 7221–7236 (2012). [CrossRef]

4. S. Molesky, Z. Lin, A. Y. Piggott, W. Jin, J. Vuckovic, and A. W. Rodriguez, “Inverse design in nanophotonics,” Nat. Photonics 12(11), 659–670 (2018). [CrossRef]

5. S. Boutami and S. Fan, “Efficient pixel-by-pixel optimization of photonic devices utilizing the Dyson’s equation in a Green’s function formalism: Part I. Implementation with the method of discrete dipole approximation,” J. Opt. Soc. Am. B 36(9), 2378–2386 (2019). [CrossRef]

6. S. Boutami and S. Fan, “Efficient pixel-by-pixel optimization of photonic devices utilizing the Dyson’s equation in a Green’s function formalism: Part II. Implementation using standard electromagnetic solvers,” J. Opt. Soc. Am. B 36(9), 2387–2394 (2019). [CrossRef]

7. D. J. Lockwood and L. Pavesi, “Silicon Fundamentals for Photonics Applications,” Silicon Photonics Top. Appl. Phys. 94, 1–50 (2004). [CrossRef]

8. G. Zhang and O. Liboiron-Ladouceur, “Scalable and Low Crosstalk Silicon Mode Exchanger for Mode Division Multiplexing System Enabled by Inverse Design,” IEEE Photonics J. 13(2), 1–13 (2021). [CrossRef]

9. J. Huang, J. Yang, D. Chen, W. Bai, J. Han, Z. Zhang, J. Zhang, X. He, Y. Han, and L. Liang, “Implementation of on-chip multi-channel focusing wavelength demultiplexer with regularized digital metamaterials,” Nanophotonics 9(1), 159–166 (2019). [CrossRef]

10. M. Zhou, B. S. Lazarov, F. Wang, and O. Sigmund, “Minimum length scale in topology optimization by geometric constraints,” Comput. Meth. Appl. Mech. Eng. 293, 266–282 (2015). [CrossRef]

11. L. Su, D. Vercruysse, J. Skarda, N. V. Sapra, J. A. Petykiewicz, and J. Vučković, “Nanophotonic inverse design with SPINS: Software architecture and practical considerations,” Appl. Phys. Rev. 7(1), 011407 (2020). [CrossRef]

12. F. Wang, R. E. Christiansen, Y. Yu, J. Mørk, and O. Sigmund, “Maximizing the quality factor to mode volume ratio for ultra-small photonic crystal cavities,” Appl. Phys. Lett. 113(24), 241101 (2018). [CrossRef]

13. G. Zhang, D. X. Xu, Y. Grinberg, and O. Liboiron-Ladouceur, “Topological inverse design of nanophotonic devices with energy constraint,” Opt. Express 29(8), 12681–12695 (2021). [CrossRef]

14. R. E. Christiansen and O. Sigmund, “Inverse design in photonics by topology optimization: tutorial,” J. Opt. Soc. Am. B 38(2), 496–509 (2021). [CrossRef]

15. K. Y. Wang, X. S. Ren, W. J. Chang, L. H. Lu, D. M. Liu, and M. M. Zhang, “Inverse design of digital nanophotonic devices using the adjoint method,” Photonics Res. 8(4), 528–533 (2020). [CrossRef]

16. D. Vercruysse, N. V. Sapra, L. Su, R. Trivedi, and J. Vučković, “Analytical level set fabrication constraints for inverse design,” Sci. Rep. 9(1), 8999 (2019). [CrossRef]

17. Y. Liu, K. Xu, S. Wang, W. Shen, H. Xie, Y. Wang, S. Xiao, Y. Yao, J. Du, Z. He, and Q. Song, “Arbitrarily routed mode-division multiplexed photonic circuits for dense integration,” Nat. Commun. 10(1), 3263 (2019). [CrossRef]

18. A. Majumder, B. Shen, R. Polson, and R. Menon, “Ultra-compact polarization rotation in integrated silicon photonics using digital metamaterials,” Opt. Express 25(17), 19721–19731 (2017). [CrossRef]

19. J. Guo, C. Ye, C. Liu, M. Zhang, C. Li, J. Li, Y. Shi, and D. Dai, “Ultra-Compact and Ultra-Broadband Guided-Mode Exchangers on Silicon,” Laser Photonics Rev. 14(7), 2000058 (2020). [CrossRef]

20. W. Chang, X. Ren, Y. Ao, L. Lu, M. Cheng, L. Deng, D. Liu, and M. Zhang, “Inverse design and demonstration of an ultracompact broadband dual-mode 3 dB power splitter,” Opt. Express 26(18), 24135–24144 (2018). [CrossRef]

21. W. Chang, L. Lu, X. Ren, D. Li, Z. Pan, M. Cheng, D. Liu, and M. Zhang, “Ultra-compact mode (de) multiplexer based on subwavelength asymmetric Y-junction,” Opt. Express 26(7), 8162–8170 (2018). [CrossRef]

22. Z. Yu, A. Feng, X. Xi, and X. Sun, “Inverse-designed low-loss and wideband polarization-insensitive silicon waveguide crossing,” Opt. Lett. 44(1), 77–80 (2019). [CrossRef]

23. M. Djavid, S. A. Mirtaheri, and M. S. Abrishamian, “Photonic crystal notch-filter design using particle swarm optimization theory and finite-difference time-domain analysis,” J. Opt. Soc. Am. B 26(4), 849–853 (2009). [CrossRef]

24. J. C. Mak, C. Sideris, J. Jeong, A. Hajimiri, and J. K. Poon, “Binary particle swarm optimized 2 × 2 power splitters in a standard foundry silicon photonic platform,” Opt. Lett. 41(16), 3868–3871 (2016). [CrossRef]

25. Z. Liu, X. Liu, Z. Xiao, C. Lu, H.-Q. Wang, Y. Wu, X. Hu, Y.-C. Liu, H. Zhang, and X. Zhang, “Integrated nanophotonic wavelength router based on an intelligent algorithm,” Optica 6(10), 1367–1373 (2019). [CrossRef]

26. C. Lu, Z. Liu, Y. Wu, Z. Xiao, D. Yu, H. Zhang, C. Wang, X. Hu, Y. C. Liu, X. Liu, and X. Zhang, “Nanophotonic Polarization Routers Based on an Intelligent Algorithm,” Adv. Opt. Mater. 8(10), 1902018 (2020). [CrossRef]

27. Y. Jiao, S. Fan, and D. A. B. Miller, “Demonstration of systematic photonic crystal device design and optimization by low-rank adjustments: an extremely compact mode separator,” Opt. Lett. 30(2), 141–143 (2005). [CrossRef]

28. A. Y. Piggott, J. Petykiewicz, L. Su, and J. Vučković, “Fabrication-constrained nanophotonic inverse design,” Sci. Rep. 7(1), 1786 (2017). [CrossRef]

29. C. M. Lalau-Keraly, S. Bhargava, O. D. Miller, and E. Yablonovitch, “Adjoint shape optimization applied to electromagnetic design,” Opt. Express 21(18), 21693–21701 (2013). [CrossRef]

30. S. Osher and J. A. Sethian, “Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulations,” J. Comput. Phys. 79(1), 12–49 (1988). [CrossRef]

31. P. I. Borel, A. Harpøth, L. H. Frandsen, M. Kristensen, P. Shi, J. S. Jensen, and O. Sigmund, “Topology optimization and fabrication of photonic crystal structures,” Opt. Express 12(9), 1996–2001 (2004). [CrossRef]

32. M. P. Bendsøe and O. Sigmund, Topology Optimization Theory, Methods and Applications (Springer, 2003).

33. N. K. Georgieva, S. Glavic, M. H. Bakr, and J. W. Bandler, “Feasible adjoint sensitivity technique for EM design optimization,” IEEE Trans. Microw. Theory Tech. 50(12), 2751–2758 (2002). [CrossRef]

34. A. Y. Piggott, “Automated design of photonic devices,” Ph.D. Thesis, Stanford University, (2018).

35. X. Huang and Y.-M. Xie, “A further review of ESO type methods for topology optimization,” Struct. Multidiscip. Optim. 41(5), 671–683 (2010). [CrossRef]

36. H. X. Chen, H. Jia, T. Wang, and J. H. Yang, “A Gradient-Oriented Binary Search Method for Photonic Device Design,” J. Lightwave Technol. 39(8), 2407–2412 (2021). [CrossRef]

37. E. N. Economou, Green’s Functions in Quantum Physics, 2nd ed. (Springer-Verlag, 1990).

38. S. Boutami, K. Hassan, C. Dupré, L. Baud, and S. Fan, “Experimental demonstration of silicon photonic devices optimized by a flexible and deterministic pixel-by-pixel technique,” Appl. Phys. Lett. 117(7), 071104 (2020). [CrossRef]

39. T. W. Hughes, M. Minkov, Y. Shi, and S. H. Fan, “Training of photonic neural networks through in situ backpropagation and gradient measurement,” Optica 5(7), 864–871 (2018). [CrossRef]

40. Y. Kane, “Numerical solution of initial boundary value problems involving maxwell's equations in isotropic media,” IEEE Trans. Antennas Propag. 14(3), 302–307 (1966). [CrossRef]

41. Lumerical FDTD Solutions, https://www.lumerical.com.

42. K. Svanberg, “The method of moving asymptotes—a new method for structural optimization,” Int. J. Num. Methods Eng. 24(2), 359–373 (1987). [CrossRef]

43. L. F. Frellsen, Y. Ding, O. Sigmund, and L. H. Frandsen, “Topology optimized mode multiplexing in silicon-on-insulator photonic wire waveguides,” Opt. Express 24(15), 16866–16873 (2016). [CrossRef]

44. F. Wang, B. S. Lazarov, and O. Sigmund, “On projection methods, convergence and robust formulations in topology optimization,” Struct. Multidiscip. Optim. 43(6), 767–784 (2011). [CrossRef]

45. N. Aage, E. Andreassen, B. S. Lazarov, and O. Sigmund, “Giga-voxel computational morphogenesis for structural design,” Nature 550(7674), 84–86 (2017). [CrossRef]

46. R. Pestourie, C. Perez-Arancibia, Z. Lin, W. Shin, F. Capasso, and S. G. Johnson, “Inverse design of large-area metasurfaces,” Opt. Express 26(26), 33732–33747 (2018). [CrossRef]

47. A. M. Hammond, A. Oskooi, S. G. Johnson, and S. E. Ralph, “Photonic topology optimization with semiconductor-foundry design-rule constraints,” Opt. Express 29(15), 23916–23938 (2021). [CrossRef]

48. J. S. Jensen and O. Sigmund, “Topology optimization of photonic crystal structures: a high-bandwidth low-loss T-junction waveguide,” JOSA B 22(6), 1191–1198 (2005). [CrossRef]

Method	Optimize Time	Footprints (µm²)	Minimum feature size	IL	Cross talk
Digitized Adjoint Method [15]	7 h	3 × 2.4	35 nm	average 1.36 dB	< -20dB
Digitized Adjoint Method [15]	E5-2637	3 × 2.4	35 nm	(1530∼1570 nm)	< -20dB
DBS [21]	36 h	3 × 2.4	45 nm	< 0.47 dB	< -24dB
DBS [21]	E5-2637	3 × 2.4	45 nm	(1530∼1590 nm)	< -24dB
Topology optimization [43]	8 h	4 × 2.4	32 nm	< 1 dB	< -18dB
Topology optimization [43]	E5-2665	4 × 2.4	32 nm	(1450∼1700 nm)	< -18dB
Proposed GPDS (this work)	4 h	4.4 × 2.4	80 nm	< 1 dB	< -19.7dB
Proposed GPDS (this work)	E5-2697	4.4 × 2.4	80 nm	(1525∼1610 nm)	< -19.7dB

Gradient-probability-driven discrete search algorithm for on-chip photonics inverse design

Abstract

1. Introduction

2. Gradient-probability-driven discrete search algorithm

3. Algorithm performance analysis

3.1 Implementation environment of the algorithm in silicon photonics

3.2 Fast convergence characteristics

3.3 Stability to the initial states

4. Device design using gradient-probability-driven discrete search algorithm

4.1 Mode converters

4.2 Mode division multiplexing devices

5. Conclusion

Appendix I: scalability of the algorithm for large-scale design parameters

Appendix II: comparison between the proposed GPDS method and three-field topology optimization method

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (10)

Tables (1)

Equations (15)

Optics Express