## Abstract

This paper presents the implementation of a parallelized Finite-Difference Time-Domain method, based on the Message Passing Interface (i.e. MPI), which is used to study the modal properties of three-dimensional (3D) dielectric waveguide structures. To this end, we also use the least-square method to obtain the wave vector, *β*, along the axis of propagation. Lastly, bending losses in arbitrary-angle waveguides are also discussed.

©2004 Optical Society of America

## 1. Introduction

In recent years, high index contrast dielectric waveguides have become of great interest in the field of integrated optics. In this way, planar optical waveguides are often the key devices in constructing integrated optical circuits and semiconductor lasers. Accordingly, the design of optimized integrated optical components requires a detailed understanding of the various electromagnetic propagation characteristics of the structures that define the devices. Unfortunately, the analytic determination of all of the necessary properties of such waveguides is not always viable. As a result, numerical techniques must be used.

To this end, the Finite-Difference Time-Domain method has been one of the more popular methods in the analysis of electromagnetic problems, due primarily to its simplicity in implementation and memory efficiency. Compared with thoroughly investigated two-dimensional (2D) analysis methods for dielectric waveguides, their counterparts in three dimensions are less developed due to their relatively extensive computational requirements. An exception to this is when *E*⃑(*x, y, z*)=*E*⃑(*y, z*)*e*
^{-jkx} (where *x* is the propagation direction) 3D waveguides, can be reduced to 2D using the compact FDTD method [1,2]; otherwise, the compact FDTD method is not valid and a full 3D method must be used. In this case, an optical waveguide will typically span a few wavelengths in the transverse direction, but along its length can span dozens of wavelengths and in some cases even thousands of wavelengths. Naturally, the longer the waveguide the more computational effort is required in its analysis. This typically demands high speeds from the CPU and significant amounts of memory. Such that the increase in computational cost can quickly exceed the capability of a single computer. For this reason, in this paper we implement a parallel algorithm, based on the Message passing Interface (MPI), to overcome the computational difficulties imposed by 3D waveguides.

Therefore, the long length of the waveguide, along the propagation direction, is an ideal candidate for parallel processing and vectorization using domain-decomposition techniques. Consequently, the division of the computational region into several sub-regions can be made along the propagation direction and, accordingly, each sub-region can be assigned to a different processor. In the implementation of such an approach, MPI is used to realize the interface between different processors.

Besides speedup, the parallelized FDTD algorithm also provides a very powerful tool for analyzing arbitrary shaped 3D waveguide structures especially those that may contain abrupt discontinuities in the propagation direction [13].

Therefore, in the context of this study we briefly present the introduction of the FDTD method, with PML absorbing boundary conditions, in Section 2. We then present its parallelization using MPI, in Section 3. Next, we present an optimization method based on a least–square non-linear algorithm that is used to determine the wave vector in 3D straight waveguides. In section 4, we present two examples: the determination of propagation wave vector, *β*, in straight waveguides and the study of losses in waveguide bends.

## 2. Brief discussion of the FDTD method and absorbing boundary conditions

Since the initial work of Yee [3], the FDTD method has become one of the more widely used numerical techniques for solving electromagnetic boundary value problems. As such, the FDTD uses a central finite-difference representation for the space and time derivatives in Maxwell’s equations, which is second order accurate in both time and space.

Due to realistic limitations in memory and computational time, it is necessary to truncate the computational domain. For this reason, an absorbing boundary condition is introduced at the outer region of the boundary to truncate the computational region and to absorb all outward traveling electromagnetic waves. In this paper, we use the PML absorbing boundary condition [5], shown in Fig. 1(a), which typically has reflections on the order of 10^{-6} to 10^{-8}. For brevity, further details concerning the FDTD method and the PML absorbing boundary conditions are referred to in [6]. As indicated in Fig. 1(a), to terminate the infinite extension of straight waveguides, proper PML parameters are chosen in the inhomogeneous media region to maintain a stable and accurate formulation. Also, as the FDTD method is a solution to an initial-value problem, it requires an excitation scheme to start the computation. In this study, the guided wave source is excited near the left-end boundary of waveguides, see Fig. 1 (a), which was obtained by the Marcatili method [7]. In the following section, we present an overview of our implementation of the FDTD method using MPI.

## 3. Parallel computation using MPI

The analysis of dielectric guides in 3D is a very demanding computationally, due to the large computational region. In many simulations, the increase in computational cost exceeds the capability of a single computer. For example, to calculate the steady state solution of a silicon waveguide structure whose index of refraction is *n*=3.5 and dimensions of 0.2286*λ*×0.2286*λ*×20*λ*, with *λ*=1*µm*, and a sampling rate of *λ*/56, it would take over 7 days using a single computer (assuming a 2 GHz CPU and 1 GB RAM). The high index of refraction of the silicon requires the higher sampling rate (i.e. 56) in the FDTD computation to minimize the dispersion error.

For this reason, parallelization methods, such as Parallel Virtual Machine (i.e., PVM) [15], High Performance Fortran (i.e., HPF), have been applied to the FDTD algorithm. We performed the analyses in this paper using MPI, because MPI is one of the most popular standards for parallel programming. Succinctly, the principal idea of parallel computation is that one can divide the computational region and memory requirements into subsets, which are all computed simultaneously on separate computers. In such an approach, one can reduce the computational time by a factor of *N*, where *N* is the number of subsets. In this paper, the computational region is divided into several parts along the propagation direction of the waveguide, each of which is assigned to a different processor, see Fig. 2. As such, the computational aspect of the analysis on each processor is the same and only the grid values are different. However, to calculate the field along the boundary cells, one needs to know the field values in the adjacent cells, which belong to the neighboring subset. This is the result of the spatial central finite-difference expression, where the value at one point is not only dependent on the information at this point, but also the information at the nearest four adjacent points. As such, the new electric field component (at each discrete point), such as *E _{z}*, is a function of the electric field component at the point from the previous time step and the magnetic field components surrounding the point from the current time step. As a result, the values along the border points in each processor’s computational region are dependent on the values at the points in the neighboring processor’s region. To address this, MPI [8] is used to realize the communication between neighboring processors. The details of its implementation can be found in [8,14]. In the next section, we present our numerical results obtained by using the parallelized FDTD method described above.

The platform of running MPI programs on the parallelized FDTD method is called a Beowulf cluster, which is a small supercomputer built by 10 off-the-shelf personal computers in our lab [14]. Each computer has 2GHz CPU and 1GB RAM and they are interconnected with Ethernet. The architecture of the Beowulf cluster is at two levels: at the upper level, there is one master node and at the lower level, there are nine slave nodes. The master node is the “brain” of the system, which interacts with the outside world/users, distributes the tasks to the slave nodes, and controls the whole system. Slave nodes are dumb: they are manipulated by the master node and do only what they are ordered to do.

## 4. Numerical results and discussion

In this session, the parallelized FDTD algorithm was applied to the two examples. The first case is the determination of the wave vector, *β*, in 3D straight waveguides. A non-linear mean-square fitting algorithm is used to obtain the wave vector in both single-mode and multi-mode waveguides, combined with the parallelized FDTD algorithm. The second case is the study of losses in 3D waveguide bends, shown in Fig. 1(b), as a function of the bend angle and the length of the diagonal waveguide region. Lastly, this section also presents the analysis of an optimal curved bend that is shown to overcome the losses encountered in sharp bends.

#### 4.1 Calculation of wave vector -- β for straight waveguides

A common method to determine the wave vector *β* is to apply the discrete Fourier transform on the electric, or magnetic, field component values along the propagation direction. However, this requires relatively large sampling rates, *N*, to ensure that *dβ*, which is equal to $\frac{2\pi}{\mathit{Ndx}}$, is small enough to obtain a precise value for *β*. For this reason, we use a non-linear mean-square fitting algorithm [9] to obtain the wave vector. As such, the field components, such as *E _{z}*, are assumed to fit into a theoretical modal expansion as indicated in Eq. (1):

where *E _{M}*(

*z*) is the modal function,

_{i}, a_{k}*A*is the value of mode amplitudes,

_{k}*β*is the propagation constants, {

_{k}*z*} are the values of the sampling points along the propagation direction, and

_{i}*M*is the number of modes.

By using the Levenberg-Marquardt method [10], the propagation constants, *β _{k}*, of each mode satisfies:

where *E _{z}*(

*z*)(

_{i}*i*=1,..,

*n*) are the values of the

*E*component, which are sampled along the propagation direction.

_{z}As most optimization methods are dependent on the initial value of the unknown variables, so is the determination of *β _{k}* in Eq. (1). In our implementation,

*β*is obtained by applying the FFT on the electric or magnetic field component along the propagation direction. In this case, the longer the waveguide, the more accurately one can determine the initial values of the wave vector.

_{initial}To design a single-mode dielectric waveguide containing an aspect ratio of 1 in the core region, we used the point matching method [11] through which the maximum thickness for such a waveguide is 0.2027*λ*, which is necessary to ensure single mode operation [12]. In this paper, the problems under test had the waveguide dimensions in the transverse direction of 0.1789*λ*×0.1789*λ*(*λ*=1*µm*). In this case, the propagation constant is equal to 9.324×10^{6}, which was obtained by Marcatili’s method [7]. Using the parallelized FDTD method, we numerically analyzed this structure, which had a waveguide length of 10*λ*. The results are compared in Table 1, which represent the theoretical result obtained from the Marcatili method, the FFT result, and the optimized result using our least-square method (parallelized FDTD). The FFT result is obtained by applying a discrete Fast Fourier transform to the sampling points along propagation direction. From Table 1, we can see that the result using the FDTD method in combination with the least-square method is very close to the theoretical result (Marcatili method).

We also applied this method to a multi-mode rectangular waveguide with dimensions of 0.257*λ*×0.257*λ*(*λ*=1*µm*). For this structure, there exist two modes with theoretical propagation constants of 1.63201×10^{7} and 7.2658×10^{6}. This waveguide was analyzed for a length of 10*λ* using our MPI FDTD method. Table 2 shows the comparison between the theoretical result, again obtained using the Marcatili method, the FFT result, and the optimized result from the least-square method. As shown, the FDTD method in combination with the least-square method provides very accurate results. However, in the case of *β _{1}* we can see that the FFT method was more accurate. We attribute this to the selection of initial values of the mode amplitudes

*A*.

_{k}In the study of the wave vector, we used a dipole as a hard source in the FDTD method. This is represented by:

where *a⃑*=(*x*
_{0}, *y*
_{0}, *z _{i}*),

*z*∈[

_{i}*z*

_{1},

*z*

_{2}].

Figure 3 shows the absolute speedup curve, which is determined using the Beowulf cluster. In terms of the parallel implementation, the absolute speedup is a measure of performance enhancement between a multiprocessor system (using MPI in this case) and a single processor system. The expression for determining this is:

For the example about the steady state of a silicon waveguide mentioned in Section 3, it only takes 18.8 hours to finish the simulation using nine processors in the cluster.

#### 4.2 Study of waveguide bending

Waveguide bends are widely used in integrated optical circuits and in this section we present an evaluation of the loss properties of arbitrary-angle waveguide bends. Figure 1(a) shows an illustration of the top-down view of a waveguide bend and the parameters of our study, namely the length of the diagonal waveguide (*L*) and its angle with respect to the input and output direction of propagation (*θ*). For each of the structures analyzed, the transverse dimensions of the waveguides are 0.1789 *λ*×0.1789*λ*(*λ*=1*µm*) and a core with a refractive index of *n*=3.5 surrounded by air cladding. Using the 10-node Beowulf cluster, we studied the loss, which is computed as the fraction of input power coupled into the fundamental mode of output waveguide, with respect to the variation of L and *θ*. The calculations of power and loss require reference planes, wherein we used one at the input waveguide and one at the output waveguide. The Poynting vector gives the power at each point, thus, the power in the *x*-direction at the entire reference plane is:

In our simulations, the wave propagates predominately in the x-direction, with the exception of radiation loss. In this case, the loss is defined by:

where the power *P _{out}* is measured in the reference plane at the output of the waveguide, shown in Fig. 1(a), and

*P*is measured in the reference plane at the input of the waveguide. To accurately determine

_{in}*P*, a straight waveguide with same dimensions is considered. The same source is used to excite the waveguide modes at the left section. Using this method,

_{in}*P*can be measured in a reference plane located near the extreme right boundary of the straight waveguide.

_{in}In our study, we first fixed *θ*=30°, and varied *L* according to *mλ*, where *m* varies from 0.5 to 10. From Fig. 4 we can see that the loss initially fluctuates with respect to *L* and tends to become flat while *L* is large. The smaller the length of *L*, the more electromagnetic coupling between the two bending corners. In this case, a resonance may occur and the propagation loss will vary with *L*. From the figure, we also can see that there are three resonance occurring at points, where the length of *L* is around 0, 2.3*λ* and 4.6*λ* respectively. When *L* increases, the two bends can become decoupled in which case the propagation loss can be approximated by considering two separate bends. However, there may exist some optimal length at which relatively low loss can be achieved. In this study, we have found such an optimal length, which is *L*=2.5*λ*.

In the second aspect of this study, we fixed the diagonal waveguide length *L*=4*λ*, and evaluated the propagation loss while *θ* changed from 15° to 40°. Figure 5 shows the propagation loss versus bending angle. From the figure, we can see that in this case the loss increases significantly as the angle increases. It agrees with the trend of coupling losing analyzed in a 2-D optical waveguide bends [17], where the depth of the groove increases, equivalent to the case of the increasing bending angle.

From the above results, we observe that the sharp corners have considerable loss in the context of optical integrated circuits, which may be overcome by introducing curved bends. To do so, we considered a structure with L and *θ* equal to 4*λ* and 40°, respectively. The sharp bends were replaced by curved bends with 1.43*µm* radius of curvature, to soften the impact of discontinuity. The two-dimensional *H _{z}* field component in the middle plane is shown in Fig. 6. The propagation loss is equal to -0.96dB, which is 59% lower than the loss in the case of the sharp bends, shown in Fig. 5. Figure 7 shows two movies that recorded the wave propagation of

*H*component through the sharp-bend waveguide (Fig. 7(a)) and curve-bend waveguide (Fig. 7(b)) respectively. From Fig. 7(a), we found there was scattering loss around two sharp bends and strong reflection in the input waveguide. However, from Fig. 7(b), the back reflection in the input waveguide significantly reduced and it has smaller scattering loss in the curved bands than in the sharp bends.

_{z}## 5. Conclusion

This paper analyzes the performance of 3D dielectric waveguide devices using a parallelized FDTD method. In particular, we calculated the wave vector, *β*, in straight waveguides, using a combination of the parallelized FDTD method and a least-square algorithm, which was shown to be efficient and accurate. In addition, we also performed a study of waveguide bends wherein we presented propagation loss as it is related to the length and angle of a diagonal connecting waveguide. Furthermore, the study shows that by substituting curved bends with small radius for sharp bends one can decrease the loss greatly. These results help provide a better understanding of the different factors involved in designing and optimizing waveguide bends for high dielectric contrast optical waveguides as well as their analysis using a parallelized FDTD method.

## Reference and links

**1. **T. Tamir, *Integrated Optics* (Berlin: Springer-Verlag, 1975), *Chap.2.*

**2. **A. Asi and L. Shafai, “Dispersion analysis of anisotropic inhomogeneous waveguides using compact 2D-FDTD,” *Electron. Lett*. , **vol. 28**, pp. 1451–1452 (1992). [CrossRef]

**3. **K.S. Yee, “Numerical solution of initial boundary value problems involving Maxwell’s equations in isotropic media,” IEEE Trans. Antennas Propag. **14**, 302 (1966). [CrossRef]

**4. **G. Mur, “Absrobing boundary conditions for the finite-difference approximation of the time-domain electromagnetic-filed equations,” IEEE Trans Electromagn. Compat. **EMC-23**, pp.377–382 (1981). [CrossRef]

**5. **J.P. Berenger, “A perfectly matched layer for the absorption of electromagnetic waves,” J. Comp. Physics **114**, 185–200 (1994). [CrossRef]

**6. **A. Taflove and S.C. Hagness, *Computational Electromagnetics: The Finite-Difference Time Domain Method*, 2nd ed. (Artech House, Norwood MA, 2000).

**7. **E.A.J. Marcatili, “Dielectric rectangular waveguide and directional coupler for integrated optics,” Bell Syst. Tech. J. **48**, 2071–2102 (1969).

**8. **Willam Gropp, Ewing Lusk, and Anthony Skjellum, *Using MPI: Portable Parallel Programming with the Message-Passing Interface* (The MIT Press, Cambridge MA, 1994).

**9. **M.A. Hernandez-Lopez and M. Quintillan, “Propagation characteristics of modes in some rectangular waveguides using the finite-difference time-domain method,” J. Electromagnetic Waves and Applications **14**,1707–1722 (2000). [CrossRef]

**10. **W.H. Press, S.A. Teulcoolsky, W.T. Vetterling, and B.P. Flannery, *Numerical Recipes in FORTRAN* (Cambridge University Press, 1992).

**11. **J.E. Goell, “A circular harmonic computer analysis of rectangular dielectric waveguides,” Bell Syst. Tech. J. **48**, 2133–2160 (1969).

**12. **A. Kumar, K. Thyagarajan, and A.K. Ghatak, “Analysis of rectangular -core dielectric waveguides - A accurate perturbation approach,” Opt. Lett. **8**, 63–65 (1983). [CrossRef] [PubMed]

**13. **J.H. Greene and A. Taflove, “Initial three-dimensional finite-difference time-domain phenomenology study of the transient response of a large vertically coupled photonic racetrack,” Opt. Lett. **28**, 1733–1735 (2003). [CrossRef] [PubMed]

**14. **Ao Jiang and Finite “Difference Time Domain Method on Paralle Architecture using Message Passing Interface,” M.S. Thesis, Dept. of Ece, University of Delaware (2003).

**15. **V. Varadarajan and R. Mittra, “Finite-Difference Time Domain (FDTD), Analysis Using Distributed Computing,” IEEE Microwave and Guided Wave Letters , **4**, 144–145 (1994). [CrossRef]

**16. **A.M. Liu, A.S. Mohan, T. A. Aubrey, and W.R. Belcher, “Techniques for Implementation of FDTD Method on CM-5 Parallel Computer,” IEEE Antennas and Propagation Magazine **37**, 64–71 (1995). [CrossRef]

**17. **S.M. Lee, W.C. Chew, S.L. Chuang, and J.J. Coleman, “Bending loss in optical waveguides for nonplanar laser array applications,” J. Appl. Phys. **71**, 2513–2520 (1992). [CrossRef]