Ultrafast farfield simulation of non-paraxial computer generated holograms

Sebastian Linss; Dirk Michaelis; Uwe D. Zeitner; Uwe D. Zeitner

doi:10.1364/OE.453731

1. Introduction

With computer generated holograms (CGHs) almost arbitrary intensity distributions can be created from incident light, e.g. for beam shaping or structured illumination. For this purpose, a specific diffractive structure is created on a substrate. To achieve a high efficiency, the usage of phase-only elements is preferable over amplitude elements due to the higher transmission. An iterative Fourier transform algorithm (IFTA), e.g., the Gerchberg-Saxton algorithm [1], is typically used to determine a phase profile into which the incoming light has to be modified to generate a specific intensity pattern in the farfield. There are various ways to experimentally realize the profile, e.g., effective medium structures [2], geometric phase [3], or resonant Huygens nanoantennas [4]. We concentrate here on effective medium holograms: They consist of specific binary structures with sub-wavelength dimensions, so that their exact shape cannot be resolved by the light, so it propagates according to an effective index instead. A typical example is an array of rectangular pillars with varying side lengths [5].

For the realization of a CGH the phase value for each position on the element has to be assigned to the structure parameters at the respective position, e.g., the pillar dimensions. Rigorous methods like the finite-difference time-domain method (FDTD) or rigorous coupled wave analysis (RCWA) can be used to determine the transfer function of a certain structure. Usually periodic boundary conditions are used to take the influence of the vicinity into account. The element is then designed supposing that the transfer function of a respective pixel in the CGH is the same as in periodic arrangement, which is called local periodic approximation (LPA) [6]. This approach leads to useful designs for small deflection angles. At higher angles more frequent lateral variations of the structures are necessary and the pixel transfer functions become dependent on the surrounding [7]. This typically reduces the performance, as we can be seen from the examples presented here.

There have been different approaches to include the vicinity influence on the transfer function, e.g., simulation of parts of the element by rigorous methods [7–10] or usage of neuronal networks based on the geometry parameters [11,12]. They have been used to design high-efficient metalenses, but either as a cylindrical lens or by utilization of the rotational symmetry and might not be trivially expandable for arbitrary 2D CGHs. We proposed a beam propagation method (BPM) that can simulate dielectric metasurfaces with a high accuracy and much higher calculation speed than rigorous methods [13]. Although this is a useful tool for the analysis of such elements, an even faster method is desirable for the design under consideration of neighbor coupling effects. Therefore, we try to reduce the amount of unnecessary calculation steps: Especially for large elements similar arrangements of structures might occur at different positions. On the other hand, the discretization necessary for a farfield projection is much coarser than in the nearfield simulation with BPM or FDTD. In this article we describe a pixel’s transfer function by only a few polynomial coefficients necessary for the farfield determination and present a technique to compute thee coefficients directly from the structure parameters in the vicinity of the actual pixel. For that, the field behind a set of randomized elements is simulated using the BPM. Then the transfer function is reduced to a set of specific coefficients for each pixel. Afterwards a linear regression model is created that determines the coefficients from the structure parameters in the surrounding. This model can then be used to compute the farfield of an arbitrary arrangement of the considered structures without necessity of a further rigorous calculation. We call our method polynomial field approximation (PFA). The description will focus on the application to CGHs, although the approach might be used for other DOEs as well.

In section 2 we will present the farfield calculation of CGHs, based on few parameters per pixel. The following section 3 contains the setup of the regression model. We demonstrate the application of the PFA in section 4 to a beam splitter and a grid projection CGH utilizing effective medium structures and compare the results to LPA, BPM, and FDTD simulations.

2. Farfield calculation for regular microstructures

The aim of a CGH is often to generate a defined intensity distribution in the farfield or at a certain distance from the element. We limit our description to the case of propagation to the farfield. Because there is no coupling between the vectorial components of the electric field in free-space propagation it is sufficient to consider a complex scalar field $u$. To improve readability we will describe a 1D element here and give the 2D formulas in the supplement.

If the field behind the element is given by $u(x, z=0) = u_0(x)$ for a wavelength $\lambda$ then in non-paraxial Fraunhofer approximation the farfield is given by [14]

(1)$$u_{Fr}(x,z) ={-}\dfrac{\mathrm{i} kz}{2\pi r^{2}}\mathrm{e}^{\mathrm{i} kr}U_0\left(k\dfrac{x}{r}\right)\ ,$$

where $r=\sqrt {x^{2}+z^{2}}$, $k=2\pi /\lambda$, and $U_0=\mathcal {F} u_0$ is the Fourier transform (FT) of $u_0$. This means that it is necessary to calculate the FT of the nearfield. In the conventional design process the element is divided into regular pixels. By using e.g. the Gerchberg-Saxton (IFTA) algorithm [1] phase or amplitude values can be determined for each pixel. Here it is assumed, that behind each pixel the field is constant, i.e.,

(2)$$\begin{aligned}u_0(x) = u_m\ , &\quad\quad \text{for } \left(m-\frac{1}{2}\right)p\leq x\leq\left(m+\frac{1}{2}\right)p\ , \end{aligned}$$

where $p$ is the size of one pixel and $m=0,\ldots,M-1$ the number of the respective pixel. It is common to repeat the whole element periodically. So following the description in [15] the nearfield can be described as

(3)$$\begin{aligned}u_0(x) = \left[\sum_{m=0}^{M-1}u_m\delta(x-mp)\right]\ast\mathrm{rect}\left(\frac{x}{p}\right)\ast\sum_{n={-}\infty}^{\infty}\delta(x-nMp)\ . \end{aligned}$$

Here $\ast$ denotes a convolution, $\delta$ is the Dirac delta function, and $\mathrm {rect}$ the rectangular function. The left factor describes the distribution of the values within the element, the central factor assigns these values to pixels of size $p$, and the right factor assures a periodic distribution of the whole element with period $Mp$. For a real element $n$ is of course finite, but for an element much larger than a single period this is a justified description. In the standard IFTA the discrete Fourier transform (DFT),

(4)$$U_m=\sum_{m'=0}^{M-1}u_{m'}\exp\left(-\mathrm{i}\dfrac{2\pi}{M}mm'\right)\ ,$$

is taken as farfield. Numerically a fast Fourier transform (FFT) is utilized, which is mathematically equivalent to the DFT. The order $U_m$ is associated to a spatial frequency of $k_{x,m}=2\pi m/(Mp)$. The result of a continuous FT is

(5)$$U_0(k_x) = \dfrac{2\pi}{M}\mathrm{sinc}\left(\dfrac{k_xp}{2}\right)\left[\sum_{m=0}^{M-1}U_m\delta\left(k_x-k_{x,m}\right)\ast\sum_{n={-}\infty}^{\infty}\delta\left(k_x-\dfrac{2\pi n}{p}\right)\right]\ ,$$

as shown in [15]. Here $\mathrm {sinc}(\varphi )=\sin \varphi /\varphi$. This corresponds to the DFT result for $k_{x,m}$ modulated by a $\mathrm {sinc}$ term caused by the diffraction at a slit of pixel width $p$. This modulation can be included in the IFTA or the defined target distribution, so the usage of the simple DFT is justified if the assumption of a constant field behind each pixel is fulfilled.

For real elements this assumption fails, because neighboring structures influence each other, especially when the structure dimensions change at length scales similar to the wavelength. To take this effect into account, it would be possible to calculate the nearfield at a finer grid, using e.g. FDTD or BPM [13] and take the DFT of it. Regardless of the field determination, this clearly increases the effort of the DFT, while the obtained farfield contains information on high orders that are outside the target area or even do not propagate and are therefore not of interest.

In an alternative model we will approximate the field behind a pixel $m$ by a polynomial of order $D$ according to

(6)$$\begin{aligned}u_{0,D}(x) = \sum_{d=0}^{D}c_{md}\left(\dfrac{x-mp}{p}\right)^{d}\ , &\quad\quad \text{for } \left(m-\frac{1}{2}\right)p\leq x\leq\left(m+\frac{1}{2}\right)p\ . \end{aligned}$$

with polynomial coefficients $c_{md}$. Like before the nearfield behind the entire element can then be described as

(7)$$\begin{aligned}u_{0,D}(x) = \sum_{d=0}^{D}\left\{\left[\sum_{m=0}^{M-1}c_{md}\delta(x-mp)\right]\ast\left(\dfrac{x}{p}\right)^{d}\mathrm{rect}\left(\frac{x}{p}\right)\right\}\ast\sum_{n={-}\infty}^{\infty}\delta(x-nMp)\ . \end{aligned}$$

In Fig. 1 we illustrate this piecewise polynomial representation of the field for different orders $D$. It can be seen that already a second order polynomial provides a close approximation.

Fig. 1. Field behind an element $u_0$ and its approximation by polynomials $u_{0,D}$ of order $D=0,1,2$. The plots show the real (a) and imaginary part (b) of the complex fields.

Download Full Size | PDF

The continuous FT of the above expression is

(8)$$\begin{aligned}U_{0,D}(k_x) = \dfrac{2\pi}{M}\sum_{d=0}^{D}\left(\dfrac{\mathrm{i}}{2}\right)^{d}\mathrm{sinc}^{(d)}\left(\dfrac{k_xp}{2}\right)\left[\sum_{m=0}^{M-1}C_{md}\delta\left(k_x-k_{x,m}\right)\ast\sum_{n={-}\infty}^{\infty}\delta\left(k_x-\dfrac{2\pi n}{p}\right)\right]\ , \end{aligned}$$

where $\mathrm {sinc}^{(d)}$ denotes the $d$-th derivative of the $\mathrm {sinc}$ function. For each $d$ the coefficients $C_{md}$ are the respective DFT of $c_{md}$. For details see the supplementary material. This shows that the farfield can be described as superposition of DFTs of the pixel coefficients of respective degree. Each degree is modulated by a different factor. The rate of convergence of the given series depends on the individual coefficients, but the examples given in section 4, Table 1, show that already a small number of $D$ can be sufficient to describe the farfield accurately. This goes along with a significant reduction of the necessary information and the numerical effort compared to a direct DFT of a finely discretized function.

Table 1. Correlation of LPA and PFA to the BPM results for beam splitter CGHs. The PFA data are distinguished for different maximum order $D$ of the polynomial coefficients. The deflection is characterized by the angle of the central beam.

View Table | View all tables in this article

Since all used considerations are identical for $x$- and $y$-direction and the formulas are separable, the derivation of the corresponding 2D-formulas is straightforward. The results can be found in the supplementary document.

3. Estimation of nearfield based on geometry parameters

For many CGHs the structure at each pixel can be described by a few or even a single parameter which might be a structure height, a rotation angle, or a fill factor [2,3,8]. This parameter entirely defines the field behind the pixel if the arrangement is periodical. Otherwise the field will be influenced also by the parameters of the neighboring pixels. As shown in the previous section it is not necessary to get knowledge of the exact shape of this field in order to calculate the farfield. The determination of a few low-degree polynomial coefficients can be sufficient. In this section we will show a machine learning approach to determine these coefficients based on the geometry parameters of the surrounding pixels.

The basic idea is to start with a set of training data consisting of CGHs with randomized structures. These elements are simulated using a rigorous or semi-rigorous method like FDTD, RCWA, or BPM to get the nearfield behind the element. The complex field behind each pixel is then approximated by polynomials. By utilizing a polynomial regression model, the polynomial coefficients can be expressed by the geometry parameters of the pixels in the close vicinity. After application of the regression model to the test data, created from randomized CGHs as well, the quality of the resulting model can be evaluated by comparing the estimated coefficients to the ones obtained from simulation.

We investigate a CGH based on dielectric, rectangular pillars and holes as in [13,16]. The individual structures are characterized by the geometric fill factor $f\in [0,1]$. The approach will be explained for the 1D case here again, with the 2D formulas given in the supplementary. Starting point is the nearfield behind each pixel, $u_m(x)$ for $(m-1/2)p\leq x\leq (m+1/2)p$. We expand it to Legendre polynomials up to degree $D$:

(9)$$\begin{aligned}u_m(x) \approx\sum_{d=0}^{D}l_{md}L_d\left(\dfrac{2(x-mp)}{p}\right)\ . \end{aligned}$$

The utilization of the orthogonal Legendre polynomials instead of a monomial basis has the advantage that the obtained coefficients $l_{md}$ do not depend on the maximum degree $D$. They can easily be transferred to the polynomial coefficients $c_{md}$. For details see the supplementary document.

To find an appropriate regression model we limited the contributing parameters by restricting to pixels in the vicinity of the currently considered one, utilizing the symmetry, and introducing a heuristic weight function that limits products of parameters. Only a finite number of neighboring pixels will have an influence on the nearfield behind one pixel, so only pixel within a maximum distance of $S$ will be considered. The aim is then to find an expression $F_d$ for each polynomial degree that approximates the coefficients $l_{md}$ as close as possible for all $m$:

(10)$$l_{md}\approx F_d\left(f_{m-S},f_{m-S+1},\ldots,f_{m+S}\right) = F_d\left(\mathbf{f}_{m,S}\right)\ .$$

We use $\mathbf {f}_{m,S}$ as abbreviation for the pixels in the neighborhood of $m$. At the edge of the element appropriate boundary conditions need to be applied, which we will assume to be periodic here.

The Legendre polynomials of even/odd degree are even/odd functions. This implies that $F_d$ needs to fulfill certain symmetry relations as well upon reversal of the parameters $\mathbf {f}_{m,S}^{r}=(f_{m+S}, f_{m+S-1},\ldots,f_{m-S})$:

(11)$$\begin{aligned}F_{2d}\left(\mathbf{f}_{m,S}\right) = F_{2d}\left(\mathbf{f}_{m,S}^{r}\right)\ , &\quad\quad F_{2d+1}\left(\mathbf{f}_{m,S}\right) ={-}F_{2d+1}\left(\mathbf{f}_{m,S}^{r}\right)\ . \end{aligned}$$

To take this into account we construct combinations of the parameters that fulfill either even or odd symmetry:

(12)$$\begin{aligned}f_{m,s,1} =f_{m-s}+f_{m+s}-1 &\quad\quad f_{m,s,-1} = f_{m+s}-f_{m-s} &\quad\quad s =0,\ldots,S\ . \end{aligned}$$

To simplify notation we introduced a parity index which symbolizes even ($p=1$) or odd ($p=-1$) symmetry. The subtraction of 1 ensures $f_{m,s,\pm 1}\in [-1,1]$ which reduces correlation of the variables in the regression model. There are now $2S+1$ independent variables ($f_{m,0,-1}=0$ is not used). We use multi-index notation to write products of this variables:

(13)$$\begin{aligned}\mathbf{f}_{m,S}^{\mathbf{t}} = \underbrace{\prod_{s=0}^{S}f_{m,s,1}^{t_s}}_\text{even terms}\underbrace{\prod_{s=S+1}^{2S+1}f_{m,s-S,-1}^{t_s}}_\text{odd terms}\ , \end{aligned}$$

where $\mathbf {t}=(t_1,\ldots,t_{2S+1}),\ t_S\geq 0$ is a tuple containing the exponents of the respective parameter combinations. The parity of such a product can be expressed by the exponents:

(14)$$p(\mathbf{t}):=\prod_{s=0}^{S}1^{t_s}\prod_{s=S+1}^{2S+1}({-}1)^{t_s}=({-}1)^{\sum_{s=S+1}^{2S+1}t_s}\ .$$

It is reasonable to assume that the influence of the fill factors in the neighborhood of the current pixel will decrease with increasing distance. The regression model should therefore include higher powers of parameters from nearer pixels than from more distant ones. To estimate the influence of a certain term we will use the following weight function that we also define in multi-index notation:

(15)$$\begin{aligned}w(s,t_s) := t_s(1+s)\ , &\quad\quad w(\mathbf{t}) := \sum_{s=0}^{S}w(s,t_s)+\sum_{s=S+1}^{2S+1}w(s-S,t_s)\ . \end{aligned}$$

The choice of this function is heuristic, with no claim to be optimal, but it proved itself useful in the development of this approach. We can now formulate a linear model for the regression: It includes all parameters up to a distance $S$ and their products that do not exceed a maximum weight $W$:

(16)$$\begin{aligned}F_d\left(\mathbf{f}_{m,S}\right)=\sum_{\substack{\mathbf{t}:\ p(\mathbf{t})={\pm} 1\\w(\mathbf{t})\leq W}}r_\textbf{t}\mathbf{f}_{m,S}^{\mathbf{t}}\ . \end{aligned}$$

The upper sign applies for even $d$, the lower one for odd $d$. This sum is finite, because all the exponents are limited by the weight function: $0\leq t_s\leq W$. Examples for the occurring parameters are given in the supplementary document as well as the corresponding formulations for a 2D element.

The combination of this regression-based nearfield estimation with the farfield determination described in the previous chapter allows the computation of the farfield directly from the given geometry parameters (PFA).

4. Application to effective index CGHs

To demonstrate the usability of our approach we simulated the farfield of different CGHs. Therefore we first designed a phase-only CGH using a standard IFTA-algorithm. To associate an effective-index structure to each pixel we followed the LPA approach and simulated pixels with different fill factor in periodic arrangement and chose the structure with closest phase value compared to the design for each pixel. Then we calculated the farfield of these elements using LPA, BPM, FDTD, and our new PFA in the 2D formulation.

For the simulation of training and test data we used a BPM as described in [13]. The FDTD simulations were performed using Lumerical FDTD. Detailed simulation parameters can be found in the supplementary document.

4.1 Off-axis beam splitter

We created a phase profile for a $11\times 11$ beam splitter design using a standard IFTA with amplitude freedom outside the area of $17\times 17$ diffraction orders. Then we added a linear phase profile to the obtained phase. This corresponds to a shift in frequency space, so in the farfield an additional deflection will be obtained for all diffraction orders. In a scalar approach this creates a suitable design for any deflection angle. However, since the structures change more rapidly with higher deflection angle, coupling effects between neighboring pixels will arise and influence the performance.

The CGH is designed for a wavelength of ${633}\,\mathrm {nm}$. It uses the orders $-5,\ldots,5$ in $x$- and $y$-direction to create a $11\times 11$ beam splitter. It consists of $100\times 100$ pixels with a size of ${400\times 400}\,\mathrm {nm^{2}}$ and thus has an overall size of ${40\times 40}\,\mathrm {\mu} \textrm{m}^{2}$. Different phase levels are created by an effective-medium approach: The structures are created in fused silica with a depth of ${1.385}\,\mathrm {\mu} \textrm{m}$ that corresponds to a $2\pi$ phase difference. Intermediate values are obtained by either a quadratic pillar or a quadratic hole with side lengths of 80, 120, 160, 180, 200, 220, 240, 260, or ${280}\,\mathrm {nm}$. In this way 20 approximately similar phase steps can be generated, which have been determined using BPM.

The training data for the PFA contain 64 random CGHs with $20\times 20$ pixels, see Data File 1. To limit the amount of coefficients a maximum distance $S=4$ and maximum weight $W=11$ has been used. So for each Legendre coefficient the parameters of the 41 pixels within a rectilinear distance of 4 are exploited with combinations of up to 11 of these. That gives $\binom {41+11}{11}\approx 6.04\cdot 10^{10}$ possible independent variables which are reduced by the weight restrictions to 4386, 3878, or 3490 terms, depending on the parity in $x$- and $y$-direction.

With increasing deflection angle different effects can be observed (see Fig. 2): The farfield of the exact IFTA phase would be purely shifted (not displayed). With the tilted phase profile the associated structures change, but they can still reproduce the phase, so in LPA we primarily observe also just a shift in the farfield (a-c). Taking the $\mathrm {sinc}$-modulation from Eq. (5) into account this is overlaid by an increasing attenuation of the higher orders. This effect can be pre-corrected in the IFTA-algorithm and is no specific drawback of the LPA. When considering the PFA, additional effects can be observed which increase with growing deflection angle: the zeroth order is enhanced, the homogeneity of the beam splitter orders is reduced, and additional stray light orders and twin images occur (e,f). For these effects a high conformity to a BPM simulation can be seen (h,i).

Fig. 2. Farfield intensity for beam splitter CGHs with varying central order deflection angle (from left to right), calculated using LPA (a-c), PFA (d-f), and BPM (g-i). The axis labels show the diffraction orders. The insets show a $10\times 10$ pixel detail of the phase profile (upper left) and the structure (lower right), with lengths given in $\mathrm {\mu}$m.

Download Full Size | PDF

For a numerical comparison of the simulated farfields, the correlation of the obtained farfields has been computed according to

(17)$$\mathrm{corr}\left(U_1,U_2\right) = \dfrac{\sum_{k_x,k_y}\left(U_1(k_x,k_y)-\overline{U_1}\right)\left(U_2^{{\ast}}(k_x,k_y)-\overline{U_2^{{\ast}}}\right)}{\sqrt{\sum_{k_x,k_y}\left|U_1(k_x,k_y)-\overline{U_1}\right|^{2}\sum_{k_x,k_y}\left|U_2(k_x,k_y)-\overline{U_2}\right|^{2}}}\ ,$$

wherein $U^{\ast }$ denotes complex conjugation and $\overline {U}$ the arithmetic mean. The results are presented in Table 1. They show that the accuracy of the LPA sharply decreases with increasing deflection angle, whereas there is only a slight decrease in the PFA. A comparison of the PFA for different maximum order $D$ shows that the usage of $D=1$ brings an improvement compared to $D=0$, but the extension to $D=2$ only gives a slight increase in accuracy. All PFA results shown in the figures use $D=2$.

The results show the expected behavior: at small deflection angles the farfield can be described by the LPA up to a certain accuracy, see Fig. 2(a,d,g). Therefore, the conventional design method is able to create a suitable CGH in this case. With increasing deflection angle the approximation becomes worse, so a more sophisticated method is needed to calculate the farfield. Our PFA produces almost the same results as the BPM.

4.2 Grid projection

We investigated a CGH design that creates a quadratic grid in the farfield. The size of the CGH has been varied, while the target shape has been fixed. With increasing CGH dimensions the amount of addressable orders increases, so the grid lines can contain more points and the optimization then leads to a more homogeneous intensity distribution. It is unproblematic to create such a CGH using a traditional IFTA, but only based on the LPA and neglecting any coupling effects. To analyze them, (semi-)rigorous simulation are necessary that need a large amount of computational power. The PFA can close this gap and enables fast farfield projections of large elements at high accuracy.

The CGH should create a target profile as shown in Fig. 3(k,l) for a wavelength of ${532}\,\mathrm {nm}$. It corrects the $\mathrm {sinc}$-modulation, Eq. (5). A standard IFTA was used, incorporating amplitude freedom outside the grid area. The deflection angle for the highest desired order is $39.7^{\circ }$ along the axes and $64.5^{\circ }$ on the diagonals. The simulated element uses pixels with a size of ${300\times 300}\,\mathrm {nm^{2}}$ and a combination of pillars and holes in fused silica as in the previous example. The side lengths of the structures are 100, 140, 180, or ${200}\,\mathrm {nm}$ and have a height of ${1.04}\,\mathrm {\mu} \textrm{m}$. We present results for element sizes from $25\times 25$ pixels ($7.5\times {7.5}\,\mathrm {\mu} \textrm{m}^{2}$) up to $1600\times 1600$ pixels ($480\times {480}\,\mathrm {\mu} \textrm{m}^{2}$).

Fig. 3. Farfield intensity for grid-generating CGHs with varying size (from left to right), calculated using LPA (a-c), PFA (d-f), BPM (g-i), and FDTD (j); target intensity (k,l), with amplitude freedom marked in light blue. The axis labels show the diffraction orders. All data has been up-sampled to $400\times 400$ pixel resolution. The insets show a $10\times 10$ pixel detail of the structure, with lengths given in $\mathrm{\mu}$m.

Download Full Size | PDF

The PFA uses the same amount of training data as before, but with a reduced weight $W=9$ and $S=4$ for the fit model, see Data File 2. This results in 1093, 894, or 751 independent variables for the different parities, out of $\binom {41+9}{9}\approx 2.51\cdot 10^{9}$ possible combinations. This choice favors the calculations speed at the expense of a slight reduction in accuracy. The BPM simulation of the training data took ${360}\,\mathrm {s}$ and the model creation ${8}\,\mathrm {s}$. All given runtimes have been determined using the same system with a ${3.3}\,\mathrm {GHz}$ processor and 16 cores. Parallelization with 16 processes has been used for the BPM.

As shown in Fig. 3, the farfield propagation of the smallest CGH shows that although this design still works in LPA (a), the real farfield differs strongly from its expected behavior when simulated with more exact methods (d,g,j). There the grid is barely recognizable. On the other hand, PFA, BPM, and FDTD show similar results with corresponding stray light orders. In the second example, the grid can be generated in the farfield (b), although there are much more inhomogeneities present in PFA and BPM simulation (e,h) than in the LPA projection. At the edges of the grid an attenuation can be noticed. This is not caused by the $\mathrm {sinc}$-modulation, which has been considered in the design, but might be induced by the stronger influence of neighboring structures at larger deflection angles. Again there is a high consistency between PFA and BPM. A simulation with FDTD would consume too much time and was not possible with the available computer power. In the third example the stray light mostly has lower intensity than the grid itself, but there are significant efficiency differences between various grid points, as well as an attenuation at the edges.

The PFA results have a high correlation of 0.987 to the BPM, almost independent of the CGH size, while it is much smaller with $0.50\ldots 0.55$ for the LPA. For the smallest example also the correlation of PFA and BPM to FDTD has been computed, which is 0.983 and 0.991, respectively. This again shows the high quality of the PFA results, while the LPA produces strong deviations.

In Table 2 we compared the runtime of the different algorithms for various CGH sizes. They have been determined without parallelization except for the FDTD which used 8 cores. It is possible to split the simulation area in PFA and BPM, which has been done to limit the necessary RAM. If a sufficient amount of RAM is available, parallelization can be used. The runtime of the PFA scales almost linear with the CGH size and is much smaller than for the BPM, that itself is substantially faster than the FDTD.

Table 2. Runtime comparison of the different algorithms for different CGH sizes. The FDTD used 8 processes, all other simulations run on a single process. An estimate for the used RAM is given as well. For values in parentheses the simulation area has been split and simulated separately to reduce memory usage.

View Table | View all tables in this article

The results demonstrate, that the usage of the LPA design of a small-size CGH is not meaningful for this target distribution. On the other hand, existing analysis methods for large-scale CGHs are limited by the computational effort. Our proposed method can analyze even large-area elements within a reasonable amount of time. For smaller examples the consistency to existing methods has been shown.

5. Conclusion

We demonstrated how a high-accuracy, non-paraxial farfield simulation of CGHs can be realized even for large-area elements. For this purpose we developed the PFA, that calculates the farfield based on polynomial coefficients of the nearfield, which are itself estimated from the geometry parameters by a linear regression model. Here we used a BPM to generate the training data used to generate the model, but results based on rigorous methods might be utilized as well. The PFA shows a high consistency to reference outcomes of BPM and FDTD simulations. It considers polarization effects, provided the training data does so. The given examples are based on quadratic effective index structures, described by their geometric fill factor, but our method can also be applied to other kind of meta-atoms. If they require more than one parameter per pixel, an extended formulation seems possible. The method is not limited to the treatment of CGHs, but might be used for other DOEs as well, if the representation of the nearfield by low-order polynomials is sufficiently accurate.

It is sufficient to use small elements in the training data, so that parallelization can be utilized. Once the model is created, the farfield computation is much faster than with physical simulation based methods. By adjusting the amount of considered terms it is even possible to make a trade-off between calculation speed and accuracy. With this characteristics the PFA is ideally suited for the optimization of elements, where repeated simulation of different elements based on the same structures are necessary. It opens the possibility of optimizing entire elements at high accuracy.

Funding

Deutsche Forschungsgemeinschaft (1375/1 NOA); Bundesministerium für Bildung und Forschung (13N15175); Freistaat Thüringen (2020 FGR 0048).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are available in Data underlying the results presented in this paper are available in Data File 1, Data File 2.

Supplemental document

See Supplement 1 for supporting content.

References

1. R. W. Gerchberg and W. O. Saxton, “A practical algorithm for the determination of phase from image and diffraction plane pictures,” Optik 35, 237–246 (1972).

2. A. Arbabi, Y. Horie, A. Ball, M. Bagheri, and A. Faraon, “Subwavelength-thick lenses with high numerical apertures and large efficiency based on high contrast transmitarrays,” Nat. Commun. 6(1), 7069 (2015). [CrossRef]

3. M. Khorasaninejad, W. T. Chen, R. C. Devlin, J. Oh, A. Y. Zhu, and F. Capasso, “Metalenses at visible wavelengths: Diffraction-limited focusing and subwavelength resolution imaging,” Science 352(6290), 1190–1194 (2016). [CrossRef]

4. M. Decker, I. Staude, M. Falkner, J. Dominguez, D. N. Neshev, I. Brener, T. Pertsch, and Y. S. Kivshar, “High-efficiency dielectric huygens’ surfaces,” Adv. Opt. Mater. 3(6), 813–820 (2015). [CrossRef]

5. W. Yu, K. Takahara, T. Konishi, T. Yotsuya, and Y. Ichioka, “Fabrication of multilevel phase computer-generated hologram elements based on effective medium theory,” Appl. Opt. 39(20), 3531–3536 (2000). [CrossRef]

6. V. Popov, M. Yakovleva, F. Boust, J.-L. Pelouard, F. Pardo, and S. N. Burokur, “Designing metagratings via local periodic approximation: From microwaves to infrared,” Phys. Rev. Appl. 11(4), 044054 (2019). [CrossRef]

7. S. J. Byrnes, A. Lenef, F. Aieta, and F. Capasso, “Designing large, high-efficiency, high-numerical-aperture, transmissive meta-lenses for visible light,” Opt. Express 24(5), 5110–5124 (2016). [CrossRef]

8. M. E. Testorf and M. A. Fiddy, “Efficient optimization of diffractive optical elements based on rigorous diffraction models,” J. Opt. Soc. Am. A 18(11), 2908–2914 (2001). [CrossRef]

9. Z. Lin and S. G. Johnson, “Overlapping domains for topology optimization of large-area metasurfaces,” Opt. Express 27(22), 32445–32453 (2019). [CrossRef]

10. T. Phan, D. Sell, E. W. Wang, S. Doshay, K. Edee, J. Yang, and J. A. Fan, “High-efficiency, large-area, topology-optimized metasurfaces,” Light: Sci. Appl. 8(1), 48 (2019). [CrossRef]

11. S. An, B. Zheng, M. Y. Shalaginov, H. Tang, H. Li, L. Zhou, Y. Dong, M. Haerinia, A. M. Agarwal, C. Rivero-Baleine, M. Kang, K. A. Richardson, T. Gu, J. Hu, C. Fowler, and H. Zhang, “Deep convolutional neural networks to predict mutual coupling effects in metasurfaces,” Adv. Opt. Mater. 10, 2102113 (2021). [CrossRef]

12. M. V. Zhelyeznyakov, S. Brunton, and A. Majumdar, “Deep learning to accelerate scatterer-to-field mapping for inverse design of dielectric metasurfaces,” ACS Photonics 8(2), 481–488 (2021). [CrossRef]

13. S. Linss, D. Michaelis, and U. D. Zeitner, “Macroscopic wave-optical simulation of dielectric metasurfaces,” Opt. Express 29(7), 10879–10892 (2021). [CrossRef]

14. M. Born, E. Wolf, A. Bhatia, P. Clemmow, D. Gabor, A. Stokes, A. Taylor, P. Wayman, and W. Wilcock, Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light (Cambridge University, 1999).

15. T. Kämpfe, E.-B. Kley, and A. Tünnermann, “Designing multiplane computer-generated holograms with consideration of the pixel shape and the illumination wave,” J. Opt. Soc. Am. A 25(7), 1609 (2008). [CrossRef]

16. W. Eckstein, E.-B. Kley, and A. Tünnermann, “Comparison of different simulation methods for effective medium computer generated holograms,” Opt. Express 21(10), 12424–12433 (2013). [CrossRef]

Name	Description
Data File 1	Training data for the regression model used for the beamsplitter.
Data File 2	Training data for the regression model used for the grid projection.
Data File 3	Test data for the regression model used for the beamsplitter.
Data File 4	Test data for the regression model used for the grid projection.
Supplement 1	Supplemental document

deflection	LPA	PFA $D = 0$	PFA $D = 1$	PFA $D = 2$
${0.0}^{\circ}$	0.923	0.989	0.995	0.995
${8.1}^{\circ}$	0.879	0.981	0.994	0.995
${16.4}^{\circ}$	0.800	0.969	0.991	0.993
${25.1}^{\circ}$	0.700	0.951	0.986	0.988
${34.5}^{\circ}$	0.590	0.946	0.986	0.988
${45.0}^{\circ}$	0.414	0.948	0.986	0.988

CGH size	Physical	Runtime	Memory	Runtime	Memory	Runtime
$[p i x e l]$	size [ $μ m^{2}$ ]	PFA [s]	PFA [GB]	BPM [s]	BPM [GB]	FDTD [s]
$25 \times 25$	$7.5 \times 7.5$	0.2	<0.2	9.8	0.7	4019
$50 \times 50$	$15 \times 15$	0.6	<0.2	40.3	2.5
$100 \times 100$	$30 \times 30$	1.9	0.4	149	9.7
$200 \times 200$	$60 \times 60$	6.5	1.4	660	38.3
$400 \times 400$	$120 \times 120$	27.3	(1.8)	2870	(41.5)
$800 \times 800$	$240 \times 240$	109	(1.9)
$1600 \times 1600$	$480 \times 480$	440	(2.7)

deflection	LPA	PFA $D = 0$	PFA $D = 1$	PFA $D = 2$
${0.0}^{\circ}$	0.923	0.989	0.995	0.995
${8.1}^{\circ}$	0.879	0.981	0.994	0.995
${16.4}^{\circ}$	0.800	0.969	0.991	0.993
${25.1}^{\circ}$	0.700	0.951	0.986	0.988
${34.5}^{\circ}$	0.590	0.946	0.986	0.988
${45.0}^{\circ}$	0.414	0.948	0.986	0.988

CGH size	Physical	Runtime	Memory	Runtime	Memory	Runtime
$[p i x e l]$	size [ $μ m^{2}$ ]	PFA [s]	PFA [GB]	BPM [s]	BPM [GB]	FDTD [s]
$25 \times 25$	$7.5 \times 7.5$	0.2	<0.2	9.8	0.7	4019
$50 \times 50$	$15 \times 15$	0.6	<0.2	40.3	2.5
$100 \times 100$	$30 \times 30$	1.9	0.4	149	9.7
$200 \times 200$	$60 \times 60$	6.5	1.4	660	38.3
$400 \times 400$	$120 \times 120$	27.3	(1.8)	2870	(41.5)
$800 \times 800$	$240 \times 240$	109	(1.9)
$1600 \times 1600$	$480 \times 480$	440	(2.7)

Ultrafast farfield simulation of non-paraxial computer generated holograms

Abstract

1. Introduction

2. Farfield calculation for regular microstructures

3. Estimation of nearfield based on geometry parameters

4. Application to effective index CGHs

4.1 Off-axis beam splitter

4.2 Grid projection

5. Conclusion

Funding

Disclosures

Data availability

Supplemental document

References

Supplementary Material (5)

Data availability

Cited By

Figures (3)

Tables (2)

Equations (17)

Optics Express