Deep neural network based calibration for freeform surface misalignments in general interferometer

Lei Zhang; Lei Zhang; Sheng Zhou; Jingsong Li; Jingsong Li; Benli Yu

doi:10.1364/OE.27.033709

1. Introduction

Optical freeform surface metrology has always been a tremendous challenge. In various techniques, interferometry, which has been the prevalent testing method for spherical and aspherical surfaces, is still the primary choice owing to its excellent measurement accuracy. Thus, numerous aspheric interferometry concepts have been applied to freeform surfaces measurement, such as the null test [1,2], non-null test [3,4], and sub-aperture stitching methods [5]. Certainly, all these attempts suffer the contradiction between the flexibility and accuracy, same as in aspheric interferometers. Moreover, the opposition appears more prominent in freeform surfaces because of the asymmetric of rotation in the surface figure. Accurate compensating optics in a freeform surface interferometer is significantly both difficult to produce and less flexible. To alleviate the above scenario, several adaptive optics have been employed as compensators to achieve a highly flexible measurement with a relatively high accuracy [6–12], which is an important development tendency of a freeform surface test.

In all the above-mentioned interferometry methods, misalignments of the test surfaces have always been important factors affecting the measurement accuracy. Here, we reformulate the relevance of freeform surface misalignments in interferometers. In some null interferometers with specifically designed static null optics, the null position of the test surface is generally unique. In this case, the misalignments of the test freeform surface are in reference to the general definition: the surface position and posture deviation from the unique null position. However, under a non-null condition, the misalignments do not refer to this deviation, because the optimum measuring position in a non-null configuration is non-existent or unnecessary. A freeform surface is generally not symmetric about the optical axis and has a non-existent optimum posture. Theoretically, a freeform surface can be tested at any position and posture only if the resulting interferogram is within the resolution range of the interferometer. Then the retrace error [13,14] will be corrected by system-model-based ray tracing. In this case, the position and posture of the freeform surface in the model and experiment must be consistent. The deviations of the two properties are thus defined as the real “misalignments” in non-null configurations. Even in the recently emerged adaptive null interferometry, misalignments imply the deviation between the model and experiment. This is because adaptive null interferometry cannot afford an absolute null test, and thus, model-based ray tracing is also necessary [6–12].

In traditional spherical interferometers, surface misalignments are relatively easy to identify by the tested wavefront aberration evaluation [15], such as the first four terms of the Zernike coefficients. However, for an aspheric test, this simple procedure is less accurate owing to high-order residual aberrations, particularly under a non-null condition [16], and not to mention freeform surfaces [17]. A freeform figure and six-axis motion freedom increase the complexity of the misalignment aberrations of freeform surfaces [18]. In freeform surface interferometry, slight misalignments of the test surface introduce complex aberrations, rapidly leading to high fringe densities, under both null and non-null conditions. These complex misalignment aberrations will contain not only tilts and the defocus but also other derived higher order ones due to the surface figure enlarging the low-order misalignments aberrations. The inscrutable derivation and inter-coupling of these aberrations make the misalignment aberration estimation and calibration in freeform surface interferometers extremely difficult. This scenario will be worse in non-null test configurations, because the retrace error will also be enlarged.

Several methods have been proposed for the freeform surface misalignment calibration in interferometers. Most of them learn from the aspheric calibration case. The classic computer-aided alignment (CAA) method exhibits a good performance in spherical and aspheric surface misalignment calibrations with a sensitive matrix (SM), also known as the sensitive table method [16,17,19–22]. The SM is generally obtained by the perturbation theory [19], which establishes a corresponding relation between the Zernike coefficients of the tested wavefront and misalignments. This method now has been integrated into some commercial optical design software such as Code V, which works well in the linear area. However, only slight misalignments in the linear area can be predicted accurately. As was highlighted in [23], the SM method cannot provide accurate estimations of large misalignments, owing to the non-linearity of the Zernike coefficient sensitivity to such misalignments, even in spherical optical systems. Furthermore, for freeform surfaces, the misalignment aberrations may not be linear with the misalignments, owing to the complex error coupling even with slight misalignments. Particularly, the freedom of rotation about the optical axis makes the aberrations due to the other five freedom unpredictable. Therefore, some researchers attempt to calibrate the rotation errors about the optical axis and other five axis misalignments separately [5,24]. However, this process is performed with difficult multiple iterations and has low accuracy. Some other improved methods such as the merit function regression method [23] and differential wavefront sampling method [25] have been developed to treat the non-linear problem. However, thus far, these methods have been validated only in optical systems with spherical surfaces. Hao et al. [26] proposed a calibration method for a non-null interferometer to optimize specific misalignments sequentially in a virtual interferometer with the assumption that the six misalignments were independent of each other. However, this assumption is not always true in case of general freeform surfaces, and thus, an iterative process should be employed. Neural network (NN) and deep learning concepts have been widely employed in wavefront sensing and aberration correction [27–32], exhibiting tremendous ability in treating non-linear problems. This provides significant motivation to address the non-linear misalignments in freeform surfaces interferometry. Actually, as early as 1993, Barrett [33] had employed a neural network to determine the Hubble space telescope aberration from stellar images. In 2013, Baer introduced a simple neural network with only 12 iterations for the misalignment calibration in a tilted wave-interferometer [34]. However, this work stagnated for aspheric surfaces, and no experiment validation was performed for freeform surfaces. Specifically, thus far, no experiment validation has been performed for freeform surface misalignment calibration using a neural network.

In this paper, we propose a deep neural network (DNN)-based misalignment calibration method to perform freeform surface misalignment aberration calibration. The well-trained DNN, with multi-layers and interactions, can treat the non-linear relation, and thus, can provide accurate misalignment estimations. Subsequently, the estimated misalignments are simulated in a ray tracing model to predict all the misalignment aberrations. These aberrations are then removed by a simple wavefront data subtraction. To the best of our knowledge, this is the first work to experimentally validate a DNN method in an interferometer for freeform optics misalignment calibration.

2. Principle

In freeform surface interferometry, we express the relation between the Zernike coefficients of the tested wavefront deviation and the surface misalignments by an implicit function in parametric form as

(1)$$\{{\Delta {Z_i}\,} \}= \{{{Z_i} - {Z_{{\mathop{\textrm o}\nolimits} i}}\,} \}= f({\{{{\varepsilon_j}} \}} ),$$

where $\Delta {Z_i}$ is the deviation of the corresponding Zernike coefficient of the misaligned tested wavefront (${Z_i}$) and aligned one (${Z_{{\mathop{\textrm o}\nolimits} i}}\,$).${\varepsilon _j}(j = 1,2, \cdots ,6)$ characterizes the surface misalignments. Specifically, they are three translations (${d_x}$, ${d_y}$ and ${d_z}$) and three rotations (${\theta _x}$, ${\theta _y}$ and ${\theta _z}$) about the respective axes. In traditional CAA, the SM method simplifies Eq. (1) in a matrix form as

(2)$$\Delta {\textbf Z} = {\textbf M} \cdot {\boldsymbol \varepsilon },$$

where $\Delta {\textbf Z} = [{\Delta {Z_i}} ]^{\prime}$ is the vector of the Zernike coefficient deviation of misaligned tested wavefront. ${\boldsymbol \varepsilon }$ is the misalignment vector and ${\textbf M}$ is the SM. This equation express the linear relation of the wavefront coefficient variations with the misalignments. However, this linear relation is not always true in case of the complex aberrations due to the complex surface figure aberrations modulated by the six-axis misalignments. Specifically, the function in Eq. (1) must be treated again and not only in a simple linear form.

Therefore, the DNN is introduced to characterize the non-linear function. The function in Eq. (1) can be simplified as the direct relations, $\{{{Z_i}\,} \}= f({\{{{\varepsilon_j}} \}} )$, because $\{{{Z_{{\mathop{\textrm o}\nolimits} i}}} \}$ is generally determined with a ray tracing model. Thus, the misalignment estimation is expressed as

(3)$$\{{\,{\varepsilon_j}} \}= {f^{ - 1}}({\{{{Z_i}} \}} )$$

The DNN can just manage the above. The framework of the DNN is presented in Fig. 1. The multilayer DNN acts as an inverse function, ${f^{ - 1}}$, with the Zernike coefficients of the misaligned wavefront as the input vector and the misalignments as the output. The first step is network training, as is shown inside the blue dotted line in Fig. 1. Samples from the ray tracing system model are employed to train the back-propagation neural network [29] with the gradient descent with momentum (GDM) [35] algorithm. Following significant iterative training, the weight and bias of any two connected neurons achieve optimal values, realizing the minimum root mean square error (RMSE). The next step is the misalignment estimation and calibration. With the well trained DNN, the specific misalignment values can be estimated with the actual tested wavefront coefficients in the experiments as the input, which is shown inside the black dotted line in Fig. 1. These estimated misalignment values then are employed to align the tested surface in the model, rather than the one in experiment system as is shown inside the purple dotted line in Fig. 1. This procedure is easier than aligning the actual tested surface in the experiment with precision mechanical structures. The final surface figure error ($E$) then can be extracted by Eq. (4) together with the misalignment aberration calibration, where ${W_{\textrm{Exp}}}({\boldsymbol \varepsilon } )$ and ${W_{\textrm{Model}}}({\boldsymbol \varepsilon } )$ are the misaligned tested wavefronts in the experiment and model, respectively. Equation (3) can be apply to both null and non-null configurations. In a non-null configuration, the retrace error can be corrected together with the misalignments by Eq. (4).

(4)$$E = \frac{1}{2}[{{W_{\textrm{Exp}}}({\boldsymbol \varepsilon } )- {W_{\textrm{Model}}}({\boldsymbol \varepsilon } )} ].$$

3. Simulation for neural network performance evaluation

We took a Zernike polynomial surface (8λ coma and 15λ astigmatism) in a simple partial null interferometer [14,36,37] for the simulated illustration. The DNN training codes were implemented in MATLAB, which were executed on a computer with Intel i7-7700 CPU running at 3.6 GHz and 16 GB RAM. A sample dataset (sample number n = 10000) from the ray tracing model was imported into the DNN for training. In each sample, the vector, $\{{{Z_{{\mathop{\textrm s}\nolimits} i}}\,} \}(i = 1 \cdots 37)$, acted as the sample input and ${{\boldsymbol \varepsilon }_{\mathop{\textrm s}\nolimits} }({d_x},{d_y},{d_z},{\theta _x},{\theta _y},{\theta _z})$ acted as the sample output, where subscript ${\mathop{\textrm s}\nolimits}$ refers to the sample dataset. Here, the transfer functions in the hidden layers and output layer were sigmoid and purelin functions, whereas the training function was a traingdm function [35], respectively. Another test dataset with 100 samples was used to validate the trained DNN. To simulate the actual test data, the tested surface was attached with a map of the figure error. Specifically, the DNN was trained with ideal surface figures, and the validation was executed with the surface figure error in the test dataset. The training and test performances of a DNN are affected by the number of layers, neurons, and epochs. Therefore, we evaluated these factors to obtain a reasonable DNN structure, both in the noise-free and noise cases. Then the performance of the DNN was compared with that of the SM method in misalignment aberration estimation.

Fig. 1. The DNN-based misalignments calibration.

Download Full Size | PDF

3.1 Noise-free case

A better training performance originates from more epochs (iterations); however, it suffers from a longer training time. Generally, the neuron number has little influence on the training accuracy in noise-free patterns when it reaches a certain magnitude, in case of infinite epochs and training time [29]. However, it is not easy to perform the parameter analysis in practice. Therefore, the epoch number in our training was selected as 2000000, which provided a relatively high convergence precision and moderate convergence rate. In this case, the convergence precision of the DNN would be affected by the network layers and neuron number. Figure 2 illustrates this. Figures 2(a) and 2(b) show the output RMSE variation in the sample dataset and test dataset, respectively, with different neuron numbers in case of different layers. In Fig. 2(a), the sample RMSE is expressed as $\sqrt {{{\sum\limits_{k = 1}^n {{{({{\varepsilon_{jk}} - {\varepsilon_{{\mathop{\textrm s}\nolimits} \_jk}}} )}^2}} } \mathord{\left/ {\vphantom {{\sum\limits_{k = 1}^n {{{({{\varepsilon_{jk}} - {\varepsilon_{{\mathop{\textrm s}\nolimits} \_jk}}} )}^2}} } n}} \right.} n}}$, where subscript k refers to $k{\mathop{\textrm th}\nolimits}$ sample. ${\varepsilon _{jk}}$ and ${\varepsilon _{{\mathop{\textrm s}\nolimits} \_jk}}$ denote the trained DNN output and sample output, respectively. Note that here, the RMSE is dimensionless because the ${\varepsilon _{jk}}$ and ${\varepsilon _{{\mathop{\textrm s}\nolimits} \_jk}}$ have two units (mm and °). The RMSE reduces rapidly with increase in the neuron number and network layers. A smaller convergence value represents a better trained network. Remarkably, the RMSE is 10⁻³ at 60 neurons with 5, 6, and 7 layers with the sample dataset. The training times are 38, 47, and 61 h, respectively. From Fig. 2(b), we conclude that a better trained network is associated with a better performance in the test dataset. The test RMSE is expressed as $\sqrt {{{\sum\limits_{k = 1}^n {{{({{\varepsilon_{jk}} - {\varepsilon_{{\mathop{\textrm t}\nolimits} \_jk}}} )}^2}} } \mathord{\left/ {\vphantom {{\sum\limits_{k = 1}^n {{{({{\varepsilon_{jk}} - {\varepsilon_{{\mathop{\textrm t}\nolimits} \_jk}}} )}^2}} } n}} \right.} n}}$, where subscript ${\mathop{\textrm t}\nolimits}$ refers to the test dataset. Note that here ${\varepsilon _{jk}}$ is not the same as above, instead it is the network output with the input in the test dataset. The test RMSE reaches 10⁻² at 60 neurons with 5, 6, and 7 layers as well.

Fig. 2. Output RMSE in noise-free case. (a) Sample output RMSE with different neuron numbers for different layers, (b) Test output RMSE with different neuron numbers for different layers, (c) Test output RMSE variation with the sample number increase. Note that here the RMSE is dimensionless.

Download Full Size | PDF

Considering the performance in terms of the accuracy and training time, the framework of 5 layers (4 hidden layers) with 60 neurons in each hidden layer was excellent. Therefore, subsequently, the DNN was performed in this framework. Sixty neurons in each layer implies that the maximum number of neurons is sixty because some neurons with zero weight will be inoperative. Figure 2(c) exhibits the test output RMSE variation with the sample number increase. It is notable that more samples are associated with a more accurate test output, unless the DNN falls into a local minimum. The specific sample number should be determined in an actual experiment according to the training performance and measurement error tolerance.

Next, the training results of the DNN with the sample dataset of 5 layers with 60 neurons in each hidden layer are presented in Fig. 3. The total training time is approximately 38 h in case of 20000 epochs. Figure 3(a) presents the RMSE variation and training time with the epoch increase, and Fig. 3(b) shows the regression analysis of the sample output (${{\boldsymbol \varepsilon }_{\mathop{\textrm s}\nolimits} }$) and DNN output (${\boldsymbol \varepsilon }$). The relation between the two outputs can be fit with the line: $y = x - 8.4 \times 1{0^{ - 6}}$, and the correlation coefficient, R, is 0.99996, which implies the DNN output is highly consistency with the sample output. Figures 3(a) and 3(b) provide the statistical characteristic in all the misalignment values. To evaluate the DNN performance in each type of misalignment, Fig. 3(c) presents the final relative error of the six misalignments estimations. For each type of misalignment, the average relative error is expressed as ${{\left[ {\sum\limits_{k = 1}^{\mathop{\textrm n}\nolimits} {|{{{{\varepsilon_{jk}} - {\varepsilon_{jk}}} \mathord{\left/ {\vphantom {{{\varepsilon_{jk}} - {\varepsilon_{jk}}} {{\varepsilon_{{\mathop{\textrm s}\nolimits} \_jk}}}}} \right.} {{\varepsilon_{{\mathop{\textrm s}\nolimits} \_jk}}}}} |} } \right]} \mathord{\left/ {\vphantom {{\left[ {\sum\limits_{k = 1}^{\mathop{\textrm n}\nolimits} {|{{{{\varepsilon_{jk}} - {\varepsilon_{jk}}} \mathord{\left/ {\vphantom {{{\varepsilon_{jk}} - {\varepsilon_{jk}}} {{\varepsilon_{{\mathop{\textrm s}\nolimits} \_jk}}}}} \right.} {{\varepsilon_{{\mathop{\textrm s}\nolimits} \_jk}}}}} |} } \right]} n}} \right.} n}$. We confirm that in the DNN sample dataset in the noise-free case, the relative errors of all the types of misalignments are less than 0.2%.

Fig. 3. Training result in case of 5 layers with 60 neurons in each hidden layer. (a) The RMSE variation and training time with increasing epochs, (b) The regression of the samples output and DNN output, (c) The average relative errors of the specific six misalignment estimations by the DNN with the sample dataset.

Download Full Size | PDF

Subsequently, the well-trained DNN was validated in the test dataset. The performance is presented in Fig. 4, where it is compared with the SM method. Specifically, the well-trained DNN was employed to estimate the actual misalignments in the test dataset. Figure 4(a) illustrates the regression analysis of the output in the test dataset (test output) and those from the DNN (DNN output). The same regression analysis of the test output and those from the SM method (SM output) is presented in Fig. 4(b). It is notable that the DNN has an excellent performance, irrespective of slight or large misalignment estimations. The SM method has a relatively adequate performance in case of slight misalignments (blue shaded area in Fig. 4(b)) and a poor performance in large ones. This is because the SM method can provide only the linear relation; however, large misalignments exhibit non-linear relations with the corresponding aberrations. The resulting average relative errors of the specific six misalignment estimations in the DNN and SM method are shown in Figs. 4(c) and 4(d), respectively. The average relative errors in the DNN are remarkably smaller than those in the SM method by about one order of magnitude.

Fig. 4. Performances of the DNN and SM method in the test dataset in the noise-free case. (a) The regression analysis of the sample outputs and DNN outputs in the test dataset, (b) The regression analysis of the sample outputs and SM method, (c) The average relative errors of the six misalignment estimations in the DNN method, (d) The average relative errors of the six misalignment estimations in the SM method.

Download Full Size | PDF

3.2 Noise case

Here, noise has two definitions. The first is the noise in the sample dataset, which affects the DNN training accuracy. The other is the noise form the actual experiments in the DNN estimation process. The noise of the sample dataset in the training process generally arises from the modeling errors because the sample dataset for the training originates from the system model. The modeling errors in the interferometer mainly include optical element figure errors, refractive index errors, other element misalignments, and Zernike polynomial fitting errors. The corresponding residual wavefront errors generally are low-order aberrations and can be controlled to less than ${10^{ - 3}}\lambda$ PV after calibration [16]. To simulate the modeling errors influence, this magnitude of errors was added to the sample input in the forms of the first four terms of the Zernike coefficients. The noise in the DNN estimation process is the root of the experimental system error. In an actual experiment, except for the inherent error (in a non-null configuration), the aberrations in the tested wavefront are affected not only by the tested surface misalignments and figure error but also by the undesired system error. These errors can generally be removed by pre-calibration [5,9,16]. The residual errors (less than ${10^{ - 3}}\lambda$ PV), for simulation, were added in the test dataset also as the first four terms of the Zernike coefficients. Figure 5 presents the performance of the DNN in different frameworks in the noise case. The training performance, as shown in Fig. 5(a), is the same as in the noise-free case presented in Fig. 2(a). This is because the little noise in the sample dataset has no influence on the network convergence. The DNN with 5 layers still exhibits excellent performance, as well as in the noise-free case. Figure 5(b) directly shows the RMSE variation with the neuron number in the test dataset with noise in the case of five layers. We find that the test RMSE exhibits small fluctuations with the increase in the neuron number, which is different from the noise-free case. The case of 60 neurons still has a relatively good performance. Figure 5(c) displays that in the noise case, more sample numbers are associated with increasingly accurate test outputs with a slight fluctuation. In general, the DNN performance in the noise case is the same as that in the noise-free case, with a slight fluctuation at different neuron numbers and sample numbers. The framework of 5 layers with 60 neurons in each hidden layer still performs well. Figure 6 presents the performance comparison of the DNN and SM method in the test dataset with noise.

Fig. 5. Output RMSE in noise case. (a) Sample output RMSE with different neuron numbers for different layers, (a) Test output RMSE with different neuron numbers for five layers, (c) Test output RMSE variation with increasing sample number.

Download Full Size | PDF

Fig. 6. Performances of the trained DNN and SM method in the test dataset in the noise case. (a) The regression analysis of the sample outputs and DNN outputs in the test dataset, (b) The regression analysis of the sample outputs and SM method, (c) The average relative errors of the six misalignment estimations in the DNN method, (d) The average relative errors of the six misalignment estimations in the SM method.

Download Full Size | PDF

Figure 6(a) illustrates the regression analysis of the test output and DNN output. The same regression analysis of the test output and those from the SM method (SM output) are presented in Fig. 6(b). The resulting average relative errors of the specific six misalignment estimations in the DNN and SM method are shown in Figs. 6(c) and 6(d), respectively. Compared with the performance in the noise-free case, as shown in Fig. 4, the misalignment estimation accuracy is reduced by approximately 20%, in either the DNN or SM method. However, the DNN performance is still better than SM method performance at the same noise level, which is quite noticeable from Fig. 6.

4. Experiment validation

The simulation discussed in Sec. 3 statistically proved the high accuracy of the DNN in freeform surface misalignment estimation. This section described the experiments performed to validate the DNN-based calibration method with several specific cases. The tested subjects were a paraboloidal surface and a bi-conic surface.

4.1 Paraboloidal surface

A paraboloidal surface was tested in a non-null interferometer, which was introduced in our previously work [14,16]. The aligned interferogram in the experiment is shown in Fig. 7(a), and the corresponding one in the system model is presented in Fig. 7(b). The surface figure error can be extracted by retrace error correction algorithms, such as the reverse optimization reconstruction (ROR) method [38,39] and theoretical reference wavefront [40] method. The final figure error acquired by the latter is presented in Fig. 7(c), which has a basic consistency with the one obtained by a Zygo interferometer with the aberration free method (Fig. 7(d)). A sample dataset with 10000 misaligned patterns was employed to train the DNN of 5 layers with 60 neurons in each hidden layer. The total training time was approximately 38 h.

Fig. 7. Experiment results of the non-null test for a paraboloidal surface. (a) Aligned non-null interferogram in the experiment, (b) Aligned non-null interferogram in the model, (c) Resulting figure error from (b) and (c), (d) Figure error by the Zygo interferometer, (e) Misaligned interferograms from the experiments, (f) Misalignments estimated by the DNN, (g) Calibrated interferograms in the model with (f), (h) Resulting figure errors after the calibration.

Download Full Size | PDF

The other three misaligned interferograms from the experiments are presented in Fig. 7(e). The Zernike coefficients of the corresponding wavefronts were imported into the trained DNN, and the resulting outputs are shown in Fig. 7(f). These outputs, the specific misalignment values, were entered into the system model to modify the surface positions and postures. The thus obtained resulting misaligned interferograms in the system model are shown in Fig. 7(g). Then, the corresponding figure errors were obtained by subtracting the wavefront in Fig. 7(g) from corresponding one in Fig. 7(e). Figure 7(h) presents the three resulting figure errors, which have similar outlines to those in the aligned case and Zygo interferometer. Table 1 lists the PV and rms values of these figure errors shown in Fig. 7, exhibiting the accuracy and repeatability of the DNN based calibration.

Table 1. PV and rms values of the figure error shown in Fig. 7

View Table | View all tables in this article

4.2 Bi-conic surface

A bi-conic surface (52 mm aperture, 244 mm x radius, and 250 mm y radius), which cannot be covered in a simple interferometric configuration, was tested in an adaptive interferometer we previously introduced [9]. This adaptive interferometer can provide the near null test and non-null test according to the ability of the deformable mirror. Therefore, we presented the surface misalignment calibration in these two cases.

In the adaptive interferometer, the adaptive optics (here deformable mirror, DM) can compensate the misalignment aberrations together with the inherent aberrations by its deformation. Thus, we cannot observe remarkable misaligned null interferograms. However, in a general null interferometer, the compensating optics is static, and the misaligned null interferograms is easy to observe. In this section, we discuss this general case: the deformable mirror deformation is fixed in a null test configuration with its ideal interferogram shown in Fig. 8(a). In this null test configuration, we misaligned the tested surface, and the resulting misaligned interferogram is presented in Fig. 8(b), and the final figure error extracted directly from Fig. 8(a), in the aligned case, is shown in Fig. 8(g). The 100000 samples extracted from the system model were entered into the DNN for training. The total training time was approximately 52 h. Subsequently, the Zernike coefficients characterizing the misaligned interferogram shown in Fig. 8(b) were entered into the trained DNN. The output, six specific misalignment values, were extracted, and are presented in Fig. 8(c). These specific misalignment values were then substituted into the system model to modify the posture of the tested surface. The resulting interferogram in the model is presented in Fig. 8(d), which has a similar outline as the one in Fig. 8(b). The final figure error after the DNN-based calibration is shown in Fig. 8(h). For comparison, misalignments were also estimated by the SM method, which are presented in Fig. 8(e). Then the misalignment values were also substituted into the system model to modify the posture of the tested surface, and the resulting interferogram in the model is presented in Fig. 8(f), which is slightly different from the one shown in Fig. 8(b). This suggests incomplete calibration with the SM method. The final figure error after the SM-based calibration is shown in Fig. 8(i). Note that the final figure errors in Figs. 8(h) and 8(i) are rotated accordingly. The PV and rms values listed for comparison in Table 2, present a relatively higher accuracy of the DNN-based calibration.

Table 2. PV and rms values of the figure errors shown in Fig. 8

View Table | View all tables in this article

Fig. 8. Experiment results of the null test for a bi-conic surface. (a) Aligned null interferogram in the experiment, (b) Misaligned interferogram in the experiment, (c) and (e) are the misalignments estimated by the DNN and SM method, respectively. (d) and (f) are the calibrated interferograms in the model with (c) and (e), respectively. (g)–(i) are the resulting surface figure errors in the aligned case, after DNN calibration and SM calibration.

Download Full Size | PDF

If the adaptive optics cannot cover the inherent aberrations of the test surface but the resulting fringe density is in the interferometer capacity, then the surface will be tested under a non-null condition, which is a general case in a common partial null interferometer as well as an adaptive interferometer. We reduced the DM deformation to build a non-null configuration to test the above-mentioned bi-conic surface. The aligned non-null interferogram is shown in Fig. 9(a). We also misaligned the tested surface, and the resulting interferogram is presented in Fig. 9(b). The executed process was the same as that for the above null test. The method proposed for the non-null test in [26] was also implemented for comparison. We denote the virtual interferometer calibration method as the VI method.

Fig. 9. Experiment results of the non-null test for the bi-conic surface. (a) Aligned non-null interferogram in the experiment, (b) Misaligned interferogram in the experiment, (c)–(e) are the misalignments estimated by the DNN, SM method, and VI method, respectively. (f)–(h) are the corresponding calibrated interferograms in the model with (c), (d), and (e), respectively. (i)–(l) are the resulting surface figure errors in the aligned case, after the DNN, SM, and VI calibration, respectively.

Download Full Size | PDF

Figures 9(c)–9(e) present the specific misalignments estimated by the DNN, SM method, and VI method, respectively. In the general case of freeform surface adjustment, ${\theta _z}$ and ${d_z}$ are difficult to estimate in the initial alignment, and thus, relatively large residual values remain. Figures 9(f)–9(h) display the modified interferogram in the model by the three methods, respectively. Figures 9(i)–9(l) show the corresponding figure errors extracted from the aligned case, DNN, SM, and VI -based calibration cases, respectively. The figure errors in Figs. 9(j)–9(l) are rotated according to the ${\theta _z}$ estimated by the corresponding methods. Figure 9(j) has a map more similar to that of Fig. 9(i), which implies that a higher accuracy is achieved by the DNN-based calibration. For comparison, the PV and rms values are listed in Table 3. From Table 2 and Table 3, we conclude that the DNN-based calibration exhibits a better performance than the SM and even the VI-based calibration. Note that the DNN-based calibration accuracy under the non-null condition is lower than that under the null condition, which is owing to the residual error of the retrace error enlargement by the misalignments.

Table 3. PV and rms values of the figure errors shown in Fig. 9

View Table | View all tables in this article

5. Conclusion and expectation

In this study, we propose a DNN-based misalignment calibration method, and first, experimentally validate it in a freeform surface interferometer. The DNN is trained to estimate the specific misalignments with the Zernike coefficients of the tested wavefront as the input. The estimated misalignments are then employed to align the test surface in the model rather than the one in the experiment system. The misaligned simulated wavefront is thus extracted. The final figure error can be reconstructed together with the misalignment calibration by subtracting the misaligned simulated wavefront from the tested one. The simulation proves that the DNN exhibits a better performance than the SM method in misalignment estimation, in both noise-free and noise cases. The high accuracy of the misalignment-calibration-based DNN is validated by experiments for paraboloid and bi-conic surface tests, assisted by the comparison with the SM-based method. Note that the sample number for the DNN training for the bi-conic surface is larger than that for the paraboloid surface. This is because 10000 samples cannot provide a high estimation accuracy by the simulated validation. The next work will be to research the necessary sample number, layer number, and neuron number for different nominal surface figures having different aberration forms. Thus, a reliable DNN framework can be set up for common freeform surfaces such as off-axis aspheric surfaces and toroidal surfaces.

Funding

National Natural Science Foundation of China (41875158, 61675005, 61705002, 61905001); Natural Science Foundation of Anhui Province (1808085QF198, 1908085QF276); Opening project of the Key Laboratory of Astronomical Optics & Technology in Nanjing Institute of Astronomical Optics & Technology of the Chinese Academy of Sciences (CAS-KLAOT-KF201704); Opening project of the Anhui Province Key Laboratory of Non-Destructive Evaluation (CGHBMWSJC05); Doctoral Start-up Foundation of the Anhui University (J01003208); National Program on Key Research and Development Project of China (2016YFC0301900, 2016YFC0302202).

Acknowledgments

I sincerely thank for my girlfriend Yueyu Peng for her company and care.

References

1. X. Zeng, X. Zhang, D. Xue, Z. Zhang, and J. Jiao, “Mapping distortion correction in freeform mirror testing by computer-generated hologram,” Appl. Opt. 57(34), F56–F61 (2018). [CrossRef]

2. E. Y. B. Pun, S. Hua, R. Zhu, W. H. Wong, X. Zhu, and Z. Gao, “Design and fabrication of computer-generated holograms for testing optical freeform surfaces,” Chin. Opt. Lett. 11(3), 32201–32205 (2013). [CrossRef]

3. G. Baer, J. Schindler, C. Pruss, J. Siepmann, and W. Osten, “Calibration of a non-null test interferometer for the measurement of aspheres and free-form surfaces,” Opt. Express 22(25), 31200–31211 (2014). [CrossRef]

4. S. Li, J. Zhang, W. Liu, Z. Guo, H. Li, Z. Yang, B. Liu, A. Tian, and X. Li, “Measurement investigation of an off-axis aspheric surface via a hybrid compensation method,” Appl. Opt. 57(28), 8220–8227 (2018). [CrossRef]

5. L. Zhang, “Free-form Surface Subaperture Stitching Interferometry,” Doctoral dissertation, Zhejiang University, China, (2016).

6. S. Xue, S. Chen, G. Tie, and Y. Tian, “Adaptive null interferometric test using spatial light modulator for free-form surfaces,” Opt. Express 27(6), 8414–8428 (2019). [CrossRef]

7. S. Xue, S. Chen, G. Tie, Y. Tian, H. Hu, F. Shi, X. Peng, and X. Xiao, “Flexible interferometric null testing for concave free-form surfaces using a hybrid refractive and diffractive variable null,” Opt. Lett. 44(9), 2294–2297 (2019). [CrossRef]

8. S. Xue, S. Chen, Z. Fan, and D. Zhai, “Adaptive wavefront interferometry for unknown free-form surfaces,” Opt. Express 26(17), 21910–21928 (2018). [CrossRef]

9. L. Zhang, S. Zhou, D. Li, Y. Liu, T. He, B. Yu, and J. Li, “Pure adaptive interferometer for free form surfaces metrology,” Opt. Express 26(7), 7888–7898 (2018). [CrossRef]

10. L. Huang, H. Choi, W. Zhao, L. R. Graves, and D. W. Kim, “Adaptive interferometric null testing for unknown freeform optics metrology,” Opt. Lett. 41(23), 5539–5542 (2016). [CrossRef]

11. K. Fuerschbach, K. P. Thompson, and J. P. Rolland, “Interferometric measurement of a concave, φ-polynomial, Zernike mirror,” Opt. Lett. 39(1), 18–21 (2014). [CrossRef]

12. C. Pruss and H. J. Tiziani, “Dynamic null lens for aspheric testing using a membrane mirror,” Opt. Commun. 233(1-3), 15–19 (2004). [CrossRef]

13. D. Liu, Y. Yang, C. Tian, Y. Luo, and L. Wang, “Practical methods for retrace error correction in nonnull aspheric testing,” Opt. Express 17(9), 7025–7035 (2009). [CrossRef]

14. S. Tu, L. Dong, Y. Zhou, T. Yan, Y. Yang, Z. Lei, B. Jian, Y. Shen, M. Liang, and H. Wei, “Practical retrace error correction in non-null aspheric testing: A comparison,” Opt. Commun. 383, 378–385 (2017). [CrossRef]

15. D. Wang, Y. Yang, C. Chen, and Y. Zhuo, “Misalignment aberrations calibration in testing of high-numerical-aperture spherical surfaces,” Appl. Opt. 50(14), 2024–2031 (2011). [CrossRef]

16. L. Zhang, D. Liu, T. Shi, Y. Yang, and Y. Shen, “Practical and accurate method for aspheric misalignment aberrations calibration in non-null interferometric testing,” Appl. Opt. 52(35), 8501–8511 (2013). [CrossRef]

17. J. Peng, Y. Yu, and H. Xu, “Compensation of high-order misalignment aberrations in cylindrical interferometry,” Appl. Opt. 53(22), 4947–4956 (2014). [CrossRef]

18. T. Yang, D. Cheng, and Y. Wang, “Aberration analysis for freeform surface terms overlay on general decentered and tilted optical surfaces,” Opt. Express 26(6), 7751–7770 (2018). [CrossRef]

19. E. Garbusi and W. Osten, “Perturbation methods in optics: application to the interferometric measurement of surfaces,” J. Opt. Soc. Am. A 26(12), 2538–2549 (2009). [CrossRef]

20. X. Zhao, W. Jiao, Z. Liao, Y. Wang, and J. Chen, “Study on computer-aided alignment method of a three-mirror off-axis aspherical optical system,” Proc. SPIE 7656, 76566M (2010). [CrossRef]

21. Y. Kim, H. S. Yang, J. B. Song, S. W. Kim, and Y. W. Lee, “Modeling Alignment Experiment Errors for Improved Computer-Aided Alignment,” J. Opt. Soc. Korea 17(6), 525–532 (2013). [CrossRef]

22. L. Zheng, D. Xue, and X. Zhang, “Computer-aided alignment for off-axis asphere null test,” Proc. SPIE 5638, 319–323 (2005). [CrossRef]

23. S. Kim, H. Yang, Y. Lee, and S. Kim, “Merit function regression method for efficient alignment control of two-mirror optical systems,” Opt. Express 15(8), 5059–5068 (2007). [CrossRef]

24. H. Shen, “Research on key techniques of tilted-wave-interferometer used in the measurement of freeform surfaces,” Doctoral dissertation, Nanjing University of Science and Technology, China, (2014).

25. H. Lee, G. B. Dalton, I. A. J. Tosh, and S. Kim, “Computer-guided alignment II: Optical system alignment using differential wavefront sampling,” Opt. Express 15(23), 15424–15437 (2007). [CrossRef]

26. Q. Hao, S. Wang, Y. Hu, H. Cheng, M. Chen, and T. Li, “Virtual interferometer calibration method of a non-null interferometer for freeform surface measurements,” Appl. Opt. 55(35), 9992–10001 (2016). [CrossRef]

27. Q. Tian, C. Lu, B. Liu, L. Zhu, X. Pan, Q. Zhang, L. Yang, F. Tian, and X. Xin, “DNN-based aberration correction in a wavefront sensorless adaptive optics system,” Opt. Express 27(8), 10765–10775 (2019). [CrossRef]

28. G. Ju, X. Qi, H. Ma, and C. Yan, “Feature-based phase retrieval wavefront sensing approach using machine learning,” Opt. Express 26(24), 31767–31783 (2018). [CrossRef]

29. H. Guo, N. Korablinova, Q. Ren, and J. Bille, “Wavefront reconstruction with artificial neural networks,” Opt. Express 14(14), 6456–6462 (2006). [CrossRef]

30. Z. Li and X. Li, “Centroid computation for Shack-Hartmann wavefront sensor in extreme situations based on artificial neural networks,” Opt. Express 26(24), 31675–31692 (2018). [CrossRef]

31. Y. Nishizaki, M. Valdivia, R. Horisaki, K. Kitaguchi, M. Saito, J. Tanida, and E. Vera, “Deep learning wavefront sensing,” Opt. Express 27(1), 240–251 (2019). [CrossRef]

32. Y. Yasuno, T. Yatagai, T. F. Wiesendanger, A. K. Ruprecht, and H. J. Tiziani, “Aberration measurement from confocal axial intensity response using neural network,” Opt. Express 10(25), 1451–1457 (2002). [CrossRef]

33. T. K. Barrett and D. G. Sandler, “Artificial neural network for the determination of Hubble Space Telescope aberration from stellar images,” Appl. Opt. 32(10), 1720–1727 (1993). [CrossRef]

34. G. Baer, J. Schindler, C. Pruss, and W. Osten, “Correction of Misalignment Introduced Aberration in Non-Null Test Measurements of Free-Form Surfaces,” J. Europ. Opt. Soc. Rap. Public 8(23), 13074 (2013). [CrossRef]

35. S. Schaal and C. G. Atkeson, “Constructive incremental learning from only local information,” Neural Comput. 10(8), 2047–2084 (1998). [CrossRef]

36. J. Wang, Q. Hao, H. Yao, S. Wang, T. Li, Y. Tian, and L. Lin, “Convex Aspherical Surface Testing Using Catadioptric Partial Compensating System,” J. Phys.: Conf. Ser. 680, 012036 (2016). [CrossRef]

37. J. J. Sullivan and J. E. Greivenkamp, “Design of partial nulls for testing of fast aspheric surfaces,” Proc. SPIE 6671, 66710W (2007). [CrossRef]

38. D. Liu, T. Shi, L. Zhang, Y. Yang, S. Chong, and Y. Shen, “Reverse optimization reconstruction of aspheric figure error in a non-null interferometer,” Appl. Opt. 53(24), 5538–5546 (2014). [CrossRef]

39. J. E. Greivenkamp and R. O. Gappinger, “Iterative reverse optimization procedure for calibration of aspheric wave-front measurements on a nonnull interferometer,” Appl. Opt. 43(27), 5143–5161 (2004). [CrossRef]

40. C. Tian, Y. Yang, and Y. Zhuo, “Generalized data reduction approach for aspheric testing in a non-null interferometer,” Appl. Opt. 51(10), 1598–1604 (2012). [CrossRef]

	Zygo	Aligned case	Misaligned case
PV (λ)	0.263	0.249	0.260	0.271	0.258
rms(λ)	0.036	0.035	0.036	0.036	0.035

	Aligned case	Misaligned case after calibration
	Aligned case	DNN	SM
PV (λ)	0.306	0.3377	0.451
rms(λ)	0.048	0.051	0.062

	Aligned case	Misaligned case after calibration
	Aligned case	DNN	SM	VI
PV (λ)	0.304	0.373	0.761	0.435
rms(λ)	0.047	0.053	0.088	0.067

	Zygo	Aligned case	Misaligned case
PV (λ)	0.263	0.249	0.260	0.271	0.258
rms(λ)	0.036	0.035	0.036	0.036	0.035

	Aligned case	Misaligned case after calibration
	Aligned case	DNN	SM
PV (λ)	0.306	0.3377	0.451
rms(λ)	0.048	0.051	0.062

Deep neural network based calibration for freeform surface misalignments in general interferometer

Abstract

1. Introduction

2. Principle

3. Simulation for neural network performance evaluation

3.1 Noise-free case

3.2 Noise case

4. Experiment validation

4.1 Paraboloidal surface

4.2 Bi-conic surface

5. Conclusion and expectation

Funding

Acknowledgments

References

Cited By

Figures (9)

Tables (3)

Equations (4)

Optics Express