Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Non-uniform adaptive angular spectrum method and its application to neural network assisted coherent beam combining

Open Access Open Access

Abstract

Increasing the number of laser beams that can be coherently combined requires accurate and fast algorithms for compensating phase and alignment errors. The paper proposes to use a Fully Connected Artificial Neural Network (FCANN) to correct the beam positioning perturbations by evaluating the beam shifts and tilts from two images taken at slightly different locations. Then, since it is practically impossible to have a large enough experimental dataset to train the neural network, this approach required developing an accurate and fast simulation method to evaluate the beam propagation in arbitrary directions, overcoming the limitations occurring when the computation must be repeated a large number of times. The numerical approach is a variant of the Angular Spectrum (AS) method, called Non Uniform ADaptive Angular Spectrum (NUADAS) method, which relies on the combination of non-uniform and adaptive Fourier transform algorithms to allow the computation of an arbitrary field distribution in a plane that is shifted and tilted with respect to the source. The parallel implementation of the NUADAS method is discussed and the numerical and experimental validations are presented. Then, an FCANN is trained using the synthetic dataset generated with the NUADAS method and the results are discussed, demonstrating the viability of the proposed approach not only for coherent beam combing, but also in other beam alignment applications.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

The high power laser market is currently largely dominated by Fiber Lasers (FLs), which in the last decade have taken over shares from all the other competitors - especially from the $\rm {CO}_{2}$ lasers, the standard high power source in early applications -, mainly in the material processing and military segments [1], thanks to the combination of high beam quality, efficiency, power scalability, simplified thermal management, and lower ownership costs [2,3]. Most of these features are intrinsic to the fiber geometry, which is characterized by a long gain medium (the rare-earth-doped active fiber), an efficient pump coupling mechanism and a favorable surface to volume ratio [4,5]. However, a great contribution to the success of FLs comes from the exceptional increase in the performances of the laser diodes used for pumping [6,7]. All this led to a huge growth in the FL power over the years, fueled by the requests of higher production yield and of new military applications [8]. Today, a single semiconductor laser chip can emit in Continuous Wave (CW) up to 20 W; then by exploiting spatial, spectral and polarization multiplexing, a plurality of such chips can be combined to obtain pump modules that are capable of delivering few hundreds of watts in a large mode area fiber [9,10]. In turn, these pump modules allow manufacturing fiber laser modules (also known as “laser engines”) that emit up to more than 2 kW; finally, some of these laser engines can be further combined through fused fiber devices [5] to achieve multi-kilowatt fiber-coupled lasers, with CW output power in the order of 10 kW to 15 kW, although at the expenses of quite a severe degradation in the beam quality due to the use of largely multi-mode output fibers.

This represents today state-of-the-art for non special purpose CW products, but, considering the previous trend, it is not difficult to predict that even higher power levels, especially if associated with higher beam quality, will be required soon. However, the output power of a fiber laser module cannot be arbitrarily scaled up by increasing the pump power due to thermal limitations and the onset of non-linear effects [1113]; moreover, the pump power itself is bounded by the size and the numerical aperture of the pump module delivery fiber and by the active fiber absorption band (relevant in the case of spectral multiplexing). Similarly, for the combination of fiber laser modules, which is also limited by thermal dissipation in the output combiner and by the maximum acceptable Beam Parameter Product (BPP).

The power-beam quality trade-off limitation can be overcome by switching from incoherent to Coherent Beam Combining (CBC) [14,15]. The CBC exploits the constructive interference among a plurality of laser beams and its most appealing feature is that the resulting beam peak intensity scales as $N^{2}$, where $N$ is the number of combined beams, while for any other incoherent multiplexing technique, such as spatial or polarization multiplexing, the peak intensity scales as $N$. Moreover, CBC is particularly attractive for combining of high beam quality fiber laser modules because, ideally, it does not suffer from the degradation typical of incoherent combination techniques. However, to achieve the maximum combination efficiency and thus intensity, the phase difference among the laser beams must be kept as close as possible to zero, which is the reason why the CBC efficiency is extremely sensitive to any kind of phase perturbation, such as those induced by an instantaneous phase drifts or by displacements in the optimal position and direction of the beams [16]. Indeed, it has been reported a decrease of CBC efficiency as large as 10% in the case of a phase perturbation whose average intensity is of the order of $\lambda /4$ at an average frequency of 400 Hz that goes up to 30% when the average intensity becomes $\lambda /2$ [17]

The phase drifts are corrected using Electro-Optic Modulators (EOMs) controlled in closed-loop and the most common approach is through the Parallel Stochastic Gradient Descend (PSGD) algorithm [18]. As for the displacement errors, these are traditionally corrected by shifting some optical components, such as lenses or mirrors, using piezoelectric stages controlled in closed-loop too, usually by the same PSGD algorithm. Such approach, although highly effective, in practical implementations sets a strong limitation to the number of laser sources that can be efficiently multiplexed because the latency of the PSGD algorithm with respect to the phase perturbations increases with the number of elements to be controlled (hence with the number of sources) and, at a certain point, it becomes so high that the EOMs and displacement stages cannot be controlled fast enough to have an adequate phase correction. The solution proposed in this paper is to continue using the PSGD algorithm for the control of the EOMs, but to resort to some other approach to address the problem of the mechanical perturbations, which have been shown to play a major role in reducing the combination efficiency [19].

In more detail, this paper proposes to correct the errors due to mechanical perturbations in real time using an Artificial Neural Network (ANN) to predict the deviation of a beam from its ideal alignment. ANNs have already been used to assist in the assembly of the collimating optical elements inside multi-emitter laser modules [20]: in that case an ANN is exploited to extract some parameters defining the tilts and shifts of a lens from the analysis of some beam images. In this paper a similar approach is extended to find the correction commands for the displacement stages from the analysis of just two beam images. In particular, the developed correction method exploits a Fully Connected Artificial Neural Network (FCANN) [21,22] that takes as input some numerical parameters automatically extracted from the beam images taken at two different locations and infers from them the beam direction with respect to a given reference system and, therefore, the correction for the ideal alignment. This allows the piezoelectric stages to correct the beam position in a single step. The process can then be repeated for each beam to be combined by having several FCNNs running in parallel. This way, the proposed approach allows correcting the propagation direction of each beam independently, reducing the number of elements to be controlled by the PSGD algorithm and thus contributing to increase the number of lasers that can be coherently combined.

In general, simultaneously extracting the beam shift and tilting with respect to the camera orientation from two images is not a trivial task since this can be formulated as the problem of inverting an unknown function given only some measured values. Fortunately, what it is known as the classical formulation of the Universal Approximation Theorem [23] states that a neural network can approximate any well-behaved function with an accuracy that depends only on the neural network free parameters and the extension of the training dataset. This means that to ensure a low enough error for CBC applications, it is essential to have a very large dataset for training the ANN. As it is practically impossible to have such a large dataset of experimental values, the solution is to train the FCANN using a synthetic dataset built from numerical computations; in turn, this requires the accurate simulation of a laser beam propagation in an arbitrary direction with respect to a given reference frame.

The propagation of a generic beam from the source plane to a shifted destination plane can be efficiently computed by means of the Angular Spectrum of Plane Waves (ASPW) [24]. However, if the source plane is also tilted with respect to the destination plane, which is the case of interest, the ASPW becomes cumbersome due to the limitations imposed by the Fast Fourier Transform algorithm [25]. Therefore, to overcome these limitations, a new algorithm named Non Uniform ADaptive Angular Spectrum (NUADAS), which is based on the Non Uniform Fast Fourier Transform (NUFFT) [26] had to be developed.

The remaining of the paper is organized as follows. Section 2 introduces the mathematical details of the proposed NUADAS optical simulation method; Sect. 3 describes its implementation exploiting parallel processing capabilities to allow building a larga dataset in a reasonable time and discusses the results of some numerical validations; then, Section 4 reports some assessment results obtained comparing the simulations with experiments. Section 5 presents the implementation of the neural network and a discussion of the method performance in relation to the FCANN parameters. Finally, Sect. 6 introduces other possible cases in which the NUADAS approach to train FCANN could be useful and then Sect. 7 draws the conclusions.

2. NUADAS: Non-Uniform ADaptive Angular Spectrum propagation method

As already mentioned, training of the FCANN requires computing the propagation from a tilted source plane to a target plane, as depicted in the typical scenario in Fig. 1. The propagation of a laser beam in an arbitrary direction with respect to a target plane with an arbitrary polarization in an homogeneous isotropic dielectric medium is solved in terms of scalar Helmoltz equations, one for each polarization, with assigned beam complex amplitude in the origin of the system [27]:

$$\nabla^{2}g(\vec{r})+\epsilon\mu\omega^{2}\,g(\vec{r})=0$$
where $g(\vec {r})$ stands for the amplitude of the modes composing the beam. As well known, the solution of Eq. (1), given its boundary value $g(\vec {r}_\textrm{i}),\;r_\textrm{i} \in \{\vec {r}\in \cal{R}^{3}|z=z_0\}$, with $z_0=0$ for simplicity, is:
$$\begin{array}{c} g(\vec{r}_\textrm{f})=\displaystyle\int_{\mathcal{R}^{2}}\mathrm{d}f_\textrm{x}\, \mathrm{d}f_\textrm{y}\, \hat{g}(f_\textrm{x},\,f_\textrm{y};\,0)\, {\rm{e}}^{{-}j\,2\pi \,\vec{f}\cdot\vec{r}_\textrm{f}} \\ \hat{g}(\vec{f};0)=\displaystyle\int_{\mathcal{R}^{2}}\mathrm{d}x\,\mathrm{d}y\, g(\vec{r}_\textrm{i}) {\rm{e}}^{j2\pi\vec{f}\cdot\vec{r_\textrm{i}}}\\ \vec{f}=\left(f_\textrm{x},\,f_\textrm{y},\,f_\textrm{z}=\sqrt{\lambda^{{-}2}-f_\textrm{x}^{2}-f_\textrm{y}^{2}}\right)^{T} \end{array}$$
being $f_\textrm{i}$ the i-th spatial frequency, $\vec {r}_\textrm{i}$ and $\vec {r}_\textrm{f}=\mathbf {R}(\vec {r}_\textrm{i}+\vec {d})$ the points belonging to the source and target planes, respectively, and $\vec {d}$ the distance as sketched in Fig. 1. It is immediate to recognize that Eq. (2) requires computing a Fourier transform integral, which relates space and spatial frequency coordinates. In general, the propagation equation can be solved only with numerical approaches because either the boundary conditions are only numerically known or the Fourier transform cannot be analytically computed.

 figure: Fig. 1.

Fig. 1. A practical scenario in which the propagation of a beam emitted by a laser has to be computed for a target plane that is tilted and shifted with respect to the source plane.

Download Full Size | PDF

The commonly used method to numerically solve this problem is the cited ASPW, whose key point is to approximate the continuous Fourier transform in Eq. (2) with a discrete Fourier transform, which can then be computed using the FFT algorithm. Indeed, considering a finite input plane of size $(S_\textrm{x},\,S_\textrm{y})$ such that $g(\vec {r_\textrm{i}})\simeq 0$ at its edges, discretized into $N_\textrm{x} \times N_\textrm{y}$ points (respectively, along $x$ and $y$), one can introduce the discrete Fourier transform (indicating the transformed quantity with the caret accent) as:

$$\hat{g}_\textrm{i}[s,t]=\Delta_\textrm{x}\,\Delta_\textrm{y}\,\textrm{FFT}\left(g_\textrm{i}[l,m]\right)[s,t]$$
with
$$\begin{array}{c} \hat{g}_\textrm{i}[s,t]=\hat{g}\left(s\Delta f_\textrm{x},\,t\Delta f_\textrm{y};\,0\right),\;\;\;g_\textrm{i}[l,m]=g\left(l\Delta_\textrm{x},\,m\Delta_\textrm{y},\,0\right)\\ \Delta f_\textrm{i}=1/L_\textrm{i},\;\;\;\Delta_\textrm{i}=L_\textrm{i}/N_\textrm{i},\;\;\; {\rm{i}}=x,y\\ I_\textrm{K}=\left\{-\displaystyle{\frac{K}{2}},\ldots,\displaystyle{\frac{K}{2}}-1\right\},\;\;\;s,l\in I_\textrm{N{x}},\;\;\;t,m\in I_\textrm{N{y}} \end{array}$$

The field on the target plane is then computed as:

$$\begin{array}{c} g_\textrm{f}[s,t]=\Delta f_\textrm{x}\,\Delta f_\textrm{y}\, \textrm{IFFT}\left(\hat{g}_\textrm{f}[l,m]\right) \\ f_\textrm{z}^{2}[l,m]=\lambda^{{-}2}-(l\,\Delta f_\textrm{x})^{2}-(m\,\Delta f_\textrm{y})^{2}\\ \hat{g}_\textrm{f}[l,m]=\hat{g}_\textrm{i}[l,m]\, {\rm{e}}^{{-}j\,2\pi\, f_\textrm{z}[l,m]\,z} \end{array}$$

If $f_\textrm{z}^{2}[l,s]<0$, the corresponding wave is evanescent and does not give any contribution to the propagation for reasonably long distances and therefore $\hat {g}_\textrm{f}[l,m]$ can be set to zero for these frequencies.

The ASPW method is simple and straightforward to implement but presents some drawbacks that lead to many constraints. The first drawback is the constraint on the dimension, discretization step and number of points on the target plane, which are all automatically fixed by the corresponding values chosen for the input plane description. The second drawback is that the ASPW method, as formulated above, gives good results for near field propagation only. The third and more important drawback is that it only allows propagating the source field to planes that are parallel to the source one. All these limitations can be overcome with the proposed NUADAS method.

As pointed out in Ref. [28], the small accuracy of the ASPW is due to a non proper sampling of the transfer function in the frequency space, $H(f_\textrm{x},\,f_\textrm{y}) = {\rm {e}}^{j\,\phi (f_\textrm{x},\,f_\textrm{y})} = {\rm{e}}^{-j\,2\pi f_\textrm{z}\,z}$. As a matter of fact, $H(f_\textrm{x},\,f_\textrm{y})$ is a chirp function and the Nyquist theory, in order to avoid aliasing errors, requires that

$$\Delta f_\textrm{x}>\left|\displaystyle{\frac{1}{\pi}}\,\displaystyle{\frac{\partial \phi}{\partial f_\textrm{x}}}\right|, \;\;\; \Delta f_\textrm{y}>\left|\displaystyle{\frac{1}{\pi}}\,\displaystyle{\frac{\partial \phi}{\partial f_\textrm{y}}}\right| $$

So, a first improvement is to use the Band Limited Angular Spectrum (BLAS) [28] for which

$$\begin{aligned} &\displaystyle{\frac{f_\textrm{x}^{2}}{f_{\textrm{x}, \textrm{lim}}^{2}}}+\displaystyle{\frac{f_\textrm{y}^{2}}{\lambda^{-2}}} \leq 1,\;\;\;f_{\textrm{x},\textrm{lim}}=\lambda^{-1}\left(\displaystyle{\frac{2z}{S_\textrm{x}}}+1\right)^{-1/2}\\ &\displaystyle{\frac{f_\textrm{y}^{2}}{f_{\textrm{y}, \textrm{lim}}^{2}}}+\displaystyle{\frac{f_\textrm{x}^{2}}{\lambda^{-2}}} \leq 1\;\;\; f_{\textrm{y}, \textrm{lim}}=\lambda^{-1}\left(\displaystyle{\frac{2z}{S_\textrm{y}}}+1\right)^{-1/2} \end{aligned}$$
where $S_\textrm{x},\,S_\textrm{y}$ define the computational window size. Again, since evanescent waves are not taken into account, the complex amplitude associated to any frequency not satisfying Eq. (6) is forced to zero. On the other hand, if Eq. (6) holds, the accuracy of the ASPW method increases considerably also for moderately large propagation distances $z$, although, when $z$ becomes larger than about forty times the size of the computational window, the accuracy decreases again. In fact, when using the FFT algorithm, the fixed computational window size makes $\Delta f_\textrm{x}$ and $\Delta f_\textrm{y}$ to be fixed too, so the number of points satisfying Eq. (6) decreases as $z$ increase. This implies that, when $z$ becomes significantly large, the number of non-zero complex amplitude values becomes too small to guarantee good results. A straightforward solution is to use larger computational windows, but this leads to an increase of the computational time and of the machine requirements. A more convenient solution is to use the Chirp-Z Transform (CZT) [29,30] to avoid the limitations of the FFT. The proper computational window size for a given propagation distance is [31]:
$$S_\textrm{i}=max\left\{\displaystyle{\frac{2\sqrt{2}\,z\,M_\textrm{i}\,\lambda}{\sqrt{-M_\textrm{i}^{2}\,\lambda^{2}+\sqrt{M_\textrm{i}^{4}\,\lambda^{4}+64\,z^{2}\,M_\textrm{i}^{2}\,\lambda^{2}}}}},\,L_\textrm{i}\right\}$$
so that $\Delta f_\textrm{i}=S_\textrm{i}^{-1}$ with $M_\textrm{i}$ being the number of equispaced points needed in the frequency region defined by Eq. (6) and $L_\textrm{i}$ being the minimum computational window size.

It is clear that, since $\Delta _\textrm{i}\,\Delta f_\textrm{i}$ is not necessarily equal to $1/M_\textrm{i}$, a “plain” FFT method cannot be used anymore. The discrete Fourier transform can, anyway, be rewritten in terms of a chirp z-transform as:

$$\begin{array}{c} \hat{g}_\textrm{i}[l,m]=\Delta_\textrm{x}\,\Delta_\textrm{y}\,\sum\limits_\textrm{s,t}\,g_\textrm{i}[s,t] {\rm{e}}^{w\,j\,2\pi\,(\alpha_\textrm{x}\, l\,s+\alpha_\textrm{y}\, m\,t)}=\\=\sum\limits_\textrm{s,t}\,g_\textrm{i}[s,t] {\rm{e}}^{w\,j\,\pi\,(\alpha_\textrm{x}\, s^{2}+\alpha_\textrm{y}\, t^{2})} {\rm{e}}^{{-}w\,j\,\pi(\alpha_\textrm{x}\,(l-s)^{2}+\alpha_\textrm{y}\,(m-t)^{2})}\times \\ \times\, {\rm{e}}^{w\,j\,\pi(\alpha_\textrm{x}\,l^{2}+\alpha_\textrm{y}\,m^{2})}\,\Delta_\textrm{x}\,\Delta_\textrm{y} \end{array}$$

At this point, Eq. (8) can be computed using an FFT, actually an ADAptive Fast Fourier Transform that we therefore call ADAFFT, thanks to the convolution theorem as:

$$\begin{array}{c} {e}^{w\,j\pi\,(\alpha_{x} l^{2}+\alpha_{y}\, m^{2})}\Delta_\textrm{x}\,\Delta_\textrm{y}\,\textrm{IFFT}\left(a[s,t]\,b[s,t] \right)=\\=\Delta_\textrm{x}\,\Delta_\textrm{y}\,\textrm{ADAFFT}(g_\textrm{i}[l,m])\\ a[s,t]=\textrm{FFT}\left( g_\textrm{i}[s,t]\,{\rm{e}}^{w\,j\,\pi\,(\alpha_\textrm{x}\, s^{2}+\alpha_\textrm{y}\, t^{2})}\right)\\ b[s,t]=\textrm{FFT}\left( {\rm{e}}^{{-}w\,j\,\pi\,(\alpha_\textrm{x}\, s^{2}+\alpha_\textrm{y}\, t^{2})}\right)=\\=\displaystyle{\frac{1}{-j\,w\,\sqrt{\alpha_\textrm{x}\,\alpha_\textrm{y}}}}\,{\rm{e}}^{{\frac{j\,\pi}{w}}\left( {\frac{s^{2}}{M_\textrm{x}}}+{\frac{t^{2}}{M_\textrm{y}^{2}\,\alpha_\textrm{y}}}\right)} \end{array}$$
with $\alpha _\textrm{i}=\Delta _\textrm{i}\,\Delta f_\textrm{i}$ and $w\in \{-1,+1\}$ to take in consideration both the direct and inverse Fourier transforms and $s\in I_\textrm{M{x}}, t\in I_\textrm{M{y}},\; M_\textrm{i} \neq N_\textrm{i}$ in general.

It has to be pointed out that the ADAFFT allows overcoming another of the limitations in the ASPW highlighted earlier above and due to use of the “plain” FFT. Indeed, in the ADAFFT the number of points in the Fourier domain, as well as the sampling frequencies $\Delta f_\textrm{i}$, are only limited by the Nyquist theorem

$$M_\textrm{i}\,\Delta f_\textrm{i} \leq \displaystyle{\frac{1}{\Delta_\textrm{i}}}$$
while the algorithm keeps the same computational complexity $O(N\,\log N)$ as the usual FFT.

Finally, it is possible to consider the last constraint limiting the ASPW, namely the parallelism condition between the source and target plane. Considering the propagation of a laser beam that is tilted with respect to a target plane, the field on the source plane, $g_\textrm{f}[l,m]$, can be computed as

$$\begin{array}{c} g_\textrm{f}[l,m]=\Delta f_\textrm{x}\,\Delta f_\textrm{y}\,\sum\limits_\textrm{s,t}\,\hat{g}_\textrm{i}[s,t]\,{\rm{e}}^{{-}j\,2\pi\,{\vec{f}}[s,t]\cdot\mathbf{R}({\vec{d}}+{\vec{r}}_\textrm{i}[l,m])}=\\={\rm{e}}^{{-}j\,2\pi\,{\vec{f}}_0\cdot{\vec{r}}_\textrm{i}[l,m]}\,\Delta f_\textrm{x}\,\Delta f_\textrm{y}\,\sum\limits_\textrm{s,t}\,\hat{g}_\textrm{d}[s,t]\,{\rm{e}}^{{-}j\,2\pi\, \tilde{f}[s,t]\cdot {\vec{r}}_\textrm{i}[l,m]} \end{array}$$
where
$$\begin{array}{c} \hat{g}_\textrm{d}[s,t]={\rm{e}}^{{-}j\,2\pi\,\vec{f}\cdot\mathbf{R}\vec{r}_\textrm{i}},\;\;\; \vec{f}[l,m]=\left(\ l\, \Delta f_\textrm{x},\, \Delta f_\textrm{y},\, f_\textrm{z}[l,m]\right)^{T}\\ \vec{f}_0=(\min(\mathbf{R}\vec{f}[l,m])_\textrm{x},\,\min(\mathbf{R}\vec{f}[l,m])_\textrm{y})^{T} \\ F_\textrm{i}=\left(\max(\mathbf{R}\vec{f}[l,m])- \vec{f}_\textrm{0}\right)_\textrm{i}\,\Delta_\textrm{i}\\ \tilde{f}_\textrm{i}=\left[(\mathbf{R}\vec{f})_\textrm{i}-f_\textrm{0,i}\right]/F_\textrm{i},\;\;\; r_\textrm{i}[l,m]=\left(\displaystyle{\frac{l}{F_\textrm{x}}},\,\displaystyle{\frac{m}{F_\textrm{y}}}\right) \end{array}$$

It is now clear that the complex amplitude $\hat {g}_\textrm{d}[s,t]$ is associated to non equispaced frequency coordinates $\tilde {f}[l,m]$, normalized to the interval $[0,2\pi ]$, due to a rotation of the coordinates in the Fourier space [3234]. In this case, it is necessary to resort to the Non Uniform Fast Fourier Transform (NUFFT) for the evaluation of $g_\textrm{f}[l,m]$ because it allows computing the Fourier transform of a field even in the case of a non uniform sampling, in any or both of the frequency and spatial domains. In order to apply the NUFFT to an unevenly sampled field, firstly a regularization step is required:

$$\begin{array}{c} \eta(f_\textrm{x},\,f_\textrm{y})=\sum\limits_\textrm{l,m}\, \hat{g}_\textrm{d}[l,m]\,\delta(f_\textrm{x}-\tilde{f}_\textrm{x}[l,m])\,\delta(f_\textrm{y}-\tilde{f}_\textrm{y}[l,m])\\ \tilde{\eta}=(\eta*\phi)(f_\textrm{x},\,f_\textrm{y})= \\ \int_{\mathcal{R}^{2}}\mathrm{d}\,f'_\textrm{x}\,\mathrm{d}f'_\textrm{y}\,\eta(f'_\textrm{x},\,f'_\textrm{y})\,\phi_\textrm{x}(f_\textrm{x}-f'_\textrm{x})\,\phi_\textrm{y}(f_\textrm{y}-f'_\textrm{y})=\\=\sum\limits_\textrm{l,m}\, \hat{g}_\textrm{d}[l,m]\,\phi_\textrm{x}(f_\textrm{x}-\tilde{f}_\textrm{x}[l,m])\,\phi_\textrm{y}(f_\textrm{y}-\tilde{f}_\textrm{y}[l,m]) \end{array}$$
with $\phi$ being a regularization function with a compact support in both the Fourier and real spaces; for instance it can be a Gaussian function or an N-order spline. Next, the convolution theorem is applied:
$$\begin{array}{c} \mathcal{F}\{\tilde{\eta}\}=\mathcal{F}\{\eta(f_\textrm{x},f_\textrm{y})\}\,\mathcal{F}\{ \phi(f_\textrm{x})\}\,\mathcal{F}\{\phi(f_\textrm{y})\}\Longrightarrow\\ \mathcal{F}\{\eta\}=\displaystyle{\frac{\mathcal{F}\{\tilde{\eta}\}}{\mathcal{F}\{\phi(f_\textrm{x})\}\,\mathcal{F}\{\phi(f_\textrm{y})\}}}=\displaystyle{\frac{\textrm{ADAFFT}(\tilde{\eta})}{\mathcal{F}\{\phi(f_\textrm{x})\}\,\mathcal{F}\{\phi(f_\textrm{y})\}}} \end{array}$$

The algorithm can be speeded up by noticing that only the discrete Fourier transform of $\tilde {\eta }$ needs to be computed numerically, while an analytically Fourier transformable regularization function can be chosen. Finally, the anti-Fourier transform of $\tilde {g}_\textrm{d}$ can be recovered:

$$\begin{array}{c} g_\textrm{f}[l,m]=\Delta f_\textrm{x}\,\Delta f_\textrm{y}\,{\rm{e}}^{{-}j\,2\pi\,\vec{f}_\textrm{0}\cdot\vec{r}_\textrm{i}[l,m]}\,\displaystyle{\frac{\textrm{ADAFFT}(\tilde{\eta})[l,m]}{\mathcal{F}\{\phi(f_\textrm{x},f_\textrm{y})\}}}=\\=\textrm{NUADAS}\{g_\textrm{i}[s,t]\}[l,m] \end{array}$$
where NUADAS is an acronym that we introduce to indicate a Non-Uniform ADAFFT, so it is basically the combination of applying the NUFFT to the ADAFFT approach.

In conclusion, it is worth noticing that the use of the proposed approach allows controlling the sampling frequency and the number of samples in each domain, making it possible to analyze also complex systems with an acceptable computational complexity and an arbitrary precision.

3. NUADAS: Implementation, scalability considerations and numerical validation

The main purpose of the development of the NUADAS algorithm is to generate a large and accurate dataset to train a neural network, as will be detailed in next sections. However, despite the use of optimized mathematical methods (e.g., chirp z-transform, NUFFT), the simulation of the beam propagation to a plane quite far from the source and tilted of some tens of milliradians requires the use of $2^{12}$ to $2^{13}$ points on both axes. Such a simulation can require just few minutes on a common laptop using a proper code; however, since the number of simulations needed to build a dataset suitable for the training of the FCANN with the required accuracy could easily be larger than $10^{5}$, the amount of time needed to build a single dataset can be about one week, which is too much for a flexible and fast deployment of the method. Accordingly, a parallel and optimized version of the NUADAS algorithm has been implemented in FORTRAN and parallelized by means of the Message Passing Interface (MPI), as explained below.

The FFT in the ADAFFT and NUFFT are performed in parallel by means of the Fastest Fourier Transform in the West (FFTW) library [35]. The computation of adaptive and non-uniform Fourier transforms involves convolution sums, which in turn introduce an aliasing error. This aliasing error is removed by truncating the physical space domain [26,36]. An oversampling factor of two is employed for the convolutions in both ADAFFT and NUFFT, resulting into an oversampling factor of four for the full NUADAS method. As a result, the number of grid points employed to discretize the frequency domain is four times larger than the number of grid points used for the discretization of the physical space.

The computational grids are divided into slabs, which are assigned to the MPI process. Each process can directly access to data corresponding to the portion of domain stored in its memory, while accessing the data corresponding to regions handled by other processors requires MPI communications. An implication of this parallelization for the FFT algorithm is that processes can directly compute the FFT along only one (local) direction, as illustrated in Fig. 2.

 figure: Fig. 2.

Fig. 2. Sketch of the steps for the execution of an oversampled FFT parallelized on two MPI processes. The colored rectangles indicate the portion of domain stored in each processor. The number of points in physical space is 8, the oversampling factor is 2 and the grid points are colored based on a Gaussian centered at the origin. (a) Starting from the physical space configuration, (b) the FFT is performed along the $x$ direction which is local in memory, resulting in the intermediate configuration. Next, (c) processors exchange the data so that direction $y$ is now local in memory. Finally, (d) the FFT is computed along $y$, resulting in the frequency space configuration.

Download Full Size | PDF

In the parallel FFT implementation, after transformation along the local direction (that is $x$ in Fig. 2), the data are transposed in order to make the other direction (i.e., $y$) local to the processor memory, so that the FFT can then be computed along $y$. The transposition of data requires an all-to-all communications, namely a communication from a single process to all the others, which are very time-consuming. In order to avoid further transpositions, data are partitioned and distributed along different directions in the frequency and physical space.

The parallel ADAFFT requires only parallel transforms together with convolution sums, which are local to the processor memory and can be carried out without further communications. The parallelization of the NUFFT is more involved, instead. The initial Cartesian 2D grid on the input plane consists of equispaced frequencies. Due to the tilting of the source plane with respect to the target plane, the frequencies perceived on the target plane are non-equispaced, and their position depends on the equispaced frequencies $f_\textrm{x}$, $f_\textrm{y}$ in the source plane, on the frequency along the direction orthogonal to the source plane, $f_\textrm{z}$, and on the tilting angles between the two planes. The non-equispaced frequencies, generated by tilting of the planes, can be interpreted as particles dispersed in the frequency domain. The computation of the NUFFT reduces to the representation of this discrete and non-equispaced particles on a equispaced grid. This problem arises in various areas of physics and massively parallel implementations of the NUFFT algorithm have been successfully employed, for example, in the numerical simulation of Coulomb interactions in charged particle systems [37] and numerical simulation of particle-laden turbulent flows [38].

In order to represent the non-equispaced frequencies on the target plane, an oversampled equispaced grid on the target plane is initialized and distributed among processes. Non-equispaced frequencies are generated by rotating the frequencies in the source plane and therefore they are initially stored in the memory of the same process. Since the frequencies are transformed by rotation, they need to be redistributed among the MPI processes. Each process computes the equispaced grid point nearest to each non-equispaced frequency and, based on that, it computes the processor to which the frequency should belong. Finally, processors exchange the frequencies values and corresponding amplitudes of the non-equispaced frequencies that need to migrate to other processors. Since the outcome of the rotation is not known a-priori the code allows non-equispaced frequencies to migrate from/to any process by means of all-to-all communications.

Once that the particles are appropriately distributed among the processors, each processor computes the projection of the non-equispaced modes into the equispaced grid independently. Each processor computes the convolution between the superposition of spikes centered at the non-equispaced frequencies with a regularizing basis function. Differently from the non-parallel implementation, the parallel implementation relies on B-splines up to order six. In the framework of convolutions for NUFFT purposes, B-splines are known to offer a number of advantages with respect to other basis functions (e.g. Gaussians) [39]. A B-spline of order $n$ has a support of $n+1$ grid points. When the contribution of the non-equispaced frequencies is spread over the nearest grid points through convolution with the B-spline, the non-equispaced frequency may be close enough to the processor boundary to affect regions of the equispaced grid stored in the neighboring processor. This is illustrated in Fig. 3. As a consequence of this non-locality of convolutions, each MPI process needs to receive from its nearest neighbors the values of the amplitude field on the portion of the equispaced grid that may be affected by non-equispaced frequencies located in the nearest neighbors processes. To this end, a halo region is defined for each process, which is a portion of grid that overlaps with the grid stored by the nearest processors. The same grid points are stored in the memory of both the neighboring processes. The halo regions are sent to the nearest neighbors, so that each process can sum the contributions from its neighbors to the field located in its own memory. With reference to Fig. 3, the halo points are in green and are copies of the corresponding blue grid points, stored in the memory of the neighboring processor. The red point indicates one of the rotated frequencies that belongs to processor $p_0$; processor $p_0$ computes the convolution of the spike centered at the rotated frequency with the B-spline basis function, which is represented as a colormap. Since the rotated frequency is close to the boundary between the processes, it also affects grid points stored in process $p_1$. Therefore, the halo grid points are exchanged by means of MPI communications, so that processor $p_1$ can add the contribution given by the rotated frequency on its portion of domain.

 figure: Fig. 3.

Fig. 3. Sketch of the execution of the convolution required for the parallel NUFFT on two MPI processes, $p_0$ and $p_1$. The domains of the two processes are shifted along $z$ for visualization purposes only, in order to highlight the halo regions, which are the green points. The red point indicates one of the rotated frequencies, which belongs to processor $p_0$, and the colormap represents the convolution of the spike centered at the rotated frequency with the B-spline basis function computed by processor $p_0$.

Download Full Size | PDF

The time required for the computation of the NUADAS on $N_{\textrm {procs}}$ processes is expected to scale as $1/N_{\textrm {procs}}$ in ideal conditions. The scaling of the resulting parallel implementation shows little deviations from ideal scaling with the number of involved processors, as shown in Fig. 4.

 figure: Fig. 4.

Fig. 4. Parallel scaling of the computation time as a function of the number of processes employed. The markers indicate the actual computational times for various sizes of the grid on the source plane with $N=N_\textrm{x}=N_\textrm{y}$, while solid lines refers to the ideal scaling. Due to necessary oversampling, the size of the arrays employed for FFT is $16N^{2}$ and it ranges from $2048^{2}$ to $65536^{2}$.

Download Full Size | PDF

The number of points on the resolved grid size considered in Fig. 4 is $N_\textrm{x}=N_\textrm{y}=N$, with an oversampling factor of four (required for the NUADAS algorithm). The actual size of the arrays to which FFT is applied therefore ranges from $2048^{2}$ to $65536^{2}$. It is then clear that, even if an extremely large number of points are required for an accurate simulation, the parallelized code can effectively do it in few seconds. The numerical accuracy of the parallelized NUADAS has been estimated computing the signal to noise ratio (SNR) with respect to the computation of the propagated beam by direct, and inefficient, computation of the inverse Fourier transform of the propagated spectrum as shown in Eq. (2), defined as

$$SNR ={-}10\textrm{ log}_{10}\displaystyle{\frac{\displaystyle\int_\textrm{A}|E_\textrm{NUADAS}(x,y,z)-E_\textrm{direct}(x,y,z)|^{2}}{\displaystyle\int_\textrm{A}|E_\textrm{NUADAS}(x,y,z)|^{2}}}$$
in different conditions of propagation distance and tilting angle. The result is reported in Figs. 57; as it can be seen, the signal to noise ratio is positive and very large, indicating that the parallel NUADAS method is accurate and reliable, opening thus for its use for a quick generation of the synthetic dataset.

 figure: Fig. 5.

Fig. 5. Signal to noise ratio as function of the propagation distance, normalized to the computational window size, with $\theta _\textrm{x} = \theta _\textrm{y} = \theta _\textrm{z} = 0$ , where $\theta _\textrm{i}$ is the rotation angle about the $i$th axis in the reference frame depicted in Fig. 1.

Download Full Size | PDF

 figure: Fig. 6.

Fig. 6. Signal to noise ratio as function of $\theta _\textrm{x}$, normalized to $\frac {\pi }{2}$, with $\theta _\textrm{y} = \theta _\textrm{z} = 0$ and $d_\textrm{z} = W$, where $d_\textrm{z}$ is the propagation distance along the $z$ axis, $W$ is the initial computational windows size, and $\theta _\textrm{i}$ has the meaning defined in Fig. 5.

Download Full Size | PDF

 figure: Fig. 7.

Fig. 7. Signal to noise ratio as function of $\theta _\textrm{z}$, normalized to $\frac {\pi }{2}$, with $\theta _\textrm{x} = \theta _\textrm{y} = 0$ and $d_\textrm{z} = W$, where all the symbols have the meaning defined in Fig. 6.

Download Full Size | PDF

4. Experimental validation

Given the good indications provided by the numerical simulation tests, the NUADAS method has been validated against experimental results to asses its applicability in practical cases and its accuracy in non ideal conditions. The setup consists of a camera and a laser beam that can be tilted and translated with respect to the camera normal. Tests with different lasers sources available in the lab have been conducted: here, as an example, the results obtained with a 450 nm diode chip, emitting a collimated elliptical beam, mounted on a movement that allows translations in the $x$ and $z$ direction and rotations around the $z$ axis are reported. The translation stages have a resolution of ${10}\;\mu \textrm {m}$, while the rotation stage of ${0.15}^{\circ }$. For this setup, the camera is based on a high resolution CMOS sensor with a sensitive area of 11.34 mm $\times$ 7.13 mm and with 1936 $\times$ 1216 pixels. Considering that a pixel to pixel comparison would be of little use due to the acquisition noise, the experimental validation has been done by comparing the measure of the centroid $(\bar {x}_\textrm{c},\bar {y}_\textrm{c})^{T}=\vec {r}_\textrm{c}$ of the image taken by the camera, defined as

$$\begin{array}{c} Z=\displaystyle\int_\textrm{CW} \mathrm{d}x\,\mathrm{d}y\, ||g(x,y)||^{2},\\ \vec{r}_\textrm{c}=\displaystyle{\frac{1}{Z}}\displaystyle\int_\textrm{CW} \mathrm{d}x\,\mathrm{d}y\; \vec{r}\,||g(x,y)||^{2} \end{array}$$
where CW is the computational windows that coincide with the size the camera active area, with the centroid computed from the simulations. The differences found between the experimental data and the predictions are very small and easily explained as due to experimental and systematic errors, as it is clear from the results reported in Fig. 8 and in Fig. 9, where, as an example, the error in the $x$ component of the image centroid is shown for translations of the source along the $x$ axis and for rotations around the $z$ axis, respectively.

 figure: Fig. 8.

Fig. 8. Error in the $x$ component of the image centroid for source rotations around the $x$ axis.

Download Full Size | PDF

 figure: Fig. 9.

Fig. 9. Error in the $x$ component of the image centroid for source rotations around the $z$ axis.

Download Full Size | PDF

5. FCANN implementation and performances

Following the successful validation of the proposed numerical method to build the training dataset, an FCANN has been implemented and its performance evaluated.

As discussed in Sect. 1, in order to reduce the overall computational load of the SPGD, the tilting and positioning error of each beam must be independently corrected and this can be implemented exploiting a FCANN to analyze two images of each beam taken at different propagation distances. A possible practical implementation of such approach is depicted in Fig. 10.

The simplest way to control the propagation direction of the beams is to have each of them reflected by a motorized mirror. Each reflected beam then propagates through a couple of polarization insensitive beam splitters, used as beam samplers, aligned along the optical path. The small fraction of the reflected beam is imaged into a high speed and high resolution camera. The two beam splitters for each beam are positioned at a slightly different distance along the propagation direction. A tilted source when imaged into a camera is not very different from a shifted source, especially in ordinary operations for a CBC system in which the shift is in the order of fractions of millimeters and the tilt of few milliradians (see also Fig. 13). In these conditions, determining if the source is tilted or shifted, left alone quantifying the tilt or the shift, is extremely difficult. However, this problem can be solved by using two images of the same beam at two distances along the optical path. Intuitively, if the beam is simply shifted, the centroid, at different distances, does not change, while the beam waist changes. If instead the beam is tilted compared to the camera input plane, not only the standard deviation but also the centroid, the kurtosis $\gamma$ and the skewness $\kappa$ of the images, defined in Eq. (18), change.

$$\begin{aligned} \gamma_\textrm{i}&=\displaystyle{\frac{1}{Z}}\displaystyle\int_\textrm{CW}\mathrm{d}x\mathrm{d}y\;(r_\textrm{i}-r_\textrm{c,i})^{3}||g(x,y)||^{2}\\ \kappa_\textrm{i}&=\displaystyle{\frac{1}{Z}}\displaystyle\int_\textrm{CW}\mathrm{d}x\mathrm{d}y\;(r_\textrm{i}-r_\textrm{c,i})^{4}||g(x,y)||^{2} \end{aligned}$$
with $Z$ defined in Eq. (17). This justifies the need for two beam splitters and two cameras for each beam to be coherently combined.

To test the potential of this approach, an FCANN has been trained on a synthetic dataset built using the parallel version of the NUADAS algorithm detailed in previous sections. Considering the available equipment for future experimental implementations, a relatively reduced power version of fiber laser sources has been considered. The dataset has been build simulating the propagation of a tilted and shifted Gaussian beam at 1070 nm with $\sigma _\textrm{x} = {70}\;\mu \textrm {m}$ and $\sigma _\textrm{y} = {100}\;\mu \textrm {m}$ and storing the field centroid, the standard deviation, the kurtosis and the skewness along $x$ and $y$ a distance of 50 mm and 100 mm. The source tilt has been restricted to rotations around the $x$ and $y$ axis for a random angle in the range ${-10}$ mrad to 10 mrad, while the shift has been considered only for $x$ and $y$ (leaving the $z$ shift to be dealt together with phase drifts by an EOM) both randomly selected in the ${-0.15}$ mm to 0.15 mm. In total, the built dataset is composed by 160 000 simulations, each of a source randomly shifted and titled with respect to the target plane; 5% of the dataset has been used as a validation set and the other 95% as a training set. The neural network free parameters are the number of hidden layers, the number of neurons in each hidden layer and the activation function, which is assumed to be the same for all the neurons for simplicity. In general, those parameters are chosen to minimize the neural network error on the validation set; however, for the current application there is another constraint: the neural network speed. Indeed, if used to align the beam in a CBC setup, the neural network response must be faster than the period of beam displacements caused by the mechanical perturbation. This constraint rules out all the most complicate neural network architectures, such as the convolutional one, which are also the most difficult to train, leaving with the used FCANN. Hence, with the goal of a FCANN as simple and fast as possible, and given the excellent agreement between the NUADAS result and the experimental one, it has been chosen a FCANN with only one hidden layer of 8 neurons densely connected with the input made of 16 neurons and the output layer composed of 4 neurons; the input and hidden layer neurons use as activation function the ReLu [40], which has been reported to accelerate the accuracy convergence rate of neural networks by avoiding the vanishing gradient problem [41], and the output neurons a linear function. The 16 inputs are centroid, standard deviation, kurtosis and skewness along $x$ and $y$ for both images. The output neurons are the source tilting angle around the $x$ and $y$ axes normalized to 1 mm and the source shift along the $x$ and $y$ axes normalized to 1 mm.

Figure 11 reports the Mean Absolute Error (MAE) versus the number of training epochs for the implemented FCANN. The MAE on the validation dataset is defined as:

$$L_\textrm{j} = \displaystyle{\frac{1}{N_\textrm{testset}}}\sum\limits_\textrm{i=1}^{N_\textrm{testset}}\,|y_\textrm{i} - \bar{y}_\textrm{i}|$$
where $L_\textrm{j}$ is the loss on the validation dataset after the $j$th training epoch and $\bar {y}_\textrm{i}$ is the neural network prediction on the $i$th input of the test set. As it can be seen in Fig. 11, the implemented FCANN has excellent performances even if its structure is very simple: indeed, after 150 epoch of training the average MAE is in the order of $5 \times 10^{-4}$ with a standard deviation of about $7.5 \times 10^{-4}$. The FCANN output is a vector whose entries are the shift and angular errors; so, assuming to consider them separately, the estimated error differs from the actual error by about ${1}\;\mu \textrm {m}$ for a shift along a single direction and by about $1 \times 10^{-3}$ rad for a tilt around a single axis. Then, as the beam needs to be aligned with respect to a combining element– either a focusing lens or a volume holographic grating –a residual angular error corresponds, in first approximation, to an error of the beam position on the combining element, which can be estimated as $d\,\tan {\alpha _\textrm{res}}$, where $d$ is the distance between the sources and the combining element and $\alpha _\textrm{res}$ is a residual tilt error of the source. Therefore, besides for the small value of $\alpha _\textrm{res}$, the impact of the FCANN estimation error on the angular misalignment can be further reduced by making $d$ as small as possible. Moreover, as illustrated in Fig. 12, the CBC efficiency shows a non negligible decrease only for quite large shift errors $\Delta r$ normalized to the beam waist $w_{0}$, far greater than the FCANN worst case normalized residual error, demonstrating that the neural network prediction are accurate enough to correct the tilting and positioning errors during the CBC process. Finally, it has to be considered that the proposed approach with the FCANN is capable of correcting errors with a more than suitable frequency (as compared with the data reported in [17]). Indeed, the execution time on a common high performance laptop is about 1.2 ms, so close to a frequency of 1 kHz on a non specialized hardware. Clearly, the frequency performance could further be increased by using a specialized hardware such as an FPGA [42].

 figure: Fig. 10.

Fig. 10. A possible implementation of the mechanical perturbation control setup. The blue lines represent the beams, while the red elements are the cameras and the grey blocks in front of the cameras the beam splitters (or samplers); in the back there are mirrors mounted on motorized stages and/or gimbal mounts.

Download Full Size | PDF

 figure: Fig. 11.

Fig. 11. Mean absolute error versus the number of training epochs. The blue curves refer to the case of a FCANN composed of only one hidden layer of 8 neurons and the orange curves to the FCANN composed of 2 hidden layers, having 12 and 8 neurons, respectively.

Download Full Size | PDF

In principle, the error could be further reduced by increasing the complexity of the neural network. However, in the considered case, unless a very complex, hence time-consuming, configuration is considered, very little improvements occur. For example, another FCANN with two fully connected hidden layers (16 neurons in the first hidden layer and 8 neurons in the second, all using a ReLu activation function) provided a level of accuracy similar to that of the simpler FCANN (Fig. 11), but with almost half the maximum frequency (execution time is 1.6 ms, leading to a frequency of 0.625 kHz), therefore not justifying its use.

6. Other possible applications

The results reported in the previous sections demonstrate on the one hand that the developed NUADAS method allows computing the propagation of a laser beam with a very high accuracy and speed also in presence of tilts and shifts and therefore it can be taken as a reference for simulating real optical systems; on the other hand that an approach based on the analysis of beam images with a FCANN can be used as a substitute of the common active alignment procedures in the assembly of optical elements, with a substantial reduction of the required time, confirming and extending the results reported in Ref. [20].

As an example of the first application, Figs. 13 and 14 show the propagation of a Gaussian beam for two different shift-tilt cases. In Fig. 13 the physical space window has size $1024\lambda \times 1024\lambda$, where $\lambda$ is the wavelength, and the distance between the target planes is $512\lambda$. The source is a Gaussian beam with $\sigma _\textrm{x} = 64\lambda$ and $\sigma _\textrm{y}= 32\lambda$. The source plane is tilted by $\theta _\textrm{x} =-\pi /50$ and $\theta _\textrm{z}=\pi /6$.

 figure: Fig. 12.

Fig. 12. CBC efficiency $\eta _\textrm{CBC}$ as function of the shift error $\Delta r$ normalized to the beam waist $w_{0}$. The CBC efficiency is computed as the ratio between the computed maximum intensity and the maximum theoretical intensity.

Download Full Size | PDF

 figure: Fig. 13.

Fig. 13. Propagation of a Gaussian beam from a tilted source plane to equispaced target planes.

Download Full Size | PDF

In Fig. 14, the input Gaussian beam is the same as before, but the distance between the target planes is $1024\lambda$ and they have a relative tilting of $\pi /10$.

 figure: Fig. 14.

Fig. 14. Another example of propagation of a Gaussian beam in equispaced and tilted target planes.

Download Full Size | PDF

The NUADAS method can then be modified to take into account also the interfaces with different dielectrics to be used also to simulate misaligned lenses or other optical components, as the propagation through an arbitrary tilted and shifted lens can be broken down into three smaller problems: the propagation from the source to the elements input plane, which may be arbitrarily tilted and shifted with respect to the source plane, the propagation through the lens itself and, finally, the propagation from the lens output plane to the target plane, which, with respect to the lens output plane, could also be arbitrarily tilted and shifted. This would allow building accurate datasets for training FCANNs to assist in the automatic assembly of optical devices. Indeed, the alignment of lenses and of other optical elements is well known to be one of the most time consuming steps in the packaging of optoelectronic devices [43,44]. This is particularly critical when several components have to be aligned, such as in high power multi-emitter laser diode modules, in which, for example, tens of lenses and mirrors have to be accurately positioned with sub-micrometric accuracy [9,45]. A multi-emitter laser diode module is a device composed of a number of single semiconductor emitters, called chips, whose output beams are combined inside a common package exploiting various multiplexing levels (usually spatial multiplexing first, followed by polarization and/or wavelength multiplexing) implemented with free-space techniques and therefore making use of micro-optics. In most of these modules the combined beams are then focused onto a collecting fiber. Today, the assembly of these devices is typically carried out with automated machines, which make use of active alignment procedures to optimally place the micro-optical components. Active alignment requires turning-on the laser source and then optimizing its coupling by raster scanning the position of the lens (or of the optical component, in general) while continuously monitoring the transmitted power or the beam profile. This is clearly a long process that takes several minutes for each component to be carried out, for an overall assembly time that can easily exceed a couple of hours for the higher power modules since these are composed of about 20 chips, each with its associated set of lenses and mirrors. As these high-power multi-emitter modules are the key component of industrial lasers and their cost accounts for almost 50% of the overall cost of the laser source, it is evident that any improvement in their assembly procedure has a huge economical impact. The use of an artificial neural network to correctly position in a single step the Fast Axis Collimating (FAC) lenses of laser chips has already been proved to be effective in comparison with active alignment procedures [20,46]. However, in that case the training set was based on simplified Gaussian beam propagation model (the “five-ray approach”) [46,47] and limitations applied. For instance, it was shown that the ANN is able to infer the shift error and the tilt error about the fast axis direction with an accuracy down to the micrometer; however, in reality, the lens position is affected also by shifts and tilt about the slow axis. To do so, the ANN must be trained on a larger dataset, that would include also these kind of errors, which could not be taken into account by the five rays model previously employed, hence the need of a different, more accurate, numerical method, such as the NUADAS.

Finally, another possible field of application of the NUADAS is in the simulation of non linear wave propagation, as in the case of high power laser propagation subject to self-focusing effects. In fact, the use of the CZT and the BLAS may be used to increase the numerical accuracy of the Split Step Fourier Method (SSFM), which is one the most common approaches for such analyses, leading to a decrease of the computation burden and thus an increase of the algorithm performance.

7. Conclusions

A new approach to increase the number of lasers that can be coherently combined by reducing the number of parameters to be optimized in real time by the commonly adopted SPGD algorithm has been proposed. This approach is based on the use of an FCANN, trained on a synthetic dataset, to correct the tilting and pointing errors induced by mechanical perturbations by analyzing just two images of the beam taken at two locations. In turn, training of the neural network has required developing a new approach to simulate the propagation in non guiding media of beams with arbitrary tilts and shifts with respect to a reference plane. The numerical method, called NUADAS, exploits a combination of the chirp z-transform and the non-uniform FFT to overcome the limitations introduced by the FFT in the angular spectrum method, which represents the usual approach for free-space beam propagation problems. The parallel implementation of the NUADAS method has been discussed and its performance assessed with both numerical and experimental validations. The capability of the FCANN-based approach to correct misalignment errors has been tested with numerical simulations and the obtained results demonstrated the extremely large accuracy that can be achieved, even with a simple ANN configuration, confirming the potentiality of this approach to increase the efficiency of the CBC. The working frequency of the FCANN has proved to be adeguate for the intended application even on a common notebook; however, further improvements in the response speed are possible by implementing the ANN on specialized hardware such as with FPGAs. Finally, the good performance of the NUADAS algorithm opens the path not only to the propagation of tilted and shifted beams though complex optical systems, with application, for example, in the improvement of the misalignment correction capabilities of automated micro-optic assembly machines, but also to the improvement of non linear wave propagation simulation thanks to the application of the CZT and the BLAS to the SSFM. Further efforts will be also devoted to the to investigate the stability of the algorithm in more realistic situations, taking into account images noise and limited pixel dynamics, to a further optimization of the neural network operation frequency through free parameters tuning, such as the ANN architecture and the activation function.

Acknowledgments

The authors thank Prof. Michele Iovieno for fruitful discussions and acknowledge the computer resources provided by La-Palma Supercomputer at the Instituto de Astrofisica de Canarias through the Red Española de Supercomputación (project FI-2018-1-0044) and by CINECA under the ISCRA initiative, project HP10CWVISL; additional resources provided by HPC@POLITO (http://www.hpc.polito.it).

Disclosures

The authors declare no conflicts of interest.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. The Worldwide Market for Lasers: Market Review and Forecast 2020 (Strategies Unlimited, 2019).

2. A. Galvanauskas, “High power fiber lasers,” Opt. Photonics News 15(7), 42–47 (2004). [CrossRef]  

3. C. Jauregui, J. Limpert, and A. Tünnermann, “Prospects in power scaling of fiber lasers and amplifiers,” in CLEO Pacific Rim Conference 2018, (Optical Society of America, 2018), p. Tu2L.1.

4. M. Zervas and C. Codemard, “High-power fiber lasers: A review,” IEEE J. Select. Topics Quantum Electron. 20(5), 219–241 (2014). [CrossRef]  

5. A. Braglia, A. Califano, Y. Liu, and G. Perrone, “Architectures and components for high power cw fiber lasers,” Int. J. Mod. Phys. B 28(12), 1442001 (2014). [CrossRef]  

6. M. Kanskar, L. Bao, Z. Chen, M. DeVito, W. Dong, M. P. Grimshaw, X. Guan, D. M. Hemenway, R. Martinsen, J. Zhang, and S. Zhang, “High-temperature diode laser pumps for directed energy fiber lasers (conference presentation),” Proc. SPIE 10192, 101920F (2017). [CrossRef]  

7. R. Paoletti, C. Coriasso, G. Meneghini, F. Gaziano, P. Gotta, S. Codato, A. Maina, G. Morello, P. D. Melchiorre, G. Pippione, E. Riva, M. Rosso, A. Stano, and M. Gattiglio, “Wavelength-stabilized DBR high-power diode laser,” J. Phys. Photonics 2(1), 014010 (2020). [CrossRef]  

8. J. Hecht, “High-power fiber lasers,” Opt. Photonics News 29(10), 30–37 (2018). [CrossRef]  

9. M. Hemenway, W. Urbanek, D. Dawson, M. Wang, L. Bao, M. Kanskar, M. DeVito, D. Kliner, and R. Martinsen, “Advances in high brightness fiber-coupled laser modules for pumping multi-kW CW fiber lasers,” Proc. SPIE 10086, 1008605 (2017). [CrossRef]  

10. A. Mirigaldi, V. Serafini, P. Gotta, G. Pippione, C. Coriasso, R. Paoletti, and G. Perrone, “Power scaling of laser diode modules using high-power DBR chips,” Proc. SPIE 11262, 112620W (2020). [CrossRef]  

11. Y. Jaouën, G. Canat, S. Grot, and S. Bordais, “Power limitation induced by nonlinear effects in pulsed high-power fiber amplifiers,” C. R. Phys. 7(2), 163–169 (2006). [CrossRef]  

12. D. J. Richardson, J. Nilsson, and W. A. Clarkson, “High power fiber lasers: current status and future perspectives,” J. Opt. Soc. Am. B 27(11), B63–B92 (2010). [CrossRef]  

13. W.-W. Ke, X.-J. Wang, X.-F. Bao, and X.-J. Shu, “Thermally induced mode distortion and its limit to power scaling of fiber lasers,” Opt. Express 21(12), 14272–14281 (2013). [CrossRef]  

14. A. Brignon, ed. Coherent Laser Beam Combing (Wiley-VCH, 2013).

15. Z. Liu, X. Jin, R. Su, P. Ma, and P. Zhou, “Development status of high power fiber lasers and their coherent beam combination,” Sci. China Inf. Sci. 62(4), 41301 (2019). [CrossRef]  

16. Z. Huang, X. Tang, D. Zhang, X. Wang, Q. Hu, J. Li, and C. Liu, “Coherent beam combination of ten fiber arrays via stochastic parallel gradient descent algorithm,” J. Opt. Technol. 82(1), 16–20 (2015). [CrossRef]  

17. P. Ma, P. Zhou, X. Wang, Y. Ma, R. Su, and Z. Liu, “Influence of perturbative phase noise on active coherent polarization beam combining system,” Opt. Express 21(24), 29666–29678 (2013). [CrossRef]  

18. C. L. Linslal, M. S. Sooraj, A. Padmanabhan, D. Venkitesh, and B. Srinivasan, “Implementation of stochastic parallel gradient descent algorithm for coherent beam combining,” Proc. SPIE 10811, 1081115 (2018). [CrossRef]  

19. H. Chang, Q. Chang, J. Xi, T. Hou, R. Su, P. Ma, J. Wu, C. Li, M. Jiang, Y. Ma, and P. Zhou, “First experimental demonstration of coherent beam combining of more than 100 beams,” Photonics Res. 8(12), 1943–1948 (2020). [CrossRef]  

20. H. Yu, G. Rossi, A. Braglia, and G. Perrone, “Artificial neural network assisted laser chip collimator assembly and impact on multi-emitter module beam parameter product,” Proc. SPIE 10085, 1008508 (2017). [CrossRef]  

21. M. Meireles, P. Almeida, and M. Simoes, “A comprehensive review for industrial applicability of artificial neural networks,” IEEE Trans. Ind. Electron. 50(3), 585–601 (2003). [CrossRef]  

22. L. Caballero, M. Jojoa, and W. S. Percybrooks, “Optimized neural networks in industrial data analysis,” SN Appl. Sci. 2(2), 300 (2020). [CrossRef]  

23. V. Kůrková, “Kolmogorov’s theorem and multilayer neural networks,” Neural Networks 5(3), 501–506 (1992). [CrossRef]  

24. A. Talatinian and M. Pluta, “Propagation of a fundamental laser mode and its numerical simulation by the angular spectrum technique,” Optik 127(8), 3882–3887 (2016). [CrossRef]  

25. J. W. Cooley and J. W. Tukey, “Algorithm for the machine calculation of complex Fourier series,” Math. Comp. 19(90), 297 (1965). [CrossRef]  

26. L. Greengard and J.-Y. Lee, “Accelerating the nonuniform fast Fourier transform,” SIAM Rev. 46(3), 443–454 (2004). [CrossRef]  

27. J. Goodman, Introduction to Fourier optics (’McGraw-Hill’, 1996), 2nd ed.

28. K. Matsushima and T. Shimobaba, “Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields,” Opt. Express 17(22), 19662–19673 (2009). [CrossRef]  

29. L. R. Rabiner, R. W. Schafer, and C. M. Rader, “Chirp Z-transform algorithm,” IEEE Trans. Audio Electroacoust. 17(2), 86–92 (1969). [CrossRef]  

30. L. R. Rabiner, R. W. Schafer, and C. M. Rader, “The chirp z-transform algorithm and its application,” Bell Syst. Tech. J. 48(5), 1249–1292 (1969). [CrossRef]  

31. X. Yu, T. Xiahui, Q. Xiong, P. Hao, and W. Wei, “Wide-window angular spectrum method for diffraction propagation in far and near field,” Opt. Lett. 37(23), 4943–4945 (2012). [CrossRef]  

32. K. Matsushima, H. Schimmel, and F. Wyrowski, “Fast calculation method for optical diffraction on tilted planes by use of the angular spectrum of plane waves,” J. Opt. Soc. Am. A 20(9), 1755–1762 (2003). [CrossRef]  

33. C. Chang, J. Xia, J. Wu, W. Lei, Y. Xie, M. Kang, and Q. Zhang, “Scaled diffraction calculation between tilted planes using nonuniform fast Fourier transform,” Opt. Express 22(14), 17331–17340 (2014). [CrossRef]  

34. Y.-H. Kim, G. H. Ryu, H.-Y. Chu, and C.-S. Hwang, “Exact light propagation between rotated planes using non-uniform sampling and angular spectrum method,” Opt. Commun. 344, 1–6 (2015). [CrossRef]  

35. M. Frigo and S. G. Johnson, “The design and implementation of FFTW3,” Proc. IEEE 93(2), 216–231 (2005). [CrossRef]  

36. C. Canuto, M. Y. Hussaini, A. Quarteroni, and T. A. Zang, Spectral methods in Fluid Mechanics (Springer, 1988).

37. M. Pippig and D. Potts, “Parallel three-dimensional nonequispaced fast Fourier transforms and their application to particle simulation,” SIAM J. Sci. Comput. 35(4), C411–C437 (2013). [CrossRef]  

38. M. Carbone, A. D. Bragg, and M. Iovieno, “Multiscale fluid–particle thermal interaction in isotropic turbulence,” J. Fluid Mech. 881, 679–721 (2019). [CrossRef]  

39. G. Beylkin, “On the Fast Fourier Transform of functions with singularities,” Appl. Comput. Harmon. Analysis 2(4), 363–381 (1995). [CrossRef]  

40. A. D. Rasamoelina, F. Adjailia, and P. Sinčák, “A review of activation function for artificial neural network,” in 2020 IEEE 18th World Symposium on Applied Machine Intelligence and Informatics (SAMI), (2020), pp. 281–286.

41. G. Lin and W. Shen, “Research on convolutional neural network based on improved relu piecewise activation function,” Procedia Computer Science 131, 977–984 (2018). [CrossRef]  

42. Y. Tu, S. Sadiq, Y. Tao, M. Shyu, and S. Chen, “A power efficient neural network implementation on heterogeneous fpga and gpu devices,” in 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), (2019), pp. 193–199.

43. T. Tekin, “Review of packaging of optoelectronic, photonic, and MEMS components,” IEEE J. Select. Topics Quantum Electron. 17(3), 704–719 (2011). [CrossRef]  

44. A. Grobnic, “Challenges in the optical alignment of optoelectronic components,” Proc. SPIE 10313, 103134Z (2017). [CrossRef]  

45. H. Yu, Y. Liu, A. Braglia, G. Rossi, and G. Perrone, “Investigation of collimating and focusing lenses’ impact on laser diode stack beam parameter product,” Appl. Opt. 54(34), 10240–10248 (2015). [CrossRef]  

46. H. Yu, G. Rossi, A. Braglia, and G. Perrone, “Application of Gaussian beam ray-equivalent model and back-propagation artificial neural network in laser diode fast axis collimator assembly,” Appl. Opt. 55(23), 6530–6537 (2016). [CrossRef]  

47. H. Yu, G. Rossi, Y. Liu, A. Braglia, and G. Perrone, “Gaussian beam optics model for multi-emitter laser diode module configuration design,” in Advances in Neural Networks, vol. 54 of Smart Innovation, Systems and TechnologiesS. Bassis, A. Esposito, F. C. Morabito, and E. Pasero, eds. (Springer, Cham, 2016), pp. 457–464.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (14)

Fig. 1.
Fig. 1. A practical scenario in which the propagation of a beam emitted by a laser has to be computed for a target plane that is tilted and shifted with respect to the source plane.
Fig. 2.
Fig. 2. Sketch of the steps for the execution of an oversampled FFT parallelized on two MPI processes. The colored rectangles indicate the portion of domain stored in each processor. The number of points in physical space is 8, the oversampling factor is 2 and the grid points are colored based on a Gaussian centered at the origin. (a) Starting from the physical space configuration, (b) the FFT is performed along the $x$ direction which is local in memory, resulting in the intermediate configuration. Next, (c) processors exchange the data so that direction $y$ is now local in memory. Finally, (d) the FFT is computed along $y$, resulting in the frequency space configuration.
Fig. 3.
Fig. 3. Sketch of the execution of the convolution required for the parallel NUFFT on two MPI processes, $p_0$ and $p_1$. The domains of the two processes are shifted along $z$ for visualization purposes only, in order to highlight the halo regions, which are the green points. The red point indicates one of the rotated frequencies, which belongs to processor $p_0$, and the colormap represents the convolution of the spike centered at the rotated frequency with the B-spline basis function computed by processor $p_0$.
Fig. 4.
Fig. 4. Parallel scaling of the computation time as a function of the number of processes employed. The markers indicate the actual computational times for various sizes of the grid on the source plane with $N=N_\textrm{x}=N_\textrm{y}$, while solid lines refers to the ideal scaling. Due to necessary oversampling, the size of the arrays employed for FFT is $16N^{2}$ and it ranges from $2048^{2}$ to $65536^{2}$.
Fig. 5.
Fig. 5. Signal to noise ratio as function of the propagation distance, normalized to the computational window size, with $\theta _\textrm{x} = \theta _\textrm{y} = \theta _\textrm{z} = 0$ , where $\theta _\textrm{i}$ is the rotation angle about the $i$th axis in the reference frame depicted in Fig. 1.
Fig. 6.
Fig. 6. Signal to noise ratio as function of $\theta _\textrm{x}$, normalized to $\frac {\pi }{2}$, with $\theta _\textrm{y} = \theta _\textrm{z} = 0$ and $d_\textrm{z} = W$, where $d_\textrm{z}$ is the propagation distance along the $z$ axis, $W$ is the initial computational windows size, and $\theta _\textrm{i}$ has the meaning defined in Fig. 5.
Fig. 7.
Fig. 7. Signal to noise ratio as function of $\theta _\textrm{z}$, normalized to $\frac {\pi }{2}$, with $\theta _\textrm{x} = \theta _\textrm{y} = 0$ and $d_\textrm{z} = W$, where all the symbols have the meaning defined in Fig. 6.
Fig. 8.
Fig. 8. Error in the $x$ component of the image centroid for source rotations around the $x$ axis.
Fig. 9.
Fig. 9. Error in the $x$ component of the image centroid for source rotations around the $z$ axis.
Fig. 10.
Fig. 10. A possible implementation of the mechanical perturbation control setup. The blue lines represent the beams, while the red elements are the cameras and the grey blocks in front of the cameras the beam splitters (or samplers); in the back there are mirrors mounted on motorized stages and/or gimbal mounts.
Fig. 11.
Fig. 11. Mean absolute error versus the number of training epochs. The blue curves refer to the case of a FCANN composed of only one hidden layer of 8 neurons and the orange curves to the FCANN composed of 2 hidden layers, having 12 and 8 neurons, respectively.
Fig. 12.
Fig. 12. CBC efficiency $\eta _\textrm{CBC}$ as function of the shift error $\Delta r$ normalized to the beam waist $w_{0}$. The CBC efficiency is computed as the ratio between the computed maximum intensity and the maximum theoretical intensity.
Fig. 13.
Fig. 13. Propagation of a Gaussian beam from a tilted source plane to equispaced target planes.
Fig. 14.
Fig. 14. Another example of propagation of a Gaussian beam in equispaced and tilted target planes.

Equations (20)

Equations on this page are rendered with MathJax. Learn more.

2 g ( r ) + ϵ μ ω 2 g ( r ) = 0
g ( r f ) = R 2 d f x d f y g ^ ( f x , f y ; 0 ) e j 2 π f r f g ^ ( f ; 0 ) = R 2 d x d y g ( r i ) e j 2 π f r i f = ( f x , f y , f z = λ 2 f x 2 f y 2 ) T
g ^ i [ s , t ] = Δ x Δ y FFT ( g i [ l , m ] ) [ s , t ]
g ^ i [ s , t ] = g ^ ( s Δ f x , t Δ f y ; 0 ) , g i [ l , m ] = g ( l Δ x , m Δ y , 0 ) Δ f i = 1 / L i , Δ i = L i / N i , i = x , y I K = { K 2 , , K 2 1 } , s , l I N{x} , t , m I N{y}
g f [ s , t ] = Δ f x Δ f y IFFT ( g ^ f [ l , m ] ) f z 2 [ l , m ] = λ 2 ( l Δ f x ) 2 ( m Δ f y ) 2 g ^ f [ l , m ] = g ^ i [ l , m ] e j 2 π f z [ l , m ] z
Δ f x > | 1 π ϕ f x | , Δ f y > | 1 π ϕ f y |
f x 2 f x , lim 2 + f y 2 λ 2 1 , f x , lim = λ 1 ( 2 z S x + 1 ) 1 / 2 f y 2 f y , lim 2 + f x 2 λ 2 1 f y , lim = λ 1 ( 2 z S y + 1 ) 1 / 2
S i = m a x { 2 2 z M i λ M i 2 λ 2 + M i 4 λ 4 + 64 z 2 M i 2 λ 2 , L i }
g ^ i [ l , m ] = Δ x Δ y s,t g i [ s , t ] e w j 2 π ( α x l s + α y m t ) = = s,t g i [ s , t ] e w j π ( α x s 2 + α y t 2 ) e w j π ( α x ( l s ) 2 + α y ( m t ) 2 ) × × e w j π ( α x l 2 + α y m 2 ) Δ x Δ y
e w j π ( α x l 2 + α y m 2 ) Δ x Δ y IFFT ( a [ s , t ] b [ s , t ] ) = = Δ x Δ y ADAFFT ( g i [ l , m ] ) a [ s , t ] = FFT ( g i [ s , t ] e w j π ( α x s 2 + α y t 2 ) ) b [ s , t ] = FFT ( e w j π ( α x s 2 + α y t 2 ) ) = = 1 j w α x α y e j π w ( s 2 M x + t 2 M y 2 α y )
M i Δ f i 1 Δ i
g f [ l , m ] = Δ f x Δ f y s,t g ^ i [ s , t ] e j 2 π f [ s , t ] R ( d + r i [ l , m ] ) = = e j 2 π f 0 r i [ l , m ] Δ f x Δ f y s,t g ^ d [ s , t ] e j 2 π f ~ [ s , t ] r i [ l , m ]
g ^ d [ s , t ] = e j 2 π f R r i , f [ l , m ] = (   l Δ f x , Δ f y , f z [ l , m ] ) T f 0 = ( min ( R f [ l , m ] ) x , min ( R f [ l , m ] ) y ) T F i = ( max ( R f [ l , m ] ) f 0 ) i Δ i f ~ i = [ ( R f ) i f 0,i ] / F i , r i [ l , m ] = ( l F x , m F y )
η ( f x , f y ) = l,m g ^ d [ l , m ] δ ( f x f ~ x [ l , m ] ) δ ( f y f ~ y [ l , m ] ) η ~ = ( η ϕ ) ( f x , f y ) = R 2 d f x d f y η ( f x , f y ) ϕ x ( f x f x ) ϕ y ( f y f y ) = = l,m g ^ d [ l , m ] ϕ x ( f x f ~ x [ l , m ] ) ϕ y ( f y f ~ y [ l , m ] )
F { η ~ } = F { η ( f x , f y ) } F { ϕ ( f x ) } F { ϕ ( f y ) } F { η } = F { η ~ } F { ϕ ( f x ) } F { ϕ ( f y ) } = ADAFFT ( η ~ ) F { ϕ ( f x ) } F { ϕ ( f y ) }
g f [ l , m ] = Δ f x Δ f y e j 2 π f 0 r i [ l , m ] ADAFFT ( η ~ ) [ l , m ] F { ϕ ( f x , f y ) } = = NUADAS { g i [ s , t ] } [ l , m ]
S N R = 10  log 10 A | E NUADAS ( x , y , z ) E direct ( x , y , z ) | 2 A | E NUADAS ( x , y , z ) | 2
Z = CW d x d y | | g ( x , y ) | | 2 , r c = 1 Z CW d x d y r | | g ( x , y ) | | 2
γ i = 1 Z CW d x d y ( r i r c,i ) 3 | | g ( x , y ) | | 2 κ i = 1 Z CW d x d y ( r i r c,i ) 4 | | g ( x , y ) | | 2
L j = 1 N testset i=1 N testset | y i y ¯ i |
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.