Beam wander prediction with recurrent neural networks

Dmitrii Briantcev; Mitchell A. Cox; Mitchell A. Cox; Abderrahmen Trichili; Boon S. Ooi; Mohamed-Slim Alouini

doi:10.1364/OE.496690

1. Introduction

With the steady increase in internet traffic, considerable resources and research efforts are devoted each year to enhancing the throughput of communication systems worldwide [1,2]. However, the throughput of radio frequency (RF) systems is fundamentally limited by spectrum congestion, particularly as networks become denser with each new generation [3]. One potential solution to circumvent this issue is to use an optical beam as a data carrier, leveraging the smaller beam spread compared to RF and THz systems, thereby achieving better resilience to spectrum congestion [4]. To further enhance the capabilities of free space optics (FSO) systems, structured light (SL) modes can be utilized for mode division multiplexing and even modal diversity, which shows potential for advancing these systems [5–9].

However, the alignment and quality of the beam at the receiver plane significantly affect the performance of FSO systems based on Gaussian and SL beams, by introducing both power loss and intermodal crosstalk. Atmospheric turbulence causes scintillation and beam wander, which can be mitigated using adaptive optics (AO) systems. While AO systems can correct phase distortions introduced by random variations in the refractive index of the atmosphere, their implementation can be an expensive solution [1]. Instead, digital means such as optimized forward error correction or signal processing can provide more cost-effective solutions. A potential input for these systems is a prediction of the beam’s future position, which could significantly improve the bit error rate of the communication system [10].

Recent advances in machine learning (ML) have led to its increased utilization in AO applications [11]. Machine learning has also been widely used to detect SL modes [12] and for accurate modeling of the effects of turbulence on structured light [13]. AO systems capable of received beam’s wavefront reconstruction for FSO purposes were studied extensively [14–16]. Although most AO-related studies focus on predicting the Zernike coefficients of the incoming beam to correct the wavefront [17–21], a machine learning-based system can in principal also predict the future position of the beam to account for beam wander. This prediction (which is the focus of this paper) could then be used as input to optimize error correction or signal processing, potentially reducing the need for physical AO systems.

Beam wander and angle of arrival fluctuations have been extensively studied. Approximate expressions for the beam-wander arrival variances of a Gaussian beam were developed in [22]. The probability distribution of the power received by a circular aperture was studied in [23]; this was later expanded for a case of a wandering shifted laser beam [24]. The wandering radius of a Gaussian beam was investigated in [25] for both weak and strong turbulence. A method to decrease the wandering effect by variation of the initial beam coherence was proposed and experimentally verified for SL beams [26]. Moreover, authors of [5] suggest using beam wander/angle of arrival prediction as a proxy for OAM crosstalk prediction. Significant effort was dedicated to studying beam wander for ground-satellite links. Atmospheric effects in the optical ground-satellite uplink were studied in [27], assuming weak turbulence. The beam-wander contribution to the scintillation in a ground-satellite FSO link was further investigated in [28], and results for the beam-wander contribution to the log-amplitude variance were provided. PDF models appropriate for ground-satellite links were discussed in [29].

While most literature focused on studying the statistical properties of beam wander, treating it as a random variable, works on its temporal properties require further effort. An autoregressive moving-average model for predicting the behavior of beam wander-induced intensity fluctuations to combat deep fading was proposed and experimentally tested in [30]. Similarly, empirical orthogonal functions were used in [31] for short-term beam wander position prediction. Neural networks were used in [32] to optimize the parameters of an experimental FSO setup, reducing the effect of beam wander. If a "memory" channel is assumed, it is theoretically possible to predict beam position based on prior information better if the channel memory can be accessed. Explicit models like this, to our knowledge, have yet to exist. However, instead of developing an analytical/statistical model head-on, an ML approach can be used.

In this paper, we propose using a recurrent neural network (RNN) [33] to predict the future position of a beam on the receiver resulting from beam wander. By “learning” the channel memory model, we aim to predict the $x$ and $y$ coordinates of the center of mass of the incoming beam, given the previous states of its center of mass.

2. Background

For the readers’ convenience, we will briefly introduce the main background topics necessary to understand this work. A brief summary of RNNs and how they work is in Section 2.1. The atmospheric turbulence simulation method will be introduced in Section 2.2, and structured light modes will be defined in Section 2.3.

2.1 Recurrent neural networks

As the main building block of the proposed predictor, we have chosen a basic RNN [34]. An RNN is a type of neural network useful for processing sequential data. It has a feedback loop that allows information to be passed from one step of the sequence to the next. The idea is to use the output from the previous step as input to the current step, which creates a form of memory in the network. This memory enables the network to process sequences of inputs and generate outputs that are dependent on previous inputs. An illustration of the RNN structure is shown in Fig. 1, alongside an equivalent “unfolded” version

Fig. 1. Illustration of the RNN operation.

Download Full Size | PDF

The choice of RNN is justified by our assumption that turbulence is a correlated memory channel, as Taylor’s frozen turbulence hypothesis suggested. Specifically, an RNN was selected for its lower computational cost compared to a similar-sized long-short memory network (LSTM). LSTMs are generally used when a long memory of the channel is present. However, due to the random nature of turbulence, we expect that only several samples before the latest one would have strong statistical significance for the future position prediction [5].

Mathematically, an RNN can be described as follows: at each step in the sequence, the network receives an input vector $x_t$. The input is transformed by a set of weights $W_{hx}$ to produce a hidden state vector $h_t$. The hidden state at time step $t$ is a function of the current input $x_t$ and the previous hidden state $h_{t-1}$:

(1)$$h_t = f(W_{hx} + W_{hh}h_{t-1} + b_h),$$

where $W_{hh}$ is a set of weights that determine the influence of the previous hidden state on the current hidden state, $b_h$ is a bias term, and $f$ is an activation function. The activation function introduces nonlinearity into the network. In this work, we used a ReLU (rectified linear unit) activation function for simplicity.

The hidden state vector $h_t$ can be thought of as a memory of the previous inputs in the sequence. It captures the relevant information from the previous inputs and combines it with the current input to produce a new hidden state that is passed on to the next step in the sequence.

The output at each time step is produced by applying another set of weights $W_{yh}$ to the hidden state:

(2)$$y_t = g(W_{yh}h_t + b_y),$$

where $g$ is an activation function and $b_y$ is a bias term. The output at time step $t$ is a function of the current hidden state $h_t$.

The weights $W_{hx}$, $W_{hh}$, and $W_{yh}$ are learned during training using backpropagation through time (BPTT), which is a variation of the standard backpropagation algorithm. BPTT calculates the gradient of the loss function with respect to the network parameters at each time step. The weights are then updated using gradient descent.

2.2 Turbulence simulation

To assess the performance of the proposed RNN predictor, we need a dataset of beam wander of the beam in free space turbulent conditions. When propagating through the atmosphere, different parts of the beam are subject to slight deviations in the refractive index of the air. This results in scintillation and beam wander. In practice, beam propagation can be simulated by the so-called phase screen method, where the thick layer of atmosphere is represented as a series of thin phase screens with Fourier propagation in-between layers. This method allows us to control the turbulence parameters for comparison with the experiment.

Fig. 2. Graphical representation of the modified phase-screen method used in simulations. Screen is shifted using addition of (a) Main diagonal shift, and (b) negligible small lateral shift to combat the long-term phase screen periodicity. Used part of the phase screen (c) is then cut out during each time step for the simulation.

Download Full Size | PDF

In this work, the phase screens that follow the modified von Kármán spectrum [35] are used:

(3)$$\Phi_{n}(\kappa, r_0)=0.023 r_0^{{-}5/3} \exp \left(-\kappa^{2}/ \kappa_{m}^{2}\right)\left(\kappa^{2}+\kappa_{0}^{2}\right)^{{-}11/6}, $$

where $r_{0}$ is the Fried parameter, $\kappa = \sqrt {\kappa _x^2 + \kappa _y^2}$, $\kappa _x$ and $\kappa _y$ are the spatial frequencies, $\kappa _{m}=5.92/l_{0}$, $\kappa _{0}=2\pi /L_{0}$. The parameters $l_{0}$ and $L_{0}$ are the inner and the outer scale of turbulence, respectively. To generate a single realization of a phase-screen $\varphi$, we use the aforementioned spectrum:

(4)$$\varphi(x, y)=\mathcal{F}^{{-}1}\left[ \frac{\mathrm{C}}{N \Delta x}\sqrt{\Phi_{n}(\kappa, r_{0prop})}\right],$$

where $\mathcal {F}(\cdot )$ is the inverse of a 2-dimensional fast Fourier transform (2D-FFT), $N$ is the number of points per square side, $\Delta x$ is the grid spacing, $C$ is a $N \times N$ array of random variables $(\text {N}(0,1) + i\text {N}(0,1))$, and $r_{0prop}$ is the propagation Fried parameter for a single phase screen, calculated according to [9]. The lower-order subharmonics are then added to the resulting phase screens to make the beam wander effect more prominent [36]. With the phase screens ready, Fast Fourier Split-Step numerical propagation method is used, following the procedure detailed in the [37].

With this approach, the datasets on the center of mass position of the received beam are generated from the received beams using the standard capabilities of the Python SciPy package. The screens are then shifted over x and y coordinates by shifting the generation plane by $\delta x, \delta y = v_{wind}\delta t/\sqrt {2}$ to simulate the movement of the air according to the frozen flow hypothesis.

One issue arises when using this method, however. With this simple setup, the center of mass positions will, at some point, repeat and will ultimately be periodic due to encountering the beam encountering the same portions of the phase screen. To alleviate this issue in the given datasets, two steps were made, as shown in Fig. 2. First, we can make the period of these repetitions bigger by generating larger phase screens, shifting them, and only "cutting out" the needed square from the larger phase screen for simulation. In this case, phase screens 3 times larger than the simulation screen were used. Second, the shift can be performed unequally across x and y coordinates by adding an additional negligible shift to the $\delta x$. We verified that this approach is successful in avoiding periodicity in the beam wander by examining the centroid position versus time figures. While some residual periodicity in the trend can persist, its effect should be negligible, given that the considered lookback is less than the period in the centroid shift trend. This approach was chosen as a compromise between computational complexity and avoiding periodicity.

2.3 Structured light

In this work, the effect of the RNN turbulence prediction is estimated for a Gaussian beam, as well as modes from the two most commonly used mode basis, namely Laguerre-Gaussian (LG) and Hermite-Gaussian (HG) with images of the amplitude and phase of the beams used in this work shown in Fig. 3.

Fig. 3. Examples of LG and HG modes.

Download Full Size | PDF

The LG modes can be defined in the cylindrical coordinates $(r, \phi, z)$ as follows:

(5)$$\begin{aligned} LG(r, \phi, z)&= E_{0}\left(\sqrt{2} \frac{r}{\omega}\right)^{\ell} L_{p}^{\ell}\left(\frac{2 r^{2}}{\omega^{2}}\right) \frac{w_{0}}{w(z)} \\ &\times\exp\left[{-}i \psi_{p \ell}(z) + i \frac{k}{2 q(z)} r^{2} + i \ell \phi\right], \end{aligned}$$

where $L_{p}^{\ell }(\cdot )$ are the Laguerre polynomials. Indices $\ell$ and $p$ are the azimuthal and radial components, $E_0$ is a unit normalisation constant, $w(z)$ is the beam size at a the distance $z$, $w_{0}$ is the beam size at the waist, $z_{0}=\pi w_{0}^{2} / \lambda$ is the Rayleigh length, $q(z)=z-i z_{0}$ is the complex beam parameter, $\psi _{p \ell }(z)=(2 p+|\ell |+1) \tan ^{-1}\left (z / z_{0}\right )$ is the LG Gouy phase.

The HG modes can be defined in Cartesian $(x,y,z)$ coordinates as follows:

(6)$$\begin{aligned} HG(x, y, z)&= E_{0} H_{m}\left(\sqrt{2} \frac{x}{w(z)}\right) H_{n}\left(\sqrt{2} \frac{y}{w(z)}\right) \frac{w_{0}}{w(z)} \\ &\times\exp \left[{-}i \psi_{m n}(z) + i \frac{k}{2 q(z)} r^{2}\right], \end{aligned}$$

where $H_{m}(\cdot )$ are the Hermite polynomials, $\psi _{m n}(z)=(m + n+1) \tan ^{-1}\left (z / z_{0}\right )$ is the HG Gouy phase.

3. Methodology

In this study we used two sets of data to train and test the RNN. The first was from an experimental setup over a 260 m real-world turbulent link. The second set of data was generated using an accurate simulation. The characteristics of these data sets are slightly different. The real-world data is somewhat under-sampled due to the limited frame rate of the camera, making it a “worst case” scenario. To complement this we generated simulated data (with turbulence parameters based on estimates matching the physical experiment) which is over-sampled. The experimental and simulation setups are described in detail in Sec. 3.2 and 3.1 respectively.

The data is in the form of images, such as those captured by a camera. We process the images to find the $x$ and $y$ coordinates of the beams center of mass to create a time-series for each measurement. In practise, a camera need not be used and we imagine that a position sensitive detector (i.e., a quadrant photodiode) would work well.

We used Keras SimpleRNN with 128 units followed by the 128-large dense layer before the final dense output layer consisting of 2 neurons for $x$ and $y$ prediction. The Adam optimizer is used in training, with the loss being defined as a mean squared error which is discussed in Sec. 3.3. In each case, the RNN is trained until convergence.

The depth of this RNN recurrence can be tweaked as needed, with higher computational complexity for the network training and execution with higher recurrence depth and more input data. We are interested in two questions: the first is how many past samples are required for a certain accuracy, and how far into the future can we accurately predict?

During training in each epoch for each time step, this network is fed with a predefined number of samples, which we refer to as “lookback” ($L_b$), and aims to predict the position of the center of mass after a certain amount of samples into the future, which we call “lookforward” ($L_f$). Each sample consists of $x$ and $y$ values of a center of mass on the receiving screen and separately computed corresponding left-hand side numerical derivatives, ensuring no future information is used in the calculation.

We test $L_b$ values from 5 to 50 time steps with a step of 5 for the general analysis and $L_b =$ 15 time steps for single-shot examples. We analyze this RNN performance for all proposed depths to find the best and average performance scenario, as we expect that a too-shallow depth will perform worse and, at some stage, no further performance improvements will occur.

3.1 Simulated data generation

We simulated turbulent beam wander time series for three beam modes, namely Gaussian, $\text {LG}_{(0,2)}$ and $\text {HG}_{(0,2)}$, in different turbulence and wind speed scenarios for a distance of 300 m. We generated data for all combinations of $r_0 = 0.01, 0.02, 0.04, 0.08$ and $v_{wind} = 8, 4, 2, 1$ (m/s), with each scenario containing 10000 time-steps (samples). A sampling frequency of 1 kHz is used to ensure proper sampling of the turbulence. Examples of beam profiles obtained through simulation at different turbulence regimes are shown in Fig. 4. An extensive list of the parameters of the simulation is in Table 1.

Fig. 4. Examples of modes affected by turbulence.

Download Full Size | PDF

Table 1. Turbulence simulation parameters.

View Table

It is important to note that even though beam wander is more prominent on longer propagation distances, we decided to limit ourselves to 300 m propagation in this case to approximate the experimental link. This decision was made as we want to evaluate the performance of the predictor on both SL beams and Gaussian beams without incurring confounding issues from long distance SL propagation. Experimental propagation of SL beams over long distances remains a significant challenge but based on this and similar works, 300 m has been shown to work well [38].

Here, we will assess the performance on $L_f = 2,6$ ms, which should represent the expected performance of the predictor looking 1 and 5 samples into the future, with 1 ms grace period for the controller to respond. The performance will be estimated on a 10% of data not used for training, and 20 sample gap in the beginning of test dataset will be used to avoid any transient correlation between the training and test data-sets.

3.2 Experimental setup

We performed a physical experiment with a high frame-rate camera to gather results. An FSO link was constructed in Johannesburg, South Africa, at the Wits Optical Communication Lab, with a diagram shown in Fig. 5, with further experimental considerations presented in Ref. [12] for the readers benefit. Rather than relying on the output of a single mode fibre, a standard mode creation setup using a liquid crystal SLM was used to generate the Gaussian beams used in the experiment at 520 nm [39]. The beam was expanded to a diameter of approximately 20 mm and sent through the lab window to a flat mirror on the roof of another building. As such, the beam was at least 10 m above ground level. The flat mirror was aligned to reflect the beam back into the lab to a received lens with a diameter of 200 mm to avoid clipping the received beam. A telescope arrangement was used to reduce the size of the beam to be viewed on the camera in the image plane. Note that a 50 mm diameter achromatic lens was used to avoid aberrations due to the beam propagating off-center through the lens.

Fig. 5. Diagram of the experimental free-space link. The flat mirror was situated 135 m from the transmitter/receiver. Note that the camera was placed in the image plane as we are concerned with beam wander rather than angle of arrival fluctuations. Pure Gaussian modes were generated using the SLM rather than relying on the output of a single mode fibre. Inset images are a sequence of every tenth frame from the camera.

Download Full Size | PDF

At the time of the experiment, the average wind speed, $v_{wind} = 2$ m/s with gusts of 3.8 m/s, and the air temperature was $10.7~^\circ$C. The scintillation index of the received beam is estimated from the camera frames to be $\sigma _I=0.431$ and thus $C_n^2 = 6.7\times 10^{-14}$, under the assumption that in weak turbulence, the Rytov variance is approximately equal to the scintillation index when it is much less than one. Therefore the approximate atmospheric coherence length, $r_0 = 0.015$ m. The turbulence was stronger than expected over this distance since there were several air conditioning vents underneath the path of the beam. The frame rate of the camera was 130 Hz, which is occasionally under-sampled as the Greenwood frequency $f_G = 0.43v_{\mathrm {wind}}/r_0$ is between 57 Hz and 109 Hz.

In a “real” system, while a camera could be used to do these measurements, the image processing is a somewhat computationally intensive process which would add cost and energy requirements to the system. Instead, we believe that a position sensitive photodiode such as a quadrant photodiode or lateral effect photodiode would be more suitable. Simple signals for the $x$ and $y$ positions are output by such devices, with kilohertz bandwidths easily achievable with off the shelf devices. Additionally, the output is inherently based on the centroid position: nature does the “computation”.

3.3 Performance evaluation

We use mean squared error (MSE) between the predicted and actual beam center of mass as a metric of the closeness of prediction to the ground truth. The MSE is defined as

(7)$$\text{MSE}(C(t), C_p(t)) = \overline{ \left(\overline{ \left( {C(t) - C_p(t))^2} \right) }_t \right) }_{x,y},$$

where $C(t)$ is the ground truth, $C_p(t)$ is the predicted position of the centre of mass of the beam, $\overline {(\cdot )}_t$ is the averaging over all available time points, and $\overline {(\cdot )}_{x,y}$ is the averaging $x$ and $y$ coordinates.

To assess the performance of our algorithm versus a system that does not employ future time state prediction, we can compare $\text {MSE}$ of our prediction values with the $\text {MSE}$ of the last known state of the turbulence (that we will call "prediction" moving forward). We stress that this comparison would be made against the perfect online beam tracking on the receiver side.

To quantify this comparison in particular examples, we will also define an error improvement score ($E_I$)

(8)$$E_I = \frac{\text{MSE}(C(t), C_p(t))}{\text{MSE}(C(t), C_g(t))},$$

where $C_g(t)$ is the predicted center of mass position. This way, $E_I < 1$ will signify the improvement provided by using the RNN, and $E_I = 0$ would, in theory, mean a perfect prediction of the turbulence state. For experimental data, we will limit ourselves to singular runs as proof that RNN is indeed operational due to the limited data available.

Finally, we will compare the predicted value of the centroid versus two prediction types, namely naive prediction and linear prediction. For both coordinates separately, by naive prediction here we understand the best prediction of a system working without future prediction with $L_f$ delay, where

(9)$$x_{naive}(t) = x_{true}(t-L_f).$$

By linear prediction here we understand the linear approximation for the last two samples accessible for the system with $L_f$ delay, with

(10)$$x_{linear}(t) = x_{true}(t-L_f) + (x_{true}(t-L_f) - x_{true}(t - L_f - 1))L_f$$

4. Results and discussion

In the following section, we detail the results of the proposed approach. Namely, we discuss the behavior of the predictor in cases of experimental and simulated data in Section 4.1, and present the performance metrics for a broad scale of turbulence regimes and in comparison to different position predictions at the receiver in Section 4.2, outlining the expected performance of the proposed predictor.

4.1 Behavior analysis

The presented experimental data were under-sampled; therefore, the prediction quality could be better. However, the performance is still better than both prediction types we are comparing against. It can be seen that the behavior of the RNN prediction is similar to the naive prediction in the case of 1 sample $L_f$ (Fig. 6).

In case of 2 (Fig. 7) and especially 6 (Fig. 8) sample $L_f$ we can observe as RNN starts to outperform naive prediction despite tracking clearly being not ideal due to the under-sampling.

Fig. 6. Performance on the the experimental dataset. $L_b = 15$, $L_f = 1$, $E_I = 0.99$

Download Full Size | PDF

Fig. 7. Performance on the experimental dataset. $L_b = 15$, $L_f = 2$, $E_I = 0.93$

Download Full Size | PDF

Fig. 8. Performance on the experimental dataset. $L_b = 15$, $L_f = 6$, $E_I = 0.80$

Download Full Size | PDF

However, the gain in performance is more significant in the better sampled simulated data. To illustrate this, we will consider one example of the simulated data (Fig. 9).

Fig. 9. Performance on simulation dataset. $v_{wind} = 4$ (m/s), $r_0 = 0.02$, $L_b = 15$, $L_f = 2$, $E_I = 0.67$

Download Full Size | PDF

Fig. 10. ACF and PACF for the experimental dataset.

Download Full Size | PDF

Fig. 11. ACF and PACF for simulated dataset, $v_{wind} = 4$ (m/s), $r_0 = 0.02$.

Download Full Size | PDF

In the case of this particular simulation scenario, RNN shows an even better performance gain than the experimental dataset for the 2 samples $L_f$, showing consistently better tracking than the naive prediction. It can also be seen here that the proposed predictor generally exhibits reserved behavior, especially in cases of under-sampled data, reducing the MSE and generally avoiding strong predictions.

This outcome can be explained if we examine the datasets’ auto-correlation (ACF) and partial auto-correlation functions (PACF), that can be expressed in this case as follows:

(11)$$\text{ACF}(\tau)=\frac{\text{Cov}(x_{t+\tau}, x_{t})}{\sigma^2}$$

(12)$$\text{PACF}(\tau)=\frac{\text{Cov}(x_{t+\tau}, x_{t} | x_{t+1} \cdots x_{t+\tau-1})}{\sigma(x_{t} | x_{t+1} \cdots x_{t+\tau-1})\sigma(x_{t + \tau} | x_{t+1} \cdots x_{t+\tau-1})}$$

where E$(\cdot )$ denotes the mean, $\sigma$ is standard deviation, and covariance is expressed as $\text {Cov}(x,y) = \text {E}\left ((x - \text {E}(x)) (y - \text {E}(y))\right )$

As can be seen from Fig. 10 and Fig. 11, only the last sample has high statistical significance in the experimental dataset, whereas in the simulated scenario, six past samples hold value for the prediction, and sampling frequency is two times the corresponding Greenwood frequency as expected.

4.2 MSE performance

We analyze results for various wind speeds across several $r_0$ values to show the predictor’s performance. Provided are the results for the MSE performance of an RNN tracking for performance over both average and best $L_b$, while $L_f$ was set to 2 and 6 samples (Fig. 12). This way of conducting the experiment was chosen due to the random nature of RNN optimization and initialization, as it can sometimes converge to sub-optimal conditions. Therefore, with these charts, the reader can be aware of both expected and best-achieved performance using our method of RNN tracking. Here, it can be seen that RNN prediction proved to be more beneficial for both LG and HG modes than a Gaussian, as well as a general error decline trend across all modes, where error decreases linearly on a log-log plot. The difference in performance between different mode families, in particular, is an outstanding question in the field, as previous studies suggest that there may be a mode dependence on turbulence [5,8,9].

Fig. 12. RNN MSE performance charts for different modes for $v_{wind} = 8, 2, 1$ (m/s). (a) Averaged over all $L_b$, $L_f = 2$, (b) Best among all $L_b$, $L_f = 2$, (c) Averaged over all $L_b$, $L_f = 6$, (d) Best among all $L_b$, $L_f = 6$.

Download Full Size | PDF

Fig. 13. RNN MSE performance comparison for different prediction types. (a) Better sampled simulated scenario, $v_{wind} = 1 (m/s)$, $r_0 = 0.01$. (b) Worse sampled simulated scenario, $v_{wind} = 4 (m/s)$, $r_0 = 0.02$ (c) Under-sampled experimental scenario.

Download Full Size | PDF

Finally, to show the merit of the proposed network, we compare MSE performance versus both naive prediction and linear predictions with results illustrated in Fig. 13. We show data with different quality of sampling, as well as the comparison on the poorly sampled experimental data.

As can be seen from the graphs, the proposed tracking scheme outperforms both linear prediction and naive prediction in all three analyzed cases. In the case of experimental data, a simple linear approximation, as presented here, is not helpful; hence it diverges immediately. However, more advanced linear approaches, such as the vector auto-regressive (VAR) model could drastically improve linear prediction performance in a long prediction case. The same effect can be noticed in the simulated case with faster wind speed. In both cases, RNN prediction has behavior similar to the naive prediction across $L_f$ values but consistently outperforms it. In the case of simulated turbulence with slower wind speed, we can see that linear approximation slightly outperforms RNN prediction for low $L_f$ values due to convergence metrics set for the RNN. However, performance is close, and RNN performs better at higher $L_f$ values again.

From the above, we can suggest that the proposed estimator behaves differently on under-sampled and well-sampled data, although more carefully controlled experiments are required to investigate this claim fully. Nevertheless, while operating on under-sampled data, the method can be helpful in predictive correction at the receiver with a small $L_f$, reducing the MSE. When operating on well-sampled data the proposed method is more useful, and can be used for more accurate beam position prediction and could potentially be used in feedback systems to offset the beam wander.

5. Conclusion

In this manuscript, we presented an RNN network capable of predicting the future beam-wander position of the optical beams. We detailed the dataset generation for the structured light and Gaussian beams and the neural network architecture and training process. We presented the MSE performance charts for various turbulence regimes in two scenarios, with 2 and 6 samples look-forward. We also compared the performance of the proposed network versus different predictions one could use to estimate the position of the beam-wander in the future. In all cases, our approach produced significantly better predictions. This approach could prove helpful in many ways, such as correcting for beam wander at the receiver with the slow correcting devices or predicting deep fades due to beam wander in a system with feedback to the transmitter. Future work includes developing a real-time version of an RNN predictor and analyzing its performance on well-sampled experimental data.

Disclosures

The authors declare that there are no conflicts of interest related to this article.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. A. Trichili, M. A. Cox, B. S. Ooi, and M.-S. Alouini, “Roadmap to free space optics,” J. Opt. Soc. Am. B 37(11), A184–A201 (2020). [CrossRef]

2. D. J. Richardson, “Filling the light pipe,” Science 330(6002), 327–328 (2010). [CrossRef]

3. R. Struzak, “On spectrum congestion and capacity of radio links,” Ann. Oper. Res. 107(1/4), 339–347 (2001). [CrossRef]

4. A. Trichili, A. Ragheb, D. Briantcev, M. A. Esmail, M. Altamimi, I. Ashry, B. S. Ooi, S. Alshebeili, and M.-S. Alouini, “Retrofitting FSO systems in existing RF infrastructure: A non-zero-sum game technology,” IEEE Open J. Commun. Soc. 2, 2597–2615 (2021). [CrossRef]

5. M. A. Cox, N. Mphuthi, I. Nape, N. Mashaba, L. Cheng, and A. Forbes, “Structured light in turbulence,” IEEE J. Sel. Top. Quantum Electron. 27(2), 1–21 (2021). [CrossRef]

6. A. Forbes, M. de Oliveira, and M. R. Dennis, “Structured light,” Nat. Photonics 15(4), 253–262 (2021). [CrossRef]

7. Z. Hu, Y. Li, Z. Chen, D. M. Benton, A. A. Ali, M. Patel, M. P. Lavery, and A. D. Ellis, “Aiming for high-capacity multi-modal free-space optical transmission leveraging complete modal basis sets,” Opt. Commun. 541, 129531 (2023). [CrossRef]

8. M. A. Cox, L. Maqondo, R. Kara, G. Milione, L. Cheng, and A. Forbes, “The resilience of Hermite- and Laguerre-Gaussian modes in turbulence,” J. Lightwave Technol. 37(16), 3911–3917 (2019). [CrossRef]

9. X. Gu, L. Chen, and M. Krenn, “Phenomenology of complex structured light in turbulent air,” Opt. Express 28(8), 11033–11050 (2020). [CrossRef]

10. A. C. Motlagh, V. Ahmadi, Z. Ghassemlooy, and K. Abedi, “The effect of atmospheric turbulence on the performance of the free space optical communications,” in 2008 6th International Symposium on Communication Systems, Networks and Digital Signal Processing, (2008), pp. 540–543.

11. Y. Guo, L. Zhong, L. Min, J. Wang, Y. Wu, K. Chen, K. Wei, and C. Rao, “Adaptive optics based on machine learning: a review,” (2022).

12. M. A. Cox, T. Celik, Y. Genga, and A. V. Drozdov, “Interferometric orbital angular momentum mode detection in turbulence with deep learning,” Appl. Opt. 61(7), D1 (2022). [CrossRef]

13. D. Briantcev, M. A. Cox, A. Trichili, A. V. Drozdov, B. S. Ooi, and M.-S. Alouini, “Efficient channel modeling of structured light in turbulence using generative adversarial networks,” Opt. Express 30(5), 7238 (2022). [CrossRef]

14. M. Li and M. Cvijetic, “Coherent free space optics communications over the maritime atmosphere with use of adaptive optics for beam wavefront correction,” Appl. Opt. 54(6), 1453–1462 (2015). [CrossRef]

15. L. A. Poyneer, D. T. Gavel, and J. M. Brase, “Fast wave-front reconstruction in large adaptive optics systems with use of the fourier transform,” J. Opt. Soc. Am. A 19(10), 2100–2111 (2002). [CrossRef]

16. T. Berkefeld, D. Soltau, R. Czichy, E. Fischer, B. Wandernoth, and Z. Sodnik, “Adaptive optics for satellite-to-ground laser communication at the 1m telescope of the esa optical ground station, tenerife, spain,” Proc. SPIE 7736, 77364C (2010). [CrossRef]

17. A. P. Wong, B. R. M. Norris, P. G. Tuthill, R. Scalzo, J. Lozi, S. B. Vievard, and O. Guyon, “Predictive control for adaptive optics using neural networks,” J. Astron. Telesc. Instrum. Syst. 7(01), 019001 (2021). [CrossRef]

18. A. Vyas, M. B. Roopashree, and B. R. Prasad, “Extrapolating Zernike moments to predict future optical wave-fronts in adaptive optics using real time data mining,” arxiv, ArXiv:1001.3295 (2010). [CrossRef]

19. X. Liu, T. Morris, C. Saunter, F. J. de Cos Juez, C. González-Gutiérrez, and L. Bardou, “Wavefront prediction using artificial neural networks for open-loop adaptive optics,” Mon. Not. R. Astron. Soc. 496(1), 456–464 (2020). [CrossRef]

20. Z. Sun, Y. Chen, X. Li, X. Qin, and H. Wang, “A bayesian regularized artificial neural network for adaptive optics forecasting,” Opt. Commun. 382, 519–527 (2017). [CrossRef]

21. M. B. Jorgenson and G. J. M. Aitken, “Prediction of atmospherically induced wave-front degradations,” Opt. Lett. 17(7), 466–468 (1992). [CrossRef]

22. D. H. Tofsted, “Outer-scale effects on beam-wander and angle-of-arrival variances,” Appl. Opt. 31(27), 5865–5870 (1992). [CrossRef]

23. R. Esposito, “Power scintillations due to the wandering of the laser beam,” Proc. IEEE 55(8), 1533–1534 (1967). [CrossRef]

24. M. Tamir, U. Halavee, and E. Azoulay, “Power fluctuations caused by laser beam wandering and shift,” Appl. Opt. 20(5), 734–735 (1981). [CrossRef]

25. G. P. Berman, A. A. Chumak, and V. N. Gorshkov, “Beam wandering in the atmosphere: The effect of partial coherence,” Phys. Rev. E 76(5), 056606 (2007). [CrossRef]

26. J. Yu, X. Zhu, F. Wang, D. Wei, G. Gbur, and Y. Cai, “Experimental study of reducing beam wander by modulating the coherence structure of structured light beams,” Opt. Lett. 44(17), 4371–4374 (2019). [CrossRef]

27. F. Dios, J. A. Rubio, A. Rodríguez, and A. Comerón, “Scintillation and beam-wander analysis in an optical ground station-satellite uplink,” Appl. Opt. 43(19), 3866–3873 (2004). [CrossRef]

28. A. Rodriguez-Gomez, F. Dios, J. A. Rubio, and A. Comeron, “Temporal statistics of the beam-wander contribution to scintillation in ground-to-satellite optical links: An analytical approach,” Appl. Opt. 44(21), 4574–4581 (2005). [CrossRef]

29. L. C. Andrews, R. L. Phillips, R. J. Sasiela, and R. Parenti, “PDF models for uplink to space in the presence of beam wander,” Proc. SPIE 6551, 655109 (2007). [CrossRef]

30. M. A. Cox, L. Gailele, L. Cheng, and A. Forbes, “Modelling the memory of turbulence-induced beam wander,” arxiv, ArXiv:1907.10519 (2019). [CrossRef]

31. O. Guyon and J. Males, “Adaptive optics predictive control with empirical orthogonal functions (eofs),” arXiv, arXiv:1707.00570 (2017). [CrossRef]

32. K. Kazaura, K. Omae, T. Suzuki, M. Matsumoto, E. Mutafungwa, T. Korhonen, T. Murakami, K. Takahashi, H. Matsumoto, K. Wakamori, and Y. Arimoto, “Enhancing performance of next generation FSO communication systems using soft computing-based predictions,” Opt. Express 14(12), 4958–4968 (2006). [CrossRef]

33. A. Sherstinsky, “Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network,” Phys. D 404, 132306 (2020). [CrossRef]

34. D. Gjylapi and E. Proko, “Recurrent neural networks in time series prediction,” J. Multidiscip. Eng. Sci. Technol. 5(10), 8741–8746 (2017).

35. J. D. Schmidt, Numerical Simulation of Optical Wave Propagation with Examples in MATLAB, vol. PM199 (SPIE, 2010).

36. R. G. Lane, A. Glindemann, and J. C. Dainty, “Simulation of a Kolmogorov phase screen,” Waves in Random Media 2(3), 209–224 (1992). [CrossRef]

37. D. Briantcev, A. Trichili, B. S. Ooi, and M.-S. Alouini, “Crosstalk suppression in structured light free-space optical communication,” IEEE Open J. Commun. Soc. 1, 1623–1631 (2020). [CrossRef]

38. A. Drozdov and M. A. Cox, “Practical modal decomposition over turbulent free-space links,” Proc. SPIE 12017, 120170E (2022). [CrossRef]

39. J. Pinnell, I. Nape, B. Sephton, M. Cox, V. Rodriguez-Fajardo, and A. Forbes, “Modal analysis of structured light with spatial lightmodulators: A practical tutorial,” J. Opt. Soc. Am. A 37(11), C146–160 (2020). [CrossRef]

Parameter	Value
Operating wavelength ( $λ$ )	520 nm
Beam waist ( $ω$ )	0.008 m
Total propagation length ( $z_{t o t}$ )	300 m
Step propagation length ( $z_{p r o p}$ )	50 m
Number of phase screens	6
Outer scale ( $L_{0}$ )	100 m
Inner scale ( $l_{0}$ )	0.01 m
Screen size ( $D$ )	0.128 m
No. grid points per side ( $N_{g}$ )	128
Sampling frequency ( $f_{s}$ )	1000 Hz

Beam wander prediction with recurrent neural networks

Abstract

1. Introduction

2. Background

2.1 Recurrent neural networks

2.2 Turbulence simulation

2.3 Structured light

3. Methodology

3.1 Simulated data generation

3.2 Experimental setup

3.3 Performance evaluation

4. Results and discussion

4.1 Behavior analysis

4.2 MSE performance

5. Conclusion

Disclosures

Data availability

References

Data availability

Cited By

Figures (13)

Tables (1)

Equations (12)

Optics Express

Dmitrii Briantcev	https://orcid.org/0000-0002-5170-6447
Mitchell A. Cox	https://orcid.org/0000-0002-1115-3729
Abderrahmen Trichili	https://orcid.org/0000-0001-8005-6319
Boon S. Ooi	https://orcid.org/0000-0001-9606-5578
Mohamed-Slim Alouini	https://orcid.org/0000-0003-4827-1793