Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Adaptive multi-layer filters incorporated with Volterra filters for impairment compensation including transmitter and receiver nonlinearity

Open Access Open Access

Abstract

We propose a receiver-side signal processing to compensate for nonlinearity that occurs in transmitter (Tx) and receiver (Rx) components of coherent optical fiber transmission systems. Nonlinear effects in transmission systems are not mutually commutative with any linear effects in general. Considering the order in which all the relevant impairments occur, we adopt a multi-layer (ML) filter architecture. The ML filters consist of strictly-linear and widely-linear filter layers to compensate for relevant linear impairments that occur in a transmission system and two Volterra filter layers to compensate for Rx and Tx nonlinearity. The coefficients of the ML filters including Volterra filter layers are adaptively controlled by using a gradient calculation with back propagation, which is similar to that used in the learning of neural networks, from the last layer and stochastic gradient descent to minimize a loss function that is composed of the last layer outputs. We evaluated the compensation performance of Tx and Rx nonlinearity using the proposed adaptive ML filters including Volterra filter layers both in simulations and experiments of the transmission of a 23 Gbaud polarization-division-multiplexed 64-quadrature amplitude modulation signal over a 100-km single-mode-fiber span. The results demonstrated that the Volterra filter layers in the ML filter architecture could compensate for the nonlinearity that occurs in Tx and Rx simultaneously and effectively even when other impairments such as chromatic dispersion coexist.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Coherent detection and digital signal processing (DSP) in optical fiber communications have paved the way for the adaptation of advanced modulation formats such as higher-order quadrature amplitude modulation (QAM) and probabilistic constellation shaping [13]. In addition, DSP provides the possibility of compensating of various effects that occur in fiber transmission systems in the digital domain flexibly, including carrier recovery [4], accumulated chromatic dispersion (CD) [5], polarization demultiplexing while compensating for polarization mode dispersion (PMD) [6], and fiber Kerr nonlinearity [7]. Both linear and nonlinear impairments due to imperfections in optical and electrical components can also occur in a transmitter (Tx) and receiver (Rx). These impairments are becoming non-negligible, especially for signals with a high symbol rate where high frequency devices are used [811].

Compensation of the impairments that occur in a Tx and Rx has been investigated regarding linear [1215] and nonlinear impairments [1624]. Characteristics of these impairments depend on the components used in a Tx and Rx, which are usually unknown beforehand. Thus, an adaptive approach or learning is required to deal with Tx and Rx impairments. Linear impairments that occur in a Tx and Rx are mainly a timing skew between in-phase (I) and quadrature (Q) components, a gain imbalance between IQ components, and a phase deviation of IQ from $\pi /2$. Receiver-side adaptive filters can compensate for these linear impairments that occur in a Tx [12,15] and those occur in an Rx [1315]. Nonlinear impairments are mainly caused by digital-to-analog converters (DACs), electronic driver amplifiers, and a Mach-Zehnder modulator in a Tx, as well as electronic trans-impedance amplifiers (TIAs) and analog-to-digital converters (ADCs) in an Rx.

To compensate for nonlinear impairments that occur mainly in a Tx, digital pre-distortion in the Tx side has been investigated on the basis of Volterra filters [1619] and neural networks [2123]. These pre-distortion approaches enable adaptive equalization by using a signal with a high signal-to-noise ratio (SNR) without the effect from other impairments that occur in a fiber transmission; however, they can only resolve Tx nonlinearity. Different approaches are required to compensate for nonlinear impairments that occur in an Rx. Moreover, other effects such as CD accumulate in a signal though fiber propagation. Nonlinear impairments are not mutually commutative with other effects, so compensating for fiber nonlinearity uses a split-step back propagation based on the nonlinear Schrödinger equation [7]. Therefore, to compensate for nonlinear impairments together with other impairments, the order in which all the relevant impairments occur should be considered unless one lumped adaptive nonlinear filter is used. However, a conventional DSP uses a block-wise compensation to effectively deal with various impairments that have different causes and models [5]. From this point of view, mutual non-commutativity of nonlinear impairments that occur in a Tx and Rx with other impairments has not been resolved in these previous approaches, preventing compensation of both Tx and Rx nonlinearity at the same time. Recently, combination of pre-distortion at the Tx side and adaptive nonlinear equalization at the Rx side has been reported [24], where a nonlinear equalizer for Rx nonlinearity compensation is positioned at the first of impairment compensation blocks and a nonlinear equalizer for Tx nonlinearity compensation is positioned at the last. Four real-valued nonlinear filters were used for both the Rx and Tx nonlinear equalizers and no impairment compensation blocks that have IQ cross terms were included in this DSP. It is reasonable to use four real-valued nonlinear filters since Tx and Rx nonlinearity usually affects IQ components independently in coherent optical transmission systems. Whereas, the absence of IQ cross terms prevents compensation of IQ phase deviation. If nonlinear equalizers that have IQ cross terms are used, this problem will be resolved, though this straightforward approach increases the number of parameters of nonlinear equalizers greatly, resulting in high computational complexity.

Linear processes are also not mutually commutative in general in the case of multi-input multi-output (MIMO). For example, CD can be represented by a convolution of a complex-valued input signal with a complex-valued response function. This complex-valued linear model is denoted as strictly-linear (SL) [14]. The Jones matrix of CD is diagonal with same elements, and thus CD is commutative with other SL processes such as PMD. Complex-valued linear models can be described as real-valued MIMO models with a restriction on the real-valued IQ basis representation. The response of CD is non-diagonal on the real-valued IQ basis representation and thus not commutative with linear processes that cannot be described as MIMO models with the restriction. These real-valued linear MIMO processes that can be described only without the restriction are equivalent to the models with a convolution of complex-valued signals and their complex-conjugate with complex-valued response functions. These linear processes are denoted as widely-linear (WL). IQ MIMO processes such as IQ skew are WL. Thus, CD and IQ skew are not mutually commutative. To compensate for all the relevant linear impairments in optical fiber communication systems including IQ skew, IQ imbalance, and IQ phase deviation that occur in both Tx and Rx at the receiver side, we have proposed an adaptive multi-layer (ML) filter architecture that considers the order in which the impairments occur [15]. The ML filter architecture unfolds an adaptive filter to ones being different in type and size to compensate for corresponding impairments. The coefficients of the ML filters are adaptively controlled by gradient calculation with back propagation, which is similar to the learning of neural networks and can be applied to any differentiable parameterized function [25,26], to minimize a loss that is composed of the last layer outputs.

In this study, we extended the adaptive ML filter architecture by incorporating nonlinear filters to compensate for both Tx and Rx nonlinearity when other impairments such as CD coexist. Volterra filters and neural networks are both nonlinear functions and back propagation can be applied to both of them. A deep neural network (DNN) slightly outperforms in compensating nonlinearity with memory effects when nonlinear compensation is performed after conventional linear impairment compensation [20]. From the view point of commutativity of impairments, this previous work is regarded as a nonlinearity compensation in a Tx. Although DNNs have the ability to approximate a complicated nonlinear function, random initialization of parameters is usually required before learning [27], resulting completely random outputs at an initial phase. Regarding impairment compensation in optical fiber communications, dominant sources to prevent demodulation are linear effects that occur in fiber propagation such as CD, though Tx and Rx nonlinearity cannot be ignored. Therefore, initializing nonlinear filters that compensate for Tx and Rx nonlinearity as a certain linear or even an identity function instead of a random function can help convergence at the beginning of adaptive control. In the case of the Volterra filter, which is easily initialized as a linear filter, an optimum nonlinear function can be smoothly and steadily obtained by adaptive control from the initial state. Here, we introduced Volterra filters into the adaptive ML filters. Considering the order in which all the relevant impairments occur, the ML filters consist of SL and WL filter layers to compensate for relevant linear impairments, and the two Volterra filter layers, each of which works as to compensate for nonlinearity that occurs in an Rx and Tx, respectively, are appropriately positioned in the ML filters. The coefficients including the Volterra filter layers were adaptively controlled by a gradient calculation with back propagation and stochastic gradient descent (SGD). In this ML filter architecture including Volterra filter layers, a Volterra filter itself compensates only for Rx or Tx nonlinearity and is not required to compensate for any other effects and their interaction with nonlinearity, which expands a temporal spread. Thus, the Volterra filter layers in the ML filters can be implemented with short memory taps, though the number of coefficients and computational complexity of a Volterra filter increase drastically with the increase in length of the memory taps [28]. We evaluated the performance of the adaptive ML filters including Volterra filter layers through simulations with a simple model and experiments where more realistic Tx and Rx nonlinearity was induced by tuning the output amplitude of electronic amplifiers. The adaptive ML filters were used in receiver-side signal processing for the transmission of a 23 Gbaud polarization-division-multiplexed (PDM) 64QAM signal over one span of a 100-km single-mode fiber (SMF). The results demonstrated that the proposed architecture could compensate for the nonlinearity that occurs in both Tx and Rx simultaneously and effectively under the accumulation of CD.

2. Theory

We first review the nonlinear impairments that occur in optical fiber communication systems with coherent detection. We then show the ML filter architecture including Volterra filter layers in an appropriate order and its adaptive control. The coefficients are updated by a gradient calculation with back propagation and SGD to minimize a loss function that is composed of the last layer outputs. Although an adaptive Volterra filter is well-known [29], adaptive control of the filter coefficients with its direct input and output is insufficient when incorporating it into the adaptive ML filters. We derive the back propagation of the Volterra filter layers in the ML filters, in other words, calculating the gradients of a loss in terms of filter coefficients and inputs, when gradients in terms of filter outputs are given, to update the coefficients in all the layers including the Volterra filter ones.

Figure 1 shows a schematic diagram of a conventional wavelength-division multiplexed (WDM) transmission system with coherent detection. The transmitted data are encoded and mapped to a certain modulation format. A DAC and electric driver amplifiers generate electric signals of four streams corresponding to the IQ components of two polarizations. A continuous-wave (CW) light source from a laser diode (LD) is modulated by a modulator driven with the electric signals. The modulated signal is multiplexed with other WDM signals and transmitted to a fiber. On this Tx side, some nonlinearity occurs in the DAC and driver amplifiers [30]. A Mach-Zehnder modulator also has nonlinear sinusoidal characteristics [31]. In a fiber propagation, CD and PMD accumulate in the signal. Fiber Kerr nonlinearity also occurs, though we ignore it here for simplicity. After fiber transmission, the signal is demultiplexed and received by a coherent receiver and sampled by an ADC. Demodulation and decoding are performed in the digital domain to recover the transmitted data. On this Rx side, TIAs in a coherent receiver and ADC induce nonlinearity.

 figure: Fig. 1.

Fig. 1. Schematic diagram of a WDM transmission system with coherent detection. Nonlinear impairments occur in Tx and Rx. ENC: encoder, DAC: digital-to-analog converter, MOD: modulator, LD: laser diode, SMF: single-mode fiber, EDFA: erbium-doped fiber amplifier, CRx: coherent receiver, ADC: analog-to-digital converter, DEM: demodulation, DEC: decoder.

Download Full Size | PDF

It is computationally inefficient to compensate for all these linear and nonlinear impairments that occur in an optical fiber communication system with one large nonlinear filter, since it should have a large degree of freedom, i.e. tremendous cross terms and nonlinear terms with long memory. From this point, required terms would not be dense, and thus a block-wise compensation where each impairment is compensated successively is more efficient. In this block-wise impairment compensation, the order in which the respective impairments are compensated should be considered, since a number of the linear processes are not mutually commutative in general. For example, the response of CD in the real-valued IQ basis representation is non-diagonal and is not commutative with WL processes such as IQ impairments [14]. Moreover, linear processes can change the intensity profile of a signal, and thus nonlinear processes are not commutative with them. Therefore, to compensate for all the relevant linear and nonlinear impairments in a block-wise manner, they should be compensated in the inverse order in which they occur if any two are not mutually commutative.

Figure 2 shows the proposed adaptive ML filter architecture where two Volterra filter layers are positioned to compensate for Rx and Tx nonlinearity. It is composed of eight layers comprising half-symbol-spaced finite impulse response (FIR) filters and Volterra filters that also operate at half-symbol-spaced. The first layer has Volterra filters to compensate for Rx nonlinearity. Although complex-valued Volterra filters exist [32], a nonlinear filter does not have to compensate for any interaction between nonlinear impairment and other effects in this ML filter architecture. Considering the architecture of transmission systems with coherent detection as shown in Fig. 1, it is valid to assume that Rx nonlinearity imposed to I and Q components in terms of the Rx output independently by Rx electronic amplifiers. Nonlinearity imposed by ADCs also affects IQ components in terms of the Rx output independently. Thus, there are no nonlinear mixing between IQ components in terms of the Rx output at the input of the first layer, and the first layer requires no IQ cross terms to compensate for Rx nonlinearity in this case. Therefore, four Volterra filters with real-valued input and real-valued coefficients are used for the corresponding IQ components. The second layer has two 2$\times$1 WL filters for two polarizations to compensate for Rx linear impairments. Focusing on IQ mixing that occurs in an Rx, which cannot be compensated by the first layer having four real-valued Volterra filters, it occurs at a coherent receiver as a phase deviation, which is followed by Rx electronic amplifiers. Thus, Rx nonlinearity compensation is followed by Rx linear impairment compensation. This Rx nonlinear and linear compensation chain can mitigate complexity of nonlinear filters since no IQ cross terms are required for nonlinear filters, whereas IQ mixing can be compensated by linear filters. The third layer has two 1$\times$1 SL filters to compensate for CD, whose coefficients are static and determined by the amount of accumulated CD. The fourth layer has a 2$\times$2 SL MIMO filter to perform polarization demultiplexing and PMD compensation, The fifth layer has two 1$\times$1 SL 1-tap filter to perform carrier recovery, whose coefficients, or amounts of phase compensation, are controlled by a digital phase locked loop (PLL) with the last layer outputs. The sixth layer has two 2$\times$1 WL filters to compensate for Tx linear impairments. The seventh layer has Volterra filters to compensate for Tx nonlinearity. Similar to Rx nonlinearity, it is valid to assume that Tx nonlinearity imposed to I and Q components in terms of the Tx input independently by Tx electronic amplifiers. Nonlinear characteristics of a Mach-Zehnder modular, which affect IQ components in terms of the Tx input independently, follow it and then a phase deviation, which causes IQ mixing, follows. Therefore, as with the first layer, four real-valued Volterra filters are used for the seventh layer and they follow Tx linear impairment compensation. At the input of the seventh layer, all mixing between IQ components in terms of the Tx input is compensated through previous six layers. The eighth layer has four real-valued linear filters for the IQ components that work as a matched filter. Narrow bandwidth filtering like a matched filter can change a signal intensity profile, and is not commutative with nonlinear effects. The matched filter is thus incorporated into the last of the ML filters. The coefficients of the first, second, fourth, sixth, and seventh layers are adaptively updated to minimize a loss that is composed of the last layer outputs by a gradient calculation with back propagation from the last layer and SGD. By this adaptive control, the coefficients of the Volterra filters are converged without involving matrix inversion as a least squares solver, which can be deteriorate accuracy as the size of the Volterra filter increases [20].

 figure: Fig. 2.

Fig. 2. Architecture of proposed adaptive ML filters including two Volterra layers to compensate for nonlinearity that occurs in Rx and Tx. Coefficients are adaptively controlled with back propagation and SGD to minimize loss. $\mathbb {C/R}$: complex-to-real conversion, $\mathbb {R/C}$: real-to-complex conversion, WL: widely-linear, SL: strictly-linear, CR: carrier recovery.

Download Full Size | PDF

We derive the forward and back propagation of the Volterra filter layers in the ML filters as follows. The output and input signal vectors of the $l$-th layer that relate to the last $L = 8$ layer outputs at the timing integer $k$ are represented as $\boldsymbol{u}_{i}^{[l]}[k]$ and $\boldsymbol{u}_{i}^{[l-1]}[k]$, respectively, in the complex-valued phasor representation, as

$$\boldsymbol{u}_{i}^{[l]}[k] = (u_{i}^{[l]}[k], u_{i}^{[l]}[k-1], \ldots, u_{i}^{[l]}[k-M_{l}+1])^{\mathrm{T}},$$
$$\boldsymbol{u}_{i}^{[l-1]}[k] = (u_{i}^{[l-1]}[k], u_{i}^{[l-1]}[k-1], \ldots, u_{i}^{[l-1]}[k-M_{l-1}+1])^{\mathrm{T}},$$
where T is the transpose, $i = 1, 2$ for the two polarizations, and $M_{l}$ and $M_{l-1}$ are the lengths of the output and input signal vectors, respectively. Given that the tap length of the FIR filters of the $l$-th layer is $M^{[l]}$, it satisfies
$$M^{[l]} = M_{l-1} - M_{l} +1,$$
due to the relation of convolution. The $l$-th layer inputs are the $(l-1)$-th layer outputs in the ML filter architecture. The last $L$-th layer outputs are $\boldsymbol{u}_{i}^{[L]}[k] = u_{i}^{[L]}[k]$ and $M_{L} = 1$. We obtain them by applying the forward propagation relation of each layer from the first $l = 1$ layer successively. We have the forward propagation in the cases where the $l$-th layer has SL (MIMO) filters and WL (MIMO) filters [15]. The remaining to obtain the last outputs of the ML filters shown in Fig. 2 is the forward propagation in the case where the $l$-th layer has four real-valued Volterra filters for the IQ components.

First, we require the relation between the signals in the complex-valued phasor representation and those in the real-valued IQ basis representation. The outputs of the $l$-th layer in the phasor representation $u_{i}^{[l]} \in \mathbb {C}$ are related to those in the IQ basis representation $u_{i\mathrm {I}}^{[l]}, u_{i\mathrm {Q}}^{[l]} \in \mathbb {R}$, by using the augmented signals with complex conjugates of $\underline {\boldsymbol{u}}^{[l]} = (u^{[l]}_{1}, u^{[l]}_{2}, u^{[l]\ast }_{1}, u^{[l]\ast }_{2})^{\mathrm {T}}$, as [14]

$$\begin{pmatrix} u^{[l]}_{1} \\ u^{[l]}_{2} \\ u^{[l]\ast}_{1} \\ u^{[l]\ast}_{2} \\ \end{pmatrix} = \begin{pmatrix} 1 & 0 & i & 0 \\ 0 & 1 & 0 & i \\ 1 & 0 & -i & 0 \\ 0 & 1 & 0 & -i \\ \end{pmatrix} \begin{pmatrix} u^{[l]}_{1\mathrm{I}} \\ u^{[l]}_{2\mathrm{I}} \\ u^{[l]}_{1\mathrm{Q}} \\ u^{[l]}_{2\mathrm{Q}} \\ \end{pmatrix} = T_{2} \begin{pmatrix} u^{[l]}_{1\mathrm{I}} \\ u^{[l]}_{2\mathrm{I}} \\ u^{[l]}_{1\mathrm{Q}} \\ u^{[l]}_{2\mathrm{Q}} \\ \end{pmatrix},$$
where
$$T_{2} = \begin{pmatrix} I_{2} & i I_{2} \\ I_{2} & -i I_{2} \\ \end{pmatrix}$$
satisfies $T^{\dagger }_{2} T_{2} = T_{2} T^{\dagger }_{2} = 2 I_{4}$. $\ast$ and $\dagger$ are the complex conjugate and the Hermitian conjugate, respectively. $I_{n}$ is an identity matrix with a corresponding size. $\mathbb {C}$ and $\mathbb {R}$ are the set of complex numbers and real numbers, respectively. We denote the signals in the IQ basis representation as $(u^{[l]}_{1\mathrm {I}}, u^{[l]}_{2\mathrm {I}}, u^{[l]}_{1\mathrm {Q}}, u^{[l]}_{2\mathrm {Q}})^{\mathrm {T}} = (x^{[l]}_{1}, x^{[l]}_{2}, x^{[l]}_{3}, x^{[l]}_{4})^{\mathrm {T}} = \boldsymbol{x}^{[l]}$. $x_{q}^{[l]} \in \mathbb {R}$, where $q = 1, \ldots , 4$, are corresponding IQ components of the two polarizations. According to Eq. (4),
$$\underline{\boldsymbol{u}}^{[l]} = T_{2} \boldsymbol{x}^{[l]},$$
and
$$\boldsymbol{x}^{[l]} = \frac{1}{2} T_{2}^{{\dagger}} \underline{\boldsymbol{u}}^{[l]}.$$
These relations hold for inputs.

When the $l$-th layer has four real-valued Volterra filters corresponding to $q = 1, \ldots , 4$, the output and input signal vectors are

$$\boldsymbol{x}_{q}^{[l]}[k] = (x_{q}^{[l]}[k], x_{q}^{[l]}[k-1], \ldots, x_{q}^{[l]}[k-M_{l}+1])^{\mathrm{T}},$$
$$\boldsymbol{x}_{q}^{[l-1]}[k] = (x_{q}^{[l-1]}[k], x_{q}^{[l-1]}[k-1], \ldots, x_{q}^{[l-1]}[k-M_{l-1}+1])^{\mathrm{T}}.$$
If a Volterra filter includes up to the $P$-th order terms, the output is
$$x^{[l]}_{q}[k] = \sum_{p=1}^{P} \sum_{m_{1}=0}^{M^{[l]}-1} \sum_{m_{2}=m_{1}}^{M^{[l]}-1} \ldots \sum_{m_{p}=m_{p-1}}^{M^{[l]}-1} h_{q, p}^{[l]}[m_{1}, \ldots, m_{p}] \prod_{r=1}^{p} x_{q}^{[l-1]}[k-m_{r}],$$
where $h_{q, p}^{[l]}[m_{1}, \ldots , m_{p}] \in \mathbb {R}$ are the coefficients of the $p$-th order. Reflecting the symmetry of the product of $x_{q}^{[l-1]}$, a Volterra filter has coefficients of the number of $\sum _{p=1}^{P} \sum _{m_{1}=0}^{M^{[l]}-1} \sum _{m_{2}=m_{1}}^{M^{[l]}-1} \ldots \sum _{m_{p}=m_{p-1}}^{M^{[l]}-1}$ 1. We can carry out the forward propagation in the cases where the $l$-th layer has Volterra filters by using Eq. (10).

Adaptive control of the coefficients in each layer of the ML filters is an extension of that for our previous ML SL&WL filters [15]. We show the adaptive coefficient update algorithm and derive the back propagation of the Volterra filters. A loss function to be minimized is composed of the last $L$-th layer outputs. For example, SGD with data-aided least mean square (LMS) minimizes an instantaneous loss of the magnitude of the error between the outputs $u_{i}^{[L]}[k]$ and the known training data $d_{i}[k]$ at a symbol timing integer, represented as

$$\phi [k] = \sum_{i=1}^{2} |d_{i}[k] - y_{i}[k]|^{2}.$$
By using Wirtinger derivatives, the complex-valued coefficients $\xi ^{\ast }$ of SL or WL filters are updated with gradient descent to minimize $\phi$ as
$$\xi^{{\ast}} \rightarrow \xi^{{\ast}} - 2 \alpha \frac{\partial \phi}{\partial \xi},$$
where $\alpha$ is a step size. The gradients of the loss in terms of the coefficients in each layer are obtained by applying back propagation, which is based on the chain rule for derivatives, from the last $L$-th layer, and calculating the gradients in terms of coefficients and inputs of filters from the gradients in terms of outputs successively. In the case of the loss of data-aided LMS, the gradients in terms of the last layer outputs are
$$\frac{\partial \phi}{\partial u_{i}^{[L]}[k]} ={-}e_{i}^{{\ast}},$$
$$\frac{\partial \phi}{\partial u_{i}^{[L]\ast}[k]} ={-}e_{i},$$
where
$$e_{i} = d_{i}[k]- y_{i}[k].$$
We have back propagation in the cases where the $l$-th layer has SL (MIMO) filters or WL (MIMO) filters [15].

When the $l$-th layer has real-valued Volterra filters, the update of a real-valued coefficient $\xi$ with SGD is

$$\xi \rightarrow \xi - \alpha \frac{\partial \phi}{\partial \xi},$$
instead of Eq. (12). We require the relation between gradients in terms of the complex-valued phasor representation and those of the real-valued IQ basis representation as with the case of the forward propagation. According to Wirtinger derivatives,
$$\begin{pmatrix} \frac{\partial \phi}{\partial u^{[l]}_{1\mathrm{I}}} \\ \frac{\partial \phi}{\partial u^{[l]}_{2\mathrm{I}}} \\ \frac{\partial \phi}{\partial u^{[l]}_{1\mathrm{Q}}} \\ \frac{\partial \phi}{\partial u^{[l]}_{2\mathrm{Q}}} \\ \end{pmatrix} = \begin{pmatrix} 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ i & 0 & -i & 0 \\ 0 & i & 0 & -i \\ \end{pmatrix} \begin{pmatrix} \frac{\partial \phi}{\partial u^{[l]}_{1}} \\ \frac{\partial \phi}{\partial u^{[l]}_{2}} \\ \frac{\partial \phi}{\partial u^{[l]\ast}_{1}} \\ \frac{\partial \phi}{\partial u^{[l]\ast}_{2}} \\ \end{pmatrix},$$
and thus
$$\frac{\partial \phi}{\partial \boldsymbol{x}^{[l]}} = T_{2}^{\mathrm{T}} \frac{\partial \phi}{\partial \underline{\boldsymbol{u}}^{[l]}},$$
$$\frac{\partial \phi}{\partial \underline{\boldsymbol{u}}^{[l]}} = \frac{1}{2} T_{2}^{{\ast}} \frac{\partial \phi}{\partial \boldsymbol{x}^{[l]}}.$$
This gives us the gradients of the loss in terms of the real-valued outputs of a Volterra filter. These relations also hold for the gradients in terms of the inputs.

When the gradients of the loss $\phi$ in terms of the output $\boldsymbol{x}^{[l]}_{q}[k]$ of a Volterra filter, represented as

$$\frac{\partial \phi}{\partial \boldsymbol{x}^{[l]}_{q}[k]} = \left(\frac{\partial \phi}{\partial x^{[l]}_{q}[k]}, \frac{\partial \phi}{\partial x^{[l]}_{q}[k-1]}, \ldots, \frac{\partial \phi}{\partial x^{[l]}_{q}[k-M_{l}+1]} \right)^{\mathrm{T}},$$
are given, the gradients in terms of the coefficients of the Volterra filter are
$$\frac{\partial \phi}{\partial h_{q, p}^{[l]}[m_{1}, \ldots, m_{p}]} = \sum_{m_r=0}^{M_{l}-1} \frac{\partial \phi}{\partial x^{[l]}_{q}[k-m_{r}]} \frac{\partial x^{[l]}_{q}[k-m_{r}]}{\partial h_{q, p}^{[l]}[m_{1}, \ldots, m_{p}]},$$
$$\frac{\partial x^{[l]}_{q}[k-m_{r}]}{\partial h_{q, p}^{[l]}[m_{1}, \ldots, m_{p}]} = x_{q}^{[l-1]}[k-m_{r}-m_{1}] \cdots x_{q}^{[l-1]}[k-m_{r}-m_{p}],$$
and the gradients in terms of the input of the Volterra filter are
$$\frac{\partial \phi}{\partial x^{[l-1]}_{q}[k-m_{s}]} = \sum_{m_{r}=0}^{M_{l}-1} \frac{\partial \phi}{\partial x^{[l]}_{q}[k-m_{r}]} \frac{\partial x^{[l]}_{q}[k-m_{r}]}{\partial x^{[l-1]}_{q}[k-m_{s}]},$$
$$\begin{aligned}\frac{\partial x^{[l]}_{q}[k-m_{r}]}{\partial x^{[l-1]}_{q}[k-m_{s}]} = &\sum_{p=1}^{P} \left(\sum_{m_{2} = m_{s}-m_{r}}^{M^{[l]}-1} \sum_{m_{3} = m_{2}}^{M^{[l]}-1} \ldots \sum_{m_{p} = m_{p-1}}^{M^{[l]}-1} h_{q, p}^{[l]}[m_{s}-m_{r}, m_{2}, \ldots, m_{p}] \right.\\ & \cdot x^{[l-1]}_{q}[k-m_{r}-m_{2}] \cdots x^{[l-1]}_{q}[k-m_{r}-m_{p}]\\ + &\sum_{m_{1} = 0}^{m_{s}-m_{r}} \sum_{m_{3} = m_{s}-m_{r}}^{M^{[l]}-1} \sum_{m_{4} = m_{3}}^{M^{[l]}-1} \ldots \sum_{m_{p} = m_{p-1}}^{M^{[l]}-1} h_{q, p}^{[l]}[m_{1}, m_{s}-m_{r}, m_{3}, \ldots, m_{p}]\\ &\cdot x^{[l-1]}_{q}[k-m_{r}-m_{1}] x^{[l-1]}_{q}[k-m_{r}-m_{3}] \cdots x^{[l-1]}_{q}[k-m_{r}-m_{p}]\\ + &\cdots\\ + &\sum_{m_{1} = 0}^{m_{s}-m_{r}} \sum_{m_{2} = m_{1}}^{m_{s}-m_{r}} \ldots \sum_{m_{p-1} = m_{p-2}}^{m_{s}-m_{r}} h_{q, p}^{[l]}[m_{1}, \ldots, m_{p-1}, m_{s}-m_{r}]\\ &\left.\cdot x^{[l-1]}_{q}[k-m_{r}-m_{1}] \cdots x^{[l-1]}_{q}[k-m_{r}-m_{p-1}] \right). \end{aligned}$$
Now we obtain the back propagation of the real-valued Volterra filters. This includes the case of the linear filter (as is in the case of the eighth layer in Fig. 2) if we restrict $P = 1$. Therefore, we now have all the forward and the back propagation of the adaptive ML filters including the Volterra filter layers.

3. Simulation

We evaluated the performance of compensation of Tx and Rx nonlinearity by the adaptive ML filters including Volterra filter layers shown in Fig. 2 through simple numerical simulations. The simulation model corresponds to a transmission system with coherent detection as shown in Fig. 1, though we consider a case without WDM to focus on Tx and Rx nonlinearity. The transmitted signal was a 23 Gbaud PDM-64QAM. Three forward error correction (FEC) frames of low-density parity-check code for DVB-S2 with a frame length of 64,800 and code rate of 4/5 with loading random bits to its payload were generated for each IQ component of the two polarizations, and they were mapped into the PDM-64QAM. To conduct a coefficient update with pilot-based data-aided LMS, a pilot symbol of the 64QAM was inserted in every 15 symbols. In addition, a preamble of 64 successive symbols was inserted in each FEC frame to detect the head of the frame to carry out pilot-based signal processing. The preamble was quadrature phase-shift keying (QPSK) corresponding to the corner states of the 64QAM to distinguish it in a simple manner by using power difference. This results in about 6.5% overhead in total in terms of the pilot and preamble symbols. Learning of large nonlinear filters can fall into overfitting, especially when using a pseudo random sequence [33]. In this study, a long random pattern was used for pilot and preamble symbols compared to the tap length of the Volterra filters and the performance was evaluated by a sufficient long random data load. The PDM-64QAM signal inserted with the pilot and preamble symbols was then upsampled to 32-fold oversampling, and a root raised cosine filter with a roll-off of 0.1 was performed. Real-valued IQ components of the two polarizations of this signal correspond to the electrical signals generated by the ideal DAC. An optical modulator driven by these signals modulated a CW light source at a frequency of 193.3 THz and linewidth of 0 to generate optical signals. At this time, Tx nonlinearity that was sinusoidal, i.e. $\sin (\beta x)/\beta$ with the degree of nonlinearity $\beta$, which emulated a characteristic of a Mach-Zehnder modulator, was imposed to the IQ components.

After adding CD corresponding to the 100-km SMF and setting an optical SNR (OSNR) to 30 dB/0.1 nm by adding an additive white Gaussian noise, the signal was coherently received. A local oscillator (LO) used had a linewidth of 0 and no frequency offset to the signal carrier so that only CD was mutually non-commutative linear impairment with Tx or Rx nonlinearity. The simulation results when a LO having a linewidth of 100 kHz was used are provided in Appendix A. A low-pass filter with a 3-dB bandwidth of 0.8 times of the symbol rate was used to filter the electrical signals after coherent detection. Rx nonlinearity, which is like a saturation characteristic of an electronic amplifier, i.e. $\tanh (\beta x)/\beta$, was imposed to the IQ components. After that, the signals were sampled with two-fold oversampling. Then, DSP was performed after normalizing IQ components individually.

In this simulation, two kind of architectures were evaluated. The first one is a conventional one used in optical fiber communication systems with coherent detection as a reference. This is referred to as Linear after only linear filters were used for impairment compensation. In this case, the received signals were resampled to two-fold oversampling on the basis of a timing error [34] with low-pass filtering that uses 3-dB bandwidth of the symbol rate (which was not necessary in this simulation and worked in the experiment described later). Then, CD compensation and matched filtering were performed in the frequency domain. The head of the preamble symbols was detected by the moving average of the signal power, and the signals were aligned to perform pilot-based data-aided LMS for polarization demultiplexing and PMD compensation with a half-symbol spaced 21-tap FIR 2$\times$2 MIMO filter. Carrier recovery was also performed by a digital PLL.

The second architecture was the adaptive ML filters including Volterra filter layers. Since CD compensation and matched filtering were performed in the ML filter in this case, the timing alignment was performed after resampling. Here, the head of the preamble symbols was detected after CD compensation and the timing alignment was performed on the signal before CD compensation. Impairment compensation by the ML filters was executed on the signal before CD compensation with the timing alignment again. The tap lengths of the second and sixth layers to compensate for linear impairment in the Rx and Tx were set to 5. The length of the third layer to compensate for CD was 41, which was sufficient to compensate for the CD amount over a 100-km SMF. The length of the fourth layer to perform polarization demultiplexing and PMD compensation was 21. The length of the eighth layer was 9, though it was a bit insufficient to represent a root raised cosine filter with a roll-off of 0.1 precisely, since the increase of the tap length of later layers in the multi-layer filters results in high computational complexity [15]. The lengths of the first and seventh layers of Volterra filters to compensate for Rx and Tx nonlinearity were 1, considering that the imposed nonlinearity in the Tx and Rx does not have any temporal spread in this simulation. The Volterra filters had the terms up to the fifth order. The coefficients of the first, second, fourth, sixth, and seventh layers were adaptively controlled by the update algorithm previously mentioned to minimize the loss of the data-aided LMS with back propagation and SGD. They were updated at both the pilot and preamble symbols. The coefficients were initialized as 1 at the center of the main diagonal filters with the remaining set to 0. The fifth layer for carrier recovery was updated by a PLL at all the symbols. A decision directed PLL with ideal one symbol delay was used to focus on the performance of adaptive control of the filter coefficients with SGD. A pilot-based digital PLL can be implemented with low penalty [35] and sophisticated parallelization techniques have been investigated to reduce loop delay [36], though further consideration is needed for practical implementation of a digital PLL. The step sizes were $10^{-3}$ for the second, fourth, and sixth layers, and $10^{-2}$ for the first and seventh Volterra filter layers. The same step size was used for all orders of Volterra filter coefficients. We evaluated three cases in the adaptive ML filter architecture; only the first layer to compensate for Rx nonlinearity was activated (ML Rx Volterra); only the seventh layer to compensate for Tx nonlinearity was activated (ML Tx Volterra); and both the first and seventh layers were activated (ML both Volterra). The four cases, which includes Linear, were compared.

After impairment compensation and removing pilot and preamble symbols, the error vector magnitude (EVM) was evaluated in this simulation. The error vector was defined as the difference between a signal to a decision result, and the reference of the EVM was the averaged amplitude of QAM.

We evaluated the performance of the four compensation cases (Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra) under four conditions: without nonlinearity in the Tx and Rx (no NL), with Tx nonlinearity (Tx NL), with Rx nonlinearity (Rx NL), and with both Tx and Rx nonlinearity (both NL). The degree of nonlinearity was set to $\beta = 0.83$ for Tx nonlinearity and to $0.67$ for Rx nonlinearity to provide similar EVM degradation in the back-to-back (b2b) condition. Figure 3 shows the constellation results in the b2b condition obtained by the four compensation cases. Only the constellations of one polarization are shown in Fig. 3 for simplicity, while a similar result was obtained for the orthogonal polarization. The results by Linear processing show that the constellation of 64QAM was distorted, where spacing between the symbol points became narrower at the outer symbol points, when Tx nonlinearity or Rx nonlinearity was imposed. In the case with both Tx and Rx nonlinearity, the constellation was distorted more severely in the same way. In contrast, the results by ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing provided constellations of almost equally-spaced symbol points, as was the case without nonlinearity, when Tx and Rx nonlinearity were imposed. This indicates that three kinds of ML Volterra processing could compensate for both Tx and Rx nonlinearity, though nonlinear compensation did not perfectly work to cancel Tx nonlinearity. In other words, each of the first and seventh Volterra filter layers of the ML filters could compensate for both Tx and Rx nonlinearity in the b2b condition. It should be noted that this was achieved only if no other effects, such as CD, polarization rotation, phase noise from a LO, and a frequency offset, coexist (see Appendix A).

 figure: Fig. 3.

Fig. 3. Simulation results of constellations in the b2b condition obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing, in the cases with no nonlinearity (no NL), Tx nonlinearity (Tx NL), Rx nonlinearity (Rx NL), and both Tx and Rx nonlinearity (both NL).

Download Full Size | PDF

Figure 4 shows the constellation results obtained by the four compensation cases in a 100-km transmission. The results by Linear processing show that the constellation under Tx nonlinearity was similar to that in the b2b condition shown in Fig. 3. Whereas, the constellation under Rx nonlinearity was not similar to that in the b2b condition, and it had equally-spaced symbol points but with increased errors around the symbol points. This is because Rx nonlinearity was imposed under CD accumulation and compensation for CD was performed after that. In the constellation with both Tx and Rx nonlinearity, both effects of distortion of unequally-spaced symbol points and increased errors around symbol points were superimposed. In the case by ML Tx Volterra processing, it compensated for Tx nonlinearity, whereas it did not compensate for Rx nonlinearity. In the case by ML Rx Volterra processing, it compensated for Rx nonlinearity, whereas it did not compensate for Tx nonlinearity. The results showed that Tx and Rx nonlinearity differed in their behaviors and that compensation of Tx or Rx nonlinearity could not work well unless the nonlinear compensation was positioned in an appropriate order in impairment compensation when other effects, which is CD in this case, coexist due to mutual non-commutativity of the effects. Since two nonlinear compensations of Rx and Tx nonlinearity were included in the case by ML both Volterra processing, it compensated for both Tx and Rx nonlinearity as shown in Fig. 4.

 figure: Fig. 4.

Fig. 4. Simulation results of constellations in 100-km transmission obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing, in the cases with no nonlinearity (no NL), Tx nonlinearity (Tx NL), Rx nonlinearity (Rx NL), and both Tx and Rx nonlinearity (both NL).

Download Full Size | PDF

Figure 5 summarizes the EVM results of the constellations shown in Figs. 3 and 4. In the b2b condition shown in Fig. 5(a), the EVMs obtained by four compensation cases with no Tx or Rx nonlinearity were similar and it indicated that a small number of taps for the eighth layer provided a small penalty in the cases with multi-layer filters. The EVMs obtained by ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing were similar and improved from those by Linear processing when Tx and Rx nonlinearity were imposed. In the case with both Tx and Rx nonlinearity, the EVM by Linear processing was 9.3% and that by ML both Volterra was 5.4%. The performance by ML both Volterra did not reach the performance without nonlinearity by Linear processing of 4.4%. Possible causes of this are the limitations of nonlinear processing with two-fold oversampling, low-pass filtering at the Rx, and restrictions of Volterra filters up to the fifth order. In the 100-km transmission shown in Fig. 5(b), ML Tx Volterra and ML both Volterra processing provided good performances of 5.3% under Tx nonlinearity, and ML Rx Volterra and ML both Volterra processing provided good ones of 4.7% under Rx nonlinearity. The EVM by ML both Volterra processing under both Tx and Rx nonlinearity was 5.5%. These results demonstrated that the proposed adaptive ML filters including Volterra filter layers could work for compensating for both Tx and Rx nonlinearity simultaneously and effectively even when other impairments coexist.

 figure: Fig. 5.

Fig. 5. Simulation results of EVM obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing, where no nonlinearity (no NL), Tx nonlinearity (Tx NL), Rx nonlinearity (Rx NL), and both Tx and Rx nonlinearity (both NL) were imposed, in the cases of (a) b2b and (b) 100-km transmission.

Download Full Size | PDF

We also evaluated the convergence property of the adaptive coefficient control. Figure 6 shows the time development of a training loss, that is, the loss at pilot and preamble symbols, in the 100-km transmission. In all cases of Tx and Rx nonlinearity, convergence was achieved by updating with about $2\times 10^{4}$ symbols. There were multiple peaks observed in the loss when Tx or Rx nonlinearity was imposed. These peaks were caused by preamble symbols. The four corner states of the 64QAM were used in preamble symbols for timing alignment. Preamble symbols had higher averaged power compared to the other symbols and thus induced more severe degradation when nonlinearity was imposed.

 figure: Fig. 6.

Fig. 6. Simulation results of time development of training loss by ML both Volterra processing moving-averaged over 100 symbols.

Download Full Size | PDF

4. Experiment

Finally, we evaluated the performance of the compensation of Tx and Rx nonlinearity by the adaptive ML filters including Volterra filter layers in a transmission experiment of a 23 Gbaud PDM-64QAM signal over a 100-km SMF span. In this experiment, Tx and Rx nonlinearity were induced by tuning an output amplitude of electronic amplifiers in a Tx and Rx.

The experimental setup is shown in Fig. 7. At the transmitter side, as was the case with the previous simulation, three FEC frames were generated for each IQ component of the two polarizations, and they were mapped into the PDM-64QAM with the inserted pilot and preamble symbols. Real-valued IQ components of the two polarizations of the signal were then upsampled to two-fold oversampling, and a raised cosine filter with a roll-off of 0.1 and pre-compensation of frequency characteristics of Tx components were performed. These signals were upsampled again to four-fold oversampling to be generated by a DAC at a sampling rate of 92 GS/s and a vertical resolution of eight bits. An optical modulator driven by the outputs of the DAC with an LD source at a frequency of 193.3 THz having a linewidth of about 100 kHz generated a 23 Gbaud PDM-QPSK signal. Tx nonlinearity was induced by tuning the output amplitude of electronic driver amplifiers operating on the DAC outputs.

 figure: Fig. 7.

Fig. 7. Experimental setup for transmission of a 23 Gbaud PDM-64QAM signal over the span of a 100-km SMF. Tx(Rx) nonlinearity was induced by tuning an output amplitude of electronic amplifiers in Tx(Rx). LD: laser diode, DAC: digital-to-analog converter, MOD: modulator, PS: polarization scrambling, SMF: single-mode fiber, ASE: amplified spontaneous emission, OBPF: optical band-pass filter, ADC: analog-to-digital converter.

Download Full Size | PDF

The generated signal was then transmitted over a span of a 100-km SMF after low speed ($10\times 2\pi$ rad/s) polarization scrambling. The span input power was set to 0 dBm, providing some fiber nonlinearity of self-phase modulation. After the 100-km SMF transmission, the OSNR was set by adding amplified spontaneous emission (ASE). In the case when no ASE was added, the received OSNR was 34.9 dB/0.1 nm. The signal then passed through an optical band-pass filter with a 3-dB bandwidth of 50 GHz and was detected with an integrated polarization-diversity coherent Rx including gain controlled electronic amplifiers. Its 3-dB bandwidth was 40 GHz. An LO with a linewidth of about 100 kHz and the signal source were free-running, and the frequency offset was within about $\pm$ 100 MHz. Since a frequency offset can be estimated and compensated after CD compensation [37], this small frequency offset is a reasonable start point of the adaptive control of filters. The electrical signals of the coherent Rx outputs were sampled by a digital oscilloscope as an ADC at a sampling rate of 80 GS/s, vertical resolution of eight bits, and bandwidth of 25 GHz. Three acquisitions were obtained for each condition. Rx nonlinearity was induced by tuning the output amplitude of the TIA in the coherent Rx.

DSP was performed offline. In this experiment, we also evaluated the performance of four compensation cases (Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra). The configuration of DSP was the same as that in the previous simulation, except that the tap length of the fourth layer to perform polarization demultiplexing and PMD compensation (and also that in Linear) was 41. The tap lengths and the maximum orders of Volterra filters of the first and seventh layers to compensate for Rx and Tx nonlinearity were optimized since the temporal spread of nonlinearity was unknown in this experiment, as discussed later. The step sizes including the Volterra filter layers were $10^{-3}$. After convergence of the adaptive control of the filter coefficients by using about $3\times 10^{6}$ samples, which includes about $6\times 10^{4}$ pilot and preamble symbols, the pre-/post-FEC bit error rate (BER) and EVM were evaluated with the removal of the pilot and preamble symbols. The median of the results of the three acquisitions was adopted.

Figure 8 shows the experimental results of the EVM obtained by Linear processing when the output amplitude of electronic driver amplifiers in the Tx or that of the TIA in the Rx was tuned. The output amplitude was normalized by its optimum value. In the cases when the electronic output amplitudes were tuned in both the Tx and Rx, the EVM degraded as the output amplitude exceeded its optimum, which suggested that some nonlinearity was induced in the Tx and Rx. The optical spectra when the output amplitude of the Tx electronic amplifiers was tuned are shown in Fig. 9. The powers of the four cases were normalized. As the output amplitude of the Tx electronic amplifiers increased, the optical spectrum became slightly broader due to nonlinearity. We hereinafter evaluated the performance in the four conditions similar to that in the simulation: the output amplitudes were both optimum and no nonlinearity was induced in the Tx and Rx (no NL); the output amplitude in the Tx was two and Tx nonlinearity was induced (Tx NL); the output amplitude in the Rx was two and Rx nonlinearity was induced (Rx NL); the output amplitudes in the Tx and Rx were two and both Tx and Rx nonlinearity were induced (both NL).

 figure: Fig. 8.

Fig. 8. Experimental results of EVM obtained by Linear processing when the output amplitude of the (a) Tx or (b) Rx electronic amplifiers was tuned. The output amplitude was normalized by its optimum value.

Download Full Size | PDF

 figure: Fig. 9.

Fig. 9. Optical spectra when the output amplitude of the Tx electronic amplifiers was tuned from 0.5 to 2.0. The powers of the four cases were normalized. The resolution bandwidth was 0.02 nm.

Download Full Size | PDF

We evaluated the EVM while changing the tap lengths and maximum orders of the Volterra filters to optimize them. No ASE was added after transmission. Figure 10(a) shows the results of the EVM by ML Tx Volterra processing where Tx nonlinearity was induced, as a function of the total number of coefficients of the Volterra filters of the seventh layer. The tap length $M^{[7]}$ and the maximum order $P$ of the seventh layer Volterra filters were changed. The step size was roughly optimized for each size. Focusing on the results with the cases where the tap length of the Volterra filters $M^{[7]} = 1$, the performance with $P = 5$ outperformed that with $P = 3$, and little improvement was observed with $P = 7$. Then, focusing on the results when the maximum order $P = 3$, they indicated that a tap length of five was a good choice. Figure 10(b) shows the results of EVM by ML Rx Volterra processing where Rx nonlinearity was induced. The tap length and maximum order of the first layer Volterra filters were changed. In accordance with these results, we set the tap lengths of the two Volterra filter layers to $M^{[7]} = M^{[1]} = 5$, and the maximum orders $P = 5$.

 figure: Fig. 10.

Fig. 10. Experimental results of dependence of EVM on Volterra filter size, (a) obtained by ML Tx Volterra processing where Tx nonlinearity was induced, and (b) by ML Rx Volterra processing where Rx nonlinearity was induced.

Download Full Size | PDF

Figure 11 shows the results of constellations obtained after the 100-km transmission by the four compensation cases when Tx or Rx nonlinearity was induced. As was the case in the simulation, ML Tx Volterra processing compensated for Tx nonlinearity and ML Rx Volterra processing compensated for Rx nonlinearity. In the case of ML both Volterra processing, compensation of both Tx and Rx nonlinearity was accomplished. Moreover, ML Tx Volterra and ML both Volterra processing provided better constellations in the case without nonlinearity induced compared with that by Linear processing. This suggests that some Tx nonlinearity remained when the electronic output amplitude in the Tx was at its optimum in this experiment.

 figure: Fig. 11.

Fig. 11. Experimental results of constellations in 100-km transmission obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing in the conditions with no nonlinearity induced (no NL), Tx nonlinearity induced (Tx NL), Rx nonlinearity induced (Rx NL), and both Tx and Rx nonlinearity induced (both NL).

Download Full Size | PDF

We evaluated the performance of the four compensation cases while changing the received OSNR to assess the robustness of the adaptive control of the ML filter coefficients including Volterra filter layers with back propagation and SGD. Figures 12 and 13 show the results of pre-/post-FEC BER, respectively, obtained by the four compensation cases in the four conditions of Tx and Rx nonlinearity. The error-free results are plotted at $10^{-6}$ for visibility. In the condition without Tx and Rx nonlinearity induced (Figs. 12(a) and 13(a)), ML both Volterra and ML Tx Volterra processing provided better pre-FEC BER than the others, especially at a high OSNR region, since Tx nonlinearity remained in this case. The post-FEC error-free was achieved above 22 dB/0.1 nm with all four compensation cases. In the condition with Tx nonlinearity (Figs. 12(b) and 13(b)), the results of the pre-FEC BER showed similar behaviors in the case with no nonlinearity induced, except that they were all degraded. The post-FEC error-free was achieved down to 29 dB/0.1 nm by ML Rx Volterra processing, 25 dB/0.1 nm by Linear, and 23 dB/0.1 nm by ML Tx Volterra and ML both Volterra processing. When no nonlinearity or Tx nonlinearity was induced, the performance of ML Rx Volterra processing was slightly worse than that of Linear processing. The possible cause of this might be excess errors in the adaptive coefficient control [38]. Since ML Rx Volterra has more coefficients than Linear processing, the degradation due to excess errors was more severe. In the condition with Rx nonlinearity (Figs. 12(c) and 13(c)), ML both Volterra processing provided the best pre-FEC BER, followed by ML Rx Volterra processing since it worked to compensated for Rx nonlinearity. The post-FEC error-free was achieved down to 25 dB/0.1 nm by Linear and ML Tx Volterra, 24 dB/0.1 nm by ML Rx Volterra, and 22 dB/0.1 nm by ML both Volterra processing. In the condition with both Tx and Rx nonlinearity (Figs. 12(d) and 13(d)), the post-FEC error-free was not achieved by Linear processing, and it was achieved down to 30 dB/0.1 nm by ML Rx Volterra, 26 dB/0.1 nm by ML Tx Volterra, and 23 dB/0.1 nm by ML both Volterra processing. ML both Volterra processing could provide the better robust performance compared with the others regardless of Tx or Rx nonlinearity induced. Therefore, the results demonstrated that the adaptive ML filters including Volterra filter layers could compensate for the nonlinearity that occurs in both Tx and Rx simultaneously and effectively under the accumulation of CD and other effects including a frequency offset and polarization rotation, which are non-commutative impairments with Tx and Rx nonlinearity.

 figure: Fig. 12.

Fig. 12. Experimental results of dependence of pre-FEC BER on OSNR obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing in the conditions with (a) no nonlinearity induced (no NL), (b) Tx nonlinearity induced (Tx NL), (c) Rx nonlinearity induced (Rx NL), and (d) both Tx and Rx nonlinearity induced (both NL).

Download Full Size | PDF

 figure: Fig. 13.

Fig. 13. Experimental results of dependence of post-FEC BER on OSNR obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing in the conditions with (a) no nonlinearity induced (no NL), (b) Tx nonlinearity induced (Tx NL), (c) Rx nonlinearity induced (Rx NL), and (d) both Tx and Rx nonlinearity induced (both NL).

Download Full Size | PDF

5. Conclusion

We proposed an ML filter architecture in which two Volterra filter layers were positioned to compensate for nonlinearity that occurs in Tx and Rx components under other impairments such as CD. The coefficients of the ML filters including Volterra filter layers are adaptively controlled by using a gradient calculation with back propagation and SGD to minimize the loss function that is composed of the last layer outputs. We evaluated the performance of the adaptive ML filters including Volterra filter layers numerically and experimentally in the transmission of a 23 Gbaud PDM-64QAM signal over one span of a 100-km SMF. The results demonstrated that the adaptive ML filters including Volterra filter layers could compensate for the nonlinearity that occurs in both Tx and Rx simultaneously and effectively when other impairments coexisted.

Appendix A: Simulation with laser phase noise

In the previous simulations, a laser phase noise was ignored so that only CD was mutually non-commutative linear impairment with Tx or Rx nonlinearity. There were no mutually non-commutative linear impairments with Tx or Rx nonlinearity in the b2b condition, in contrast to the case when CD coexisted in the 100-km transmission.

Figure 14 shows the simulation results of EVM when an LO linewidth was 100 kHz. In contrast to the case without a phase noise shown in Fig. 5(a), the EVMs obtained by ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing were different in the b2b condition as shown in Fig. 14(a) when Tx or Rx nonlinearity was imposed. Tx nonlinearity was compensated by ML Tx Volterra and ML both Volterra processing. Rx nonlinearity was compensated by ML Rx Volterra and ML both Volterra processing. The EVM results in the 100-km transmission shown in Fig. 14(b) show similar behaviors. Thus, not only CD but also other linear effects that are not commutative with Tx or Rx nonlinearity prevent one nonlinear filter positioned at the first or the last of compensation blocks from compensating for both Tx and Rx nonlinearity. Nevertheless, ML both Volterra processing could compensate for all the relevant impairments including Tx and Rx nonlinearity.

 figure: Fig. 14.

Fig. 14. Simulation results of EVM with LO phase noise of 100 kHz obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing, where no nonlinearity (no NL), Tx nonlinearity (Tx NL), Rx nonlinearity (Rx NL), and both Tx and Rx nonlinearity (both NL) were imposed, in the cases of (a) b2b and (b) 100-km transmission.

Download Full Size | PDF

Appendix B: Converged coefficients of Volterra filters in experiment

Figure 15 shows the histograms of the square amplitudes of converged coefficients of the four Volterra filters at the first and seventh layers obtained by ML both Volterra processing in the 100 km transmission experiment with Tx and Rx nonlinearity induced (both NL) when no ASE was added. The tap lengths of the two Volterra filter layers were $M^{[7]} = M^{[1]} = 5$, and the maximum orders were $P = 5$. It is noted that the inputs of the multi-layer filters were normalized so that the corner states of 64QAM corresponded to the square amplitude of one. The histograms of the coefficients in the order of $p = 1$ to 5 are individually shown. Clear discrimination to choose important coefficients could not be extracted regarding both the Volterra filters of the seventh layer for Tx nonlinearity compensation and the ones of the first layer for Rx nonlinearity compensation with the LMS algorithm. Regularization techniques such as $L_{1}$ regularization can induce sparse coefficients and pruning of them may help to reduce the computational complexity of the Volterra filters. Mitigation of the computational complexity for practical implementation remains for our future study.

 figure: Fig. 15.

Fig. 15. Histogram of the square amplitude of converged coefficients of the Volterra filters of (a) the seventh layer for Tx nonlinearity compensation and (b) the first layer for Rx nonlinearity compensation when ML both Volterra processing was used in the 100 km transmission experiment with Tx and Rx nonlinearity (both NL).

Download Full Size | PDF

Appendix C: Performances of adaptive control of Volterra filters in different OSNR

The qualities of the adaptive control of the Volterra filter coefficients in different OSNR conditions were assessed. Figure 16 shows the experimental results of the pre-/post-FEC BER in 100-km transmission obtained by ML both Volterra processing in the conditions with both Tx and Rx nonlinearity induced (both NL). We compared the performances in three adaptive control procedures when changing the OSNR; the coefficients of the Volterra filters at the first and seventh layers were converged at each corresponding received OSNR individually, which was the same as the case in obtaining the previous experimental results; the converged coefficients obtained at the highest OSNR (without ASE) were used for the Volterra filters and they were fixed without update at each OSNR; and the coefficients were initialized as the converged coefficients obtained at the highest OSNR and they were also updated at each OSNR. According to Fig. 16, these three cases provided similar pre- and post-BERs down to low OSNR. Thus, adaptive control of the Volterra filter coefficients was robust and gave little penalty at low OSNR in this evaluation.

 figure: Fig. 16.

Fig. 16. Experimental results of dependence of (a) pre- and (b) post-FEC BER on OSNR in 100-km transmission obtained ML both Volterra processing in the conditions with both Tx and Rx nonlinearity induced (both NL), where three different cases for adaptive control of the coefficients of the Volterra filters at the first and seventh layers were evaluated.

Download Full Size | PDF

Funding

Ministry of Education, Culture, Sports, Science and Technology (19H02138).

Acknowledgments

We thank Masaki Sato, Ankith Vinayachandran, Kohei Hosokawa, and Emmanuel Le Taillandier de Gabory for their insightful discussions.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. H.-C. Chien, J. Zhang, J. Yu, and Y. Cai, “Single-carrier 400G PM-256QAM generation at 34 Gbaud trading off bandwidth constraints and coding overheads,” in Optical Fiber Communication Conference (2017), paper W1J.3.

2. F. Buchali, F. Steiner, G. Böcherer, L. Schmalen, P. Schulte, and W. Idler, “Rate adaptation and reach increase by probabilistically shaped 64-QAM: An experimental demonstration,” J. Lightwave Technol. 34(7), 1599–1609 (2016). [CrossRef]  

3. X. Chen, J. Cho, A. Adamiecki, and P. Winzer, “16384-QAM transmission at 10 GBd over 25-km SSMF using polarization-multiplexed probabilistic constellation shaping,” in European Conference on Optical Communication (2019), paper PD3.3.

4. F. Derr, “Coherent optical QPSK intradyne system: Concept and digital receiver realization,” J. Lightwave Technol. 10(9), 1290–1296 (1992). [CrossRef]  

5. S. J. Savory, “Digital filters for coherent optical receivers,” Opt. Express 16(2), 804–817 (2008). [CrossRef]  

6. Y. Han and G. Li, “Coherent optical communication using polarization multiple-input-multiple-output,” Opt. Express 13(19), 7527–7534 (2005). [CrossRef]  

7. E. Ip and J. M. Kahn, “Compensation of dispersion and nonlinear impairments using digital backpropagation,” J. Lightwave Technol. 26(20), 3416–3425 (2008). [CrossRef]  

8. K. Schuh, F. Buchali, W. Idler, T. A. Eriksson, L. Schmalen, W. Templ, L. Altenhain, U. Dümler, R. Schmid, M. Möller, and K. Engenhardt, “Single carrier 1.2 Tbit/s transmission over 300 km with PM-64QAM at 100 Gbaud,” in Optical Fiber Communication Conference (2017), paper Th5B.5.

9. F. Buchali, M. Chagnon, K. Schuh, and V. Lauinger, “Beyond 100 Gbaud transmission supported by 120 GSa/s CMOS digital to analog converter,” in European Conference on Optical Communication (2019), paper Tu.2.D.3.

10. M. Nakamura, F. Hamaoka, H. Yamazaki, M. Nagatani, Y. Ogiso, H. Wakita, M. Ida, A. Matsushita, T. Kobayashi, H. Nosaka, and Y. Miyamoto, “1.3-Tbps/carrier net-rate signal transmission with 168-Gbaud PDM PS-64QAM using analogue-multiplexer-integrated optical frontend module,” in European Conference on Optical Communication (2019), paper Tu.2.D.5.

11. M. R. Chitgarha, P. Studenkov, J. Zhang, H. Hodaei, H. Tsai, S. Buggaveeti, A. Rashidinejad, A. Yekani, R. Mirzaei Nejad, S. Kerns, J. Diniz, D. Pavinski, R. Brigham, B. Foo, M. Al-Khateeb, S. Koenig, S. Wolf, R. Going, S. Porto, I. Leung, R. Maher, V. Dominic, H. Sun, S. Sanders, J. Osenbach, S. Corzine, P. Evans, V. Lal, and M. Ziari, “2×800Gbps/wave coherent optical module using a monolithic InP transceiver PIC,” in European Conference on Optical Communication (2020), paper We2C-1.

12. C. R. S. Fludger and T. Kupfer, “Transmitter impairment mitigation and monitoring for high baud-rate, high order modulation systems,” in European Conference on Optical Communication (2016), paper Tu.2.A.2.

13. R. Rios-Müller, J. Renaudier, and G. Charlet, “Blind receiver skew compensation and estimation for long-haul non-dispersion managed systems using adaptive equalizer,” J. Lightwave Technol. 33(7), 1315–1318 (2015). [CrossRef]  

14. E. P. da Silva and D. Zibar, “Widely linear equalization for IQ imbalance and skew compensation in optical coherent receivers,” J. Lightwave Technol. 34(15), 3577–3586 (2016). [CrossRef]  

15. M. Arikawa and K. Hayashi, “Adaptive equalization of transmitter and receiver IQ skew by multi-layer linear and widely linear filters with deep unfolding,” Opt. Express 28(16), 23478–23494 (2020). [CrossRef]  

16. A. Rezania, J. C. Cartledge, A. Bakhshali, and W.-Y. Chan, “Compensation schemes for transmitter- and receiver-based pattern-dependent distortion,” IEEE Photonics Technol. Lett. 28(22), 2641–2644 (2016). [CrossRef]  

17. G. Khanna, B. Spinnler, S. Calabrò, E. De Man, and N. Hanik, “A robust adaptive pre-distortion method for optical communication transmitters,” IEEE Photonics Technol. Lett. 28(7), 752–755 (2016). [CrossRef]  

18. P. W. Berenguer, M. Nölle, L. Molle, T. Raman, A. Napoli, C. Schubert, and J. K. Fischer, “Nonlinear digital pre-distortion of transmitter components,” J. Lightwave Technol. 34(8), 1739–1745 (2016). [CrossRef]  

19. H. Faig, Y. Yoffe, E. Wohlgemuth, and D. Sadot, “Dimensions-reduced Volterra digital pre-distortion based on orthogonal basis for band-limited nonlinear opto-electric components,” IEEE Photonics J. 11(1), 1–13 (2019). [CrossRef]  

20. C. Bluemm, M. Schaedler, S. Calabrò, G. Charlet, C. Xie, F. Pittalà, and M. Kuschnerov, “Equalizing nonlinearities with memory effects: Volterra series vs. deep neural networks,” in European Conference on Optical Communication (2019), paper W.3.B.3.

21. M. Abu-Romoh, S. Sygletos, I. D. Phillips, and W. Forysiak, “Neural-network-based pre-distortion method to compensate for low resolution DAC nonlinearity,” in European Conference on Optical Communication (2019), paper Th.1.B.4.

22. G. Paryanti, H. Faig, L. Rokach, and D. Sadot, “A direct learning approach for neural network based pre-distortion for coherent nonlinear optical transmitter,” J. Lightwave Technol. 38(15), 3883–3896 (2020). [CrossRef]  

23. V. Bajaj, F. Buchali, M. Chagnon, S. Wahls, and V. Aref, “Single-channel 1.61 Tb/s optical coherent transmission enabled by neural network-based digital pre-distortion,” in European Conference on Optical Communication (2020), paper Tu1D-5.

24. F. Pittalà, M. Schaedler, C. Bluemm, G. Goeger, S. Calabrò, M. Kuschnerov, and C. Xie, “800ZR+DWDM demonstration over 600km G.654D fiber enabled by adaptive nonlinear TripleX equalization,” in Optical Fiber Communication Conference (2020), paper M4K.5.

25. K. Gregor and Y. LeCun, “Learning fast approximation of sparse coding,” in International Conference on Machine Learning (2010), 399–406.

26. S. Takabe, M. Imanishi, T. Wadayama, R. Hayakawa, and K. Hayashi, “Trainable projected gradient detector for massive overloaded MIMO channels: Data-driven tuning approach,” IEEE Access 7, 93326–93338 (2019). [CrossRef]  

27. I. Goodfellow, Y. Bengio, and A. Courville, Deep learning (MIT Press, 2016, Chap. 8).

28. J. Tsimbinos and K. V. Lever, “The computational complexity of nonlinear compensators based on the Volterra inverse,” in Workshop on Statistical Signal and Array Processing (1996), 387–390.

29. V. J. Mathews, “Adaptive polynomial filters,” IEEE Signal Process. Mag. 8(3), 10–26 (1991). [CrossRef]  

30. S. Amiralizadeh, A. T. Nguyen, and L. A. Rusch, “Modeling and compensation of transmitter nonlinearity in coherent optical OFDM,” Opt. Express 23(20), 26192–26207 (2015). [CrossRef]  

31. K.-P. Ho, Phase-modulated optical communication systems (Springer, 2005, Chap. 2).

32. P. Chevalier, P. Duvaut, and B. Picinbono, “Complex transversal Volterra filters optimal for detection and estimation,” in International Conference on Acoustics, Speech, and Signal Processing (1991), 3537–3540.

33. T. A. Eriksson, H. Bülow, and A. Leven, “Applying neural networks in optical communication systems: Possible pitfalls,” IEEE Photonics Technol. Lett. 29(23), 2091–2094 (2017). [CrossRef]  

34. F. M. Gardner, “A BPSK/QPSK timing-error detector for sampled receivers,” IEEE Trans. on Commun. 34(5), 423–429 (1986). [CrossRef]  

35. X. Chen, J. Cho, and D. Che, “Experimental quantification of implementation penalties from laser phase noise for ultra-high-order QAM signals,” in European Conference on Optical Communication (2020), paper Tu2D-5.

36. Q. Zhuge, M. Morsy-Osman, X. Xu, M. E. Mousa-Pasandi, M. Chagnon, Z. A. El-Sahn, and D. V. Plant, “Pilot-aided carrier phase recovery for M-QAM using superscalar parallelization based PLL,” Opt. Express 20(17), 19599–19609 (2012). [CrossRef]  

37. M. Selmi, Y. Jaouën, and P. Ciblat, “Accurate digital frequency offset estimator for coherent PolMux QAM transmission systems,” in European Conference on Optical Communication (2009), paper P3.08.

38. J. G. Proakis and M. Salehi, Digital communications5th ed. (McGraw-Hill, 2008, Chap. 10).

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (16)

Fig. 1.
Fig. 1. Schematic diagram of a WDM transmission system with coherent detection. Nonlinear impairments occur in Tx and Rx. ENC: encoder, DAC: digital-to-analog converter, MOD: modulator, LD: laser diode, SMF: single-mode fiber, EDFA: erbium-doped fiber amplifier, CRx: coherent receiver, ADC: analog-to-digital converter, DEM: demodulation, DEC: decoder.
Fig. 2.
Fig. 2. Architecture of proposed adaptive ML filters including two Volterra layers to compensate for nonlinearity that occurs in Rx and Tx. Coefficients are adaptively controlled with back propagation and SGD to minimize loss. $\mathbb {C/R}$ : complex-to-real conversion, $\mathbb {R/C}$ : real-to-complex conversion, WL: widely-linear, SL: strictly-linear, CR: carrier recovery.
Fig. 3.
Fig. 3. Simulation results of constellations in the b2b condition obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing, in the cases with no nonlinearity (no NL), Tx nonlinearity (Tx NL), Rx nonlinearity (Rx NL), and both Tx and Rx nonlinearity (both NL).
Fig. 4.
Fig. 4. Simulation results of constellations in 100-km transmission obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing, in the cases with no nonlinearity (no NL), Tx nonlinearity (Tx NL), Rx nonlinearity (Rx NL), and both Tx and Rx nonlinearity (both NL).
Fig. 5.
Fig. 5. Simulation results of EVM obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing, where no nonlinearity (no NL), Tx nonlinearity (Tx NL), Rx nonlinearity (Rx NL), and both Tx and Rx nonlinearity (both NL) were imposed, in the cases of (a) b2b and (b) 100-km transmission.
Fig. 6.
Fig. 6. Simulation results of time development of training loss by ML both Volterra processing moving-averaged over 100 symbols.
Fig. 7.
Fig. 7. Experimental setup for transmission of a 23 Gbaud PDM-64QAM signal over the span of a 100-km SMF. Tx(Rx) nonlinearity was induced by tuning an output amplitude of electronic amplifiers in Tx(Rx). LD: laser diode, DAC: digital-to-analog converter, MOD: modulator, PS: polarization scrambling, SMF: single-mode fiber, ASE: amplified spontaneous emission, OBPF: optical band-pass filter, ADC: analog-to-digital converter.
Fig. 8.
Fig. 8. Experimental results of EVM obtained by Linear processing when the output amplitude of the (a) Tx or (b) Rx electronic amplifiers was tuned. The output amplitude was normalized by its optimum value.
Fig. 9.
Fig. 9. Optical spectra when the output amplitude of the Tx electronic amplifiers was tuned from 0.5 to 2.0. The powers of the four cases were normalized. The resolution bandwidth was 0.02 nm.
Fig. 10.
Fig. 10. Experimental results of dependence of EVM on Volterra filter size, (a) obtained by ML Tx Volterra processing where Tx nonlinearity was induced, and (b) by ML Rx Volterra processing where Rx nonlinearity was induced.
Fig. 11.
Fig. 11. Experimental results of constellations in 100-km transmission obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing in the conditions with no nonlinearity induced (no NL), Tx nonlinearity induced (Tx NL), Rx nonlinearity induced (Rx NL), and both Tx and Rx nonlinearity induced (both NL).
Fig. 12.
Fig. 12. Experimental results of dependence of pre-FEC BER on OSNR obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing in the conditions with (a) no nonlinearity induced (no NL), (b) Tx nonlinearity induced (Tx NL), (c) Rx nonlinearity induced (Rx NL), and (d) both Tx and Rx nonlinearity induced (both NL).
Fig. 13.
Fig. 13. Experimental results of dependence of post-FEC BER on OSNR obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing in the conditions with (a) no nonlinearity induced (no NL), (b) Tx nonlinearity induced (Tx NL), (c) Rx nonlinearity induced (Rx NL), and (d) both Tx and Rx nonlinearity induced (both NL).
Fig. 14.
Fig. 14. Simulation results of EVM with LO phase noise of 100 kHz obtained by Linear, ML Tx Volterra, ML Rx Volterra, and ML both Volterra processing, where no nonlinearity (no NL), Tx nonlinearity (Tx NL), Rx nonlinearity (Rx NL), and both Tx and Rx nonlinearity (both NL) were imposed, in the cases of (a) b2b and (b) 100-km transmission.
Fig. 15.
Fig. 15. Histogram of the square amplitude of converged coefficients of the Volterra filters of (a) the seventh layer for Tx nonlinearity compensation and (b) the first layer for Rx nonlinearity compensation when ML both Volterra processing was used in the 100 km transmission experiment with Tx and Rx nonlinearity (both NL).
Fig. 16.
Fig. 16. Experimental results of dependence of (a) pre- and (b) post-FEC BER on OSNR in 100-km transmission obtained ML both Volterra processing in the conditions with both Tx and Rx nonlinearity induced (both NL), where three different cases for adaptive control of the coefficients of the Volterra filters at the first and seventh layers were evaluated.

Equations (24)

Equations on this page are rendered with MathJax. Learn more.

u i [ l ] [ k ] = ( u i [ l ] [ k ] , u i [ l ] [ k 1 ] , , u i [ l ] [ k M l + 1 ] ) T ,
u i [ l 1 ] [ k ] = ( u i [ l 1 ] [ k ] , u i [ l 1 ] [ k 1 ] , , u i [ l 1 ] [ k M l 1 + 1 ] ) T ,
M [ l ] = M l 1 M l + 1 ,
( u 1 [ l ] u 2 [ l ] u 1 [ l ] u 2 [ l ] ) = ( 1 0 i 0 0 1 0 i 1 0 i 0 0 1 0 i ) ( u 1 I [ l ] u 2 I [ l ] u 1 Q [ l ] u 2 Q [ l ] ) = T 2 ( u 1 I [ l ] u 2 I [ l ] u 1 Q [ l ] u 2 Q [ l ] ) ,
T 2 = ( I 2 i I 2 I 2 i I 2 )
u _ [ l ] = T 2 x [ l ] ,
x [ l ] = 1 2 T 2 u _ [ l ] .
x q [ l ] [ k ] = ( x q [ l ] [ k ] , x q [ l ] [ k 1 ] , , x q [ l ] [ k M l + 1 ] ) T ,
x q [ l 1 ] [ k ] = ( x q [ l 1 ] [ k ] , x q [ l 1 ] [ k 1 ] , , x q [ l 1 ] [ k M l 1 + 1 ] ) T .
x q [ l ] [ k ] = p = 1 P m 1 = 0 M [ l ] 1 m 2 = m 1 M [ l ] 1 m p = m p 1 M [ l ] 1 h q , p [ l ] [ m 1 , , m p ] r = 1 p x q [ l 1 ] [ k m r ] ,
ϕ [ k ] = i = 1 2 | d i [ k ] y i [ k ] | 2 .
ξ ξ 2 α ϕ ξ ,
ϕ u i [ L ] [ k ] = e i ,
ϕ u i [ L ] [ k ] = e i ,
e i = d i [ k ] y i [ k ] .
ξ ξ α ϕ ξ ,
( ϕ u 1 I [ l ] ϕ u 2 I [ l ] ϕ u 1 Q [ l ] ϕ u 2 Q [ l ] ) = ( 1 0 1 0 0 1 0 1 i 0 i 0 0 i 0 i ) ( ϕ u 1 [ l ] ϕ u 2 [ l ] ϕ u 1 [ l ] ϕ u 2 [ l ] ) ,
ϕ x [ l ] = T 2 T ϕ u _ [ l ] ,
ϕ u _ [ l ] = 1 2 T 2 ϕ x [ l ] .
ϕ x q [ l ] [ k ] = ( ϕ x q [ l ] [ k ] , ϕ x q [ l ] [ k 1 ] , , ϕ x q [ l ] [ k M l + 1 ] ) T ,
ϕ h q , p [ l ] [ m 1 , , m p ] = m r = 0 M l 1 ϕ x q [ l ] [ k m r ] x q [ l ] [ k m r ] h q , p [ l ] [ m 1 , , m p ] ,
x q [ l ] [ k m r ] h q , p [ l ] [ m 1 , , m p ] = x q [ l 1 ] [ k m r m 1 ] x q [ l 1 ] [ k m r m p ] ,
ϕ x q [ l 1 ] [ k m s ] = m r = 0 M l 1 ϕ x q [ l ] [ k m r ] x q [ l ] [ k m r ] x q [ l 1 ] [ k m s ] ,
x q [ l ] [ k m r ] x q [ l 1 ] [ k m s ] = p = 1 P ( m 2 = m s m r M [ l ] 1 m 3 = m 2 M [ l ] 1 m p = m p 1 M [ l ] 1 h q , p [ l ] [ m s m r , m 2 , , m p ] x q [ l 1 ] [ k m r m 2 ] x q [ l 1 ] [ k m r m p ] + m 1 = 0 m s m r m 3 = m s m r M [ l ] 1 m 4 = m 3 M [ l ] 1 m p = m p 1 M [ l ] 1 h q , p [ l ] [ m 1 , m s m r , m 3 , , m p ] x q [ l 1 ] [ k m r m 1 ] x q [ l 1 ] [ k m r m 3 ] x q [ l 1 ] [ k m r m p ] + + m 1 = 0 m s m r m 2 = m 1 m s m r m p 1 = m p 2 m s m r h q , p [ l ] [ m 1 , , m p 1 , m s m r ] x q [ l 1 ] [ k m r m 1 ] x q [ l 1 ] [ k m r m p 1 ] ) .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.