Gaussian mixture model-hidden Markov model based nonlinear equalizer for optical fiber transmission

Fukui Tian; Qingyi Zhou; Chuanchuan Yang

doi:10.1364/OE.386476

1. Introduction

With the continuous development of the Internet, the rapid development of applications, such as cloud computing, 5G wireless communication and high definition video, have led to an exponential increase in data, which requires high-speed optical fiber communication [1,2]. Digital signal processing (DSP) is quite essential for raising the optical link’s transmission rate [3]. Many conventional DSP technologies are very effective on the mitigation of inter-symbol interference (ISI), by means of pre-filtering and post equalization [4]. However, nonlinearity distortions such as relative intensity noise (RIN) and mode partition noise (MPN) in short range optical interconnections cannot be effectively compensated or equalized by the conventional equalizer [5], such as feed forward equalization (FFE), decision feedback equalizer (DFE) and maximum likelihood sequence estimator (MLSE) [6,7]. DSP techniques mentioned above mainly depend on channel modeling, which cannot accurately capture many non-ideal and nonlinearity distortions (e.g. modulation nonlinearity together with square law detection) that exist in practical systems [8].

To mitigate the nonlinear distortions effectively, many DSP algorithms based on machine learning have been proposed recently, such as support vector machine (SVM) [9] and neural networks (NN) [10,11], which not only eliminate ISI effectively, but also show an excellent capability of mitigating nonlinear distortions. However, many NN-based DSP algorithms do not utilize prior knowledge about optical communication system, and require a large amount of training data. NN-based algorithms also suffer from high computational complexity, hindering their application in optical communication system [12].

In this paper, Gaussian Mixture model (GMM) - Hidden Markov model (HMM) is proposed for equalization in optical fiber communication systems. Like other machine learning based DSP algorithms, probability density function (PDF) of distorted symbols can be approximated more precisely using GMM, leading to improved BER performance. On the other hand, the computational complexity of GMM-HMM is much lower than that of NN-based equalizers.

GMM has been proposed to replace the traditional soft/hard decoding in the PAM-4 decoding process [13,14]. Concretely, after equalization, the residual linear and nonlinear impairments of symbols are modeled by GMM rather than removed like traditional schemes, and symbols can be directly classified through the probability calculated by GMM. However, in this paper, not only GMM is proposed to model the probability density of nonlinearly distorted symbols, but also the probability calculated by GMM is used as observation probability in HMM. Then Viterbi algorithm is used for decision according to the observation probabilities calculated by GMM, which makes our algorithm a new and effective method for distortion equalization in optical fiber communication system.

The BER performance of GMM-HMM based equalizer is evaluated in PAM-4 modulated VCSEL-MMF optical link. Experimental results indicate that by introducing GMM-HMM, the BER performance can be greatly improved compared with conventional MLSE. Complexity analysis is also conducted, showing that the computational complexity of GMM-HMM is about 73% lower than that of recurrent neural network (RNN) with similar BER performance.

2. Nonlinear equalizer based on a GMM-HMM

2.1 Gaussian mixture model and hidden Markov model

As shown in Fig. 1, denote the received signal sequence after sinc interpolation as ${\boldsymbol{r}} = [{\boldsymbol{r}}_{1}, {\boldsymbol{r}}_{2}, \dots , {\boldsymbol{r}}_{T}]$, where vector ${\boldsymbol{r}}_{i}(i = 1,2, \dots , T)$ corresponds to the $i$th symbol. The length of vector ${\boldsymbol{r}}_{i}$, which is the interpolation multiple, is denoted as $\varGamma$ [9]. Then vector ${\boldsymbol{r}}_{i}$ is wrapped with its $L$ adjacent vectors together and serves as the final feature vector $\tilde {{\boldsymbol{x}}}_{i} = [{\boldsymbol{r}}_{ {i-L}}, \dots , {\boldsymbol{r}}_{ {i}-1}, {\boldsymbol{r}}_{ {i}}]$. The observation sequence (following chronological order) can be denoted as $\tilde {{\boldsymbol{x}}}=[\tilde {{\boldsymbol{x}}}_1,\tilde {{\boldsymbol{x}}}_2,\dots ,\tilde {{\boldsymbol{x}}}_T]$, which will be used during training and equalizing process. The transmitted symbol sequence is denoted as ${\boldsymbol{s}}=[s_1,s_2,\dots ,s_{T}]$, where $s_{i}(i=1,2,\dots ,T)$ corresponds to the $j$th state (for PAM-4, $j = 1, 2, 3, 4$, corresponding to four types of symbols). However, due to the existence of ISI, we map current symbol and its $L$ adjacent symbols to a new symbol $y_{i}(i=1,2,\dots ,T)$, where $y_i \in \{ q_1,q_2,\dots ,q_N \}, N=4^{L+1}$. So the mapping function can be defined as follows:

(1)$$f(s_{i-L},\dots,s_i) = y_i$$

Therefore, we can construct labeled training dataset $\{ \tilde {{\boldsymbol{x}}}_i, y_i \}, i=1,2,\dots ,T$, which is used to train the model.

Fig. 1. The pre-processing of transmitted and received signals.

Download Full Size | PDF

The form of the Hidden Markov Model can be defined as follows [15]:

(2)$$\lambda = ({\boldsymbol{A}},{\boldsymbol{B}}, \pmb{\pi}),$$

where ${\boldsymbol{A}}$ is the state transition probability matrix and is defined as:

(3)$$ {\boldsymbol{A}} = [a_{mn}]_{N\times N},$$

$a_{mn}$ is the state transition probability and

(4)$$a_{mn}=P(y_i=q_n|y_{i-1}=q_m), 1\le m,n \le N.$$

where $y_i$ and $y_{i-1}$ are the states at time $i$ and $i-1$ respectively. $a_{mn}\in [0,1]$ describes the transition probability from state $q_m$ to state $q_n$, satisfying $\sum _n a_{mn}=1$. The transition probability from one state to another is considered to be the same, which indicates that the transition probability between any two states is equivalent. Therefore, after mapping current symbol $s_i$ and its $L$ adjacent symbols to a new symbol $y_i$ $(i=1,2,\dots ,T)$, the element of transition probability matrix ${\boldsymbol A}$ for PAM-4 is:

(5)$$a_{mn}=0.25$$

if and only if

(6)$$[f^{{-}1}(y_m)]_{i-j}=[f^{{-}1}(y_n)]_{i-j-1},j=0,1,\dots,L-1.$$

$\pmb {\pi }$ is the initial probability distribution:

(7)$$\pmb{\pi}=(\pi_m),$$

where $\pi _m=P(y_1=q_m), m=1,2,\dots ,N$ is the probability of being in state $q_m$ at the beginning. And ${\boldsymbol{B}}$ is the observation probability matrix:

(8)$${\boldsymbol{B}} = [b_m(x)].$$

In our proposed algorithm, the observation probability distribution of features is modeled with a GMM, and the elements of ${\boldsymbol{B}}$ can be calculated by [16]:

(9)$$b_m(x) = P(x|y_i=q_m)=\sum \limits _{k=1}^{K}\alpha_{mk}\phi(x|\pmb{\mu}_{mk},\pmb{\sigma}_{mk}),$$

where $\alpha _{mk}$ is the mixture coefficient for the $k$th component in state $q_m$ $(m=1,2,\dots ,N)$, $\alpha _{mk} > 0$ and $\sum _{k=1}^{K}\alpha _{mk}=1$. $\phi (\cdot )$ is the Gaussian distribution density, $\pmb {\mu }_{mk}$ is the mean vector and $\pmb {\sigma }_{mk}$ is the covariance matrix.

In our study, the output probability of Gaussian Mixture Model (GMM) is a weighted sum of its components, each of which is a Gaussian density function. While for MLSE, the received signals are mainly modeled by calculating the Euclidean distance between symbols and the constellation. GMM utilizes many Gaussian density functions to fit the true distribution of data that is based on certain state, then the probability densities that the received symbols belong to each state are calculated by trained GMM. Therefore, the true distribution of symbols can be modeled more accurately by GMM, which leads that the nonlinearity distortions can be effectively mitigated. On the other hand, in optical fiber communication, the received signal is continuous. GMM has an advantage in fitting the true distribution of continuous signals.

2.2 Training process

In order to train the model, we have constructed labeled training dataset $\{ \tilde {{\boldsymbol x}}_i, y_i\}, i = 1,2,\dots ,T$. The state transition probability matrix ${\boldsymbol{A}}$ is fixed and can be directly calculated from Eq. (5). Therefore, we only need to obtain the observation probability according to Eqs. (8) and (9). As a result, the labeled training dataset $\{ \tilde {{\boldsymbol x}}_i, y_i\}, i = 1,2,\dots ,T$ is divided into $N$ sub-datasets $\{ \tilde {{\boldsymbol x}}_i, y_i|y_i=q_m\}, m = 1,2,\dots ,N$. The dataset partitioning process is illustrated in Fig. 2. During the training process, the Expectation Maximization (EM) algorithm is used to get the parameters $\pmb {\mu }_{mk}$, $\pmb {\sigma }_{mk}$ and $\alpha _{mk}$ [17]. For given initialized parameters (including mean, covariance and mixture coefficient) in GMM, the EM algorithm tries to maximize the likelihood of the received data. EM algorithm consists of two steps:

Fig. 2. The training process of GMM-HMM based equalizer. The labeled training dataset is divided into $N$ sub-datasets.

Download Full Size | PDF

(1) Estimation step:

(10)$${\hat \gamma _{mk}} = \frac{{{\alpha _k}\phi ({x_m}|{\pmb{\mu} _k},{\pmb{\sigma} _k})}}{{\sum \limits_{k = 1}^K {{\alpha _k}\phi ({x_m}|{\pmb{\mu} _k},{\pmb{\sigma} _k})} }},\textrm{ }m = 1,2, \ldots ,N;\textrm{ }k = 1,2, \ldots ,K$$

(2) Maximization step:

(11)$${\hat{\pmb{\mu}} _k} = \frac{{\sum \limits_{m = 1}^N {{{\hat \gamma }_{mk}}{x_m}} }}{{\sum \limits_{m = 1}^N {{{\hat \gamma }_{mk}}} }},\textrm{ } k = 1,2, \ldots ,K$$

(12)$$\hat {\pmb{\sigma}} _k^2 = \frac{{\sum \limits_{m = 1}^N {{{\hat \gamma }_{mk}}{{({x_m} - {\pmb{\mu} _k})}^2}} }}{{\sum \limits_{m = 1}^N {{{\hat \gamma }_{mk}}} }},\textrm{ }k = 1,2, \ldots ,K$$

(13)$${\hat \alpha _k} = \frac{{\sum \limits_{m = 1}^N {{{\hat \gamma }_{mk}}} }}{N},\textrm{ }k = 1,2, \ldots ,K$$

where $x_m$ represents the data in the sub-dataset $\{ \tilde {{\boldsymbol x}}_i, y_i|y_i=q_m\},m = 1,2, \ldots ,N$. EM algorithm iterates on these two steps until the parameters converge. During the training process, the labeled training data that have been divided are brought into the EM algorithm to calculate the parameters. During the equalizing process, likelihood of unknown symbols is estimated by feeding feature vectors into GMM. After training, all the parameters in GMM are fixed and will remain unchanged during the equalizing process.

2.3 Equalizing process

The equalizing process is illustrated in Fig. 3. According to the transition probability matrix ${\boldsymbol A}$ and the observation probability ${\boldsymbol B}$, it is possible for a trained HMM to produce a state sequence $\hat q = ({\hat q_1},{\hat q_2}, \ldots ,{\hat q_T})$ that can best explain the given feature vectors $\tilde {{\boldsymbol x}} = [\tilde {{\boldsymbol x}}_1,\tilde {{\boldsymbol x}}_2, \ldots ,\tilde {{\boldsymbol x}}_T]$. During the equalizing process, according to the observation probability ${b_m}({\tilde {{\boldsymbol x}}_i}),\textrm { }m = 1,2, \ldots ,N$ and the transition probability matrix ${\boldsymbol{A}}$, the probability-based Viterbi algorithm is conducted for decision in HMM [18]. The Viterbi algorithm is stated as follows:

Fig. 3. The equalizing process of GMM-HMM based equalizer.

Download Full Size | PDF

(1) Initialization:

(14)$${\delta _1}(m) = {\pi _m}{b_m}({\tilde{{\boldsymbol x}}_1}),\textrm{ }m = 1,2, \ldots ,N$$

(15)$${\psi _1}(m) = 0,\textrm{ }m = 1,2, \ldots ,N$$

(2) Termination: $2\le i \le T$

(16)$${\delta _i}(m) = \mathop {\max }_{1 \le n \le N} \left[ {{\delta _{i - 1}}(n){a_{mn}}} \right]{b_m}({\tilde{\boldsymbol x}_i}),\textrm{ }m = 1,2, \ldots ,N$$

(17)$${\psi _i}(m) = \mathop {\arg \max }_{1 \le n \le N} \left[ {{\delta _{i - 1}}(n){a_{mn}}} \right],\textrm{ }m = 1,2, \ldots ,N$$

(3) Termination:

(18)$$\hat P = \mathop {\max }_{1 \le m \le N} {\delta _T}(m)$$

(19)$$\hat{q}_T = \mathop {\arg \max }_{1 \le m \le N} [ {{\delta _T}(m)} ].$$

(4) Path (state sequence) backtracking:

(20)$${\hat q_i} = {\psi _{i + 1}}({\hat q_{i + 1}}),\textrm{ }i = T - 1,T - 2, \ldots ,1.$$

Once the optimal path $\hat {{\boldsymbol q}} = ({\hat q_1},{\hat q_2}, \ldots ,{\hat q_T})$ has been obtained, $\hat q_i$ can be demapped to original symbol and the estimated $s_i$ is derived which is the final output.

2.4 Computational complexity

As a machine learning based equalizer, GMM-HMM based method that we propose has double advantages of training complexity and equalizing complexity. GMM-HMM based equalizer requires two stages: One is the training stage, and the other is the equalizing stage. The equalizer needs to be trained offline first, and the parameters in the equalizer will be fixed during the equalizing stage. Therefore, we mainly focus on the computational complexity of the equalizing process. During the equalizing stage, the observation probabilities ${b_m}({\tilde {{\boldsymbol x}}_i}),i = 1,2, \ldots ,T;\textrm { }m = 1,2, \ldots N$ are calculated first. From Eqs. (7) and (8), it can be concluded that the computational complexity of calculating observation probability mainly depends on $K$, which determines the number of Gaussian distribution density functions in GMM. Fortunately, the natural exponential function in the Gaussian distribution density can be calculated by coordinate rotation digital computer (CORDIC) method [19]. Therefore, the computational complexity of the natural exponential function can be ignored [9].

For convenience, in this paper, GMM-HMM based equalizer is denoted as GMM-HMM $(\varGamma , L, K)$. When equalizing every received symbol, we have to calculate the observation probability for $N$ times, which requires $N \times [\varGamma \times (L + 1) + K]$ multiplications and $N \times [\varGamma \times (L + 1) + K - 1]$ additions. After calculating observation probabilities, Viterbi algorithm is used for equalization. When the state number is determined, conducting Viterbi algorithm requires $(L + 1) \times N$ multiplications and $(L + 3) \times N$ additions for every received symbol. Therefore, equalizing every received symbol requires $N \times [(\varGamma + 1) \times (L + 1) + K]$ multiplications and $N \times [(\varGamma + 1) \times (L + 1) + K + 1]$ additions in total.

3. Comparison with a NN

Compared with NN-based equalizers widely researched, GMM-HMM based method has lower training complexity as well as equalization complexity. During the training process, NN-based DSP algorithms do not utilize prior knowledge regarding the optical communication system. In order to fit the true distribution of data, a large number of neurons are required in hidden layer, and a large amount of training data is needed to train the neural network. While GMM-HMM is a concrete mathematical model, the size of the training data is much smaller than the size of the data that the neural network requires. Therefore, when the size of training set is small, GMM-HMM is a better choice than neural networks. In addition, a large number of iterations are required to train neural network during the training process. However, the number of iterations that GMM-HMM requires is much fewer than that of neural network, which indicates that the time to train GMM-HMM is much less than the time to train neural network. As a result, when the computing power is limited or the amount of the training data is small, neural networks cannot get optimal results, and GMM-HMM is usually a better choice.

We compare the computational complexity of GMM-HMM based equalizer with that of recurrent neural network (RNN) based equalizer in [12], which has been proved to be very effective among NN-based equalizers. For convenience, RNN based equalizer is denoted as RNN $(\varGamma , L, R)$, where $\varGamma$ denotes the interpolation multiple, $L$ denotes adjacent number, and $R$ denotes the number of neurons in hidden layer. Considering that the hardware implementation of the activation function has a lot of optimization space, for convenience, additions and multiplications are only considered in this paper. For RNN $(\varGamma , L, R)$, equalizing one symbol requires $(2L + 1) \times \varGamma \times R + {R^2} + 4R$ multiplications and $(2L + 1) \times \varGamma \times R + {R^2} + 4R$ additions. The computational complexity of GMM-HMM $(\varGamma , L, K)$ and RNN $(\varGamma , L, R)$ is shown in table 1.

Table 1. Computational complexity comparison of GMM-HMM and RNN

View Table | View all tables in this article

4. Experiment results and discussion

In order to evaluate the performance of the GMM-HMM based equalizer, we have conducted experiments with a 56 Gb/s PAM4-modulated VCSEL-MMF optical link. The VCSEL-MMF based optical interconnect system is illustrated in Fig. 4. The experimental system is mainly composed of an 850 nm VCSEL that is directly modulated by a bit pattern generator (BPG) of SHF 12104A, 100-m OM4 MMF and a photodiode (PD). The PAM-4 modulated optical signal emitted by the VCSEL is coupled to MMF by FC/PC connector. At the end of MMF, the optical signal is detected by the PD and then transformed into baseband electrical signal. The baseband electrical signal is sampled using a high-speed real time digital signal oscilloscope (DSO) for the offline DSP. The adjacent number $L$ mainly depends on the inter-symbol interference, in this experiment, we can get optimal BER performance when $L = 2$, and a larger $L$ cannot lead to a better BER. Therefore, $L$ is fixed to be 2 in our DSP algorithm.

Fig. 4. Experiment block diagram of VCSEL based optical interconnect link with GMM-HMM. The input data is generated with Bit-pattern Generator (BPG) using random pattern (56 Gb/s).

Download Full Size | PDF

In our experiments, the 850 nm VCSEL is New Focus 1784 [20], and PD is New Focus 1484-A-50 [21], while the -3dB bandwidth of VCSEL and PD are 18 GHz and 22 GHz respectively. The OM4 MMF is chosen as YOFC MaxBand OM4 bend insensitive multimode fiber with an over filled launch (OFL) bandwidth of 4394 MHz$\times$km. At receiver side, the DSO is Agilent DSAX96204Q with sampling rate of 160 GSa/s. The offline DSP equalization is conducted via Matlab.

We’ve generated 3 sets of PAM-4 symbols with bit pattern generator (BPG) using random pattern (56 Gb/s). As a result, we can get 3 datasets in this experiment, each of which contains 1048576 PAM-4 symbols. All 3 datasets have been processed by feed forward equalization (FFE) to eliminate ISI. The tap number of FFE is 43 to effectively eliminate ISI, and 8192 symbols (1/128 of each dataset) are used to train the FFE by using recursive least squares (RLS) algorithm. In addition, the sampling rate of the FFE is 2. The eye diagrams of the PAM4 signals before the nonlinear equalizer are illustrated in Fig. 5. The data in each dataset is divided into the training data and testing data. For the proposed GMM-HMM method, the training data contains 10% data in each dataset and the testing data contains the rest 90% data in each dataset, while the training data of RNN contains 50% data in each dataset and the testing data contains the rest 50% data in each dataset. As analyzed in section 3, RNN based equalizer needs more training data to achieve better performance. The trained model is used to equalize the testing data during the equalizing process.

Fig. 5. The eye diagrams of received PAM4 signals before the nonlinear equalizer.

Download Full Size | PDF

Figure 6 presents the BER performance after GMM-HMM based equalization. The received optical power (ROP) is −2.7 dBm. It is clear that the BER gets smaller when $K$ increases. In order to achieve better BER performance while maintaining low complexity, the optimal solution is $K = 4$. We have also compared GMM-HMM with MLSE so as to strengthen our conclusions. In this paper, the memory length of MLSE is 8 while the sampling rate of MLSE is 2, and the tap number of MLSE is 2. In order to estimate the impulse response better, 8192 symbols (1/128 of each dataset) are used to estimate the channel response using least mean squares (LMS) algorithm. The BER performance are presented in Fig. 7. We observe that the BER performance of GMM-HMM (4, 2, 4) can be greatly improved compared with MLSE.

Fig. 6. Measured BER vs. the number of the Gaussian distribution density $K$ in GMM. The received optical power (ROP) is −2.7 dBm.

Download Full Size | PDF

Fig. 7. Measured BER vs. ROP with different equalization strategies. The adjacent number $L = 2$. Compared with RNN (8, 2, 10), GMM-HMM (4, 2, 4) can lower computational complexity by 73%. The BER performances of FFE, MLSE and Volterra are also given.

Download Full Size | PDF

The performance of Volterra-based nonlinear equalizer is also presented as a comparison. In this paper, the tap number of Volterra-based nonlinear equalizer is 10, and RLS algorithm is used for training using 20% of dataset. The results are shown in Fig. 7, it can be concluded that the BER of GMM-HMM (2, 2, 4) is similar to that of Volterra based nonlinear equalizer, while GMM-HMM (4, 2, 4) shows obvious improvement. The BER performance of RNN is also displayed in Fig.7. We can derive that compared with RNN (8, 2, 10), the sensitivity penalty of GMM-HMM (4, 2, 4) is about 0.6 dB at the BER of the 7% hard-decision forward-error correction (HD-FEC) limit $3.8\times 10^{-3}$. According to Table 2, the computational complexity of GMM-HMM (4, 2, 4) is about 73% lower than that of RNN (8, 2, 10).

Table 2. Computational complexity comparison of different algorithms

View Table | View all tables in this article

5. Conclusion

In this paper, in order to maintain better performance of machine learning while reducing its computational complexity, we propose GMM-HMM based equalizer for optical fiber communication. The performance of GMM-HMM has been experimentally evaluated in a 56-Gbps PAM-4 modulated VCSEL-MMF optical interconnect link. The computational complexity of GMM-HMM based equalizer is analyzed theoretically. Experiment results reveal that GMM-HMM shows significant BER performance improvement compared with conventional MLSE and obvious BER performance improvement compared with Volterra-based nonlinear equalizer. By choosing appropriate parameters, the complexity of GMM-HMM is at least 73% lower than RNN based equalizer with similar BER performance. This makes it quite suitable for energy-efficient applications in high-speed optical interconnect systems.

Funding

National Key Research and Development Program of China (2019YFB1802904); Ministry of Education of the People's Republic of China (6141A02033347).

Disclosures

The authors declare no conflicts of interest.

References

1. C. Xie, “Datacenter optical interconnects: Requirements and challenges,” in Proceedings of IEEE Optical Interconnects Conference (OI), (IEEE, 2017), pp. 37–38.

2. J. A. Tatum, G. D. Landry, D. Gazula, J. K. Wade, and P. Westbergh, “VCSEL-Based Optical Transceivers for Future Data Center Applications,” in Optical Fiber Communication Conference, OSA Technical Digest (Optical Society of America, 2018), paper M3F.6.

3. G. Meloni, A. Malacarne, F. Fresi, and L. Poti, “6.27 bit/s/Hz Spectral Efficiency VCSEL-based Coherent Communication over 800km of SMF,” in Optical Fiber Communication Conference, OSA Technical Digest (Optical Society of America, 2015), paper Th2A.30.

4. M. Rubsamen, P. J. Winzer, and R.-J. Essiambre, “MLSE Receivers for Narrow-band Optical Filtering,” in Optical Fiber Communication Conference and Exposition and The National Fiber Optic Engineers Conference, Technical Digest (CD) (Optical Society of America, 2006), paper OWB6.

5. K. Fotini, P. Cristian, S. Nebojsa, O. Markus, D. Aidan, and H. Robert, “Experimental performance evaluation of equalization techniques for 56 Gb/s PAM-4 VCSEL-based optical interconnects,” in Proceedings of European Conference on Optical Communication (ECOC, 2015), pp. 1–3.

6. Z. Tan, C. Yang, Y. Zhu, Z. Xu, and K. Zou, “High speed band-limited 850-nm VCSEL link based on time-domain interference elimination,” IEEE Photonics Technol. Lett. 29(9), 751–754 (2017). [CrossRef]

7. K. Szczerba, T. Lengyel, M. Karlsson, P. A. Andrekson, and A. Larsson, “94 Gb/s 4-PAM using an 850-nm VCSEL, pre-emphasis, and receiver equalization,” IEEE Photonics Technol. Lett. 28(22), 2519–2521 (2016). [CrossRef]

8. L. Justin, K. P. Sriharsha, T. V. Antony, and R. Stephen, “Noise in VCSEL-based links: direct measurement of VCSEL transverse mode correlations and implications for MPN and RIN,” J. Lightwave Technol. 35(4), 698–705 (2017). [CrossRef]

9. A. Liang, C. Yang, C. Zhang, Y. Liu, F. Zhang, Z. Zhang, and H. Li, “Experimental study of support vector machine based nonlinear equalizer for VCSEL based optical interconnect,” Opt. Commun. 427, 641–647 (2018). [CrossRef]

10. T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Trans. Cogn. Commun. Netw. 3(4), 563–575 (2017). [CrossRef]

11. J. Estaran, R. Rios-Mueller, M. A. Mestre, F. Jorge, H. Mardoyan, A. Konczykowska, J.-Y. Dupuy, and S. Bigo, “Artificial Neural Networks for Linear and Non-Linear Impairment Mitigation in High-Baudrate IM/DD Systems,” in Proceedings of 42nd European Conference on Optical Communication, Dusseldorf, Germany (ECOC, 2016), pp. 1–3.

12. Q. Zhou, C. Yang, A. Liang, X. Zheng, and Z. Chen, “Low computationally complex recurrent neural network for high speed optical fiber transmission,” Opt. Commun. 441, 121–126 (2019). [CrossRef]

13. F. Lu, P. Peng, S. Liu, M. Xu, S. Shen, and G. Chang, “Integration of Multivariate Gaussian Mixture Model for Enhanced PAM-4 Decoding Employing Basis Expansion,” in Optical Fiber Communication Conference, OSA Technical Digest (Optical Society of America, 2018), paper M2F.1

14. M. Xu, J. Zhang, H. Zhang, Z. Jia, J. Wang, L. Cheng, L. A. Campos, and C. Knittle, “Multi-Stage Machine Learning Enhanced DSP for DP-64QAM Coherent Optical Transmission Systems,” in Optical Fiber Communication Conference (OFC) 2019, OSA Technical Digest (Optical Society of America, 2019), paper M2H.1.

15. L. Rabiner and B. Juang, “An introduction to hidden Markov models,” IEEE ASSP Mag. 3(1), 4–16 (1986). [CrossRef]

16. D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, “Speaker Verification Using Adapted Gaussian Mixture Models,” in Digital Signal Processing (M.I.T. Lincoln Laboratory, 2000)

17. A. P. Dempster, “Maximum likelihood from incomplete data via the EM algorithm,” in Journal of Royal Statistical Society (Methodological, 1977).

18. L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE 77(2), 257–286 (1989). [CrossRef]

19. M. Garrido, P. Källström, M. Kumm, and O. Gustafsson, “CORDIC II: A New Improved CORDIC Algorithm,” IEEE Trans. Circuits Syst. II 63(2), 186–190 (2016). [CrossRef]

20. Newport, 1784 VCSEL Datasheet, 2017. https://www.newport.com.cn/p/1784. (Accessed 13 March 2017).

21. Newport, 1484-A-50-Newport, 2017. https://www.newport.com/p/1484-A-50. (Accessed 13 March 2017).

Model	Additions	Multiplications
GMM-HMM $(Γ, L, K)$	$N \times [(Γ + 1) \times (L + 1) + K + 1]$	$N \times [(Γ + 1) \times (L + 1) + K]$
RNN $(Γ, L, R)$	$(2 L + 1) \times Γ \times R + R^{2} + 4 R$	$(2 L + 1) \times Γ \times R + R^{2} + 4 R$

Model	Additions	Multiplications
GMM-HMM $(1, 2, 4)$	144	128
GMM-HMM $(2, 2, 4)$	176	160
GMM-HMM $(4, 2, 4)$	240	224
RNN $(8, 2, 10)$	865	865
MLSE	64	32
Volterra	244.6	296.4

Model	Additions	Multiplications
GMM-HMM $(Γ, L, K)$	$N \times [(Γ + 1) \times (L + 1) + K + 1]$	$N \times [(Γ + 1) \times (L + 1) + K]$
RNN $(Γ, L, R)$	$(2 L + 1) \times Γ \times R + R^{2} + 4 R$	$(2 L + 1) \times Γ \times R + R^{2} + 4 R$

Model	Additions	Multiplications
GMM-HMM $(1, 2, 4)$	144	128
GMM-HMM $(2, 2, 4)$	176	160
GMM-HMM $(4, 2, 4)$	240	224
RNN $(8, 2, 10)$	865	865
MLSE	64	32
Volterra	244.6	296.4

Gaussian mixture model-hidden Markov model based nonlinear equalizer for optical fiber transmission

Abstract

1. Introduction

2. Nonlinear equalizer based on a GMM-HMM

2.1 Gaussian mixture model and hidden Markov model

2.2 Training process

2.3 Equalizing process

2.4 Computational complexity

3. Comparison with a NN

4. Experiment results and discussion

5. Conclusion

Funding

Disclosures

References

Cited By

Figures (7)

Tables (2)

Equations (20)

Optics Express