Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Upper and lower bounds to the information rate transferred through the Pol-Mux channel

Open Access Open Access

Abstract

Pol-Mux transmission is a well established technique that enhances spectral efficiency by simultaneously transmitting over horizontal and vertical polarizations of the electrical field. However, cross-coupling of the two polarizations impairs transmission. Under the assumption that the cross-coupling matrix is a Markov process with free-running state, we propose upper and lower bounds to the information rate that can be transferred through the channel. Simulation results show that the two bounds are tight for values of the cross-coupling power of practical interest and modulation formats up to 16-QAM (quadrature amplitude modulation).

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Simultaneous transmission of modulated signals over the horizontal and vertical polarizations of the electrical field is a well established technique [1–3] that allows to improve spectral efficiency by using the same frequency twice. In its essence, this technique relies upon the principle of MIMO (Multiple Input Multiple Output) systems, that have become popular after the seminal paper [4]. To cancel interference arising from non-ideal orthogonality between the horizontal and the vertical polarizations, linear processing can be adopted [5], even if it is well known that non-linear techniques achieve better performance in presence of interference and additive noise, see e.g. [6, 7].

Either implicitly or explicitly, most of the receivers studied in the literature assume that the MIMO channel matrix is static or quasi-static. However, the experimental results of [6] show that the coherence time of the channel is quite small, say, in the order of 10 to 30 symbol intervals for 112 Gb/s dual-polarization QPSK (Quadrature Phase Shift Keying). Hence tracking the channel becomes an issue. Tracking techniques can be based on pilot symbols, as proposed, for instance, in [8], but, independently of the channel tracking method, a low coherence time of the channel matrix, hence a fast time-varying channel, will make noisy the channel estimate (in practice only a short time window spanning a few signal samples can be used for channel estimation at a given time instant) thus impacting the information rate that can be transmitted through the channel. This observation motivates the study of the information rate transferred through the Pol-Mux channel. Channel capacity of the fading MIMO channel is a classical topic in the general framework of information theory, see e.g. [9] and, in that context, also the information rate of channels with free-running state has been studied [10]. In the context of optical transmission the information rate is well studied for the phase noise channel, at least for the channel model with free-running state, see e.g. [11–13], but less has been done for the Pol-Mux channel, which can be seen indeed as a variant of the phase noise channel where

  • the modulus is not constant
  • the channel is MIMO.
Therefore, starting from the lower bound for the phase noise channel of [13], we adapt it here to the Pol-Mux channel and introduce a new upper bound based on the Kalman filter.

2. Channel model

Let the lowercase characters indicate possibly complex scalars and column vectors and let the uppercase characters indicate matrices. The notation akk+i is used to indicate a column vector (or matrix, when the elements are vectors) made by the chunk of sequence (ak, ak+1, ⋯, ak+i)T, while {ak} is used to indicate the semi-infinite sequence (a0, a1, ⋯). The notation m is used to indicate the m × m identity matrix and the superscript H denotes Hermitian transposition. The output of the Pol-Mux channel at time k is

yk=Mkxk+wk,k=1,2,,
where xk is the k-th sample of the i.i.d. input modulation complex vector data sequence, with zero mean vector and covariance matrix
E{xkxkH}=2,
Mk is the channel matrix and wk is the k-th element of the i.i.d. complex Gaussian vector noise sequence with zero mean vector and covariance matrix
E{wkwkH}=σ22.
For small to moderate polarization crosstalk, the matrix Mk can be modelled as [6]
Mk=(1λ1,kλ2,k1),
where
λk=(λ1,k,λ2,k)T
is the k-th element of a complex Gaussian random vector sequence which is hereafter modelled here as a free-running 1-causal ARMA (Autoregressive Moving Average) process, hence
λk=i=1pbivki+i=1qaiλki,
where vk is the k-th sample of a white Gaussian random vector sequence with zero mean and covariance matrix
E{vkvkH}=(1ρρ1).
In other words, {λk} is the filtered version of {vk}, where the filter is made of two shift registers, one for {v1,k} and the other one for {v2,k}, each one with m memories, and with 1-causal feedback taps a1q and 1-causal forward taps b1p, with
m=max{p,q}.
Using the z-transform you write
λ(z)=v(z)b(z)1a(z),
where
b(z)=i=1mbizi,a(z)=i=1maizi.

To cast the model in the framework of linear dynamic systems we need to define the state of the system. To this aim, let us define the vector sequence

ωk=(ω1,k,ω2,k)T=vk+i=1maiωki,k=0,1,,
hence ωkmk1 is the content of the two shift registers at the k-th channel use. Note that λk depends only on ωkmk1 as
λk=i=1mbiωki
and, given ωkmk1 the sequence λk is independent of λ1k1. Therefore you can take
sk=(1,(ω1,kmk1)T,1,(ω2,kmk1)T)T
as the state of the linear dynamic system at time k, thus writing the measurement equation and the state transition equation as
yk=Hksk+wk,
sk+1=Fsk+(0,v1,k,(01m1)T,0,v2,k,(01m1)T)T,
with
Hk=[x1,kx2k(b1m)T0(01m)T0(01m)Tx2,kx1,k(b1m)T],
where 01m is a column vector of m zeros, and the 2(m + 1) × 2(m + 1) state transition matrix is
F[Fm+1Om+1Om+1Fm+1],
where
Fm+1[1(01m1)T00(a1m1)Tam01m1m101m1],
and Om is the all-zero square matrix of size m × m. The state transition probability is
p(sk+1|sk)=gc(Fsk,Q;sk+1),
where gc(µ, Σm; x) indicates a m-dimensional complex Gaussian probability density function over the complex vector space spanned by x with mean vector µ and covariance matrix Σm and Q is the covariance matrix of the process noise (0,v1,k,(01m1)T,0,v2,k,(01m1)T)T, that is
Q[Q1QρQρQ1],
with
Q1=[00(01m1)T01(01m1)T01m101m1Om1],
Qρ=[00(01m1)T0ρ(01m1)T01m101m1Om1].
The joint source and channel output probability, given the hidden state, is
p(yk,xk|sk)=p(xk|sk)p(yk|xk,sk)=p(xk)p(yk|xk,sk)
where
p(yk|xk,sk)=gc(Hksk,σ22;yk).

The conditional probability of channel output given the hidden state is

p(yk|sk)=xkXkp(xk)p(yk|xk,sk).

3. Upper and lower bounds to the information rate by the Kalman filter

Let

I(x;y)=H(x)H(x|y),
where, for conventional M-QAM (Multi-Level Quadrature Amplitude Modulation) and M-PSK (Multi-Level Phase Shift Keying)
H(x)=log2M.
For the conditional entropy, by chain rule one writes
H(x|y)=limN1Nk=1NH(xk|x1k1,y1N),
which, by the Shannon-McMillan-Breiman theorem, can be evaluated as
H(x|y)=limN1Nk=1Nlog2p(xk|x1k1,y1N).

Since conditioning does not increase entropy, we have the following upper and lower bounds to the conditional entropy

H(x|y)=limN1Nk=1NH(xk|x1k1,y1N)limN1Nk=1NH(xk|x1k1,y1k)=limN1Nk=1Nlog2p(xk|x1k1,y1k),
H(x|y)=limN1Nk=1NH(xk|x1k1,y1N)limN1Nk=1NH(xk|x1k1,xk+1N,y1N)=limN1Nk=1Nlog2p(xk|x1k1,xk+1N,y1N),
that one can use in a straightforward way in the right side of (22) together with (23) to get lower and upper bounds to the information rate.

Let us consider the upper bound (26). The probabilities inside the logarithm can be evaluated by the Kalman filter as follows. The knowledge of past transmitted symbols that appear in the conditioning is imported in the Kalman filter by including all the conditions in the measurement, hence by updating the Kalman filter in data-aided mode. Let us write the channel output as

yk=Hksk+wk=hk(sk)+wk.
The predicted measurement at time k is
y^k=Hks^k,
where s^k denotes the state predicted by the Kalman filter at time k, that is the expectation of the hidden state given past measurements
s^k=E{sk|y1k1,x1k1}.
As innovations process we take
uk=yky^k=Hk(sks^k)+wk.
Starting from an initial pair (Σ^1,s^1), where
Σ^k=E{(sks^k)(sks^k)H},
for k = 1, 2, ⋯, the state prediction vector and the prediction error covariance matrix evolve as
s^k+1=F(s^k+Kkuk),
Σ^k+1=FΣkFT+Q,
where
Σk=((Σ^k)1+σ2HkHHk)1,
Kk=σ1ΣkHkH.

The desired probability is evaluated as

p(xk|x1k1,y1k)=p(xk|yk,x1k1,y1k1)=p(xk|x1k1,y1k1)p(yk|xk,x1k1,y1k1)xkXkp(xk|x1k1,y1k1)p(yk|xk,x1k1,y1k1)=p(xk)p(yk|x1k,y1k1)xkXkp(xk)p(yk|x1k,y1k1),
where, using the predicted state and the prediction error covariance matrix computed by the Kalman filter, one has
p(yk|x1k,y1k1)=Sp(sk,yk|x1k,y1k1)dsk=Sp(sk|x1k,y1k1)p(yk|sk,x1k,y1k1)dsk=Sp(sk|x1k1,y1k1)p(yk|sk,xk)dsk=Sgc(s^k,Σ^k;sk)gc(Hk,sk,σ22;yk)dsk=gc(Hks^k,HkHΣ^kHk+σ22;yk).
Similarly, for the lower bound to the conditional entropy, one has
p(xk|x1k1,xk+1N,y1N)=p(xk)p(yk|x1N,y1k1,yk+1N)xkXkp(xk)p(yk|x1N,y1k1,yk+1N),
with
p(yk|x1N,y1k1,yk+1N)=Sp(sk,yk|x1N,y1k1,yk+1N)dsk=Sp(sk|x1N,y1k1,yk+1N)p(yk|sk,x1N,y1k1,yk+1N)dsk=Sp(sk|x1k1,y1k1,xk+1N,yk+1N)p(yk|sk,xk)dsk=Sgc(s^fb,k,Σ^fb,k;sk)gc(Hksk,σ22;yk)dsk=gc(Hks^fb,k,HkHΣ^fb,kHk+σ22;yk),
where s^fb,k and Σ^fb,k are the estimates produced by combining a forward and a backward Kalman filter as
s^fb=Σ^b(Σ^f+Σ^b)1s^f+Σf(Σ^f+Σ^b)1s^b,
Σ^fb=(Σ^f1+Σ^b1)1.

4. Simulation results

The consideration of realistic spectra of the cross-pol coefficients is out of the scope of the present paper and we left it to future studies. For practical methods, to estimate the strength of cross-pol interference the reader is referred to [6], where the strength of interference is given by the autocorrelation of interference at time zero. In the following we express the strength of interference by using the SIR (Signal-to-Interference Ratio), which is the inverse of the interference autocorrelation at time zero. To derive simulation results, we set ρ = 0 and for each one of the two random coefficients appearing in the Pol-Mux matrix we take the first-order ARMA model

λ(z)=v(z)(1zp)z11zpz1,
where −1 < zp < 1 is the pole of the first-order ARMA model. The filtered sequence has zero mean, unit power spectral density at frequency zero and power
E{λk2}=1zp1+zp,
hence the SIR is
SIR=1+zp1zp.

In the common case where zp is close to 1, the filtered sequence is a first-order low-pass random sequence with −3 dB normalized bandwidth

B31zp2π.

Figure 1 gives the upper and lower bounds to the information rate of 4-QAM, 16-QAM and 64-QAM obtained with zp = 0.977, corresponding to SIR=19.3 dB. With such moderate interference the two bounds are close to each other, also for 64-QAM. Moreover, at high values of SNR (Signal-to-Noise Ratio) information rates reach the maximum value allowed by the constellation sizes, achievable with the pure AWGN (Additive White Gaussian Noise) channel: 4 bits for 2 × 4-QAM, 8 bits for 2 × 16-QAM and 12 bits for 2 × 64-QAM.

 figure: Fig. 1

Fig. 1 Upper and lower bounds to the information rate for various modulation formats and zp = 0.977. The Signal-to-Noise Ratio (SNR) is SNR=1σ2.

Download Full Size | PDF

Figure 2 gives the same upper and lower bounds obtained with zp = 0.887, that is SIR=12.2 dB. In the practice it seems to be a strong interference condition, since the minimum SIR reported in the experimental results of [6] is around 14 dB. In this case, the information rate with 64-QAM and at high SNR remains well below the information rate achieved with the AWGN channel, thus confirming that the Pol-Mux interference becomes the limiting factor of the information rate transferred through the channel. We also note that the spread between upper and lower bounds becomes large with 64-QAM and at high SNR, where the capability of tracking the MIMO channel becomes crucial. Actually, the lower bound renounces to the blind part of tracking thus renouncing to some tracking capability, while the upper bound upgrades the blind tracking to a data-aided tracking, thus enhancing tracking capabilities over what can actually be done.

 figure: Fig. 2

Fig. 2 Upper and lower bounds to the information rate for various modulation formats and zp = 0.887. The Signal-to-Noise Ratio (SNR) is SNR=1σ2.

Download Full Size | PDF

5. Conclusions

We have proposed upper and lower bounds to the information rate of the Pol-Mux channel and shown simulation results for a specific channel model. The results show that with moderate interference our bounds are so close that virtually compute the exact information rate. For strong interference and modulation formats with high spectral efficiency there is still some spread between the two, leaving space to future investigations.

References and links

1. M.S.A.S. Al Fiad, M. Kuschnerov, S.L. Jansen, T. Wuth, D. van den Borne, and H. de Waardt, “11 × 224-Gb/s POLMUX-RZ-16QAM transmission over 670 km of SSMF with 50-GHz channel spacing,” IEEE Photon. Technol. Lett. 22(15), 1150–1152 (2010). [CrossRef]  

2. V.A.J.M. Sleiffer, M.S.A.S. Al Fiad, D. van den Borne, M. Kuschnerov, V. Veljanovski, M. Hirano, Y. Yamamoto, T. Sasaki, S.L. Jansen, T. Wuth, and H. de Waardt, “10 × 224-Gb/s POLMUX-16QAM transmission over 656 km of Large-Aeff PSCF with a spectral efficiency of 5.6 b/s/Hz,” IEEE Photon. Technol. Lett. 23(20), 1427–1429 (2011). [CrossRef]  

3. P. Boffi, M. Ferrario, L. Marazzi, P. Martelli, P. Parolari, A. Righetti, R. Siano, and M. Martinelli, “Stable 100-Gb/s POLMUX-DQPSK transmission with automatic polarization stabilization,” IEEE Photon. Technol. Lett. 21(11), 745–747 (2009). [CrossRef]  

4. Gerard J. Foschini and Michael J. Gans, “On limits of wireless communications in a fading environment when using multiple antennas,” Wireless Pers. Commun. 6(3), 311–335 (1998). [CrossRef]  

5. Seb. J. Savory, “Digital coherent optical receivers: Algorithms and subsystems,” IEEE J. Sel. Top. Quantum Electron. 16(5), 1164–1179 (2010). [CrossRef]  

6. L. Li, Z. Tao, L. Liu, W. Yan, S. Oda, T. Hoshida, and Jens C. Rasmussen, “Nonlinear polarization crosstalk canceller for dual-polarization digital coherent receivers,” presented at Optical Fiber Communication, collocated National Fiber Optic Engineers Conference (OFC/NFOEC), IEEE, Piscataway, NJ, USA, 21 March 2010.

7. P. Layec, A. Ghazisaeidi, and G. Charlet, Jean-Christophe Antona, and S. Bigo, “Generalized maximum likelihood for cross-polarization modulation effects compensation,” J. Lightwave Technol. 33(7), 1300–1307 (2015). [CrossRef]  

8. J. Li, R. Schmogrow, D. Hillerkuss, Philipp C. Schindler, M. Nazarathy, C. Schmidt-Langhorst, Shalva-Ben Ezra, I. Tselniker, C. Koos, W. Freude, and J. Leuthold, “A self-coherent receiver for detection of PolMUX coherent signals,” Opt. Express 20(19), 21413–21433 (2012). [CrossRef]   [PubMed]  

9. R.H. Etkin and D. N. C. Tse, “Degrees of freedom in some underspread MIMO fading channels,” IEEE T. Inform. Theory 52(4), 1576–1608 (2006). [CrossRef]  

10. L. Barletta, M. Magarini, S. Pecorino, and A. Spalvieri, “Upper and lower bounds to the information rate transferred through first-order Markov channels with free-running continuous state,” IEEE T. Inform. Theory 60(7), 3834–3844 (2014). [CrossRef]  

11. L. Barletta, M. Magarini, and A. Spalvieri, “Estimate of information rates of discrete-time first-order Markov phase noise channels,” IEEE Photonic Tech. L. 23(21), 1582–1584 (2011). [CrossRef]  

12. L. Barletta, M. Magarini, and A. Spalvieri, “The information rate transferred through the discrete-time Wiener’s phase noise channel,” J. Lightwave Technol. 30(10), 1480–1486 (2012). [CrossRef]  

13. L. Barletta, M. Magarini, and A. Spalvieri, “A new lower bound below the information rate of Wiener phase noise channel based on Kalman carrier recovery,” Opt. Express 20(23), 2547–25477 (2012). [CrossRef]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (2)

Fig. 1
Fig. 1 Upper and lower bounds to the information rate for various modulation formats and zp = 0.977. The Signal-to-Noise Ratio (SNR) is SNR = 1 σ 2.
Fig. 2
Fig. 2 Upper and lower bounds to the information rate for various modulation formats and zp = 0.887. The Signal-to-Noise Ratio (SNR) is SNR = 1 σ 2.

Equations (50)

Equations on this page are rendered with MathJax. Learn more.

y k = M k x k + w k , k = 1 , 2 , ,
E { x k x k H } = 2 ,
E { w k w k H } = σ 2 2 .
M k = ( 1 λ 1 , k λ 2 , k 1 ) ,
λ k = ( λ 1 , k , λ 2 , k ) T
λ k = i = 1 p b i v k i + i = 1 q a i λ k i ,
E { v k v k H } = ( 1 ρ ρ 1 ) .
m = max { p , q } .
λ ( z ) = v ( z ) b ( z ) 1 a ( z ) ,
b ( z ) = i = 1 m b i z i , a ( z ) = i = 1 m a i z i .
ω k = ( ω 1 , k , ω 2 , k ) T = v k + i = 1 m a i ω k i , k = 0 , 1 , ,
λ k = i = 1 m b i ω k i
s k = ( 1 , ( ω 1 , k m k 1 ) T , 1 , ( ω 2 , k m k 1 ) T ) T
y k = H k s k + w k ,
s k + 1 = F s k + ( 0 , v 1 , k , ( 0 1 m 1 ) T , 0 , v 2 , k , ( 0 1 m 1 ) T ) T ,
H k = [ x 1 , k x 2 k ( b 1 m ) T 0 ( 0 1 m ) T 0 ( 0 1 m ) T x 2 , k x 1 , k ( b 1 m ) T ] ,
F [ F m + 1 O m + 1 O m + 1 F m + 1 ] ,
F m + 1 [ 1 ( 0 1 m 1 ) T 0 0 ( a 1 m 1 ) T a m 0 1 m 1 m 1 0 1 m 1 ] ,
p ( s k + 1 | s k ) = g c ( F s k , Q ; s k + 1 ) ,
Q [ Q 1 Q ρ Q ρ Q 1 ] ,
Q 1 = [ 0 0 ( 0 1 m 1 ) T 0 1 ( 0 1 m 1 ) T 0 1 m 1 0 1 m 1 O m 1 ] ,
Q ρ = [ 0 0 ( 0 1 m 1 ) T 0 ρ ( 0 1 m 1 ) T 0 1 m 1 0 1 m 1 O m 1 ] .
p ( y k , x k | s k ) = p ( x k | s k ) p ( y k | x k , s k ) = p ( x k ) p ( y k | x k , s k )
p ( y k | x k , s k ) = g c ( H k s k , σ 2 2 ; y k ) .
p ( y k | s k ) = x k X k p ( x k ) p ( y k | x k , s k ) .
I ( x ; y ) = H ( x ) H ( x | y ) ,
H ( x ) = log 2 M .
H ( x | y ) = lim N 1 N k = 1 N H ( x k | x 1 k 1 , y 1 N ) ,
H ( x | y ) = lim N 1 N k = 1 N log 2 p ( x k | x 1 k 1 , y 1 N ) .
H ( x | y ) = lim N 1 N k = 1 N H ( x k | x 1 k 1 , y 1 N ) lim N 1 N k = 1 N H ( x k | x 1 k 1 , y 1 k ) = lim N 1 N k = 1 N log 2 p ( x k | x 1 k 1 , y 1 k ) ,
H ( x | y ) = lim N 1 N k = 1 N H ( x k | x 1 k 1 , y 1 N ) lim N 1 N k = 1 N H ( x k | x 1 k 1 , x k + 1 N , y 1 N ) = lim N 1 N k = 1 N log 2 p ( x k | x 1 k 1 , x k + 1 N , y 1 N ) ,
y k = H k s k + w k = h k ( s k ) + w k .
y ^ k = H k s ^ k ,
s ^ k = E { s k | y 1 k 1 , x 1 k 1 } .
u k = y k y ^ k = H k ( s k s ^ k ) + w k .
Σ ^ k = E { ( s k s ^ k ) ( s k s ^ k ) H } ,
s ^ k + 1 = F ( s ^ k + K k u k ) ,
Σ ^ k + 1 = F Σ k F T + Q ,
Σ k = ( ( Σ ^ k ) 1 + σ 2 H k H H k ) 1 ,
K k = σ 1 Σ k H k H .
p ( x k | x 1 k 1 , y 1 k ) = p ( x k | y k , x 1 k 1 , y 1 k 1 ) = p ( x k | x 1 k 1 , y 1 k 1 ) p ( y k | x k , x 1 k 1 , y 1 k 1 ) x k X k p ( x k | x 1 k 1 , y 1 k 1 ) p ( y k | x k , x 1 k 1 , y 1 k 1 ) = p ( x k ) p ( y k | x 1 k , y 1 k 1 ) x k X k p ( x k ) p ( y k | x 1 k , y 1 k 1 ) ,
p ( y k | x 1 k , y 1 k 1 ) = S p ( s k , y k | x 1 k , y 1 k 1 ) d s k = S p ( s k | x 1 k , y 1 k 1 ) p ( y k | s k , x 1 k , y 1 k 1 ) d s k = S p ( s k | x 1 k 1 , y 1 k 1 ) p ( y k | s k , x k ) d s k = S g c ( s ^ k , Σ ^ k ; s k ) g c ( H k , s k , σ 2 2 ; y k ) d s k = g c ( H k s ^ k , H k H Σ ^ k H k + σ 2 2 ; y k ) .
p ( x k | x 1 k 1 , x k + 1 N , y 1 N ) = p ( x k ) p ( y k | x 1 N , y 1 k 1 , y k + 1 N ) x k X k p ( x k ) p ( y k | x 1 N , y 1 k 1 , y k + 1 N ) ,
p ( y k | x 1 N , y 1 k 1 , y k + 1 N ) = S p ( s k , y k | x 1 N , y 1 k 1 , y k + 1 N ) d s k = S p ( s k | x 1 N , y 1 k 1 , y k + 1 N ) p ( y k | s k , x 1 N , y 1 k 1 , y k + 1 N ) d s k = S p ( s k | x 1 k 1 , y 1 k 1 , x k + 1 N , y k + 1 N ) p ( y k | s k , x k ) d s k = S g c ( s ^ f b , k , Σ ^ f b , k ; s k ) g c ( H k s k , σ 2 2 ; y k ) d s k = g c ( H k s ^ f b , k , H k H Σ ^ f b , k H k + σ 2 2 ; y k ) ,
s ^ f b = Σ ^ b ( Σ ^ f + Σ ^ b ) 1 s ^ f + Σ f ( Σ ^ f + Σ ^ b ) 1 s ^ b ,
Σ ^ f b = ( Σ ^ f 1 + Σ ^ b 1 ) 1 .
λ ( z ) = v ( z ) ( 1 z p ) z 1 1 z p z 1 ,
E { λ k 2 } = 1 z p 1 + z p ,
SIR = 1 + z p 1 z p .
B 3 1 z p 2 π .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.