## Abstract

A mutual information inspired nonbinary coded modulation design with non-uniform shaping is proposed. Instead of traditional power of two signal constellation sizes, we design 5-QAM, 7-QAM and 9-QAM constellations, which can be used in adaptive optical networks. The non-uniform shaping and LDPC code rate are jointly considered in the design, which results in a better performance scheme for the same SNR values. The matched nonbinary (NB) LDPC code is used for this scheme, which further improves the coding gain and the overall performance. We analyze both coding performance and system SNR performance. We show that the proposed NB LDPC-coded 9-QAM has more than 2dB gain in symbol SNR compared to traditional LDPC-coded star-8-QAM. On the other hand, the proposed NB LDPC-coded 5-QAM and 7-QAM have even better performance than LDPC-coded QPSK.

© 2016 Optical Society of America

## 1. Introduction

Photonic networks have been evolving at an increased pace in the last decades thanks to convergence of technologies in both lower and upper layers’ development. The emergence of coherent optical modems, flexible grid wavelength selective switches, and a handful of other important optical/electrical components allow us a whole new level of flexibility. To further improve the adaptivity and flexibility, people are looking for spaces in the subsystem level, such as flexible coded modulation schemes, which is a more efficient way to fine tune the cost and performance trade off. Adaptive techniques such as GLDPC coding [1,2] (generalized in the way that check processor is not limited to single parity-check code), shortening code, puncturing code, and selective constellations [3] are all targeting to provide a better solution to the performance/cost trade off.

On the other hand, increasing per channel data rate has been a key cost driver, but it is becoming harder to keep doing this. One possible way is to increase the spectral efficiency by using a larger constellation, such as moving from QPSK to star-8-QAM. For digital binary data, the constellation size increase by a factor of two will be limited by the effective number of bits (ENOB) of the digital-to analog converter (DAC) in the near future.

In Shannon information theory, the key step to maximize the capacity of a communication channel is to determine the capacity-achieving input probability mass function (p.m.f.). For constellation constrained digital communication applications, such as optical fiber, unequal transition probabilities between input and output symbols, input power constrains, or input symbols of unequal durations can result in a non-uniform capacity-achieving shaping. The non-uniform shaping is timely hot topic in optical communications [4–7]. Different shaping algorithms, such as Blahut-Arimoto [6] and Maxwel-Boltzmann [5], are widely used. However, it is nontrivial to do the forward error correction (FEC) code on these non-uniform signals, especially to output the exact bit-error-rate (BER) from the shaped symbols.

In this paper, we propose a non-uniform nonbinary LDPC coded-modulation scheme, which is a promising solution to both flexibility and excellent performance considerations. We use the Huffman source code as the prefix-free encoder, which is very easy to implement and the BER can be easily be derived from symbol errors. To deal with the general M-ary constellations, such as 5-QAM and 7-QAM, we use corresponding matched nonbinary LDPC code with the same size of Galois field (GF). This scheme is not only providing more flexibility for the system design, but also outperforms the traditional QPSK and star-8-QAM performance with even lower complexity.

The paper is organized as follows. The system diagram and coded modulation blocks are presented in Section 2. Section 3 described proposed the mutual information directed constellation design with non-uniform shaping [8–11]. In Section 4, the generalized nonbinary LDPC code is studied for the matched size constellations, generalized in the sense that it is applicable to any prime power. The simulation results and analysis are provided in Section 5. Finally, we conclude the paper in Section 6.

## 2. Proposed generalized nonbinary coded modulation scheme

Figure 1 shows the system diagram of the proposed nonuniform nonbinary coded modulation scheme. To facilitate explanations only single polarization is shown. The uniformly distributed binary source is assumed at the transmitter side and followed by a bits-to-bytes interleaver. A nonbinary LDPC encoder, whose GF size is not limited to the power of two, is used for error correction. The encoded bytes are then properly mapped to M-ary signal constellation. Because of the possibility of having long sequence of transmission of symbol with zero energy, a power distributer is used for interleaving the symbols to minimize the zero energy symbols’ duration. At the receiver side; after balanced coherent detection, ADC, and carrier phase estimation; the received symbols will be redistributed to the original order and then passed to nonbinary LDPC decoder, with additional details provided in Section 4. After the nonbinary LDPC decoder, a byte error rate is calculated. The byte error rate is the same as symbol error rate when the GF size is matched to the signal constellation size (M), and it will be different if we use smaller field for coding. The decoded bits can be obtained by deinterleaving the bytes from the output of the decoder and the BER will be counted in the end.

## 3. Probability shaping and mutual information inspired signal constellation design

Kschischang and Pasupathy proposed in [12] to use the Huffman source code of the p.m.f. that maximizes entropy at the channel input as a prefix-free encoder. We use the similar nonuniform signaling scheme as described in [4] with Huffman encoder. In this paper, we consider three non-uniform constellations, namely 5-QAM, 7-QAM, and 9-QAM.

The beauty of this scheme is that symbol-to-bits mapping is generated as we perform Huffman encoding. So the exact BER can be easily calculated by counting the corresponding symbol pairs errors with provided bits mapping. In Fig. 2 we show the non-uniform Huffman codes for 5-QAM, 7-QAM, and 9-QAM. There are two layers in 5-QAM and three layers in 7-QAM and 9-QAM. Patterns in the same layer share the same probability of occurrence. The corresponding bits mapping and probabilities are also provided in the same figure.

The next step is to find a proper M-ary signal constellation design for the channel. We assume amplified spontaneous emission (ASE) noise dominated scenario in the paper and constellation points in the same layer of the constellation have the same probability. So the number of layers of the constellation is dependent on the number of different probabilities. M-ary constellation associated with a length M probability vector is given as follows:

The achievable information rate (AIR) can be calculated by the mutual information between the input symbol sequence X and the received constellation point sequence Y as follows:

*x*. Notice that even when the information byte follows the $q$ distribution, the encoder will change this p.m.f. The uniform distribution is denoted by $m$. We assume the encoder is systematic and it has code rate

*R.*The parity-check bytes follow the uniform distribution, so that the input p.m.f. $p$ can be calculated as

The accuracy of mutual information calculation is critical here as the gap to capacity is relatively small to capture. We use the Gauss-Hermite quadratures [13] for the integral calculation, which is more numerically accurate than direct integral or Monte Carlo integration. The first part of Eq. (2) is straightforward to calculate, while the integral in the second part can be calculated by changing variable for the pdf of two-dimensional Gaussian distribution.

The constellation design will be an optimization problem that minimizes the gap to the Shannon limit. Since the number of layer is fixed, we just need to arrange the points in each layer and adjust the relative angle and distance between layers. In general, more than two dimensions can be optimized, but we just show the 9-QAM design as an illustrative example, with the third layer magnitude and relative angle $\theta $ being variables, and we assume that the points from the same layer are equally spaced.

In Fig. 3(a) we illustrate the 9-QAM three layers signal constellation design by changing the third layer magnitude. The two-dimensional optimizations for 9-QAM are summarized in Fig. 3(b) and the maximized output has the magnitude equal $\sqrt{2}$ and relative angle $\theta =\frac{\pi}{4}$. Notice that we select the metric as achievable information rate at SNR = 7dB. Figure 4 shows the third layer magnitude of 9-QAM optimization, which minimizes the gap to the Shannon limit. Four different magnitude values are tested and the process tries to find the one that minimizes the gap to the Shannon limit. In most of the cases, the SNR local optimal solution is also the best for nearby SNR region. With non-uniform 9-QAM, the spectral efficiency is 3 bits per symbol, the same as the star 8-QAM, but the 9-QAM constellation is more than 0.1 bits/channel use closer than the star 8-QAM to the Shannon limit, in most of the SNR regions of interest.

Similar design rules can be applied to other constellations. For 7-QAM, we not only consider the magnitude of layers and relative angles, but also the points distribution in one layer, which, in general, can be non-equally spaced. With the proposed design rules, the resulting 5-QAM, 7-QAM, and 9-QAM signal constellations are shown in Fig. 5 with corresponding mapping rules described in Fig. 2. When counting the BER from symbol error pairs, we used the punctured bits for length miss-match, and the BER is calculated for each punctured bits pairs as well. The punctured bits are shown with the green underline in Fig. 5.

Figure 6 shows the achievable information rate (AIR) vs. SNR for various nonuniform constellations. The lowest one with 2 bits per channel use is QPSK and actually the non-uniform 5-QAM has the identical AIR. Later we will show that the 5-QAM is still better than QPSK because of higher GF size coding gain. The non-uniform 7-QAM carries 2.25 bits/channel use, which is slightly higher than that of QPSK. The AIRs for star 8-QAM and non-uniform 9-QAM are also shown in Fig. 6. The AIR of non-uniform 9-QAM is higher than that of star 8-QAM in most of the mid-SNR region and has a smaller gap to the Shannon limit as shown in Fig. 4.

## 4. Generalized nonbinary LDPC decoder

To perfectly match the constellation size with coding, we use generalized nonbinary LDPC coding whose GF size is not limited to power of two. Given that GF(3), GF(5), GF(7) and GF(3^{2}) are not popular choices for traditional communication system, since it is not binary based extension field, but it is the best choice of proposed M-ary constellation with non-uniform distribution, and here we develop the nonbinary LDPC decoder for any prime power.

As an illustration, the GF(3^{2}) LDPC code, with Tanner graph shown in Fig. 7, is constructed by randomly replacing nonzero elements in binary LDPC code with nonzero elements from GF(3^{2}) [14]. We use regular girth-10 LDPC code with length 16935 and rate 0.8. Sum-product algorithm is used in the decoding procedure. The variable node processor (sum operator) is the same as typical nonbinary LDPC decoder, while the check node processor (product operator) is based on finite state machine with BCJR algorithm, as shown in Fig. 8.

In the check node BCJR processor, the number of states is equal to the finite field size for single parity check (SPC) code and the number of stages is determined by the check nodes’ degrees. In the BCJR algorithm, three important metrics, the forward, backward and branch metrics are defined, respectively, as follows:

Forward metric:

Backward metric:

Branch metric:

To avoid the dependence on all previous received symbols $\overrightarrow{y}\left[1,s\right]$, we need to update the ${\alpha}_{i}\left(s\right)$ and ${\beta}_{i}\left(s\right)$ in an iterative fashion. The calculation of ${\alpha}_{i}\left(s\right)$ solely depends on previous value of ${\alpha}_{i-1}\left(s\right)$ and current branch metric ${\gamma}_{i}\left({s}^{\prime},s\right)$. Besides, the ${\gamma}_{i}\left({s}^{\prime},s\right)$ will be updated for every given received symbol ${\text{y}}_{\text{i}}$ in current stage. The *i*-th iteration updating rule can be written as

## 5. Results and discussions

The proposed non-uniform generalized nonbinary coded modulation scheme has two kind of gains comparing to traditional binary modulation cases, namely, shaping gain (SG) and coding gain (CG). The SG can be easily determined from the AIR curves such as Fig. 6, while the CG can be determined form the input and output relation of the byte error rate of the decoder.

Figure 9 shows the byte error rate improvement before and after the decoder. These curves mostly reflect the performance improvement by using larger GF-size nonbinary LDPC decoder, which is the CG. The solid lines are the cases where GF-size is the same as the constellation size. As we can see from the figure, with the same structure of LDPC decoder, the higher the GF size, the better performance is. The dash lines are three cases where we use smaller GF-size codes for M-ary constellation for complexity consideration. The dots near the curves with the same color are the short pattern byte error rate results for calculating the input byte error rate, which is independent of SNR. Two of GF(3) decoder for non-uniform 9-QAM is better than both GF(2) coding from QPSK or star 8-QAM. The gap between GF(2) cases of QPSK and star 8-QAM is because the Pre-FEC Byte error rate is not a good predictor for post-FEC performance [15].

Figure 10 shows the SNR and BER relation for various coded modulation schemes. Because there is no Gray mapping for star 8-QAM, the demapper AIR loss is large if we use binary coding for such constellation. This is why we have a large gap by using nonbinary GF(2^{3}) coding for 8-QAM. Keeping the spectral efficiency at 3 bits per channel use, the non-uniform 9-QAM performance is benefiting from both SG and CG. About 0.9dB gain can be obtained by using non-uniform 9-QAM with GF(3^{2}) LDPC code. Alternatively, we can use two simpler GF(3) LDPC decoders for this 9-QAM, which has much lower complexity than GF(2^{3}) decoder and much better performance than traditional binary coding for star 8-QAM.

For lower spectral efficiency case, we studied the QPSK with GF(2^{2}) and binary decoder for complexity-performance trade-off. Generally speaking, the higher the GF size, the sharper the water fall region of LDPC code is, and the gap might be even larger at lower BER region. The non-uniform 5-QAM with the same spectral efficiency as QPSK has lower SNR threshold because of higher CG, which can be seen from Fig. 9. The non-uniform 7-QAM has the best performance with 2.25 bits/channel use comparing to QPSK. It can carry 0.25 bits more per channel use and it has a better performance than others because of higher both SG and CG. The disadvantages of 7-QAM is that it requires higher resolution in DAC as it has five different voltages in quadrature axis.

## 7. Conclusions

Based on calculated mutual information, we propose a method to design a general constellation for optical communication, which can fill the gap of power-two constellation sizes. The method can be used to design more general sizes of signal constellation, for arbitrary integer, with Huffman non-uniform shaping. The designed 9-QAM constellation is more than 0.1 bits/channel use closer than the star 8-QAM to the Shannon limit.

Moreover, we investigate the nonbinary LDPC codes for these general constellations. With the same structure of LDPC code, we showed that the byte error rate performance is better for LDPC codes with larger GF sizes. Besides, the binary LDPC coded cases and decomposed nonbinary case are studied, which is a good choice for lower complexity consideration. Finally, we show BER vs. SNR performance with both SG and CG together. The NB LDPC-coded 9-QAM is outperformaing the binary LDPC-coded star-8-QAM by more than 2dB in SNR and the NB LDPC-coded 5-QAM is better than LPDc-coded QPSK, while NB LDPC-coded 7-QAM is superior than LPDC-coded QPSK in terms of both SNR and SE.

## References and links

**1. **I. B. Djordjevic and T. Wang, “Multiple component codes based generalized LDPC codes for high-speed optical transport,” Opt. Express **22**(14), 16694–16705 (2014). [CrossRef] [PubMed]

**2. **D. Zou and I. B. Djordjevic, “FPGA implementation of concatenated non-binary QC-LDPC codes for high-speed optical transport,” Opt. Express **23**(11), 14501–14509 (2015). [CrossRef] [PubMed]

**3. **C. Lin, I. B. Djordjevic, M. Cvijetic, and D. Zou, “Mode-Multiplexed Multi-Tb/s Superchannel Transmission with Advanced Multidimensional Signaling in the Presence of Fiber Nonlinearities,” IEEE Trans. Commun. **62**(7), 2507–2514 (2014). [CrossRef]

**4. **T. Liu and I. B. Djordjevic, “LDPC-coded BICM-ID based nonuniform signaling for ultra-high-speed optical transport,” in *Optical Fiber Communication 2016* (Optical Society of America, 2016), paper M3A.3.

**5. **L. Beygi, E. Agrell, J. M. Kahn, and M. Karlsson, “Rate-adaptive coded modulation for fiber-optic communications,” J. Lightwave Technol. **32**(2), 333–343 (2014). [CrossRef]

**6. **M. P. Yankov, D. Zibar, K. J. Larsen, L. P. Christensen, and S. Forchhammer, “Constellation shaping for fiber-optic channels with QAM and high spectral efficiency,” IEEE Photonics Technol. Lett. **26**(23), 2407–2410 (2014). [CrossRef]

**7. **T. Fehenberger, D. Lavery, R. Maher, A. Alvarado, P. Bayvel, and N. Hanik, “Sensitivity gains by mismatched probabilistic shaping for optical communication systems,” IEEE Photonics Technol. Lett. **28**(7), 786–789 (2016). [CrossRef]

**8. **A. K. Khandani and P. Kabal, “Shaping multidimensional signal spaces. part i: optimal shaping shell mapping,” IEEE Trans. Inf. Theory **39**(6), 1799–1808 (1993). [CrossRef]

**9. **A. R. Calderbank and L. H. Ozarow, “Nonequiprobable signaling on the Gaussian channel. information theory,” IEEE Trans. **36**(4), 726–740 (1990).

**10. **C. Lin, I. B. Djordjevic, and D. Zou, “Achievable information rates calculation for optical OFDM transmission over few-mode fiber long-haul transmission systems,” Opt. Express **23**(13), 16846–16856 (2015). [CrossRef] [PubMed]

**11. **I. B. Djordjevic, “On advanced FEC and coded modulation for ultra-high-speed optical transmission,” IEEE Comm. Surv. and Tutor. **PP**(99), 1–31 (2016). [CrossRef]

**12. **F. R. Kschischang and S. Pasupathy, “Optimal nonuniform signaling for Gaussian channels,” IEEE Trans. Inf. Theory **39**(3), 913–929 (1993). [CrossRef]

**13. **M. Abramowitz and I. A. Stegun, “Handbook of mathematical functions.” Appl. Math. Series **55**, 62 (1966).

**14. **M. Arabaci, I. B. Djordjevic, R. Saunders, and R. M. Marcoccia, “Polarization-multiplexed rate-adaptive non-binary-quasi-cyclic-LDPC-coded multilevel modulation with coherent detection for optical transport networks” in *International Conference on Transparent Optical Networks* (2009), paper Tu.B2.2.

**15. **A. Alvarado, A. Erik, L. Domanic, M. Robert, and B. Polina, “Replacing the soft FEC limit paradigm in the design of optical communication systems,” arXiv preprint arXiv:1503–05477 (2015).