## Abstract

We propose to combine 4D trellis-coded modulation (TCM) with transmitter-side Tomlinson-Harashima precoding (THP) in IM/DD transmissions, and experimentally investigate the achieved performance improvement. Theoretically, THP can approximately produce an end-to-end additive white Gaussian noise (AWGN) channel even with severe bandwidth limitation, allowing TCM to maintain its coding gain in the presence of inter-symbol interference. In our experiments with off-the-shelf commercial components, which limit the 3 dB bandwidth of the system to be ~3.5 GHz, the combination of TCM and THP shows a better receiver sensitivity for various system bit rates from 56 Gbit/s to 112 Gbit/s, considering the KP4 threshold of BER = 2 × 10^{−4}. In the 112 Gbit/s back-to-back (B2B) transmission, with the help of THP the receiver sensitivity is improved by 3.3 dB using 4D-PAM4 TCM at the KP4 FEC threshold compared with using conventional PAM4. In addition, combining TCM and THP also helps to lower the BER floor.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

The capacity demand of short-range optical transmission systems has been growing rapidly mainly driven by data-center interconnects (DCI). In recent standardization efforts, 8-lane × 50 Gb/s/*λ* solutions have been considered for 400 Gb/s Ethernet (400GbE) transceivers [1]. Currently, the research community moves focus towards techniques that support 100 Gb/s/*λ* and beyond. Several single-carrier, single polarization, short-reach ≥100 Gb/s transmission experiments have been reported using various advanced modulation formats, such as pulse amplitude modulation (PAM), multi-band carrierless amplitude phase (multi-CAP) modulation, discrete multi-tone (DMT) modulation and Nyquist subcarrier modulation (SCM) [2–9]. Among these formats, the PAM format has been adopted as the mainstream technology for DCI applications as its simpler transmitter structure reduces cost and power consumption. Meanwhile, in order to further improve system performance, multi-dimensional (MD) coded modulation technologies based on multiple time-domain PAM symbols have also attracted a lot of attentions recently [10–16]. In [11], a 4-dimensional (4D) coded PAM4 scheme defines an alphabet with a total of 256 (4^{4}) symbols, from which one subset of 128 symbols can be selected to increase the minimum Euclidean distance. At the BER threshold of 10^{−3}, an improvement of 1.4 dB in signal-to-noise ratio (SNR) per bit can be achieved. In [12], higher than 4D modulations are also demonstrated. To further exploit the coding gain, in [13–16], 2D/4D trellis-coded modulation (TCM) using 2/3 convolutional coding is proposed to maximize the Euclidian distance between the points within the same subset. It is experimentally verified that 4D TCM outperforms PAM4 at system bit rates between 56 Gb/s and 80 Gb/s. However, when implementing single wavelength ≥100 Gb/s IM/DD systems, the limited bandwidth of low-cost off-the-shelf commercial components may cause a big issue for the TCM signals due to the occurrence of severe inter-symbol interference (ISI). Generally, 4D TCM is more sensitive to the ISI impairments compared with PAM4. It is experimentally shown in [16] that the sensitivity gain from 4D TCM vanishes at system bit rates up to 96 Gb/s in a bandwidth limited system where PAM4 even outperforms 4D TCM.

In this work, in order to maintain the coding gain of TCM in the bandwidth limited system, we propose to combine TCM with transmitter-side Tomlinson-Harashima precoding (THP). THP is employed to address the ISI issue in bandwidth limited systems, and in theory it can approximately produce an end-to-end Additive White Gaussian Noise (AWGN) channel. As a result, the TCM, designed for the AWGN channel, can be used to ensure the performance gain with THP. In our experiments with commercial components, including an Arbitrary Waveform Generator (AWG) with a 3dB bandwidth of about 14 GHz, the combination of TCM and THP achieves a better receiver sensitivity compared with PAM4 for various system bit rates from 56 Gbit/s to 112 Gbit/s considering the KP4 threshold of BER = 2 × 10^{−4}. Specifically, in the 56 Gbit/s case, with the help of THP, the receiver sensitivity is improved by 2 dB and 1.5 dB using 4D-PAM4 TCM and 4D-PAM5 TCM at the KP4 FEC threshold, respectively, when compared with using conventional PAM4 in the back-to-back (B2B) transmissions. Similarly, in the 112 Gbit/s B2B case, with the help of THP, 3.3 dB and 2 dB receiver sensitivity improvements are obtained using 4D-PAM4 TCM and 4D-PAM5 TCM at the KP4 FEC threshold, respectively.

The remainder of the paper is organized as follows: in Section 2, we introduce the principle of combining TCM and THP for the ISI channel; In Section 3, we present the extensive experimental results in comparing the performance of various systems, showing that the use of THP can ensure that TCM outperforms PAM4 in terms of the receiver sensitivity even in bandwidth limited systems. The conclusion is finally drawn in Section 4.

## 2. Trellis-coded modulation with THP for an ISI channel

#### 2.1 Trellis-coded modulation and its performance in an ISI channel

Combining channel coding and modulation, it has been shown that TCM can improve the system performance without sacrificing data rate, and it has been well studied for the 1000BASE-T Gigabit Ethernet standard [17–20]. Figure 1(a) depicts the structure of a commonly used TCM encoder/mapper in the optical communication community. Generally, the TCM signals are generated as follows: when $m$bits are to be transmitted per encoder/mapper operation, $\widehat{m}=2$ bits are expanded by a rate-2/3 binary convolutional encoder into 3 coded bits. These 3 bits are used to select one of the 8 subsets and the remaining $m-2$ uncoded bits determine which of the ${2}^{m-2}$ points to use in this subset. Set-partitioning is applied to divide a signal set successively into smaller subsets [17]. Considering the 4D-PAM4 TCM and 4D-PAM5 TCM (consecutive 4 PAM4/PAM5 symbols in the time-domain), we first partition the set of PAM4/PAM5 points in each dimension into two subsets: A and B, as shown in Fig. 1(b). Then 8 4D subsets (*S0* to *S7*) can be formed by means of this partitioning. For example, by choosing a point (−1) from A in all 4 dimensions (AAAA combination), then the 4D point (−1,-1,-1,-1) belongs to subset S0. In the 4D-PAM4 TCM case, each subset contains 32 points and 5 bits can be encoded to choose these points. In the 4D-PAM5 TCM case, we choose 64 points from each subset and encode 6 bits according to the bit-to-symbol mapping table given in [19]. Including the 2 bits into the convolutional encoder, each 4D-PAM4/4D-PAM5 TCM symbol carry on 7/8 bits. As for the rate-2/3 binary convolutional encoder, there are usually 3 types, namely 8-state, 16-state and 32-state, as shown in Fig. 1(c). The number of states is given by the number of delay taps employed by the convolutional encoder. Adding more taps help to improve the error correcting capabilities of the convolutional code but it also significantly increases the complexity of the decoder. In the receiver-side, Viterbi algorithm is usually applied to decode the TCM signals.

As depicted in Fig. 2, we first numerically investigate the performance of various 112 Gbit/s signals using OptiSystem 13. At the transmitter-side, the optical power of the MZM output is 6 dBm and a 4th order Bessel low pass filter (LPF) with various 3 dB bandwidth is utilized to emulate the transmitter bandwidth. At the receiver-side, the PD responsibility, dark current and thermal noise power density are set to 0.65 A/W, 10 nA and 2 × 10^{−24} W/Hz, respectively. Another 4th order Bessel LPF with 25 GHz 3 dB bandwidth is applied to simulate the bandwidth of the receiver. A 16-state rate-2/3 binary convolutional encoder is applied to generate the TCM signals. As we can see in Fig. 2(a), when the transmitter-side bandwidth is 20 GHz, the receiver sensitivity is improved by 1.15 dB and 0.64 dB using 4D-PAM4 TCM and 4D-PAM5 TCM signals, respectively, compared with using the conventional PAM4 signal considering the KP4 threshold of BER = 2 × 10^{−4}. In addition, the 4D-PAM4 TCM signal outperforms the 4D-PAM5 TCM signal. However, as shown in Fig. 2(b), when the transmitter-side bandwidth is reduced to 10 GHz, the conventional PAM4 signals show the best performance regardless of the received power. This is due to the fact that the TCM signal is more vulnerable to ISI induced by bandwidth limitation. Specifically, considering the same system bit rate, the 4D-PAM4 TCM signal suffers from severer penalties due to the bandwidth limitation induced ISI compared with the PAM4 signal, since it should operate at higher symbol rate, such as 64 GBaud instead of 56 GBaud to achieve 112 Gbit/s. Although the 4D-PAM5 TCM signal does not suffer from the higher symbol rate issue as it has the same symbol rate as PAM4, it has 5 amplitude levels and is more sensitive to ISI [16].

#### 2.2 Tomlinson-Harashima precoding

Tomlinson-Harashima precoding (THP) was first proposed in the early 70’s as an alternative to DFE and has been well known for its effectiveness in dealing with the ISI problem [21–25]. Compared with DFE, THP does not suffer from error propagation. The structure of the transmitter-side THP for PAM is given in Fig. 3(a). Obviously, the THP can only deal with post-cursor ISI. The linearized description of the precoded sequence after THP is as follows [23]:

*M*to avoid the data-flipping. However it will further increase the transmitted signal power, causing a larger precoding loss [23]. In practice,

*M*is usually chosen as 5 and 8 for the PAM5 and PAM4, respectively.

#### 2.3 Combination of Trellis-coded modulation and THP for an ISI causal channel

As mentioned in Section 2.2, except for the Modulo congruence, THP can turn an ISI causal channel into an end-to-end AWGN channel [26–28]. Therefore, combined with THP, the TCM signals, designed for an AWGN channel, can maintain its coding gain even in the presence of severe ISI. Figure 4(a) shows the combination of TCM and THP, which is quite straightforward. However, the ‘data flipping’ problem will become very significant when the Viterbi algorithm is applied to decode the TCM signals. Due to the sequence estimation nature of the Viterbi algorithm, the ‘data flipping’ will not only affect the decision corresponding to time *n,* but also those in the future. To solve this problem, we need to modify the Viterbi algorithm to maximize the TCM coding gain. The required modification is done only on the Viterbi subset slicer and no other modifications are applied to the Viterbi decoding algorithm itself. Specifically, the slicer is modified as follows: the original constellation in each dimension is extended by one more level, as shown in Fig. 4(b). In addition, the extended points are the duplicates of the outermost points of the original constellation, where the distance between an outer point and its duplicate point is$M$. As we can see, even after the ‘data flipping’ occurs, i.e. the received symbol crosses over the Modulo boundary, the extended Viterbi subset slicer will still ensure that the current symbol is within the correct subset with a precise transition metric. After the subset slicing and transition metric calculations, the maximum likelihood sequence estimator that follows the slicer is exactly the same as the conventional Viterbi sequence estimator.

Furthermore, if we combine 4D-PAM5 TCM with THP, the extension parameter *M* of 5 chosen for PAM5 will be no longer suitable. Generally, the minimum squared Euclidean distance (MSED) within a 4D-PAM5 subset is 4. For example, the distance between point (−2,-2,-2,-2) to point (−2,-2,-2,0) within subset S0 is 4. However, combining with THP, point (−2,-2,-2,-2) may be expanded into point (−2,-2,-2, *M*-2). The nearest point to the point (−2,-2,-2, *M*-2) in subset S0 is point (−2,-2,-2,2). In this case, to ensure MSED within a subset is 4, *M* should be 6. However, as mentioned in Section 2.2, a larger extension parameter$M$means larger precoding loss. Therefore, we need to optimize *M* for the 4D-PAM5 TCM signal with THP. Moreover, we should note that the selection of *M* has no impact on the performance of modified Viterbi algorithm since the precise transition metric can always be ensured. In addition, as for the 4D-PAM4 TCM signals, the extension parameter M of 8 chosen for PAM4 ensures the same MSED, since the outmost two symbols of PAM4 belong to different groups, which are used to build 4D-PAM4 subsets.

## 3. Experimental setup, results and discussions

The experimental setup is depicted in Fig. 5. In the transmitter-side DSP, a pseudo-random bit sequence (PRBS) is first generated using the MATLAB rand function. Then a 16-state rate-2/3 binary convolutional encoder is applied to generate the 4D-PAM4/4D-PAM5 TCM signals as described in the previous section. Note that in the 4D-PAM5 TCM case, the symbols are scrambled to ensure the symmetry of the signal around the zero level [16]. The performance of the conventional PAM4 signal is also investigated for comparison. Afterwards, the THP with 30 feedback taps is applied [29]. The symbol sequence after THP is then up-sampled to 2 samples per symbol (sps), and passed through a 128-tap root-raised cosine (RRC) finite impulse response (FIR) filter with a roll-off factor of 0.1 for pulse shaping. To match the sampling rate of the AWG, the RRC signal is resampled, followed by nonlinear compensation (NLC) to handle arcsine function and clipping. Finally, the signal is sent to an AWG operating at 70 GSa/s. The 3-dB bandwidth of the AWG is about 14 GHz. The AWG output signal is first amplified using a 28 GHz RF amplifier. In order to limit the signal power at the input of the modulator, a 6 dB RF attenuator with 50 GHz bandwidth is used. After that, the RF signal with a peak-to-peak voltage of ~4 V is fed into a 28 GHz single-drive Mach-Zehnder Modulator (MZM). The MZM has a 5.5 dB insertion loss and is biased at the quadrature point. The Vpi of the MZM is about 6.8 V. A C-band laser operating at 1550 nm is employed and its output optical power is 16.5 dBm. After modulation, the optical signal power into the 3-km SSMF is about 7.6 dBm. A variable optical attenuator (VOA) is employed to control the optical signal power into the photodetector (PD). At the receiver-side, a Finisar PD without an inline transimpedance amplifier (TIA) is used for optical-to-electrical conversion. Finally, the received electrical signal is digitized at 160 GSa/s by a 63 GHz real-time oscilloscope (RTO). As for the receiver-side DSP, the digital waveform is first re-sampled to 2 sps, and then passed through a matched RRC FIR filter with a roll-off factor of 0.1. Since the optical communication channel has in fact a non-causal impulse response with both post-cursor and pre-cursor ISI components, a linear feed-forward equalizer (FFE) is also used to mitigate the pre-cursor ISI. After that, a Modulo operation is applied to obtain the original data sequence. As for the 4D-PAM4/4D-PAM5 TCM signals, the modified Viterbi algorithm as proposed in the previous section is applied. After symbol-to-bit de-mapping, the BER is finally calculated. The measured system transfer function in the back-to-back scenario is also given in Fig. 5. The 3 dB bandwidth is only ~3.5 GHz due to the cascade of bandwidth-limited devices.

First, in Fig. 6 we investigate the 56 Gbit/s system performance of various signals in a B2B measurement. As for the 4D-PAM5 TCM signal and PAM4 signal, the system is operating at 28 GBaud. For the TCM 4D-PAM4 signal the symbol rate is 32 Gbaud to obtain the same data rate. The cases with and without THP are under consideration. The received optical power is swept from −6 dBm to −2 dBm. The impact of the extension parameter *M* on the 4D-PAM5 TCM signal combining THP is also studied, and three *M* values are taken into account for comparison. As we can see, in the case of 4D-PAM5 TCM, *M* = 5 shows the worst performance since the MSED within an extended subset is only 1 in this case. *M* = 6 shows the best performance because of the maximum MSED within the extended subset. In addition, the use of THP results in a slight performance loss as higher BERs are observed for all the systems. We owe this phenomenon to the fact that the channel bandwidth is not limited at 56 Gbit/s bit rate and THP results in precoding loss. Afterwards, as per Fig. 6, the use of TCM can increase the receiver sensitivity compared with the conventional PAM4. In particular, the receiver sensitivity improvement is about 2 dB and 1.5 dB using the 4D-PAM4 TCM and 4D-PAM5 TCM signals at the KP4 FEC threshold of BER = 2 × 10^{−4}, respectively. The gain becomes larger as the BER further decreases.

Next, we evaluate the performance at higher bit rates including 70 Gbit/s, 84 Gbit/s, 98 Gbit/s and 112 Gbit/s, where the signal suffers from the ISI induced by the bandwidth limitation. For all the bit rates, the BER as a function of the received power is plotted in Fig. 7. On the one hand, for the cases without THP, the 4D-PAM4 TCM and 4D-PAM5 TCM signals outperform the conventional PAM4 signal in terms of the receiver sensitivity at lower system bit rate (70 Gbit/s and 84 Gbit/s). However, when the system bit rate is increased to 112 Gbit/s, the PAM4 signal shows better BER performance than the 4D-PAM4 TCM and 4D-PAM5 TCM signals since it is more tolerant to ISI. In addition, we can observe an obviously BER floor for the cases without using THP when the system bit rate is high such as 112 Gbit/s. On the other hand, as the system bit rate increases, the benefits of using THP begin to appear due to the mitigation of the increased ISI, especially for the TCM 4D-PAM4 signal. In particular, for the 112 Gbit/s system with 4.5 dBm received power, the achieved BERs are 1.03 × 10^{−1}, 5.8 × 10^{−3}, and 4 × 10^{−3} for the 4D-PAM4 TCM, 4D-PAM5 TCM (*M* = 5.5) and PAM4 signals without THP, respectively. And the achieved BER decreases to 3.7 × 10^{−5}, 1.2 × 10^{−4}, and 4.02 × 10^{−4} for the 4D-PAM4 TCM, 4D-PAM5 TCM (*M* = 5.5) and PAM4 signals using THP, respectively. As a result, the THP schemes always outperform their non-THP counterparts at these bit rates. In addition, as per Fig. 7, the 4D-PAM4 TCM and 4D-PAM5 TCM signals always outperform the PAM4 signal when using THP even at higher bit rate where the channel is severely ISI limited. With the help of THP, the receiver sensitivity considering the KP4 FEC threshold of BER = 2 × 10^{−4} by using 4D-PAM4 TCM can be improved by about 2 dB, 1.45 dB, 2.05 dB and 3.3 dB compared with using PAM4 at 70 Gbit/s, 84 Gbit/s, 98 Gbit/s and 112 Gbit/s, respectively. Similarly, the receiver sensitivity by using 4D-PAM5 TCM can be improved by about 1.5 dB, 1.2 dB, 1.5 dB and 2 dB compared with using PAM4 at 70 Gbit/s, 84 Gbit/s, 98 Gbit/s and 112 Gbit/s, respectively. We should note that when the bit rate is 98 Gbit/s and 112 Gbit/s, the *M* = 5.5 case shows better performance than the *M* = 6 case for the 4D-PAM5 TCM signal due to the reduced precoding loss. Furthermore, along with increasing the received power, the BER decreases more rapidly for the THP cases using 4D-PAM4/4D-PAM5 TCM compared with the THP cases using PAM4, which helps to lower the BER floor. In conclusion, the combination of 4D-PAM4 TCM and THP achieves the best performance even at the bit rate of 112 Gbit/s, indicating its potential to realize 4λ × 100G IM/DD transmissions.

Finally, we fix the transmission distance at 3 km and investigate the BER performance under different received powers for the 112 Gbit/s transmissions. The results are summarized in Fig. 8 and three types of the convolutional encoder (8-state, 16-state and 32-state) for TCM are evaluated. As for the 4D-PAM5 TCM, the extension parameter is chosen to be *M* = 5.5 to get the best BER performance. Because of the chromatic dispersion induced power fading and fiber nonlinearities, all the signals suffer from performance penalty depending on the symbol rate and transmission distance. Generally, the 4D-PAM4 TCM signal should suffer from more penalties, since its symbol rate is 64 Gbaud. As per Fig. 8, after 3 km transmission, the 4D-PAM4 TCM signal still shows the best receiver sensitivity and the PAM4 signal shows the worst receiver sensitivity at BER = 2 × 10^{−4}. Specifically, the PAM4 signal cannot reach the KP4 FEC threshold at 5 dBm received power, whereas the required received power is 3.85 dBm, 4.2 dBm and 5 dBm for the 4D-PAM4 TCM signal with the 32-state, 16-state and 8-state encoder, respectively. In addition, the required received power is 4.9 dBm for the 4D-PAM5 TCM signal with the 32-state. When the number of convolutional encoder states is reduced to 16 and 8, 4D-PAM5 TCM signal cannot reach the KP4 FEC threshold at 5 dBm received power. Obviously, increasing the number of convolutional encoder states leads to a larger performance gain. For example, by increasing the number of convolutional encoder state from 8 to 16, the receiver sensitivity can be improved by 0.8 dB for the 4D-PAM4 TCM signal. By further increasing the number of convolutional encoder state from 16 to 32, the receiver sensitivity improvement is 0.35 dB for the 4D-PAM4 TCM signal. However, a larger number of convolutional encoder states leads to a more complex convolutional encoder/decoder. Considering the balance between the receiver sensitivity and the coding complexity, the number of convolutional encoder states for the TCM signals needs to be optimized in practical short reach IM/DD transmissions.

## 4. Conclusions

In this paper, we experimentally study the improved performance in IM/DD transmissions combining 4D TCM and transmitter-side THP. TCM, designed for the AWGN channel, can maintain its coding gain with the adoption of THP, even in the presence of severe inter-symbol interference induced by channel bandwidth limitation. In our experiments with off-the-shelf commercial components, which limits the 3 dB bandwidth of the whole system to be ~3.5 GHz, the combination of TCM and THP achieves a better receiver sensitivity for various system bit rate from 56 Gbit/s to 112 Gbit/s at the KP4 threshold of BER = 2 × 10^{−4}. In the 112 Gbit/s case with THP, the receiver sensitivity at the KP4 threshold can be improved by 3.3 dB using 4D-PAM4 TCM instead of PAM4 in the B2B transmission. In addition, combining 4D TCM with THP also helps to lower the BER floor. Thus, the combination of 4D TCM and THP is a promising scheme to implement 4λ × 100G IM/DD transmissions.

## References

**1. **IEEE P802.3bs, “400 Gb/s Ethernet Task Force,” http://www.ieee802.org/3/bs/.

**2. **X. Pang, O. Ozolins, S. Gaiarin, A. Kakkar, J. R. Navarro, M. Iglesias, R. Schatz, A. Udalcovs, U. Westergren, D. Zibar, S. Popov, and G. Jacobsen, “Experimental study of 1.55-um EML-based optical IM/DD PAM-4/8 short reach systems,” IEEE Photonics Technol. Lett. **29**(6), 523–526 (2017). [CrossRef]

**3. **N. Kikuchi and R. Hirai, “Intensity-modulated / direct-detection (IM/DD) Nyquist pulse-amplitude modulation (PAM) signaling for 100-Gbit/s/λ optical short-reach transmission,” in Proceedings of European Conference on Optical Communication (ECOC) (Institute of Electrical and Electronics Engineers, 2014). [CrossRef]

**4. **M. Zhu, J. Zhang, X. Yi, Y. Song, B. Xu, X. Li, X. Du, and K. Qiu, “Hilbert superposition and modified signal-to-signal beating interference cancellation for single side-band optical NPAM-4 direct-detection system,” Opt. Express **25**(11), 12622–12631 (2017). [CrossRef] [PubMed]

**5. **C. Prodaniuc, N. Stojanovic, F. Karinou, Z. Qiang, T. Dippon, and R. Llorente, “PAM-n solutions for low-cost implementations of 100 Gbps/Lambda transmissions,” in Proceedings of European Conference on Optical Communication (ECOC) (Institute of Electrical and Electronics Engineers, 2016), pp. 842–844.

**6. **L. Sun, J. Du, and Z. He, “Multiband three-dimensional carrierless amplitude phase modulation for short reach optical communications,” J. Lightwave Technol. **34**(13), 3103–3109 (2016). [CrossRef]

**7. **L. Zhang, T. Zuo, Y. Mao, Q. Zhang, E. Zhou, G. N. Liu, and X. Xu, “Beyond 100-Gb/s transmission over 80- km SMF using direct-detection SSB-DMT at C-band,” J. Lightwave Technol. **34**(2), 723–729 (2016). [CrossRef]

**8. **K. Zhong, X. Zhou, T. Gui, L. Tao, Y. Gao, W. Chen, J. Man, L. Zeng, A. P. K. Lau, and C. Lu, “Experimental study of PAM-4, CAP-16, and DMT for 100 Gb/s Short Reach Optical Transmission Systems,” Opt. Express **23**(2), 1176–1189 (2015). [CrossRef] [PubMed]

**9. **Y. Gao, J. C. Cartledge, A. S. Kashi, S. Yam, and Y. Matsui, “Direct modulation of a laser using 112-Gb/s 16-QAM Nyquist subcarrier modulation,” IEEE Photonics Technol. Lett. **29**(1), 35–38 (2017). [CrossRef]

**10. **J. Renaudier, R. Rios-Müller, M. A. Mestre, and H. Mardoyan, “Multi rate IMDD transceivers for optical interconnects using coded modulation,” in Optical Fiber Communication Conference, OSA Technical Digest (online) (Optical Society of America, 2015), paper Tu2J.2.

**11. **R. Rios-Müller, J. Renaudier, M. A. Mestre, and H. Mardoyan, “Multi-dimension coded PAM4 signaling for 100Gb/s short-reach transceivers,” in Optical Fiber Communication Conference, OSA Technical Digest (online) (Optical Society of America, 2015), paper Th1G.4.

**12. **X. Lu, V. S. Lyubopytov, and I. T. Monroy, “24-dimensional modulation formats for 100 Gbit/s IM-DD transmission systems using 850 nm single-mode VCSEL,” in Proceedings of European Conference on Optical Communication (ECOC) (Institute of Electrical and Electronics Engineers, 2017), paper M.1.D.1. [CrossRef]

**13. **N. Stojanovic, C. Prodaniuc, F. Karinou, and Z. Qiang, “56-Gbit/s 4-D PAM-4 TCM transmission evaluation for 400-G data center applications,” in Optical Fiber Communication Conference, OSA Technical Digest (online) (Optical Society of America, 2016), paper Th1G.6.

**14. **C. Prodaniuc, N. Stojanovic, Z. Qiang, F. Karinou, T. Lee, K. Engenhardt, and R. Llorente, “Experimental demonstration of 56 Gb/s 4D-PAM-5 Trellis coded modulation for 400G WDM metro-access networks,” in Optical Fiber Communication Conference, OSA Technical Digest (online) (Optical Society of America, 2016), paper Th1G.6.

**15. **Y. Hu, M. Bi, D. Feng, X. Miao, H. He, and W. Hu, “Spectral efficiency improved 2D-PAM8 Trellis coded modulation for short reach optical system,” IEEE Photonics J. **9**(4), 7904908 (2017).

**16. **C. Prodaniuc, N. Stojanovic, F. Karinou, Z. Qiang, and R. Llorente, “Performance comparison between 4D trellis coded modulation and PAM-4 for low-cost 400 Gbps WDM optical networks,” J. Lightwave Technol. **34**(22), 5308–5316 (2016). [CrossRef]

**17. **G. Ungerboeck, “Trellis-coded modulation with redundant signal sets,” IEEE Commun. Mag. **25**(2), 5–21 (1987). [CrossRef]

**18. **M. Hatamian, O. E. Agazzi, J. Creigh, H. Samueli, A. J. Castellano, D. Kruse, A. Madisetti, N. Yousefi, K. Bult, P. Pai, M. Wakayama, M. M. McConnell, and M. Colombatto, “Design considerations for Gigabit Ethernet 1000Base-T twisted pair transceivers,” in Proceedings of IEEE Custom Integrated Circuits Conference (IEEE, 1998), pp. 335–342. [CrossRef]

**19. **IEEE Std. 802.3ab, “Physical Layer Parameters and Specifications for 1000 Mb/s Operation Over 4-Pair of Category 5 Balanced Copper Cabling” (IEEE, 1999). https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=798775.

**20. **L. F. Wei, “Trellis-coded modulation with multidimensional constellations,” IEEE Trans. Inf. Theory **33**(4), 483–501 (1987). [CrossRef]

**21. **M. Tomlinson, “New automatic equaliser employing modulo arithmetic,” Electron. Lett. **7**(5/6), 138–139 (1971). [CrossRef]

**22. **H. Harashima and H. Miyakawa, “Matched-transmission technique for channels with intersymbol interference,” IEEE Trans. Commun. **20**(4), 774–780 (1972). [CrossRef]

**23. **R. F. H. Fischer, *Precoding and Signal Shaping for Digital Transmission* (John Wiley & Sons, 2005), Chap. 3.

**24. **R. D. Wesel and J. M. Cioffi, “Achievable rates for Tomlinson Harashima precoding,” IEEE Trans. Inf. Theory **44**(2), 824–831 (1998). [CrossRef]

**25. **R. Rath and W. Rosenkranz, “Tomlinson-Harashima precoding for fiber-optic communication systems,” in Proceedings of European Conference on Optical Communication (ECOC) (Institute of Electrical and Electronics Engineers, 2013), paper We.2.C.2.

**26. **A. K. Aman, R. L. Cupo, and N. A. Zervos, “Combined trellis coding and DFE through Tomlinson precoding,” IEEE J. Sel. Areas Comm. **9**(6), 876–884 (1991). [CrossRef]

**27. **R. Laroia, S. A. Tretter, and N. Farvardin, “A simple and effective precoding scheme for noise whitening on intersymbol interference channels,” IEEE Trans. Commun. **41**(10), 1460–1463 (1993). [CrossRef]

**28. **R. Laroia, “Coding for intersymbol interference channels – combined coding and precoding,” IEEE Transactions on Information Theory. Commun. **42**(4), 1053–1061 (1996).

**29. **M. Xiang, Z. Xing, E. El-Fiky, M. Morsy-Osman, Q. Zhuge, and D. V. Plant, “Single-Lane 145 Gbit/s IM/DD Transmission With Faster-Than-Nyquist PAM4 Signaling,” IEEE Photonics Technol. Lett. **30**(13), 1238–1241 (2018). [CrossRef]