Deep learning based transceiver design for multi-colored VLC systems

Hoon Lee; Inkyu Lee; Sang Hyun Lee

doi:10.1364/OE.26.006222

1. Introduction

As solid-state light sources such as light-emitting diodes (LEDs) replace existing illumination devices for a number of advantages, such as high efficiency, small size, low cost, and long lifetime, a fast switching feature of LEDs stimulates substantial research interest in the use for visible light communications (VLC), an optical wireless transmission technology with the first purpose of illumination retained [1].

A VLC transceiver with multi-color LEDs, such as red/green/blue (RGB) LEDs, naturally forms a color multiple-input multiple-output (MIMO) transmission, which enables multi-fold increase in transmission efficiency over a single-color counterpart [2]. In a multicolored LED configuration with the users’ desire for the color and illuminance, inherent constraints from legacy optical wireless communication of the signal non-negativity for intensity modulation, the average and peak intensity restriction by LED capability [3] lead to a carefully design for the arrangement of RGB LEDs. Since each color is characterized by the combination of colors, the intensity normalization of three color components defines a three-dimensional signal space. On the other hand, human perception of a color is captured in a two-dimensional color space, which can be described over CIE xyY chart [4], reduced from three-dimensional signal space. In this configuration, a message symbol can be mapped to one point in the three-dimensional space characterized by the relative intensities of RGB LEDs. Message symbols are associated with distinct colors such that the average color matches the user’s desire. Given the target color, a group of messages symbols are mapped on the chart such that the average color is equal to the target color. The difference between the signal space originating from LED arrangement and the color space associated with human perception renders this task very challenging.

To achieve this target, there have been a number of attempts for color modulation. Color shift keying (CSK) [4] is a multi-color modulation technique proposed by the IEEE 802.15.7 task group. It maps message symbols on the color space for the requirements of the dimming target matching to white color. To minimize the symbol detection error, the distance between message symbols are maximized. The design of CSK constellation for RGB LEDs has been widely investigated. A constellation design for CSK are addressed based on the billiards algorithm [5]. A CSK VLC prototype with optimized constellation design is demonstrated for RGB LEDs [4]. In addition, a nonconvex optimization is formulated to maximize the minimum distance for the set of message symbols [6]. Since CSK is intended to meet the average color requirement and lacks handling with changing target colors, it may not adapt very well under varying target color conditions. To cope with this, several color-space based modulation techniques with arbitrary color adaptation have been developed. In [7], generalized color modulation (GCM) was proposed for color-independent VLC systems. Color intensity modulation (CIM) [8] is developed to allow for both instantaneous transmission color and intensity to adjust freely to the target average color and intensity.

However, the existing signal constellation techniques lack practical considerations about channel imperfection such as color crosstalk among RGB LEDs, random noise sources of ambient noise and shot noise, etc. To handle this shortcomings, this work approaches the design of the signal constellation and the corresponding transceivers via deep-learning (DL) techniques, which are very popularly explored for their applications in a diverse domains, such as computer vision, natural language processing, biomedical engineering, and robotics. The DL applications have recently been considered as an insightful way of rethinking the communication system design in complex communication scenarios [9].

This work introduces the concept of autoencoder (AE), a state-of-the-art technique in DL, to learn a pair of transmitter and receiver adapted to dimmable VLC systems with multi-color arrangement of LEDs and optimized with the objective of minimizing symbol error probability. The key idea is to represent a transmitter, a channel, and a receiver as a single DL network trained as an AE. The beauty of this approach is that it can even be applied to insufficient knowledge about channel models and the symbol detection rules. The AE is trained to encode the input into some representation that can be recovered to the original input, i.e., the target output of the AE is the input itself. This approach could provide multifold advantages over existing techniques: First, it can consider many practical aspects of system design issues. Compared with most communication techniques [10–12] developed from probability and signal processing theory for channel models, the DL based system can capture practical system imperfections without rigorous mathematical models but simply requires stochastic layers and a training set of input vectors that include the corresponding imperfections. Second, the learned AE automatically provides the structure of an efficient optical transmitter-receiver pair that achieves the best end-to-end error performance since the training process jointly optimizes the overall DL network.

However, to obtain the optically encoded message symbols, additional intermediate learning blocks such as projection need to be introduced for meeting illuminance and color requirements. To tackle this issue, alternating projection algorithm is adopted to find feasible optical signal representations from the AE outputs. We have compared results for the AE with existing techniques by extensive computer simulation. The numerical results shows that the AE technique can provide the performance improvement in terms of the average symbol error probability.

Notations: Throughout this paper, we employ uppercase boldface letters, lowercase boldface letters, and normal letters for matrices, vectors, and scalar quantities, respectively. A set of all real matrices of size p-by-q is represented by ℝ^p^×^q, and transpose and 2-norm are respectively denoted by (·)^T and ║·║₂. Also, [X]_pq accounts for the (p, q)-entry of a matrix X, the p-th element of a vector z is expressed as [z]_p, and I_p stands for an identity matrix of size p-by-p.

2. Preliminaries and system model

We briefly review the basics of AE and describe a system model for a multi-colored point-to-point VLC.

2.1. Autoencoder preliminaries

Consider a standard structure of the AE, as illustrated in Fig. 1(a). The AE is a DL technique that produces the output of the same value as the input [13]. It is comprised of an input layer, a hidden layer, and an output layer, each with D, H, and D neurons, respectively. A single hidden layer can be extended to multiple layers for deeper learning compatibility. Let $x^{(i)} ≜ {[x_{1}^{(i)}, \dots, x_{D}^{(i)}]}^{T} \in ℝ^{D \times 1}$ denote the i-th training input in a training input data set {x⁽¹⁾, ⋯, x^(m)}, where m stands for the number of training inputs. The hidden-layer output vector, denoted by $h^{(i)} ≜ {[h_{1}^{(i)}, \dots, h_{H}^{(i)}]}^{T} \in ℝ^{H \times 1}$ , for the i-th training input is written as

h^{(i)} = ϕ_{1} (W_{1} x^{(i)} + b_{1}), for i = 1, \dots, m,

where a vector function ϕ₁(·) represents an activation at the hidden layer which introduces non-linearity to the AE and W₁ ∈ ℝ^H^×^D and b₁ ∈ ℝ^H^×1 indicate a weight matrix and a bias vector, respectively, which will be learned during a training stage. Careful choice of an appropriate activation is important since it can vary with types of input data and forms of the desired output. Popular options for ϕ₁(z) for a vector z = [z₁, ⋯, z_D]^T include linear activation (ϕ₁ (z)]_j = z_j), sigmoid

({[ϕ_{1} (z)]}_{j} = 1 / (1 + e^{- z_{j}}))

, softmax

({[ϕ_{1} (z)]}_{j} = e^{z_{j}} / \sum_{k = 1}^{D} e^{z_{k}})

, and rectified linear unit (ReLU) ([ϕ₁ (z)]_j = max {0, z_j}) [13,14].

Fig. 1 Schematic diagrams for AE-based learning networks and multi-color VLC.

Download Full Size | PDF

With the hidden layer output h⁽ⁱ⁾ at hand, the output layer computes the AE output vector ${\hat{x}}^{(i)} ≜ {[{\hat{x}}_{1}^{(i)}, \dots, {\hat{x}}_{D}^{(i)}]}^{T} \in ℝ^{D \times 1}$ as

{\hat{x}}^{(i)} = ϕ_{2} (W_{2} h^{(i)} + b_{2}), for i = 1, \dots, m,

where ϕ₂(·), W₂ ∈ ℝ^D^×^H, and ∈ b₂ ℝ^H^×1 equal an activation, a weight, and a bias employed at the output layer, respectively. During the training stage, the AE parameters {W_l, b_l} are learned so that the output vector

{\hat{x}}^{(i)}

becomes close to the training data x⁽ⁱ⁾ for i = 1, ⋯, m. The affinity between x⁽ⁱ⁾ and

{\hat{x}}^{(i)}

is measured with a cost function (J_i({W_l, b_l}) and is usually modeled with the mean-square error

(J_{i} ({W_{l}, b_{l}}) = {‖ {\hat{x}}^{(i)} - x^{(i)} ‖}^{2})

or the categorial cross entropy

(J_{i} ({W_{l}, b_{l}}) = - \sum_{j = 1}^{D} x_{j}^{(i)} \log {\hat{x}}_{j}^{(i)})

. The objective of the AE learning is to identify the AE parameters {W_l, b_l} that minimize the overall cost function

J ({W_{l}, b_{l}}) ≜ \frac{1}{m} \sum_{i = 1}^{m} J_{i} ({W_{l}, b_{l}}) .

In the the parameter training, the stochastic gradient descent (SGD) frameworks are utilized, where weights and biases are updated after several training examples called a mini-batch, instead of the entire set of training examples. Specifically, to allow parallel computing, the SGD technique iteratively updates parameters over a mini-batch

T \subset {1, \dots, m}

of the training data [15] as

W_{l} \leftarrow W_{l} - η \frac{\partial {\tilde{J}}_{T} ({W_{l}, b_{l}})}{\partial W_{l}} and b_{l} \leftarrow b_{l} - η \frac{\partial {\tilde{J}}_{T} ({W_{l}, b_{l}})}{\partial b_{l}}, for l = 1, 2,

where η is a non-negative parameter that controls the learning rate. Here, the approximated cost function

{\tilde{J}}_{T} ({W_{l}, b_{l}})

computed on a mini-batch

T

is given by

{\tilde{J}}_{T} ({W_{l}, b_{l}}) ≜ \frac{1}{| T |} \sum_{i \in T} J_{i} ({W_{l}, b_{l}})

. The gradients in (3) are obtained using the back propagation technique [13]. For low complexity gradient computation, the size of a mini-batch

| T |

is chosen to be smaller than the size of the training data set m. At each iteration of the SGD training step, the parameters are calculated by applying (3) to all possible mini-batches

T \subset {1, \dots, m}

. This process is repeated until the cost function converges. Finally, the performance of the trained AE is verified on a new data x ∉ {x⁽¹⁾, ⋯, x^(m)} in a test stage.

2.2. System model

Figure 1(b) illustrates a point-to-point VLC system in which a multi-color LED with N color chips transmits M different messages $b \in ℳ ≜ {1, \dots, M}$ to a receiver with N photo detectors (PD) corresponding to each color. Unlike [16] where the message is modulated into each single color light with a simple pulse amplitude modulation (PAM), for ease of handling the optical signal with respect to the signal space, we consider the concept of CIM [8] that handles the message symbol mapping on N-dimensional signal space. However, the proposed technique is straightforwardly extended to any type of color modulations such as CSK, since the corresponding color space where message symbols are mapped is a subset of the signal space considered in this work, and the resulting computational load required for learning becomes less. Thus, message $b \in ℳ$ is mapped to N-dimensional signal constellation points $s_{b} \in S ≜ {s_{1}, \dots, s_{M}}$ , where s_b ∈ ℝ^N^×1 is the modulated symbol for message b. Denoting $s = {[s_{1}, \dots, s_{N}]}^{T} \in S$ as the vector of the transmitted intensities of N LEDs that encodes a message symbol, the received signal y = [y₁, ⋯, y_N]^T ∈ ℝ^N^×1 at the receiver can be written by

y = Hs + n,

where H ∈ ℝ^N^×^N is equal to the VLC channel matrix representing the effect of the color crosstalk induced by imperfect color filtering and n = [n₁, ⋯, n_N]^T indicates the additive noise. The color crosstalk matrix H, which characterizes a line-of-sight (LOS) property with color filter imperfection in the VLC channels, is expressed as [16, 17]

H = [\begin{array}{l} 1 - ζ & 2 ζ & 0 & \dots & 0 & 0 \\ ζ & 1 - 2 ζ & 2 ζ & \dots & 0 & 0 \\ 0 & 2 ζ & 1 - 2 ζ & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & \dots & 1 - 2 ζ & ζ \\ 0 & 0 & 0 & \dots & 2 ζ & 1 - ζ \end{array}]

where ζ ∈ [0, 0.5) represents the inter-color interference ratio. Although we focus on the channel model in (5), the proposed approach can be extended to arbitrary setups of the channel matrix. Throughout this paper, it is assumed that the perfect knowledge of the channel matrix is available. In the VLC, the transmitted signal

s \in S

is designed to satisfy the following lighting constraints [2,8] as

0 \leq {[s]}_{j} \leq {[a]}_{j}, for j = 1 \dots, N,

E_{S} [s] = \frac{1}{M} \sum_{b = 1}^{M} s_{b} = d,

where a = [a₁, …, a_N] stands for the peak-intensity constraint and (7) denotes the dimming constraint which ensures that the average intensity of the transmitted symbols fulfils the target dimming d = [d₁, …, d_N] ∈ ℝ^N^×1 with 0 ≤ d_j ≤a_i.

We design modulation and demodulation operations for the VLC with the lighting constraints (6) and (7) so that the transmitted messages can be successfully decoded at the receiver, e.g., minimizing the average symbol error rate (SER) $P_{e} = \Pr {b \neq \hat{b}}$ , where $\hat{b} \in ℳ$ indicates the message recovered at the receiver. This requires computationally demanding joint identification of encoding and decoding strategies of electrical signals including N-dimensional optical signal constellation design that satisfies (6) and (7) simultaneously. We aims at handling this challenge to introduce the AE idea from the context of DL. The following section addresses the detailed description of the AE design procedure.

3. Autoencoder design

Figure 2 depicts the AE structure for dimmable VLC designs consisting of a transmitter, a channel, and a receiver. To find an efficient AE structure, activation functions and the number of layers and neurons are carefully chosen since the corresponding DL performance varies with the choice [13]. Since M different optical signals are used for N-dimensional signal space, we employ three hidden layers each with M, N, and M neurons, respectively, and an output layer with M neurons. For the activations, the hidden layers utilize ReLU and linear activation functions for mapping each message to a point in the optical signal space formed by LEDs with positive intensity values, while the output layer adopts a softmax activation in order to classify which message has been transmitted. Additional deterministic and stochastic layers are included to reflect constraints imposed optically by the VLC system. The input is represented in one-hot vector e_b ∈ ℝ^M^×1 associated with the transmitted message $b \in ℳ$ , i.e., the input corresponding to the b-th message is denoted by a zero vector except for the b-th element equal to 1. Thus, a training input is constructed from M different one-hot vectors. Then, the receiver yields the estimate of message $\hat{b}$ based on the AE output p ∈ ℝ^M^×1.

Fig. 2 Proposed AE structure for dimmable VLC.

Download Full Size | PDF

3.1. Transmitter

At the transmitter block, the i-th training input associated with message $b^{(i)} \in ℳ$ is applied in one-hot vector $e_{b}^{(i)}$ to two subsequent hidden layers each with M and N neurons, respectively. For a positive real-valued output that represents an optical intensity signal, the ReLU and linear activations are utilized at subsequent layers, respectively. The output vectors $h_{1}^{(i)} \in ℝ^{N \times 1}$ and $h_{2}^{(i)} \in ℝ^{N \times 1}$ of the corresponding layers are expressed, respectively, as

h_{1}^{(i)} = \max {0, W_{1} e_{b}^{(i)} + b_{1}},

h_{2}^{(i)} = W_{2} h_{1}^{(i)} + b_{2},

where W_l and b_l are the weight matrix and the bias vector for the l-th hidden layer, respectively, learned during the training stage. To fulfil the lighting constraints (6) and (7), we introduce additional layers, as in [9], which force

h_{2}^{(i)}

to reside in the feasible optical signal region. To be specific, a deterministic post-processing layer finds a projection s⁽ⁱ⁾ of

h_{2}^{(i)}

onto the space jointly specified by (6) and (7), so that the vector s⁽ⁱ⁾ can represent the optical signal associated with message b⁽ⁱ⁾. Since DL libraries deal with a batch of samples rather than a single data, the proposed post-processing layer handles multiple inputs

h_{2}^{(i)}

for i = 1, ⋯, m, and thus the post-processing layer solves the optimization problem given by

\begin{array}{l} (P 1) : \min_{s^{(1)}, \dots, s^{(m)}} \frac{1}{2} \sum_{i = 1}^{m} ‖ s^{(i)} - h_{2}^{(i)} ‖^{2} \\ subject to 0 \leq {[s^{(i)}]}_{j} \leq {[a]}_{j}, for j = 1, \dots, N and i, \dots, m, \end{array}

\sum_{i = 1}^{m} s^{(i)} = m d .

Note that this formulation is valid when the vectors

h_{2}^{(i)}

for i = 1, ⋯, m are obtained from the training data set {b⁽¹⁾, ⋯, b^(m)} that contains all possible messages in

ℳ

. In practice, this is the case since the number of training data m is much larger than the number of message M.

The convexity of (P1) is obvious since the objective is quadratic and all the constraints are affine. Although it allows to find a solution via existing convex optimization solvers [18], increasingly high computational load may be required when the number of training data m becomes large. To overcome this difficulty, we introduce an alternating projection framework where a solution is obtained by iteratively finding the intersecting points of feasible sets associated with two different constraints [19]. The resulting formulation is given by two optimization problems as

\begin{array}{l} (P 2) : \min_{s^{(1)}, \dots, s^{(m)}} \frac{1}{2} \sum_{i = 1}^{m} ‖ s^{(i)} - h_{2}^{(i)} ‖^{2} \\ subject to 0 \leq {[s^{(i)}]}_{j} \leq {[a]}_{j}, for j = 1, \dots, N and i, \dots, m, \end{array}

\begin{array}{l} (P 3) : \min_{s^{(1)}, \dots, s^{(m)}} \frac{1}{2} \sum_{i = 1}^{m} {‖ s^{(i)} - h_{2}^{(i)} ‖}^{2} \\ subject to \sum_{i = 1}^{m} s^{(i)} = m d . \end{array}

The solution s⁽ⁱ⁾ of (P2) is simply obtained by

{[s^{(i)}]}_{j} = \min {\max {0, {[h_{2}^{(i)}]}_{j}}, {[a]}_{j}} for j = 1, \dots, N and i = 1, \dots, m .

In contrast, (P3) can be expressed in the Lagrange dual formulation [18] given by

ℒ (s^{(1)}, \dots, s^{(m)}, ν) = \frac{1}{2} \sum_{i = 1}^{m} {‖ s^{(i)} - h_{2}^{(i)} ‖}^{2} - ν^{T} (\sum_{i = 1}^{m} s^{(i)} - m d),

where ν ∈ ℝ^N^×1 indicates the dual variable associated with constraint (12). The Karush-Kuhn-Tucker conditions [18] lead to the solution expressed as

s^{(i)} = d + h_{2}^{(i)} - \frac{1}{m} \sum_{k = 1}^{m} h_{2}^{(k)}, for i = 1, \dots, m .

Let us define S ≜ [s^(1)⋆, ⋯, s⁽^m^)⋆] ∈ ℝ^N^×^m and

H_{2} ≜ [h_{2}^{(1)}, \dots, h_{2}^{(m)}] \in ℝ^{N \times m}

. Then, the solutions in (13) and (14) can be respectively rewritten by

\begin{array}{l} {[S]}_{j i} = \min {\max {0, {[H_{2}]}_{j i}}, {[a]}_{j}} for j = 1, \dots, N and i = 1, \dots, m, \\ S = d 1_{m}^{T} + H_{2} (I_{m} - \frac{1}{m} 1_{m} 1_{m}^{T}), \end{array}

where 1_m indicates an all one vector of length m. The alternating projection algorithm based on closed-form projection solutions for (P2) and (P3) can be summarized as in Table 1. At the t-th iteration of this algorithm, the projection S^[^t^] of the solution obtained from the previous iteration S^[^t^−1] is found. Since the feasible set of (P1) is non-empty for the bounded region 0 ≤ [d]_i ≤ [a]_i, the algorithm provably converges to a feasible solution [19].

Table 1. Alternating projection algorithm for (P1).

View Table

3.2. Channel

To model the optical channel for VLC in (4), the channel block is included between the transmitter and receiver blocks. At a color crosstalk layer, the optical intensity vector s⁽ⁱ⁾ is multiplied by the channel matrix H in (5), and the noise vector $n^{(i)} ~ N (0, σ^{2} I_{N})$ is added at the stochastic noise layer. The channel layer output y⁽ⁱ⁾ for the i-th training data is given by y⁽ⁱ⁾ = Hs⁽ⁱ⁾ + n⁽ⁱ⁾. Notice that the noise vector n⁽ⁱ⁾ is independently generated for each training data i and is not known to the AE during the training stage. Although we consider color crosstalk and additive noise, this approach is very efficient in taking into account various types of channel imperfections, such as signal-dependent noise and device non-linearity.

3.3. Receiver

The receiver block processes the channel output y⁽ⁱ⁾ with the third hidden layer and the output layer each of which employs the ReLU and the softmax activations for handling valid (positive) components of the optical intensities and computing the likelihoods of messages decoded from them, respectively. The outputs denoted by $h_{3}^{(i)} \in ℝ^{M \times 1}$ and p⁽ⁱ⁾ ∈ ℝ^M^×1 for the i-th training data can be, respectively, expressed as

h_{3}^{(i)} = \max {0, W_{3} y^{(i)} + b_{3}},

{[p^{(i)}]}_{j} = \frac{e^{{[z^{(i)}]}_{j}}}{\sum_{k = 1}^{M} e^{{[z^{(i)}]}_{k}}}, for j = 1, \dots, M,

where z⁽ⁱ⁾ is defined as

z^{(i)} ≜ W_{4} h_{3}^{(i)} + b_{4}

, W₄ ∈ ℝ^M^×^N are the weight matrices for the third hidden layer and the output layer, respectively, and b₃ ∈ ℝ^M^×1 and b₄ ∈ ℝ^M^×1 represent the bias vectors for the third hidden and the output layers, respectively. The softmax activation at the output layer produces the AE output vector p⁽ⁱ⁾ which characterizes the probability of the messages in

ℳ

being transmitted. Thus, the estimate of the transmitted message

{\hat{b}}^{(i)} \in ℳ

is found with the element of the probability vector with the maximum value of p⁽ⁱ⁾, i.e.,

{\hat{b}}^{(i)} = \arg \max_{j} {[p^{(i)}]}_{j}

.

For the training, we set the cost function J({W_l, b_l}) as

J ({W_{l}, b_{l}}) = - \frac{1}{m} \sum_{i = 1}^{m} \log ({[p^{(i)}]}_{b^{(i)}}) + λ \sum_{l = 1}^{4} {‖ W_{l} ‖}_{2}^{2},

where the first term in (17) indicates the categorial cross entropy between the AE input

{e_{b}^{(i)}}

and the output {p⁽ⁱ⁾} and the second term stands for the regularization which prevents overfitting [13]. Here, the design parameter λ adjusts a relative contribution of the regularization to the overall cost. The l₂-norm based regularization is employed for weight matrices in (17), which has been widely utilized in various DL applications [15] and can regularize all elements of weight matrices {W_l} uniformly.

For practical implementation of the proposed approach, the channel matrix H is estimated first since it changes rarely and can be easily obtained before transmission [17]. Then, the weight matrices of the AE can be optimized via the SGD algorithms in advance. If there are mismatches between the training channel and the actual channel, we can employ fine-tuning strategies [24] where the AE that has been trained with one of similar channel matrix candidates is further optimized over the actual channel. Once the AE is trained, we can evaluate the average SER performance of the AE by testing over a new noise vector n which is independently generated from {n⁽ⁱ⁾}. In practice, the AE parameters {W_l, b_l} can be trained in advance and then can be implemented at the transmitter and the receiver. With W_l and b_l for l = 1 and 2 at hand, the transmitter identifies the intensity vector s for the transmitted message. The receiver can recover message $\hat{b}$ from a new received signal y by using W_l and b_l for l = 3 and 4. The encoding and the decoding operations of the AE are processed by linear matrix multiplications and additions specified by (8), (9), (15), and (16). The overall complexity of the proposed AE transceiver becomes $O (M N + M^{2})$ , which is manageable for a simple VLC transceiver design.

4. Numerical results

We provide numerical results to verify the efficacy of the proposed AE based VLC transceiver with RGB LEDs (N = 3). The peak intensity constraint a = [a₁, a₂, a₃]^T in (6) is normalized to one, i.e., a₁ = a₂ = a₃ = 1, and the signal-to-noise ratio (SNR) is defined as 1/σ². Three hidden layers shown in Fig. 2 are employed for the AE structure, where each hidden layer consists of M, N, M neurons, respectively. For the AE training, a mini-batch SGD method called Adam algorithm [21] is applied with learning rate η = 0.001 and mini-batch size 60 × M. The weight matrices and the bias vectors of the AE are initialized according to [22]. We utilize 10⁵ × M training samples to train the AE, and another 10⁵ × M samples to determine parameters such as λ in (17). In order to evaluate the average performance of the trained AE, We use 10⁷ × M test samples during the testing stage. Note that the additive noise for the training, validation, and testing stages is generated with different random seeds. The simulations are implemented in Python 3.5.2 with TensorFlow 1.3.0 [20].

To optimize the AE parameters efficiently, we propose a two-stage training strategy where the AE is first trained at fairly high SNR and subsequently is further trained with low SNR in the second stage. The number of iterations of the mini-batch SGD for the first and the second training stages are given by 50 and 100, respectively. Note that the SNR values adopted during the training are parameters which are optimized via the validation process.

4.1. Impact of dimming constraint

We investigate the impact of the dimming constraint d = [d₁, d₂, d₃]^T for the AE based VLC transceiver with M = 8 messages and ζ = 0. For a benchmark, we consider the binary PAM with analog dimming [2, 16, 17] in which the intensity c_j of each color j (j = 1, 2, 3) takes a value from independently biased binary PAM {c_j₁, c_j₂}, where c_j₁ ≜ max{0, 2d_j – 1} and c_j₂ ≜ min{2d_j, 1}. In three-dimensional signal space, these points {c_j₁, c_j₂} for j = 1, 2, and 3 form a cuboid (or rectangular hexahedron) constellation where the transmit symbols are given by vertices of a rectangular cuboid, whose minimum Euclidean distance (MED) is equal to min_j 2d_j. For the decoding of PAM, the maximum likelihood decoding is adopted. The overall complexity of the baseline scheme is given by $O (M N)$ , which is of a similar order in computational complexity although AE might cause a slight increase in the number of calculations.

Figure 3 shows the average SER performance as a function of the SNR with various dimming constraint for M = 8 and ζ = 0. We see that the AE significantly outperforms the baseline scheme for all dimming constraints. It is noticed that, although we have trained the AE for certain SNR, the AE provides a good average SER performance for all SNR regimes. For the average SER of 10⁻³, the proposed VLC system offers about 7 and 3 dB gains over the baseline scheme with d = [0.2, 0.4, 0.9]^T and [0.8, 0.6, 0.5]^T, respectively.

Fig. 3 Average SER performance as a function of SNR with M = 8 and ζ = 0.

Download Full Size | PDF

Figure 4 illustrates constellation points learned by the AE for M = 8 and ζ = 0 with different dimming constraints. The mean of the constellation points are marked by black squares. It is observed that the learned constellation satisfies the target dimming constraint. We also see that the learned constellation points are totally different from the baseline cuboid constellation. Note that the MED for the learned constellations in Figs. 4(a)–4(d) are obtained as 0.317, 0.427, 0.512, and 0.660, respectively, whereas those for the cuboid constellation are 0.2, 0.2, 0.4, and 0.4, respectively. As a consequence, the constellations in Fig. 4 attained by the AE are more efficient than those of the baseline scheme in terms of the error probability, as shown in Fig. 3. This indicates that, for a given dimming target, the AE learns unknown and complicated modulation rules by itself to reliably transmit and receive the messages through the noisy channel.

Fig. 4 Learned constellation points by the AE with M = 8 and ζ = 0.

Download Full Size | PDF

To understand transmission strategy of the AE-based VLC system, the Euclidean distance between each constellation point and the target dimming vector d is depicted in Fig. 5 for both constellations with M = 8 and ζ = 0. We index the constellation points in an ascending order by intensities of green, blue, and red colors. All constellation points of the baseline scheme are equally distant from the dimming target d since the points are located at the vertices of the cuboid and the centroid corresponds to the dimming target. By contrast, the AE uses only one or two constellation points near d, and thus we can distribute other points uniformly over the signal space rather than merging them closely around the dimming target. For this reason, the AE can achieve a better MED performance over the baseline scheme.

Fig. 5 Euclidean distance between each constellation point and dimming constraint with M = 8 and ζ = 0.

Download Full Size | PDF

To further validate the efficacy of the proposed AE method, we depict the MED gain of the learned AE constellations by varying the dimming constraint as d₁ = 0.1, 0.2, ⋯, 0.5 and d_j = 0.1, 0.2, ⋯, 0.9 for j = 2 and 3 in Fig. 6. The MED of the AE constellations is larger in general. The MED gain performance are of axial symmetry around a line d₂ = d₃, which means that the AE learns similar modulation rules with permutations of a dimming constraint. It is interesting to note that the MED gain normally increases as the target dimming intensities of each color becomes asymmetric. This is because the degrees of freedom in the constellation design are much higher when d_j for j = 1, 2 and 3 are quite different.

Fig. 6 MED gain over the baseline scheme with M = 8 and ζ = 0.

Download Full Size | PDF

4.2. Impact of color crosstalk

We study the performance of the proposed method when there exists the inter-color interference, i.e., ζ > 0. In Fig. 7, we plot the average SER performance for M = 8 and d = [0.8, 0.6, 0.5]^T with ζ = 0, 0.1, and 0.3. The AE clearly outperforms the baseline scheme in all SNR regimes. This shows that the AE based VLC well compensates the color crosstalk compared to the baseline scheme.

Fig. 7 Average SER performance as a function of SNR with M = 8 and d = [0.8, 0.6, 0.5]^T.

Download Full Size | PDF

Figure 8 exhibits three-dimensional transmitted symbols {s_b} learned by the AE for M = 8 and d = [0.8, 0.6, 0.5]^T with ζ = 0.1 and 0.3. The AE still fulfills the dimming target with positive inter-color interference ratio ζ. Compared to the constellation obtained with ζ = 0 in Figure 4(d), the AE produces significantly different transmitted symbols for the inter-color interference case. To obtain insights behind transmitted symbols in Fig. 8, we illustrate the noiseless received signal Hs_b, $\forall b \in ℳ$ , with M = 8 and d = [0.8, 0.6, 0.5]^T for ζ = 0.1 and 0.3. Note that the MED of the constellations in Fig. 9(a) and 9(b) is calculated as 0.534 and 0.370, respectively, which are much larger than 0.362 and 0.305 for the cuboid constellation for ζ = 0.1 and 0.3, respectively.

Fig. 8 Transmitted signal of the AE based VLC with M = 8 and d = [0.8, 0.6, 0.5]^T.

Download Full Size | PDF

Fig. 9 Noiseless received signal of the AE based VLC with M = 8 and d = [0.8, 0.6, 0.5]^T.

Download Full Size | PDF

We present the average SER performance in Fig. 10 by changing the inter-color interference ratio ζ at SNR = 20 dB with different dimming targets. The average SER performance of both schemes increases as ζ grows. Besides, the AE outperforms regardless of ζ.

Fig. 10 Average SER performance as a function of ζ with M = 8 and SNR = 20 dB.

Download Full Size | PDF

In Fig. 11, we provide the MED gain of the constellation points learned by the AE over the cuboid constellation for ζ = 0.1, 0.2, and 0.3 with fixed d 1 = 0.3. The MED gain is normally greater than 1 for all ζ. Thus, we conclude that, in the presence of color crosstalk, the proposed AE based VLC still performs very well in terms of the error rate performance. It is observed that the MED gain generally grows as the inter-color interference ratio ζ increases. The AE learns the effect of the color crosstalk channel by itself and separates the constellation points as far as possible to successfully recover the transmitted message. In contrast, baseline schemes do not handle the inter-color interference very efficiently. As a result, the constellation points learned by the AE show a good MED performance gain.

Fig. 11 MED gain over the baseline scheme with M = 8 and d₁ = 0.3.

Download Full Size | PDF

4.3. Impact of the number of messages

We present numerical results for the AE with M = 16 messages. Figure 12 demonstrates the average SER performance as a function of SNR for various dimming constraints with M = 16 and ζ = 0. As a baseline, we adopt the cube-in-cube (CIC) constellation [23] with analog dimming. The results confirm that the AE outperforms the baseline scheme with a higher modulation level. The AE presents about 10 and 7 dB gains for the average SER of 10⁻⁴ with d = [0.2, 0.4, 0.9]^T and [0.8, 0.6, 0.5]^T, respectively.

Fig. 12 Average SER performance as a function of SNR with M = 16 and ζ = 0.

Download Full Size | PDF

Figure 13 illustrates the constellation points learned by the AE with M = 16 and ζ = 0. The MED of the learned constellations in Figs. 13(a)–13(d) are computed as 0.232, 0.270, 0.345, and 0.423, respectively, which are much larger than that of the CIC constellations calculated as 0.093, 0.093, 0.186, and 0.186, respectively. These results show that the AE is also effective to learn a higher-order modulations for the VLC system.

Fig. 13 Learned constellation points by the AE with M = 16 and ζ = 0.

Download Full Size | PDF

4.4. Comparison with conventional constellation design method

Figure 14 compares the average SER performance of the proposed AE and the constellation design method in [6]. Here, the input-dependent shot noise $n_{j} ~ N (0, σ^{2} (1 + ψ^{2} s_{j}))$ is taken into account [17]. First, for ψ² = 0, it is shown that two schemes provide similar SER performance. However, if ψ² = 5, the proposed AE transceiver performs better than the conventional method in [6] for all simulated dimming constraints. The reason is that the AE automatically learns a complicated input-output relation of the input-dependent noise channel for improving the symbol detection performance, while the method in [6] considers only the minimum distance of the constellation points with maximum likelihood decoding, which is suboptimal for the input-dependent noise channel.

Fig. 14 Performance comparison with the constellation design method [6].

Download Full Size | PDF

5. Conclusion and future works

This paper has introduced an AE concept to design the transceiver for multi-colored VLC with dimming control feature. To deal with the peak intensity and dimming constraints, we have added the post-processing layer to the AE that projects a batch of hidden layer outputs into a feasible region of the optical signal space specified by the lighting constraints. During the training stage, the parameter of the AE is optimized such that the categorial cross entropy between the transmitted and the received symbols is minimized. The performance of the trained AE can be evaluated over unseen additive noise channels during the test stage. Simulation results have demonstrated that the proposed VLC transceivers significantly outperform the baseline schemes in terms of the average SER performance. Also, by observing the constellation points learned by the AE, we have confirmed that the AE can learn efficient modulation rules in the dimmable VLC by itself.

In this work, we have considered a LOS link dominated channel which can easily be estimated before the training stage. In this configuration, the AE which has been learned with a given channel can be directly employed to communicate over the same channel matrix. For future works, the proposed strategies can be extended to address general scenarios where the channel matrix varies during transmission and thus it is not available during the training stage. One possible method is to design an AE which does not require the channel matrix during the testing stage, i.e., an open-loop VLC system. This would be achieved by training the AE over numerous random channel matrices generated from an accurate VLC channel model to effectively learn the optical channel statistics. Also, the channel matrix can be considered as an additional input of the AE, so that the trained AE operates in a closed-loop VLC system where the transmit intensity vectors are adaptively computed based on each channel realization. The investigation of an efficient AE structure and learning strategies for such approaches are worth pursuing. This framework can shed light on the viability of AE-based approaches for optical communication applications and stimulate a wide interest within the research activity.

Funding

National Research Foundation of Korea (NRF) funded by the Korea Government (MSIP) (No. 2015R1C1A1A01052529, 2017R1A2B3012316).

References and links

1. T. Komine and M. Nakagawa, “Fundamental analysis for visible-light communication system using LED lights,” IEEE Trans. Consum. Electron. 50(1), 100–107 (2004). [CrossRef]

2. S. H. Lee, S.-Y. Jung, and J. K. Kwon, “Modulation and coding for dimmable visible light communication,” IEEE Commun. Mag. 53(2), 136–143 (2015). [CrossRef]

3. R. Singh, T. O’Farrell, and J. P. R. David, “An enhanced color shift keying modulation scheme for high-speed wireless visible light communications,” J. Lightw. Technol. 32(14), 2582–2592 (2014). [CrossRef]

4. E. Monteiro and S. Hranilovic, “Design and implementation of color-shift keying for visible light communications,” J. Lightw. Technol. 32(10), 2053–2060 (2014). [CrossRef]

5. R. J. Drost and B. M. Sadler, “Constellation design for color-shift keying using billiards algorithms,” in Proceedings of GLOBECOM Workshops (IEEE, 2010).

6. X. Liang, M. Yuan, J. Wang, Z. Ding, M. Jiang, and C. Zhao, “Constellation design enhancement for color-shift keying modulation of quadrichromatic LEDs in visible light communications,” J. Lightw. Tech. 35(17), 3650–3663 (2017). [CrossRef]

7. P. Das, B.-Y. Kim, Y. Park, and K.-D. Kim, “Color-independent VLC based on a color space without sending target color information,” Optics Communications 286(1), 69–73 (2012) [CrossRef]

8. K.-I. Ahn and J. K. Kwon, “Color intensity modulation for multicolored visible light communications,” IEEE Photon. Technol. Lett. 24(24), 2254–2257 (2012). [CrossRef]

9. T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Trans. Cogn. Commun. Netw. 3(4), 563–575 (2017). [CrossRef]

10. H. Lee, B. Lee, and I. Lee, “Iterative detection and decoding with an improved V-BLAST for MIMO-OFDM systems,” IEEE J. Sel. Areas Commun. 24(3), 504–513 (2006). [CrossRef]

11. H. Sung, S.-R. Lee, and I. Lee, “Generalized channel inversion methods for multiuser MIMO systems,” IEEE Trans. Commun. 57(11), 3489–3499 (2009). [CrossRef]

12. K. Lee, H. Sung, E. Park, and I. Lee, “Joint optimization for one and two-way MIMO AF multiple-relay systems,” IEEE Trans. Wireless Commun. 9(12), 3671–3681 (2010). [CrossRef]

13. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553), 436–444 (2015). [CrossRef] [PubMed]

14. V. Nair and G. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of International Conference on Machine Learning (IMLS, 2010), pp. 807–814.

15. M. Li, T. Zhang, Y. Chen, and A. J. Smola, “Efficient mini-batch training for stochastic optimization,” in Proceedings of International Conference on Knowledge Discovery and Data mining (ACM, 2014), pp. 661–670.

16. J. Dong, Y. Zhang, and Y. Zhu, “Convex relaxation for illumination control of multi-color multiple-input-multiple-output visible light communications with linear minimum mean square error detection,” Appl. Opt. 56(23), 6587–6595 (2017). [CrossRef] [PubMed]

17. Q. Gao, C. Gong, and Z. Xu, “Joint transceiver and offset design for visible light communications with input-dependent shot noise,” IEEE Trans. Wireless Commun. 16(5), 2736–2747 (2017). [CrossRef]

18. S. Boyd and L. Vandenberghe, Convex Optimization (Cambridge University, 2004). [CrossRef]

19. S. Boyd and J. Dattorro, “Alternating projections,” http://www.stanford.edu/class/ee392o/alt_proj.pdf (2013).

20. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: large-scale machine learning on heterogeneous systems,” http://tensorflow.org (2015).

21. D. Kingma and J. Ba, “Adam: a method for stochastic optimization,” http://arxiv.org/abs/1412.6980 (2014).

22. X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of International Conference on Artificial Intelligence and Statistics (PMLR, 2010), pp. 249–256.

23. Z. Chen, J. S. Bae, S.-K. Chung, J.-W. Koh, and S. G. Kang, “Multi-envelope 3-dimensional constellations for polarization shift keying modulation,” in Proceedings of Information and Communication Technology Convergence (KICS, 2010), pp. 173–174.

24. S. Dorner, S. Cammer, J. Hoydis, and S. Brink, “Deep learning based communication over the air,” IEEE J. Sel. Topics Signal Process. 12(1), 132–143 (2018). [CrossRef]

Deep learning based transceiver design for multi-colored VLC systems

Abstract

1. Introduction

2. Preliminaries and system model

2.1. Autoencoder preliminaries

2.2. System model

3. Autoencoder design

3.1. Transmitter

3.2. Channel

3.3. Receiver

4. Numerical results

4.1. Impact of dimming constraint

4.2. Impact of color crosstalk

4.3. Impact of the number of messages

4.4. Comparison with conventional constellation design method

5. Conclusion and future works

Funding

References and links

Cited By

Figures (14)

Tables (1)

Equations (21)

Optics Express

Alternating Projection Algorithm for (P1)
Initialize t = 0 and S^[^t^] = H₂.
repeat
Set t ← t + 1.
Compute $S^{[t]} = d 1_{m}^{T} + S^{[t - 1]} (I_{m} - \frac{1}{m} 1_{m} 1_{m}^{T})$ .
Compute [S^[^t^]]_ji = min{max{[S^[^t^]]_ji, 0}, [a]_j} for j = 1, ⋯, N and i = 1, ⋯, m.
until S^[^t^] converges.