Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Performance evaluation of data embedding schemes for two-dimensional display field communication

Open Access Open Access

Abstract

Display field communication (DFC) is an unobtrusive display-to-camera technology that transmits data within the frequency domain of images, ensuring that the embedded data are hidden and do not disrupt the viewing experience. The display embeds data into image frames, while the receiver captures the display and extracts it. Two-dimensional DFC (2D-DFC) focuses on embedding data in the width and height of an image. This study explores two methods to minimize the error rate in 2D-DFC without affecting the quality of the displayed image. The orthogonal method embeds data in the orthogonal direction of an image. On the other hand, the diagonal embedding method strategically embeds the data in the diagonal direction. Experiments show the diagonal method maintains a higher peak signal-to-noise ratio and surpasses the orthogonal embedding method in terms of bit error rate. 2D-DFC is expected to have practical applications in digital signage, advertising and informational displays at airports and train stations, as well as at large-scale displays for events, sports arenas, and performance venues.

© 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Display field communication (DFC) [1] is a subcategory of unobtrusive wireless display-to-camera (D2C) communication technologies [24], using digital displays to transmit data to cameras. It is based on OFDM-like modulation [5] to deliver data streams without interfering with the viewing experience. In other words, D2C communication makes full use of spatial and temporal diversity, meaning it can modulate multiple pixels on a screen to facilitate high-rate data communication. Additionally, it does not require any modifications to standard off-the-shelf screen and camera devices, so it is easily integrated with the existing infrastructure. Consequently, D2C communication has become a highly discussed and explored area of optical technology. In comparison to RF communications, D2C communication offers several distinct advantages. First, it establishes rapid connections without the need for explicit network setup, a challenge often faced by WiFi and Bluetooth. Secondly, the RF spectrum is heavily congested, whereas D2C communication enjoys a significant amount of available spectrum, operates with independent regulations, and naturally supports one-to-many communication. Lastly, it inherently provides security benefits [6,7]. This is because, unlike RF signals that can penetrate walls, D2C links are confined to well-defined coverage zones. Consequently, it paves the way for numerous innovative applications in fields such as mobile communications, digital advertising, smart homes, and intelligent transportation systems. D2C communications can be thought of as a replacement technology for 2D barcodes and QR codes, which also transfer data to a camera. Note that QR codes are widely used today, and various efforts are underway to advance them [810].

The pioneering work in hidden D2C communications capitalizes on the temporal-spatial flicker-fusion property [2], and employs modulation techniques resembling CDMA to transmit data streams without causing disruptions in the viewing experience. Another approach, HiLight [11], takes advantage of the orthogonal transparency (alpha) channel to embed bits as changes in pixel translucency without altering pixel color values. TextureCode [12] enhances invisibility by adaptively embedding data based on video texture. ChromaCode [13] further enhances code invisibility by adjusting lightness in a uniform color space and achieves complete imperceptibility. DeepLight [14] elevates the reliable data transmission rates of concealed D2C communication in various real-world conditions by selectively modulating the intensity of the blue channel and incorporating machine learning (ML) models into the decoding process. For increased data rates, AirCode [15] leverages the complementary advantages of video and audio channels, and integrates visual odometry for precise screen detection. Along similar lines, a screen-to-camera image code was proposed that adopts the color decomposition principle to ensure data embedding efficiency [16]. A BCH coding-based data arrangement and an attention-guided data decoding network are designed to guarantee robustness and adaptability. More recently, an innovative approach was presented to detect the display area in camera images that involves the incorporation of a novel localization marker into the corners of the display [17]. This marker, though less obtrusive than conventional fiducial markers, has proven to be highly reliable across various types of display content and backgrounds, as demonstrated through both simulations and experimental results.

DFC is D2C communication in which the data are embedded in the frequency domain of an image. Unlike spatial domain techniques, data embedding in the frequency domain makes the data less noticeable to the human eye [18,19]. DFC transforms spatial domain images to the frequency domain using frequency transformation techniques. The data are embedded into the frequency-domain image, which is then converted back to the spatial domain to be displayed on screen. This means that data can be transmitted via the D2C link while the image sequences are displayed for their original purpose. The proposed scheme embeds data in a designated spectral band so the hidden data are not revealed through normal displays. Embedding data in two dimensions of an image frame results in two-dimensional DFC (2D-DFC) [20]. It has been shown that 2D-DFC achieves better data rates than 1D-DFC. The initial versions of DFC used discrete Fourier transform (DFT) as a frequency transformation technique, which restricts data embedding due to the inherent conjugate symmetric properties [1]. This limitation was removed by using discrete cosine transform (DCT), which does not have an imaginary component [21,22]. In this study, we consider 2D-DFC employing 2D-DCT, which serves as a better approach to enhance the data rate of the system. DCT, being a real-valued transform, does not suffer from ringing artifacts that are sometimes introduced by the imaginary components. 2D-DCT places greater emphasis on vital image information within the lower frequencies, compared to 1D-DCT. Another advantage of DCT is that common image compression algorithms like JPEG utilize 2D-DCT. Hence, our transformation technique becomes synchronized with the compression techniques to be utilized when the camera captures the image. Hereinafter, we propose two data embedding mechanisms: a modified orthogonal method and a diagonal method. Both schemes are evaluated for different system design parameters. Simulation results reveal that diagonal data embedding performs better than orthogonal data embedding in terms of bit error rate (BER). In addition, the diagonal method maintains a higher peak signal-to-noise ratio (PSNR).

The rest of this paper is organized as follows. Section 2 presents the 2D-DCT-based DFC system model. Section 3 presents the proposed data embedding mechanisms and the power allocation scheme. In addition, data repetition and data extraction are explained. Section 4 presents the experiment setup and the simulation results in terms of BER and PSNR. Finally, the study concludes in Section 5.

2. 2D-DCT based DFC

2.1 2D-DFC system model

A fundamental DFC system comprises a display pointed towards a camera. Figure 1 shows a block diagram of a 2D-DFC system. At the transmitter, the input image, $\mathbf {I}_t$, is first converted into its frequency domain equivalent, $\mathbf {I}_F$, using the 2D-DCT operation. At the same time, binary input data ($\mathbf {b}$) are modulated, and based on the average power of the frequency domain coefficients, are embedded into the frequency-domain image. The embedding process is done via the addition allocator. The data are embedded in specific frequency regions (sub-bands) of the image. The modulated and power-allocated data, $\mathbf {X}$, are embedded into the image resulting in data-embedded image $\mathbf {D}_F$. The frequency-domain data-embedded image is then converted back to the spatial domain to be displayed on the screen. This is achieved via the inverse 2D-DCT operation resulting in spatial domain image $\mathbf {D}_t$. Note that the reference image, $\mathbf {I}_{t(ref)}$, which is not data embedded, is also rendered alternately on screen. By inserting the reference image between the neighboring data-embedded images, image artifacts that might be visible to the human eye can be minimized. In addition, the reference images help the camera receiver decode the data.

 figure: Fig. 1.

Fig. 1. Block diagram of a 2D-DFC system.

Download Full Size | PDF

2.2 2D-DCT frequency coefficients

As data embedding is performed on the frequency coefficients of an image and our proposed 2D-DFC schemes exploit frequency domain characteristics of the image, it is important to understand the properties of the 2D-DCT operation. 2D-DCT changes the image from its spatial domain (i.e., the pixel values that make up the image) into the frequency domain. Each coefficient in the resulting DCT image represents the amplitude of a specific spatial frequency component. 2D-DCT on image $\mathbf {I}_t$ sized at $P \times Q$ is performed as follows:

$$\mathbf{I}_F (u,v) = C(u) C(v) \sum_{p=0}^{P-1}\sum_{q=0}^{Q-1} \mathbf{I}_t(p,q) \cos\frac{\pi (2p+1)u}{2P} \cos\frac{\pi (2q+1)v}{2Q}, 0 \leq u \leq P-1, 0 \leq v \leq Q-1,$$
where $C(u)$ and $C(v)$ are scaling factors, defined as
$$C(u) = \begin{cases} \frac{1}{\sqrt{P}}, & u = 0 \\ \sqrt{\frac{2}{P}}, & 1 \leq u \leq P-1 \end{cases}$$
and
$$C(v) = \begin{cases} \frac{1}{\sqrt{Q}}, & v = 0 \\ \sqrt{\frac{2}{Q}}, & 1 \leq v \leq Q-1 \end{cases}.$$

$\mathbf {I}_F (u,v)$ represents the DCT coefficient value at position $(u,v)$ in frequency-domain image $\mathbf {I}_F$. The scaling factors are used to adjust the magnitude of the DCT coefficients, and they ensure the coefficients are normalized properly. Note that $C(v)$ follows the same pattern as $C(u)$. The resulting frequency domain image represents different spatial frequencies present in the image frame.

In Fig. 2, the red region corresponds to the mean DC value, which represents the average of all the pixels. Then, we see the distribution of low-, medium-, and high-frequency components of an image. Observe that the amplitudes of the low-frequency components are located at the top left corner of the 2D spectrum, while high frequencies are located at the bottom right corner. The 2D-DFC scheme exploits the fact that image information is concentrated in certain regions of the frequency-domain image. On the image plane, we can describe gradual changes in color as low-frequency variations and rapid color transitions can be considered high-frequency variations. In other words, the lower-frequency coefficients generally capture the image’s main features, while the higher-frequency coefficients represent finer details and textures. Generally, the low-frequency content holds the most crucial visual information perceived by the human eye. When passing through the D2C channel, high-frequency noise gets added to the image. This high-frequency noise makes it difficult to extract data from the high-frequency region. Moreover, the amplitudes of high-frequency components are comparatively lower compared to the mid-frequency components. Consequently, for the 2D-DFC method, it is preferable to load data in the mid-frequency region.

 figure: Fig. 2.

Fig. 2. Frequency component distributions of a $15 \times 15$p spectral image after the 2D-DCT operation.

Download Full Size | PDF

3. Proposed data embedding and extraction

3.1 Power allocation

The power allocation is performed as it provides advantages from the perspective of PSNR. In addition, it helps in decoding the data. Regarding modulation, we will use BPSK as it is a robust and fundamental modulation scheme. Binary input data $b$ are first BPSK modulated before embedding into the image frames. Data mapping is performed as follows:

$$d = \begin{cases} -1, & b=0 \\ 1, & b=1 \end{cases}.$$

After modulation, power allocation is applied to the data. The average power of the data-embedding region of the frequency-domain image, $P_{\text {avg}}$, is first obtained and then scaled by a proportionality constant, $\alpha$ [22]. Therefore, the power-allocated version of the modulated data is represented as

$$d_{\text{amp}} = X_{\text{amp}} \cdot d,$$
where $X_{\text {amp}}$ is determined as follows:
$$X_{\text{amp}} = \left( \sqrt{P_{\text{avg}}} \cdot \alpha \right).$$

The value of $\alpha$ varies in the range $(0,3)$. Now, let us consider data matrix $\textbf {X}$ at the same size as $\textbf {I}_F$. Data vector $d_{\text {amp}}$ is inserted into data matrix $\textbf {X}$ using a unique pattern to form data matrix $\textbf {X}_D$. This data insertion mechanism is explained in the next section. After we have final data matrix $\textbf {X}_D$, we add it to frequency-domain image $\textbf {I}_F$ by simply following the addition allocation:

$$\textbf{D}_F = \textbf{I}_F + \textbf{X}_D,$$
where $\textbf {D}_F$ is the final data-embedded image in the frequency domain.

3.2 Repetitive data embedding

In this study, we employ repetitive data embedding, which is a technique where the same data are repeatedly embedded in the image frame. This technique is used to add redundancy and robustness to the data because they might get lost or corrupted during transmission and it helps improve error resilience. By embedding the same information multiple times, the chances of successfully recovering the data increase, even if parts are lost or corrupted due to noise, compression, or other factors. Considering data of length $L$ repeated $K$ times, we can achieve data repetition as shown in Fig. 3. An example of data repetition with $L=4$ and $K=2$ is shown in Fig. 3(a). As seen, the data are repeated $K-1$ times in horizontal, vertical, and diagonal directions. An example with actual data is shown in Fig. 3(b). Please note that even though repetitive embedding improves the BER performance, it increases the transmit data size.

 figure: Fig. 3.

Fig. 3. Example of repetitive data embedding when $L=4$ and $K=2$: (a) the general procedure, and (b) a walkthrough example.

Download Full Size | PDF

3.3 Orthogonal and diagonal embedding

In this study, we propose and evaluate two data embedding methods: orthogonal (E1) and diagonal (E2). Orthogonal data embedding is done in an orthogonal direction in the image. Let us consider a $15 \times 15$p image. Figure 4(a) illustrates orthogonal data embedding where the white region indicates the embedded data. We can see the data are embedded in the mid-frequency region of the image at row 9. The total number of embedded data bits, $N$, can be calculated as follows:

$$N_o=n \cdot 4 \quad \text{[bits]}.$$

 figure: Fig. 4.

Fig. 4. An example of data embedding at mid frequency, $n=9$: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).

Download Full Size | PDF

Figure 4(b) depicts diagonal data embedding where the data are embedded in a diagonal direction. The total number of embedded bits in diagonal data embedding is computed as

$$N_d = n \cdot 4 \quad \text{[bits]}.$$

For a fair comparison, we adjusted the number of bits to be the same. Note that as the embedding position, $n$, increases, the total number of data-embedded bits increases proportionally.

3.4 Data extraction

After receiving the images at the camera, the first step involves correcting image distortions. These distortions are caused by various factors such as camera angle and distance between transmitter and receiver. In this study, we do not treat image distortions as a separate concern; instead, we solely focus on the AWGN occurring at the camera [20]. Considering the above scenario, the received data-embedded image, $\mathbf {Y}_t$, and the received reference image at the camera, $\mathbf {Z}_t$, can be represented as

$$\mathbf{Y}_t = \mathbf{D}_t + \mathbf{N}_{t_1}$$
and
$$\mathbf{Z}_t = \mathbf{I}_{t(ref)} + \mathbf{N}_{t_2},$$
where $\mathbf {N}_{t_1}$ and $\mathbf {N}_{t_2}$ are two different AWGN matrices affecting the reception of data-embedded and reference frames, respectively.

The received spatial domain images are converted to the spectral domain using the 2D-DCT operation (cf., Fig. 1). After the 2D-DCT operation, the above equations are transformed as follows:

$$\mathbf{Y}_F = \mathbf{D}_F + \mathbf{N}_{F_1} = \mathbf{I}_F + \mathbf{X}_{D} + \mathbf{N}_{F_1}$$
and
$$\mathbf{Z}_F = \mathbf{I}_{F(ref)} + \mathbf{N}_{F_2}.$$

At this point, subtraction data retrieval process is applied to get the estimated data matrix as

$$\hat{\mathbf{X}}_D = \mathbf{Y}_F - \mathbf{Z}_F = \left( \mathbf{I}_F + \mathbf{X}_{D} + \mathbf{N}_{F_1} \right) - \left( \mathbf{I}_{F(ref)} + \mathbf{N}_{F_2} \right).$$

Finally, the original data can be demodulated from the estimated data matrix as follows:

$$\hat{b} = \begin{cases} 0, & \hat{d} < 0 \\ 1, & \hat{d} \geq 0 \end{cases}.$$

4. Experiment results

4.1 Experiment setup

In this study, we used the $512 \times 512$p color Lena image shown in Fig. 5(a). The choice to utilize the Lena image was driven by several factors. First, it is a standard and historically significant test image in the field of image processing and serves as a standard benchmark in the academic community. Second, it is widely recognized for its rich textural details and color depth, making it an ideal test case for evaluating our proposed data embedding schemes. For instance, it has light and dark regions, sharp and blurred segments, and distinct color gradients, which makes it an excellent candidate for our analysis. Lastly, in our previous studies related to DFC, we have used the Lena image, which facilitates direct comparison and continuity, thereby enhancing the comprehensibility and relevance of our current research findings. The data are embedded in the sub-bands of the frequency-domain image. Figures 5(b) and 5(c) show the locations of the embedded data as white regions (sub-bands). The data are embedded in three sub-bands, namely, low-sub-band (LSB), mid-sub-band (MSB), and high-sub-band (HSB). The starts and end rows for the LSB, MSB, and HSB, rows 57-60, 257-260, and 457-500, respectively. are It should be noted that the same data are embedded in all three channels, red (R), green (G), and blue (B), of the Lena image. The number of data repetitions was set at four with $K=2$, primarily motivated by the desire to strike a favorable balance between error correction capability and transmit data size. We evaluated the performance of the proposed data-embedding methods in terms of BER and PSNR. In addition, we compared the effectiveness of our proposed data-embedding method with and without repetition coding.

 figure: Fig. 5.

Fig. 5. The original Lena image and locations of sub-bands for the two data-embedding approaches: (a) the original $512 \times 512$p Lena image, (b) the sub-bands for orthogonal data embedding, and (c) the sub-bands for diagonal data embedding.

Download Full Size | PDF

4.2 PSNR as a function of $\alpha$ and data allocation positions

Figure 6 compares the PSNR for various sub-bands where the data can be embedded. The upper panels show the spatial-domain images when the data are embedded using the orthogonal method. The lower panels show the images when the data are embedded using the diagonal method. We have shown and compared when the data are embedded in LSB, MSB, and HSB.

 figure: Fig. 6.

Fig. 6. Data-embedded images for different methods and frequency bands, with PSNR for $\alpha =0.7$.

Download Full Size | PDF

In both methods, as the embedding region shifted from LSB to HSB, the PSNR of the image increased, lessening the visual artifacts. We can observe that Figs. 6(a) and 6(d) show some visual artifacts compared to Figs. 6(b) and 6(e), respectively. This is because the 2D-DCT results in pixels closer to the LSB region containing a greater amount of important information. Therefore, embedding data in the LSB region results in greater degradation of the image. Moving towards the MSB and HSB regions, which contain less visual information, data embedding does not result in visible artifacts. Overall, the E1 method demonstrated a slightly higher PSNR compared to E2, indicating better image quality.

Figure 7 presents comparisons of PSNR for different values of $\alpha$ when the data allocation position is fixed to the mid-frequency range, that is, the MSB for both E1 and E2 methods. We can see that when the data embedding position is fixed, an increase in $\alpha$ results in a decrease in PSNR. This can be attributed to the fact that when data are embedded in a frequency-domain image and the image is converted back to the spatial domain, the data become more scattered. Therefore, better PSNR is observed. On the other hand, as $\alpha$ increased, the embedded data had a significant impact in the spatial domain, resulting in a low PSNR. In addition, we can see that the E1 method exhibits a slightly higher PSNR than the E2 method for the same data-embedding positions and values of $\alpha$.

 figure: Fig. 7.

Fig. 7. PSNR values for the E1 and E2 methods by varying $\alpha$ when the data allocation position is fixed to the MSB.

Download Full Size | PDF

Based on the above results, we conducted experiments to investigate the variations in PSNR with respect to changing data allocation positions and $\alpha$. Figure 8 depicts the results for PSNR versus the position of the data embedding region and the values of $\alpha$. The image used to generate the results is the same Lena image depicted in Fig. 7. The values of $\alpha$ were varied from 0.3 to 3. We can see that both methods showed an increase in PSNR with an increase in data-embedding positions. In other words, as the data-embedding region moved from a low frequency to a high frequency, PSNR increased. On the other hand, as $\alpha$ increased, lower PSNR values were observed. In particular, until $\alpha =1$, both methods showed a PSNR of more than 30 dB when the data embedding position was at row number 57. This phenomenon can be attributed to the fact that crucial pixels in the frequency-domain image tend to cluster around rows 50 to 60. At $\alpha =3$, there was a notable decline in PSNR, with values falling below 30 dB around row number 57. The rationale behind this lies in the fact that as $\alpha$ increases, the likelihood of artifacts and distortions appearing when embedding data in the MSB and HSB regions increases. Furthermore, observe that the E1 method achieved a slightly higher PSNR compared to E2 method for the same positions and the same values of $\alpha$, indicating that the data-embedded image using the E1 method showed fewer artifacts compared to E2.

 figure: Fig. 8.

Fig. 8. PSNR results versus data-embedding locations for various values of $\alpha$: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).

Download Full Size | PDF

4.3 BER performance without repetitive embedding

In Fig. 9, we evaluate the BER performance of the 2D-DFC scheme in terms of BER as a function of SNR. Note that because we are using color images for data transmission, and the data are embedded in each channel of the image, BER is computed by averaging the errors across R, G, and B channels. In addition, performance is evaluated for different data allocation positions under E1 and E2 and with different values for $\alpha$. In Fig. 9(a), we can see that for $\alpha =0.3$, method E1 performed worse than method E2 in all sub-bands. In particular, using the E1 method, allocating data to the LSB resulted in a BER of 0.39 at 0 dB SNR, whereas allocating data to the MSB and HSB yielded a BER of 0.48. Therefore, the diagonal method performed better than the orthogonal method. In addition, we can see that for SNR=30 dB, the BER in the LSB becomes zero. Furthermore, comparing all the figures, we can see that as the value of $\alpha$ increased, data embedding in the LSB performed better in both methods, finally reaching a BER of close to zero in Fig. 9(d). In particular, for $\alpha =3$, both E1 and E2 achieved a BER of 0.1 or less for data embedding in the LSB. Also in Fig. 9(d), we see that data embedding in the MSB and the HSB performed better. Overall, in these figures, we see the E2 method performed better than E1. In addition, using the LSB worked better than using the MSB and HSB. Therefore, we can see that by increasing $\alpha$, the MSB and HSB showed limited improvement over the LSB. This is because the increase in $\alpha$ results in an increase in the amplitude of the data, which improves the BER. Additionally, for both E1 and E2 methods, the BER graph demonstrated its lowest values for the LSB. This is primarily due to the LSB having larger frequency coefficients compared to the MSB and HSB. As a result, the LSB exhibited relatively larger data amplitudes, making it advantageous for data extraction.

 figure: Fig. 9.

Fig. 9. BER performance versus SNR for different values of $\alpha$, embedding methods, and sub-band positions.

Download Full Size | PDF

4.4 BER performance with repetitive embedding

Figure 10 shows the performance of the proposed 2D-DFC scheme when using the repetitive data embedding approach. That is, the same set of data were repeated four times during data allocation. Moreover, during BER measurement, the average value of those four data sets was used. Similar to Fig. 9, we see better BER values with increasing values of $\alpha$. However, the improvement is much more significant in the LSB compared to the MSB and HSB. This is primarily because of the same reasons mentioned above. In addition, the important thing is that the BER performance was better than seen in Fig. 9 due to repetitive data embedding, particularly in the LSB. Comparing Fig. 9(a) and Fig. 10(a), we can see that E2 LSB performed better. In particular, E2 showed more improvement than the E1 method. It shows that the effect of repetitive data embedding is more pronounced when using the diagonal method. In Fig. 10(c), almost zero BER was achieved with the E2 LSB. The effect of repetitive data embedding is less pronounced in the MSB and HSB. Furthermore, similar to Fig. 9, as $\alpha$ increased, the BER decreased for all data allocation positions. Overall the E2 method outperformed E1.

 figure: Fig. 10.

Fig. 10. BER performance versus SNR for different values of $\alpha$, embedding methods, and sub-band positions when using repetitive embedding.

Download Full Size | PDF

4.5 Comparison of BER w/ and w/o repetitive data embedding

Figures 9 and 10 allow us to compare the effectiveness of the E1 and E2 methods based on repetitive data embedding. Figure 11 illustrates a direct comparison of repetitive data embedding for all data allocation positions. The ‘x’ indicates that repetitive data embedding was not applied, whereas ‘o’ indicates it was applied. We can see in Fig. 11(a) that the BER performance under E1 did not significantly improve after repetitive data embedding. However, Fig. 11(b) shows great improvement. Therefore, we can say that the effect of repetitive data embedding is more pronounced in diagonal embedding. Note that increasing the repetitive data embedding factor may improve performance, but it will result in redundancy in the data. Overall, we can deduce that repetitive data embedding has a significant impact on the proposed scheme, particularly in the case of diagonal data embedding, improving error reduction and robustness. However, it often results in lower data transmission rates because of the redundant bits. This trade-off needs to be carefully considered in the design of 2D-DFC systems.

 figure: Fig. 11.

Fig. 11. BER results based on the application of repetitive data embedding with a fixed value of $\alpha =0.7$: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).

Download Full Size | PDF

4.6 BER performance according to embedding position

In this section, our focus remains on the BER based on data allocation positions and the value of $\alpha$. SNR was fixed at 30 dB. Figure 12 illustrates BER performance. First of all, we can see that for both E1 and E2 methods with increasing positions, the BER increased. This is because, on average, the frequency coefficients become smaller, resulting in proportionally reduced data amplitudes. Second, an increase in $\alpha$ improved the BER performance. As $\alpha$ increased, the data embedding position also increased, mitigating the effect of reduced frequency coefficients and leading to a decrease in BER. Moreover, the application of repetition coding further reduced the BER by referencing the same data four times. Comparing Figs. 12(a) and 12(b), we can see that the E2 method performed better because it shows a zero BER until row number 150, regardless of embedding position, the value of $\alpha$, and use of repetitive data embedding. Even when the position moved higher than 150, BER performance was less significant compared to E1.

 figure: Fig. 12.

Fig. 12. BER results based on data allocation positions, the value of $\alpha$, and use of the data repetition scheme at a fixed SNR of 30 dB: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).

Download Full Size | PDF

4.7 BER performance according to PSNR

Next, we conducted experiments at a fixed SNR of 30 dB without repetitive data embedding. We computed the BER according to the PSNR. Figure 13 depicts the results for both methods. First, we can see that increasing the PSNR increased the BER. Second, depending upon $\alpha$, E1 achieved a higher BER than E2. Table 1 summarizes the key results in Fig. 13.

 figure: Fig. 13.

Fig. 13. BER versus PSNR for different values of $\alpha$ at SNR = 30 dB without repetitive data embedding: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).

Download Full Size | PDF

Tables Icon

Table 1. The PSNR results in Fig. 13.

After this, we simulated the BER according to PSNR when data repetitive embedding is employed. Figure 14 depicts the performance of the scheme according to PSNR when repetitive data encoding was employed. Although the maximum PSNR values are the same as Fig. 13, the BER decreased due to repetitive data embedding. Table 2 summarizes the results in Fig. 14. Again, the E2 method outperformed the E1 data-embedding mechanism.

 figure: Fig. 14.

Fig. 14. BER versus PSNR for different values of $\alpha$ at SNR = 30 dB with repetitive data embedding: (a) orthogonal embedding (E1), (b) diagonal embedding (E2).

Download Full Size | PDF

Tables Icon

Table 2. PSNR and BER in Fig. 14.

Table 3 provides a comparative summary of PSNR values at which the BER was zero. Again, applying repetition coding improved the performance of the scheme. For instance, with $\alpha =0.3$ and by using method E2, zero BER was reached at 53.7 dB when using repetition, compared to 50 dB when no repetition was used. In addition, as $\alpha$ increased, E1 and E2 both exhibited decreasing PSNR values. Therefore, we can say that the ideal value of $\alpha$ is in the range 0 to 1. Increasing $\alpha$ beyond 1 significantly reduced the PSNR.

Tables Icon

Table 3. PSNR with and without repetition coding for zero BER at a 30 dB SNR.

4.8 BER performance for an alternative input image

In this section, to validate the robustness of the methodologies proposed, an alternative image is used as input on the screen. Figure 15 shows the original image alongside the corresponding BER outcomes. The Mandrill image is used, which has a lower PSNR than the Lena image in both methods at $\alpha =0.7$. Compared to Fig. 11(a), we can observe in Fig. 15(b) that the BER performance is marginally improved across all sub-bands when employing repetition encoding. Without repetition encoding, there is an enhancement in performance in the HSB and MSB, while the LSB exhibits an elevated BER, particularly in proximity to 0 dB. In addition, when comparing the diagonal embedding method, the performance is improved in MSB whereas degraded in HSB and LSB. As we are mostly interested in MSB, it is noteworthy that performance is improved and converges to a BER of zero after 15 dB SNR. Altogether, this shows that our proposed methods are robust against any input image, irrespective of the input image types. The little improvement and degradation observed is due to the different frequency characteristics of the images, whereas the overall BER trend remains the same.

 figure: Fig. 15.

Fig. 15. The input Mandrill image and the corresponding BER performances with a fixed value of $\alpha =0.7$: (a) the original $512 \times 512$p Mandrill image, (b) orthogonal data embedding (E1), and (c) diagonal data embedding (E2).

Download Full Size | PDF

5. Conclusions

In this study, we proposed two data embedding schemes for 2D-DFC systems: orthogonal (E1) and diagonal (E2). The orthogonal method embeds data orthogonally in an image. On the other hand, the diagonal method embeds data diagonally in an image frame. The simulation results indicate that the E1 method performed better than the diagonal method in terms of PSNR. However, when considering the BER, the diagonal method exhibited better performance. When it comes to proportionality constant $\alpha$, higher values in the range of (0,1) are preferable. That will provide a lower BER with a higher PSNR. Regarding data allocation positions, when not using repetition coding, the E1 method had a BER close to zero that corresponded to the LSB. MSB and HSB data embedding showed worse BER performance. In addition, with the diagonal method, applying repetition coding led to a significant BER improvement. Overall, our findings suggest the diagonal method is more suitable for 2D-DFC implementation because it maintains a reasonable PSNR but superior BER performance compared to the orthogonal method. In our future works, the Diagonal method is expected to show enhanced performance by utilizing algorithms like zigzag scanning and optimization based on the characteristics of 2D-DCT. We also plan to consider changes due to image scaling and test in real-world environments using multiple images.

Funding

National Research Foundation of Korea (2022R1G1A1004799); Korea Institute for Advancement of Technology (P0017011).

Disclosures

The authors declare no conflicts of interest.

Data availability

No data were generated or analyzed in the presented research.

References

1. B. W. Kim, H.-C. Kim, and S.-Y. Jung, “Display field communication: Fundamental design and performance analysis,” J. Lightwave Technol. 33(24), 5269–5277 (2015). [CrossRef]  

2. A. Wang, Z. Li, C. Peng, et al., “Inframe++: Achieve simultaneous screen-human viewing and hidden screen-camera communication,” in Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services, (2015), pp. 181–195.

3. K. Jo, M. Gupta, and S. K. Nayar, “DisCo: Display-camera communication using rolling shutter sensors,” ACM Trans. Graph. 35(5), 1–13 (2016). [CrossRef]  

4. C. Chen, W. Huang, L. Zhang, et al., “Robust and unobtrusive display-to-camera communications via blue channel embedding,” IEEE Trans. on Image Process. 28(1), 156–169 (2018). [CrossRef]  

5. T. Nguyen, M. D. Thieu, and Y. M. Jang, “2D-OFDM for optical camera communication: Principle and implementation,” IEEE Access 7, 29405–29424 (2019). [CrossRef]  

6. J. Zhao and X.-Y. Li, “SCsec: A secure near field communication system via screen camera communication,” IEEE Trans. on Mobile Comput. 19(8), 1943–1955 (2019). [CrossRef]  

7. M. Guri, D. Bykhovsky, and Y. Elovici, “Brightness: Leaking sensitive data from air-gapped workstations via screen brightness,” in 2019 12th CMI Conference on Cybersecurity and Privacy (CMI), (IEEE, 2019), pp. 1–6.

8. G. J. Garateguy, G. R. Arce, D. L. Lau, et al., “QR images: optimized image embedding in QR codes,” IEEE Trans. on Image Process. 23(7), 2842–2853 (2014). [CrossRef]  

9. S. Ahlawat, C. Rana, and R. Sindhu, “A review on QR codes: Colored and image embedded,” Int. J. Adv. Res. Comput. Sci. 8(5), 410–413 (2017). [CrossRef]  

10. K. Pena-Pena, D. L. Lau, A. J. Arce, et al., “QRnet: fast learning-based QR code image embedding,” Multimed. Tools Appl. 81(8), 10653–10672 (2022). [CrossRef]  

11. T. Li, C. An, A. Campbell, et al., “Hilight: Hiding bits in pixel translucency changes,” Proceedings of the 1st ACM MobiCom workshop on Visible light communication systems, (2014), pp. 45–50.

12. V. Nguyen, Y. Tang, A. Ashok, et al., “High-rate flicker-free screen-camera communication with spatially adaptive embedding,” in IEEE INFOCOM 2016-The 35th Annual IEEE International Conference on Computer Communications, (IEEE, 2016), pp. 1–9.

13. K. Zhang, C. Wu, C. Yang, et al., “Chromacode: A fully imperceptible screen-camera communication system,” in Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, (2018), pp. 575–590.

14. V. Tran, G. Jayatilaka, A. Ashok, et al., “Deeplight: Robust & unobtrusive real-time screen-camera communication for real-world displays,” in Proceedings of the 20th International Conference on Information Processing in Sensor Networks (co-located with CPS-IoT Week 2021), (2021), pp. 238–253.

15. K. Qian, Y. Lu, Z. Yang, et al., “{AIRCODE}: Hidden {Screen-Camera} communication on an invisible and inaudible dual channel,” in 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21), (2021), pp. 457–470.

16. H. Fang, D. Chen, F. Wang, et al., “TERA: Screen-to-camera image code with transparency, efficiency, robustness and adaptability,” IEEE Trans. Multimedia 24, 955–967 (2021). [CrossRef]  

17. J. Xu, J. Klein, J. Jochims, et al., “A reliable and unobtrusive approach to display area detection for imperceptible display camera communication,” J. Vis. Commun. Image Represent. 85, 103510 (2022). [CrossRef]  

18. T. K. Tsui, X.-P. Zhang, and D. Androutsos, “Color image watermarking using multidimensional fourier transforms,” IEEE Trans.Inform.Forensic Secur. 3(1), 16–28 (2008). [CrossRef]  

19. S. Tsai, K. Liu, S. Yang, et al., “An efficient image watermarking method based on fast discrete cosine transform algorithm,” Math. Probl. Eng. 2017, 1–10 (2017). [CrossRef]  

20. S.-Y. Jung, H.-C. Kim, and B. W. Kim, “Implementation of two-dimensional display field communications for enhancing the achievable data rate in smart-contents transmission,” Displays 55, 31–37 (2018). [CrossRef]  

21. L. D. Tamang and B. W. Kim, “Spectral domain-based data-embedding mechanisms for display-to-camera communication,” Electronics 10(4), 468 (2021). [CrossRef]  

22. Y.-J. Kim, P. Singh, and S.-Y. Jung, “Experimental evaluation of display field communication based on machine learning and modem design,” Appl. Sci. 12(23), 12226 (2022). [CrossRef]  

Data availability

No data were generated or analyzed in the presented research.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (15)

Fig. 1.
Fig. 1. Block diagram of a 2D-DFC system.
Fig. 2.
Fig. 2. Frequency component distributions of a $15 \times 15$p spectral image after the 2D-DCT operation.
Fig. 3.
Fig. 3. Example of repetitive data embedding when $L=4$ and $K=2$: (a) the general procedure, and (b) a walkthrough example.
Fig. 4.
Fig. 4. An example of data embedding at mid frequency, $n=9$: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).
Fig. 5.
Fig. 5. The original Lena image and locations of sub-bands for the two data-embedding approaches: (a) the original $512 \times 512$p Lena image, (b) the sub-bands for orthogonal data embedding, and (c) the sub-bands for diagonal data embedding.
Fig. 6.
Fig. 6. Data-embedded images for different methods and frequency bands, with PSNR for $\alpha =0.7$.
Fig. 7.
Fig. 7. PSNR values for the E1 and E2 methods by varying $\alpha$ when the data allocation position is fixed to the MSB.
Fig. 8.
Fig. 8. PSNR results versus data-embedding locations for various values of $\alpha$: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).
Fig. 9.
Fig. 9. BER performance versus SNR for different values of $\alpha$, embedding methods, and sub-band positions.
Fig. 10.
Fig. 10. BER performance versus SNR for different values of $\alpha$, embedding methods, and sub-band positions when using repetitive embedding.
Fig. 11.
Fig. 11. BER results based on the application of repetitive data embedding with a fixed value of $\alpha =0.7$: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).
Fig. 12.
Fig. 12. BER results based on data allocation positions, the value of $\alpha$, and use of the data repetition scheme at a fixed SNR of 30 dB: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).
Fig. 13.
Fig. 13. BER versus PSNR for different values of $\alpha$ at SNR = 30 dB without repetitive data embedding: (a) orthogonal embedding (E1), and (b) diagonal embedding (E2).
Fig. 14.
Fig. 14. BER versus PSNR for different values of $\alpha$ at SNR = 30 dB with repetitive data embedding: (a) orthogonal embedding (E1), (b) diagonal embedding (E2).
Fig. 15.
Fig. 15. The input Mandrill image and the corresponding BER performances with a fixed value of $\alpha =0.7$: (a) the original $512 \times 512$p Mandrill image, (b) orthogonal data embedding (E1), and (c) diagonal data embedding (E2).

Tables (3)

Tables Icon

Table 1. The PSNR results in Fig. 13.

Tables Icon

Table 2. PSNR and BER in Fig. 14.

Tables Icon

Table 3. PSNR with and without repetition coding for zero BER at a 30 dB SNR.

Equations (15)

Equations on this page are rendered with MathJax. Learn more.

I F ( u , v ) = C ( u ) C ( v ) p = 0 P 1 q = 0 Q 1 I t ( p , q ) cos π ( 2 p + 1 ) u 2 P cos π ( 2 q + 1 ) v 2 Q , 0 u P 1 , 0 v Q 1 ,
C ( u ) = { 1 P , u = 0 2 P , 1 u P 1
C ( v ) = { 1 Q , v = 0 2 Q , 1 v Q 1 .
d = { 1 , b = 0 1 , b = 1 .
d amp = X amp d ,
X amp = ( P avg α ) .
D F = I F + X D ,
N o = n 4 [bits] .
N d = n 4 [bits] .
Y t = D t + N t 1
Z t = I t ( r e f ) + N t 2 ,
Y F = D F + N F 1 = I F + X D + N F 1
Z F = I F ( r e f ) + N F 2 .
X ^ D = Y F Z F = ( I F + X D + N F 1 ) ( I F ( r e f ) + N F 2 ) .
b ^ = { 0 , d ^ < 0 1 , d ^ 0 .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.