Performance evaluation of data embedding schemes for two-dimensional display field communication

Tae-Min Kim; Pankaj Singh; Sung-Yoon Jung

doi:10.1364/OE.515565

1. Introduction

Display field communication (DFC) [1] is a subcategory of unobtrusive wireless display-to-camera (D2C) communication technologies [2–4], using digital displays to transmit data to cameras. It is based on OFDM-like modulation [5] to deliver data streams without interfering with the viewing experience. In other words, D2C communication makes full use of spatial and temporal diversity, meaning it can modulate multiple pixels on a screen to facilitate high-rate data communication. Additionally, it does not require any modifications to standard off-the-shelf screen and camera devices, so it is easily integrated with the existing infrastructure. Consequently, D2C communication has become a highly discussed and explored area of optical technology. In comparison to RF communications, D2C communication offers several distinct advantages. First, it establishes rapid connections without the need for explicit network setup, a challenge often faced by WiFi and Bluetooth. Secondly, the RF spectrum is heavily congested, whereas D2C communication enjoys a significant amount of available spectrum, operates with independent regulations, and naturally supports one-to-many communication. Lastly, it inherently provides security benefits [6,7]. This is because, unlike RF signals that can penetrate walls, D2C links are confined to well-defined coverage zones. Consequently, it paves the way for numerous innovative applications in fields such as mobile communications, digital advertising, smart homes, and intelligent transportation systems. D2C communications can be thought of as a replacement technology for 2D barcodes and QR codes, which also transfer data to a camera. Note that QR codes are widely used today, and various efforts are underway to advance them [8–10].

The pioneering work in hidden D2C communications capitalizes on the temporal-spatial flicker-fusion property [2], and employs modulation techniques resembling CDMA to transmit data streams without causing disruptions in the viewing experience. Another approach, HiLight [11], takes advantage of the orthogonal transparency (alpha) channel to embed bits as changes in pixel translucency without altering pixel color values. TextureCode [12] enhances invisibility by adaptively embedding data based on video texture. ChromaCode [13] further enhances code invisibility by adjusting lightness in a uniform color space and achieves complete imperceptibility. DeepLight [14] elevates the reliable data transmission rates of concealed D2C communication in various real-world conditions by selectively modulating the intensity of the blue channel and incorporating machine learning (ML) models into the decoding process. For increased data rates, AirCode [15] leverages the complementary advantages of video and audio channels, and integrates visual odometry for precise screen detection. Along similar lines, a screen-to-camera image code was proposed that adopts the color decomposition principle to ensure data embedding efficiency [16]. A BCH coding-based data arrangement and an attention-guided data decoding network are designed to guarantee robustness and adaptability. More recently, an innovative approach was presented to detect the display area in camera images that involves the incorporation of a novel localization marker into the corners of the display [17]. This marker, though less obtrusive than conventional fiducial markers, has proven to be highly reliable across various types of display content and backgrounds, as demonstrated through both simulations and experimental results.

DFC is D2C communication in which the data are embedded in the frequency domain of an image. Unlike spatial domain techniques, data embedding in the frequency domain makes the data less noticeable to the human eye [18,19]. DFC transforms spatial domain images to the frequency domain using frequency transformation techniques. The data are embedded into the frequency-domain image, which is then converted back to the spatial domain to be displayed on screen. This means that data can be transmitted via the D2C link while the image sequences are displayed for their original purpose. The proposed scheme embeds data in a designated spectral band so the hidden data are not revealed through normal displays. Embedding data in two dimensions of an image frame results in two-dimensional DFC (2D-DFC) [20]. It has been shown that 2D-DFC achieves better data rates than 1D-DFC. The initial versions of DFC used discrete Fourier transform (DFT) as a frequency transformation technique, which restricts data embedding due to the inherent conjugate symmetric properties [1]. This limitation was removed by using discrete cosine transform (DCT), which does not have an imaginary component [21,22]. In this study, we consider 2D-DFC employing 2D-DCT, which serves as a better approach to enhance the data rate of the system. DCT, being a real-valued transform, does not suffer from ringing artifacts that are sometimes introduced by the imaginary components. 2D-DCT places greater emphasis on vital image information within the lower frequencies, compared to 1D-DCT. Another advantage of DCT is that common image compression algorithms like JPEG utilize 2D-DCT. Hence, our transformation technique becomes synchronized with the compression techniques to be utilized when the camera captures the image. Hereinafter, we propose two data embedding mechanisms: a modified orthogonal method and a diagonal method. Both schemes are evaluated for different system design parameters. Simulation results reveal that diagonal data embedding performs better than orthogonal data embedding in terms of bit error rate (BER). In addition, the diagonal method maintains a higher peak signal-to-noise ratio (PSNR).

The rest of this paper is organized as follows. Section 2 presents the 2D-DCT-based DFC system model. Section 3 presents the proposed data embedding mechanisms and the power allocation scheme. In addition, data repetition and data extraction are explained. Section 4 presents the experiment setup and the simulation results in terms of BER and PSNR. Finally, the study concludes in Section 5.

2. 2D-DCT based DFC

2.1 2D-DFC system model

A fundamental DFC system comprises a display pointed towards a camera. Figure 1 shows a block diagram of a 2D-DFC system. At the transmitter, the input image, $\mathbf {I}_t$, is first converted into its frequency domain equivalent, $\mathbf {I}_F$, using the 2D-DCT operation. At the same time, binary input data ($\mathbf {b}$) are modulated, and based on the average power of the frequency domain coefficients, are embedded into the frequency-domain image. The embedding process is done via the addition allocator. The data are embedded in specific frequency regions (sub-bands) of the image. The modulated and power-allocated data, $\mathbf {X}$, are embedded into the image resulting in data-embedded image $\mathbf {D}_F$. The frequency-domain data-embedded image is then converted back to the spatial domain to be displayed on the screen. This is achieved via the inverse 2D-DCT operation resulting in spatial domain image $\mathbf {D}_t$. Note that the reference image, $\mathbf {I}_{t(ref)}$, which is not data embedded, is also rendered alternately on screen. By inserting the reference image between the neighboring data-embedded images, image artifacts that might be visible to the human eye can be minimized. In addition, the reference images help the camera receiver decode the data.

Fig. 1. Block diagram of a 2D-DFC system.

Method	$α = 0.3$	$α = 0.5$	$α = 0.7$	$α = 1$	$α = 3$
E1 (max. PSNR/BER)	72 dB/0.47	63 dB/0.43	60 dB/0.4	57 dB/0.37	47 dB/0.14
E2 (max. PSNR/BER)	65 dB/0.38	59 dB/0.32	56 dB/0.26	53 dB/0.17	44 dB/0.003

Method	$α = 0.3$	$α = 0.5$	$α = 0.7$	$α = 1$	$α = 3$
E1 (max. PSNR/BER)	72 dB/0.44	63 dB/0.43	60 dB/0.35	57 dB/0.27	47 dB/0.02
E2 (max. PSNR/BER)	65 dB/0.2	59 dB/0.11	56 dB/0.05	53 dB/0.01	44 dB/0

Method	$α = 0.3$	$α = 0.5$	$α = 0.7$	$α = 1$	$α = 3$
E1 (no repetition/repetition)	45 dB/49 dB	42 dB/47 dB	39 dB/46 dB	35 dB/45 dB	33 dB/44 dB
E2 (no repetition/repetition)	50 dB/53.7 dB	46 dB/52.3 dB	45 dB/52.3 dB	44 dB/52.2 dB	44 dB/46 dB

Abstract

1. Introduction

2. 2D-DCT based DFC

2.1 2D-DFC system model

2.2 2D-DCT frequency coefficients

3. Proposed data embedding and extraction

3.1 Power allocation

3.2 Repetitive data embedding

3.3 Orthogonal and diagonal embedding

3.4 Data extraction

4. Experiment results

4.1 Experiment setup

4.2 PSNR as a function of $\alpha$ and data allocation positions

4.3 BER performance without repetitive embedding

4.4 BER performance with repetitive embedding

4.5 Comparison of BER w/ and w/o repetitive data embedding

4.6 BER performance according to embedding position

4.7 BER performance according to PSNR

4.8 BER performance for an alternative input image

5. Conclusions

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (15)

Tables (3)

Equations (15)

Optics Express