Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Faster generation of holographic video of 3-D scenes with a Fourier spectrum-based NLUT method

Open Access Open Access

Abstract

In this article, a new type of Fourier spectrum-based novel look-up table (FS-NLUT) method is proposed for the faster generation of holographic video of three-dimensional (3-D) scenes. This proposed FS-NLUT method consists of principal frequency spectrums (PFSs) which are much smaller in size than the principal fringe patterns (PFPs) found in the conventional NLUT-based methods. This difference in size allows for the number of basic algebraic operations in the hologram generation process to be reduced significantly. In addition, the fully one-dimensional (1-D) calculation framework of the proposed method also allows for a significant reduction of overall hologram calculation time. In the experiments, the total number of basic algebraic operations needed for the proposed FS-NLUT method were found to be reduced by 81.23% when compared with that of the conventional 1-D NLUT method. In addition, the hologram calculation times of the proposed method, when implemented in the CPU and the GPU, were also found to be 60% and 66% faster than that of the conventional 1-D NLUT method, respectively. It was also confirmed that the proposed method implemented with two GPUs can generate a holographic video of a test 3-D scene in real-time (>24f/s).

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Electro-holographic display based on the computer-generated holograms (CGHs) has been considered as one of the eventual techniques for the realization of a realistic three-dimensional television (3-D TV) since it can provide us photorealistic optical 3-D images [14]. However, this CGH-based electro-holographic display has been suffered from two kinds of critical issues in its practical deployment in many application fields [48]. One problem is the unavailability of a spatial light modulator (SLM) whose displaying area and pixel pitch are relatively large and narrow, enabling the optical reconstruction of large-scale 3-D images in high-resolution with a wide viewing-angle [4,5]. The other problem is its heavy computational complexity involved in the real-time generation of the holographic video for an input 3-D scene [68]. Therefore, a lot of research works in the CGH-based electro-holographic display have been focused on the development of faster CGH algorithms [935].

So far, various kinds of CGH algorithms have been proposed, such as the classical ray-tracing [8], look-up table (LUT) [9], novel look-up table (NLUT) [10], NLUT-based [1120], split look-up table (S-LUT) [21], compressed look-up table (C-LUT) [22], improved look-up table [23], layer-based [2426], wave-front recording plane (WRP)-based [2731], polygon-based [3234], FPGA-based [35] and GPU-based [3638] methods.

Basically, the LUT method [9] can be directly derived from the ray-tracing method, which simplifies the hologram computation operations just by transforming all those complex power, root, trigonometric functions into the multiplication and addition functions without any sacrifice of reconstruction image quality, such as parallax, depth dues and viewing zone. Since then, many research works have been focused on the development of the LUT-based real-time hologram generation systems.

The novel look-up table (NLUT) method enables a dramatic reduction of the memory capacity of the LUT from terabytes (TBs) to gigabytes (GBs) just by storing only the Fresnel-zone-plate (FZPs) of the center object points on each depth plane of a 3-D object, which are designated as the principal fringe patterns (PFPs) [10]. For further decrease of the memory capacity of the NLUT, many types of NLUT-based methods have been developed [1120]. In the circular symmetry and trigonometric decomposition-based methods [1113], 1-D forms of the PFPs such as the line-type PFP and Sub-PFP are stored, which also results in great reductions of their required memory capacity [14]. In the depth compensation-based methods, only two 2-D baseline and depth-compensating PFPs (2-D B-PFP and DC-PFP) are pre-calculated and stored, from which a set of 2-D PFPs for each depth plane are generated based on the thin-lens property of the PFP, which results in a massive reduction of the required memory [15,16].

In addition, to accelerate the calculation speed of the NLUT in the hologram generation process, several redundancy and motion compensation-based methods have been proposed, in which redundant object data between the two consecutive video frames can be removed with motion estimation and compensation algorithms, which results in great reductions of their overall hologram calculation times [1720]. Furthermore, the split look-up table (S-LUT) method can shorten the hologram calculation time by extraction of a pair of horizontal and vertical light modulators from the diffraction fringe pattern, which enables a great reduction of computational complexity [21]. C-LUT and AC-LUT methods can further decrease their memory capacity just by extracting the longitudinal terms of the Fresnel diffraction equation from the horizontal and vertical factors [22,23].

Recently, a one-dimensional NLUT (1-D NLUT) method has been proposed [39]. In this method, only a pair of half-sized 1-D baseline and depth-compensating principal-fringe-patterns is pre-calculated and stored based on the concentric-symmetry property of the PFP, and from which a set of half-sized 1-D PFPs for all depth planes are generated based on its thin-lens property. This method then enables us to minimize the required memory size down to a few kilobytes (KB) regardless of the number of depth planes. Moreover, all those hologram calculations are fully one-dimensionally performed with a set of half-sized 1-D PFPs based on the shift invariance property, which also allows a significant reduction of computational load when it is compared with those conventional NLUT-based methods. Here, the computational complexity of the 1-D NLUT can be given by O(MN + N2) for a square hologram shape, where M and N× N denote the number of object points and resolution of the hologram, respectively. However, even though this method can dramatically reduce the memory capacity of the 1-D NLUT down to the order of KBs, its hologram calculation speed cannot be compatible with real-time applications since computational load of this method is still huge.

Accordingly, in this paper, as a feasible approach for real-time generation of holographic video of 3-D scenes, a new type of the Fourier spectrum-based novel look-up table (FS-NLUT) method is proposed. Basically, in this proposed FS-NLUT method, only the so-called principal frequency spectrums (PFSs) are pre-calculated and stored for all those center object points on each depth plane in the Fourier domain unlike the conventional NLUT-based methods where principal fringe patterns (PFPs) are pre-calculated and stored in the spatial domain. Here each PFS actually represents the Fourier-transformed version of its corresponding PFP. In this proposed method, frequency spectrums (FSs) for other object points on each depth plane can be simply generated from their corresponding PFSs just by being multiplied with phase factors (PFs) of those object points, which correspond to the Fourier-transformed versions of displacements of those object points from their centers.

Here it must be noted that sizes of the PFSs of the proposed method can be generated much smaller than even those of the PFPs of the conventional 1-D NLUT method. Thus, the proposed method operating in the Fourier domain enables a massive removal of the computational redundancy of the conventional NLUT-based methods, which results in a dramatic reduction of the total number of multiplication and addition operations generally performed in the hologram generation process here in the proposed method.

The computational complexity of the proposed method can be given by O(MN/r + N2/r2), where r denotes the redundancy factor to be determined by the hologram resolution, wavelength, hologram pixel pitch and distance from the hologram to the object image. Thus, by keeping the value of r larger than one, the computational complexity of the proposed method can be made much smaller than even that of the conventional 1-D NLUT method. Furthermore, in the proposed method, all those multiplication and addition operations for hologram generation are fully one-dimensionally performed like the conventional 1-D NLUT method, so that its overall hologram calculation time can be also greatly reduced.

To confirm the feasibility of the proposed method in the practical application, the FS-NLUT is implemented on the CPU and GPU, and with which experiments are carried out using a test 3-D scene, and then the results are compared with the conventional 1-D NLUT method.

2. Computational redundancy of the conventional NLUT-based methods

As seen in Fig. 1, in the conventional NLUT-based methods, pre-calculated Fresnel zone plates (FZPs) for each object point, which are called principal fringe pattern (PFP), are to be summed up after being multiplied with their corresponding object intensities to generate the final hologram pattern of a 3-D object, which is defined as Eq. (1). In this hologram calculation process, only the multiplication and summation operations are involved, which enables acceleration of the calculation speed of the conventional NLUT-based methods.

$$\begin{aligned} u(x,y) &= \frac{{{e^{ik{z_d}}}}}{{i\lambda {z_d}}}\int {\int\limits_{ - \infty }^\infty {I(\xi ,\eta )\exp \left\{ {i\frac{k}{{2{z_d}}}[{{{(x - \xi )}^2} + {{(y - \eta )}^2}} ]} \right\}d\xi d\eta } } \\ &\textrm{ = }\int {\int\limits_{ - \infty }^\infty {I(\xi ,\eta )t(x,y,\xi ,\eta )\textrm{ }d\xi d\eta } } \end{aligned}$$

 figure: Fig. 1.

Fig. 1. Computational redundancy of the NLUT-based hologram calculation process

Download Full Size | PDF

In Eq. (1), u(x, y) indicates the hologram of 3-D object, and (ξ, η), (x, y) denote the coordinates of the object and hologram image planes, while k, λ and zd mean the wave number, wavelength and distance from the dth object image plane to the hologram plane, respectively. The exponential part is independent from the object, thus can be pre-calculated and stored in look-up table t. Here we can see that it is too hard to decrease the sizes of FZPs since diffraction zones of those object points are enclosed all over the whole hologram plane. Furthermore, the sampling rate of the FZP for each object point is set to be identical with that of the final hologram to keep the high addition efficiency of the FZPs for all those object points, which inevitably brings a massive computational redundancy since the sampling rate of the FZP for each object point must be lower than that of the final hologram [40].

According to the integral property of the Fourier transformation, the hologram calculation process defined by Eq. (1) can be transformed into the Fourier domain. The hologram frequency spectrum (HFS) of Eq. (2), U(fx, fy) can be then obtained just by adding all those frequency Fresnel zone plates (FFZPs) for each object point as shown in Fig. 1, where T denotes the Fourier transformation of t (x, y, ξ, η) across the x and y dimensions.

$$U({f_x},{f_y}) = \int {\int\limits_{ - \infty }^\infty {I(\xi ,\eta )T({f_x},{f_y},\xi ,\eta )\textrm{ }d\xi d\eta } }$$

Here it must be noted that same multiplication and summation operations are also needed even in the Fourier domain just like the space domain case as shown in Fig. 1. No matter in space or Fourier domains, the sampling point number of the hologram for each object point is the key factor affecting the eventual computational complexity of the NLUT-based hologram calculation process. In the conventional NLUT-based methods employing a spatial domain calculation scheme, sampling points of the holograms for each object point are all identical, which is also equal to that of the final hologram.

In fact, the space-bandwidth product of the hologram is related to the physical object size as defined in Eq. (3) [40], where Lx and Lξ mean the physical lengths in the horizontal directions of the hologram and object image planes, respectively, and only the horizontal directions are considered in Eq. (3) for simplicity.

$${B_X} = \frac{{{L_\xi } + {L_x}}}{{\lambda z}}$$

For the hologram of an object point, Lξ can be regarded as the infinitesimal length, which indicates the hologram bandwidth for the object point has to be smaller than that of the final hologram that is, the space bandwidth product gets smaller.

In the discrete calculation process, the spatial sampling number, Ns equals Lx/p when the spatial sampling interval is given by p, and the frequency sampling number, Nf becomes BxNsp just by dividing the bandwidth Bx with the frequency sampling interval 1/Nsp. Then, sampling numbers of the holograms for those object points in the spatial and frequency domains are to be mutually interrelated with Eqs. (4). Here, spatial sampling numbers of the point object and final holograms are same, which is set to be Ns.

$${N_s}^2 = \frac{{\lambda z}}{{{p^2}}}{N_f}$$

Figure 1 shows the Ns dependence on the Nf, which reveals that Ns stays larger than Nf until it reaches a critical point. For instance, a critical value of the Ns is calculated to be 3,325 pixels under the conditions of λ = 0.532 µm, z = 4e5 µm and p = 8 µm. In fact, for the commercially available spatial light modulator (SLM) with a pixel pitch of 8 µm, this value is calculated to be only 1,920 pixels, which is much smaller than the critical value mentioned above. For the emerging SLMs whose pixel pitches are in the range of 3.74 µm to 4.5 µm, their corresponding values are calculated to be ranged from10,509 to 15,213 pixels. Thus, the frequency sampling number of Nf becomes much smaller than the spatial sampling number of Ns for the hologram calculation process applied to the practical holographic display, which means that the number of frequency coefficients can be made much less than that of the spatial coefficients in the hologram calculation process. Here, it is very useful to use the ratio of r = Ns/Nf as the redundancy factor representing a redundancy degree of hologram calculation for the comparative performance analysis of many hologram generation methods.

3. Proposed FS-NLUT method

Here in this paper, we propose a new type of the Fourier spectrum-based NLUT (FS-LUT) method enabling the real-time calculation of holographic video of an input 3-D object in motion. As mentioned above, the proposed FS-NLUT is composed of a set of PFSs, which are pre-calculated and stored for all those center object points in each depth plane in the Fourier domain. Basically, each size of the PFS of the proposed method can be generated much smaller than that of the PFP of the conventional NLUT-based methods, which allows the proposed method to massively reduce the number of multiplication and addition operations to be performed in the hologram generation process. That is, computational complexity of the conventional NLUT-based methods can be significantly reduced in the proposed method with this Fourier domain-based 1-D hologram calculation process.

Figure 2 shows the operational diagram of the proposed method which is composed of three processes, such as 1) Pre-calculation and storing of 1-D PFSs for each depth plane of an input 3-D scene, 2) Pre-calculation and storing of 1-D phase factors (1-D PFs) for object points on each depth plane, and 3) Generation of the hologram for the input 3-D scene with all those pre-calculated 1-D PFSs and 1-D PFs on the1-D calculation pipelines.

 figure: Fig. 2.

Fig. 2. Operational diagram of the three-step process of the proposed FS-NLUT method.

Download Full Size | PDF

3.1 Pre-calculation of 1-D PFSs

In fact, a 3-D object can be treated as a set of 2-D images discretely-sliced along the z-direction, in which each 2-D image with a fixed depth is regarded as a collection of self-luminous object points of light. In the conventional NLUT-based methods, only the PFPs of the object points located at the center of each image plane are pre-calculated and stored [15,16]. In the proposed FS-NLUT method, instead of the PFPs of the conventional NLUT-based methods, PFSs representing the Fourier spectrums of their corresponding PFPs are pre-calculated and stored for all those object points located at each center of the depth planes as mentioned above.

As discussed above, the number of effective coefficients of each PFS becomes much less than that of the corresponding PFP. Thus, the memory capacity required for storing all those PFSs in the proposed method becomes much smaller than those used for storing the PFPs in the conventional NLUT-based methods. Here, 2-D and 1-D forms of the PFS for each sliced depth plane of the 3-D object can be calculated with Eq. (5) and Eq. (6), respectively,

$$PFS = {\cal F}\left\{ {\exp (i\frac{k}{{2z}}({x^2} + {y^2})} \right\},\textrm{ ( }{f_x} \le {B_x},{f_y} \le {B_y})$$
$$\left\{ \begin{array}{l} PFS{\__x}\textrm{ = }{\cal F}\left\{ {\exp (i\frac{k}{{2z}}{x^2})} \right\},\textrm{ }{f_x} \le {B_x}\\ PFS{\__y}\textrm{ = }{\cal F}\left\{ {\exp (i\frac{k}{{2z}}{y^2})} \right\},\textrm{ }{f_y} \le {B_y} \end{array} \right.$$
where F denotes the Fourier-transform operator, while Bx and By represent the effective bandwidths of the PFS in the fx and fy directions, respectively, which can be calculated from Eq. (3) by setting Lξ to be zero. It is noted here that the PFS can be windowed by the limited frequency boundaries in the Fourier domain as shown in Fig. 3.

 figure: Fig. 3.

Fig. 3. 2-D and 1-D forms of the PFS obtained by Fourier-transforming its corresponding PFP.

Download Full Size | PDF

Since the PFS is depth-dependent, a pair of depth-dependent 1-D versions of the PFS, which are given by PFS_x and PFS_y, are to be stored for minimizing the on-line calculation time. Then, 2-D forms of the PFSs can be generated with the baseline PFS and depth compensation PFS to save the memory capacity based on the depth compensation theory [39].

3.2 Pre-calculation of 1-D PFs

To generate the frequency spectrums for other object points in each depth plane with their PFSs, phase factors (PFs) of all those object points, which correspond to Fourier-transformed versions of their displacements from the centers, are calculated and stored. According to the shifting property of the Fourier transform, 2-D and 1-D form of the PF in the x and y-directions can be given by Eq. (7), (8) and (9), respectively,

$$PF = PF\_y \cdot PF\_x$$
$$PF\_x = \exp ( - i2\pi \xi {f_x})$$
$$PF\_y = \exp ( - i2\pi \eta {f_y})$$
where fx and fy denote the significant Fourier frequencies of the PF in the x and y-directions, and ξ and η represent the horizontal and vertical coordinates of the object point, respectively. Here each PF is composed of a pair of 1-D components of PF_x and PF_y, and with which all those holograms are to be generated for the whole calculation process of the proposed method to be compatible with the 1-D calculation pipeline [39]. Since PFs are independent from the depth plane, only a pair of LUTs are required to store each of the x and y-dimensional values of the PFs. Here, sizes of the PFs are identical with those of the PFSs, so that the resultant hologram calculation time for each object point can be greatly reduced.

3.3 Generation of hologram patterns with two sets of PFSs and PFs

Hologram patterns of input 3-D scenes can be one-dimensionally calculated just by combined use of two sets of PFSs and PFs on the 1-D calculation pipelines. As shown in Fig. 4, the hologram generation framework of the proposed method is composed of five-step processes, such as 1) Calculation of hologram patterns for each object point array in each depth layer with their corresponding PFSs and addition up them all, 2) Calculation of the accumulated PFs for all object points in each depth layer, 3) Generation of frequency spectrums of the holograms for each depth layer by multiplying each of accumulated PFs to the corresponding PFSs, 4) Generation of the frequency spectrum for the input 3-D scene by adding up those frequency spectrums of the holograms for all depth layers, and 5) Generation of the final hologram pattern of the input 3-D scene by inverse-Fourier transformation of the frequency spectrum with a fast Fourier transform (FFT) algorithm.

 figure: Fig. 4.

Fig. 4. Five-step hologram generation process with combined use of the PSFs and PFs on the 1-D calculation pipeline.

Download Full Size | PDF

First, intensities of the object points in each object point array are multiplied to their corresponding 1-D PFS_x and PFS_y, respectively, and summed up to generate the PFSs for each depth layer. Here, all those object points in each object point array have the same vertical coordinates and depth values. Second, the accumulated horizontal 1-D PFs for all those object points belonging to each object point array are multiplied with their vertical 1-D PFs corresponding to the vertical coordinate values of each object point array, from which 2-D PFs for each object point array are to be generated. 2-D PFs for all object point arrays are then added up in each depth plane, which results in generation of the complete PFs for all object points in each depth plane, which is designated as the ComPF here and given by Eq. (10).

$$ComPF = \sum\limits_\eta {PF\_y(\eta )\left\{ {\sum\limits_\xi {I(\eta ,\xi )PF\_x(\xi )} } \right\}}$$

Third, these ComPFs are then multiplied to their corresponding PFSs in each depth layer. Here, the frequency spectrum of the hologram for all object point arrays in a depth plane of d can be given by Eq. (11).

$$HF{S_d} = ComPF \cdot PFS$$

Fourth, based on the linearity property of the Fresnel convolution, the hologram pattern of the 3-D scene composed of many sliced depth layers, can be obtained just by summing up all those holograms for every depth layer. The accumulated frequency spectrum of the hologram can be then given by Eq. (12) based on the integral property of the Fourier transform.

$$HFS = \sum\limits_d {HF{S_d}}$$

Finally, the final hologram pattern of the 3-D object can be generated just by inversely Fourier transforming the accumulated frequency spectrum of the hologram, which is given by Eq. (13).

$$CGH = {{\cal F}^{ - 1}}\{{HFS} \}$$

4. Experiments and the results

4.1 Computational redundancy analysis of the proposed method

For the comparative analysis of computational redundancy of the proposed method based on the Fourier domain with that of the conventional 1-D NLUT based on the spatial domain, Figs. 5(a)-(c) shows the PFP of an object point generated with the proposed FS-NLUT method and its frequency spectrum and phase angles in the Fourier domain, and Figs. 5(d) show the PFP generated with the conventional 1-D NLUT method, while Figs. 5(e) and (f) show reconstructed object points from each of the PFPs of Fig. 5(a) and (d), respectively.

 figure: Fig. 5.

Fig. 5. Comparative analysis of computational redundancy of the PFS and PFP: (a) PFP generated from the PFS with the proposed FS-NLUT method, (b) Frequency spectrum and (c) Phase angel images of the PFS, (d) PFP generated with the conventional 1-D NLUT, and (e), (f) Reconstructed object point images from the PFP of (a) and (d), respectively.

Download Full Size | PDF

As shown in Fig. 5(b), it is very obvious that effective coefficients of the frequency spectrum only exist within a limited area, which indicates the number of effective coefficients of the PFS of the proposed method turns out to be much less than that of the corresponding PFP of the conventional method as mentioned above. Here, frequency sampling numbers of Nf in the horizontal and vertical directions are calculated to be 1,109 pixels and 351 pixels, respectively, under the condition that horizontal and vertical resolutions of the hologram, pixel pitch and wavelength are given by 1,920, 1,080, 8 µm and 532 µm, respectively as well as the distance from the object image to the hologram image plane is set to be 40 cm.

The frequency spectrum shown in Fig. 5(b) is well coincident with the theoretically calculated values, which may prove the accuracy of the bandwidth calculation of the PFP. Figure 5(c) also shows the phase angle image of the frequency coefficients of the PFP where effective coefficients have the same distribution area as of the spectrum image.

Here, the number of effective coefficients of the PFP in the spatial domain is given by1,920 * 1,080 = 2,073,600, which means 2,073,600 times of multiplication and addition operations are required to calculate the hologram of one object point. In contrast, the number of effective coefficients of the PFS in the frequency domain is only given by 1,109 * 351 = 389,259 which corresponds to only 18.77% of the spatial domain case. In other words, the total number of multiplication and addition operations required in the proposed method can be reduced by 81.23% when it is compared with that of the conventional 1-D NLUT method.

Figures 5(a) and (d), respectively, show the spatial PFPs generated with the conventional 1-D NLUT and proposed FS-NLUT methods in forms of 2-D and 3-D grids. Here it must be noted that even though the PFP of the FS-NLUT is generated only with a limited number of effective frequency coefficients, there is no difference between the two PFPs generated with the proposed FS-NLUT and conventional 1-D NLUT methods as shown in Fig. 5(a) and (d). In addition, Figs. 5(e) and (f) show the reconstructed object point images from the PFPs generated with the 1-D NLUT and FS-NLUT shown in Fig. 5(a) and (d), respectively, where in both cases perfect impulses are reconstructed without any degradation.

Thus, those experimental results reveal that even though the number of involved frequency coefficients of the proposed method is much less than that of the conventional method, the reconstructed image quality of the proposed method has been found not to be degraded while all of the hologram information can be preserved.

4.2 Implementation of the proposed FS-NLUT on the CPU and GPU frameworks

In the experiments, the proposed FS-NLUT method has been implemented on both of the CPU and GPU boards to confirm its accelerated hologram calculation capability and its potential application to the real-time generation of high-resolution holographic videos for the test 3-D scene. For comparative performance analysis, the conventional 1-D NLUT method has been also implemented on both of them.

Table 1 shows the pseudocode for the CPU-based implementation of the proposed method. All required data including the object intensity and depth images are loaded at first and hologram calculations for the test object is performed on the one-by-one basis for each object point of the test object. The externalist loop represents the depth entry of each depth image sliced from the 3-D object space where judgements of whether object points exist on the current depth value or not is required. The inner loops traverse each object point in the current depth layer, in which x-dimensional phase factors (PF_xs) for each object point is multiplied by the object intensities and added up together to generate a temporal phase factor (Tem_PF_x), then if the number of object points in the current row is not zero, the Tem_PF_x is to be multiplied by the corresponding PF_y to generate a complete PF (Com_PF) presenting the phase factors for all object points.

Tables Icon

Table 1. Pseudocode for the CPU implementation of the proposed FS-NLUT method.

The next operation is the pointwise multiplication between the Com_PF and PFS of the current depth layer and directly Fourier-transformed into the final hologram with the inverse-FFT (IFFT) algorithm.

In addition, the proposed method has been also implemented with the GPU framework under the computer-unified-device-architecture (CUDA) platform. As seen in Fig. 6, calculation processes are divided into the two host and device parts. That is, pre-calculated data are firstly load in the host, and transfer to the device memories, such as the global, shared and registry memories with the faster access speed. Here, all information of the test 3-D object is packed into 1-D data arrays which consist of the intensity point array (I-point array), x-coordinate point array (X-point array), y-coordinate point array (Y-point array) and z-coordinate point array (Z-point array).

 figure: Fig. 6.

Fig. 6. Block-diagram of the memory management of the GPU framework for the proposed method.

Download Full Size | PDF

Three kernels corresponding to different calculation tasks are designed to calculate the hologram of the test 3-D object according to the task characteristics. The memory optimization has been also done to avoid the traffic jam of data access. As seen in Fig. 6, object data are transferred from the host memory to the global memory first, and then transferred to the shared memory because the shared memory is independent among different blocks, which supports faster data access. During the hologram calculation, the registry memory for each thread is prepared for storing the current updated coefficients for the PFSs and PFs, which saves memory accesses and transfer times more efficiently.

When a CUDA computation architecture is considered, multiple loop programming structures may not be compatible with those parallel characteristic of the GPU and its computation efficiency is seriously decreased. However, the hologram calculation process has multiple loops logically, such as the loops for each of the three spatial dimensions of the 3-D scene and hologram pixels in the horizontal and vertical directions. To fully utilize the parallel computation capacity of the GPU, the code structure needs to be carefully designed.

Here, computational tasks of the proposed method can be divided into the three dependent tasks corresponding to three kernels. As seen in the pseudocode of Table 2, Kernel 1 functions to update the object intensities through multiplication of the PF_x and PFS_x, and Kernel 2 functions to generate the frequency spectrums of holograms, and Kernel 3 just functions the Fourier transformation. Unlike the CPU coding structure having three-layer loops for ξ, η and depth dimensions for traversing the 3-D object space, respectively, the GPU coding structure is designed to have only one-layer loop which significantly enhances its computational efficiency.

Tables Icon

Table 2. Pseudocode for the GPU implementation of the proposed FS-NLUT method.

In addition, multiplications and summations between each pixel of the PF_x and intensities of the object points are implemented in parallel by 10 thousand independent threads T in the Kernel 1. In the Kernel 2, the 1-D temporal PF_x is transformed into the 2-D version with multiplication and summation operations that are assigned to hundreds of thousands independent threads to calculate the frequency spectrum of the final hologram. Thus, tremendous parallel computations and small quantity of loops in the programming of the GPU framework guarantee the high-speed hologram computation capability of the GPU.

4.3 Comparative analysis of the hologram calculation speed of the proposed FS-NLUT and conventional 1-D NLUT methods

In this section, computation times taken for the calculation of holographic video of a test 3-D scene are analyzed between the proposed FS-NLUT and conventional 1-D NLUT methods. In addition, a feasibility test of the proposed method for the real-time generation of holographic vide of 3-D scenes is also carried out.

The proposed method is implemented on each of the CPU and the GPU frameworks. CPU and GPU programs are made under the Matlab 2017 running in the personal computer with the CPU (Intel Core i7-9700 CPU @ 3.00 GHz 16GB) and CUDA platform running in the GPU (GTX TITAN 12GB), respectively. For comparison, the conventional 1-D NLUT method sis also implemented on both of the CPU and GPU.

As a test scene, 30-frames of the input 3-D scene composed of two objects of ‘House’ and ‘Car’ are used, in which ‘Car’ moves around the fixed ‘House’ with varying depths across total 30 frames. The resolution of the scene image is set to be 480*640 with the sampling intervals of 16 µm in the horizontal and vertical directions, and 0.1 mm in the longitudinal direction, respectively. The number of object points contained in the input 3-D scene is calculated to be around 50 K for the single frame, and around 1,500K for the whole video frames of 30. In addition, the resolution of the hologram is set to be 1920*1080 with a pixel pitch of 8 µm, while the distance measured from the front plane of the 3-D scene to the hologram plane is set to be 400 mm.

As shown in Table 3, calculation times of the holographic video (CTHV) of the conventional 1-D NLUT and proposed FS-NLUT methods on the CPU frameworks have been calculated to be about 113.88 s and 45.74 s, respectively. In addition, calculation times of the single frame (CTSF) and single point (CTSP) of the conventional and proposed methods have been calculated to be 3.8 s, 1.52 s and 80 µs, 32 µs, respectively. These results indicate that the calculation speed of the proposed method has been enhanced by 60% when it is compared with the conventional 1-D NLUT method. Hence, on the GPU frameworks, CTHV, CTSF and CTSP of the proposed FS-NLUT and conventional 1-D NLUT methods have been also estimated to be 4.7 s, 159 ms, 3.3 µs and 13.86 s, 462 ms, 9.7 µs, respectively, which also reveals that the calculation speed of the proposed method has been increased by 66% when it is compared with that of the conventional method.

Tables Icon

Table 3. Calculation times of the proposed FS-NLUT and conventional 1-D NLUT methods on each of the CPU and GPU frameworks for the test 3-D scene

Furthermore, when the sampling ratio of the test object image to the half of the original one, the resultant total number of object points is reduced to down to around 10 K. Then, the calculation time for one frame hologram has been calculated to be around 80 ms, with which 12f/s (frames per second) of hologram video can be generated. Thus, by using a pair of GPUs, the total calculation time of the whole hologram video of 30 frames can be estimated to be around 1.2 s, which implies real-time generation of holographic video with 2 K resolution can be achieved with the proposed method implemented on two GPUs.

In addition, ultimate computing capacities of the CPU and GPU-based conventional 1-D NLUT and proposed FS-NLUT systems have been also tested for the numbers of object points ranged from 10 K to 140 K. Actually, experiments on their computing capacities are carried out for four cases, such as the1-D NLUT implemented on the CPU, FS-NLUT implemented on the CPU, 1-D NLUT implemented on the GPU and FS-NLUT implemented on the GPU.

As seen in Fig. 7(a), calculation times on the GPU frameworks have been calculate to be much less than those on the CPU frameworks in both of the conventional and proposed methods, which confirm that computing capacity of the GPU framework is more powerful than that of the CPU. Moreover, as shown in Fig. 7(a), in both CPU and GPU frameworks, hologram computing speeds of the proposed method have been found to be remarkably faster than those of the conventional method, while gap differences in computing speed between the proposed and conventional methods have been becoming wider as the number of calculated object points increases. For instance, the difference in the calculation time between the conventional and proposed methods on the CPU framework is calculated to be around 2.2 s for 10 K object points, whereas it is calculated to be around 3 s and 4 s for 70 K and 140 K object points, respectively.

 figure: Fig. 7.

Fig. 7. Computing capacity comparisons of the conventional 1-D NLUT and proposed FS-NLUT methods implemented on the CPU and GPU frameworks: (a) Calculation times dependences on the number of object points (a) On the CPU and GPU, and (b) Only on the GPU frameworks.

Download Full Size | PDF

Figure 7(b) also shows the calculation times for the conventional 1-D NLUT and proposed FS-NLUT methods on the GPU for the numbers of object points ranged from 10 K to 140 K. As shown in Fig. 7(b), calculation times of the conventional and proposed methods for 10 K and 140 K object points, have been calculated to be about 0.08 s, 0.27 s and 0.29 s, 0.76 s, respectively. For both cases of the conventional and proposed methods implemented on the GPU frameworks, their calculation times have been calculated to be less than one second times even for 140 K object points. In addition, as shown in Fig. 7(b), slopes of those two lines representing the calculation time dependence on the number of object points have been approximately calculated to be 0.036 and 0.014 for each case of the conventional and proposed methods. These results show that the calculation time of the proposed method on the GPU framework has been found to be very slowly increased with a slope of 0.014 as the number of object points increases, which reveals that the proposed method implemented on the GPU framework can be potentially applied to the generation of high-resolution holograms for the input 3-D scene composed of millions of object points.

4.4 Numerical and optical reconstructions of holographic video

Figures 8, 9 and 10 show computationally and optically-reconstructed 3-D scene images of the 4th, 7th, 10th, 13th, 16th, 19th, 22nd and 25th frames from the hologram patterns generated with the proposed method for the test 3-D scene. 30 frames of computationally reconstructed 3-D scenes have been also compressed into one video files of Visualization 1 whose running time is around three seconds, and included in Fig. 8. Here, the reconstruction distance from the hologram plane is set to be 40 cm, which means the reconstructed object image is to be focused on the depth plane of the ‘House’ object.

 figure: Fig. 8.

Fig. 8. Computationally-reconstructed 3-D scene images in the 1st, 4th, 7th, 10th, 13th, 16th, 19th, 22nd, 25th and 28th video frames. Visualization 1

Download Full Size | PDF

 figure: Fig. 9.

Fig. 9. Optical configuration for the reconstruction of holographic video.

Download Full Size | PDF

 figure: Fig. 10.

Fig. 10. Optically-reconstructed 3-D object images of the 1st, 4th, 7th, 10th, 13th, 16th, 19th, 22nd, 25th and 28th video frames. Visualization 2

Download Full Size | PDF

As seen in Fig. 8, the computationally-reconstructed object image of ‘Car’ moves around the fixed ‘House’ along the circled route on the x-y-z coordinates. As shown in Fig. 8, in the 4th,7th and 25th video frames, ‘Car’ object images look more clearly focused than those in other video frames since they are located relatively close to the focused depth image plane of the ‘House’, whereas in other video frames, all those ‘Car’ object images appear to be somewhat out of focused since they are further away from the focused image plane of the ‘House’.

Figure 9 shows the implemented optical system to reconstruct the holographic video generated with the proposed method. The green laser, which is used as a light source, is collimated and expanded with the laser collimator (LC) and beam expander (BE) and then illuminated onto the SLM where the calculated holographic video is loaded. In the experiments, a reflection-type phase-modulation mode SLM (Model: HOLOEYE PLUTO2.1) with the resolution of 1920×1080 pixels and pixel-pitch of 8 µm is employed, and the camera (Model: Canon EOS) is set at the distance of 40 cm measured from the SLM.

30 frames of optically reconstructed 3-D scenes are also compressed into one 3 seconds video files of Visualization 2 and included in Fig. 10. As shown in Fig. 10, optically-reconstructed 3-D scene images have been found to be very similar with those computationally reconstructed images of Fig. 8. That is, as shown in Fig. 10, ‘Car’ object images of the 4th, 7th and 25th video frames appear to be relatively focused than other ones just like the cases of the computational reconstruction. On the other hand, in the 13th, 16th and 19th video frames, those ‘Car’ object images look somewhat blurred since they are located in front of the ‘House’ with their depth. Here, all those blurred object images appeared on each left top of the reconstructed images happen to be resulted from the conjugate object images due to the use of the real number holograms.

5. Conclusion

In this paper, a Fourier spectrum-based NLUT (FS-NLUT) method is proposed for faster generation of holographic video, which is composed of so-called PFSs whose sizes can be generated much smaller than those of PFPs of the conventional NLUT-based methods. This method enables the massive reduction of the number of hologram calculation operations, as well as the reduction of the overall hologram calculation time with its 1-D calculation framework. Experiments show that the total number of hologram calculation operations and hologram calculation time of the proposed method have been found to be 81.23% less, and 66% faster than those of the conventional 1-D NLUT method, respectively. Moreover, it is practically confirmed that the proposed method implemented on the two GPUs can generate the holographic video of a test 3-D scene scenario in real-time.

Funding

Natural Science Foundation of Guangdong Province (2020A1515010345); National Natural Science Foundation of China (61827804, 61991451); National Research Foundation of Korea (2018R1A6A1A03025242); Institute for Information and Communications Technology Promotion (IITP-2017-01629).

Acknowledgment

The authors thank Dr. Ping Su (Shenzhen International Graduate School, Tsinghua University) for her kind assistance in providing experimental set-up for optical experiments of this paper.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. D. Gabor, “A new microscopic principle,” Nature 161(4098), 777–778 (1948). [CrossRef]  

2. C. J. Kuo and M.-H. Tsai, Three-Dimensional Holographic Imaging (John Wiley & Sons, 2002).

3. T. C. Poon, Digital Holography and Three-dimensional Display (Springer Verlag, 2007).

4. H. Sasaki, K. Yamamoto, K. Wakunami, Y. Ichihashi, R. Oi, and T. Senoh, “Large size three-dimensional video by electronic holography using multiple spatial light modulators,” Sci. Rep. 4(1), 6177 (2015). [CrossRef]  

5. J. Hahn, H. Kim, Y. Lim, G. Park, and B. Lee, “Wide viewing angle dynamic holographic stereogram with a curved array of spatial light modulators,” Opt. Express 16(16), 12372–12386 (2008). [CrossRef]  

6. A. E. Shortt, T. J. Naughton, and B. Javidi, “Histogram approaches for lossy compression of digital holograms of three-dimensional objects,” IEEE Trans. on Image Process. 16(6), 1548–1556 (2007). [CrossRef]  

7. T. Nishitsuji, T. T. Shimobaba, T. Kakue, N. Masuda, and T. Ito, “Review of fast calculation techniques for computer-generated holograms with the point-light-source-based model,” IEEE Trans. Ind. Inf. 13(5), 2447–2454 (2017). [CrossRef]  

8. E. Sahin, E. Stoykova, and J. Mäkinen,, and A. Gotchev, “Computer-generated holograms for 3D imaging: A survey,” ACM Comput. Surv. 53(2), 1–35 (2020). [CrossRef]  

9. M. Lucente, “Interactive computation of holograms using a look-up table,” J. Electron. Imaging 2(1), 28–34 (1993). [CrossRef]  

10. S. C. Kim and E. S. Kim, “Effective generation of digital holograms of three-dimensional objects using a novel look-up table method,” Appl. Opt. 47(19), D55–D62 (2008). [CrossRef]  

11. S. M. Jiao, Z. Y. Zhuang, and W. B. Zou, “Fast computer generated hologram calculation with a mini look-up table incorporated with radial symmetric interpolation,” Opt. Express 25(1), 112–123 (2017). [CrossRef]  

12. S. Lee, H. Nam, D. Park, and C. Kim, “Fast Hologram Pattern Generation by Removing Concentric Redundancy,” Sid Symp, Digest Tech. Pap.43(1), 800–803 (2012).

13. T. Nishitsuji, T. Shimobaba, T. Kakue, N. Masuda, and T. Ito, “Fast calculation of computer-generated hologram using the circular symmetry of zone plates,” Opt. Express 20(25), 27496–27502 (2012). [CrossRef]  

14. S. C. Kim, J. M. Kim, and E. S. Kim, “Effective memory reduction of the novel look-up table with one dimensional sub-principle fringe patterns in computer-generated holograms,” Opt. Express 20(11), 12021–12034 (2012). [CrossRef]  

15. S. C. Kim and E. S. Kim, “Fast one-step calculation of holographic videos of three-dimensional scenes by combined use of baseline and depth-compensating principal fringe patterns,” Opt. Express 22(19), 22513–22527 (2014). [CrossRef]  

16. S. C. Kim, X. B. Dong, and E. S. Kim, “Accelerated one-step generation of full-color holographic videos using a color-tunable novel-look-up-table method for holographic three-dimensional television broadcasting,” Sci. Rep. 5, 1–10 (2015). [CrossRef]  

17. S. C. Kim, J. H. Yoon, and E. S. Kim, “Fast generation of three-dimensional video holograms by combined use of data compression and lookup table techniques,” Appl. Opt. 47(32), 5986–5995 (2008). [CrossRef]  

18. X. B. Dong, S. C. Kim, and E. S. Kim, “MPEG-based novel-look-up-table method for accelerated computation of digital video holograms of three-dimensional objects in motion,” Opt. Express 22(7), 8047–8067 (2014). [CrossRef]  

19. H. K. Cao, S. F. Lin, and E. S. Kim, “Accelerated generation of holographic videos of 3-D objects in rotational motion using a curved hologram-based rotational-motion compensation method,” Opt. Express 26(16), 21279–21300 (2018). [CrossRef]  

20. H. K. Cao and E. S. Kim, “Faster generation of holographic videos of objects moving in space using a spherical hologram-based 3-D rotational motion compensation scheme,” Opt. Express 27(20), 29139–29157 (2019). [CrossRef]  

21. Y. Pan, X. Xu, S. Solanki, X. Liang, R. Tanjung, C. Tan, and T. Chong, “Fast CGH computation using S-LUT on GPU,” Opt. Express 17(21), 18543–18555 (2009). [CrossRef]  

22. J. Jia, Y. T. Wang, X. Li, Y. J. Pan, B. Zhang, Q. Zhao, and W. Jiang, “Reducing the memory usage for effective computer-generated hologram calculation using compressed look-up table in full-color holographic display,” Appl. Opt. 52(7), 1404–1412 (2013). [CrossRef]  

23. C. Gao, J. Liu, X. Li, G. L. Xue, J. Jia, and Y. T. Wang, “Accurate compressed look up table method for CGH in 3D holographic display,” Opt. Express 23(26), 33194–33204 (2015). [CrossRef]  

24. Y. Zhao, K. C. Kwon, M. U. Erdenebat, M. S. Islam, S. H. Jeon, and N. Kim, “Quality enhancement and GPU acceleration for a full-color holographic system using a relocated point cloud gridding method,” Appl. Opt. 57(15), 4253–4262 (2018). [CrossRef]  

25. Y. Zhao, L. C. Cao, H. Zhang, D. Kong, and G. F. Jin, “Accurate calculation of computer-generated holograms using angular-spectrum layer-oriented method,” Opt. Express 23(20), 25440 (2015). [CrossRef]  

26. H. Zhang, L. C. Cao, and G. F. Jin, “Three-dimensional computer-generated hologram with Fourier domain segmentation,” Opt. Express 27(8), 11689–11697 (2019). [CrossRef]  

27. T. Shimobaba, N. Masuda, and T. Ito, “Simple and fast calclulation algorithm for computer-generated hologram with wavefront recording plane,” Opt. Lett. 34(20), 3133–3135 (2009). [CrossRef]  

28. P. W. M. Tsang and T. C. Poon, “Fast generation of digital holograms based on warping of the wavefront recording plane,” Opt. Express 23(6), 7667–7673 (2015). [CrossRef]  

29. Y. L. Li, D. Wang, N. N. Li, and Q. H. Wang, “Fast hologram generation method based on the optimal segmentation of a sub-CGH,” Opt. Express 28(21), 32185–32198 (2020). [CrossRef]  

30. T. Shimobaba and T. Ito, “Fast generation of computer-generated holograms using wavelet shrinkage,” Opt. Express 25(1), 77–87 (2017). [CrossRef]  

31. D. Blinder and P. Schelkens, “Accelerated computer generated holography using sparse bases in the STFT domain,” Opt. Express 26(2), 1461–1473 (2018). [CrossRef]  

32. H. Kim, J. k. Hahn, and B. ho Lee, “Mathematical modeling of triangle-mesh-modeled three-dimensional surface objects for digital holography,” Appl. Opt. 47(19), D117–D127 (2008). [CrossRef]  

33. D. Im, J. Cho, J. Hahn, B. Lee, and H. Kim, “Accelerated synthesis algorithm of polygon computer-generated holograms,” Opt. Express 23(3), 2863–2871 (2015). [CrossRef]  

34. Y-M Ji, H. Yeom, and J. H. Park, “Efficient texture mapping by adaptive mesh division in mesh-based computer generated hologram,” Opt. Express 24(24), 28154–28169 (2016). [CrossRef]  

35. T. Sugie, T. Akamatsu, T. Nishitsuji, R. Hirayama, N. Masuda, H. Nakayama, Y. Ichihashi, A. Shiraki, M. Oikawa, N. Takada, and Y. Endo, “High-performance parallel computing for next-generation holographic imaging,” Nat. Electron. 1(4), 254–259 (2018). [CrossRef]  

36. M. W. Kwon, S. C. Kim, S. E. Yoon, Y. S. Ho, and E. S. Kim, “Object tracking mask-based NLUT on GPUs for real-time generation of holographic videos of three-dimensional scenes,” Opt. Express 23(3), 2101–2120 (2015). [CrossRef]  

37. H. Sato, T. Kakue, Y. Ichihashi, Y. Endo, K. Wakunami, R. Oi, K. Yamamoto, H. Nakayama, T. Shimobaba, and T. Ito, “Real-time color hologram generation based on ray-sampling plane with multi-GPU acceleration,” Sci Rep 8(1), 1–10 (2018). [CrossRef]  

38. H. Niwase, T. Naoki, A. Hiromitsu, M. Yuki, F. Masato, N. Hirotaka, K. Takashi, S. Tomoyoshi, and I. Tomoyoshi, “Real-time electro holography using a multiple-graphics processing unit cluster system with a single spatial light modulator and the InfiniBand network,” Opt. Eng. 55(09), 1 (2016). [CrossRef]  

39. H. K. Cao and E. S. Kim, “Full-scale one-dimensional NLUT method for accelerated generation of holographic videos with the least memory capacity,” Opt. Express 27(9), 12673–12691 (2019). [CrossRef]  

40. J. W. Goodman, Introduction to Fourier Optics (Roberts & Company Publishers, 2005), P353-354.

Supplementary Material (2)

NameDescription
Visualization 1       30 frames of computationally reconstructed 3-D scenes are compressed into one video files of Visualization 1.
Visualization 2       30 frames of optically reconstructed 3-D scenes are compressed into one video file of Visualization 2

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (10)

Fig. 1.
Fig. 1. Computational redundancy of the NLUT-based hologram calculation process
Fig. 2.
Fig. 2. Operational diagram of the three-step process of the proposed FS-NLUT method.
Fig. 3.
Fig. 3. 2-D and 1-D forms of the PFS obtained by Fourier-transforming its corresponding PFP.
Fig. 4.
Fig. 4. Five-step hologram generation process with combined use of the PSFs and PFs on the 1-D calculation pipeline.
Fig. 5.
Fig. 5. Comparative analysis of computational redundancy of the PFS and PFP: (a) PFP generated from the PFS with the proposed FS-NLUT method, (b) Frequency spectrum and (c) Phase angel images of the PFS, (d) PFP generated with the conventional 1-D NLUT, and (e), (f) Reconstructed object point images from the PFP of (a) and (d), respectively.
Fig. 6.
Fig. 6. Block-diagram of the memory management of the GPU framework for the proposed method.
Fig. 7.
Fig. 7. Computing capacity comparisons of the conventional 1-D NLUT and proposed FS-NLUT methods implemented on the CPU and GPU frameworks: (a) Calculation times dependences on the number of object points (a) On the CPU and GPU, and (b) Only on the GPU frameworks.
Fig. 8.
Fig. 8. Computationally-reconstructed 3-D scene images in the 1st, 4th, 7th, 10th, 13th, 16th, 19th, 22nd, 25th and 28th video frames. Visualization 1
Fig. 9.
Fig. 9. Optical configuration for the reconstruction of holographic video.
Fig. 10.
Fig. 10. Optically-reconstructed 3-D object images of the 1st, 4th, 7th, 10th, 13th, 16th, 19th, 22nd, 25th and 28th video frames. Visualization 2

Tables (3)

Tables Icon

Table 1. Pseudocode for the CPU implementation of the proposed FS-NLUT method.

Tables Icon

Table 2. Pseudocode for the GPU implementation of the proposed FS-NLUT method.

Tables Icon

Table 3. Calculation times of the proposed FS-NLUT and conventional 1-D NLUT methods on each of the CPU and GPU frameworks for the test 3-D scene

Equations (13)

Equations on this page are rendered with MathJax. Learn more.

u ( x , y ) = e i k z d i λ z d I ( ξ , η ) exp { i k 2 z d [ ( x ξ ) 2 + ( y η ) 2 ] } d ξ d η  =  I ( ξ , η ) t ( x , y , ξ , η )   d ξ d η
U ( f x , f y ) = I ( ξ , η ) T ( f x , f y , ξ , η )   d ξ d η
B X = L ξ + L x λ z
N s 2 = λ z p 2 N f
P F S = F { exp ( i k 2 z ( x 2 + y 2 ) } ,  (  f x B x , f y B y )
{ P F S _ x  =  F { exp ( i k 2 z x 2 ) } ,   f x B x P F S _ y  =  F { exp ( i k 2 z y 2 ) } ,   f y B y
P F = P F _ y P F _ x
P F _ x = exp ( i 2 π ξ f x )
P F _ y = exp ( i 2 π η f y )
C o m P F = η P F _ y ( η ) { ξ I ( η , ξ ) P F _ x ( ξ ) }
H F S d = C o m P F P F S
H F S = d H F S d
C G H = F 1 { H F S }
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.