## Abstract

Fast two-step layer-based and sub-sparse two-dimensional Fast Fourier transform (SS-2DFFT) algorithms are proposed to speed up the calculation of computer-generated holograms. In a layer-based method, each layer image may contain large areas in which the pixel values are zero considering the occlusion effect among the different depth layers. By taking advantage of this feature, the two-step layer-based algorithm only calculates the non-zero image areas of each layer. In addition, the SS-2DFFT method implements two one-dimensional fast Fourier transforms (1DFFT) to compute a 2DFFT without calculating the rows or columns in which the image pixels are all zero. Since the size of the active calculation is reduced, the computational speed is considerably improved. Numerical simulations and optical experiments are performed to confirm these methods. The results show that the total computational time can be reduced by 5 times for a three-dimensional (3D) object of a train, 3.4 times for a 3D object of a castle and 10 times for a 3D object of a statue head when compared with a conventional layer-based method if combining the two proposed methods together.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Holography is regarded as a promising approach for true 3D display which can produce all depth cues without the need to wear any device. In computer generated holographic 3D displays, the digital hologram is calculated and then used to reconstruct 3D images. To realize the real-time computation, several fast approaches are proposed [1–8]. Among them, a layer-based method is the fastest one by far. In such an effective method [1,9,10], the calculation of each layer is fast by employing a two dimensional Fast Fourier Transform (2DFFT) rather than wave propagations for millions of point light sources or thousands of polygon light sources. The computational load is related only to the performance time of 2DFFT and limited by the number and the image size of layers. Considering the limited accommodation resolution of human eyes, it can reduce the computational load greatly by calculating only a few layers. Separately, a fraction method is proposed [9] to improve the speed of computer-generated holograms (CGH) by reducing the image size of calculation. To make of the original image size, it duplicated the calculated size-reduced hologram to other fraction areas in hologram plane. The computational time is reduced but at cost of degrading the resolution of reconstructed images.

Angular-spectrum layer-oriented method is another type of layer-based method which uses angular spectrum transform instead of 2DFFT [10]. By using the angular spectrum without paraxial approximation, the quality of the optical reconstructions could be greatly improved compared with the conventional paraxial approximation methods. However, the computational load is increased due to the usage of an angular transform which includes two 2DFFTs.

In 2016, a non-uniform 2DFFT method was applied in the layer-based method to calculate holograms [11]. It does not need to perform 2DFFT for every depth layers which will reduce the computational load. However, the method is only suitable if the depth map of the 3D object is smooth enough. Any sharp change between adjacent points in depth may cause the error in depth reconstruction. No matter what kind of layer-based methods is used, 2DFFT is the basic mathematics tool which limits the CGH calculation speed. In many applications, most of the Fourier coefficients of a signal are small or equal to zero, which defines that the output of the FFT is (approximately) sparse. To take advantage of this feature, a method of sparse Fast Fourier transform (sFFT) was proposed [12–14] to speed up the Fourier transform when the target signal is sparse. However, sparse Fourier transform methods require a very big range of sparsity to compare the total signal. It is very difficult to apply this method to the calculation of computer-generated holograms.

In 2016 and 2017, the sparsity of hologram fringe pattern was explored and the calculation time could be reduced by only selecting top 5 - 10% dominant signals of hologram fringe pattern which would be sparse enough to use spare FFT to calculate the final holograms from virtual recording planes [15] or depth planes [16]. The results showed that the proposed method could speed up CGH calculation and still provide acceptable visual quality of the reconstructed images.

In this paper, a fast two-steps layer-based algorithm and a sub-sparse 2DFFT (SS-2DFFT) are proposed to speed up the calculation of computer-generated holograms. The 3D objects are divided into several 2D sliced layer images along the viewing direction. Considering the occlusion effect among the different layers, each layer image contains large image areas in which the pixel values are zero. By taking advantage of this feature, the two-step layer-based algorithm only calculates the non-zero image area of each layer. Therefore, the 2DFFT output size for each layer is reduced. In addition, SS-2DFFT implements two one dimensional fast Fourier transforms (1DFFT) which is equivalent to compute 2DFFT but without calculating the rows or columns which image pixels are all zero. Consequently, the computational load of 2DFFT is reduced. Numerical simulations and optical experiments are performed and the results show a considerable reduction in computational load by combining these two methods in comparison with conventional layer-based CGH calculations.

## 2. Traditional layer-based method and computational load

The principle of the traditional layer-based method is shown in Fig. 1. A 3D object is divided into several 2D sliced layer images according to its depth map. 2DFFT is used to create a diffractive pattern for each layer, which multiplies with the holographic lens with a focal length corresponding to the layer's depth before being added together to create a Fourier hologram as described by the following equation:

where*2DFFT*denotes the two dimensional fast Fourier transform;

_{xy}*lens*(

*z*) denotes the lens function which produces the depth information; L is the number of depth layer. Assuming the size of each 2D sliced layer image is M × N, the operations of computing 2DFFT is OFFT [M × N × log(M × N)] which includes the operation of complex multiplications and additions. While the 3D object has L depth layers, the total computational operation will be increased L times.

_{l}## 3. Proposed methods

#### 3.1 Sub-sparse 2D fast Fourier transform

The two dimensional fast Fourier transform (2DFFT) is the basic mathematics tool in layer-based CGH calculation methods, which is expressed as

The computational load is proportional to the output size *M* × *N*, which is *O _{FFT}*[

*M*×

*N*× log(

*M*×

*N*)]. By using the feature of reparability of 2DFFT, two 1DFFT are used which is equivalent to compute a 2DFFT that:

Equation (2) can be written as:

where*x*,

*y*are the pixel coordinations on the image plane;

*u*,

*v*are the spectrum coordinations on the Fourier transform plane;

*M*and

*N*are the pixel number in

*x*and

*y*directions. To obtain the 2D Fourier transform, the implementing process is performing the 1DFFT for each row of an image firstly, then doing another 1DFFT for each column of the previous 1DFFT results, as shown in Fig. 2. It is the same to calculate the image columns firstly and then rows. For the image rows or columns in which all the pixel values are equal to zero, we call them zero-pixel rows or zero-pixel columns, their 1DFFT value is zero. Therefore, these image rows or columns do not need to be calculated. This avoiding calculating the zero-pixel rows or columns by using two separate 1D FFT reduces the computational load of 2DFFT. The images contain several zero-pixel rows or columns are named as the sub-sparse images, and this proposed method is named as the sub-sparse 2DFFT (SS-2DFFT) to differ the sparse images and sparse Fourier transform method. It is noted that the sparse dominant signals can distribute randomly in sFFT method, but a very big range of sparsity is required for applying sFFT. In contrast, our proposed SS-2DFFT method can be used to speed up the 2D FFT for images with any range of sparsity, if only the images contain the rows or columns in which all the pixel values are equal to zero.

Since the results of 1DFFT are not the sub-sparse images anymore, the computational load could not be reduced in the second 1DFFT. Assuming a sub-sparse image contain *N _{l}* image rows or columns where all pixel values are zero, the computational complexity of 2DFFT will be reduced to

*O*[

_{FFT}*M*× (

*N*-

*N*) × log

_{l}*M*+

*M*×

*N*× log

*N*)] by using sub-sparse 2DFFT.

#### 3.2 Two-step layer-based method

In a layer-based method, the 3D objects are sliced along the viewing direction. Considering the occlusion of the layers, each 2D sliced layer image may contain the large area that the pixel values are all zero which do not carry any 3D objects information. By taking advantage of this feature, the proposed two-step layer-based algorithm only calculates the non-zero image areas of each layer, the schedule of principle is shown in Fig. 3.

In the first step, the amplitude complex of the non-zero image areas are calculated into one layer named a reference plane by using angular spectrum transform:

*T*(z

*) is the angular spectrum transform function:where (*

_{l}*x*,

_{l}*y*) denotes the non-zero image areas for different layers;

_{l}*z*denotes the angular spectrum transform distance between the 2D sliced layer image and the reference plane, which produces the depth information instead of using holographic lens in the traditional layer-based method; the

_{l}*f*is the radial variable in spatial spectrum domain.

_{r}The non-zero image areas can be found at the rectangle areas enclosed by the four vertexes (*x _{min}*,

*y*), (

_{min}*x*,

_{max}*y*), (

_{min}*x*,

_{min}*y*) and (

_{max}*x*,

_{max}*y*) denoted by red rectangles in Fig. 3(a), where

_{max}*x*,

_{min}*y*,

_{min}*x*, and

_{max}*y*are the minimum and maximum coordination value of non-zero pixels in

_{max}*x*and

*y*directions for each layer. Because the non-zero image areas cover a small area compared with the whole 2D sliced layer image, the 2DFFT output size is significantly reduced. Furthermore, it not only speeds up the 2DFFT calculation but also reduces the computation load of the data processing such as a hologram transferring and combination in the traditional layer-based method.

After all the layers are calculated into the reference plane, the complex amplitude on the hologram plane is calculated from the reference plane by using 2DFFT. It is assuming the size of non-zero image areas are${{M}^{\prime}}_{l}\times {{N}^{\prime}}_{l}$, where${{M}^{\prime}}_{l}<M,{{N}^{\prime}}_{l}<N$. The computational load of proposed two-step layer-based method is reduced to${\sum}_{l}^{L-1}{{M}^{\prime}}_{l}{{N}^{\prime}}_{l}\mathrm{log}({{M}^{\prime}}_{l}{{N}^{\prime}}_{l})}+MN\mathrm{log}(MN)$.

In short, the comparison of the computational complexity of the traditional layer-based method, the sub-sparse 2DFFT, and the two-step layer-based method are shown in Table 1.

The computational complexity of CGH is determined by the size of 2D sliced layer images and the performing number of 2DFFT. Since the sub-sparse 2DFFT and two-step layer-based methods reduce the actual calculation size of each depth layers, the computational complexity is much less than the traditional method. It is noted that these two methods are compatible techniques, therefore, the computational load can be further reduced by combining two methods.

## 4. Aliasing minimization by dynamical zero padding

Reducing the 2DFFT area may cause the aliasing error in two-step layer-based method. In CGH calculation by using Fourier transform, its Fourier spectrum on the hologram plane will infinitely extend in the spectrum domain because the 3D object is space-limited, and vice versa. Only the hologram is well-sampled, the object light can be retrieved digitally or optically. Otherwise, the high order images will appear in the target image repeatedly when the hologram is under-sampled, this phenomenon is called aliasing error [17].

It is assuming the size of 2D sliced layer image is *H* × *W*, which contains the *h* × *w* non-zero image area, as shown in Fig. 4(a). In the traditional method, the spatial size *H* × *W* of the hologram remains the same in the calculation and reconstruction involving the Fourier transform algorithm. The image can be reconstructed by convolving hologram *H _{r}* (

*x*,

*y*) with transfer function

*T*(-z

*) which represents reverse propagation. Because*

_{l}*T*(z

*) ×*

_{l}*T*(-z

*) = 1, the 3D object*

_{l}*f*(

*x*,

*y*,

*z*) can be reconstructed without aliasing error as shown in Fig. 4(b).

In contrast, only the non-zero image area of *h* × *w* is calculated into a hologram in two-steps layer based method. While in image reconstruction, the hologram size is extended to *H* × *W*. When the transfer function is sampled at the different intervals between the forward and backward transfer propagation and the intervals of which is smaller than the frequency of transfer function, an aliasing error may be introduced to the image plane [18,19], as shown in Fig. 4(d).

To minimize the aliasing error in the region of target image area while maintaining the small 2DFFT calculation size, the dynamical zero-padding method is proposed. By reducing the aliasing error, the zero is added around the non-zero image area to get the proportion spatial spectrum samplings in the transfer function. Due to AST is used to calculate the hologram, we only need to consider the sampling property in the Fourier spectrum domain. The condition of well-sampling for *x*-direction is thus given by [17]:

*f*is 1/(2Δ

_{r}_{0}), Δ

_{0}is sampling period on the image plane; the Δ

*is the sampling period on the hologram plane (in the spectrum domain) after zero padding, which is equal to 1/*

_{f}*w'*and can be derived from Eq. (8),

According to Eq. (9), we found that zero padding is necessary to avoid the aliasing error and the size of zero padding is determined by the propagation distance and the initial size of non-zero image areas. Therefore, for different layers, different size of zero is added around the non-zero image areas to get the aliasing free images. It is noted that only the size *w'* after zero padding is smaller than the original size of 2D sliced layer images, the proposed methods are efficient. Otherwise, the traditional layer-based method is used to calculate the CGH directly.

On the other hands, reducing the 2DFFT calculation areas in two-step layer-based method may cause information loss and degrade the image quality, because the information will distribute to the whole hologram plane as regarding the surfaces of 3D objects are diffuse. The information distribution area is determined by the diffraction angle of hologram and propagation distance. Fortunately, the information distributing area is limited by the diffraction ability of hologram and the information lost can be ignored when the reference plane is close to other layers. Therefore, in two-step layer-based method, to minimize the propagation distances between each 2D sliced layer image and the reference plane, the central 2D sliced layer image of 3D objects are chosen as the reference plane. In this case, the largest propagation distance is equal to the half-length of a 3D object in the propagation direction. Otherwise, the zero-padding increases the calculation area which can reduce the information lost. For the 3D objects with long depth, multi-reference planes can be used. After all the depth layers are calculated to the reference plane, its complex amplitude information are diffracted to the whole hologram plane by diffraction calculation. Therefore, the information lost is very limited and can be ignored.

## 5. Implementation results and discussion

To show the results of the proposed algorithms, we perform numerical simulations and optical experiments. Our program is run by a personal computer (CPU 2.59GHz, RAM 8G and Operating System 64) under the Matlab 2010 environment. The hologram is numerically generated and loaded on the phase-only spatial light modulator (HoloEye, PLUTO) with the resolution of 1,920 × 1,080 and the pixel size of 8 *um*. The wavelength of the light source in use is 633 *nm*.

In the proposed two methods, the computational speed is improved by reducing the computational size of calculation area. However, as analyzation in section 4, directly reducing the Fourier transform area will cause an aliasing error. It is assuming the size of 2D sliced layer image is 1920 × 1920, the non-zero pixel area is the central rectangle with the size of 480 × 480 as shown in Fig. 5(a). The high order images appear around the central target image area in reconstruction image plane by using the hologram calculated from the non-zero pixel area only, as shown in Fig. 5(b). To minimize the aliasing error, the proposed dynamical zero-padding method is applied, and the zero padding size is obtained by Eq. (9). The simulation results show the aliasing error is totally removed from the target image at the cost of additional calculation, as shown in Fig. 5(c). Fortunately, the pixels in zero padding area are all zero, which do not need to be calculated based on the SS-2DFFT.

The calculation time comparison of total Fourier transform area, non-zero Fourier transform area with/without zero padding, non-zero Fourier transform area with zero padding and SS-2DFFT are shown in Fig. 6. It is assuming the size of 2D sliced layer image is 1920 × 1920 as well, the non-zero pixel area is the central rectangle with the size from 200 × 200 to 1920 × 1920. The zero-padding size is 20 pixels around the non-zero pixel area. The results show that the calculation time of traditional angular spectrum transform method denoted by the blue bar is independent of non-zero pixel size. Instead of calculating the whole area, only calculating non-zero pixel area can reduce the computational time denoted by the orange bar, which can be further reduced with the decreasing of the non-zero pixel area, but it suffers from the aliasing error. Zero padding can be used to remove the aliasing error, but it increases the calculation size and costs more time denoted by the gray bar. By applying the SS-2DFFT, the computational time denoted by yellow bar can be reduced without calculation the zero-pixel rows or column in zero padding area. Furthermore, the larger size of zero padding, the more improvement will be achieved by applying the SS-2DFFT in two-step layer based method. The results show that the computational load can reduce by avoiding calculation the zero-pixelimage area, and the more zero-pixel area a 2D sliced layer image contains, the more reduction of computational time is. However, it is noted that when the calculation area after zero padding is equal or larger than the original size of 2D sliced layer images, the proposed methods cost more time than traditional methods. In this case, the traditional layer-based method is used to calculate the CGH directly.

For better demonstrating the feasibility of the proposed new algorithm, the CGHs of a train with the size of 6mm (width) × 8mm (height) × 20mm (depth), a statue head with the size of 7 mm (width) × 7mm (height) × 8mm (depth), and a castle with the size of 12mm (width) × 10mm (height) × 30mm (depth) are computed by using hybrid method of SS-2DFFT and two-step layer-based. The resolution of each depth layer is 1920 × 1920. The 3D objects contain 50 depth layers. In both the simulation and experimental results in Fig. 7, only the parts of 3D objects on focused are clear and the other parts would become blurred. The 3-D objects are reconstructed successfully.

To show the improvement due to the introduction of individual methods, the total computation times at different stages are listed in Table 2. The time spent on data reading for all different methods are same. In the traditional layer-based method, the most of the computational time is spent on the CGH calculation including 2DFFT performance, attaching holographic lenses and stacked up all the layers together. In the SS-2DFFT method and two-step layer-based method, the data pre-processing is necessary which finds the non-zero rows or non-zero columns and non-zero image areas. Although the additional time is spent on data processing compared with the traditional method, these two proposed methods are still faster since the size of active calculation area is largely reduced. By combining two proposed methods, the final calculation speed can be improved by 5 times for a 3D object of train, 3.4 times for a 3D object of castle and 10 times for a 3D object of statue head when compared with a traditional layer based method.

Because the proposed methods are related to the sparsity and non-zero image pixel distribution of the 2D sliced layer images, the improvement of calculation speed is different for 3D objects with different sparsity. In our experiments, the statue head with the smallest non-zero image area 979 × 787 has the fastest calculation speed. For the train with the non-zero image area 893 × 1402, although the height is similar to the statue head, the train’s long body occupies the large area in the horizontal direction of 2D sliced depth image which spends more time than statue head. For the castle with the largest non-zero area of 1354 × 1526 spends most time than train and statue head. Since the largest calculation size after zero padding among the 2D sliced depth images is 1320 × 1540 which is still smaller than the original size of 1920 × 1920, the proposed method is faster than transitional method.

The computational time can be significantly reduced when the 3D objects consisting many depth layers, compared to the traditional method. In contrast, for 3D objects with only a small number of depth layers, the efficiency of the two-step layer-based method will be reduced, because of the decrease of the sparsity of depth layer images. For 3D objects with two depth layers, the final calculation speed can be still improved by 3.3 times for a 3D object of train, 2.4 times for a 3D object of castle and 2.5 times for a 3D object of statue head when compared with a traditional layer based method.

## 6. Conclusions

We proposed the use of two compatible methods to improve the calculation speed of holograms based on a layer-based method. It was shown that the SS-2DFFT method and two-step layer-based method only calculate the non-zero image area for each depth layer. Since the active calculation image size is reduced, the computational speed can be improved. The numerical simulation and optical experiments are performed to proof our methods. As a result, the calculation speed of an algorithm based on the hybrid of SS-2DFFT and two-step layer-based method can be improved by 5 times for a 3D object of train, 3.4 times for a 3D object of castle and 10 times for a 3D object of statue head compared with a conventional layer-based method.

## References and links

**1. **J.-S. Chen, D. Chu, and Q. Smithwick, “Rapid hologram generation utilizing layer-based approach and graphic rendering for realistic three-dimensional image reconstruction by angular tiling,” J. Electron. Imaging **23**(2), 023016 (2014). [CrossRef]

**2. **X.-B. Dong, S.-C. Kim, and E.-S. Kim, “MPEG-based novel look-up table for rapid generation of video holograms of fast-moving three-dimensional objects,” Opt. Express **22**(7), 8047–8067 (2014). [CrossRef] [PubMed]

**3. **C. Gao, J. Liu, X. Li, G. Xue, J. Jia, and Y. Wang, “Accurate compressed look up table method for CGH in 3D holographic display,” Opt. Express **23**(26), 33194–33204 (2015). [CrossRef] [PubMed]

**4. **H. Kang, E. Stoykova, and H. Yoshikawa, “Fast phase-added stereogram algorithm for generation of photorealistic 3D content,” Appl. Opt. **55**(3), A135–A143 (2016). [CrossRef] [PubMed]

**5. **M.-W. Kwon, S.-C. Kim, and E.-S. Kim, “Three-directional motion-compensation mask-based novel look-up table on graphics processing units for video-rate generation of digital holographic videos of three-dimensional scenes,” Appl. Opt. **55**(3), A22–A31 (2016). [CrossRef] [PubMed]

**6. **T. Nishitsuji, T. Shimobaba, T. Kakue, D. Arai, and T. Ito, “Simple and fast cosine approximation method for computer-generated hologram calculation,” Opt. Express **23**(25), 32465–32470 (2015). [CrossRef] [PubMed]

**7. **Y. Sando, D. Barada, and T. Yatagai, “Fast calculation of computer-generated holograms based on 3-D Fourier spectrum for omnidirectional diffraction from a 3-D voxel-based object,” Opt. Express **20**(19), 20962–20969 (2012). [CrossRef] [PubMed]

**8. **J. Jia, Y. Wang, J. Liu, X. Li, Y. Pan, Z. Sun, B. Zhang, Q. Zhao, and W. Jiang, “Reducing the memory usage for effective computer-generated hologram calculation using compressed look-up table in full-color holographic display,” Appl. Opt. **52**(7), 1404–1412 (2013). [CrossRef] [PubMed]

**9. **J.-S. Chen and D. P. Chu, “Improved layer-based method for rapid hologram generation and real-time interactive holographic display applications,” Opt. Express **23**(14), 18143–18155 (2015). [CrossRef] [PubMed]

**10. **Y. Zhao, L. Cao, H. Zhang, D. Kong, and G. Jin, “Accurate calculation of computer-generated holograms using angular-spectrum layer-oriented method,” Opt. Express **23**(20), 25440–25449 (2015). [CrossRef] [PubMed]

**11. **C. Chang, J. Wu, Y. Qi, C. Yuan, S. Nie, and J. Xia, “Simple calculation of a computer-generated hologram for lensless holographic 3D projection using a nonuniform sampled wavefront recording plane,” Appl. Opt. **55**(28), 7988–7996 (2016). [CrossRef] [PubMed]

**12. **H. Hassanieh, P. Indyk, D. Katabi, and E. Price, “Nearly Optimal Sparse Fourier Transform,” in Proceedings Forty-fourth Annual ACM SymposiumTheory Computing, STOC’12 (ACM, 2012), pp. 563–578.

**13. **H. Hassanieh, P. Indyk, D. Katabi, and E. Price, “Simple and Practical Algorithm for Sparse Fourier Transform,” in Proceedings Twenty-third Annual ACM-SIAM Symposium Discrete Algorithms, SODA’12 (Society for Industrial and Applied Mathematics, 2012), pp. 1183–1194. [CrossRef]

**14. **J. Hu, Z. Wang, Q. Qiu, W. Xiao, and D. Lilja, “Sparse Fast Fourier Transform on GPUs and Multi-core CPUs,” in Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on 2012, pp. 83–91. [CrossRef]

**15. **H. G. Kim, H. Jeong, and Y. Man Ro, “Acceleration of the calculation speed of computer-generated holograms using the sparsity of the holographic fringe pattern for a 3D object,” Opt. Express **24**(22), 25317–25328 (2016). [CrossRef] [PubMed]

**16. **H. G. Kim and Y. Man Ro, “Ultrafast layer based computer-generated hologram calculation with sparse template holographic fringe pattern for 3-D object,” Opt. Express **25**(24), 30418–30427 (2017). [CrossRef] [PubMed]

**17. **J.-P. Liu, “Controlling the aliasing by zero-padding in the digital calculation of the scalar diffraction,” J. Opt. Soc. Am. A **29**(9), 1956–1964 (2012). [CrossRef] [PubMed]

**18. **K. Matsushima and T. Shimobaba, “Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields,” Opt. Express **17**(22), 19662–19673 (2009). [CrossRef] [PubMed]

**19. **L. Onural, “Sampling of the diffraction field,” Appl. Opt. **39**(32), 5929–5935 (2000). [CrossRef] [PubMed]