Multi-anchor spatial phase unwrapping for fringe projection profilometry

Sen Xiang; Sen Xiang; You Yang; You Yang; Huiping Deng; Huiping Deng; Jin Wu; Jin Wu; Li Yu; Li Yu

doi:10.1364/OE.27.033488

1. Introduction

In recent years, depth-related topics have been widely studied in both academia and industry [1–3]. As an important depth sensing technique, structured light (SL) is also obtaining increasing attention. Based on stereo vision, an SL system consists of a projector and a camera as shown in Fig. 1(a). The projector casts patterns $I_{prj}$ to the measured scene and a reference plane with known depth $Z_0$. Correspondingly, the camera records the patterns, notated as $I_{cap}$ for the measured scene and $I_{ref}$ for the reference plane. The difference between $I_{cap}$ and $I_{ref}$ only relates to the shape or depth of the measured objects, so once the correspondence between $I_{cap}$ and $I_{ref}$ is established, depth or shape can be derived. For example, Kinect V1 matches $I_{cap}$ and $I_{ref}$ in intensity domain to find the optimal disparity [4]. More advanced methods incorporate coding strategies when designing the projected patterns [3,5]. In these patterns, every pixel has its unique code, and thus the correspondence between $I_{cap}$ and $I_{ref}$ can be built accurately and quickly.

Fig. 1. Sketch of a typical SL system. (a) System setup. (b) Mainstream coding strategies.

Download Full Size | PDF

Mainstream coding strategies can be divided into temporal, spatial and phase-based ones as shown in Fig. 1(b). Temporal methods project a set of patterns, and the code of a pixel is obtained by sequentially assembling the pattern intensity. Spatial strategies need a single pattern, and the code is generated from a local patch. Both strategies use grayscale or binary patterns, and the codes are usually discrete. Phase-based techniques are quite different. They use phase-modulated patterns where the latent phase is utilized as the codeword. The patterns consist of a serial of fringes, and this technique is also known as fringe-projection profilometry. Since phase values are continuous, fringe patterns produce smoother depth than discrete codewords.

However, fringe-projection profilometry faces the phase unwrapping challenge. Since sinusoidal signals are periodical, the extracted phase $\varphi _w(p)$ is always in the range of $(-\pi , \pi ]$, which is a wrapped version of the absolute phase $\varphi (p$). There exists the relationship of

(1)$${\varphi(p)} = {\varphi _w(p)} + m(p)2\pi$$

where $m(p) \in \mathbb {N}$ is the fringe order of pixel $p$. The wrapped phase $\varphi _w(p)$ must be unwrapped to recover the absolute phase $\varphi (p)$ before depth can be calculated.

To solve the problem, many temporal and spatial phase unwrapping algorithms have been proposed [6], which will be introduced in detail in Section 2. In general, temporal methods are based on phase-frequency constraint. They project several sets of varying-frequency patterns, producing multiple wrapped phase maps. After that, the absolute phase can be directly derived from the phase-frequency constraint. Spatial unwrapping methods are based on the phase smoothness assumption. They need only one phase map and phases are unwrapped from pixel to pixel by following a specified path. A challenge in this procedure is that any unwrapping error will propagate and accumulate to all successive pixels. To improve the robustness, quality-guided methods process high-quality pixels first and error-prone ones last, and thus alleviate the effect of error accumulation. However, it needs very heavy computation to determine the unwrapping path. Therefore, a fast and robust phase unwrapping method is very much desired.

In this paper, we propose a high-efficient and robust method called multi-anchor scanline unwrapping (MASU). We find that conventional spatial unwrapping only refers to the nearest pixel, and thus is vulnerable when the referred anchor is error-prone. Instead, we propose to use multiple anchors, and each will predict a fringe order candidate for the current pixel. After that, voting among the candidates produces a robust winner fringe order, which is further used in unwrapping. The advantages of the proposed method are as follows. (1) It greatly improves phase unwrapping performance against errors. (2) It is conducted on scanlines and avoids cumbersome quality guidance, which makes the method very efficient. Simulation and experimental results show that the proposed MASU produces accurate depth values with error-prone phase data, which greatly outperforms basic scanline unwrapping and is comparative with quality guided methods. In addition, MASU is thousands of times faster than quality-guided methods on the condition of similar accuracy.

2. Related work

2.1 Fringe-projection profilometry

In fringe-projection profilometry, every pixel has a phase value embedded in the pattern. The phase can be extracted with phase-shifting (PSP) [7,8], Fourier transform analysis (FTA) [9,10] or wavelet transform analysis (WTA) [11,12]. Phase shifting projects a set of $N$ sinusoidal patterns

(2)$$I_i(p) = A+Bcos\left(\varphi\left(p\right)+i\frac{2\pi}{N}\right), \qquad i=1,2,\cdots,N$$

where $\varphi (p) = \omega x_p$ is the phase modulated by the position $x_p$. Correspondingly, with the $N$ captured patterns, $\varphi (p)$ can be recovered as

(3)$$\varphi = \arctan{\frac{\Sigma_{i=0}^{N}I_isin\left(i\frac{2\pi}{N}\right)}{\Sigma_{i=0}^{N}I_icos\left(i\frac{2\pi}{N}\right)}}$$

As shown in Fig. 1, the phase extracted from the reference patterns and the captured patterns are different, and the difference is converted to depth as

(4)$$Z = \frac{b{f_L}{Z_{0}}}{{f_L}b + \frac{ \Delta \varphi}{2\pi {f}}{Z_{0}}}$$

where $\Delta \varphi = \varphi ^{cap} - \varphi ^{ref}$ is the deformed phase caused by the measured objects [13].

FTA and WTA work in a similar framework but different phase extracting techniques. FTA processes the fringe waveform with Fourier transform, band-pass filtering and inverse Fourier transform, and thus produces a complex signal. After that, the angle between the real and the imaginary parts of this complex signal is the phase [10]. WTA has a similar procedure with FTA, but wavelet transform is applied instead of Fourier transform [11]. FTA and WTA require only a single pattern, while compared with phase shifting, they are more complex in computation and the phase accuracy is lower.

2.2 Phase unwrapping

An important step in fringe-projection profilometry is phase unwrapping. No matter using PSP, FTA or WTA, the extracted phase is wrapped to $\left (-\pi , \pi \right ]$. As mentioned in Eq. (1), $\varphi (p)$ must be recovered from $\varphi _w(p)$. Unfortunately, it is an ill-posed problem since $m(p)$ is unknown and can only be inferred based on prior knowledge and assumptions. Currently, unwrapping methods can be classified into temporal, spatial and learning approaches.

Temporal unwrapping is based on number theory. It measures the scene several times, and the patterns are modulated with different frequencies, resulting in a set of wrapped phases. For each individual pixel, the ratio between the unwrapped absolute phase and the modulation frequency should be constant. With this phase-frequency constraint, the fringe order can be inferred and the absolute phase can be recovered [6,14]. A drawback of temporal unwrapping is the requirement of measuring the scene multiple times, and efforts have been made for acceleration. Ding [15] proposed to build a look-up-table that directly reports the fringe order $m$ based on two-frequency wrapped phases. Liu [16] and Su [17] combined a high-frequency sinusoidal component with a unit-frequency fringe. Xiang [13] used a spatially-multiplexed pattern where the frequency varies spatially and phase unwrapping is achieved based on depth smooth assumption and phase-frequency constraint.

Spatial unwrapping is based on the assumption that phase signal is smoothly varied in most cases. Therefore, a large phase transition, such as a drop from $\pi$ to -$\pi$, indicates the beginning of a new fringe. For pixel $p$ and its neighbor $q$, the fringe order $m(p)$ can be inferred as

(5)$$m(p)= \begin{cases} m(q)+1 & \textrm{ if } \Delta\varphi_w(p, q)\;<\;{-}Th \\ m(q)-1 & \textrm{ if } \Delta\varphi_w(p, q)\;>\;Th \\ m(q) & \textrm{ otherwise } \end{cases}$$

where $\Delta \varphi _w(p, q) = \varphi _w(p) - \varphi _w(q)$. $Th$ is the threshold and usually equals to $\pi$. Spatial unwrapping must be conducted pixel-by-pixel since $m(p)$ depends on $\Delta \varphi (p, q)$ and $m(q)$. In the simplest case, the unwrapping can be done along the scanelines [18].

Nevertheless, this pixel-by-pixel manner has the problem of unwrapping error propagation. In Eq. (5), if the coefficient of $p$ is not correctly inferred, the error will propagate to successive pixels. To alleviate this effect, ‘quality-guided methods’ [19–21] are proposed. To be specific, each pixel has a phase quality metric, such as partial derivation variance [22], maximum phase gradient [23], or the amplitude of the pattern [24]. After that, the pixel with the best quality is unwrapped. This process iterates with quality sorting algorithms such as flooding-fill [25] or region-growing [26,27] before the whole phase map is unwrapped. Compared with scanline unwrapping, quality guidance can improve the phase quality at the cost of great computation load. Researcher also tried to accelerate the unwrapping with quality guidance. For example, Zhang [28] used multiple quality thresholds to divide the wrapped phase into several levels, and the phase is unwrapped from high-quality levels to low-quality levels. Liu [29] segmented the phase map into regions, followed by regional phase unwrapping, region-to-region unwrapping, unwrapping phase merging.

Besides, many methods utilize auxiliary information to assist phase unwrapping. Jiang [30] separately processed different regions, and each was unwrapped by referring to multiple reference planes. Budianto [31] and Cong [32] embedded special markers in the patterns to help to identify the fringe order. Hyun [33] used an additional two patterns to carry the fringe order. Jiang [34] introduced an auxiliary camera and thus the two cameras provide a rough depth estimation. An [35] and Li [36] introduced an artificial absolute phase map at a given virtual plane with known depth in the scene. Dai [37] utilized the absolute phase of a known object to help to unwrap the phase of a new measurement.

Recently, with the booming development of deep learning, several attempts are made to develop unwrapping methods using convolution neural network (CNN). Sawaf [38] predicted phase discontinuities and Dardikman [39] tried to directly report the unwrapped phase. Nevertheless, these methods are still being explored. In addition, to train a deep neural network, great efforts must be made in sample collecting, labeling and training.

3. The proposed scheme

3.1 Motivation

As mentioned before, spatial unwrapping is highly dependent on the phase difference between neighboring pixels. Take Fig. 2(a) as an example, in interval $AB$, the phase has a sharp drop from $0.9\pi$ to $-0.9\pi$ between pixels $p$ and $q$. According to Eq. (5), the fringe order $m$ increases at $p$, and the curve of $m$ is shown in Fig. 2(d). Unfortunately, the captured patterns are often corrupted by noise, defocus, occlusion etc., and these factors introduce errors to the wrapped phase and may lead to incorrect phase unwrapping results. For example, distorting $\varphi _w(q)$ in Fig. 2(a) leads to the phase curve in Fig. 2(b). Here, the phase jumps from $q_2$ to $p$ via $q_1$, and the phase difference also changes: $\Delta \varphi _w(q_1,\;q_2)=-0.9\pi$ and $\Delta \varphi _w(p,\;q_1)=-0.9\pi$, According to Eq. (5), the phase difference does not exceed the threshold, so the pixels are considered to be in the same fringe, e.g. $m(q_2)= m(q_1) = m(p)$. Correspondingly, the $m$ curve is shown in Fig. 2(e), which is obviously incorrect in the interval $pB$.

Fig. 2. Principle of the multi-anchor scanline unwrapping. (a)(d) Spatial unwrapping without phase errors. (b)(e) Spatial unwrapping with phase errors. (c)(f) The proposed MASU with phase errors. The upper row represents $\varphi _w$ and the lower row illustrates $m$.

Download Full Size | PDF

We notice that an important drawback in conventional spatial unwrapping, as shown in Fig. 2(b), is that every pixel refers to only one neighbor as the anchor. To be specific, $q_1$ only refers to $q_2$, and $p$ is unwrapped only based on $q_1$. This makes the unwrapping quite weak in robustness such as the case of $q_1$ in Fig. 2(b).

3.2 Multi-anchor scanline unwrapping

To solve this problem, we propose a scheme called multi-anchor scanline unwrapping (MASU). As shown in Fig. 2(c), among the pixels before $p$, $q_1$ is a bad anchor with severe error, but more pixels $q_2$, $q_3$, $\cdots$, $q_5$ are reliable. By referring to these pixels, correct fringe order $m$ can still be obtained as shown in Fig. 2(f).

Motivated by this idea, MASU chooses $n$ non-uniformly located pixels $q_1, q_2, \ldots , q_n$ as anchors before the current pixel $p$ like in Fig. 2(c). The distance between $q_i$ and $p$ is

(6)$$d_i = \begin{cases} 1 & \textrm{ if } i = 1 \\ \frac{T}{2} \frac{1}{2^{(n+1-i)}} & \textrm{ if } 1\;<\;i\leq n \end{cases}$$

where $T$ is the period of the reference fringe. Note that $d_i$ is smaller than $\frac {T}{4}$, indicating the anchors will always lie within the current fringe or the one before it.

Based on the assumption of phase smoothness, every anchor $q_i$ will predict a fringe order $m_i$ for $p$ according to their phase difference $\Delta \varphi _w(p, q_i)$

(7)$$m_i(p)= \begin{cases} m(q_i)+1 & \textrm{ if } \Delta\varphi_w(p, q_i)\;<\;{-}Th_i \\ m(q_i)-1 & \textrm{ if } \Delta\varphi_w(p, q_i)\;>\;Th_i \\ m(q_i) & \textrm{otherwise } \end{cases}$$

where $\Delta \varphi _w(p, q_i) = \varphi _w(p) - \varphi _w(q_i)$. Different from Eq. (5) which has only one anchor and one threshold, MASU sets adaptive thresholds for the anchors. As illustrated in Fig. 2(c), the absolute phase difference between the anchor $q_i$ and pixel $p$ decreases as the distance $d_i$ increases. Therefore, the adaptive threshold is given as

(8)$$Th_i = \pi (1 - \frac{2d_i}{T})$$

We would like to discuss the thresholds of the nearest and the farthest anchors. For the nearest anchor, $d_1$ equals to 1, and $Th_1 \approx \pi$ because $T$ is much larger than $d_1$. As to the farthest anchor, $d_n=T/4$, and $Th_n=\frac {\pi }{2}$. In fact, Eq. (7) is a generalized form of spatial unwrapping that is applicable to any pair of pixels, while Eq. (5) is a specified version that only uses the anchor $q_1$. In such a manner, every anchor will report a candidate fringe order $m$ for pixel $p$. This process can be regarded as voting, and the candidate $m$ with the highest number of votes will be the result. In the implementation, we use an odd number of anchors, and thus the voting can always report a winner. In addition, please note that Eq. (7) can be conducted on any unwrapping path, and we follow the scanlines for simplicity and speed.

3.3 Modification on invalid pixels

In practice, there always exist invalid regions in structured light depth sensing. For one thing, some regions are not sufficiently modulated such as the occluded background. For another, some materials have high reflective factors, i.e. mirror and metal. In both cases, the pixel intensity cannot be used to produce the phase or depth. We propose to detect the two kinds of regions, and the proposed MASU is modified for these invalid pixels. As shown in Fig. 3(a), with the captured patterns, we get a maximum-intensity map $I_{max}$ and a minimal-intensity map $I_{min}$ with

(9)$$\left\{\begin{matrix} I_{max} = max(I_1, I_2, \ldots, I_N)\\ I_{min} = min(I_1, I_2, \ldots, I_N) \end{matrix}\right.$$

In phase-shifting, the fringes shift spatially in different patterns. Therefore, all pixels in $I_{max}$ are bright except the low modulation regions, and similarly only the reflective regions are bright in $I_{min}$. In other words, these invalid regions are quite different from the parts of $I_{max}$ or $I_{min}$, and thus they can be easily extracted. Specifically, for any pixel $p$, if $I_{max}(p)\;<\;T_{lm}$, it is treated as a low modulation pixel, and if $I_{min}(p)\;>\;T_{rr}$, $p$ is considered to be in a reflective region. In implementation, $T_{lm} = 0.3mean(I_{max})$ and $T_{rr} = 3mean(I_{min})$, where $mean$ reports the mean value, 0.3 and 3 are empiric factors.

Fig. 3. Sketch of invalid region processing. (a) Flowchart of detecting the low modulation region and the reflective region. (b)(c) Modification of MASU in the low illumination region and the reflective region, respectively.

Download Full Size | PDF

The proposed MASU is modified for the two types of invalid pixels. As shown in Fig. 3(b), low-modulation pixels have no patterns covered, and thus the phases of the two endpoints, i.e. $p$ and $q_1$, should be consistent. Therefore, these low modulation pixels are skipped when choosing anchors. On the other hand, as shown in Fig. 3(c), reflective regions are covered by patterns but they report incorrect phase values. We set anchors in these regions, but do not take them into account in the voting.

3.4 Error immunity analysis of MASU

In addition to presenting the method as shown above, we would like to discuss the reason for the error immunity of MASU. Without loss of generality, the case of three anchors is first considered. Assume that the $i^{th}$ anchor has the probability of $\xi _i$ to successfully predict $m(p)$, and we will analyze the overall probability of correctly predicting $m(p)$ when using a single anchor and three anchors, respectively.

If only a single anchor is available, phase unwrapping totally depends on it, and the overall probability

(10)$$P_{1A} = \xi_1$$

If three anchors are used, $m(p)$ will be correct when more than two give correct votes, so the overall probability is

(11)$$\begin{aligned} P_{3A} & = \xi_1\xi_2\xi_3+\xi_1\xi_2(1-\xi_3)+\xi_1(1-\xi_2)\xi_3+(1-\xi_1)\xi_2\xi_3 \\ & = \xi_1\xi_2 + \xi_1\xi_3 + \xi_2\xi_3 - 2 \xi_1\xi_2\xi_3\\ & = \xi_1(\xi_2+\xi_3-2\xi_2\xi_3)+\xi_2\xi_3\\ & = k_{3A}\xi_1+ b_{3A} \end{aligned}$$

where the slope $k_{3A} = (\xi _2+\xi _3-2\xi _2\xi _3)$ and the offset $b_{3A} = \xi _2\xi _3$ are between 0 and 1.

Figure 4 illustrates the probability curves of $P_{1A}$ and $P_{3A}$. When the nearest anchor has a corrupted phase, $\xi _1$ is small. In this case, $P_{1A}$ is quite low, indicating the unwrapping is very likely to be incorrect. In contrast, if three anchors are utilized, the probability increases from $P_{1A}$ to $P_{3A}$, and it is more likely to achieve correct phase unwrapping. From another aspect of view, the probability gain can be explained. In the one-anchor case, if the referred pixel is corrupted with severe error, the unwrapping can hardly be successful. While with three anchors, even if one of them fails, the unwrapping can be correctly conducted with the other two anchors.

Fig. 4. Probability of correct phase unwrapping with one anchor and three anchors.

Download Full Size | PDF

The analysis of $P_{1A}$ and $P_{3A}$ can be extended to more general cases. Whenever a new anchor is introduced, it provides information to correct the errors of existing anchors. Therefore, incorporating more referred pixels can improve the error-immunity performance, especially when the phase values are not accurate.

Meanwhile, Fig. 4 also shows that when $\xi _1$ is large, $P_{3A}$ is smaller than $P_{1A}$. This reason is that $P_{1A}$ only requires that the first anchor is correct, while $P_{3A}$ demands at least two accurate predictions.

4. Simulation and experiments

We verify the proposed method with extensive simulation and real experiments. The original wrapped phase is obtained with three step-phase shifting, e.g. $N$=3 in Eqs. (2) - (3). In addition to the proposed MASU, we also re-perform three spatial unwrapping methods. (1) CS: classic scanline-based unwrapping, (2) QG-PDV: quality-guided unwrapping where partial derivation variance is the quality index, and (3) QG-MPG: quality-guided unwrapping where maximum phase gradient is the quality index. Besides, one temporal phase unwrapping (TF-TPU) [15] is also conducted for comparison. It uses phase shifting patterns modulated with a low frequency and a high frequency and the ratio is 8/15. In total, it needs six patterns to produce one depth map.

4.1 Simulation results

The proposed method is first verified on simulated data generated with a structured light system with 3dsmax. Basic setup of the system is as the follows: projector resolution = 800$\times$1280, camera resolution = 960$\times$1280, baseline $b$ = 80mm, focal length $f_L$=35.572mm, depth of the reference plane $Z_{0}$ = 800mm. The camera covers a 495mm$\times$660mm rectangle in the reference plane. 3D models ‘dragon’ and ‘Buddha’ are used for the test.

To verify the performance against noise, uniformly distributed noise with zero-mean is added to the captured patterns. In Figs. 5 and 7, the patterns and the corresponding wrapped phases are presented. The amplitudes of the noise are $\pm$10, $\pm$20, $\pm$30 and $\pm$40 from left to right. It can be clearly observed that as noise becomes stronger, more black and white points appear in the wrapped phase maps, which are phase errors. These errors act as false-positive phase transition predictions, and thus yields to incorrect unwrapping phase and depth values.

Fig. 5. Noisy patterns (the upper row) and the corresponding wrapped phase maps (the lower row) of dragon. From left to right, the amplitudes of the noise are $\pm$10, $\pm$20, $\pm$30 and $\pm$40. Correspondingly, the PSNR values of the patterns are 32.90dB, 26.88dB, 23.36dB, 20.86dB, respectively.

Download Full Size | PDF

In Figs. 6 and 8, the resultant depth maps of CS, QG-PDV, QG-MPG, TF-TPU and the proposed MASU are shown from top to bottom. In each row, the results are obtained with the phase maps under different levels of noise as shown in Figs. 5 and 7. It can be noticed that CS performs poorly even with low-level noise, where phase errors propagate along the scanlines. As the noise becomes stronger, the depth quality degrades severely. Quality guided methods, include QG-PDV and QG-MPG, are robust against low-level noise when the amplitude is $\pm$10 and $\pm$20, while QG-PDV still suffers from error-propagation when the noise reaches $\pm$30 and $\pm$40. The reason lies in the fact that even using phase quality as the guidance, QG-PDV still refers to only one anchor. If the anchor is incorrect, errors will propagate along the unwrapping path. As to TF-TPU, phase unwrapping is performed independently for every pixel, and thus it is free from error propagation. Nevertheless, based on number theory, the reported fringe order is very sensitive to the changes in the wrapped phase. That is to say, a minor phase error can yield an incorrect fringe order and finally incorrect depth. As a result, many pixels will have incorrect depth values, and they appear in isolated speckles in the maps. Finally, by referring to multiple anchors, MASU is very robust even with strong noise. Note that MASU is conducted simply along scanlines, and it produces comparative results of QG-MPG and even outperforms QG-PDV. We further present an unwrapping example of SC and MASU in Fig. 9, where the details of row 847 of Buddha with a noise level of 20 are shown. It can be observed that there exists a background-foreground transition point $A$ marked with red circles in Figs. 9(a) and 9(b). Figure 9(c) presents the wrapped phase where $A$ is in the same period with the pixels before it. However, due to phase errors, the phase difference $\Delta \varphi _w(A,A-1)$ equals to 3.24 and exceeds the threshold $\pi$. According to Eq. (5), CS makes an incorrect prediction that $m(A)= m(A-1) - 1$ and Fig. 9(d) shows the curve of $m_{CS}$. In contrast, the proposed MASU refers to five anchors, among which three predict $m(A) = m(A-1)$, and after voting the correct fringe order is still obtained in Fig. 9(d).

Fig. 6. Resultant depth maps of dragon generated with Fig. 5. From top to bottom, the phase values are unwrapped with CS, QG-PDV, QG-MPG, TF-TPU and the proposed MASU. From left to right are the results generated with the four noisy phase maps in Fig. 5, respectively.

Download Full Size | PDF

Fig. 7. Noisy patterns (the upper row) and corresponding wrapped phase maps (the lower row) of Buddha. From left to right, the amplitudes of the noise are $\pm$10, $\pm$20, $\pm$30 and $\pm$40. Correspondingly, the PSNR values of the patterns are 32.90dB, 26.88dB, 23.36dB, 20.86dB, respectively.

Download Full Size | PDF

Fig. 8. Resultant depth maps of Buddha generated with Fig. 7. From top to bottom, the phase values are unwrapped with CS, QG-PDV, QG-MPG, TF-TPU and the proposed MASU. From left to right are the results generated with the four noisy phase maps in Fig. 7, respectively.

Download Full Size | PDF

Fig. 9. An example of unwrapping detail of CS and the proposed MASU. (a) depth map. (b) Intensity. (c) Wrapped phase. (d) Coefficient $m$ with CS and MASU . (e) Unwrapped phase $\varphi$ with CS and MASU.

Download Full Size | PDF

In addition, the relative mean absolute difference (MAD) values about the ground truth depth are presented in Table 1, where $A_n$ indicates the amplitude of the noise. The ground truth is acquired with three-frequency phase shifting/unwrapping and the patterns are noise-free. The table indicates that under a weak and medium level of noise, the depth errors of MASU are less than 1%. When corrupted with very strong noise ($A_n$=$\pm$40, PSNR=20.86dB), the error is 1.2278$\%$. Compared with other methods, MASU outperforms SC and TF-TPU greatly, and it is comparative with complex algorithms like QG-MPG. Overall, these relative MAD metrics prove that MASU produces accurate depth on simulated data.

Table 1. Relative MAD (%) of the depth maps

View Table | View all tables in this article

4.2 Experimental results

We also test the proposed MASU on a real captured structured light system. The resolutions of the projector and the camera are 1280$\times$800 and 2592$\times$1800, respectively. Other parameters are as the following: baseline $b$=150mm, focal length $f_L$=16mm, and depth of the reference plane $Z_{0}$=1950mm. The projector and the camera are vertically settled for the convenience of the system setup, and the patterns are correspondingly vertically modulated. To eliminate the projector’s non-linearity effect, we project 256 patterns covering the intensity 0 to 255, and record the images with the camera. With these patterns, a response curve can be obtained as shown in Fig. 10. According to this curve, the intensities of the projected patterns are limited to the middle range and avoids the two saturation intervals. In addition, patterns are modified based on this curve to make the captured intensities linearly related to the projected ones. Two scenes, ‘boy’ and ‘cones’ are used for the test. Since the captured data is impaired by noise, defocus and other practical factors, no additional noise is introduced. The results are shown in Figs. 11 and 12, respectively. Furthermore, we detect the invalid regions with the captured patterns, and these regions are in pure black indicating no valid depth is available.

Fig. 10. Response curve between the projected intensity and the recorded intensity.

Download Full Size | PDF

Fig. 11. Results of boy. (a) The second captured pattern. (b) Map of the wrapped phase. (c)–(g) Results of CS, QG-PDV, QG-MPG, TF-TPU and the proposed MASU, respectively.

Download Full Size | PDF

Fig. 12. Results of cones. (a) The second captured pattern. (b) Map of the wrapped phase. (c)–(g) Results of CS, QG-PDV, QG-MPG, , TF-TPU and the proposed MASU, respectively.

Download Full Size | PDF

In Figs. 11 and 12, the capturing is performed in a dark room to avoid ambient light, so the patterns and wrapped phases are less noisy than the simulated data. Even in this condition, CS still has propagated errors, e.g. the black/white stripes. Quality guided methods, both QG-PDV and QG-MPG, can report correct results. With less noise, the performance of TF-TPU also improves greatly compared to that on the simulated data. Nevertheless, there still an incorrect region at the bottom of the neck which is filled with black pixels. Finally, the proposed MASU produces correct geometry of the scenes.

We also computed MAD and relative MAD about ground truth to evaluate the accuracy. The ground truth is obtained with three-frequency temporal phase unwrapping, and the frequency ratio is 1:3:9. The results are shown in Table 2. The scene ‘boy’ is a plaster statue with Lambertian surface, so MASU achieves high accuracy with relative MAD being about 0.1%. In contrast, ‘cones’ is more challenging with a high reflective metal surface, black texture and larger occluded region, and the accuracy is lower with the relative error being about 1%. Compared with other methods, MASU outperforms CS and TF-TPU, and is comparative with quality guided methods.

Table 2. MAD (mm) and relative MAD (%) of boy and cones

View Table | View all tables in this article

4.3 Complexity analysis

One significant advantage of the proposed scheme is its high efficiency. In Table 3, we present the average time consumption in unwrapping a single pixel. All experiments are conducted on a desktop with an Intel I7-8700 CPU and 32GB RAM, and the algorithms are coded with Matlab. Since the tested quality-guided methods cannot be parallelized, we unwrap all pixels in serial for a fair comparison.

Table 3. Average time consumed (seconds) in unwrapping a single pixel

View Table | View all tables in this article

CS is very fast, but the severe error propagation makes it inapplicable in practical use. TF-TPU is also in high efficiency, but it requires many more patterns in data acquisition. Quality-guided methods, although with robust performance, can only unwrap about 50 pixels per-second, which is too slow to be applied. Finally, the proposed MASU can still achieve high efficiency that processes 24000 pixels per-second, on the condition of robustly recovering the absolute phase.

Furthermore, in MASU, multiple rows can be processed simultaneously by using parallel computation techniques, which can even accelerate the unwrapping.

We would like to further theoretically analyze the complexity of the unwrapping methods. Without loss of generality, assume the image consists of $n$ pixels, and the unwrapping complexities of the unwrapping methods are as follows.

• CS and TF-TPU scan every pixel once, and the fringe order can be determined. So the complexity is $O(n)$ for both methods.
• Quality guided methods, including QG-PDV and QG-MPG, has two steps when unwrapping a single pixel in an iteration. (1) The first step is to find the point with the best quality. This pixel must locate at the boundary between the pixels-unwrapped and pixels-to-be-unwrapped. In the $k^{th}$ iteration, there are $k$ unwrapped pixels, and the number of boundary pixels $b_k\in [2\sqrt {\pi k}, 2k]$. The minimal can be reached when the unwrapped region is circular, while the maximum corresponds to a line of unwrapped pixels. These boundary pixels have to be scanned to find the most reliable one. (2) Once found the pixel with the highest quality, unwrapping is conducted by following Eq. (5). To unwrap the whole phase map, these two steps are conducted in iterations. The overall complexity of step (1) is $\Sigma _{k=0}^{n}{b_k}$, which is between $O(n^{\frac {3}{2}})$ and $O(n^{2})$ [40], and the complexity for step (2) is $O(n)$. Therefore, in summary, the overall complexity of quality-guided methods lies between $O(n^{\frac {3}{2}})$ and $O(n^{2})$.
• The proposed MASU is carried out along scanlines. For each pixel, Eq. (7) is done between the current pixel and every anchor. Since only several anchors are used, the overall complexity is still $O(n)$.

The complexities imply that, as more pixels are incorporated, the time consumption increases linearly for scanline unwrapping, TF-TPU and the proposed MASU, but booms in polynomial for quality-guided methods. In Table 3, boy and cones have more pixels than Buddha and dragon. The time varies little for CS, TF-TPU and MASU, but increases greatly for QG-PDV and QG-MPG. This indicates that, for a high resolution dense depth map, QG methods may face a disaster of low efficiency, while the proposed MASU can still work quickly.

5. Conclusion

Conventional spatial unwrapping methods refer to only a single pixel as the anchor, and thus is not robust with error-prone phases. In this paper, we proposed a robust and fast spatial phase unwrapping method. Instead of referring to a single pixel, the proposed multi-anchor scanline unwrapping (MASU) utilizes multiple anchor reference points. Based on the smooth assumption, these anchor points predict fringe order candidates, and the one with the highest number of votes is selected for further phase unwrapping and depth computation. This novel method introduced the idea of statistic to structured light profilometry, and thus can well handle the phase errors and improve the robustness. Extensive simulation and experiments have proved that the proposed MASU can generate precise depth maps even with severe phase errors. In addition, without the cumbersome phase guidance, MASU performs with quite a high efficiency, which is thousands of times faster than quality-guided methods in the condition of similar quality. In the future, we will continue on this work, especially on how to choose reliable anchors.

Funding

National Natural Science Foundation of China (61702384); Natural Science Foundation of Hubei Province (2017CFB348); Hubei Provincial Department of Education (Q20171106); Wuhan University of Science and Technology (2017xz008).

Disclosures

The authors declare no conflicts of interest.

References

1. Y. Yang, Q. Liu, X. He, and Z. Liu, “Cross-view multi-lateral filter for compressed multi-view depth video,” IEEE Trans. on Image Process. 28(1), 302–315 (2019). [CrossRef]

2. Y. Yang, B. Li, P. Li, and Q. Liu, “A two-stage clustering based 3d visual saliency model for dynamic scenarios,” IEEE Trans. Multimedia 21(4), 809–820 (2019). [CrossRef]

3. S. Zhang, “High-speed 3d shape measurement with structured light methods: A review,” Opt. Lasers Eng. 106, 119–131 (2018). [CrossRef]

4. B. Freedman, A. Shpunt, M. Machline, and Y. Arieli, “Depth mapping using projected patterns,” (2012). US Patent 8,150,142.

5. J. Salvi, J. Pages, and J. Batlle, “Pattern codification strategies in structured light systems,” Pattern Recognit. 37(4), 827–849 (2004). [CrossRef]

6. S. Zhang, “Absolute phase retrieval methods for digital fringe projection profilometry: A review,” Opt. Lasers Eng. 107, 28–37 (2018). [CrossRef]

7. P. S. Huang and S. Zhang, “Fast three-step phase-shifting algorithm,” Appl. Opt. 45(21), 5086–5091 (2006). [CrossRef]

8. B. Pan, Q. Kemao, L. Huang, and A. Asundi, “Phase error analysis and compensation for nonsinusoidal waveforms in phase-shifting digital fringe projection profilometry,” Opt. Lett. 34(4), 416–418 (2009). [CrossRef]

9. X. Su and W. Chen, “Fourier transform profilometry:: a review,” Opt. Lasers Eng. 35(5), 263–284 (2001). [CrossRef]

10. M. Takeda and K. Mutoh, “Fourier transform profilometry for the automatic measurement of 3-d object shapes,” Appl. Opt. 22(24), 3977–3982 (1983). [CrossRef]

11. L. Huang, Q. Kemao, B. Pan, and A. Asundi, “Comparison of fourier transform, windowed fourier transform, and wavelet transform methods for phase extraction from a single fringe pattern in fringe projection profilometry,” Opt. Lasers Eng. 48(2), 141–148 (2010). [CrossRef]

12. Z. Zhang and J. Zhong, “Applicability analysis of wavelet-transform profilometry,” Opt. Express 21(16), 18777–18796 (2013). [CrossRef]

13. S. Xiang, H. Deng, L. Yu, J. Wu, Y. Yang, Q. Liu, and Z. Yuan, “Hybrid profilometry using a single monochromatic multi-frequency pattern,” Opt. Express 25(22), 27195–27209 (2017). [CrossRef]

14. C. Zuo, L. Huang, M. Zhang, Q. Chen, and A. Asundi, “Temporal phase unwrapping algorithms for fringe projection profilometry: A comparative review,” Opt. Lasers Eng. 85, 84–103 (2016). [CrossRef]

15. Y. Ding, J. Xi, Y. Yu, W. Cheng, S. Wang, and J. F. Chicharo, “Frequency selection in absolute phase maps recovery with two frequency projection fringes,” Opt. Express 20(12), 13238–13251 (2012). [CrossRef]

16. K. Liu, Y. Wang, D. L. Lau, Q. Hao, and L. G. Hassebrook, “Dual-frequency pattern scheme for high-speed 3-d shape measurement,” Opt. Express 18(5), 5229–5244 (2010). [CrossRef]

17. W.-H. Su and H. Liu, “Calibration-based two-frequency projected fringe profilometry: a robust, accurate, and single-shot measurement for objects with large depth discontinuities,” Opt. Express 14(20), 9178–9187 (2006). [CrossRef]

18. Y. Xu and C. Ai, “Simple and effective phase unwrapping technique,” in Interferometry VI: Techniques and Analysis, vol. 2003 (International Society for Optics and Photonics, 1993), pp. 254–263.

19. Z. Dai and X. Zha, “An accurate phase unwrapping algorithm based on reliability sorting and residue mask,” IEEE Geosci. Remote Sens. Lett. 9(2), 219–223 (2012). [CrossRef]

20. M. Zhao, L. Huang, Q. Zhang, X. Su, A. Asundi, and Q. Kemao, “Quality-guided phase unwrapping technique: comparison of quality maps and guiding strategies,” Appl. Opt. 50(33), 6214–6224 (2011). [CrossRef]

21. H. Zhong, J. Tang, and S. Zhang, “Phase quality map based on local multi-unwrapped results for two-dimensional phase unwrapping,” Appl. Opt. 54(4), 739–745 (2015). [CrossRef]

22. Z. Dai and X. Zha, “An accurate phase unwrapping algorithm based on reliability sorting and residue mask,” IEEE Geosci. Remote Sens. Lett. 9(2), 219–223 (2012). [CrossRef]

23. H. Zhong, J. Tang, S. Zhang, and M. Chen, “An improved quality-guided phase-unwrapping algorithm based on priority queue,” IEEE Geosci. Remote Sens. Lett. 8(2), 364–368 (2011). [CrossRef]

24. X. Su, W. Chen, Q. Zhang, and Y. Chao, “Dynamic 3-d shape measurement method based on ftp,” Opt. Lasers Eng. 36(1), 49–64 (2001). [CrossRef]

25. A. Asundi and Z. Wensen, “Fast phase-unwrapping algorithm based on a gray-scale mask and flood fill,” Appl. Opt. 37(23), 5416–5420 (1998). [CrossRef]

26. C. Ojha, M. Manunta, A. Pepe, L. Paglia, and R. Lanari, “An innovative region growing algorithm based on minimum cost flow approach for phase unwrapping of full-resolution differential interferograms,” in IEEE Intl. Geos. Remo. Sens. Symp., (IEEE, 2012), pp. 5582–5585.

27. M. A. Herráez, J. G. Boticario, M. J. Lalor, and D. R. Burton, “Agglomerative clustering-based approach for two-dimensional phase unwrapping,” Appl. Opt. 44(7), 1129–1140 (2005). [CrossRef]

28. Z. Song, L. Xiaolin, and Y. Shing-Tung, “Multilevel quality-guided phase unwrapping algorithm for real-time three-dimensional shape reconstruction,” Appl. Opt. 46(1), 50–57 (2007). [CrossRef]

29. S. Liu and L. X. Yang, “Regional phase unwrapping method based on fringe estimation and phase map segmentation,” Opt. Eng. 46(5), 051012 (2007). [CrossRef]

30. C. Jiang, B. Li, and Z. Song, “Pixel-by-pixel absolute phase retrieval using three phase-shifted fringe patterns without markers,” Opt. Lasers Eng. 91, 232–241 (2017). [CrossRef]

31. B. Budianto, P. Lun, and T.-C. Hsung, “Marker encoded fringe projection profilometry for efficient 3d model acquisition,” Appl. Opt. 53(31), 7442–7453 (2014). [CrossRef]

32. P. Cong, Z. Xiong, Y. Zhang, S. Zhao, and F. Wu, “Accurate dynamic 3d sensing with fourier-assisted phase shifting,” IEEE J. Sel. Top. Signal Process. 9(3), 396–408 (2015). [CrossRef]

33. J. S. Hyun and S. Zhang, “Superfast 3d absolute shape measurement using five binary patterns,” Opt. Lasers Eng. 90, 217–224 (2017). [CrossRef]

34. C. Jiang and S. Zhang, “Absolute phase unwrapping for dual-camera system without embedding statistical features,” Proc. SPIE 10220, 1022009(2017). [CrossRef]

35. Y. An, J. S. Hyun, and S. Zhang, “Pixel-wise absolute phase unwrapping using geometric constraints of structured light system,” Opt. Express 24(16), 18445–18459 (2016). [CrossRef]

36. B. Li, Z. Liu, and S. Zhang, “Motion-induced error reduction by combining fourier transform profilometry with phase-shifting profilometry,” Opt. Express 24(20), 23289 (2016). [CrossRef]

37. J. Dai, S. Zhang, and Y. An, “Absolute three-dimensional shape measurement with a known object,” Opt. Express 25(9), 10384 (2017). [CrossRef]

38. F. Sawaf and R. M. Groves, “Phase discontinuity predictions using a machine-learning trained kernel,” Appl. Opt. 53(24), 5439 (2014). [CrossRef]

39. G. Dardikman and N. T. Shaked, “Phase unwrapping using residual neural networks,” in Imaging and Applied Optics (Optical Society of America, 2018), p. CW3B.5. [CrossRef]

40. S. Shekatkar, “The sum of the r’th roots of first n natural numbers and new formula for factorial,” arXiv preprint arXiv:1204.0877 (2012).

noise level		$A_{n}$ = $\pm$ 10	$A_{n}$ = $\pm$ 20	$A_{n}$ = $\pm$ 30	$A_{n}$ = $\pm$ 40
dragon	CS	0.3051	0.9744	8.8380	19.2200
	QG-PDV	0.2943	0.5937	0.9854	107.4250
	QG-MPG	0.2942	0.5824	0.8703	1.1573
	TF-TPU	5.5856	35.997	69.2466	86.0555
	MASU(proposed)	0.2942	0.5831	0.8743	1.1735
Buddha	CS	1.6707	3.6966	18.3518	34.5849
	QG-PDV	0.6853	0.6025	223.7146	291.9052
	QG-MPG	0.6381	0.5843	0.8720	1.1702
	TF-TPU	3.6236	42.7012	75.3301	90.2420
	MASU(proposed)	0.6413	0.6110	0.9381	1.2278

		CS	QG-PDV	QG-MPG	TF-TPU	MASU(proposed)
boy	MAD relative MAD	10.3559	1.4577	1.4600	37.4814	1.8899
boy	MAD relative MAD	0.6470%	0.0827%	0.0829%	2.3081%	0.1084%
cones	MAD relative MAD	27.7705	15.5685	15.5291	177.3056	17.7786
cones	MAD relative MAD	1.6728%	0.8583%	0.8558%	11.9804%	1.0025%

	Buddha	dragon	boy	cones
# of pixels	1228800	1228800	2950400	1941000
CS	$5.13 \times 10^{- 8}$	$4.96 \times 10^{- 8}$	$4.78 \times 10^{- 8}$	$5.77 \times 10^{- 8}$
QG-PDV	$2.03 \times 10^{- 2}$	$2.03 \times 10^{- 2}$	$7.77 \times 10^{- 2}$	$7.51 \times 10^{- 2}$
QG-MPG	$2.11 \times 10^{- 2}$	$2.11 \times 10^{- 2}$	$7.86 \times 10^{- 2}$	$7.77 \times 10^{- 2}$
TF-TPU	$1.09 \times 10^{- 6}$	$1.09 \times 10^{- 6}$	$9.36 \times 10^{- 7}$	$9.43 \times 10^{- 7}$
MASU(proposed)	$4.20 \times 10^{- 5}$	$4.22 \times 10^{- 5}$	$4.61 \times 10^{- 5}$	$5.89 \times 10^{- 5}$

noise level		$A_{n}$ = $\pm$ 10	$A_{n}$ = $\pm$ 20	$A_{n}$ = $\pm$ 30	$A_{n}$ = $\pm$ 40
dragon	CS	0.3051	0.9744	8.8380	19.2200
	QG-PDV	0.2943	0.5937	0.9854	107.4250
	QG-MPG	0.2942	0.5824	0.8703	1.1573
	TF-TPU	5.5856	35.997	69.2466	86.0555
	MASU(proposed)	0.2942	0.5831	0.8743	1.1735
Buddha	CS	1.6707	3.6966	18.3518	34.5849
	QG-PDV	0.6853	0.6025	223.7146	291.9052
	QG-MPG	0.6381	0.5843	0.8720	1.1702
	TF-TPU	3.6236	42.7012	75.3301	90.2420
	MASU(proposed)	0.6413	0.6110	0.9381	1.2278

		CS	QG-PDV	QG-MPG	TF-TPU	MASU(proposed)
boy	MAD relative MAD	10.3559	1.4577	1.4600	37.4814	1.8899
boy	MAD relative MAD	0.6470%	0.0827%	0.0829%	2.3081%	0.1084%
cones	MAD relative MAD	27.7705	15.5685	15.5291	177.3056	17.7786
cones	MAD relative MAD	1.6728%	0.8583%	0.8558%	11.9804%	1.0025%

Multi-anchor spatial phase unwrapping for fringe projection profilometry

Abstract

1. Introduction

2. Related work

2.1 Fringe-projection profilometry

2.2 Phase unwrapping

3. The proposed scheme

3.1 Motivation

3.2 Multi-anchor scanline unwrapping

3.3 Modification on invalid pixels

3.4 Error immunity analysis of MASU

4. Simulation and experiments

4.1 Simulation results

4.2 Experimental results

4.3 Complexity analysis

5. Conclusion

Funding

Disclosures

References

Cited By

Figures (12)

Tables (3)

Equations (11)

Optics Express