Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Single-shot 3D measurement of highly reflective objects with deep learning

Open Access Open Access

Abstract

Three-dimensional (3D) measurement methods based on fringe projection profilometry (FPP) have been widely applied in industrial manufacturing. Most FPP methods adopt phase-shifting techniques and require multiple fringe images, thus having limited application in dynamic scenes. Moreover, industrial parts often have highly reflective areas leading to overexposure. In this work, a single-shot high dynamic range 3D measurement method combining FPP with deep learning is proposed. The proposed deep learning model includes two convolutional neural networks: exposure selection network (ExSNet) and fringe analysis network (FrANet). The ExSNet utilizes self-attention mechanism for enhancement of highly reflective areas leading to overexposure problem to achieve high dynamic range in single-shot 3D measurement. The FrANet consists of three modules to predict wrapped phase maps and absolute phase maps. A training strategy directly opting for best measurement accuracy is proposed. Experiments on a FPP system showed that the proposed method predicted accurate optimal exposure time under single-shot condition. A pair of moving standard spheres with overexposure was measured for quantitative evaluation. The proposed method reconstructed standard spheres over a large range of exposure level, where prediction errors for diameter were 73 µm (left) and 64 µm (right) and prediction error for center distance was 49 µm. Ablation study and comparison with other high dynamic range methods were also conducted.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

In recent years, three-dimensional (3D) optical measurement has gained great interest in researches and applications for its non-contact and low-cost features. Growing demand for high-speed 3D measurement are witnessed in biomedicine, computer vision, and industrial manufacturing. Fringe projection profilometry (FPP) is the most popular 3D optical measurement method. By projecting a set of fringe patterns onto objects, FPP derives the 3D coordinates using fringe analysis and triangulation.

The first step in FPP is the calibration of the structured light system. Calibration methods can be classified into phase-height models and triangulation [1]. Extracting wrapped phase from fringe patterns and unwrapping the extracted wrapped phase to obtain the continuous phase information of the target to be measured are the key steps of fringe analysis. Phase-shifting [2] and temporal phase unwrapping (TPU) [3] methods are commonly used phase retrieval and unwrapping methods in FPP. Zeng et al. [4] proposed a self-unwrapping phase-shifting method for 3D shape measurement. They embed a space-varying phase shift (SPS) that uniquely determines the fringe order information into sinusoidal patterns, and then extract it to retrieve the absolute phase by pixelwise calculation. The absolute phase was retrieved without external information or priors. Wang et al. [5] presented a 3D measurement method for rigid moving objects based on phase-shifting and three pitches heterodyne phase unwrapping (TPHPU) algorithms. This method preprocesses the fringe patterns of moving objects in 3D space, instead of complex phase compensation and inherits the accuracy and robustness of phase-shifting and TPHPU algorithms. Hu et al. [6] addressed a microscopic 3D measurement method of shiny surfaces based on a multi-frequency phase-shifting scheme. They calculated the phase of the highlighted areas from a subset of the phase-shifted fringe images and solved the problem that the defocus of the dense fringe and the complex surface reflexivity characteristics decrease the fringe quality. Li et al. [7] designed an improved temporal phase unwrapping method based on super-grayscale multi-frequency grating projection. The captured realistic super-grayscale patterns can calculate the high-resolution phase information, so that the reliable unit-frequency phase can guide the low-frequency and high-frequency fringe order calculation more precisely, and improve the measurement accuracy. Li et al. [8] proposed a 3D reconstruction framework based on modified three-wavelength phase unwrapping algorithm and phase error compensation method. The three-wavelength phase unwrapping algorithm can improve the 3D frame speed and phase error compensation method can reduce the error of 3D measurement. Servin et al. [9] combined co-phased profilometry and 2-steps temporal phase-unwrapping techniques for 3D fringe projection profilometry and phase unwrapping. The proposed profilometer allows to measure highly discontinuous objects minimizing the shadows and maximizing the phase-sensitivity.

With the development of deep learning, convolutional neural networks have been applied in 3D measurement. Lin et al. [10] designed a multi-stage convolution neural network for fringe pattern denoising. Their method is competitive with state-of-the-art denoising methods in spatial or transform domain. Yao et al. [11] proposed a multi-purpose neural network to calculate absolute phase map from few patterns. These patterns included a sinusoidal fringe pattern and two code-based patterns. The well-trained multi-purpose network can retrieve absolute phase map, which reduced the number of patterns needed in phase-shifting techniques. Yao et al. [12] introduced a super-resolution technique for dense 3D reconstruction. The fringe resolution was extended using a dual-dense block super-resolution network (DdBSRN).

However, these methods above require phase-shifting fringe images in multiple frequencies, which reduces the measurement speed. Although these methods provide highly accurate measurement data for static objects by capturing multiple fringe pattern images, the performance can degrade due to the disturbance of vibration and movement between gap of image shots, which hinders the performance in applications of dynamic scenes.

Single-shot methods extract phase maps from one fringe image, which are robust against movement and have the advantages of fast fringe acquisition speed and low cost. Thus, single-shot methods are desired for dynamic 3D measurement. The methods based on spatial demodulation were commonly used in single-shot 3D measurement. These methods include Fourier transform profilometry (FTP) [13], windowed Fourier transform profilometry (WFTP) [14] and wavelet transform profilometry (WTP) [15]. Despite their viability in dynamic scenes, spatial demodulation methods have the problems of spectrum aliasing and spectrum leakage, thus being sensitive to noise and having low accuracy especially when measuring objects with strong texture. Single-shot methods based on deep learning are more accurate and robust than traditional methods, thus being more viable in practical applications.

Single-shot methods based on deep learning start from introducing deep networks into phase retrieval. Deep networks were first used to predict the numerator and denominator of wrapped phase [1618]. The ground-truth numerator and denominator are prepared using phase-shifting techniques. These methods yielded better results than traditional demodulation methods. Networks that directly output wrapped phase maps were also designed [1922]. Variations of U-Net were adopted as the network architecture for fringe analysis and calculation of the numerator and denominator were concealed in hidden layers. Instead of retrieving wrapped phase from a single fringe image, some researchers proposed to transform a single fringe image into phase-shifting fringe images [2325]. The predicted fringe images were then processed using phase-shifting techniques, which served the same purpose as single-shot 3D measurement. In addition to phase retrieval methods, phase unwrapping methods based on deep learning have been developed [2630]. These methods utilized deep networks to predict fringe order maps from wrapped phase maps, which either avoided the requirement of multi-frequency fringe images or improved the accuracy of phase unwrapping. End-to-end networks that directly predict height maps have also been researched [3133]. These networks adopt a large number of simulated or real fringe images with height map ground-truth for supervised training. FCN, AEN and U-Net have been evaluated as the architecture of end-to-end network, where U-Net yielded the most accurate results. However, end-to-end networks cannot take advantage of some important intermediate information.

The measured objects in industrial manufacturing often spawn highly reflective areas and acquisition of fringes of these objects will lead to overexposure and information loss. Thus, high dynamic range (HDR) measurement is needed to tackle the large range of exposure levels in fringe images.

Multi-exposure and single-exposure HDR methods have been proposed for 3D measurement and other applications. Jiang et al. [34] presented a 3D scanning technique for high-reflective surfaces based on phase-shifting fringe projection method. By fusing the raw fringe patterns acquired with different camera exposure time and the illumination fringe intensity from the projector, a synthetic fringe image avoiding saturation and under-illumination phenomenon was generated. Yonesaka et al. [35] introduced a digital holography method using HDR imaging to improve the quality of reconstructed image. The HDR imaging process included estimating the camera response function and synthesizing multiple holograms. Song et al. [36] developed a HDR method using binary pattern projection for 3D measurement. They proposed to calculate HDR fringe images from estimated radiance maps using multi-exposure to relieve the saturation. Multi-exposure methods use different exposure time or light intensity to capture fringe images and fuse them to recover the details lost in highly reflective areas. These methods are time-consuming and cannot fulfil the need for real-time 3D measurement. Cogalan et al. [37] proposed a single-exposure HDR method using camera sensors that can perform per-pixel exposure modulation. They developed a joint frame deinterlacing and denoising algorithm using deep neural networks. Jiang et al. [38] introduced a simple HDR method by projecting inverted fringe patterns to complement regular patterns without using multiple exposure. Inverted fringe patterns were used as lieu or combined with regular patterns depending on the saturation of regular patterns. Wu et al. [39] reconstructed HDR objects using motion tracking and phase-shifting profilometry. They used motion to change the position of saturated points on objects for single-exposure HDR measurement. Single exposure methods mostly aim to increase the number of unsaturated fringe patterns in highly reflective methods. These methods have additional or advanced devices that increase the cost of FPP system or have accuracy loss in phase computation due to extra fringes or tracking. Most HDR methods are incapable of measuring moving objects in dynamic scenes.

There are few deep-learning-based single-shot 3D optical measurement methods that attempt to solve high dynamic range problem. Zhang et al. [40] increased the dynamic range of 3D measurement by using deep learning for phase calculation, which broadens the dynamic range of three-step phase-shifting by a factor of 4.8. Yang et al. [41] designed a deep network to detect low-modulation regions and enhance fringe pattern details in these regions. A standard metal gauge block with a height of 5 mm was measured with the RMSE improvement from 0.55 mm to 0.06 mm. Their methods were fully based on algorithm and cannot entirely avoid information loss in overexposed areas. Liu et al. [42] developed a hand-crafted metric for exposure selection and used deep learning networks to enhance the fringe image, which achieved similar coverage rates (97.6% versus 98.0%) to the HDR method with ten exposures. Despite not being a single-shot method, their method increased the accuracy in highly reflective areas in a single exposure.

In this work, a single-shot high dynamic range 3D measurement method with deep-learning-based exposure selection and fringe analysis is proposed. The proposed method adopts two convolutional neural networks that cooperate in training and measurement: exposure selection network (ExSNet) and fringe analysis network (FrANet). The ExSNet utilized self-attention mechanism to enhance overexposed and underexposed areas and solve the problem of high dynamic range in single-shot 3D measurement. In order to solve the problem that the phase map error increases as the number of fringe patterns collected decreases in traditional methods, the FrANet is constructed to predict the wrapped phase maps and the absolute phase maps to increase the accuracy of phase retrieval and phase unwrapping. A novel training strategy that enables the ExSNet to learn optimal exposures will be introduced. During measurement, the ExSNet predicts the optimal exposure time and exposure adjusted fringe images are processed by the FrANet. The proposed method will be evaluated using a real-world FPP system to verify the effectiveness of the deep-learning-based exposure selection and fringe analysis. Results and discussions will also be presented.

2. Single-shot 3D measurement of highly reflective objects

The proposed method utilizes two deep neural networks, ExSNet and FrANet, to select optimal exposure time and conduct fringe analysis. The FrANet is first trained using fringe images under different exposure time. Then the exposure time corresponding to the best network performance is used as ground truth to train the ExSNet. The process of adding ground-truth optimal exposure time to fringe images for training is named optimal exposure time annotation. In single-shot measurement, a fringe image with inappropriate exposure time is fed into the ExSNet. Exposure time is adjusted based on the result and recaptured image is processed using the FrANet. Overview of the proposed method is shown in Fig. 1. In contrast to the methods that enhance the fringe images with overexposure or underexposure, the proposed method adjusts the exposure time to avoid information loss that cannot be fully recovered using these algorithm-based methods. The ground-truth optimal exposure time is directly selected based on the FrANet results, which guarantees accurate 3D measurement using exposure selected images. Details of network architecture and training strategy for FrANet and ExSNet are introduced in the following sections.

 figure: Fig. 1.

Fig. 1. Overview of the proposed method.

Download Full Size | PDF

2.1 Architecture of the fringe analysis network

The fringe analysis network FrANet fulfils the tasks of phase retrieval and phase unwrapping to solve the problem that the phase map error increases as the number of fringe patterns collected decreases in traditional methods. Architecture of the network is depicted in Fig. 2. The network consists of three improved U-Net modules for phase retrieval, phase unwrapping and refinement. These modules share similar U-Net variation architecture with some improvements, including stacked convolutional layers at the lowest resolution in all modules, additional down-sampling and up-sampling in the phase unwrapping module to raise receptive field and removal of batch normalization in the refinement module. U-Net, as a widely applied network architecture originally for segmentation, is known for requiring a few training samples and high efficiency in feature extraction and image resolution recovery. This is beneficial in processing fringe images and thus U-Net has been adopted in many single-shot methods based on deep learning. However, single-shot methods often adopt one U-Net for phase retrieval, phase unwrapping or height map prediction. Phase unwrapping and height map prediction that essentially contains phase unwrapping are ill-posed tasks that need long-range information to remove phase ambiguity. A single U-Net cannot guarantee the acquisition of long-range information, so this work adopts two extra improved U-Net modules to process phase unwrapping and refine absolute phase map after wrapped phase map is retrieved. Phase unwrapping module with large receptive field gathers long-range information and refinement module with small receptive field refines the details. Phase retrieval module predicts wrapped phase map using single-shot fringe images as input. Phase unwrapping module takes both predicted wrapped phase maps and fringe images to generate absolute phase maps. Refinement module takes fringe images and all predicted phase maps as input and refined absolute phase maps as output. Despite using three modules, the FrANet is considered a light-weighted network with the maximum number of filters in a layer being 48. In contrast to large networks with 512 or more filters in one layer, the FrANet is less time-consuming and more viable in 3D measurement.

 figure: Fig. 2.

Fig. 2. Architecture of the FrANet.

Download Full Size | PDF

Architecture of the improved U-Net module used in this work is shown in Fig. 3, including convolutional layers, down-sampling layers, up-sampling layers, concatenated layers and output layer. Convolutional layers use convolution with kernel size 3 × 3 and stride s = 1, batch normalization (BN), and leaky ReLU as activation function. Down-sampling layers consist of convolution with kernel size 2 × 2 and stride 2, batch normalization, and leaky ReLU. The numbers of filters are identical at the same resolution of each module, which are 16, 24, 24, 32, 32, and 48 corresponding to 1, 1/2, 1/4, 1/8, 1/16, and 1/32 of input resolution. Residual blocks operate at the lowest resolution using four convolutional layers with skip connections. The number of filters is 48 for these layers. Up-sampling layers utilize transposed convolution with kernel size 2 × 2 and stride 2 to recover the features into high resolution, which is followed by batch normalization and leaky ReLU as well. The number of filters at each resolution is 16, 24, 24, 16, and 8 from left to right. Up-sampled features are concatenated to output of convolutional layers at the same resolution in the down-sampling part. After that, two convolutional layers with kernel size 1 × 1 and 3 × 3 respectively are designed to decrease the number of channels and conduct further feature extraction. They both include convolution with stride 1, batch normalization, and leaky ReLU. Output layer uses convolution and sigmoid function to convert features into output dimension and value range of each module. The value range is from zero to one after using the sigmoid activation and then adjusted by multiplying a scalar. The range of predicted wrapped phases is from 0 to 2π, while the range of predicted and refined absolute phases is from 0 to 128π.

 figure: Fig. 3.

Fig. 3. Architecture of the improved U-Net module used in this work.

Download Full Size | PDF

2.2 Architecture of the exposure selection network

The exposure selection network ExSNet fulfils the task of predicting the optimal exposure time and solves the problem of high dynamic range in single-shot 3D measurement. Architecture of the ExSNet is shown in Fig. 4. The network adopts a convolutional neural network architecture similar to classifiers. The network extracts feature in multiple resolutions and the number of filters increases as the resolution decreases. The network consists of down-sampling layers, convolutional layers, one attention module and one pooling layer. Each down-sampling layer consists of convolution with kernel size 2 × 2 and stride 2, batch normalization, and leaky ReLU as activation function, followed by convolutional layer with convolution kernel size 3 × 3 and stride 1, batch normalization, and leaky ReLU. The numbers of filters remain the same at the same resolution, which are 16, 32, 64, 128, 256, 512, and 1024 corresponding to 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, and 1/128 of input resolution. The optimal exposure time is determined by highly reflective areas of measured objects, which can be a tiny portion of the fringe image. Thus, using full convolutional layers leads to these areas being easily overwhelmed and optimal exposure time cannot be accurately predicted. Self-attention mechanism is adopted in the ExSNet network to solve this problem. Attention module explores long-range spatial relationship and emphasizes the critical areas. Architecture of attention module in ExSNet is shown in Fig. 5.

 figure: Fig. 4.

Fig. 4. Architecture of the ExSNet.

Download Full Size | PDF

 figure: Fig. 5.

Fig. 5. Architecture of attention module in ExSNet.

Download Full Size | PDF

A popular self-attention mechanism known as scaled dot product attention can be expressed as:

$$A(Q,K,V) = \textrm{softmax(}\frac{{Q{K^T}}}{{\sqrt {{d_k}} }}\textrm{)}V,$$
where A is the result of attention operation, Q is query, K is key, V is value and ${d_k}$ is the dimension of K. In image processing, query, key and value are yielded from convolution of features. When calculating the dot product, image features are stretched into one dimension so that attention mechanism can be executed in spatial domain.

Attention module uses multi-head scaled dot product attention followed by convolution with kernel size 1 × 1. The number of heads is set to 8. Pooling layer uses global average pooling in spatial dimensions to convert image filters into one-digit numbers. Followed convolution and sigmoid function further convert the numbers into one scalar in the range of exposure time.

2.3 Training strategy

In the training process, the FrANet is trained first and the ExSNet is trained based on its performance. Networks for fringe analysis often require a large amount of data for training and data acquisition is time-consuming since ground-truth is prepared using phase-shifting techniques. In this work, unsupervised learning [43] is adopted for pre-training of the FrANet. Unsupervised learning does not require ground-truth phase maps, thus being more efficient in data acquisition. Fringe patterns at two frequencies are projected onto measured objects and fringe images are captured using a real-world FPP system. A large number of images are collected to form the unsupervised data set, which meets the data requirement of deep learning algorithms and avoids overfitting in training. Since unsupervised learning does not rely on real phase maps or height maps as ground-truth, each training sample only needs two fringe images, which are single-shot fringe images without phase-shifting at each frequency. Thus, time-consuming phase-shifting projection in data collection is avoided and the memory for data storage is reduced. After the reprojection process, unsupervised loss is calculated between real and re-projected fringe images:

$${L_{\textrm{unsup}}} = L({I_h},{\bar{I}_h}({\bar{\varphi }_h})) + L({I_h},{\bar{I}_h}({\bar{\Phi }_h})) + L({I_h},{\bar{I}_h}(\bar{\Phi }_h^r)) + L({I_l},{\bar{I}_l}({\bar{\Phi }_l})) + L({I_l},{\bar{I}_l}(\bar{\Phi }_l^r)),$$
where $L$ denotes the $L1$ loss function, $\bar{I}$ is the reprojected fringe image, h is the high frequency, l is the low frequency, $\bar{\varphi }$ is the predicted wrapped phase map, $\bar{\Phi }$ and ${\bar{\Phi }^r}$ are the predicted primary and refined absolute phase maps respectively. Since the gradient scales of five loss terms are similar, the same weight on each loss term is applied. The network is trained using unsupervised loss until convergence.

After pre-training, the FrANet is trained using supervised learning. Supervised learning enables the network to directly learn from ground-truth phase maps, which are prepared using phase-shifting profilometry. Twelve-step phase shifting fringe patterns at four frequencies are projected onto measured objects, also by using the real-world FPP system. These 48 fringe images are captured under different exposure time for image fusion using time-consuming multi-exposure HDR method. 48 fused fringe images are then used to obtain ground-truth phase map of one group of training samples at different exposure time. The intensity of a phase-shifting fringe image can be described as:

$${I_n}(x,y) = {I_b}(x,y) + {I_a}(x,y)\cos [\varphi (x,y) + \frac{{2\pi n}}{N}],$$
where n denotes the n-th phase shifting pattern, ${I_n}$ is the intensity of the image, x is the horizontal position of the pixel, y is the vertical position of the pixel, ${I_b}$ is the background intensity, ${I_a}$ is the fringe amplitude, $\varphi $ is the wrapped phase, and N is the number of phase-shifting steps. The wrapped phase can be retrieved using the following equation:
$$\varphi (x,y) = \arctan \frac{{\sum\limits_{n = 0}^{N - 1} {{I_n}(x,y)\sin \left( {\frac{{2\pi n}}{N}} \right)} }}{{\sum\limits_{n = 0}^{N - 1} {{I_n}(x,y)\cos \left( {\frac{{2\pi n}}{N}} \right)} }}.$$

Absolute phase maps are calculated using temporal phase unwrapping. The fringe order ${k_h}$ of the high-frequency image is computed using the following equation:

$${k_h}(x,y) = \textrm{round(}\frac{{({f_h}/{f_l}){\Phi _l}(x,y) - {\varphi _h}(x,y)}}{{2\pi }}\textrm{),}$$
where the $\textrm{round}$ function converts the result to a round number, ${f_h}$ is the high frequency, ${f_l}$ is the low frequency, ${\Phi _l}$ denotes the absolute phase of the low-frequency image, and ${\varphi _h}$ denotes the wrapped phase of the high-frequency image. At the frequency of 1, the absolute phase equals the wrapped phase, thus, the absolute phase of high-frequency images can be calculated. Compared to full supervised methods, fewer training samples are needed to fine-tune the network with prior knowledge and the effect of overfitting is also reduced. Thus, less time is required to collect extra fringe images for computing ground-truth phase maps. Supervised loss is computed as:
$${L_{\textrm{sup}}} = L(\varphi ,\bar{\varphi }) + L(\Phi ,\bar{\Phi }) + L(\Phi ,{\bar{\Phi }^r}).$$

The ExSNet aims to predict the optimal exposure time for 3D measurement. Thus, the ideal prediction corresponds to the highest accuracy for 3D measurement using the FrANet. The 3D reconstruction results of FrANet and ground-truth are compared and the accuracy for each set of fringe images is evaluated. Ground-truth optimal exposure time is yielded as the exposure time with the highest accuracy. Images captured at different exposure time serve as input of the ExSNet. The L1 loss is calculated between predicted exposure time and ground-truth. The ExSNet is trained using supervised learning. Since the task of predicting optimal exposure time is much easier than dense phase prediction, the ExSNet does not have the problem of overfitting and specifically designed training strategy is not required after ground-truth optimal exposure time is yielded.

During 3D measurement, single-shot fringe patterns at one frequency are projected onto measured objects and fringe images are captured using a real-world FPP system. Then the captured fringe image with inappropriate exposure time is used as test sample and input of the trained ExSNet to obtain the predicted optimal exposure time. The fringe image is recaptured with predicted exposure time and used as input of the trained FrANet to obtain refined absolute phase map of the measured object. The test samples have not been used for training the FrANet. Lastly, single-shot 3D measurement is achieved by converting refined absolute phase map of the measured object into 3D coordinates using triangulation.

3. Experiments

To verify the effectiveness of the proposed method, an FPP system consisting of a blue light projector (from Tengju technology, China) and two digital cameras (from Daheng Imaging, China) is built. The built FPP system is shown in Fig. 6(a), where only the left camera is adopted in the experiments. Fig. 6(b) displays an industrial part for 3D measurement in the test set, which is used for evaluation of the proposed method. The camera captures images of resolution 1296 × 964 pixels with three channels and resolution of the projector is 1280 × 720 pixels. During data acquisition, the projector projects vertical blue fringe patterns onto measured objects. Pre-training of FrANet takes 2000 groups of fringe images in each epoch. Each group contains fringe images of the same scene at two frequencies. High frequency and low frequency are set to 64 and 9 respectively. During the data acquisition in supervised learning, 15 exposure time from 3000 µs to 45000 µs is evenly selected. 20 groups of twelve-step phase-shifting fringe images are captured at four frequencies of 64, 16, 4, and 1, each group including 15 exposure time. In the following sections, implementation details of the ExSNet and FrANet are first provided, then optimal exposure time predicted by the network and the 3D measurement accuracy are evaluated, ablation study of exposure selection, attention mechanism and fringe analysis is conducted, and lastly the proposed method is compared with other HDR methods for 3D measurement.

 figure: Fig. 6.

Fig. 6. FPP system and an industrial part used for 3D measurement

Download Full Size | PDF

3.1 Implementation details

The proposed method is implemented with PyTorch and the device for training is one RTX 2070 SUPER GPU. Adam optimizer is used with parameters ${\beta _1} = 0.9$ and ${\beta _2} = 0.999$. During pre-training of the FrANet, the initial learning rate is set to $1 \times {10^{ - 3}}$ and decreases to $1 \times {10^{ - 4}}$ after 30 epochs. The whole pretraining process takes 40 epochs. Images are cropped to size $1280 \times 768$. During finetuning of the FrANet, the initial learning rate is set to $3 \times {10^{ - 4}}$ for 100 epochs and the learning rate changes to $8 \times {10^{ - 5}}$ for another 100 epochs. During training of the ExSNet, the learning rate is set to $1 \times {10^{ - 3}}$ for 200 epochs. Batch size of 2 and multiple data augmentations including hue adjustment and random noise are adopted. In one iteration, a batch of fringe images is processed by the networks. The size of input tensor is [batch size, the number of channels, height, width]. After training, the network is used for single-shot high dynamic range measurement with run-time of 1.23 s in the same setup. The processes included in run-time calculation are capturing a fringe image, predicting the optimal exposure time using the ExSNet, recapturing the fringe image, and yielding 3D coordinates using the FrANet.

3.2 Exposure selection evaluation

To evaluate whether the ExSNet can predict accurate optimal exposure time, fringe images of an industrial part not in the training set are first captured and the ground-truth absolute phase maps are yielded by using multi-exposure HDR method and twelve-step phase-shifting techniques. Since the goal of exposure selection is to achieve accurate 3D measurement, the fringe image for 3D measurement using the FrANet with the highest accuracy is decided as ground-truth optimal exposure time. By processing captured fringe images at a set of exposure time from 3000 µs to 45000 µs using the FrANet, the ground-truth optimal exposure time corresponding to the best network performance is revealed as 18000 µs. A single fringe image is processed by the ExSNet to predict the optimal exposure time and the exposure selected fringe image generates predicted absolute phase map using the FrANet. The single fringe image adopted as the input is the image with zero phase shift corresponding to $n = 0$ in Eq. (3). To compare the ExSNet with other exposure selection methods, the exposure selected fringe images derived from other methods are also processed by the FrANet and the absolute phase error maps are calculated. The results are shown in Fig. 7, including exposure selected fringe images and absolute phase error maps. Exposure selection methods in computer vision aim to improve the quality of images for better detection, tracking or other computer vision tasks. These methods include gradient-based method and entropy-based method. Liu et al. designed an exposure selection method for 3D measurement [42]. The result of this method depends on whether to crop the fringe image or not due to the influence of background with modulation. Thus, the results using this method with and without cropped images are both presented.

 figure: Fig. 7.

Fig. 7. Exposure selection results using different methods. The left column is exposure selected fringe images and the right column is absolute phase error maps.

Download Full Size | PDF

The selected exposure time using gradient-based method, entropy-based method, uncropped exposure selection method [42], cropped exposure selection method [42], and the proposed ExSNet method is 36207 µs, 25841 µs, 34155 µs, 14565 µs, and 17593 µs, respectively. The corresponding average absolute phase errors are 23.36 mrad, 12.74 mrad, 15.55 mrad, 8.39 mrad, and 5.22 mrad. The optimal exposure time, selected exposure time and error, and corresponding average absolute phase errors using different methods are shown in Fig. 8.

 figure: Fig. 8.

Fig. 8. Optimal exposure time, selected exposure time and error, and corresponding average absolute phase error using different methods.

Download Full Size | PDF

Since gradient-based method and entropy-based method focus on general quality of images, they are not sensitive to overexposure and select exposure time longer than optimal. Exposure selection method [42] using uncropped images is not accurate due to large areas of modulated background, while the method using cropped images is more accurate but yields slightly underexposed images. Results show that the ExSNet predicts accurate optimal exposure time where the FrANet almost reaches peak performance and yields the most accurate absolute phase map compared to other exposure selection methods.

3.3 3D measurement evaluation

To verify the effectiveness of the proposed method for single-shot high dynamic range 3D measurement, a fringe image with overexposed areas is captured and processed. For detailed evaluation of optimal exposure time and phase maps predicted by ExSNet and FrANet, ground-truth of optimal exposure time and phase maps is also calculated. Similar to exposure selection evaluation, fringe images at a set of exposure time from 3000 µs to 45000 µs are used to generate ground-truth of phase maps. Single-shot images at different exposure time are then processed by the FrANet to decide the ground-truth optimal exposure time, which is 12000µs. The evaluation of exposure selection using the ExSNet and phase map prediction using the FrANet is shown in Fig. 9. The exposure time selected by the ExSNet is 13559 µs, which is the closest to the optimal and adjacent exposure time in the set. The average errors of predicted wrapped phase map, predicted absolute phase map and refined absolute phase map are 362.3 mrad, 19.1 mrad and 5.5 mrad, respectively. Due to the learning principle of deep neural networks, predicted phase maps have value in areas without modulation. Since the input of three modules in the FrANet all include single-shot fringe image, wrapped phase map predicted by phase retrieval module can still be refined during phase unwrapping. The predicted absolute phase map is largely accurate with some error in the border of different fringe orders. The refined absolute phase map, as the final output of FrANet, fixes the error in the border of different fringe orders and has high accuracy. The right side of the industrial part is overexposed in the original fringe image, but the network predictions in this area are equally accurate and have no blur. This demonstrates that the proposed method is capable of accurate single-shot high dynamic range 3D measurement.

 figure: Fig. 9.

Fig. 9. Evaluation of exposure selection using ExSNet and phase map prediction using FrANet.

Download Full Size | PDF

To quantitively evaluate the proposed method in 3D measurement of moving objects, we also measure moving standard spheres with diameter of 30 mm ± 2 µm and center distance of 100 mm ± 2 µm. The results are shown in Fig. 10. The dynamic scene of moving standard spheres is shown in Fig. 10(a), where the standard spheres are tied in non-retractable rope to form a pendulum. Fringe images are captured when standard spheres reach lowest height and highest speed. The pendulum length is 695 mm and standard spheres are released at horizontal deviation of 150 mm corresponding to the highest speed of 0.567 m/s. Original fringe image is used for exposure selection and predicted optimal exposure time is 9012 µs. The exposure adjusted image shown in Fig. 10(b) is fed into FrANet to complete 3D measurement. The network predicts wrapped phase map and absolute phase map, as shown in Fig. 10(c) and (d). The predicted absolute phase map is converted into 3D coordinates using previous calibration parameters. The predicted diameter is 30.073 mm for left sphere and 29.936 mm for right sphere with errors of 73 ± 2 µm and 64 ± 2 µm. The predicted center distance is 100.049 mm with an error of 49 ± 2 µm. Note that the white standard spheres and the black rod are highly different in exposure level, which makes it challenging to reconstruct them in one exposure. The proposed method can recover information in underexposed regions and reconstruct the standard spheres and rod using one fringe image.

 figure: Fig. 10.

Fig. 10. Results of 3D measurement of moving standard spheres.

Download Full Size | PDF

3.4 Ablation study

To provide a clear view of contributions made by each technique proposed in this work, ablation study on exposure selection, attention mechanism and fringe analysis techniques is conducted. Results of ablation study are shown in Table 1. The relative root-mean-square error (relative RMSE) between prediction and ground-truth 3D coordinates on the test set is selected as the metric. Relative RMSE can be expressed as:

$$\textrm{Relative - RMSE = }\sqrt {\frac{{\sum\limits_{x,y} {{{|{c(x,y) - g(x,y)} |}^2}} }}{{\sum\limits_{x,y} {{{|{a(x,y) - g(x,y)} |}^2}} }}} ,$$
where $({x,y} )$ is the pixel location, c is the predicted 3D coordinates using current techniques, g is the ground-truth 3D coordinates, and a is the predicted 3D coordinates using all techniques. The test set is mainly fringe images with overexposed areas. The ground-truth 3D coordinates are calculated using twelve-step phase-shifting and temporal phase unwrapping. Without exposure selection means that raw fringe images without exposure adjustment are used as the input of FrANet. With exposure selection but without attention mechanism means that the attention module in the ExSNet is replaced with a convolutional layer. FTP for fringe analysis means that after exposure selection, fringe images are processed using FTP instead of the FrANet. The results demonstrate the contribution of each technique to the accuracy in 3D measurement.

Tables Icon

Table 1. Results of ablation study on exposure selection, attention mechanism and fringe analysis techniques.

3.5 Comparison with other methods

The proposed method is compared with other high dynamic range methods for 3D measurement. These methods include multi-exposure HDR method using three exposures and single-exposure method of Liu et al. [42]. Since multi-exposure HDR methods are time-consuming, only three exposures are used for fringe image fusion, which are 12000µs, 18000µs and 24000 µs. Phase maps are calculated using four-step phase-shifting techniques and temporal phase unwrapping. Single-exposure method of Liu et al. [42] includes exposure selection using hand-crafted metric and image enhancement using deep neural network. Enhanced fringe images generate phase maps also by using four-step phase-shifting techniques and temporal phase unwrapping. This method is the most related one to the proposed method. Instead of using hand-crafted metric, the proposed method uses deep-learning-based exposure selection opting for the best measurement accuracy. Moreover, image enhancement is integrated into the FrANet for single-shot measurement instead of using a separate network.

Results of different high dynamic range 3D measurement methods are shown in Figs. 1113. For multi-exposure HDR method, only three exposures cannot fully recover underexposed and overexposed areas. In the fused fringe image, left side of the industrial part is underexposed and a large portion of it is overexposed. Thus, the calculated absolute phase map using this method has the largest error. Single-exposure method of Liu et al. [42] conducts exposure selection using their metric and enhance the selected fringe image. The enhancement lightens the overexposed and underexposed problem by making modulation more evenly distributed. However, the image enhancement by using a deep neural network cannot predicts accurate modulation in overexposed regions due to information loss and brings extra error in absolute phase map. The proposed method selects optimal exposure time directly based on the FrANet performance. The FrANet can integrate the process of image enhancement, so extra enhancement is not needed. The predicted absolute phase map using the proposed method is the most accurate among these methods.

 figure: Fig. 11.

Fig. 11. Results of multi-exposure HDR method using three exposures.

Download Full Size | PDF

 figure: Fig. 12.

Fig. 12. Results of single-exposure method of Liu et al. [42].

Download Full Size | PDF

 figure: Fig. 13.

Fig. 13. Results of the proposed method.

Download Full Size | PDF

The absolute phase maps of different methods are converted into 3D coordinates using triangulation. 3D reconstruction results are shown in Fig. 14. Result of multi-exposure method with three exposures misses left side of the industrial part due to large phase error in this area. Single-exposure method of Liu et al. [42] cannot fully recover information loss in image enhancement, which leads to scattered points in the reconstruction result. The proposed method accurately reconstructs the industrial part using single-shot fringe image. Most reconstruction error comes from challenging underexposed region in the left side. Absolute phase map error and reconstruction error of different high dynamic range methods are shown in Fig. 15. For absolute phase map error, mean absolute error between absolute phase map retrieved in 3D measurement and ground-truth absolute phase map is calculated. Reconstruction error is evaluated using root mean square error between measured 3D coordinates and ground-truth. The proposed method is superior to other HDR methods in both phase map prediction and reconstruction.

 figure: Fig. 14.

Fig. 14. 3D reconstruction using different high dynamic range methods.

Download Full Size | PDF

 figure: Fig. 15.

Fig. 15. Absolute phase map error and reconstruction error of different high dynamic range methods.

Download Full Size | PDF

4. Discussions

Single-shot measurement of objects with highly reflective areas has always been challenging. Traditional HDR methods that capture fringe images in a single exposure mostly use additional or advanced devices, which increases the cost of FPP system. HDR methods with multiple exposures are hard to achieve real-time measurement and not viable for single-shot measurement in dynamic scenes. Prior attempts in single-shot HDR 3D measurement methods were very limited. Methods using deep neural networks for image enhancement and fringe analysis have been proposed. These methods generally rely on image enhancement ability of deep learning algorithms to increase the accuracy in highly reflective areas or underexposed areas. The problem of algorithm-based methods is that information loss in overexposed and underexposed regions cannot be fully recovered since only one fringe image is processed.

In this work, a method combining deep learning with exposure selection and also processing the fringe analysis using deep learning is proposed to achieve single-shot HDR 3D measurement. The proposed method has multiple advantages over other 3D measurement methods. Compared to multi-shot methods using phase-shifting techniques and temporal phase unwrapping, the proposed method has higher measurement speed and is capable of measuring moving objects. Multi-exposure HDR methods are more time-consuming than the proposed method and cannot fulfil real-time measurement. Single-exposure HDR methods are either more costly by using additional or advanced devices, or less accurate by compensating the information loss with only algorithms. The proposed method only requires one projector and one camera in the FPP system. The exposure time is selected by the ExSNet, directly opting for optimal FrANet performance. This avoids information loss in overexposed areas and enables single-shot measurement for HDR measurement. Predicting the optimal exposure time using a deep network and recapturing the fringe image improves the accuracy of fully algorithm-based methods. Only a single fringe image is required for phase map calculation, which enables the proposed method to measure objects with movements between image shots. Further research may include optimizing the architecture of ExSNet and FrANet, modifying the set of exposure time to reduce the exposure selection interval, and applying the proposed method to 3D measurement with temporal noises.

5. Conclusion

To tackle high dynamic range problem in single-shot 3D measurement, an ExSNet for exposure selection combined with FrANet for fringe analysis to process the adjusted fringe images has been proposed. The ExSNet adopts self-attention mechanism for enhancement of highly reflective areas and underexposed areas to achieve high dynamic range single-shot 3D measurement. The FrANet consists of three modules to predict wrapped phase maps and absolute phase maps to improve the accuracy of phase retrieval and phase unwrapping. Experiments on a FPP system has shown that the proposed method predicted accurate optimal exposure time and the corresponding absolute phase map prediction was superior to other exposure selection methods. Moreover, accurate high dynamic range 3D measurement can be performed on industrial parts and moving standard spheres with overexposure has been measured. Prediction errors for diameter were 73 µm (left) and 64 µm (right). Prediction error for center distance was 49 µm. Ablation study has shown that exposure selection, attention mechanism and fringe analysis techniques all contributed to the accuracy of 3D measurement. Multi-exposure HDR method using three exposures and single-exposure method by other researchers have been compared with the proposed method. The proposed method yielded the most accurate result in absolute phase map prediction and 3D reconstruction.

Funding

National Natural Science Foundation of China (52075100).

Disclosures

The authors declare no conflict of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. S. Feng, C. Zuo, L. Zhang, T. Tao, Y. Hu, W. Yin, J. Qian, and Q. Chen, “Calibration of fringe projection profilometry: A comparative review,” Opt. Lasers Eng. 143, 106622 (2021). [CrossRef]  

2. C. Zuo, S. Feng, L. Huang, T. Tao, W. Yin, and Q. Chen, “Phase shifting algorithms for fringe projection profilometry: a review,” Opt. Lasers Eng. 109, 23–59 (2018). [CrossRef]  

3. C. Zuo, L. Huang, M. Zhang, Q. Chen, and A. Asundi, “Temporal phase unwrapping algorithms for fringe projection profilometry: a comparative review,” Opt. Lasers Eng. 85, 84–103 (2016). [CrossRef]  

4. J. Zeng, W. Ma, W. Jia, Y. Li, H. Li, X. Liu, and M. Tan, “Self-Unwrapping Phase-Shifting for Fast and Accurate 3-D Shape Measurement,” IEEE Trans. Instrum. Meas. 71, 1–12 (2022). [CrossRef]  

5. J. Wang, Y. Yang, M. Shao, and Y. Zhou, “Three-Dimensional Measurement for Rigid Moving Objects Based on Multi-Fringe Projection,” IEEE Photonics J. 12(4), 1–14 (2020). [CrossRef]  

6. Y. Hu, Q. Chen, Y. Liang, S. Feng, T. Tao, and C. Zuo, “Microscopic 3D measurement of shiny surfaces based on a multi-frequency phase-shifting scheme,” Opt. Lasers Eng. 122, 1–7 (2019). [CrossRef]  

7. H. Li, Y. Cao, Y. Wan, C. Xu, H. Zhang, H. An, and H. Wu, “An improved temporal phase unwrapping based on super-grayscale multi-frequency grating projection,” Opt. Lasers Eng. 153, 106990 (2022). [CrossRef]  

8. L. Li, Y. Zheng, K. Yang, X. Su, Y. Wang, X. Chen, Y. Wang, and B. Li, “Modified three-wavelength phase unwrapping algorithm for dynamic three-dimensional shape measurement,” Opt. Commun. 480, 126409 (2021). [CrossRef]  

9. M. Servin, M. Padilla, G. Garnica, and E. Gonzalez, “Profilometry of three-dimensional discontinuous solids by combining two-steps temporal phase unwrapping, co-phased profilometry and phase-shifting interferometry,” Opt. Lasers Eng. 87, 75–82 (2016). [CrossRef]  

10. B. Lin, S. Fu, C. Zhang, F. Wang, and Y. Li, “Optical fringe patterns filtering based on multi-stage convolution neural network,” Opt. Lasers Eng. 126, 105853 (2020). [CrossRef]  

11. P. Yao, S. Gai, and F. Da, “Coding-Net: A multi-purpose neural network for Fringe Projection Profilometry,” Opt. Commun. 489, 126887 (2021). [CrossRef]  

12. P. Yao, S. Gai, and F. Da, “Super-resolution technique for dense 3D reconstruction in fringe projection profilometry,” Opt. Lett. 46(18), 4442–4445 (2021). [CrossRef]  

13. M. Takeda and K. Mutoh, “Fourier transform profilometry for the automatic measurement of 3-D object shapes,” Appl. Opt. 22(24), 3977–3982 (1983). [CrossRef]  

14. Q. Kemao, “Two-dimensional windowed Fourier transform for fringe pattern analysis: Principles, applications and implementations,” Opt. Lasers Eng. 45(2), 304–317 (2007). [CrossRef]  

15. J. Zhong and J. Weng, “Spatial carrier-fringe pattern analysis by means of wavelet transform: Wavelet transform profilometry,” Appl. Opt. 43(26), 4993–4998 (2004). [CrossRef]  

16. S. Feng, Q. Chen, G. Gu, T. Tao, L. Zhang, Y. Hu, W. Yin, and C. Zuo, “Fringe pattern analysis using deep learning,” Adv. Photonics 1(02), 1 (2019). [CrossRef]  

17. W. Yin, J. Zhong, S. Feng, T. Tao, J. Han, L. Huang, Q. Chen, and C. Zuo, “Composite deep learning framework for absolute 3D shape measurement based on single fringe phase retrieval and speckle correlation,” JPhys Photonics 2(4), 045009 (2020). [CrossRef]  

18. B. Zhang, S. Lin, J. Lin, and K. Jiang, “Single-shot high-precision 3D reconstruction with color fringe projection profilometry based BP neural network,” Opt. Commun. 517, 128323 (2022). [CrossRef]  

19. W. Hu, H. Miao, K. Yan, and Y. Fu, “A fringe phase extraction method based on neural network,” Sensors 21(5), 1664 (2021). [CrossRef]  

20. H. Nguyen, E. Novak, and Z. Wang, “Accurate 3D reconstruction via fringe-to-phase network,” Measurement 190, 110663 (2022). [CrossRef]  

21. J. Liang, J. Zhang, J. Shao, B. Song, B. Yao, and R. Liang, “Deep convolutional neural network phase unwrapping for fringe projection 3D imaging,” Sensors 20(13), 3691 (2020). [CrossRef]  

22. J. Shi, X. Zhu, H. Wang, L. Song, and Q. Guo, “Label enhanced and patch based deep learning for phase retrieval from single frame fringe pattern in fringe projection 3D measurement,” Opt. Express 27(20), 28929–28943 (2019). [CrossRef]  

23. H. Yu, X. Chen, Z. Zhang, C. Zuo, Y. Zhang, and D. Zheng, “Dynamic 3-D measurement based on fringe-to-fringe transformation using deep learning,” Opt. Express 28(7), 9405–9418 (2020). [CrossRef]  

24. Y. Yang, Q. Hou, Y. Li, Z. Cai, X. Liu, J. Xi, and X. Peng, “Phase error compensation based on Tree-Net using deep learning,” Opt. Lasers Eng. 143, 106628 (2021). [CrossRef]  

25. H. Nguyen and Z. Wang, “Accurate 3D shape reconstruction from single structured-light image via fringe-to-fringe network,” Photonics 8(11), 459 (2021). [CrossRef]  

26. W. Yin, Q. Chen, S. Feng, T. Tao, L. Huang, M. Trusiak, A. Asundi, and C. Zuo, “Temporal phase unwrapping using deep learning,” Sci. Rep. 9(1), 20175 (2019). [CrossRef]  

27. J. Qian, S. Feng, T. Tao, Y. Hu, Y. Li, Q. Chen, and C. Zuo, “Deep-learning-enabled geometric constraints and phase unwrapping for single-shot absolute 3d shape measurement,” APL Photonics 5(4), 046105 (2020). [CrossRef]  

28. P. Yao, S. Gai, Y. Chen, W. Chen, and F. Da, “A multi-code 3D measurement technique based on deep learning,” Opt. Lasers Eng. 143, 106623 (2021). [CrossRef]  

29. W. Li, J. Yu, S. Gai, and F. Da, “Absolute phase retrieval for a single-shot fringe projection profilometry based on deep learning,” Opt. Eng. 60(06), 064104 (2021). [CrossRef]  

30. A. Nguyen, O. Rees, and Z. Wang, “Learning-based 3D imaging from single structured-light image,” Graphical Models 126, 101171 (2023). [CrossRef]  

31. S. Van der Jeught and J. J. Dirckx, “Deep neural networks for single shot structured light profilometry,” Opt. Express 27(12), 17091–17101 (2019). [CrossRef]  

32. H. Nguyen, Y. Wang, and Z. Wang, “Single-shot 3d shape reconstruction using structured light and deep convolutional neural networks,” Sensors 20(13), 3718 (2020). [CrossRef]  

33. H. Nguyen, T. Tan, Y. Wang, and Z. Wang, “Three-dimensional shape reconstruction from single-shot speckle image using deep convolutional neural networks,” Opt. Lasers Eng. 143, 106639 (2021). [CrossRef]  

34. H. Jiang, H. Zhao, and X. Li, “High dynamic range fringe acquisition: A novel 3-D scanning technique for high-reflective surfaces,” Opt. Lasers Eng. 50(10), 1484–1493 (2012). [CrossRef]  

35. R. Yonesaka, Y. Lee, P. Xia, T. Tahara, Y. Awatsuji, K. Nishio, and O. Matoba, “High dynamic range digital holography and its demonstration by off-axis configuration,” IEEE Trans. Ind. Inform. 12(5), 1658–1663 (2016). [CrossRef]  

36. Z. Song, H. Jiang, H. Lin, and S. Tang, “A high dynamic range structured light means for the 3D measurement of specular surface,” Opt. Lasers Eng. 95, 8–16 (2017). [CrossRef]  

37. U. Cogalan and AO. Akyuz, “Deep Joint Deinterlacing and Denoising for Single Shot Dual-ISO HDR Reconstruction,” IEEE Trans. Image Process. 29, 7511–7524 (2020). [CrossRef]  

38. C. Jiang, T. Bell, and S. Zhang, “High dynamic range real-time 3d shape measurement,” Opt. Express 24(7), 7337–7346 (2016). [CrossRef]  

39. K. Wu, Y. Xie, L. Lu, Y. Yin, and J. Xi, “Three-dimensional reconstruction of moving HDR object based on PSP,” Opt. Lasers Eng. 163, 107451 (2023). [CrossRef]  

40. L. Zhang, Q. Chen, C. Zuo, and S. Feng, “High-speed high dynamic range 3D shape measurement based on deep learning,” Opt. Lasers Eng. 134, 106245 (2020). [CrossRef]  

41. G. Yang, G. Yang, M. Yang, N. Zhou, and Y. Wang, “High dynamic range fringe pattern acquisition based on deep neural network,” Opt. Commun. 512, 127765 (2022). [CrossRef]  

42. X. Liu, W. Chen, H. Madhusudanan, J. Ge, and Y. Sun, “Optical measurement of highly reflective surfaces from a single exposure,” IEEE Trans. Ind. Inform. 17(3), 1882–1891 (2021). [CrossRef]  

43. S. Fan, S. Liu, X. Zhang, H. Huang, W. Liu, and P. Jin, “Unsupervised deep learning for 3D reconstruction with dual-frequency fringe projection profilometry,” Opt. Express 29(20), 32547–32566 (2021). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (15)

Fig. 1.
Fig. 1. Overview of the proposed method.
Fig. 2.
Fig. 2. Architecture of the FrANet.
Fig. 3.
Fig. 3. Architecture of the improved U-Net module used in this work.
Fig. 4.
Fig. 4. Architecture of the ExSNet.
Fig. 5.
Fig. 5. Architecture of attention module in ExSNet.
Fig. 6.
Fig. 6. FPP system and an industrial part used for 3D measurement
Fig. 7.
Fig. 7. Exposure selection results using different methods. The left column is exposure selected fringe images and the right column is absolute phase error maps.
Fig. 8.
Fig. 8. Optimal exposure time, selected exposure time and error, and corresponding average absolute phase error using different methods.
Fig. 9.
Fig. 9. Evaluation of exposure selection using ExSNet and phase map prediction using FrANet.
Fig. 10.
Fig. 10. Results of 3D measurement of moving standard spheres.
Fig. 11.
Fig. 11. Results of multi-exposure HDR method using three exposures.
Fig. 12.
Fig. 12. Results of single-exposure method of Liu et al. [42].
Fig. 13.
Fig. 13. Results of the proposed method.
Fig. 14.
Fig. 14. 3D reconstruction using different high dynamic range methods.
Fig. 15.
Fig. 15. Absolute phase map error and reconstruction error of different high dynamic range methods.

Tables (1)

Tables Icon

Table 1. Results of ablation study on exposure selection, attention mechanism and fringe analysis techniques.

Equations (7)

Equations on this page are rendered with MathJax. Learn more.

A ( Q , K , V ) = softmax( Q K T d k ) V ,
L unsup = L ( I h , I ¯ h ( φ ¯ h ) ) + L ( I h , I ¯ h ( Φ ¯ h ) ) + L ( I h , I ¯ h ( Φ ¯ h r ) ) + L ( I l , I ¯ l ( Φ ¯ l ) ) + L ( I l , I ¯ l ( Φ ¯ l r ) ) ,
I n ( x , y ) = I b ( x , y ) + I a ( x , y ) cos [ φ ( x , y ) + 2 π n N ] ,
φ ( x , y ) = arctan n = 0 N 1 I n ( x , y ) sin ( 2 π n N ) n = 0 N 1 I n ( x , y ) cos ( 2 π n N ) .
k h ( x , y ) = round( ( f h / f l ) Φ l ( x , y ) φ h ( x , y ) 2 π ),
L sup = L ( φ , φ ¯ ) + L ( Φ , Φ ¯ ) + L ( Φ , Φ ¯ r ) .
Relative - RMSE =  x , y | c ( x , y ) g ( x , y ) | 2 x , y | a ( x , y ) g ( x , y ) | 2 ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.