Super-resolution non-line-of-sight imaging based on temporal encoding

Jinye Miao; Enlai Guo; Yingjie Shi; Fuyao Cai; Lianfa Bai; Jing Han

doi:10.1364/OE.504894

1. Introduction

Imaging objects or scenes outside the direct line of sight of cameras is called non-line-of-sight (NLOS) imaging [1,2]. NLOS imaging technology has various applications in autonomous driving, medical imaging, and public safety. In recent years, many different methods have been proposed to solve NLOS problems, such as speckle correlations [3–9], occlusion-based imaging [10–12], Fermat paths [13], acoustic echoes [14,15], intensity-based imaging [16–19], and transient techniques [20–23].

Among these methods, the transient technique is the most popular for its ability to recover 3D shapes of hidden objects. In transient-based NLOS imaging techniques, photons traversing three-bounce light paths are few and sparse, many methods have been proposed to decouple more information from limited photons. The earliest method used to reconstruct NLOS objects from measured transient images [21,24] is back-projection algorithm, which is first proposed by Velten et al [20]. Then, some works add feature extraction [25] and frame-to-frame connection [26] based on back-projection algorithm to improve imaging resolution. In 2018, O’Toole et al. simplify the transient formation in a 3D convolutional model [27]. Subsequently, Bernd et al. [28] jointly model the hidden object albedo and surface normal on the deconvolution model to obtain more accurate surface reconstructions. In addition to geometric optics models, wave propagation models are also used to reconstruct NLOS objects. In 2019, the frequency-wavenumber migration (FK) proposed by Lindell et al. [29] can realize scene reconstruction through space-time transformation of the wave field. In the same year, Liu et al. [30] propose a phasor field virtual wave imaging method based on Rayleigh Sommerfeld Diffraction (RSD). Benefiting from the accuracy of space-time-based wave optics modeling, the reconstruction quality of the wave propagation algorithm is effectively improved. Recently, a transformer has been designed to capture local and global spatial-temporal correlations [31] in 3D NLOS measurements [32]. The methods mentioned above effectively improve NLOS imaging by exploiting spatial-temporal features of transient images. However, implementation of these methods requires an ultrafast time-resolved detector to record the returned photon information.

Compressed sensing can effectively exploit the sparsity of scenes, which is more suitable for NLOS imaging [33]. Some methods improve the spatial resolution of time-resolved detectors by compressing modulation with digital micro-mirror device (DMD) [34,35]. There are also methods that use regularization terms [36,37] or deep learning methods [38] to reconstruct high-resolution objects from a small number of scanning points. These spatial super-resolution NLOS imaging methods improve sampling rates, but the imaging resolution of NLOS imaging is still determined by the time resolution of systems [27]. The time resolution of a system is governed by the trade-off between time jitter and single-photon timing resolution (SPTR). There are some methods proposed to reduce time jitter to a few picoseconds [39]. However, due to time-to-digital converter (TDC) fabrication constraints, the SPTR of some time-resolved detectors is fixed and difficult to match to a time jitter of a few picoseconds. In this case, the bottleneck of the system’s imaging resolution lies in the system’s SPTR [40,41]. When the SPTR of detectors is coarse, photons reflected by objects at different depths are all recorded in the same time bin, which brings difficulties to imaging algorithms. The combination of single-point single-photon avalanche diode (SPAD) and DMD instead of SPAD array for NLOS imaging has been proposed by some researchers [42], which uses higher-SPTR single-point SPAD to make up for the shortcoming of SPAD array with low time resolution. However, the resolution of this method is still limited by the SPTR of the single-point SPAD. There are also methods to introduce sub-bin delays into the incident waveform to improve the SPTR of detectors [41]. This method requires multiple measurements of each point, which unavoidably leads to a decrease in efficiency. Therefore, our interest is exploiting spatial-temporal correlation and sparsity of transient images to efficiently improve the SPTR of systems.

In this paper, a temporal encoding non-line-of-sight imaging technique (TE-NLOS) is proposed, which can reduce requirements on the SPTR of detectors. Based on the spatial-temporal correlation of transient images, the snapshot compressive imaging (SCI) model is employed to reconstruct transient images. High-resolution transient images can be reconstructed from compressed measurements and known encoding sequences using the generalized alternating projection (GAP) algorithm. A high-frequency modulator is utilized to randomly encode each position, effectively reducing the reliance on high-SPTR detectors. We have demonstrated that the introduction of TE-NLOS enables a detector with 1.28ns SPTR to collect photon information of 20ps, which breaks the SPTR limit of detectors. Meanwhile, the proposed method is robust to code types and miscoding. The feasibility of our method has been verified by the simulation with simulated data and public data. Experiments show that when a modulator has a "half-on half-off" miscoding rate of 40%, the structural similarity index (SSIM) between the reconstructed object and ground truth (GT) is above 0.5. Even when there is more complex miscoding in modulators, the proposed method is still resistant to interference. Compared with conventional methods, the proposed method is more prominent in systems with low time jitter. The results demonstrate the accuracy and robustness of our scheme and provide new ideas for obtaining information with high temporal precision.

2. Methods

The detectors of NLOS imaging mainly include a SPAD and a time-correlated single-photon counting (TCSPC). TCSPC records the time of flight (TOF) of photons returning from an object’s surface with an interval of SPTR. However, limited by the manufacturing process, the SPTR of some low-cost consumer SPAD and SPAD arrays cannot reach several picoseconds [43]. Low-SPTR detectors will record photons reflected from different depths in the same time bin. In this case, the indistinguishable photons at different depths lead to poor resolution of reconstructed objects. Therefore, it is necessary to improve the SPTR of a detector, and the transient images with high SPTR are helpful for the high-quality reconstruction of objects.

In the NLOS imaging system, the photons collected by SPAD contain valid information about objects. Most NLOS methods require raster scanning of the intermediary wall to acquire hidden scene information from different scanning positions. The photon histograms collected at different scanning positions can form a transient cube containing spatial and temporal information. The cube is sparse and highly correlated. It can be seen from Fig. 1(a) that the collected transient images gradually evolve and diffuse with time. And there is a strong correlation between frames. However, limited by the SPTR of detectors, ultrafast transient images with high time resolution cannot be collected. SCI can capture high-dimensional data from measured compressed signals by hardware encoding within a certain exposure time, which reduces data storage space and enables high-speed imaging. Based on this idea, many methods have been proposed to achieve high-resolution video reconstruction [44–46]. In this method, SCI technology is introduced into NLOS imaging, and the SPTR of detectors can be broken by temporally encoding each bin. It makes it possible for coarse-SPTR detectors to achieve the NLOS object reconstruction with high resolution.

Fig. 1. TE-NLOS imaging system. (a) Non-line-of-sight imaging system based on temporal encoding. (b) Detector composition. (c) Photon histograms collected by SPAD in the conventional method and the proposed method.

Download Full Size | PDF

2.1 Imaging principle and system design

TE-NLOS system mainly includes a pulsed laser, an intensity modulator, a single-point SPAD, and a TCSPC module. The experimental setup is shown in Fig. 1.

Specifically, when the intermediary wall is illuminated by a pulse laser, the NLOS detection system needs to scan each position equidistantly to obtain complete 3D data containing spatial and temporal information. Different from conventional methods, our method requires continuous time modulation during the detection process at each bin. Time modulation in this method refers to randomly adjusting the passage or non-passing of light at a high frequency in a time sequence, which can be achieved by a high-bandwidth modulator. Theoretically, the light reflected from each scanning point is amplified by an erbium-doped fiber amplifier (EDFA) and input to the optical intensity Mach-Zehnder modulator (MZM), as shown in Fig. 1(b). The modulator is driven by a microwave signal. In this way, the signal collected by SPAD in each bin is a superposition result of multiple randomly encoded light intensities in this bin. Other points of the intermediary wall are also encoded and collected one by one. Finally, high-precision transient images can be reconstructed from compressed sequences acquired at multiple sampling points. In this method, the SPTR of transient images is determined by the bandwidth of a modulator rather than the bin width of detectors. Figure 1(c) shows the histograms of a certain point collected by the conventional method and our method. The proposed method can reconstruct a photon histogram with higher precision by encoding it four times in a bin. The introduction of this modulation liberates transient NLOS imaging from relying heavily on the high-SPTR SPAD. In this way, ultrafast photon distributions can be recorded even with SPAD of low timing precision.

2.2 Mathematical model and algorithm design

In the TE-NLOS system, SPAD needs to sample every position $(i,j)$ of the intermediary wall. The detected histogram of each position contains N bins, and the bin width is $\Delta t$. We consider an exposure model in which a pixel at position $(i,j)$ over the exposure time $\Delta t$. After multiple laser pulse illumination and photon accumulation, the exposure $E_{i,j,n}$ at $n_{th}$ $\Delta t$ is the integral of incident radiance $x_{i,j,n}$ over the exposure time $\Delta t$:

(1)$$E_{i, j, n}(t)=\int_{(n-1) \Delta t}^{n \Delta t} x_{i, j, n}(\mathrm{t}) d \mathrm{t}(n \in\{1,2,3, \ldots, N\}),$$

where $n$ is the time bin index. Here, the incident radiance $x_{i,j,n}$ represents high-speed photon transient images detected multiple times at all locations during the $n_{th}$ exposure time $\Delta t$. The SPTR of $x_{i,j,n}$ is $\Delta t$. In order to improve SPTR, we introduce a spatial-temporal modulation function $\phi _{i,j,n}$ that modulates pixel $(i,j)$ throughout the exposure time:

(2)$$E_{i, j, n}(t)=\int_{(n-1) \Delta t}^{n \Delta t} x_{i, j, n}(\mathrm{t}) \cdot \phi_{i,j,n}(\mathrm{t}) d \mathrm{t}.$$

As shown in Fig. 2(a), the modulation function $\phi _{i,j,n}$ is set as binary functions defined on $B$ discrete time slots. Thus, Eq. (2) can be rewritten as:

(3)$$E_{i, j}[n]=\sum_{k=1}^B x_{i, j, n}[k] \cdot \phi_{i, j, n}[k],$$

where the size of exposure $E_{i, j}[n]$ is $M \times M$. We further assume that the modulation functions $\phi _{i,j,n}$ do not vary from one slot to another. The most general class of such modulation functions defines each time slot as either "on" or "off" and thus consists of $B$ free parameters for each pixel $\phi _{i,j,n}[k] \in \{0,1\}, \forall k \in \{1, \ldots, B\}$ during the $n_{th}$ exposure time $\Delta t$. The value of $E_{i, j}[n]$ is the superposition of randomly encoded $B$ frames of transient image $x_{i, j, n}[k]$ during the $n_{th}$ exposure time $\Delta t$. Our goal is to reconstruct a high-SPTR space-time volume $x_{i, j, n}[k]$ from the captured exposure $E_{i, j}[n]$. Equation (3) can be written in matrix form as $\boldsymbol {E}[n]=\boldsymbol {\phi }[n] \boldsymbol {x}[n]$, where $\boldsymbol {E}[n]$ (observation) and $\boldsymbol {x}[n]$ (unknowns) are vectors with $M \times M$ and $M \times M \times B$ elements, respectively. According to the principle of video-compressed sensing [46], the sensing matrix $\boldsymbol {\phi }[n]$ here is not a dense matrix. Rather, it is a diagonal matrix and can be expressed as:

(4)$$\boldsymbol{\phi}[n]=\left[\boldsymbol{D}_{1}[n], \boldsymbol{D}_{2}[n], \ldots, \boldsymbol{D}_{B}[n]\right].$$

Fig. 2. Schematic diagram of the algorithm of TE-NLOS. (a) The compressive sampling process of TE-NLOS within each $\Delta \mathrm {\textit {t}}$. (b) Compressed measurements of $M \times M$ size collected at each $\Delta \mathrm {\textit {t}}$. (c) The process of reconstructing the complete transient images. (d) The reconstruction result of NLOS objects by LCT.

Download Full Size | PDF

Considering the camera’s response mechanism, SPADs detect at most one among the returning photons that originated in the same pulse. Therefore, the final collected photon histogram is obtained by repeatedly emitting laser pulses and detecting and recording each photon into the corresponding time bin. The collected histogram obeys the Poisson distribution:

(5)$$\mathrm{y}[n] \sim \operatorname{Poisson}\left(\int_{(n-1) \Delta t}^{n \Delta \mathrm{t}}((\boldsymbol{\phi}[n] \boldsymbol{x}[n]) * g) \mathrm{d} t+d\right),$$

where "*" denotes convolution on the time axis, $g$ is the time jitter caused by the uncertainty of system response time, and $d$ corresponds to photon detections due to ambient light and dead counts. For ease of notation, we removed $d$ from Eq. (5). Refer to Ref. [27,47], the time jitter $g$ follows a Gaussian function and it is proportional to the standard deviation $\sigma _t$, which can be expressed as:

(6)$$g=\exp \left(-\frac{t^2}{2 \sigma_t^2}\right).$$

The size of $\mathrm {y}[n]$ is the same as the size of $\boldsymbol {E}[n]$, which is a vector with $M \times M$ elements. Therefore, as shown in Fig. 2(b), the SPAD collects a total of $N$ groups of modulated transient images with a size of $M \times M$.

The Poisson noise caused by SPAD in the fewer-photon scenario can be suppressed by increasing the number of pulses [48]. In the proposed method, due to the design of temporal encoding, each position $(i,j)$ inherently needs to be illuminated by many laser pulses. After multiple illumination and sampling, the effect of Poisson noise decreases and can be ignored. So the compressive sensing problem can be transformed into a total variable (TV) minimization problem:

(7)$$\min _{\boldsymbol{x}}\|\operatorname{TV}(\boldsymbol{x})\| \text{, subject to } \boldsymbol{\phi}[n] \boldsymbol{x}[n] * g=y \text{, }$$

where TV(x) represents the global variable norm. The GAP-TV algorithm [46] has been widely used to solve such problems. Through a series of alternating projections, the strong constraint of $y = \boldsymbol {\phi }[n] \boldsymbol {x}[n] * g$ can make the results more accurate.

As shown in Fig. 2(c), the high-SPTR transient images reconstructed by our method can be used to reconstruct NLOS objects from any available imaging model. In our work, the light cone transform (LCT) algorithm is used to reconstruct objects from reconstructed transient images, and the reconstruction result is shown in Fig. 2(d). SSIM is a measure of the similarity between two images. Given two images $x$ and $y$, the structural similarity of two images can be found as follows:

(8)$$\operatorname{SSIM}(x, y)=\frac{\left(2 \mu_x \mu_y+c_1\right)\left(2 \sigma_{x y}+c_2\right)}{\left(\mu_x^2+\mu_y^2+c_1\right)\left(\sigma_x^2+\sigma_y^2+c_2\right)},$$

where $\mu _x$ and $\mu _y$ are the average value of $x$ and $y$. $\sigma _x^2$ and $\sigma _y^2$ are the variance of x and y respectively. $\sigma _{x y}$ is the covariance of x and y. Besides, $c_{1}$ and $c_{2}$ are the constants used to maintain stability. Among them, $\mathrm {c}_1=\left (\mathrm {k}_1 \mathrm {~L}\right )^2, \mathrm {c}_2=\left (\mathrm {k}_2 \mathrm {~L}\right )^2\left (k_1=0.01, k_2=0.03)\right.$. And $L=2^{8}-1$.

3. Experiments and analysis

3.1 Improvement of resolution by TE-NLOS

We use simulated data to verify the reconstruction performance our TE-NLOS. The motorcycles are generated in [22], where the LCT model [27] is used to generate transient images. In the LCT model, the positions illuminated by the laser and detected by the SPAD are identical. When a pulse laser illuminates a point on the intermediate wall, the light scatters into the hidden scene and returns to the same position on the intermediate wall. The wavelength of the laser is 1550nm. A SPAD records the arrival time of the returned photons in the form of a photon histogram. This procedure is repeated for a uniform and planar 2D grid of points across the surface. High-speed transient images can be obtained. After resampling along the time dimension $\left (\tilde {\tau }=R_t\{\tau \}\right )$ and depth dimension $\left (\tilde {\rho }=R_z\{\rho \}\right )$, the forward model can be converted into a three-dimensional convolution:

(9)$$\tilde{\tau}=h * \tilde{\rho},$$

where $\tilde {\rho }$ is the hidden volume and $h$ is a known point spread function (PSF). This convolution model can also be written as:

(10)$$\tau=R_t^{{-}1} F^{{-}1} \hat{H} F R_z \rho,$$

where the matrix $F$ represents a 3D discrete Fourier transform and $\hat {H}$ is a diagonal matrix representing the Fourier transform of the PSF $h$. We utilize the LCT model to generate transient images. Given a depth map $d(x,y)$, we compute a corresponding synthesized reflectance volume for our scene: $\rho (x,y,z) = a$ when $z = d(x,y)$. The scalar $a$ is a constant value representing the amount of light reflected by voxels in the volume. Then we convert the synthesized volume $\rho (x,y,z)$ into transient images $\tau$ by using the image formation model described in Eq. (10). The time jitter of the system is 20ps.

In the experiment, transient images with 40ps SPTR are simulated as GT. A modulator with a bandwidth of 40ps is used to randomly encode the intermediary wall point by point. The returned photons are collected by a detector with a time resolution of 160ps. Then the encoded high-speed transient images are reconstructed from the compressed measurements by GAP-TV algorithm. The comparison results between the transients reconstructed by our method and GT are shown in Fig. 3(a). It can be seen that each frame of transient images reconstructed by this method is similar to the outline of GT, and the overall diffusion trend of the transient images is also the same. The SSIM between each reconstructed frame and GT is above 0.75.

Fig. 3. Reconstruction results of TE-NLOS (a) Reconstructed transient images and corresponding GT. (b) Comparison of photon histograms between the conventional method and the proposed method (c) The reconstruction results by LCT of the conventional method and the proposed method.

Download Full Size | PDF

The detector with the same SPTR is also used to directly acquire transient images. The photon histograms collected by the conventional method and TE-NLOS at a certain scanning point are compared. As shown in Fig. 3(b), the green line represents the photon histogram reconstructed by TE-NLOS, the red line represents the photon histogram obtained by the conventional method, and GT is represented by the orange line. It can be seen from Fig. 3(b) that the photon histogram reconstructed by TE-NLOS retains more object information and is very consistent with GT. However, limited by the SPTR of the detector, conventional methods can only roughly record the distribution trend of photons and cannot obtain accurate photon distribution.

In order to further explore the influence of this method on NLOS reconstruction, the LCT algorithm is used to reconstruct objects from transient images. The reconstruction result is shown in Fig. 3(c). Limited by the SPTR of detectors, the conventional method can only collect the transient images with 160ps SPTR. The object details reconstructed from these low-SPTR transient images are blurred, especially the shapes of wheels and seats are difficult to distinguish. The SSIM between the reconstructed object and GT is only 0.6838. However, the reconstructed object’s structure of TE-NLOS is complete, and the local outline is also very clear. The SSIM between the reconstruction object and GT can reach 0.9220. The effect of TE-NLOS has been demonstrated by experiments, the encoded acquisition can reconstruct transient images with more information than direct acquisition at the same-SPTR detector.

3.2 Limit of super-resolution TE-NLOS

The photon TOF information collected by the detector with coarse SPTR will be mixed, which brings difficulties in reconstructing NLOS objects. The SPTR of detectors determines the resolution of the reconstructed object. However, in the proposed methods, the introduction of TE-NLOS makes it possible for detectors with coarse SPTR to reconstruct accurate objects.

In order to explore the super-resolution capability of the proposed method, experiments on detectors with different SPTR are compared with the conventional method and our method. This set of experiments requires a modulator with an electro-optic bandwidth of 50GHz. When the SPTR of a detector is 80ps, the conventional collection method can only collect the photon histogram with 80ps SPTR at each point. In TE-NLOS, the modulator with the bandwidth of 20ps is used for time encoding. The photon histograms are encoded 4 times within a bin of 80ps, and then a photon histogram with 20ps SPTR can be reconstructed through the GAP-TV algorithm. In this case, the SPTR of reconstructed transient images can be up to four times that of the detector. Meanwhile, the reconstruction results of photons collected by the conventional method and TE-NLOS with detector’s SPTR of 160ps, 320ps, 640ps, 1.28ns, 2.56ns, and 5.12ns are compared. It can be seen from Fig. 4 that regardless of the conventional method or TE-NLOS, the larger the SPTR of detectors, the worse the quality of reconstructed objects. When the SPTR of detectors is 1.28ns, the object reconstructed by the conventional method is completely blurred. The front view error of the reconstructed object and GT further proves that it is difficult for conventional methods to reconstruct object’s details from a detector with low time resolution. In the TE-NLOS method, transient images can be encoded 64 times by a modulator whose bandwidth is 20ps. In this way, the SPTR of reconstructed transient images is 64 times that of detectors. It can also be seen from the error maps that the error of the reconstruction result of TE-NLOS is much lower than that of the conventional method. When the SPTR of detectors is 2.56ns, the structure of hidden objects is generally reconstructed by TE-NLOS with some noise, while the conventional method cannot reconstruct the shape of objects at all. Experiments have proved that temporal super-resolution of detectors can be achieved by TE-NLOS and hidden objects can be reconstructed even on ns-level detectors.

Fig. 4. Comparison of object’s front view and error map reconstructed by conventional method and TE-NLOS at different super-resolution multiples.

Download Full Size | PDF

3.3 Impact of code type on TE-NLOS

The proposed method exhibits good robustness to various types of modulation codes. This method employs a random matrix as the modulation code. Here, we compare reconstructions of other types of codes by TE-NLOS. Figure 5 shows the reconstructions of simulated data and authentic data for different types of codes.

Fig. 5. The reconstructed transient images and LCT reconstructions of TE-NLOS under different types of code modulation. (a) The impact of different types of code modulation on simulated data. (b) The impact of different types of code modulation on authentic data.

Download Full Size | PDF

In the first set of experiments, transient images with a time resolution of 20ps, corresponding to two different types of targets, are simulated and used as GT. Assuming that the SPTR of the detector is 160ps, our method can reconstruct high-resolution transient images through a modulator whose bandwidth is 20ps. The impact of code types on the proposed method is shown in Fig. 5(a). Judging from the evolution rules of the reconstructed transient images, three modulation patterns of the Gaussian matrix, Hadamard matrix, and random matrix can accurately reconstruct the changing trend of transient images. Among the three encoding types, the transient images reconstructed using Hadamard matrix modulation exhibit smoother characteristics, and the reconstructed targets show slight noise. However, overall, using the three types of modulation codes before the SPAD with 160ps time resolution captures more details of the targets compared to directly acquiring them using the SPAD.

In addition, we also apply our methods to another two scenes in the Stanford dataset. Figure 5(b) shows the impact of different modulation codes on authentic data. The transient images with 32ps time resolution are used as GT. Since the collection time of public data is fixed and relatively short, there is some shot noise in the transient images of GT. However, the evolution rules of transient images can still be discerned. Using a SPAD with 256ps time resolution, we demonstrate recovery of transients with 32ps time resolution, an 8$\times$ improvement in time resolution. It can be seen from the reconstruction results by LCT that the proposed method has good robustness to modulated code types.

3.4 TE-NLOS imaging with different types of interference

3.4.1 Effect of different extinction ratios of modulators on TE-NLOS

In the TE-NLOS, photons passing through the modulator are captured by a detector. With an ideal extinction ratio state, the modulator only performs "0-1" modulation. Considering the shortage of modulator hardware, it is difficult to achieve the highest extinction ratio in a very small bandwidth. In this case, the decoding sequence is no longer consistent with the encoding sequence, and the wrong encoding mode brings difficulties to solve transient images. To demonstrate the robustness of the proposed method, the effect of different degrees of "miscoding" sequences on the NLOS object reconstruction is discussed.

Here, the experiments are validated on publicly available Stanford data [29]. Assuming that the SPTR of an SPAD is 256ps. A photon histogram with 32ps SPTR can be reconstructed on a 256ps-level SPAD through a modulator. The bandwidth of the modulator is 32ps. When the encoded and decoded sequences are kept consistent, the ideal reconstruction results are shown in the second column of Fig. 6. In order to simulate the incomplete opening problem faced by modulators in practice, two different miscoding situations that may occur during modulation are considered. One is the case that only contains "0-0.5-1" third-order miscoding. Specifically, time modulation is performed using an error code with a "half" state during the detection process. Since it is impossible to predict at which time and at which point the modulator will make an error, the "half" state in the encoding matrix is random. The sequence that the GAP-TV algorithm uses to decode the transient images is still a sequence without " miscoding ". In order to examine the ability of the proposed method to resist interference, the degree of "miscoding" is quantitatively controlled. When the detector and modulator are unchanged, the reconstructions of the encoding sequence containing 10%, 20%, 30%, 40%, 50%, 60%, 70%, and 80% of the "half" states are compared. The reconstruction results are shown in Fig. 6. As the ratio of miscoding increases, the reconstruction quality of the object decreases. The proposed method is robust to error states. Especially when the error rate reaches 40%, the transient images reconstructed by this method can still recover the complete shape of the object. The SSIM between the reconstructed object and GT is above 0.5.

Fig. 6. Reconstruction results in varying "Half" miscoding ratio.

Download Full Size | PDF

In practice, the modulator does not only have three states of "0-0.5-1". It takes response time for the modulator to perform extinction, which leads to more complex patterns across timing modulations, such as 0.1, 0.2, 0.3, and so on. Therefore, in order to further verify the robustness of the proposed method, the impact of more complex miscoding on the same data is also experimentally analyzed. Likewise, the degree of "miscoding" is quantitatively controlled. The difference from the above experiment is that the state of "miscoding" is not limited to 0.5, but also includes 0.1, 0.2, 0.3, 0.4, 0.6, 0.7, and 0.8. The impact of complex miscoding with different probabilities on the proposed method is shown in Fig. 7. It can be seen from the experimental results that the interference of complex miscoding is larger than that of third-order miscoding. When the miscoding probability is 80%, the SSIM of the reconstructed object under complex miscoding interference is lower than that of the reconstructed object under third-order miscoding interference. In particular, the SSIM of the reconstructed object dragon under complex miscoding interference is 0.1 lower than that of the third-order miscoding. This method is still very resistant to complex miscoding, especially when the probability of miscoding is 40%, the reconstruction target of this method is still clear. Two sets of experiments verified that the proposed method is resistant to interference caused by insufficient modulator hardware, and can better adapt to practical scenarios.

3.4.2 Effect of time jitters on TE-NLOS

Time jitter refers to the uncertainty between the actual generation time and the ideal response time of a signal, which is determined by system hardware. In the NLOS system, the time jitter is jointly determined by all components of a system. In this case, the time jitter of the system $g_{\mathrm {sys}}$ is defined as:

(11)$$g_{\mathrm{sys}} \approx \sqrt{g_{\mathrm{det}}^2+g_{\mathrm{laser}}^2+g_{\mathrm{TDC}}^2+\sum g_{\mathrm{others}}^2},$$

where $g_{\mathrm {sys}}$ is the jitter of a detector in response to the photoelectric conversion time, $g_{\mathrm {laser}}$ is the jitter and pulse width of a laser, $g_{\mathrm {TDC}}$ is the jitter of TDC time measurement, and $g_{\mathrm {others}}$ is the jitter contributed by other electronic components.

TE-NLOS plays a greater advantage in systems with low time jitters. Here, different time jitters are simulated on a detector with a fixed-SPTR of 320ps. Figure 8 shows reconstruction results of TE-NLOS and conventional NLOS imaging with different time jitters. In the conventional NLOS method, the reduction of time jitter cannot improve the reconstruction results. As shown in Fig. 8, when the time jitter drops to 0ps, the transient images obtained by the conventional method are not clear. The resolution of reconstructions by LCT is so poor that the shape of the wheel cannot be distinguished. However, the proposed TE-NLOS can reconstruct the transients whose SPTR is 20ps on a 320ps-level detector. This method can reconstruct clearer results on a system with low time jitters. When the time jitter is 20ps, the reconstructed transient images are clear and close to GT. The recovered object’s structure is consistent with GT. The proposal of TE-NLOS can make a system with low time jitter play a greater advantage in reconstructing high-resolution objects, and provide new ideas for improving the quality of reconstruction.

Fig. 7. Reconstruction results in varying complex miscoding ratios.

Download Full Size | PDF

Fig. 8. The reconstructed transient images and objects for TE-NLOS and conventional NLOS imaging with different time jitters.

Download Full Size | PDF

4. Discussion

Different from traditional NLOS imaging techniques, the proposed method achieves a 64$\times$ improvement in time resolution by a high-bandwidth modulator. Here we provide the proof of our scheme principle, and other types of ultrafast signals or transient processes can be studied with our scheme. Regarding the applicable conditions of TE-NLOS, we have two key points.

(i) The time resolution of TE-NLOS is determined by modulation bandwidth. A high-bandwidth modulator is easier to obtain than a high-SPTR detector. If the bandwidth of a modulator is limited, other modulations such as the sub-bandwidth delay method [41] can be used to improve resolution. Different modulation methods or dimensions combined with TE-NLOS can get high time resolution.

(ii) Although this method can improve the SPTR of a detector, the applicable conditions of this method are also affected by other factors, such as the time jitter of the system. The problem our method solves is that the SPTR is the main bottleneck limiting imaging accuracy. In situations where time jitter is a primary cause of lower time resolution, TE-NLOS will likely not be effective (e.g., PicoHarp at 4 ps or HydraHarp at 1 ps [27]).

5. Conclusion

In conclusion, an encoded temporal super-resolution method for low-SPTR detectors is proposed. This is achieved by incorporating optical design, detector modeling, and GAP-TV reconstruction algorithms. By exploiting the spatial-temporal correlation of transient images, it is possible to reconstruct high-resolution transient images using a low-SPTR detector. Compared with conventional methods without encoding, the NLOS objects reconstructed by our method are clearer and the SSIM is 0.23 higher than the conventional one. In addition, our method can achieve at least 64 $\times$ improvement in time resolution. Experimental results have demonstrated that using a modulator with a bandwidth of 20ps, a detector with 1.28ns SPTR can successfully reconstruct 20ps-transient images. Meanwhile, the proposed method is insensitive to the code types and miscoding of modulators. Even with "half" miscoding or more complex miscoding, it can still reconstruct transient images and hidden objects well. The spatial-temporal correlation of transient images is used by TE-NLOS to break through the hardware shortage of detectors, which provides a new solution for high-resolution NLOS imaging. We anticipate that future work will extend the analysis in this paper to other wide fields, even beyond optics and physics.

Funding

National Natural Science Foundation of China (61971227, 62031018, 62101255); Jiangsu Provincial Key Research and Development Program (BE2022391); China Postdoctoral Science Foundation (2021M701721, 2022M721620, 2023T160319).

Acknowledgment

We thank Yi Wei, Lingfeng Liu, Chenyin Zhou for technical supports and experimental discussion.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. T. Maeda, G. Satat, T. Swedish, et al., “Recent advances in imaging around corners,” arXiv, arXiv:1910.05613 (2019). [CrossRef]

2. D. Faccio, A. Velten, and G. Wetzstein, “Non-line-of-sight imaging,” Nat. Rev. Phys. 2(6), 318–327 (2020). [CrossRef]

3. O. Katz, P. Heidmann, M. Fink, et al., “Non-invasive single-shot imaging through scattering layers and around corners via speckle correlations,” Nat. Photonics 8(10), 784–790 (2014). [CrossRef]

4. S. Zhu, E. Guo, J. Gu, et al., “Imaging through unknown scattering media based on physics-informed learning,” Photonics Res. 9(5), B210–B219 (2021). [CrossRef]

5. R. French, S. Gigan, and O. L. Muskens, “Snapshot fiber spectral imaging using speckle correlations and compressive sensing,” Opt. Express 26(24), 32302–32316 (2018). [CrossRef]

6. A. Porat, E. R. Andresen, H. Rigneault, et al., “Widefield lensless imaging through a fiber bundle via speckle correlations,” Opt. Express 24(15), 16835–16855 (2016). [CrossRef]

7. Y. Shi, E. Guo, M. Sun, et al., “Non-invasive imaging through scattering medium and around corners beyond 3d memory effect,” Opt. Lett. 47(17), 4363–4366 (2022). [CrossRef]

8. S. Rotter and S. Gigan, “Light fields in complex media: Mesoscopic scattering meets wave control,” Rev. Mod. Phys. 89(1), 015005 (2017). [CrossRef]

9. Y. Shi, E. Guo, L. Bai, et al., “Prior-free imaging unknown target through unknown scattering medium,” Opt. Express 30(10), 17635–17651 (2022). [CrossRef]

10. K. L. Bouman, V. Ye, A. B. Yedidia, et al., “Turning corners into cameras: Principles and methods,” in Proceedings of the IEEE International Conference on Computer Vision, (2017), pp. 2270–2278.

11. C. Saunders, J. Murray-Bruce, and V. K. Goyal, “Computational periscopy with an ordinary digital camera,” Nature 565(7740), 472–475 (2019). [CrossRef]

12. M. Baradad, V. Ye, A. B. Yedidia, et al., “Inferring light fields from shadows,” in Proceedings of the IEEE conference on computer vision and pattern recognition, (2018), pp. 6267–6275.

13. S. Xin, S. Nousias, K. N. Kutulakos, et al., “A theory of fermat paths for non-line-of-sight shape reconstruction,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (2019), pp. 6800–6809.

14. I. Dokmanić, R. Parhizkar, A. Walther, et al., “Acoustic echoes reveal room shape,” Proc. Natl. Acad. Sci. 110(30), 12186–12191 (2013). [CrossRef]

15. D. B. Lindell, G. Wetzstein, and V. Koltun, “Acoustic non-line-of-sight imaging,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), pp. 6780–6789.

16. J. He, S. Wu, R. Wei, et al., “Non-line-of-sight imaging and tracking of moving objects based on deep learning,” Opt. Express 30(10), 16758–16772 (2022). [CrossRef]

17. Y. Shi, E. Guo, J. Miao, et al., “Steady state non-line of sight imaging via unsupervised network,” in Ninth Symposium on Novel Photoelectronic Detection Technology and Applications, vol. 12617 (SPIE, 2023), pp. 559–563.

18. B. M. Smith, M. O’Toole, and M. Gupta, “Tracking multiple objects outside the line of sight using speckle imaging,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), pp. 6258–6266.

19. S. Popoff, G. Lerosey, M. Fink, et al., “Image transmission through an opaque material,” Nat. Commun. 1(1), 81 (2010). [CrossRef]

20. A. Velten, T. Willwacher, O. Gupta, et al., “Recovering three-dimensional shape around a corner using ultrafast time-of-flight imaging,” Nat. Commun. 3(1), 745 (2012). [CrossRef]

21. F. Heide, L. Xiao, W. Heidrich, et al., “Diffuse mirrors: 3d reconstruction from diffuse indirect illumination using inexpensive time-of-flight sensors,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2014), pp. 3222–3229.

22. W. Chen, F. Wei, K. N. Kutulakos, et al., “Learned feature embeddings for non-line-of-sight imaging and recognition,” ACM Trans. Graph. 39(6), 1–18 (2020). [CrossRef]

23. J. Grau Chopite, M. B. Hullin, M. Wand, et al., “Deep non-line-of-sight reconstruction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2020), pp. 960–969.

24. O. Gupta, T. Willwacher, A. Velten, et al., “Reconstruction of hidden 3d shapes using diffuse reflections,” Opt. Express 20(17), 19096–19108 (2012). [CrossRef]

25. M. Laurenzis and A. Velten, “Feature selection and back-projection algorithms for nonline-of-sight laser–gated viewing,” J. Electron. Imaging 23(6), 063003 (2014). [CrossRef]

26. M. Laurenzis and A. Velten, “Investigation of frame-to-frame back projection and feature selection algorithms for non-line-of-sight laser gated viewing,” in Electro-Optical Remote Sensing, Photonic Technologies, and Applications VIII; and Military Applications in Hyperspectral Imaging and High Spatial Resolution Sensing II, vol. 9250 (SPIE, 2014), pp. 113–120.

27. M. O’Toole, D. B. Lindell, and G. Wetzstein, “Confocal non-line-of-sight imaging based on the light-cone transform,” Nature 555(7696), 338–341 (2018). [CrossRef]

28. S. I. Young, D. B. Lindell, B. Girod, et al., “Non-line-of-sight surface reconstruction using the directional light-cone transform,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (2020), pp. 1407–1416.

29. D. B. Lindell, G. Wetzstein, and M. O’Toole, “Wave-based non-line-of-sight imaging using fast fk migration,” ACM Trans. Graph. 38(4), 1–13 (2019). [CrossRef]

30. X. Liu, I. Guillén, M. La Manna, et al., “Non-line-of-sight imaging using phasor-field virtual wave optics,” Nature 572(7771), 620–623 (2019). [CrossRef]

31. M. Mounaix, D. Andreoli, H. Defienne, et al., “Spatiotemporal coherent control of light through a multiple scattering medium with the multispectral transmission matrix,” Phys. Rev. Lett. 116(25), 253901 (2016). [CrossRef]

32. Y. Li, J. Peng, J. Ye, et al., “Nlost: Non-line-of-sight imaging with transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2023), pp. 13313–13322.

33. G. Musarra, A. Lyons, E. Conca, et al., “Non-line-of-sight three-dimensional imaging with a single-pixel camera,” Phys. Rev. Appl. 12(1), 011002 (2019). [CrossRef]

34. Q. Sun, X. Dun, Y. Peng, et al., “Depth and transient imaging with compressive spad array cameras,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

35. Q. Sun, J. Zhang, X. Dun, et al., “End-to-end learned, optically coded super-resolution spad camera,” ACM Trans. Graph. 39(6), 1–12 (2020). [CrossRef]

36. X. Liu, J. Wang, Z. Li, et al., “Non-line-of-sight reconstruction with signal–object collaborative regularization,” Light: Sci. Appl. 10(1), 198 (2021). [CrossRef]

37. X. Liu, J. Wang, L. Xiao, et al., “Non-line-of-sight imaging with arbitrary illumination and detection pattern,” Nat. Commun. 14(1), 3230 (2023). [CrossRef]

38. J. Wang, X. Liu, L. Xiao, et al., “Non-line-of-sight imaging with signal superresolution network,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2023), pp. 17420–17429.

39. B. Wang, M.-Y. Zheng, J.-J. Han, et al., “Non-line-of-sight imaging with picosecond temporal resolution,” Phys. Rev. Lett. 127(5), 053602 (2021). [CrossRef]

40. B. Li, J. Bartos, Y. Xie, et al., “Time-magnified photon counting with 550-fs resolution,” Optica 8(8), 1109–1112 (2021). [CrossRef]

41. A. Raghuram, A. Pediredla, S. G. Narasimhan, et al., “Storm: Super-resolving transients by oversampled measurements,” in 2019 IEEE International Conference on Computational Photography (ICCP), (IEEE, 2019), pp. 1–11.

42. W. Yang, C. Zhang, W. Jiang, et al., “None-line-of-sight imaging enhanced with spatial multiplexing,” Opt. Express 30(4), 5855–5867 (2022). [CrossRef]

43. C. Callenberg, Z. Shi, F. Heide, et al., “Low-cost spad sensing for non-line-of-sight tracking, material classification and depth imaging,” ACM Trans. Graph. 40(4), 1–12 (2021). [CrossRef]

44. Y. Hitomi, J. Gu, M. Gupta, et al., “Video from a single coded exposure photograph using a learned over-complete dictionary,” in 2011 International Conference on Computer Vision, (IEEE, 2011), pp. 287–294.

45. J. N. Martel, L. K. Mueller, S. J. Carey, et al., “Neural sensors: Learning pixel exposures for hdr imaging and video compressive sensing with programmable sensors,” IEEE Trans. Pattern Anal. Mach. Intell. 42(7), 1642–1653 (2020). [CrossRef]

46. X. Yuan, “Generalized alternating projection based total variation minimization for compressive sensing,” in 2016 IEEE International conference on image processing (ICIP), (IEEE, 2016), pp. 2539–2543.

47. C. Wu, J. Liu, X. Huang, et al., “Non–line-of-sight imaging over 1.43 km,” Proc. Natl. Acad. Sci. 118(10), e2024468118 (2021). [CrossRef]

48. J.-T. Ye, X. Huang, Z.-P. Li, et al., “Compressed sensing for active non-line-of-sight imaging,” Opt. Express 29(2), 1749–1763 (2021). [CrossRef]

Super-resolution non-line-of-sight imaging based on temporal encoding

Abstract

1. Introduction

2. Methods

2.1 Imaging principle and system design

2.2 Mathematical model and algorithm design

3. Experiments and analysis

3.1 Improvement of resolution by TE-NLOS

3.2 Limit of super-resolution TE-NLOS

3.3 Impact of code type on TE-NLOS

3.4 TE-NLOS imaging with different types of interference

3.4.1 Effect of different extinction ratios of modulators on TE-NLOS

3.4.2 Effect of time jitters on TE-NLOS

4. Discussion

5. Conclusion

Funding

Acknowledgment

Disclosures

Data availability

References

Data availability

Cited By

Figures (8)

Equations (11)

Optics Express