Tracking and imaging of moving objects with temporal intensity difference correlation

Shuai Sun; Shuai Sun; Huizu Lin; Huizu Lin; Yaokun Xu; Yaokun Xu; Junhao Gu; Junhao Gu; Weitao Liu; Weitao Liu

doi:10.1364/OE.27.027851

1. Introduction

Ghost imaging (GI) is an active imaging scheme in which the information of object is obtained based on high-order coherence of light field [1–5]. In GI system, the distribution of the illumination light field imprinted on the object is recorded by a reference CCD camera, and the intensity of the light transmitted or reflected by the object is collected by a point-like detector. Neither of the two detectors can retrieve the image of the object independently. The image of the object can be reconstructed from the second-order correlation between the signals of the light intensity recorded by both detectors. Specially, when the distribution of the light field recorded by the reference CCD camera can be calculated or obtained in advance, the reference arm can be removed and GI can be performed only with a point-like detector. This is the basic idea of computational ghost imaging(CGI) [6,7]. CGI is a single-pixel imaging scheme, which allows us to image 2- or 3-dimensional objects using a point-like detector [8,9]. Besides, the response and reading speed of point-like detectors are usually far faster than that of a common array sensor, so the intensity of the light can be recorded more quickly thus speed up imaging process of GI. Moreover, point-like detector usually provides higher sensitivity than array sensor, makes it possible to acquire the image of the object with low flux of photons [10]. In addition, using a single-pixel detector, GI can get the image of the object in some operating wavelength such as X-ray or Terahertz, in which spectrum range array sensors are usually expensive or hard to get [11–14]. Moreover, GI can perform a better ability against disordered media than traditional imaging in some cases [15–18]. Last but not least, the sampling process in GI can cooperate with the data process of machine learning to improve the quality of the reconstructed image [19,20].

In GI, the spatial information of the scene to be imaged is acquired from the second-order coherence of the illumination light field [21], for which averaging over enough long time is required. During this process, which is performed by averaging over a large number of samplings in practice, the object is required to be static or approximately immobile otherwise the quality of the reconstructed image will decrease. Moreover, the contrast to noise ratio of the reconstructed image is proportional to the square root of the number of the samplings and inversely proportional to the size of the scene to be imaged [22,23]. That is, when the field of view of the scene is larger, more samplings are required within the moment that the scene can be seen as immobile. So when there is moving objects in the scene, high refresh frequency of illumination and high speed detection will be both necessary. Even, the faster the objects are moving, the faster the illumination and detection system are required [24]. For the cases that the speed of system is not as fast as required, people proposed to acquire the information of the object during its moving process with modified imaging scheme or particular algorithm [25–27], which usually requires a high-precision tracking and aiming system as assist or prior information of the moving object. This makes the imaging system complicated or hard to track multiple objects. In addition, to improve the number of samplings per unit time, multispectral light source are used to illuminate the object and the intensity of each spectra is recorded individually [28]. However, in this method the number of samplings required is not really reduced. Compressive sensing (CS) is a highly effective way to reduce the number of samplings required to obtain information [29]. Compressive GI was also presented for tracking the moving object at low light levels with fewer samplings [30]. This benefits from the fact that the moving object is usually spatially sparse with static background. However, the reconstruction of the image in CS is to solve a convex optimization problem, in which large time consumption and huge memory are usually required. If linear algorithm can be developed towards this issue, it will undoubtedly beneficial to improve the timeliness of information acquisition of the moving object.

In this paper, we propose a temporal intensity difference correlation GI (TDGI) to image moving objects within complex background. In our scheme, the image of the moving objects is reconstructed with greatly reduced number of samplings compared with traditional GI (TGI). Besides, the virtue of the spatial sparsity of the moving objects is utilized, with only linear algorithm required. Data processing of our scheme is much faster than that of compressive GI. Thus the time consumption to reconstruct image of the moving object is greatly reduced. In the experiments, tracking and imaging of two moving objects with varying speed and direction are achieved. Since relative movement between multiple moving objects can be handled, our method can also work if the shape of the object is changing. This makes the scheme promising in tracking and imaging moving objects.

2. Theoretical analysis

In ghost imaging system with thermal light, the distribution of the light field on reference CCD and that of the light field imprinted on the object is

(1)$$\vec E_s(\vec r_s,\;t)=\int \vec E_0 (\vec r'_s,\;t_0)\vec h_s(\vec r_s,\vec r'_s;t,\;t_0)d \vec r'_s, \{s=o,\;r\}$$

in which the subscripts $r$ and $o$ refer to reference arm and object arm, respectively. $\vec E_0 (\vec r'_s,\;t_0)$ is the distribution of the light field of thermal source at time $t_0$. $\vec h_s(\vec r_s,\vec r'_s;t,\;t_0)$ is the impulse response function between the source and the reference CCD (or object),

(2)$$\vec h_s(\vec r_s,\vec r'_s;t,\;t_0)=\frac{e^{jkz_s}}{j\lambda z_s} \exp\{\frac{j\vec k}{2z_s}(\vec r_s-\vec r'_s)^2\}, \{s=o,\;r\}$$

where $z_s$ is the distance between the source and the reference CCD (or object). $\lambda$ is the wavelength of the source and $\vec k$ is wave vector. $j$ is the imaginary unit and $t$ is the time that the light field arrives at CCD (or object). This situation is discussed based on monochrome illumination, so the term related to temporal frequency $e^{-j \omega t}$ is omitted. The coherence between the light filed on reference CCD and the one on the object is

(3)$$\langle \vec E_r(\vec r_r,\;t)\vec E^*_o(\vec r_o,\;t)\rangle=\int \langle \vec E_0 (\vec r'_r,\;t_0)\vec E^*_0 (\vec r'_o,\;t_0)\rangle \vec h_r,(\vec r_r,\vec r'_r;t,\;t_0)\vec h^*_o,(\vec r_o,\vec r'_o;t,\;t_0)d \vec r'_rd \vec r'_r,$$

where $\langle \cdot \rangle$ means temporal ensemble average. The thermal source can be taken as completely incoherent, that is $\langle \vec E_0 (\vec r'_r,\;t_0)\vec E^*_0 (\vec r'_o,,\;t_0)\rangle =|\vec E_0(\vec r')|^2\delta (\vec r'-\vec r'_r,\vec r'-\vec r'_o)$, where the electric field $\vec E_0(\vec r')$ follows circular complex Gaussian statistics and $\delta ()$ means Dirac delta function. Substitute Eqs. (1) and (2) into Eq. (3),

(4)$$\langle \vec E_r(\vec r_r,\;t)\vec E^*_o(\vec r_o,\;t)\rangle =\frac{\exp\{\frac{j\vec k}{2z}(|\vec r_r|^2-|\vec r_o|^2)\}}{\lambda^2z^2}\int |\vec E_0(\vec r')|^2\exp\{\frac{j\vec k}{z}(\vec r_r-\vec r_o)\vec r'\}d\vec r',$$

where the condition $z_r=z_o=z$ is used, which is usually used and easy to be satisfied in GI system. The integral term in the equation is the Fourier transform of the intensity distribution of the light source. This is the expectation of coherence between the light field on reference CCD and that on the object. To obtain such expectation value, averaging over infinite ensemble or infinite number of samplings is required, which can not be satisfied in practice. With limited number of samplings, there will be a noise term, which is related to the number of sampling, between the the measured value and the expectation value,

(5)$$\langle \vec E_r(\vec r_r,\;t)\vec E^*_o(\vec r_o,\;t)\rangle _m=\frac{\exp\{\frac{j\vec k}{2z}(|\vec r_r|^2-|\vec r_o|^2)\}}{\lambda^2z^2}\int |\vec E_0(\vec r')|^2\exp\{\frac{j\vec k}{z}(\vec r_r-\vec r_o)\vec r'\}d\vec r'+\sigma(N),$$

where $N$ is the number of samplings and $\sigma (N)$ is the noise term. With less samplings, $\sigma (N)$ will be larger.

In GI system, the light transmitted or reflected by the object is collected by a point-like detector, which is called bucket detector. For a temporal-varying scene, the image of the scene at $t$ can be reconstructed with,

(6)$$G(\vec r_r,\;t)=\langle I(\vec r_r,\;t)B(t)\rangle -\langle I(\vec r_r,\;t)\rangle \langle B(t)\rangle ,$$

in which $B(t)=\int S(\vec r_o,\;t)I(\vec r_o,\;t)d\vec r_o$, is the light intensity collected by the bucket detector. $S(\vec r_o,\;t)$ is the intensity reflectivity function of the scene at $t$, which include the information of the moving object and the background. $I(\vec r_o,\;t)$ is the intensity distribution of the light field imprinted on the object. $G(\vec r_r,\;t)$ is the reconstructed image of $S(\vec r_o,\;t)$. $I(\vec r_r,\;t)$ is the intensity distribution of reference light field recorded by CCD and $\langle \cdot \rangle$ means temporal average over $\Delta t$. Substitute the expression of the signal of bucket detector into Eq. (6),

(7)$$G(\vec r_r,\;t)_m=\int S(\vec r_o,\;t)|\langle \vec E_r(\vec r_r,\;t)\vec E^{{\ast}}_o(\vec r_o,\;t)\rangle _m|^2d\vec r_o.$$

Here Complex Gaussian Moment Theorem [31] is used, which can be expressed as $\langle I(\vec r_r,\;t)I(\vec r_o,\;t)\rangle =\langle I(\vec r_r,\;t)\rangle \langle I(\vec r_o,\;t)\rangle +|\langle \vec E_r(\vec r_r,\;t)\vec E^{\ast }_o(\vec r_o,\;t)\rangle |^2$ . Then substitute Eq. (5) into Eq. (7), we can obtain

(8)$$G(\vec r_r,\;t)_m=S(\vec r_o,\;t){\ast}\alpha |{\mathscr F}\{ {\widetilde{\vec E_0}}(\frac{\vec r_r-\vec r_o}{\lambda z})\}|^2+\epsilon (N),$$

where $\alpha =\frac {1}{\lambda ^4 z^4}$ represents the propagation parameter. * represents convolution operation and $\mathscr {F} {}$ represents Fourier Transformation operation. $\epsilon (N)$ is the noise term between the expectation result and the measured result due to the limited number of samplings and $\epsilon (N)=S(\vec r_o,\;t)*\sigma ^2(N)$. The first term on the right hand of Eq. (8) is the expectation of GI, which is the convolution between the intensity reflectivity function of the scene and the square of the Fourier transform of the intensity distribution of the source. So GI can be seen as an incoherent imaging system [32] and the point spread function(PSF) is the Fourier Transformation term. As mentioned in Eq. (5), with limited samplings, there will be a noise term, which is related to the number of sampling, between the measured PSF and its expectation. Then in the GI system the noise term in the PSF will be accumulated incoherently to the noise in the image through the convolution operation, which is shown in the second term on the right hand of Eq. (8). The smaller the object is, the less noise will be accumulated to the image, which means the noise term $\epsilon (N)$ will be smaller. So when imaging a moving object in complex scene, if the spatial sparsity of the object can be utilized, due to the far smaller size of the object compared with that of the whole scene, the noise term can be reduced and the number of samplings required in GI will also be reduced, which is the motivation of this paper.

A temporal-varying scene can be discretized as a set of scene frames, $O(x,\;y,\;t_k), k=1,2\cdots M$, and each scene frame can be treated as static within the moment $\Delta t_k= t_{k+1}-t_k$. During $\Delta t_k$, $N$ frames of speckle pattern $I(\vec r_r,\;t_{kn}), n=1,2\cdots N$ can be used to illuminate the scene frame and counterpart signal of bucket detector $B(t_{kn}),\;n=1,2\cdots N$ can be obtained. For two different scene frames $O(x,\;y,\;t_k)$ and $O(x,\;y,\;t_l)$, the difference between the two sequence of signal from bucket detector at $t_k$ and $t_l$ is

(9)$$B_{DT}(t_{ln})=\int S(\vec r_o,\;t_k)I(\vec r_o,\;t_{kn})d\vec r_o-\int S(\vec r_o,\;t_l)I(\vec r_o,\;t_{ln})d\vec r_o, (n=1,2\cdots N).$$

For different durations $\Delta t_l$ and $\Delta t_k$, the same sequence of the illumination patterns are repeated. That is, $I(\vec r_r,\;t_{kn})=I(\vec r_r,\;t_{ln}), k=1,2\cdots N$, then Eq. (9) can be simplified as

(10)$$B_{TD}(t_{ln})=\int \Delta S(\vec r_o,\;t_l)I(\vec r_o,\;t_{ln})d\vec r_o,$$

in which $\Delta S(\vec r_o,\;t_l)=S(\vec r_o,\;t_l)-S(\vec r_o,\;t_k)$. $\Delta S(\vec r_o,\;t_i)$ is the temporal-variying component between the two scene frames $S(\vec r_o,\;t_k)$ and $S(\vec r_o,\;t_l)$. Equation (10) shows that $B_{TD}(t_{ln})$ is the intensity of the light reflected by the temporal-varying component and the light from the background is eliminated by subtraction, so in this way the information of the moving component, which is sparse, is extracted. This is the core idea of TDGI. Replacing $B(t)$ in Eq. (5) with $B_{TD}(t_l)$,

(11)$$\Delta G(\vec r_r,\;t_l)=\langle I(\vec r_r,\;t_l)B_{TD}(t_l)\rangle -\langle I(\vec r_r,\;t_l)\rangle \langle B_{TD}(t_l)\rangle ,$$

where we take $\Delta G(\vec r_r,\;t_l)$ as the final imaging results of the temporal-varying component of the scene. This is the temporal intensity difference correlation algorithm, in which the image of the moving objects is extracted from the complex scene just by replacing the signal of the bucket detector in traditional GI (TGI) with the temporal intensity difference between the signal collected by bucket detector at two different durations. According to the derivation of Eqs. (6) and (8), Eq. (11) can also be simplified as

(12)$$\Delta G_m(\vec r_r,\;t_l)=\Delta S(\vec r_o,\;t_l)*\alpha |\mathscr{F}\{ \widetilde{\vec E_0}(\frac{\vec r_r-\vec r_o}{\lambda z})\}|^2+\epsilon_\Delta (N),$$

where $\epsilon _\Delta (N)$ is the noise term between the expectation results and the measured results, $\epsilon _\Delta (N)=\Delta S(\vec r_o,\;t_l)*\sigma ^2(N)$. Comparing Eq. (11) with Eq. (8), we can see that the intensity transmission function of the time-varying component $\Delta S(\vec r_o,\;t_i)$ is measured directly in TDGI. The size of $\Delta S(\vec r_o,\;t_l)$ is usually smaller compared with $S(\vec r_o,\;t_l)$, so with the same number of samplings, the noise term $\epsilon _\Delta (N)$ will far smaller than $\epsilon (N)$. In TDGI, the temporal difference of the signals from bucket detector are the intensity of the light reflected by the moving objects, which is spatially sparse. By using the temporal difference of the bucket signals to reconstruct the image, the spatial sparsity of the moving objects is utilized and the number of sampling required to reconstruct the image is reduced compared with TGI. Besides, in our method the image of the time-varying components is reconstructed, so our method works well in principle in the case that there are multiple targets in the scene. Moreover, this is achieved only with linear algorithm, which is of timeliness for data processing. The time consumption of obtaining the image of the objects will be reduced greatly, makes this scheme useful in tracking and imaging moving objects in complex scene.

3. Experimental results

The experimental setup is shown in Fig. 1, a laser beam with wavelength of 532nm is expanded by a 4-f system and then scattered by a rotating ground glass (RGG). After the RGG, there is a 4-f system, with an aperture located in the focal plane, allows the quasi-parallel light in the scattering field to pass. After the Fourier plane of the lens $L4$, a beamsplitter is used to divide the light into two arms, each of which passes through a 2-f system. Then the light field in the reference arm is recorded by CCD1(AVT Stingray F-125 B). In object arm the illumination pattern is reflected by a DMD(TI DN 2503686 ), which is used to display the scene to be imaged. For calculating Mean Square Error (MSE) [33] between the intensity distribution function of the scene displayed and the reconstructed image in GI, the displayed scene is imaged onto CCD2(AVT Stingray F-125 B) by lens $L7$. So both traditional imaging (TI) and ghost imaging can be performed in this setup. When GI is performed, the intensity recorded by CCD2 will be integrated and then discretized to 0-255 in GI. With this discretization, the dynamic range of the integrated signal, which has effect on the result in GI, from CCD2 will be closer to that of the signal from a real 8-bit bucket detector. In our scheme, the same sequence of speckle patterns are used to illuminate the scene at different durations. It requires the system to produce the same sequence of speckle patterns repeatedly. This is achieved by using a precision step motor (SHINANO Y07-43D1-4275) to control the RGG to rotate in a repeatable trajectory, so this setup is also a CGI system. The reference illumination pattern can be stored by CCD1 in advance, then the same sequence of illumination pattern is produced and TDGI can be performed only with CCD2.

Firstly, we consider the case that one moving object in the complex scene to show that our method can largely reduced the number of samplings required in ghost imaging. The scene to be imaged is a top view of a block with a car going across, which is shown in Fig. 2(a), the scene can be discretized to fives scene frames $S(\vec r_o,\;t_k),\;k=1,2\cdots 5$, consists of the background $(S0)$ and with a car in four different positions $(S1$-$S4)$, which is shown in the first row of Fig. 2(a). The duration of each scene frame is $\Delta t_k$, within which the movement distance of the car is less than the size of a speckle so the car can be taken as immobile. At $t_1$, $N$ frames of speckle patterns $I(\vec r_r,\;t_{1n}), n=1,2\cdots N$ is used to illuminate the scene $(S0)$, which is the block without the car, and the signal sequence of the bucket detector $B(t_{1n})$ is obtained and stored. At $t_l, l=2,3\cdots 5$ the same sequence of speckle patterns are used, and ${B(t_{ln})}$ is obtained. Then $B_{TD}(t_{ln})$ is calculated and the image of the car in the scene frame $S(\vec r_o,\;t_l)$ is reconstructed with Eq. (10). The results are shown in the third row of Fig. 2(a), each of which is from 500 samplings. For a comparison, TGI with Eq. (5) is also performed to image the whole scene of S1-S4 with the same 500 samplings, respectively. The result is shown in the second row of Fig. 2(a), in which the position of the car is marked by a dashed box. We can see that the information of the car is submerged in the noise. In practice, the image of the car can also be composited with the image of the block, which can be obtained in advance, to acquire the information of the relative position between the car and the block. The size of the scene on the CCD2 is $400 \times 400$ pixels and the average Full Width at Half Maximum (FWHM) of the speckle unit is about $10 \times 10$ pixels. So the average number of the speckle imprinted on the scene is about $1600$. The size of the car is $1056$ pixels so its sparsity is about 0.007 compared with the whole scene. Each of the results in Fig. 2(a) is reconstructed from 500 samplings so the sample rate is about 0.313. Benefits from the spatial sparsity of the car, with the same number of samplings, the quality of results from TDGI is far higher than that from TDI. To demonstrate that TDGI can reduce the number of samplings required in GI, The MSE of the reconstructed image in TGI and that in TDGI with different samplings are also calculated. The result is shown in Fig. 2(b), the bottom coordinate is the number of samplings of TDGI and the top coordinate is the number of samplings of TGI. In TDGI, higher quality image of the moving object can be obtained with greatly reduced number of samplings than that in TGI.

Fig. 1. . Schematic Diagram of Experimental Setup. A laser beam is expanded by a 4-f system ($f1=5cm, f2=20cm$) and the expanded beam size is about 5mm, then is scattered by a RGG. An aperture, whose diameter is about 3mm is located in the focal plane of a 4-f system $f_3=f_4=20cm$. The scattering field is divided in two beams, each of which there is a 2-f system ($f_5=f_6=30cm$). The L7 is an imaging lens with $f7=20cm$. The scene to be imaged is displayed by the DMD and the image of the displayed scene is obtained by the CCD2, which also works as a bucket detector in GI.

Download Full Size | PDF

Fig. 2. . Comparison between TDGI and TGI. (a) The first row are the discretized scene frames to be imaged. The second row are the results reconstructed by TGI, each of which is from with 500 samplings. The position of the moving object is marked by dashed box. The third row are the reconstructed images of moving objects, each of which is obtained by TDGI with 500 samplings. (b) The MSE of the reconstructed image in TDGI and that in TGI with different number of samplings.

Download Full Size | PDF

To demonstrate the performance of our method in more complex situation, the scene with multiple moving objects is also considered. The first row of Fig. 3(a) is the five discretized scene frames to be imaged, which consists of the background $(S0)$ with two cars in four different positions ($S1$-$S4$). The speed and the direction of each moving car is various. And there are also relative movement between two cars, which can be extended to the case that the shape of the moving object is changing during its evolution. The results obtained by TDGI are shown in the third row of Fig. 3(a). For a comparison, TGI are also performed with the same number of samplings and the results are shown in the second row of Fig. 3(a). The position of the moving cars is marked by dashed box. In this experiment, average number of the speckles imprinted on the scene is about $5625$. Each of the reconstructed image in Fig. 3(a) is obtained from 3000 samplings, respectively, which means the sample rate for each image is about 0.533. To demonstrate that TDGI can reduce the number of samplings required in GI, the MSE of the reconstructed image in TGI and that in TDGI with different samplings are also calculated. The result is shown in Fig. 3(b). In TDGI, higher quality image of the moving objects can be obtained with greatly reduced number of samplings than that in TGI. There results implies that our method can improve the performance of ghost imaging in tracking and imaging multiple moving objects and still effective in the case that the shape of the moving object is changing during its evolution.

Fig. 3. . Comparison between TDGI and TGI in the scene with multiple objects. The first row are the discretized scene frames to be imaged. The second row are the results reconstructed by TGI, each of which is from with 3000 samplings. The position of the moving object is marked by dashed box. The third row are the reconstructed images of moving objects, each of which is obtained by TDGI with 3000 samplings. (b) The MSE of the reconstructed image in TDGI and that in TGI with different number of samplings.

Download Full Size | PDF

CS is widely used in GI to improve the quality of the reconstructed image in the cases that the image is not required in real time. However, in practical, tracking is usually achieved from sequential images of the moving object during its evolution. It requires to “see” where the object is in the reconstructed image at a desirable frequency. Therefore, to obtain the image of the object in real time is significant. GI process consists of sampling and data processing. At present, benefiting from the development of illumination sources with high refresh frequency [34–37], tens of thousands of samplings can be performed within subsecond in a common GI system. However, the time consumption of different data processing varies greatly. This makes data processing play an important role in the timeliness of a tracking and imaging system, and the time consumption of the data processing is really an issue to be considered. From this point of view, the comparison between the time consumption of the data process in our scheme and that of CS is demonstrated. The reconstructed images are shown in Fig. 4(a). The first row is the result from CS, in which the number of samplings is notated by $N_C$. The second row is the result from TDGI, in which the number of samplings is notated by $N_T$. The two images in the same dashed box numbered by $K$ are of the same quality characterized with MSE. Figure 4(b) shows the MSE of the reconstructed images in Fig. 4(a). The bottom coordinates is the index of the reconstructed images numbered with $K$. We can see that with less than 1600 samplings, the quality of the reconstructed images from TDGI and that from CS method are almost the same. However, more time consumption will be required in CS method than TDGI. With the number of samplings increase, CS method can achieve an image with higher quality than TDGI with the same number of samplings. As mentioned before, the time consumption of sampling in GI is subsecond, which is negligible compared with that of data processing. So the comparison is performed between the time consumption of the data processing in TDGI and that in CS method, which is shown in Fig. 4(c). Comparing Figs. 4(b) and 4(c), the quality of the images reconstructed from TDGI and that from CS method are almost the same. However, the time consumption in TDGI is about an order of magnitude less than that in CS, which is based on gradient projection for sparse reconstruction in two dimensional discrete cosine transform (2D-DCT) domain [38]. Both of the two algorithms run with a desktop with Intel Xeon CPU$\times$8 @ 2.10 GHz and a 96 GB RAM.

Fig. 4. . Comparison between TDGI and CS. (a) the first row is the reconstructed image in CS and the second row is the image reconstructed in TDGI. (b) the MSE of the images obtained in CS and that in TDGI. (c) the time consumption of the data processing in CS and in TDGI.

Download Full Size | PDF

Moreover, in TDGI, the image of the moving objects is reconstructed from the second-order correlation between the temporal difference of the bucket signal and the illumination pattern. TDGI is different from the method that calculating the difference between two images of the scene reconstructed via TGI at two different durations. This will be demonstrated as follows. At two different scene frames $G(\vec r_r,\;t_k)$ and $G(\vec r_r,\;t_l)$, two different sequences of speckle patterns $I(\vec r_r,\;t_{kn})$ and $I(\vec r_r,\;t_{ln})$ are used to illuminate the scene and providing signals of bucket detector $B(t_{kn})$ and $B(t_{ln})$. The image of the temporal-varying component of the scene $\Delta G'(\vec r_r,\;t_l)$ is reconstructed with

(13)$$\Delta G'_m(\vec r_r,\;t_l)=\langle I(\vec r_r,\;t_l))B(t_l)\rangle -\langle I(\vec r_r,\;t_l))\rangle \langle B(t_l)\rangle -(\langle I(\vec r_r,\;t_k))B(t_k)\rangle -\langle I(\vec r_r,\;t_k))\rangle \langle B(t_k)\rangle ),$$

The illumination pattern is keep changing in different durations in this method. Let’s call the scheme shown in Eq. (13) reference changing TDGI (RCTDGI). According Eq. (5), Eq. (13) can be simplified as,

(14)$$\Delta G'_m(\vec r_r,\;t_l)=G_m(\vec r_r,\;t_l)-G_m(\vec r_r,\;t_k),$$

$\Delta G'_m(\vec r_r,\;t_l)$ is an indirect measurement results from $G_m(\vec r_r,\;t_l)$ and $G_m(\vec r_r,\;t_k)$, both of which are the reconstructed image of the whole scene. As mentioned in Eq. (8), the larger scene, the heavier noise. So both of the noise term in $G_m(\vec r_r,\;t_l)$ and $G_m(\vec r_r,\;t_k)$, which can be rotated as $\epsilon _l(N)$ and $\epsilon _k (N)$ respectively, are larger than $\epsilon _\Delta (N)$, which is the noise term in the image obtained by TDGI with the same number of sampling. Besides, the noise in $G_m(\vec r_r,\;t_l)$ and $G_m(\vec r_r,\;t_k)$ are incoherent, the noise term in the image reconstructed in RCTDGI is $\epsilon ' (N)=\epsilon _l (N)+\epsilon _k (N)$. So with the same number of samplings, $\epsilon ' (N)>\epsilon _\Delta (N)$. With increasing number of samplings, both of the quality of the images for TDGI and RCTDGI will increase, even to be the same for enough samplings. However, with small number of samplings, TGDI can obtain image of the moving component with a far higher quality than RCTDGI. This conclusion can also be explained with the fact that the virtue of the sparsity of the moving objects is indeed utilized by TGDI as demonstrated before, but not used by RCTGDI, in which the whole scene is measured at first and then the difference between two measured results is calculated. Besides, the reconstructed image of temporal-varying component is an indirect measurement result, the noise of each direct measurement result will be transmitted to the final results. The obtained image of the moving cars in $S2$ from TDGI and that from RCTGDI are shown as Fig. 5. We can see that with the same number of samplings, the reconstructed image with TDGI is far clearer than that from RCTGDI.

Fig. 5. . Comparison between TDGI and RCTDGI. The reconstructed imaged from TDGI and RCTDGI with N samplings, the upper row is from TDGI and the lower one is from RCTDGI.

Download Full Size | PDF

4. Discussion

In this paper, we propose a temporal intensity difference correlation GI scheme, in which the information of the static component in the scene is removed and the image of the moving objects can be obtained with far fewer samplings. Besides, data processing with linear algorithm makes the scheme is of good real-time performance, which is significant for tracking. For more general situation, if the information of the moving objects and the background are both required, our scheme can be a supplementary way to TGI. The image of the moving objects can be obtained by our scheme with greatly reduced time and then the image of the background can be obtained by TGI without illegibility caused by the moving objects. The performance improvement by our scheme benefits from the spatial sparsity of the moving objects, since the number of samplings required to image the moving objects in TGI is determined by the size of the whole scene, while in TDGI determined by the size of the moving objects. If the speed of the moving objects is too fast that enough samplings, which is determined by the size of the objects, can not be performed within the moment that the objects can be seen as immobile, the image reconstructed via TDGI will also deteriorate. TDGI is demonstrated with a CGI system here and can also be used in a single-pixel camera by modulate the light reflected from two different scene frames with the same mask sequence. This scheme can be used in target tracking and live tissue imaging, in which less samplings means less probability to be perceived or less damage. Besides, with less samplings, this method can largely reduce the data volume required when GI is performed in surveillance video.

Further more, in our scheme, the sparsity of the moving objects is utilized with intensity correlation algorithm, which is linear and can run far faster than solving the convex optimization problem in CS. It improves the timeliness of information acquisition, which will be of practical significance in tracking and imaging moving objects. Besides, the spatial sparsity of the objects is utilized in our scheme just by means of subtracting the signals of the bucket detector. This is probably difficult to achieve by traditional imaging, since the quality of an image captured by a camera will be lower if it is just subtracted by another image. We believe our scheme can be a good start of taking advantage of the sparsity of the objects by using a linear algorithm.

Funding

National Natural Science Foundation of China (11774431, 61701511); Science and Technology Project of Hunan Province (2017RS3043); College of Advanced Interdisciplinary Studies.

References

1. T. B. Pittman, Y. H. Shih, D. V. Strekalov, and A. V. Sergienko, “Optical imaging by means of two-photon quantum entanglement,” Phys. Rev. A 52(5), R3429–R3432 (1995). [CrossRef]

2. A. Gatti, E. Brambilla, M. Bache, and L. A. Lugiato, “Correlated imaging, quantum and classical,” Phys. Rev. A 70(1), 013802 (2004). [CrossRef]

3. R. S. Bennink, S. J. Bentley, R. W. Boyd, and J. C. Howell, “Quantum and classical coincidence imaging,” Phys. Rev. Lett. 92(3), 033601 (2004). [CrossRef]

4. A. Valencia, G. Scarcelli, M. D’Angelo, and Y. H. Shih, “Two-photon imaging with thermal light,” Phys. Rev. Lett. 94(6), 063601 (2005). [CrossRef]

5. S. S. Hodgman, W. Bu, S. B. Mann, R. I. Khakimov, and A. G. Truscott, “Higher-Order Quantum Ghost Imaging with Ultracold Atoms,” Phys. Rev. Lett. 122(23), 233601 (2019). [CrossRef]

6. J. H. Shapiro, “Computational ghost imaging,” Phys. Rev. A 78(6), 061802 (2008). [CrossRef]

7. Y. Bromberg, O. Katz, and Y. Silberberg, “Ghost imaging with a single detector,” Phys. Rev. A 79(5), 053840 (2009). [CrossRef]

8. B. Sun, M. P. Edgar, R. Bowman, L. E. Vittert, S. Welsh, A. Bowman, and M. J Padgett, “3D computational imaging with single-pixel detectors,” Science 340(6134), 844–847 (2013). [CrossRef]

9. W. Gong, C. Zhang, H. Yu, M. Chen, W. Xu, and S. Han, “Three-dimensional ghost imaging lidar via sparsity constraint,” Sci. Rep. 6(1), 26133 (2016). [CrossRef]

10. P. A. Morris, R. S. Aspden, J. E. Bell, R. W. Boyd, and M. J Padgett, “Imaging with a small number of photons,” Nat. Commun. 6(1), 5913 (2015). [CrossRef]

11. H. Yu, R. Lu, S. Han, H. Xie, G. Du T. Xiao, and D. Zhu, “Fourier-transform ghost imaging with hard X rays,” Phys. Rev. Lett. 117(11), 113901 (2016). [CrossRef]

12. A. X. Zhang, Y. H. He, L. A. Wu, L. M. Chen, and B. Wang, “Tabletop x-ray ghost imaging with ultra-low radiation,” Optica 5(4), 374–377 (2018). [CrossRef]

13. Y. Altmann, S. McLaughlin, M. J. Padgett, V. K. Goyal, A. O. Hero, and D. Faccio, “Quantum-inspired computational imaging,” Science 361(6403), eaat2298 (2018). [CrossRef]

14. J. P. Zhao, Y. W. E K. Williams, X. C. Zhang, and R. W. Boyd, “Spatial sampling of terahertz fields with sub-wavelength accuracy via probe-beam encoding,” Light: Sci. Appl. 8(1), 55 (2019). [CrossRef]

15. W. Tan, X. Huang, S. Nan, Y. Bai, and X. Fu, “Effect of the collection range of a bucket detector on ghost imaging through turbulent atmosphere,” J. Opt. Soc. Am. A 36(7), 1261–1266 (2019). [CrossRef]

16. Y. Zhang, W. Li, H. Wu, Y. Chen, X. Su, Y. Xiao, and Y. Gu, “High-visibility underwater ghost imaging in low illumination,” Opt. Commun. 441, 45–48 (2019). [CrossRef]

17. Y. K. Xu, W. T. Liu, E. F. Zhang, Q. Li, H. Y. Dai, and P. X. Chen, “Is ghost imaging intrinsically more powerful against scattering?” Opt. Express 23(26), 32993–33000 (2015). [CrossRef]

18. L. Li, Q. Li, S. Sun, H. Z. Lin, W. T. Liu, and P. X. Chen, “Imaging through scattering layers exceeding memory effect range with spatial-correlation-achieved point-spread-function,” Opt. Lett. 43(8), 1670–1673 (2018). [CrossRef]

19. M. Lyu, W. Wang, H. Wang, H. Wang, G. Li, N. Chen, and G. Situ, “Deep-learning-based ghost imaging,” Sci. Rep. 7(1), 17865 (2017). [CrossRef]

20. Y. He, G. Wang, G. Dong, S. Zhu, H. Chen, A. Zhang, and Z. Xu, “Deep-learning-based ghost imaging,” Sci. Rep. 8(1), 6469 (2018). [CrossRef]

21. S. Ota, R. Horisaki, Y. Kawamura, M. Ugawa, I. Sato, K. Hashimoto, and K. Waki, “Ghost cytometry,” Science 360(6394), 1246–1251 (2018). [CrossRef]

22. K. W. Chan, M. N. O’Sullivan, and R. W. Boyd, “Optimization of thermal ghost imaging: high-order correlations vs. background subtraction,” Opt. Express 18(6), 5562–5573 (2010). [CrossRef]

23. J. Li, D. Yang, B. Luo, G. Wu, L. Yin, and H. Guo, “Image quality recovery in binary ghost imaging by adding random noise,” Opt. Lett. 42(8), 1640–1643 (2017). [CrossRef]

24. H. Li, J. Xiong, and G. H. Zeng, “Lensless ghost imaging for moving objects,” Opt. Eng. 50(12), 127005 (2011). [CrossRef]

25. E. R. Li, Z. W. Bo, M. L. Chen, W. L. Gong, and S. S. Han, “Ghost imaging of a moving target with an unknown constant speed,” Appl. Phys. Lett. 104(25), 251120 (2014). [CrossRef]

26. S. Jiao, M. Sun, Y. Gao, T. Lei, Z. Xie, and X. Yuan, “Motion estimation and quality enhancement for a single image in dynamic single-pixel imaging,” Opt. Express 27(9), 12841–12854 (2019). [CrossRef]

27. D. B. Phillips, M. J. Sun, M J. M. Taylor, M. P. Edgar, S. M. Barnett, G. M. Gibson, and M. J. Padgett, “Adaptive foveated single-pixel imaging with dynamic supersampling,” Sci. Adv. 3(4), e1601782 (2017). [CrossRef]

28. N. Radwell, K. J. Mitchell, G. M. Gibson, M. P. Edgar, R. Bowwan, and M. J. Padgett, “Single-pixel infrared and visible microscope,” Optica 1(5), 285–289 (2014). [CrossRef]

29. W. T. Liu, T. Zhang, J. Y. Liu, P. X. Chen, and J. M. Yuan, “Experimental quantum state tomography via compressed sampling,” Phys. Rev. Lett. 108(17), 170403 (2012). [CrossRef]

30. O. S. Magana-Loaiza, G. A. Howland, M. Malik, J. C. Howell, and R. W. Boyd, “Compressive object tracking using entangled photons,” Appl. Phys. Lett. 102(23), 231104 (2013). [CrossRef]

31. I. Reed, “On a moment theorem for complex Gaussian processes,” IEEE Trans. Inf. Theory 8(3), 194–195 (1962). [CrossRef]

32. D. Z. Cao, J. Xiong, and K. G. Wang, “Geometrical optics in correlated imaging systems,” Phys. Rev. A 71(1), 013801 (2005). [CrossRef]

33. S. Sun, W. T. Liu, H. Z. Lin, E. F. Zhang, J. Y. Liu, Q. Li, and P. X. Chen, “Multi-scale adaptive computational ghost imaging,” Sci. Rep. 6(1), 37013 (2016). [CrossRef]

34. M. Bache, E. Brambilla, A. Gatti, and L. A. Lugiato, “Ghost imaging schemes: fast and broadband,” Opt. Express 12(24), 6067–6081 (2004). [CrossRef]

35. L. Wang and S. Zhao, “Fast reconstructed and high-quality ghost imaging with fast Walsh-Hadamard transform,” Photonics Res. 4(6), 240–244 (2016). [CrossRef]

36. Y. Wang, Y. Liu, J. Suo, G. Situ, C. Qiao, and Q. Dai, “High speed computational ghost imaging via spatial sweeping,” Sci. Rep. 7(1), 45325 (2017). [CrossRef]

37. Z. H. Xu, W. Chen, J. Penuelas, M. Padgett, and M. J. Sun, “1000 fps computational ghost imaging using LED-based structured illumination,” Opt. Express 26(3), 2427–2434 (2018). [CrossRef]

38. M. A. Figueiredo, R. D. Nowak, and S. J. Wright, “Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,” IEEE J. Sel. Top. Signal Process. 1(4), 586–597 (2007). [CrossRef]

Tracking and imaging of moving objects with temporal intensity difference correlation

Abstract

1. Introduction

2. Theoretical analysis

3. Experimental results

4. Discussion

Funding

References

Cited By

Figures (5)

Equations (14)

Optics Express