## Abstract

Compressive hyperspectral imaging (CHI) has attracted widespread attention due to its advantage of snapshot by encoding the 3D spectral image into a 2D measurement. The bottleneck of CHI for the real application now lies in the limited reconstruction quality, for which one of the fundamental reason is the inaccurate modeling of the measurement noise. Our key observation is that in practical scenarios, the measurement is inevitably contaminated with a positive offset (i.e., noise mean) due to the unideal imaging conditions (e.g. stray light and dark current), resulting in serious degradations of the reconstruction quality. In this paper, we propose to model the real noise with non-zero mean that generalizes the traditional zero mean noise to faithfully characterizing the optical imaging principle, and then introduce a novel reconstruction method with a goal to boost the reconstruction quality. During the reconstruction, the noise mean is estimated gradually to its convergence by measuring the deviation of the intermediate reconstruction result from the system measurement. We demonstrate the superior performance of our method by experiments on synthetic data and real capture data with a hardware prototype.

© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Hyperspectral imaging measures each pixel in a scene with a densely sampled spectral signature. The obtained 3D datacube with 1D spectral and 2D spatial variations provides deterministic information about the material and lighting in the scene. Owing to its capability of detailed scene representation, hyperspectral imaging has shown increasing application values in a variety of fields, e.g., remote sensing, medical diagnosis and health care [1–3]. Conventional spectral imagers [4] usually adopt the scanning-based strategy to capture the 3D hyperspectral images (HSI), which needs multiple exposures to obtain a full HSI and is not applicable in time-crucial scenarios.

To overcome the limitation of the conventional spectral imagers, various kinds of snapshot hyperspectral imaging prototypes have been proposed in the last decades [5–7]. In the early developments, snapshot hyperspectral imagers are based on direct measurement. Representative techniques include image-replicating imaging spectrometer [8], 4D imaging spectrometer [9], and image mapping spectrometry [10]. However, since the datacube voxel is directly mapped to the sensor pixel, the number of the voxel is fundamentally limited by the number of sensor pixels, thus the spatial and spectral resolutions are mutually restricted. Based on the compressive sensing theory, researchers have developed quite a few compressive hyperspectral imaging (CHI) prototypes. Among these works, the coded aperture snapshot spectral imaging (CASSI) [11,12] has drawn increasing attention due to its nature of high spatial and spectral resolution [13]. At the meantime, same variants of CASSI have been proposed with alternative optical designs, including the multi-frame snapshot [14], the spatial-spectral modulation [15,16], and the colored coded aperture [17]. Integrated with an extra panchromatic camera, CASSI has been upgraded to a dual-camera design, named DCCHI [18,19].

Despite of the hardware advancement, we have not seen the widespread application of CHI in practice. It is no doubt that the transition from a technical concept to a practical equipment is greatly meaningful, i.e., realizing what CHI has promised. The bottleneck of CHI for achieving this transition lies in the limited reconstruction quality in the real scenarios [5], for which one of the fundamental reason is the inaccurate modeling of the measurement noise. Our key observation is that the measurement noise is with a non-zero mean in practical scenarios, which mainly results from two aspects: (1) **Dark Current**: electrical components in the camera spontaneously generate a dark current, leading to a positive offset on the measurement. (2) **Stray Light**: light in the imaging system is reflected and scattered multiple times among lenses due to the imperfect transmittance, reflectance, and smoothness of optical elements.

Existing reconstruction methods are designed and optimized for zero mean additive white Gaussian noise (AWGN) [20–22]. The influence of the non-zero noise mean has been ignored in the literatures, which is the result of thinking inertia from direct imaging. In the field of direct imaging, adding an offset to one image has little impact on the image structures, so people always ignore the noise mean. However, in the field of computational imaging, the non-zero noise mean would be amplified and transferred to the reconstruction results during the reconstruction process, which results in large reconstruction error. Please see Sec. 2.2 for a theoretical analysis, here we show an intuitive comparison on the influence of different noise means in Fig. 1. The images are reconstructed by the two-step iterative shrinkage/thresholding (TwIST) method [20]. It can be seen that the non-zero noise means introduce large reconstruction error on both spatial structures and spectral fidelity.

In this paper, we focus on the problem of noise mean estimation and fix it based on two representative CHI systems, i.e., CASSI and its dual-variant DCCHI. Our contributions include:

- We
*propose*to model the real noise with non-zero mean that generalizes the traditional zero mean AWGN in the CHI system. - We
*develop*a novel reconstruction method that particularly considers the non-zero mean noise, where the noise mean is estimated gradually to its convergence by calculating the deviation of the intermediate reconstruction results from the system measurements. - We
*validate*our method with simulations on synthetic data both quantitatively and qualitatively. - We
*build*a compressive hyperspectral imaging prototype to further verify our method and*demonstrate*its practical values.

The remainder of this paper is organized as follows. In Sec. 2, we build the imaging model for CHI with non-zero mean noise and introduce our method for noise mean estimation and hyperspectral image reconstruction. In Sec. 3, we evaluate the effectiveness of the proposed method in experiments on synthetic data. In Sec. 4, we build a hardware prototype and demonstrate our method with real data. In Sec. 5, we conduct a discussion about the complexity and extension of our method. In Sec. 6, we conclude this paper.

## 2. Methods

#### 2.1. Imaging model with non-zero mean noise

The schematics of CASSI and DCCHI are shown in Fig. 2. Denote the 3D hyperspectral image data as *f*(*x*, *y*, *λ*), where *x* and *y* express the spatial position and *λ* represents the wavelength in spectral dimension. The incident light from the scene is equally divided into two directions by a beam splitter. In the CASSI branch, the light is first projected onto a coded aperture, of which the pattern obeys a random Bernoulli distribution to guarantee the restricted isometry property [23], and then dispersed by a dispersive prism. Denote the transmission function of the coded aperture as *T*(*x*, *y*), and the dispersion function of the dispersive prism as *M* (*λ*). The information before the sensor plane can be represented as

The information is finally integrated by the camera along the spectral dimension and discretized into pixels along the spatial dimensions. Value of the (*m*, *n*) pixel on the sensor is given by

*w*(

_{c}*λ*) represents the spectral sensitivity of the camera in CASSI branch, Δ denotes the pixel size.

*λ*and

_{min}*λ*indicate the spectral range. Considering the unavoidable noise in practice, Eq. (2) can be simplified to the following matrix form where

_{max}*G*and

_{c}*F*are the vector forms for

*g*and

_{c}*f*,

*H*is the observation matrix of CASSI, and

_{c}*N*(

_{c}*μ*, ${\sigma}_{c}^{2}$) represents the noise of CASSI obeying a Gaussian distribution with mean of

_{c}*μ*and variance of ${\sigma}_{c}^{2}$.

_{c}*H*is mainly contributed from the modulation of the coded aperture, the dispersion of the prisms, and integration of the sensor [24–26]. For the panchromatic camera branch without any modulation, information collected by the camera is only an integration of the datacube along the spectral dimension,

_{c}*w*(

_{p}*λ*) is the spectral sensitivity of the camera in the panchromatic branch. Similar to the CASSI branch, Eq. (4) can be reformed as where

*G*is the vector form for

_{p}*g*,

_{p}*H*is the observation matrix which indicates the integration of the panchromatic camera, and

_{p}*N*(

_{p}*μ*, ${\sigma}_{p}^{2}$) represents the noise of the camera obeying a Gaussian distribution with mean of

_{p}*μ*and variance of ${\sigma}_{p}^{2}$. Due to the existing of dark current and stray light, the noise means

_{p}*μ*and

_{c}*μ*are non-zero. By stacking Eq. (3) and Eq. (5), the forward model of the DCCHI system can be expressed as where

_{p}#### 2.2. Image reconstruction by estimating noise mean

Give the system measurements *G* and the system observation matrix *H*, the reconstruction for CHI are generally formulated as the following optimization problem

*β*is a weight factor balancing the fidelity term and the regularization term. Here, we develop an effective method to estimate the value of the noise mean and reduce its influence on the reconstruction quality. To simplify the deduction, we first define a vector that indicates the noise mean as where

*B*and

_{c}*B*are constant vectors, whose values are

_{p}*μ*,

_{c}*μ*and the sizes are the same as

_{p}*G*and

_{c}*G*, respectively. Then, the forward model of DCCHI can be reformulated as with

_{p}*N′*representing a zero mean noise.

For the situation where noise mean is zero (i.e., *B* = **0**), the reconstructed hyperspectral image *F̂* can be expressed as

*𝒦*(·) indicates the function of the reconstructed process in Eq. (8). In this situation, the reconstructed hyperspectral image

*F̂*is close to the ground truth

*F*. However, when the noise mean is non-zero, the results obtained from the conventional reconstruction methods change to

*C*is an auxiliary variable satisfying

*HC*=

*B*. On this condition, the estimated

*F̂*is close to

*F*+

*C*, deviating from the ground truth

*F*.

To address the issue caused by the non-zero noise mean, we propose a novel optimization method by explicitly modeling the noise mean *B* in the fidelity term

To be consistent with the original CASSI works and thus facilitate the performance comparison, we adopt the total variation regularizer as

*D*is a difference operator, and

*i*,

*j*,

*k*represent the coordinates of one voxel in HSI, respectively. To solve Eq. (13), we adopt an alternating minimization scheme as follows.

**Update** *F*. With an estimated value of *B*, which is initialized as **0**, we use ADMM frame-work [27] to estimate the hyperspectral images *F*. By variable splitting and dual decomposition, Eq. (13) can be written as

*V*is an auxiliary variable and

*γ*is a regularization parameter. In the

*i*th iteration, for fixed

*U*and

^{i}*V*, Eq. (16) has a closed-form solution as

^{i}*CG*) algorithm [28]. After

*F*

^{i+1}being calculated,

*U*

^{i+1}will be updated by soft thresholding [29] as

*x*,

*y*}, sign{·} and ⊙ denote the maximum function of

*x*and

*y*, the algebraic sign function and the element-wise multiplication function, respectively. Then

*V*

^{i+1}is calculated by This iteration runs up to a fixed number.

**Update** *B*. After solving the intermediated HSI *F̂*, the noise means *μ _{c}* and

*μ*(i.e.,

_{p}*B*) can be estimated by solving the least square problem

The HSI and the noise mean are estimated alternatively. There are two stopping criterion in our implementation, i.e., maximum outer loop as 30 times and minimum change rate of the noise mean between the two successive estimations as 1%. Our method is summarized in Algorithm 1.

## 3. Experiments on synthetic data

The HSIs in Columbia Multispectral Image Database [30] are used as the synthetic data. The spatial resolution is 512 × 512 and the number of spectral bands is 31 from 400nm to 700nm. We consider TwIST [20], gradient projection for sparse reconstruction (GPSR) [21], and approximate message passing (AMP) [22] as the competing methods. All parameters involved in these methods are optimized or set as described in their original references, and we also have consulted the authors about parameter setting. Specifically, the parameter for balancing the fidelity term and the regularization is set to 0.8 for TwIST, 0.5 for GPSR, 0.2 for AMP, respectively. The number of iterations for these methods is 200. In our method, we set the parameters empirically. Specifically, the balanced parameter *β* is set to 0.15. The regularization parameter in the TV minimum *γ* is set to 2. Iterations of the inner loop *I _{in}* and the outer loop

*I*are set to 15 and 30, respectively.

_{out}We employ the peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and spectral angle mapping (SAM) as the evaluation metrics. PSNR assesses the pixel-level spatial fidelity and SSIM assesses the structure-level spatial fidelity, which are calculated on each 2D band-wise image and averaged over all spectral bands. SAM measures the spectral error, which is calculated on each 1D point-wise spectral signature and averaged over all spatial points. In the experiments, we set the noise variance as 0.1, change the noise means from 0 to 20, and conduct 10 repeated trials for calculating the averaged results.

For the quantitative comparison, we first depict the accuracy of noise mean estimation under different noise means in Fig. 3. Clearly, our method can achieve accurate noise mean estimation results. On average, the noise mean estimation error is as low as 0.035. Fig. 4 shows the numerical results for different methods with varying noise means. All the numbers are the averaged results over the test images. As can be seen, with the noise mean increasing, reconstruction qualities of the competing methods decline rapidly, while our method are robust to the noise mean and can achieve a better performance. As our method can accurately estimate the noise mean and fix its influence under different noise means, the reconstruction results of our method would keep stable with varying noise means. Meanwhile, we list the numerical results for 10 representative images in Table 1. The best results are highlighted in bold. Similarly, our method can achieve the best performance according to the metrics.

For the qualitative comparison, we show the reconstructed results for HSIs *beads* and *statue* in Figs. 5 and 6. The CASSI measurements, panchromatic measurements and ground truth images are also displayed. With the noise mean increasing, the reconstruction results for competing methods suffer from serious spatial blur and spectral error, while our method can achieve visually pleasant results with fine spatial structure and high spectral fidelity. The associated PSNR further demonstrate the superiority of our method.

To further compare the spectral performance, we select two regions within HSIs *beads* and *statue* as labeled in Figs. 5 and 6. The voxel values of these images have been scaled to [0,1]. The spectral signatures are shown in Fig. 7. It can be seen that the spectral signature obtained by our method are closest to the ground truth, which demonstrates that our method can achieve a superior spectral fidelity.

Finally, we show the convergence curve of the noise mean estimation for HSIs *beads* and *statue* in Fig. 8. The ground truth noise mean here is 10. It can be see that the noise mean estimated by the proposed method can converge quickly, which is very close to the ground truth.

## 4. Experiments on real captured data

Our prototype is shown in Fig. 9, which consists of a standard CASSI branch and a panchromatic one. In this setup, the incident light is divided equally into the two directions by a half-pass beam splitter. Then the light passes through optical filters with passband from 450 to 700 nm. In CASSI branch, the light is spatially modulated by a manufactured coded aperture obeying a random binary pattern with 300 × 300 elements, and then spectrally modulated by a double Amici prism. Finally, the coded information is collected by a gray camera. In the panchromatic branch, the gray camera captures the scene without any spatial and spectral dispersion. Both cameras are identical PointGrey cameras (model FL3-U3-13Y3M), with a grayscale of 8-bit, pixel number of 1280 × 1024, and pixel pitch of 4.8 × 4.8μm. The size of the element on the coded aperture is 10 × 10μm. Each element on the coded aperture is optically mapped to 2 × 2 pixels on the camera in CASSI.

We have conducted extensive experiments on real scenes and compare our method with TwIST. Fig. 10 shows the reconstructed results for the images *Ninja* and *Doll*. The compressive images from the CASSI branch and panchromatic images from the other branch are shown for reference. We display the reconstruction results of TwIST and our method at spectral bands 550 and 648nm. The noise means we estimated for *Ninja* and *Doll* are 15.78 and 5.49, which are scaled to [0, 255]. We can see that our method produces results with more precise details and clearer textures, such as the flame and the numbers in image *Ninja* and smooth surface in image *Doll*, which further proves the effectiveness of our method. To better illustrate the spectral fidelity, we show the reconstructed results for a Color Checker in Fig. 11. The noise mean estimated by our method is 7.69. As can be seen, the result image for TwIST is degenerated with spatial blur and color distortion, while our result has a better visual quality. For quantitative description, we chose two regions labeled in Figs. 11 (c) and 11 (d), and then extract the spectral signatures. The spectrum comparisons are depicted in Fig. 12. The reference spectrum were obtained by a standard spectrometer (Ocean Optics). Our results are closer to the reference spectrum compared that of TwIST. The benefits of our method that leads to improved reconstruction quality is further confirmed.

## 5. Discussion

#### 5.1. Computational complexity

Table 2 lists the computational complexities for different methods to reconstruct a HSI patch with the size of *M* × *N* × *K*. Here we use *L* to indicate the number of iterations. GPSR, AMP and TwIST have a comparable computational complexity, while ours is with a higher one. The higher computational complexity of our method mainly came from the two-layer iterations for reconstructing HSI and estimating the noise mean.

#### 5.2. Extension to multi-frame CASSI

Despite of the snapshot advantage from the hardware design, multiple frame acquisition strategy has been proposed to improve the performance of CASSI [14]. This strategy uses more than one measurements with different coded apertures to relieve the reconstruction problem. Table 3 depicts the results for HSIs *beads* and *statue* reconstructed by 1-frame CASSI and 2-frame CASSI. From these numbers, we can verify the effectiveness of our method on different variants of CASSI.

## 6. Conclusion

In summary, we propose a compressive hyperspectral imaging reconstruction technique that particularly considers the non-zero mean noise which seriously degrades the reconstruction accuracy in practice. To fix the problem, we build a general model to specifically considering the noise with non-zero mean and then propose a reconstruction method to reduce its adverse effect on the reconstruction quality. By utilizing the derivation between the noisy observation and the intermediate reconstruction results, the noise mean is gradually estimated to its convergence, and the reconstruction accuracy is improved accordingly. We demonstrate the effectiveness of our method on simulation data and build a prototype system experimentally for further validation.

## Funding

National Natural Science Foundation of China (61425013, 61701025); Beijing Municipal Science and Technology Project (Z181100003018003); Beijing Institute of Technology Research Fund Program for Young Scholars.

## References

**1. **D. B. Gillis, J. H. Bowles, M. J. Montes, and W. J. Moses, “Propagation of sensor noise in oceanic hyperspectral remote sensing,” Opt. Express **26**, A818–A831 (2018). [CrossRef] [PubMed]

**2. **G. Lu and B. Fei, “Medical hyperspectral imaging: a review,” J. Biomed. Opt. **19**, 010901 (2014). [CrossRef]

**3. **V. Backman, M. B. Wallace, L. T. Perelman, J. T. Arendt, R. Gurjar, M. G. Müller, Q. Zhang, G. Zonios, E. Kline, and J. A. Mcgilligan, “Detection of preinvasive cancer cells,” Nature **406**, 35–36 (2000). [CrossRef] [PubMed]

**4. **A. F. H. Goetz, G. Vane, J. E. Solomon, and B. N. Rock, “Imaging spectrometry for earth remote sensing,” Science **228**, 1147–1153 (1985). [CrossRef] [PubMed]

**5. **N. Hagen and M. W. Kudenov, “Review of snapshot spectral imaging technologies,” Opt. Eng. **52**, 090901 (2013). [CrossRef]

**6. **M. Tamamitsu, Y. Kitagawa, K. Nakagawa, R. Horisaki, O. Yu, S. Y. Morita, Y. Yamagata, K. Motohara, and K. Goda, “Spectrum slicer for snapshot spectral imaging,” Opt. Eng. **54**, 123115 (2015). [CrossRef]

**7. **X. Cao, T. Yue, X. Lin, S. Lin, X. Yuan, Q. Dai, L. Carin, and D. J. Brady, “Computational snapshot multispectral cameras: toward dynamic capture of the spectral world,” IEEE Signal Process. Mag. **33**, 95–108 (2016). [CrossRef]

**8. **A. R. Harvey and D. W. Fletcher-Holmes, “High-throughput snapshot spectral imaging in two dimensions,” Proc. SPIE **4959**, 46–54 (2003). [CrossRef]

**9. **N. Gat, G. Scriven, J. Garman, D. L. Ming, and J. Zhang, “Development of four-dimensional imaging spectrometers (4D-IS),” Proc. SPIE **6302**, 104 (2006).

**10. **G. Liang, B. Noah, H. Nathan, R. T. Kester, and T. S. Tkaczyk, “Depth-resolved image mapping spectrometer (IMS) with structured illumination,” Opt. Express **19**, 17439–17452 (2011). [CrossRef]

**11. **M. E. Gehm, R. John, D. J. Brady, R. M. Willett, and T. J. Schulz, “Single-shot compressive spectral imaging with a dual-disperser architecture,” Opt. Express **15**, 14013–14027 (2007). [CrossRef] [PubMed]

**12. **W. Ashwin, J. Renu, W. Rebecca, and D. J. Brady, “Single disperser design for coded aperture snapshot spectral imaging,” Appl. Opt. **47**, 44–51 (2008). [CrossRef]

**13. **G. Arce, D. Brady, L. Carin, H. Arguello, and D. Kittle, “Compressive coded aperture spectral imaging: An introduction,” IEEE Signal Process. Mag. **31**, 105–115 (2014). [CrossRef]

**14. **K. David, C. Kerkil, W. Ashwin, and D. J. Brady, “Multiframe image estimation for coded aperture snapshot spectral imagers,” Appl Opt **49**, 6824–6833 (2010). [CrossRef]

**15. **X. Lin, G. Wetzstein, Y. Liu, and Q. Dai, “Dual-coded compressive hyperspectral imaging,” Opt. Lett. **39**, 2044–2047 (2014). [CrossRef] [PubMed]

**16. **X. Lin, Y. Liu, J. Wu, and Q. Dai, “Spatial-spectral encoded compressive hyperspectral imaging,” Tran. on on Graph. **33**, 233 (2014).

**17. **A. Parada-Mayorga and G. Arce, “Colored coded aperture design in compressive spectral imaging via minimum coherence,” IEEE Transactions on Comput. Imaging **PP**, 1 (2017).

**18. **L. Wang, Z. Xiong, H. Huang, G. Shi, F. Wu, and W. Zeng, “High-speed hyperspectral video acquisition by combining nyquist and compressive sampling,” IEEE transactions on pattern analysis machine intelligence **41**, 857–870 (2019). [CrossRef]

**19. **L. Wang, Z. Xiong, D. Gao, G. Shi, W. Zeng, and F. Wu, “High-speed hyperspectral video acquisition with a dual-camera architecture,” in IEEE Conference on Computer Vision and Pattern Recognition, (2015), pp. 4942–4950.

**20. **J. M. Bioucas-Dias and M. A. T. Figueiredo, “A new TwIST: two-step iterative shrinkage/thresholding algorithms for image restoration,” IEEE Tran. on Image Process. **16**, 2992–3004 (2007). [CrossRef]

**21. **M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright, “Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,” IEEE J. Sel. Top. Signal Process. **1**, 586–597 (2008). [CrossRef]

**22. **T. Jin, Y. Ma, H. Rueda, D. Baron, and G. Arce, “Compressive hyperspectral imaging via approximate message passing,” IEEE J. Sel. Top. Signal Process. **10**, 389–401 (2016). [CrossRef]

**23. **E. J. Candėes, “The restricted isometry property and its implications for compressed sensing,” Comptes rendus - Mathématique **346**, 589–592 (2008). [CrossRef]

**24. **A. Henry, R. Hoover, W. Yuehao, D. W. Prather, and G. R. Arce, “Higher-order computational model for coded aperture spectral imaging,” Appl. Opt. **52**, D12–D21 (2013). [CrossRef]

**25. **L. Wang, Z. Xiong, D. Gao, G. Shi, and F. Wu, “Dual-camera design for coded aperture snapshot spectral imaging,” Appl. Opt. **54**, 848 (2015). [CrossRef] [PubMed]

**26. **L. Yang, Y. Xin, J. Suo, D. J. ady, and Q. Dai, “Rank minimization for snapshot compressive imaging,” IEEE Transactions on Pattern Analysis Mach. Intell. **PP**, 1 (2018).

**27. **S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations Trends Mach. Learn. **3**, 1–122 (2010). [CrossRef]

**28. **M. R. Hestenes and E. Stiefel, *Methods of conjugate gradients for solving linear systems*, vol. 49 (NBS, 1952).

**29. **A. Chambolle, “An algorithm for total variation minimization and applications,” J. Math. Imaging Vis. **20**, 89–97 (2004). [CrossRef]

**30. **Y. Fumihito, M. Tomoo, I. Daisuke, and S. K. Nayar, “Generalized assorted pixel camera: postcapture control of resolution, dynamic range, and spectrum,” IEEE Tran. on Image Process. **19**, 2241 (2010). [CrossRef]