Abstract
Traditional optical imaging and sensing methods capture signals of interest by direct sampling in the domain of interest such as by forming images on pixelated camera sensors or by regular temporal sampling of a waveform. These methods are indispensable in our daily lives and for many scientific disciplines such as microscopy in biology and spectroscopy in chemistry. Using these approaches, the sampling constraints and their impact on the bounds on signal fidelity are well understood through the Nyquist–Shannon sampling theorem. However, the problems of modern science require ever increasing amounts of data at unprecedented temporal and spatial scales and resolutions, which challenges the limits of traditional sensing. The increased availability of computational power combined with recent strides in signal processing promise to surpass many of the problems associated with traditional sensing methods through computational imaging and sensing methods. Within the realm of computational sensing, compressed sensing (CS), in particular, has enabled the capture of signals with lower sampling resources than traditionally required by the Nyquist–Shannon sampling theorem using prior information such as sparsity. In this tutorial, we focus on the operation and impact of such sub-Nyquist sampling schemes through the use of CS in photonic sensing and imaging systems. Emphasis is placed on intuition, but mathematical results are derived or cited where appropriate. Finally, we highlight several applications in macroscopic and microscopic imaging, spectroscopy, and microwave photonic sensing.
© 2022 Optica Publishing Group
1. INTRODUCTION
The world is composed of analog signals: the spectrum of sunlight scattered from the leaf of a crop, the flash of light from a fluorescent protein within a neuron, or the fluctuations of the electric field of a RADAR pulse returning from a distant object. The goal of modern photonic imaging and sensing systems is to capture such analog signals into an accurate digital representation that can be efficiently transmitted, stored, and analyzed. Analog signals travel through many processing steps before they are saved on a computer for further analysis including sampling and quantization, and each of these steps inevitably leads to some loss of information. The challenge for an imaging or sensing system is to strike an optimal balance between information loss and efficiency.
The first step in the journey from analog signal to digital representation is sampling, where a set of discrete analog measurements representative of the analog signal is made. A simple example is shown in Fig. 1(a), where an image is discretized into many smaller rectangular areas corresponding to the sampling effect of camera pixels. Sampling intrinsically leads to information loss since the analog signal could be changing arbitrarily in between samples or within a sample (e.g., averaged intensity over a pixel). This is largely avoided by filtering signals in the frequency domain so the analog signal is bandwidth limited and sampling at a rate greater than the Nyquist limit; however, in this case information is still lost during the filtering stage.
Following sampling, the next step is measurement with some form of quantization. Quantization is necessary not only because computer memory is finite but also because one can only measure by comparisons. For example, an analog voltage can be measured by comparing it to many precise reference voltage levels. Time can be measured by measuring the number of cycles of a reference clock. Many examples of quantized measurements can be given here, and they would all have similar types of noise associated with them, such as the step size (resolution) of the reference levels, the accuracy of the reference levels, and the nonlinearity of the step sizes, which carry various names for different quantizing devices [e.g., differential/integral nonlinearity (DNL/INL) for an analog-to-digital converter, or jitter in a clock]. Many fundamental trade-offs exist between quantization or sampling parameters and design performance metrics, such as cost, speed, and resolution. For example, a large number of reference levels would allow measurement of a larger range of (or more precise) signal values. However, this would incur increased financial cost, reduced speed since more bits need to be transmitted, and increased data storage. Data compression has played a significant role in relaxing some of the physical constraints associated with these design choices for both conventional and compressed sensing modalities.
In conventional sensing, signals are compressed after digitization for efficient storage and transmission as depicted in Fig. 1(b). Technologies such as video streaming have become so ubiquitous that we take them for granted, and yet they would have been impossible without compression algorithms. A tremendous amount of effort has gone into designing efficient compression algorithms, which are generally grouped into lossless and lossy algorithms. Lossless algorithms exploit redundancies in the signal. For example, if the signal has a lot of repeating patterns (e.g., “000000011111”), one can store “runs” of data (e.g., “7 0s, 5 1s”), which is known as run-length encoding. Huffman coding is another example where patterns with the highest probability of occurrence are encoded with the fewest number of bits, which is an example of variable-length code (e.g., common letters of the English alphabet such as “e” or “a” can be transmitted with the fewest number of bits). Lossy compression algorithms, on the other hand, make use of the general fact that high-dimensional natural signals live in a low-dimensional manifold. Therefore, high-dimensional data (e.g., images with millions of pixels) can be projected down to a low-dimensional space [e.g., a small fraction of the largest discrete cosine transform (DCT) coefficients] and reconstructed computationally with minimal error. An example of this is shown in Fig. 1(a), where keeping as few as 1% of the wavelet transform coefficients of an image results in almost no visible degradation in image quality.
Critically, research into lossy compression has demonstrated that natural signals can be compressed with little loss of information, which suggests that conventional sensing modalities are inefficient since conventionally sampled data has a large amount of redundant information. The main goal of compressed sensing is to compress signals in the physical (analog) domain prior to sampling in order to maximize the information acquired per sample [Fig. 1(c)]. The increased sampling efficiency of compressed sensing is extremely valuable for “measurement constrained” applications. Problems can be measurement constrained due to many reasons, such as the cost of pixels (e.g., IR imaging), bottlenecks in throughput of high speed imaging, and limited spatial access to the sample (e.g., minimally invasive microendoscopy).
The term multiplexing will be used often throughout the text, which is an important concept borrowed from telecommunications terminology where signals are added to share a scarce resource. In a very similar manner, we want information from the sample to share the scarce resource of limited measurements. As we will discuss later, pseudo-random multiplexing of signals is a very efficient way to compress them in the analog domain before sampling.
We organized the rest of this tutorial into three sections: Section 2 covers compressed sensing theory and algorithms, Section 3 focuses on optical technologies for multiplexing optical signals, and finally Section 4 discusses some applications that make use of both the compressed sensing algorithms and the multiplexing technologies to solve real problems in optical sensing. This last section is not intended to be an exhaustive review but a collection of selected works that the authors find instructive. There are many excellent introductory tutorials on compressed sensing [1,2] and its applications to imaging [3] as well as books [4,5] with an emphasis on mathematically rigorous explanations. In our tutorial, we cover compressed sensing theory with an emphasis on intuitive understanding followed by a discussion on commonly used photonic tools and applications. We assume linear algebra knowledge at the level of [6], and although not required, signal processing knowledge at the level of [7] would be useful.
2. COMPRESSED SENSING
A. Notation
Matrices and vectors are denoted with boldface uppercase and lowercase letters, respectively. The vector norm is a measure of its length—a generalization of the Euclidean distance. The general ${l_p}$ norm of vector ${\textbf x}$ is defined as
for $p \ge 1$, where the popular Euclidean distance corresponds to $p = 2$. For $p \lt 1$, the triangle inequality is not satisfied, and therefore it does not satisfy the definition of a norm. Loss functions based on norms are convex and therefore much easier to optimize. While not considered a norm, the special case of $p \to 0$ is known as the ${l_0}$ “pseudo-norm” ${\| {\textbf x} \|_0}$, which yields the number of nonzero elements in ${\textbf x}$. A vector ${\textbf x}$ is said to be $S$-sparse (${\textbf x} \in {\Sigma _S}$) when it has no more than $S$ nonzero entries, i.e., ${\| {\textbf x} \|_0} \le S$, where the number of nonzero elements is often much smaller than the signal dimension $N$, i.e., $S \ll N$.Optimization problems will be posed in the following form:
where argmin refers to the argument that minimizes the expression to its right. In other words, the value of ${\textbf x}$ that minimizes the cost function $f({\textbf x})$ is found and set to the signal estimate ${\hat{\textbf{x}}}$.B. Sampling
Natural signals of interest have a finite extent in time (or space) and possess a minimum feature size and thus finite bandwidth in the Fourier domain. For example, the spatial bandwidth of free-space optics is limited by their finite aperture. According to the Shannon–Nyquist sampling theorem, a signal with bandwidth $W$ in the frequency domain can be sufficiently sampled at equal intervals of $\delta t = \frac{1}{{2W}}$, first proven by Shannon and attributed to Nyquist. This makes intuitive sense as Shannon explains, “if $f(t)$ contains no frequencies higher than $W$, it cannot change to a substantially new value in a time less than one-half cycle of the highest frequency” [8]. Equivalently, $N = \frac{T}{{\delta T}} = 2TW$ is the total number of independent measurements needed to specify a signal with finite extent $T$, which is also known as the time-bandwidth product (TBP) or space-bandwidth product (SBP) in the optics literature. TBP and SBP are powerful concepts for describing sensing systems and can be used to describe either the optical signal or the optical system of interest [9]. For example, for imaging systems a phenomena of interest may possess a certain minimum spatial feature size and maximum spatial extent determining the SBP of the signal itself. Furthermore, the SBP of the optical system used to capture this phenomenon may be constrained by the resolution and field of view of the physical optics or by the number of pixels on the sensor. Ultimately the goal for the imaging hardware is to match or exceed the SBP of the signal of interest.
We will group sampling strategies into three categories: oversampling, critical sampling (Nyquist), and undersampling (compressed sensing). One way to visualize these strategies in a general way is to use set diagrams as shown in Fig. 2, where mappings from a set of measurements to a set of signals are drawn. For example, a signal could be an image and the measurement its Fourier transform. In the critical sampling case, it is straightforward to go from a measurement ${M_i}$ to a signal ${S_i}$. For our example, the mapping is a Fourier transform, and, for critical sampling, a unique inverse transform exists. In an ideal world, critical sampling would be sufficient; however, in the presence of noise, multiple measurements might be necessary to calculate signals from measurements in a robust way. That leads to the oversampled case as shown in Fig. 2(b), where the same signal results in different measurements. Here simple methods such as averaging or ridge regression can be used to reduce the effects of noise.
The undersampled case is of specific interest here because it corresponds to the conditions of compressed sensing. Inversion for the undersampled case seems impossible at first glance where multiple signals map to the same measurements as shown in Fig. 2(c). However, if we could incorporate prior information about signals of interest, then it might be possible to invert the measurements as shown in Fig. 2(d). Specifically, given measurement ${M_1}$ or ${M_2}$ and knowledge that the signal of interest lies in some subset shown in red, we can infer that the signal that led to that particular measurement was the signal ${S_2}$ or ${S_3}$, respectively. As one example, an oracle who knows the position of the nonzero elements of a sparse vector might be able to convert an underdetermined set of linear equations to an overdetermined one, as will be discussed later.
Measurements should be maximally informative, which implies orthogonality in linear algebra terms. However, there are infinitely many ways to construct linearly independent measurements, and the specific choice of measurement basis affects system performance significantly. For example, given a single photodetector, a scene can be imaged by scanning the detector pointwise across the entire image as shown in Fig. 3(a), which is also known as raster scanning. Alternatively, the scene can be blocked using a complete basis (e.g., Hadamard), and the light from the entire image can be integrated by the photodetector as shown in Fig. 3(b). Basis scan is comparable to raster scan, where the number of measurements is equal to the reconstructed signal dimension and the image can be reconstructed using a direct inverse transform. For compressed sensing, the scene is often sampled using a pseudo-random pattern as shown in Fig. 3(c). As will be shown, the latter approach is often more efficient requiring fewer such measurements to reconstruct the signal.
Linear algebra is a powerful tool in untangling the image from the measurements. The imaging process in this case can be converted to a linear algebra problem by vectorizing the images and the projected patterns resulting in the following linear equation:
where ${\textbf A}$ is known as the sensing matrix of size $m \times n$, ${\textbf x}$ is the vectorized image of size $n \times 1$ (e.g., $n$ is the number of image pixels), and ${\textbf y}$ is the vector of measurements of size $m \times 1$, where $m$ is the number of measurements. Vectorizing an image involves taking each column or row of an image and stacking them to form a vector. Following our psuedo-random measurement example, each row of the measurement matrix ${\textbf A}$ would be a pseudo-random pattern that multiplies the column vector ${\textbf x}$ and results in a single measurement that goes into the corresponding row in ${\textbf y}$.The case of critical sampling corresponds to $m = N$ and can be solved by ${\textbf x} = {{\textbf A}^{- 1}}{\textbf y}$ if the matrix inverse exists. However, the sensing matrix is not square for the oversampled or the undersampled case and therefore its inverse does not exist. This problem can be circumvented by multiplying both sides of the equation by its transpose ${{\textbf A}^T}$,
Since ${{\textbf A}^T}{\textbf A}$ is square and can be inverted if $|{{\textbf A}^T}{\textbf A}| \ne 0$, we can solve for the imageC. Sparsity
Sparse signals are compressible since only a few nonzero elements need to be stored to accurately represent the signal. Most natural signals are sparse in some domain and therefore can be compressed in the digital domain (e.g., JPEG for image compression) with very little loss in signal quality. Compressed sensing advocates that, since the vast majority of signals are compressible, compression prior to analog-to-digital conversion would make the data collection process more efficient. A more efficient data collection process would imply either (i) fewer measurements can be used to reconstruct a signal or (ii) the same number of measurements can be used to extract more information about the signal.
Some signals are naturally sparse in the measurement domain, such as stars in the night sky. Clearly, raster scanning these scenes is inefficient since most of the time our detector would measure zero. For other signals, a sparse domain of representation can be realized by applying a sparsifying transformation such as the DCT, wavelet transform, or basis vectors learned from training data. With a sparsifying transformation, we have
where ${\textbf A}$ is the $m \times n$ sampling matrix (e.g., pseudo-random binary pattern), ${\boldsymbol \Psi}$ is the $n \times n$ sparsifying transformation (e.g., DCT), and ${\boldsymbol \alpha}$ is the $n \times 1$ sparse representation of our signal. In this case, the sensing matrix becomes ${{\textbf A}^\prime} = {\textbf A}{\boldsymbol \Psi}$, and the linear problem becomes Once a sparse solution is found, the original signal can be recovered using ${\textbf x} = {\boldsymbol \Psi}{\boldsymbol \alpha}$. Therefore, for any signal sparse in some arbitrary domain, we can always write a linear matrix-vector equation of the form Eq. (8). However, solving Eq. (8) is equivalent to solving Eq. (3); therefore, we will simply refer to ${\textbf x}$ as a sparse vector in the following sections without any loss of generality.The inverse problem of recovering ${\textbf x}$ from a small set of measurements ${\textbf y} = {\textbf A}{\textbf x}$ involves two critical questions: (i) how should we design the sensing matrix ${\textbf A}$ to preserve the essential information in ${\textbf x}$?; and (ii) how can we recover ${\textbf x}$ from measurements ${\textbf y}$? In this section, we summarize a number of desirable properties that ${\textbf A}$ should have and discuss popular algorithms for signal recovery. We delay the photonic implementation of such operations to Section 3 and applications to Section 4.
At first glance, the problem ${\textbf y} = {\textbf A}{\textbf x}$ where ${\textbf A}$ has more columns than rows is underdetermined and, hence, has an infinite number of solutions. Enforcing sparsity on the solution ${\textbf x}$ is a form of regularization, making the problem better posed and leading to our first optimization problem:
Next, we should immediately observe that the null space of ${\textbf A}$, denoted as ${\cal N}({\textbf A}) = \{{\textbf z}\;|\;{\textbf A}{\textbf z} = {\textbf 0}\}$, holds the key to the unique recovery of ${\textbf x}$ because both ${\textbf x}$ and ${\textbf x} + {\textbf z}$ are possible solutions to ${\textbf y} = {\textbf A}{\textbf x}$. Moreover, since we look for sparse solutions as in Eq. (9), unique sparse recovery can be successful as long as the null space of ${\textbf A}$ does not contain any sparse vectors. More precisely, ${\textbf A}$ represents all $S$-sparse signals ${\textbf x}$ if and only if ${\cal N}({\textbf A})$ contains no vector in ${\Sigma _{2S}}$, a set of ${2}S$-sparse vectors. This is equivalent to the fact that any random subset of $2S$ columns of ${\textbf A}$ should be linearly independent—a critical property that we strive to achieve in the design of good sensing matrices in Section 3.
D. Restricted Isometry Property and Mathematical Guarantee
While the null-space property of ${\textbf A}$ is both necessary and sufficient for establishing guarantees of successful sparse recovery, these guarantees do not account for noise, and the signals of interest ${\textbf x}$ might only be approximately $S$-sparse. The pioneering work by Candès and Tao in [10] introduces the following crucial property called the restricted isometry property (RIP) on matrices ${\textbf A}$ defined as follows: a matrix ${\textbf A}$ satisfies RIP of order $2S$ if there exists a bound $0 \lt {\delta _{2S}} \lt 1$ such that
In fact, even with approximately $S$-sparse signal and noisy measurements ${\textbf y} = {\textbf A}{\textbf x} + {\textbf w}$ where ${\textbf w}$ is an additive noise vector, the following generalized optimization also yields robust recovery:
We can interpret Eq. (10) as saying that ${\textbf A}$ approximately preserves the distance between any pair of $S$-sparse vectors $\{{\textbf u},{\textbf v}\}$ (where we can interpret ${\textbf x}$ as ${\textbf u} - {\textbf v}$). This close relationship between RIP and the Johnson–Lindenstrauss lemma points us to construction of robust matrices ${\textbf A}$’s as well-understood dimensionality-reduction operators [12]. At closer inspection, one can observe that RIP also has a tight relationship with the aforementioned null-space property of ${\textbf A}$: any random subset of $2S$ columns of ${\textbf A}$ yields an approximate isometry, and this degree of isometry ${\delta _{2S}}$, as expected, will play an important role in the quality of recovery. Therefore, RIP leads to fundamental implications on recovery robustness with the presence of measurement noise in ${\textbf y}$ as well as with an approximately $S$-sparse signal ${\textbf x}$, establishing the following powerful guarantee: when the sensing matrix ${\textbf A}$ satisfies the RIP of order $2S$ with ${\delta _{2S}} \lt \sqrt 2 - 1$, then solving Eq. (12) yields a solution whose deviation from the ground truth ${\textbf x}$ is bounded by the noise power ${\| {\textbf w} \|_2}$ and the error of the best $S$-sparse approximation to ${\textbf x}$. In other words, the solution ${\hat{\textbf{x}}}$ to Eq. (11) obeys
E. Coherence, the Gershgorin Disc Theorem, and the Benefit of Random Sensing Matrices
Although RIP has nice mathematical properties and guarantees, it is difficult to evaluate in practice since its evaluation is a combinatorial problem. Coherence of a sensing matrix is much easier to calculate and can often elucidate the trade-offs when designing sampling strategies for photonic applications. For successful recovery, knowing the support of ${\textbf x}$ (i.e., positions of nonzero elements) as shown in Fig. 4 is not enough. We also need to guarantee that the following subspace problem is invertible as well [14]:
where ${{\textbf A}_S}$ contains columns corresponding to the nonzero elements of ${\textbf x}$. Therefore, the eigenvalues of ${\textbf A}_S^T{{\textbf A}_S}$ cannot be zero as that would imply $\textit{det}({{\textbf A}_S^T{{\textbf A}_S}}) = 0$. Intuitively, we can project the signal ${{\textbf x}_S}$ onto the eigenvectors of ${\textbf A}_S^T{{\textbf A}_S}$ and observe that an eigenvalue of zero would multiply the signal’s corresponding coefficient with zero and information about that component would be irrecoverably lost. Furthermore, a small eigenvalue is not desirable as it might push the signal energy in that mode below the noise floor. Estimating the eigenvalues of ${\textbf A}_S^T{{\textbf A}_S}$ proves invaluable in assessing the reconstruction success.The Gershgorin disc theorem provides an easy and intuitive method for bounding the eigenvalues of nearly diagonal matrices. For diagonal matrices, the eigenvalues are the diagonal elements. For matrices that are nearly diagonal, we would expect the eigenvalues to be close to the diagonal values. The Gershgorin disc theorem states that indeed an eigenvalue of a matrix is located in a disc centered around the diagonal element with radius equal to the sum of absolute values of the off-diagonal elements [15] as shown graphically in Fig. 6. It is worth noting that the off-diagonal elements could have been picked from the columns instead of the rows since the eigenvalues of a matrix and its transpose are the same.
The Gershgorin disc theorem might not always give a tight bound (especially if off-diagonal elements are not small), but it is useful in finding the trade-offs in a compressive sensing scheme. The matrix on the left-hand side of the inverse problem measures the overlap of basis functions:
Coherence is much easier to calculate in practice for photonic systems compared to the RIP. For example, for a free-space imaging system, $\mu$ would be the overlap integral between point spread functions (PSFs) separated by the sampled distances. If the PSF is very broad, then the overlap integral and therefore $\mu$ would be large, which would reduce resolution.
It should be noted that since coherence or the Gershgorin bound is not very tight, results derived from it should only hold qualitative value. In fact, tighter bounds can be obtained using RIP or the null-space property of the sensing matrix. However, it is satisfying to clearly see that a lower sparsity level or lower coherence yields better reconstruction. The sparsity level can be reduced by obtaining better basis functions possibly through machine learning techniques, whereas coherence level $\mu({\textbf A})$ can be reduced through a better sensing matrix design.
Pseudo-random matrices are often employed since they have low coherence with high probability and they yield elegant mathematical proofs and bounds. For random matrices, we can obtain robust recovery when the number of measurements is simply high enough [4]:
where $C$ is a coherence-dependent constant. This is a significant improvement over the coherence treatment above. For this reason, typically, the goal for optical hardware is to implement pseudo-random multiplexing of the signal of interest, and this is detailed in Section 3.It is also worth noting here that clearly small coherence, a large number of measurements, or sparse signals help in signal reconstruction success. In fact, Donoho and Tanner show that noise-free reconstruction success has abrupt phase transitions that vary as a function of the undersampling fraction and sparsity fraction [17]. This behavior is shown to hold generally for a larger number of cases and can provide guidance on possible achievable compression ratios. In the author’s experience, the value of $C$ is roughly 3–4, which means for most practical applications of compressed sensing the sparsity level should be below 5%–10% to claim significant benefits over Nyquist sampling.
F. Sparse Recovery Algorithms
In this section, we present some of the most common algorithms and techniques used for sparse recovery. Some are presented in pseudo-code form as well (Algorithms 1 –4), with consistent notation to highlight the differences between the algorithms.
1. Greedy Algorithms
Greedy recovery strategies have been known and developed long prior to the compressed sensing theory [18]. Algorithms in this direction attempt to identify the support ${\cal S}$ of the sparse signal and then employ the pseudo-inverse of the corresponding set of columns in ${\textbf A}$ to obtain the magnitude of the sparse codes, as we demonstrate previously in Fig. 4. The optimization problem here can be posed as
2. Linear Programming
Convex relaxation of the original optimization problem from ${\ell _0}$ penalty to ${\ell _1}$ can make reconstruction significantly faster. Moreover, ${\ell _1}$-minimization guarantees to find the sparsest solution when the level of coherence $\mu({\textbf A})$ is low enough as discussed in previous sections. Linear programming is a mature optimization technique available in many specialized software packages, and therefore its inner workings will not be discussed here [22]. Its canonical form is
3. Iterative Convex Optimization
In practice, when the dimension of the problem becomes large, we typically convert the constrained optimization problem in Eqs. (11) or (12) into the following unconstrained convex optimization problem:
The simplest strategy in this direction is the iterative hard thresholding (IHT) algorithm [23], described in Algorithm 3. Starting from the trivial signal estimate ${\hat{\textbf{x}}} = {\textbf 0}$, the algorithm iterates a gradient-descent step followed by hard thresholding (as shown in the left of Fig. 7) to enforce the sparsity regularization until a convergence criterion is met. In fact, we can even classify IHT under the greedy pursuit category where the hard thresholding step at each iteration reveals the $S$-sparse support of the current estimate.
A natural improvement in this direction is the iterative shrinkage thresholding algorithm (ISTA) [24] as depicted in Algorithm 4, which borrows powerful ideas from past signal denoising efforts. ISTA consists of two steps: a gradient-descent step and a proximal shrinkage step via soft thresholding of the result obtained from the gradient step to generate sparsity. The most popular ISTA-style algorithm is a modified fast version named FISTA [25]; it accelerates ISTA convergence significantly by adding a momentum term with adaptive step size in each of its iterations. Another well-known algorithm with a different approach to speeding up convergence is the two-step ISTA coined TwIST [26], which adds an element-wise reweighted scheme and a two-step-back update to the shrinkage thresholding step for enhanced robustness.
Alternatively, we can opt for converting the constrained optimization problem in Eq. (11) to its unconstrained form via introducing additional Larange multipliers ${{\boldsymbol \lambda}^T}({\textbf y} - {\textbf A}{\textbf x})$ and then strengthening these constraints further via an additional penalty term $\frac{\gamma}{2}\| {{\textbf y} - {\textbf A}{\textbf x}} \|_2^2$. Such a strategy is known as the augmented Lagrangian multipliers technique [27], and the overall cost function becomes
Probabilistic optimization routines are largely omitted in this tutorial due to space constraints, which are nonetheless a large and an important group of algorithms. One example approach is from [29], where shot-noise-limited measurements are modeled using a Poisson model. Log likelihood is then optimized along with ${l_1}$ and TV-norm penalty terms using separable quadratic approximations.
4. Local versus Global Recovery
In practice, there are at least two compelling reasons why one would prefer local patch- or block-based recovery over global image-based recovery. The first reason involves efficiency: it is much simpler and faster to recover a small image patch or region, especially in a high-resolution setting. More importantly, if we follow the well-grounded auto-regressive image model in the compression community [31], then the larger the patch size, the less homogeneous (the more complex) it becomes, which increases $S$ making the signal less sparse and rendering sparse recovery less effective.
An immediate solution to this local-versus-global trade-off is to rely on multi-resolution representations such as the wavelet transform or the redundant wavelet packet [32] as the sparsifying operator ${\boldsymbol \Psi}$ in the recovery process. However, if sensing is performed locally (as it often is, as discussed in the next section), we can recover any random patch of any random size at any location in the image without conforming to a certain grid since overlapped recovered regions can simply be averaged to provide even more robust recovery and to prevent undesired blocking artifacts. One example of such a local–global recovery strategy is depicted in Fig. 8 where the iterative optimization algorithm seeks the sparsest set of DCT coefficients of overlapped patches that are consistent with the CS measurements [33,34]:
5. Deep Neural Networks
Recent works in compressed sensing and sparse coding have focused on the application of new tools and ideas from deep neural networks. Initial models use a pretrained deep neural network (DNN) ${{\cal G}_{\boldsymbol \theta}}$ (which can be a variational auto-encoder, a generative adversarial network, etc.) as the structural constraint instead of sparsity [35,36]. This generator maps a latent representation or low-dimensional sparse code ${\textbf z}$ to the signal space: ${\textbf x} = {{\cal G}_{\boldsymbol \theta}}({\textbf z})$. Instead of requiring sparse signals, the deep network ${{\cal G}_{\boldsymbol \theta}}$ implicitly constrains output ${\textbf x}$ to belong in a low-dimensional manifold via its architecture and the weights adapted from the training data. Effectively, the choice of training data provides the prior knowledge that limits the set of possible solutions shown in Fig. 2(d). In other words, the network can learn the mapping that generates the high-dimensional signal ${\textbf x}$ in ${{\cal R}^N}$ from a latent low-dimensional sparse code ${\textbf z}$ in ${{\cal R}^S}$. Often, the sensing matrix can be co-trained with the optimizer here. A minimization process similar to that in compressed sensing is then employed for reconstruction
Due to the nonlinear complex nature of deep neural networks, the cost function in Eq. (26) is highly nonconvex and the optimization problem is intractable [35]. A well-behaved approximation strategy involves gradient descent starting from randomly sampled points in the a priori signal distribution ${p_{\boldsymbol z}}({{\textbf z}_k})$ as follows:
Another group of approaches takes inspiration from the existing recovery algorithms augmented by DNN components such as ISTA-Net inspired by the ISTA algorithm [37], AMP-Net inspired by the approximate message passing algorithm (AMP) [38], or ADMM-Net inspired by the alternating direction method of multipliers (ADMM) algorithm [28]. More recent work making use of attention mechanisms such as the transformer architecture is also gaining popularity [39]. A more in-depth review of the literature can be found in [40,41].
3. OPTICAL TECHNOLOGIES FOR COMPRESSED SENSING
In this section, we review some optical technologies often employed for multiplexing optical information for compressed sensing. Multiplexing information is critical for implementation of the sensing matrix ${\textbf A}$ and generating the measurement vector ${\textbf y}$ in the physical domain prior to sampling.
The optical technology used will have important implications for resolution and therefore the number of samples needed to represent the reconstructed signal. A common pitfall is to generate an oversampled representation of the signal, which would then overestimate the achieved compression ratio and therefore the performance of the compressed sensing system. The SBP or the TBP of the sensing waveform produced by the optical hardware and used to impart the sensing matrix will determine the number of pixels needed to appropriately represent the recovered signal. It should be noted that typically photonic compressed sensing hardware does not result in idealized matrices as discussed in Section 2 but in responses that have spatial (or temporal) extent as well as a bounded bandwidth as shown in Fig. 9. Features of the signal that are smaller than the minimum feature in this sensing waveform will not be recoverable, and for optimal sampling of the reconstructed signal the Nyquist rate is determined from the sensing waveform’s SBP or TBP. The number of compressive measurements needed can then be estimated from Eq. (17), which depends on the scene sparsity and the coherence. Another important consideration is the signal-to-noise ratio (SNR). For example, nonbinary signals or scenes with low contrast generally require measurements with higher SNR since the encoded information of interest will result in smaller changes between measurements. Additionally, systematic errors from nonidealities such as aberrations can be removed by in situ calibration of the photonic system, but this might not always be possible if the dimension of the reconstructed signal is very large (e.g., high pixel counts for imaging).
Remember that psuedo-random sensing matrices are extremely useful in the context of compressed sensing, and therefore the optical technologies generally focus on implementing psuedo-random multiplexing. These optical multiplexing technologies can be split into two broad categories: (i) passive multiplexing, which leverages static optical devices, and (ii) active multiplexing, where a dynamic optical device is leveraged and some electronic device control is typically needed.
A. Passive Multiplexing
1. Coded Aperture
Coded apertures block light with patterned opaque materials in order to multiplex intensity information onto a sensor, an example of which is shown in Fig. 10. One of the earlier examples is from x-ray imaging where lenses that are impractical as materials have near-unity refractive indices and bulky mirrors operated near grazing angles are not desirable. Dicke [42] found that randomly distributed pinholes (50% open) can be used to efficiently image stars in x-ray by placing the pinholes between the object and image planes and using no further optics. Although a single pinhole can achieve imaging with high resolution (i.e., camera obscura), most of the light is lost, which reduces the SNR. With multiple pinholes, images from each hole are multiplexed on the measurement screen to increase SNR and subsequently demultiplexed digitally.
Initial designs had engineered patterns such as uniformly redundant arrays (URAs) [43] or modified URAs (MURAs) [44], which provided desirable correlation properties. We will first investigate the image formation process to understand why. Using the geometry in Fig. 10(a), an object intensity distribution $o(\xi ,\eta)$ assuming planar scene and an aperture function $a({x^\prime},{y^\prime})$ results in the following measurement $p(x,y)$:
As the distances between the scene and the code increase $b \to \infty$, the magnification factor asymptotically approaches unity $m \to 1$. In that case, we have
Here, we reproduce the square MURA since it is often employed in coded aperture imaging. For a $p \times p$ aperture $A$ where $p$ is a prime number,
Pseudo-random binary arrays can also be used since they provide greater flexibility in code design. For example, MURAs for large $p$ have roughly 50% transparency, but the experimental setup might require a different ratio. For example, a sensing problem with higher SNR and lower sparsity will favor lower transparency (fewer pinholes). Nevertheless, MURAs lead to computationally cheaper algorithms since the signals can be reconstructed with a simple correlation with the precomputed inverse, $G$. This is beneficial for imaging applications where the pixel count is large and inverting a matrix becomes infeasible.
2. Diffusers and Random Microlens Arrays
Diffusers rely on surface roughness of a transparent dielectric and the corresponding spatially random refraction for pseudo-randomly multiplexing scene information as shown in Fig. 11(a). Diffusers are in many ways similar to coded apertures but leverage spatial phase modulation rather than amplitude modulation to generate a complex amplitude pattern known as a caustic pattern a distance from the diffuser. Caustic patterns are equivalent to the cobweb-like intensity pattern formed on the bottom of a swimming pool due to the random water surface height on a sunny day. Importantly, unlike coded apertures, diffusers do not rely on light absorption and therefore can provide less loss and better SNR. The simplest way to calibrate a diffuser setup is to operate far from the diffuser. In that case, a single point in the scene is used to calibrate the entire system since any other point on the same plane simply shifts the caustic pattern. Furthermore, points at different depths simply change the magnification of the calibrated system response. Suppose a point on the optical axis leads to a caustic pattern $h(x,y)$; then any point shifted by $(\xi ,\eta ,z)$ from the measurement point will lead to the following system response [46]:
where Note the similarities between the coded aperture and the diffuser response formation equations. ${m_z}$ and ${m_b}$ correspond to aperture magnification at distances $z$ and $b,$ respectively, for coded aperture. Since the caustic pattern is measured some distance $b$ away from the diffuser, it needs to be properly scaled for a point at distance $z$.A more accurate approach to modeling is to recover the surface roughness profile through phase retrieval algorithms and experiments followed by ray tracing [47]. There are many phase retrieval methods with extensive literature coverage [48–50], so we will not discuss them here.
3. Scattering
Recovering shape information from acoustic or electromagnetic scattering data has a wide range of applications from geophysical surveying to imaging and therefore has been extensively studied in the literature under the name inverse scattering [51] and will be omitted here. Scattering in compressed sensing has been used for two main applications: (i) to multiplex light information randomly or (ii) to create high bandwidth patterns for structured illumination. Ultimately, the SBP or TBP of the scattering process will constrain the compressed sensing system. Therefore, we will cover some statistical properties following the excellent treatment from Goodman [52] as such properties might be relevant either in optimally sampling the response of a scattering medium or in engineering the bandwidth of the structured illumination. Scattering approaches are similar to the use of diffusers but leverage the random interference of coherent sources transmitted though similar optics and the resulting speckle.
One interesting characteristic is to look at the probability density function (PDF) for the intensity from a speckle pattern (i.e., if we binned the intensity values and counted frequencies in each bin for speckle, what would the histogram look like?) also referred to as “first order statistics” by Goodman [52]. The speckle pattern for coherent polarized light can be thought of as arising from a random walk on the complex plane $A(x,y,z) = \frac{1}{{\sqrt N}}\sum\nolimits_{k = 1}^N {a_k}{e^{i{\phi _k}}}$, where ${a_k}$ and ${\phi _k}$ are random variables. The amplitude function is normalized by $\sqrt N$ so that the intensity that is proportional to $|A{|^2}$ is properly normalized. A random walk in the complex plane yields a Gaussian joint distribution for the real and imaginary parts of the complex amplitude
Another interesting characteristic is the auto-correlation function, which is referred to as “second order” statistical properties of speckle by Goodman [52]. For pseudo-random sampling of information, auto-correlation provides a measure of response sensitivity to changes to the input signal, which usually determines spectral resolution for spectroscopy, spatial resolution for imaging, and temporal resolution for RF sensing.
Here, we need to consider two geometries: (1) free-space geometry where speckle is measured some distance $z$ away from the scattering surface or (2) imaging geometry where speckle is imaged through a lens and is measured some distance $z$ away from the lens. The intensity auto-correlation functions calculated for these two example cases are very instructive:
- 2. For the imaging configuration with a circular lens diameter $D$,$${R_I}(r) = {\langle I\rangle ^2}\left({1 + {{\left| {2\frac{{{J_1}\left({\frac{{\pi Dr}}{{\lambda z}}} \right)}}{{\frac{{\pi Dr}}{{\lambda z}}}}} \right|}^2}} \right),$$where ${r^2} = \Delta {x^2} + \Delta {y^2}$, ${J_1}$ is the Bessel function of the first kind, and $\lambda$ is the wavelength.
For either case, we can define a width parameter $\delta x = \frac{{\lambda z}}{L}$ or $\delta r = \frac{{\lambda z}}{D}$ that can be thought of as the average speckle width and is important for determining speckle bandwidth and how to efficiently sample it. When speckle is projected onto the scene to sample it, the resolution of the system is set by the average speckle width. On the one hand, if scattering is used such that the signal forms a speckle response on the sensor, the average speckle width determines the optimal sampling width on the sensor. The statistical properties of blurred speckle are also important for compressed sensing; however, we will omit them here due to space constraints. A significantly more detailed treatment of speckle statistics can be found in [52].
4. Waveguides and Cavities
Integrated photonic devices such as chaotic cavities and multimode waveguides hold promise as miniaturized sensing devices and can be produced cheaply en masse exploiting the mature micro-electronic processing technologies. Multimode waveguides can impart random phase between various propagating spatial modes over some distance and effectively yield speckle in both the spatial and temporal domains. Similarly, chaotic cavities can be designed to generate pseudo-random spectro-temporal behavior and time-domain speckle with even smaller footprints due to their reverberating nature. Chaotic cavities have many other applications such as in laser physics and thus have been extensively covered [58–61] and are often termed dynamical billiards due to their analogous behavior [62].
Overall, these devices try to mimic the behavior from scattering in smaller form factors, so discussions for the speckle statistics above are also highly relevant here. For example, Guerit et al. [63] provide a comprehensive theoretical and experimental study of speckle arising from multicore fibers and find that under some mild assumptions, the radius of a speckle grain scales as $\lambda z/D$, where $\lambda$ is the wavelength, $z$ is the propagation distance, and $D$ is the fiber diameter, which is very similar to the scattering result above. Often, a large number of modes are required from these devices to approximate speckle behavior well, which is determined by size and refractive index contrast.
Figure 12 shows example devices that result in a speckle-like response by approximating a random walk on the complex plane and are used for compressed sensing. Figure 12(a) shows a scanning electron microscope (SEM) image of a ray chaotic cavity where the induced features and the chamfer lead to the ray chaotic behavior and the reverberant nature of the disc keeps losses low and the device size small, which is used for RF sensing [53]. Figure 12(b) shows an optical image of a multicore fiber used to generate speckle for compressive RF sensing as well [54]. Figure 12(c) shows an SEM image of an evanescently coupled multimodal Si spiral waveguide where the wavelength dependent response is used for on-chip spectroscopy [55]. Evanescently coupled silicon nitride spirals have also been proposed for compressive on-chip spectral sensing [64]. Figure 12(d) shows another integrated waveguide, which is used for RF sensing [56].
B. Active Multiplexing
1. Spatial Light Modulators
Spatial light modulators (SLMs) are actively controllable optical devices that allow for the spatially dependent amplitude or phase modulation of an optical wavefront. Effectively, SLMs are the active equivalent of coded apertures and diffusers. Furthermore, SLMs are often combined with, for example, Fourier transformation with a lens to multiplex the modulated spatial positions onto a single measurement position, e.g., in the case of the single pixel camera [65]. There are two primary classes of SLMs used for compressed sensing, digital-micromirror-based devices and liquid-crystal-based devices. Digital micromirror devices (DMDs) are made up of millions of tiny mirrors (more or less 10 µm in size) that can be rotated by applying a voltage that can be used to create binary patterns as shown in Fig. 13(a) [66]. Even though operation is binary, dithering or pulse width modulation can be used to trade off spatial resolution or frame rate, respectively, for grayscale operation.
The transfer function of the DMD and a more detailed discussion can be found in [66], where the angle of the incoming wave, the square aperture size, and the pixel pitch are taken into account. It is worth noting here that due to the small mirror sizes and pitch length, a significant amount of power can be lost to the diffracted orders. The simplest way to obtain compressive measurements using a DMD is shown in Fig. 13(c) where the scene is patterned. A Fourier transform of the scene is obtained at the focal point behind the lens where a single detector can be placed to make a measurement. The finite extent of the detector essentially low-pass filters the signal, which approximates a dot product as shown in Eq. (45) for 1D in the next section.
Beyond DMDs, liquid-crystal-based devices such as liquid crystal on silicon (LCoS) technology allows spatial patterning of light using the anisotropic nature of liquid crystals and allows both amplitude and phase modulation as well as grayscale operation. However, the increase in grayscale quality and control comes at the cost of typically slower operation than DMDs operating in a binary fashion. There are many configurations, such as the twisted nematic configuration, which is widely used in display technology, vertically aligned nematic (VAN), electrically controlled, surface stabilized ferroelectric liquid crystal, and more [67]. A unique application of liquid crystal devices for compressed sensing is that, unlike DMDs, they allow for direct phase modulation and wavefront shaping.
2. Electro-Optic Modulators
Electro-optic modulators (EOMs) are used for controlling the phase or the amplitude of a single spatial mode optical signal in the time domain. Within an EOM, nonlinear optical crystals such as lithium niobate are driven by radio frequency (RF) signals that modulate the refractive index of the crystal and therefore the phase of the light passing through it. Such phase-based EOMs can be used in an interferometric configuration to convert the phase modulation into intensity modulation as well [i.e., Mach–Zehnder modulator (MZM)] as shown in Fig. 14. More detail on different geometries and trade-offs can be found in Ref [69]. For a relative phase shift of $\phi$, the output intensity is
where ${V_\pi}$ is the voltage required for shifting the phase by $\pi$ and depends on many factors such as the nonlinear index and the wavelength of the light. Often, for small amplitudes and by changing the DC bias, the MZM can be operated in a linear regime, which is often preferred for compressed sensing applications. Such temporal modulation combined with, for example, filtering or dispersive Fourier transformation can be used to implement psuedo-random multiplexing of points within a time-domain signal for compressed sensing.One of the simplest ways to use MZMs to compressively sample a photonic link is to modulate the signal encoded lightwave with a pseudo-random bit string (PRBS) and to low-pass filter the resulting signal [68]. Low-pass filtering essentially calculates a dot product between the photonic signal and the PRBS. More specifically, the input lightwave is temporally varying with the encoded signal, so we get
After low-pass filtering, the measured signal is proportional to
Beyond EOMs, acousto-optic modulators (AOMs) are another method for active control of light where light is scattered from sound waves. Although the authors are not aware of any compressive sensing schemes that make use of this effect, there have been several non-compressive sensing applications where it is used for frequency multiplexing [70] and has potential for future use. An excellent review article for microscopy applications can be found in [71].
4. APPLICATIONS
In this section, we will provide a brief survey of how CS algorithms and CS photonic sampling technologies are used in practice to solve challenging sensing problems. As this is a tutorial, the overview in this section is not meant to serve as a comprehensive review but rather as a collection of examples that we believe to have pedagogical value.
A. Single Pixel Camera
The single pixel camera is one of the earliest and perhaps one of the most well-known applications of compressed sensing. High-resolution cameras are only available in a small portion of the electromagnetic spectrum (mostly visible). For many other portions of the spectrum (e.g., far-IR or terahertz) or in low-light settings that require photomultiplier tubes (PMTs), only single detectors are cost effective or feasible. In order to get 2D information, that pixel needs to be scanned across the scene, which is time consuming. Compressive single pixel imaging can enable high-resolution imaging at such wavelengths with faster imaging speeds [65]. There are many excellent review articles on the topic [74,75]. Figure 15 shows an example experiment where an image of the scene is formed on the DMD and then the light from the DMD is collected onto a single detector using a lens. Multiple measurements from the scene (each corresponding to a different DMD pattern) can be used to reconstruct the original scene. Other application areas of single pixel imaging include lidar [76,77], non-line of sight imaging [78], 3D imaging [79], Raman imaging [80], and ghost imaging [81].
B. Microscopy
1. Fluorescence Microscopy
The single pixel imaging concept can be extended to microscopy. Studer et al. propose a compressive fluorescence microscopy and hyperspectral imaging (HSI) method for biological applications [73]. PMTs are extremely sensitive detectors that are widely used in microscopy as they work well in low photon count applications. However, they are expensive and bulky, and it is difficult to make PMT arrays for high-resolution imaging applications, which makes single pixel imaging using CS a very attractive solution. In their setup a DMD is mapped onto the sample and the fluorescent signal is collected by a PMT as shown in Fig. 16. When imaging fluorescent beads, they use the Hadamard basis for the sensing matrix since the scene is sparse in its natural domain (i.e., spatial domain). The Fourier basis is also maximally incoherent with the spatial basis; however, Hadamard is easier to implement with the binary DMD patterns. Elements of the Hadamard patterns take on values of ${+}1$ and ${-}1$ that are mapped to 1 and 0, respectively, on the DMD. Another method often used, although not employed in this work, is to measure the response for the ${+}1{\rm s}$ and the ${-}1{\rm s}$ separately and to subtract them. This approach leads to $2 \times$ more measurements, but can provide increased SNR and dynamic range for differential sensing (e.g., balanced detection) due to background removal and noise cancellation. They show significant compression ratios can be achieved, which translates to either higher speed operation or higher resolution at the same frame rate.
Kuo et al. propose an on-chip fluorescence microscope using a thin diffuser placed between the sample and the camera [46]. On-chip microscopes can be lightweight and portable and have reduced hardware complexity. Due to the omnidirectional nature of fluorescence signals, samples must be placed close to the sensor, which deviates from the case discussed in the previous section on diffusers. Samples close to the sensor complicate the calibration process, and so Kuo et al. propose a model-based interpolation scheme to significantly reduce the calibration time and data.
Miniscopes have been extensively used for imaging the brain activity of freely moving mice due to their compact design. Yanny et al. propose a 3D miniscope that replaces the tube lens of a popular miniscope design with a diffuser [82]. In order to achieve high-frequency patterns from each depth plane, they use a multifocal microlens diffuser since a single focal length would yield a high-frequency diffuser pattern only at one depth plane. Microlenses are initially randomly placed, but their parameters (e.g., microlens positions) are fine tuned by minimizing a coherence-based loss function using gradient descent. In a similar vein, Adams et al. aim to build a very thin and lightweight 3D fluorescence microscope and achieve this by using coded aperture with MURA patterns [83]. The reduced bulk of such a system could be highly beneficial for mounting to an animal and imaging brain activity over large transverse fields of view. There are many other examples of compressive fluorescence microscopy techniques [84–88], and a more comprehensive review of the literature can be found in [89].
2. Multiphoton Fluorescence Microscopy
Multiphoton fluorescence imaging is a critical technology in biomedical imaging for surpassing the limitations of single photon imaging techniques such as a low signal-to-background ratio and small penetration depth due to strong scattering from biological tissue. These are circumvented by the larger penetration depth of a longer wavelength excitation and small background since multiphoton absorption has a strong intensity dependence allowing for depth resolved excitation at the laser focus. Conventional multiphoton microscopes often raster scan the scene, which we know to be inefficient and time consuming especially since the scenes are often sparse in the spatial domain.
Alemohammad et al. propose a temporally focused compressive two-photon microscope where a DMD acts both as a structured light source and a diffraction grating [34] as shown in Fig. 17. Pseudo-random binary patterns are used for the sampling matrix. In addition to the benefits offered by compressed sensing, random access of the sample plane enables more sophisticated adaptive image scanning.
Wijesinghe et al. implement the same compressive multiphoton microscope and argue that a random Morlet basis is better suited for the given optical setup [90]. They show that high-frequency components from Hadamard or pseudo-random patterns get low-pass filtered by the optical system during pattern projection, which causes basis mismatch. Random Morlet patterns are essentially low-pass filtered random patterns matched to the optical system, which removes the basis mismatch. Alternatively, this issue could also be resolved in post-processing by incorporating the transfer function of the optical system.
Kazemipour et al. propose a tomographic two-photon imager that is able to resolve 1 billion voxels per second [91]. A 2D plane with $N \times N$ pixels is sampled with four line scans (45º increments) resulting in $4 \times N$ measurements. This extreme compression ratio is achieved by first taking a full raster scan image of the scene and segmenting the parts of the scene that are very sparse. An SLM is then used in the intensity modulation configuration to only allow light from the segmented regions. Other examples of compressive two-photon microscopy can be found in [92–94].
C. On-Chip Spectrum Sensing
The resolving power of conventional spectrometers depends strongly on the disperser size and pathlength where bulky free-space optics are typically used. The large disperser sizes required for high-resolution spectroscopy limit the ability to integrate high-resolution spectrometers into chip-scale photonic devices. In order to solve this problem, Redding et al. propose using evanescently coupled multimode spiral waveguides to randomly multiplex spectral information [55] as shown in Fig. 18. The response of these devices is in many ways similar to wavelength dependent speckle and can efficiently multiplex information. In their work, spectra are reconstructed both with and without using compressed sensing. In such applications, CS can be leveraged to reconstruct spectra with either higher resolution or larger range using a limited number of measurements. Other works have also used spirals [64] and ring resonators [95] for compressive on-chip spectrum sensing.
D. Hyperspectral Imaging
HSI captures spectral information for each pixel (or voxel for 3D HSI) in a given scene as shown in Fig. 19 . Since high-dimensional hyperspectral information must be captured with a 2D sensor, CS has been extensively utilized to alleviate the need for a large number of measurements (i.e., pixels and frames). An excellent introduction to the topic can be found in [98].
Gehm et al. propose a dual-disperser coded aperture setup known as CASSI where the light from the scene is dispersed and patterned with a coded aperture [100]. Due to dispersion, different wavelengths of light are patterned with shifted versions of the aperture. The second disperser then undoes the effect of the first dispersion. This way hyperspectral information encoded in 2D is detected in a single shot.
Wagadarikar et al. propose a modified version of the CASSI system with a single disperser [96]. Here the scene is patterned with a coded aperture first so all the wavelength channels get the same pattern as shown in Fig. 10. A coded hyperspectral data cube is subsequently sheared using a disperser and projected onto the camera. This method tends to get better spectral resolution at the expense of reduced spatial resolution.
Alemohammad et al. propose an HSI system target on particulate scenes where the coded aperture is completely eliminated as it reduces SNR and a conventional image is simultaneously captured with the sheared spectral data [101]. Here the spatial sparsity of the particles and their naturally random distribution and motion is leveraged. Co-registered conventional images along with the sparsity constraint are then needed to disentangle hyperspectral data of the particles.
E. Lens-Free Micro-Endoscope
Fiber microendoscopes enable imaging deep inside tissue with minimal damage. However, their miniature size and correspondingly limited number of spatial measurements result in low resolution and a narrow field of view. Compressed sensing can be useful in this scenario to maximize the image information captured with this limited number of measurements [97,102–105].
Shin et al. propose using a $\rm{Ti}{\rm{O}_2}$ tipped single-mode fiber on the illumination side where wavelength dependent speckle patterns are formed on the sample to be imaged [102]. Light from the sample is then detected with a single photodetector. A tunable laser is used to scan the wavelength in small increments to generate distinct speckle patterns.
In another work, Shin et al. show that a coded aperture placed in the distal end of a multicore fiber can be used to efficiently multiplex information from the scene [97] as shown in Fig. 20. In this case, using a compressive approach enables significant reduction in fiber size, which is important for minimal invasiveness.
Amitonova and De Boer propose using a multimode fiber to produce speckle patterns whose characteristics were modified by a DMD [104]. DMD is used to control phase through an amplitude holography setup.
Recently, Guérit et al. proposed using a combination of an SLM and a multicore fiber as well for minimally invasive endoscopy applications. Their work is very comprehensive and goes into details of estimating speckle statistics for optimal sensing [63].
F. Optical Coherence Tomography
Optical coherence tomography (OCT) typically uses low coherence light sources to acquire 3D information from interferometric measurements. The high-dimensional nature of OCT data and the need for high frame rates provides opportunities for compressed sensing.
Stroud et al. propose an OCT system where pseudo-random binary patterns are imprinted on the spectrum of pulses from a mode locked laser [99] as shown in Fig. 21. This enables a high compression ratio, which implies either slower ADCs can be used or alternatively the same ADC can be used to make faster measurements.
Liu and Kang propose a compressive spectral domain OCT system where the $k$-space is randomly undersampled [108]. Random undersampling is achieved by using a CCD camera with randomly addressable pixels that have been developed for high speed imaging where the bottleneck is electrical (transfer of data from pixels to the computer).
G. High Speed Imaging
High speed imaging beyond the frame rate of cameras has been of great interest for elucidating ultrafast phenomena. Success in this field has been partly due to the generation of short femtosecond pulses that provide superior temporal sectioning as well as high intensities. This problem is very attractive to compressed sensing since it can push the speed limit of such imagers further by recovering more from fewer measurements.
For example, Gao et al. propose a method where a streak camera is used in conjunction with a DMD as shown in Fig. 22 to image macroscopic objects at 100 GHz frame rates for bursts of up to 350 frames [109].
Bosworth et al. propose a method where mode locked laser pulses are dispersed, patterned with random binary patterns, and compressed again followed by wavelength-to-space mapping to image a microscopic object with high throughput for continuous imaging [30].
H. Radio Frequency Sensing
Many RF applications make use of RF signals sparse in some domain and therefore compressed sensing relax constraints on ADCs such as size, weight, or power consumption [107]. Valley et al. propose the setup shown in Fig. 23 where the RF signal is imprinted on the intensity of a chirped pulse from a mode locked laser whereby RF information in time is mapped to spectrum. A diffraction grating then maps spectrum to space, which is coded with an SLM and finally detected by an ADC. Each pattern on the SLM yields a single compressive measurement. RF signals can be recovered using measurements from multiple patterns (far fewer than the Nyquist rate).
Other work in the literature has made use of other active or passive multiplexing technologies. For example, Bosworth and Foster use EOMs to pattern chirped pulses prior to imprinting RF information [110]. In another work, Valley et al. and Sefler et al. make use of speckle from a multimode waveguide to multiplex information [54,111], and in their more recent work, Borlaug et al. study SiN multimode spirals for integrated RF sensing [56].
5. CONCLUSION
In this tutorial we cover the mathematical background, algorithms, and photonic technologies necessary for designing photonic compressed sensing systems, and some example systems for specific applications. Compressed sensing (CS) methods can improve system performance or help surpass physical limits for conventional photonic sensing modalities. In particular, by leveraging compression, CS methods can provide significant improvements when the underlying conventional sensing method is measurement constrained. Successful compression of signals usually involves pseudo-random multiplexing of information. Signals are then reconstructed using computational methods that have a significant effect on reconstruction quality. Recent progress in machine learning can help push compression ratios further in the future with better decoding algorithms as well as hardware co-design.
Funding
Air Force Office of Scientific Research (FA9550-21-1-0372); Defense Threat Reduction Agency (HDTRA1-20-2-0001).
Disclosures
The author declares no conflicts of interest.
Data availability
No data were generated or analyzed in the presented research.
REFERENCES
1. E. J. Candès and M. B. Wakin, “An introduction to compressive sampling,” IEEE Signal Process. Mag. 25(2), 21–30 (2008). [CrossRef]
2. M. A. Davenport, M. F. Duarte, Y. C. Eldar, and G. Kutyniok, Introduction to Compressed Sensing (2012).
3. J. Romberg, “Imaging via compressive sampling,” IEEE Signal Process. Mag. 25(2), 14–20 (2008). [CrossRef]
4. S. Foucart and H. Rauhut, A Mathematical Introduction to Compressive Sensing (Springer, 2013).
5. Y. C. Eldar and G. Kutyniok, Compressed Sensing: Theory and Applications (Cambridge University, 2012).
6. G. Strang, Introduction to Linear Algebra (2020).
7. A. V. Oppenheim, J. Buck, M. Daniel, A. S. Willsky, S. H. Nawab, and A. Singer, Signals & Systems (Pearson Educación, 1997).
8. C. Shannon, “Communication in the presence of noise,” Proc. IRE 37, 10–21 (1949). [CrossRef]
9. A. W. Lohmann, R. G. Dorsch, D. Mendlovic, Z. Zalevsky, and C. Ferreira, “Space–bandwidth product of optical signals and systems,” J. Opt. Soc. Am. A 13, 470–473 (1996). [CrossRef]
10. E. Candes and T. Tao, “Decoding by linear programming,” IEEE Trans. Inf. Theory 51, 4203–4215 (2005). [CrossRef]
11. T. Hastie, R. Tibshirani, J. H. Friedman, and J. H. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, 2009), Vol. 2.
12. R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin, “A simple proof of the restricted isometry property for random matrices,” Constr. Approx. 28, 253–263 (2008). [CrossRef]
13. E. J. Candes, J. K. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,” Commun. Pure Appl. Math. 59, 1207–1223 (2006). [CrossRef]
14. L. Stanković, M. Brajović, D. Mandic, I. Stanković, and M. Daković, “Improved coherence index-based bound in compressive sensing,” IEEE Signal Process. Lett. 28, 1110–1114 (2021). [CrossRef]
15. R. A. Horn and C. R. Johnson, Matrix Analysis, 2nd ed. (Cambridge University, 2012).
16. D. Donoho, M. Elad, and V. Temlyakov, “Stable recovery of sparse overcomplete representations in the presence of noise,” IEEE Trans. Inf. Theory 52, 6–18 (2006). [CrossRef]
17. D. Donoho and J. Tanner, “Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing,” Philos. Trans. R. Soc. A 367, 4273–4293 (2009). [CrossRef]
18. S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequency dictionaries,” IEEE Trans. Signal Process. 41, 3397–3415 (1993). [CrossRef]
19. J. A. Tropp and A. C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans. Inf. Theory 53, 4655–4666 (2007). [CrossRef]
20. W. Dai and O. Milenkovic, “Subspace pursuit for compressive sensing signal reconstruction,” IEEE Trans. Inf. Theory 55, 2230–2249 (2009). [CrossRef]
21. D. Needell and J. A. Tropp, “CoSaMP: Iterative signal recovery from incomplete and inaccurate samples,” Appl. Comput. Harmon. Anal. 26, 301–321 (2009). [CrossRef]
22. G. B. Dantzig, “Linear programming,” Oper. Res. 50, 42–47 (2002). [CrossRef]
23. T. Blumensath and M. Davies, “Iterative hard thresholding for compressive sensing,” Appl. Comput. Harmon. Anal. 27, 265–274 (2009). [CrossRef]
24. I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Commun. Pure Appl. Math. 57, 1413–1457 (2004). [CrossRef]
25. A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imag. Sci. 2, 183–202 (2009). [CrossRef]
26. J. M. Bioucas-Dias and M. A. Figueiredo, “A new twist: two-step iterative shrinkage/thresholding algorithms for image restoration,” IEEE Trans. Image Process. 16, 2992–3004 (2007). [CrossRef]
27. D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods (Academic, 2014).
28. Y. Yang, J. Sun, H. Li, and Z. Xu, “ADMM-CSNet: a deep learning approach for image compressive sensing,” IEEE Trans. Pattern Anal. Mach. Intell. 42, 521–538 (2018). [CrossRef]
29. Z. T. Harmany, R. F. Marcia, and R. M. Willett, “This is spiral-tap: sparse Poisson intensity reconstruction algorithms-theory and practice,” IEEE Trans. Image Process. 21, 1084–1096 (2012). [CrossRef]
30. B. T. Bosworth, J. R. Stroud, D. N. Tran, T. D. Tran, S. Chin, and M. A. Foster, “High-speed flow microscopy using compressed sensing with ultrafast laser pulses,” Opt. Express 23, 10521–10532 (2015). [CrossRef]
31. J. W. Woods, Multidimensional Signal, Image, and Video Processing and Coding (Elsevier, 2012).
32. S. Mallat, A Wavelet Tour of Signal Processing: The Sparse Way (Elsevier, 2009).
33. D. N. Tran, D. N. Tran, S. P. Chin, and T. D. Tran, “Local sensing with global recovery,” in IEEE International Conference on Image Processing (ICIP) (IEEE, 2015), pp. 4313–4316.
34. M. Alemohammad, J. Shin, D. N. Tran, J. R. Stroud, S. P. Chin, T. D. Tran, and M. A. Foster, “Widefield compressive multiphoton microscopy,” Opt. Lett. 43, 2989–2992 (2018). [CrossRef]
35. D. Van Veen, A. Jalal, M. Soltanolkotabi, E. Price, S. Vishwanath, and A. G. Dimakis, “Compressed sensing with deep image prior and learned regularization,” arXiv:1806.06438 (2018).
36. Y. Wu, M. Rosca, and T. Lillicrap, “Deep compressed sensing,” in International Conference on Machine Learning (PMLR, 2019), pp. 6850–6860.
37. J. Zhang and B. Ghanem, “ISTA-Net: interpretable optimization-inspired deep network for image compressive sensing,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 1828–1837.
38. Z. Zhang, Y. Liu, J. Liu, F. Wen, and C. Zhu, “Amp-net: denoising-based deep unfolding for compressive image sensing,” IEEE Trans. Image Process. 30, 1487–1500 (2020). [CrossRef]
39. D. Ye, Z. Ni, H. Wang, J. Zhang, S. Wang, and S. Kwong, “Csformer: Bridging convolution and transformer for compressive sensing,” arXiv:2112.15299 (2021).
40. M. T. McCann, K. H. Jin, and M. Unser, “Convolutional neural networks for inverse problems in imaging: a review,” IEEE Signal Process. Mag. 34(6), 85–95 (2017). [CrossRef]
41. A. L. Machidon and V. Pejović, “Deep learning for compressive sensing: a ubiquitous systems perspective,” Artif. Intell. Rev. (2022). [CrossRef]
42. R. Dicke, “Scatter-hole cameras for x-rays and gamma rays,” Astrophys. J. 153, L101 (1968). [CrossRef]
43. E. E. Fenimore and T. M. Cannon, “Coded aperture imaging with uniformly redundant arrays,” Appl. Opt. 17, 337–347 (1978). [CrossRef]
44. S. R. Gottesman and E. E. Fenimore, “New family of binary arrays for coded aperture imaging,” Appl. Opt. 28, 4344–4352 (1989). [CrossRef]
45. N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “Diffusercam: lensless single-exposure 3d imaging,” Optica 5, 1–9 (2018). [CrossRef]
46. G. Kuo, F. L. Liu, I. Grossrubatscher, R. Ng, and L. Waller, “On-chip fluorescence microscopy with a random microlens diffuser,” Opt. Express 28, 8384–8399 (2020). [CrossRef]
47. N. Antipa, S. Necula, R. Ng, and L. Waller, “Single-shot diffuser-encoded light field imaging,” in IEEE International Conference on Computational Photography (ICCP) (IEEE, 2016), pp. 1–11.
48. E. J. Candes, T. Strohmer, and V. Voroninski, “Phaselift: Exact and stable signal recovery from magnitude measurements via convex programming,” Commun. Pure Appl. Math. 66, 1241–1274 (2013). [CrossRef]
49. J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. 21, 2758–2769 (1982). [CrossRef]
50. P. F. Almoro, L. Waller, M. Agour, C. Falldorf, G. Pedrini, W. Osten, and S. G. Hanson, “Enhanced deterministic phase retrieval using a partially developed speckle field,” Opt. Lett. 37, 2088–2090 (2012). [CrossRef]
51. D. L. Colton, R. Kress, and R. Kress, Inverse Acoustic and Electromagnetic Scattering Theory (Springer, 1998), Vol. 93.
52. J. W. Goodman, “Statistical properties of laser speckle patterns,” in Laser Speckle and Related Phenomena (Springer, 1975), pp. 9–75.
53. H. Sun, B. T. Bosworth, B. C. Grubel, M. R. Kossey, M. A. Foster, and A. C. Foster, “Compressed sensing of sparse rf signals based on silicon photonic microcavities,” in CLEO: Science and Innovations (Optica Publishing Group, 2017), paper SM1O-5.
54. G. A. Sefler, T. J. Shaw, and G. C. Valley, “Demonstration of speckle-based compressive sensing system for recovering RF signals,” Opt. Express 26, 21390–21402 (2018). [CrossRef]
55. B. Redding, S. F. Liew, Y. Bromberg, R. Sarma, and H. Cao, “Evanescently coupled multimode spiral spectrometer,” Optica 3, 956–962 (2016). [CrossRef]
56. D. B. Borlaug, S. Estrella, C. T. Boone, G. A. Sefler, T. J. Shaw, A. Roy, L. Johansson, and G. C. Valley, “Photonic integrated circuit based compressive sensing radio frequency receiver using waveguide speckle,” Opt. Express 29, 19222–19239 (2021). [CrossRef]
57. B. C. Grubel, B. T. Bosworth, M. R. Kossey, H. Sun, A. B. Cooper, M. A. Foster, and A. C. Foster, “Silicon photonic physical unclonable function,” Opt. Express 25, 12710–12721 (2017). [CrossRef]
58. H. Cao and J. Wiersig, “Dielectric microcavities: model systems for wave chaos and non-Hermitian physics,” Rev. Mod. Phys. 87, 61 (2015). [CrossRef]
59. J. U. Nockel and A. D. Stone, “Ray and wave chaos in asymmetric resonant optical cavities,” Nature 385, 45–47 (1997). [CrossRef]
60. T. Tanaka, M. Hentschel, T. Fukushima, and T. Harayama, “Classical phase space revealed by coherent light,” Phys. Rev. Lett. 98, 033902 (2007). [CrossRef]
61. M. Lebental, J.-S. Lauret, J. Zyss, C. Schmit, and E. Bogomolny, “Directional emission of stadium-shaped microlasers,” Phys. Rev. A 75, 033806 (2007). [CrossRef]
62. D. Szasz, L. Bunimovich, D. Burago, et al., Hard Ball Systems and the Lorentz Gas, Encyclopedia of Mathematical Sciences (Springer, 2013).
63. S. Guerit, S. Sivankutty, J. Lee, H. Rigneault, and L. Jacques, “Compressive imaging through optical fiber with partial speckle scanning,” SIAM J. Imag. Sci. 15, 387–423 (2022). [CrossRef]
64. D. Irvine, V. Kilic, M. Foster, and A. Foster, “Evanescently coupled multimode silicon nitride waveguides for on chip spectroscopy,” in Conference on Lasers and Electro-Optics (CLEO) (OSA, 2022).
65. M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G. Baraniuk, “Single-pixel imaging via compressive sampling,” IEEE Signal Process. Mag. 25(2), 83–91 (2008). [CrossRef]
66. Y.-X. Ren, R.-D. Lu, and L. Gong, “Tailoring light with a digital micromirror device,” Ann. Phys., Lpz. 527, 447–470 (2015). [CrossRef]
67. Z. Zhang, Z. You, and D. Chu, “Fundamentals of phase-only liquid crystal on silicon (LCOS) devices,” Light Sci. Appl. 3, e213 (2014). [CrossRef]
68. J. Nichols and F. Bucholtz, “Beating Nyquist with light: a compressively sampled photonic link,” Opt. Express 19, 7339–7348 (2011). [CrossRef]
69. E. L. Wooten, K. M. Kissa, A. Yi-Yan, E. J. Murphy, D. A. Lafaw, P. F. Hallemeier, D. Maack, D. V. Attanasio, D. J. Fritz, G. J. McBrien, and D. E. Bossi, “A review of lithium niobate modulators for fiber-optic communications systems,” IEEE J. Sel. Top. Quantum Electron. 6, 69–82 (2000). [CrossRef]
70. E. D. Diebold, B. W. Buckley, D. R. Gossett, and B. Jalali, “Digitally synthesized beat frequency multiplexing for sub-millisecond fluorescence microscopy,” Nat. Photonics 7, 806–810 (2013). [CrossRef]
71. M. Duocastella, S. Surdo, A. Zunino, A. Diaspro, and P. Saggau, “Acousto-optic systems for advanced microscopy,” J. Phys. Photon. 3, 012004 (2020). [CrossRef]
72. S. S. Welsh, M. P. Edgar, R. Bowman, P. Jonathan, B. Sun, and M. J. Padgett, “Fast full-color computational imaging with single-pixel detectors,” Opt. Express 21, 23068–23074 (2013). [CrossRef]
73. V. Studer, J. Bobin, M. Chahid, H. S. Mousavi, E. Candes, and M. Dahan, “Compressive fluorescence microscopy for biological and hyperspectral imaging,” Proc. Natl. Acad. Sci. USA 109, E1679–E1687 (2012). [CrossRef]
74. G. M. Gibson, S. D. Johnson, and M. J. Padgett, “Single-pixel imaging 12 years on: a review,” Opt. Express 28, 28190–28208 (2020). [CrossRef]
75. M. P. Edgar, G. M. Gibson, and M. J. Padgett, “Principles and prospects for single-pixel imaging,” Nat. Photonics 13, 13–20 (2019). [CrossRef]
76. A. Colaço, A. Kirmani, G. A. Howland, J. C. Howell, and V. K. Goyal, “Compressive depth map acquisition using a single photon-counting detector: parametric signal processing meets sparsity,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2012), pp. 96–102.
77. N. Radwell, S. D. Johnson, M. P. Edgar, C. F. Higham, R. Murray-Smith, and M. J. Padgett, “Deep learning optimized single-pixel lidar,” Appl. Phys. Lett. 115, 231101 (2019). [CrossRef]
78. Z. Zhang, X. Ma, and J. Zhong, “Single-pixel imaging by means of Fourier spectrum acquisition,” Nat. Commun. 6, 6225 (2015). [CrossRef]
79. B. Sun, M. P. Edgar, R. Bowman, L. E. Vittert, S. Welsh, A. Bowman, and M. J. Padgett, “3D computational imaging with single-pixel detectors,” Science 340, 844–847 (2013). [CrossRef]
80. C. Scotté, S. Sivankutty, R. A. Bartels, and H. Rigneault, “Line-scan compressive Raman imaging with spatiospectral encoding,” Opt. Lett. 45, 5567–5570 (2020). [CrossRef]
81. Y. Bromberg, O. Katz, and Y. Silberberg, “Ghost imaging with a single detector,” Phys. Rev. A 79, 053840 (2009). [CrossRef]
82. K. Yanny, N. Antipa, W. Liberti, S. Dehaeck, K. Monakhova, F. L. Liu, K. Shen, R. Ng, and L. Waller, “Miniscope3D: optimized single-shot miniature 3D fluorescence microscopy,” Light Sci. Appl. 9, 171 (2020). [CrossRef]
83. J. K. Adams, V. Boominathan, B. W. Avants, D. G. Vercosa, F. Ye, R. G. Baraniuk, J. T. Robinson, and A. Veeraraghavan, “Single-frame 3D fluorescence microscopy with ultraminiature lensless flatscope,” Sci. Adv. 3, e1701548 (2017). [CrossRef]
84. A. Ghezzi, A. Farina, A. Bassi, G. Valentini, I. Labanca, G. Acconcia, I. Rech, and C. D’Andrea, “Multispectral compressive fluorescence lifetime imaging microscopy with a SPAD array detector,” Opt. Lett. 46, 1353–1356 (2021). [CrossRef]
85. V. J. Parot, C. Sing-Long, Y. Adam, U. L. Böhm, L. Z. Fan, S. L. Farhi, and A. E. Cohen, “Compressed Hadamard microscopy for high-speed optically sectioned neuronal activity recordings,” J. Phys. D 52, 144001 (2019). [CrossRef]
86. M. Pascucci, S. Ganesan, A. Tripathi, O. Katz, V. Emiliani, and M. Guillon, “Compressive three-dimensional super-resolution microscopy with speckle-saturated fluorescence excitation,” Nat. Commun. 10, 1327 (2019). [CrossRef]
87. G. Calisesi, M. Castriotta, A. Candeo, A. Pistocchi, C. D’Andrea, G. Valentini, A. Farina, and A. Bassi, “Spatially modulated illumination allows for light sheet fluorescence microscopy with an incoherent source and compressive sensing,” Biomed. Opt. Express 10, 5776–5788 (2019). [CrossRef]
88. C. Fang, T. Yu, T. Chu, W. Feng, F. Zhao, X. Wang, Y. Huang, Y. Li, P. Wan, W. Mei, and D. Zhu, “Minutes-timescale 3D isotropic imaging of entire organs at subcellular resolution by content-aware compressed-sensing light-sheet microscopy,” Nat. Commun. 12, 1–13 (2021). [CrossRef]
89. G. Calisesi, A. Ghezzi, D. Ancora, C. D’Andrea, G. Valentini, A. Farina, and A. Bassi, “Compressed sensing in fluorescence microscopy,” Prog. Biophys. Mol. Biol. 168, 66–80 (2021). [CrossRef]
90. P. Wijesinghe, A. Escobet-Montalbán, M. Chen, P. R. Munro, and K. Dholakia, “Optimal compressive multiphoton imaging at depth using single-pixel detection,” Opt. Lett. 44, 4981–4984 (2019). [CrossRef]
91. A. Kazemipour, O. Novak, D. Flickinger, J. S. Marvin, A. S. Abdelfattah, J. King, P. M. Borden, J. J. Kim, S. H. Al-Abdullatif, P. E. Deal, and E. W. Miller, “Kilohertz frame-rate two-photon tomography,” Nat. Methods 16, 778–786 (2019). [CrossRef]
92. M. Alemohammad, J. Shin, J. R. Stroud, and M. A. Foster, “High-speed compressive line-scanned two photon microscopy,” in Optics and the Brain (Optica Publishing Group, 2020), paper BTu1C-3.
93. C. Wen, M. Ren, F. Feng, W. Chen, and S.-C. Chen, “Compressive sensing for fast 3-D and random-access two-photon microscopy,” Opt. Lett. 44, 4343–4346 (2019). [CrossRef]
94. A. Song, A. S. Charles, S. A. Koay, J. L. Gauthier, S. Y. Thiberge, J. W. Pillow, and D. W. Tank, “Volumetric two-photon imaging of neurons using stereoscopy (VTWINS),” Nat. Methods 14, 420–426 (2017). [CrossRef]
95. X. He and J. Cardenas, “Fully integrated on-chip ring resonator spectrometer based on compressed sensing,” in Conference on Lasers and Electro-Optics (CLEO) (IEEE, 2022), pp. 1–2.
96. A. Wagadarikar, R. John, R. Willett, and D. Brady, “Single disperser design for coded aperture snapshot spectral imaging,” Appl. Opt. 47, B44–B51 (2008). [CrossRef]
97. J. Shin, D. N. Tran, J. R. Stroud, S. Chin, T. D. Tran, and M. A. Foster, “A minimally invasive lens-free computational microendoscope,” Sci. Adv. 5, eaaw5595 (2019). [CrossRef]
98. G. R. Arce, D. J. Brady, L. Carin, H. Arguello, and D. S. Kittle, “Compressive coded aperture spectral imaging: an introduction,” IEEE Signal Process. Mag. 31(1), 105–115 (2014). [CrossRef]
99. J. R. Stroud, L. Liu, S. Chin, T. D. Tran, and M. A. Foster, “Optical coherence tomography using physical domain data compression to achieve MHz A-scan rates,” Opt. Express 27, 36329–36339 (2019). [CrossRef]
100. M. E. Gehm, R. John, D. J. Brady, R. M. Willett, and T. J. Schulz, “Single-shot compressive spectral imaging with a dual-disperser architecture,” Opt. Express 15, 14013–14027 (2007). [CrossRef]
101. M. Alemohammad, E. R. Wainwright, J. R. Stroud, T. P. Weihs, and M. A. Foster, “Kilohertz frame rate snapshot hyperspectral imaging of metal reactive materials,” Appl. Opt. 59, 10406–10415 (2020). [CrossRef]
102. J. Shin, B. T. Bosworth, and M. A. Foster, “Single-pixel imaging using compressed sensing and wavelength-dependent scattering,” Opt. Lett. 41, 886–889 (2016). [CrossRef]
103. J. Shin, B. T. Bosworth, and M. A. Foster, “Compressive fluorescence imaging using a multi-core fiber and spatially dependent scattering,” Opt. Lett. 42, 109–112 (2017). [CrossRef]
104. L. V. Amitonova and J. F. De Boer, “Compressive imaging through a multimode fiber,” Opt. Lett. 43, 5427–5430 (2018). [CrossRef]
105. B. Lochocki, M. V. Verweg, J. J. Hoozemans, J. F. de Boer, and L. V. Amitonova, “Epi-fluorescence imaging of the human brain through a multimode fiber,” APL Photon. 7, 071301 (2022). [CrossRef]
106. J. Liang and L. V. Wang, “Single-shot ultrafast optical imaging,” Optica 5, 1113–1127 (2018). [CrossRef]
107. G. C. Valley, G. A. Sefler, and T. J. Shaw, “Compressive sensing of sparse radio frequency signals using optical mixing,” Opt. Lett. 37, 4675–4677 (2012). [CrossRef]
108. X. Liu and J. U. Kang, “Compressive SD-OCT: the application of compressed sensing in spectral domain optical coherence tomography,” Opt. Express 18, 22010–22019 (2010). [CrossRef]
109. L. Gao, J. Liang, C. Li, and L. V. Wang, “Single-shot compressed ultrafast photography at one hundred billion frames per second,” Nature 516, 74–77 (2014). [CrossRef]
110. B. T. Bosworth and M. A. Foster, “High-speed ultrawideband photonically enabled compressed sensing of sparse radio frequency signals,” Opt. Lett. 38, 4892–4895 (2013). [CrossRef]
111. G. C. Valley, G. A. Sefler, and T. J. Shaw, “Multimode waveguide speckle patterns for compressive sensing,” Opt. Lett. 41, 2529–2532 (2016). [CrossRef]