## Abstract

Perhaps the largest obstacle to practical compressive sampling is an inability to accurately, and sparsely describe the data one seeks to recover due to poor choice of signal model parameters. In such cases the recovery process will yield artifacts, or in many cases, fail completely. This work represents the first demonstration of a solution to this so-called “off-grid” problem in an experimental, compressively sampled system. Specifically, we show that an Alternating Convex Search algorithm is able to significantly reduce these data model errors in harmonic signal recovery.

© 2015 Optical Society of America

## 1. Introduction

The mathematics that underly the field of compressive sampling (CS) suggest that one can design and build a sampling device with limited temporal or spatial resolution, yet accurately estimate the desired data values at a much higher resolution [1, 2]. A number of CS devices have been built and tested for numerous applications including imaging [3, 4], analog-to-digital conversion [5–7], microscopy [8], and spectroscopy [9, 10]. CS systems continue to receive attention for their potential to reduce the size, weight, power and cost of sampling data, all of which typically increase as a function of resolution. A particularly important application (and one of the motivations for this work) is found in the area of electronic warfare (EW). Current EW systems face increasing challenges as the operating frequency range pushes into the high tens of GHz - a frequency range where direct (Nyquist) sampling is either difficult, or expensive in terms of size, weight and power, or, simply unavailable. CS using a microwave photonic approach such as the one described here, but incorporating fast (> 40 GHz) electro-optic modulators, may offer a viable, near-term solution to these challenges.

Despite this promise, as CS devices have been built and tested, some practical challenges to implementation have emerged. Of particular interest is the inability to generate accurate reconstructions when the signal parameters (e.g. frequencies) do not match those in the signal model. This problem goes by different names including “basis mismatch” [11] (see also [12]) and CS “off-the-grid” [13]. The resulting model can still be accurate, however it is not necessarily parsimonious, one of the key requirements of a data model in CS applications. In the best case, these errors cause artifacts in the recovered data and in the worst case cause a complete inability to recover (estimate) the signal values.

Perhaps the most well-known example of off-grid parameter errors occurs in the recovery of harmonic signals. In this case, a good data model is the discrete Fourier matrix, comprised of sines and cosines at a fixed set of frequencies. If the data is, in fact, a linear combination of sinusoids at precisely *S* frequencies from this set, it can be described exactly by the *S* complex (2*S* real) coefficients that multiply the corresponding model vectors. The job of the CS algorithm is to estimate these coefficients. However if the data is a linear combination of sinusoids chosen from frequencies other than those in the model set, an accurate description of the data will require some number *S*′> *S* coefficients in the model. The result is a more challenging estimation problem, often resulting in larger errors in the recovered signal. This phenomenon has been clearly documented for harmonic signals in theory [11] and in practice [14].

Several solutions to the “off-grid” problem have been proposed in the literature and comparisons among them have been drawn in the recovery of harmonic signals from compressed measurements [15]. These methods include: overcomplete Fourier dictionaries, atomic norm minimization via semi-definite programming (SDP) [13], a greedy “forward-backward (GFB)” algorithm [16], and alternating convex search (ACS) [15]. To summarize the results of [15], the overcomplete dictionaries performed well in terms of the 2-norm of the reconstruction error, but required a large number of model coefficients, often at the wrong frequency locations, to do so. The SDP method is extremely accurate, but also quite slow. Moreover, this approach assumes a very specific sampling strategy that our hardware (to be described) does not employ. The GFB algorithm produced reasonable results in terms of execution time, but did not match the ACS approach in reconstruction error.

This work represents the first known effort to address the issue of signal model parameter gridding errors in an experimental, compressively sampled system. Results are presented for linear combinations of sinusoids on the 0.25–1.4 GHz range where the compressed samples are recorded at 1.4 GS/s, well below the Nyquist sampling rate. Single and multi-tone results are presented.

## 2. Preliminaries

Consider a continuous, harmonic waveform
$z(t)={\displaystyle {\sum}_{k=1}^{K}{A}_{k}{e}^{i2\pi {f}_{k}t}}$ with complex amplitudes *A _{k}* and real-valued frequencies

*f*. Assume that we wish to collect (digitize)

_{k}*N*points of this wave-form as

**z**∈ ℝ

*≡*

^{N}*z*(

*n*Δ

*),*

_{t}*n*= 0,⋯,

*N*−1 with sampling interval Δ

*. However, instead of sampling*

_{t}**z**directly, assume that the digitizer is only capable of coarse sampling intervals ${\tilde{\mathrm{\Delta}}}_{t}>{\mathrm{\Delta}}_{t}$ such that over the observation interval

*N*Δ

*we collect*

_{t}*M*<

*N*samples (i.e., ${\mathrm{\Delta}}_{t}/{\tilde{\mathrm{\Delta}}}_{t}=M/N$). Given this physical restriction, CS theory suggests that we instead estimate

**z**based on the measurement model

**Φ**∈ ℝ

*is not the identity matrix (direct sampling) but instead projects the desired signal onto the low-dimensional measurements $\mathit{y}\in {\mathbb{R}}^{M}\equiv y(m{\tilde{\mathrm{\Delta}}}_{t})$,*

^{M×N}*m*= 0,⋯,

*M*− 1. The matrix

**Φ**is therefore a discrete model for the physical process by which the desired signal values are mapped to the compressed samples

**y**. Note that in Eq. (1) we have also used the linear signal model

**z**=

**Ψx**where, for harmonic signals, the matrix

**Ψ**∈ ℝ

*is frequently taken as the discrete Fourier basis, defined on the frequency set $\mathcal{F}=\{1/N{\mathrm{\Delta}}_{t},2/N{\mathrm{\Delta}}_{t},\cdots ,1/2{\mathrm{\Delta}}_{t}\}$. The Fourier coefficients are therefore given by the*

^{N×N}**x**and it is our goal to form the estimates $\widehat{\mathbf{x}}$, and by extension, $\widehat{\mathbf{z}}=\mathbf{\Psi}\widehat{\mathbf{x}}$. Given this construction, one can accurately estimate $\widehat{\mathbf{x}}$ in the presence of the noise,

**∈ ℝ**

*η**by solving*

^{M}_{1}is the

*ℓ*

_{1}norm, taken as the sum of the magnitude of the coefficients

**x**. The estimator is known to be accurate provided that 1) the projection matrix

**Φ**Ψ ∈ ℝ

*possess minimal coherence among the columns [17] and 2) we collect at least*

^{M×N}*M*> 2

*S*log (

*N/M*) measurements where

*S*<

*M*is the number of non-zero values in

**x**, referred to as the signal sparsity [18]. To accomplish the former, one may choose the entries of

**Φ**independently from a variety of zero-mean probability distribution (CS recovery performance for a several such distributions was shown to be identical in [18]). As a result, many CS architectures (including the one described here) are based on randomly modulating the data prior to sampling. Although other, non-random architectures are certainly admissible, we have found random modulation to be an effective means of generating low-coherence measurements.

The greater concern is the restriction on *M* which is governed largely by *S*. If each
${f}_{k}\in \mathcal{F}$, then **z** can be represented with exactly *S* = 2*K* real, non-zero values in **x**. However, it is well-known that if the harmonic signal frequencies lie “off the grid” defined by
$\mathcal{F}$, the signal will require *S*′> *S* coefficients for an accurate representation. For fixed *M, N* this increase in number of coefficients can mean the difference between successful recovery and failure. In the next section we review an alternative to Eq. (2) that iteratively adapts
$\mathcal{F}$ to the signal thereby significantly reducing *S*′ and consequently improving the estimates
$\widehat{\mathbf{x}}$.

## 3. Finding a better signal model: alternating convex search

Here we briefly describe an Alternating Convex Search (ACS) algorithm that significantly reduces the influence of gridding errors on the recovery of a linear combination of sinusoids [15]. This approach treats the Fourier coefficients *and* frequencies in the signal model as unknowns to be estimated. Rather than restrict ourselves to the Fourier basis, we consider a more general overcomplete harmonic “dictionary” where each of the columns are referred to as “atoms” [19]. We let *Q* be a factor specifying the degree of overcompleteness, chosen so that *QN/*2 is an integer, and form

*θ*= 1,

_{k}, k*…,QN/*2 can take on values in the range $\left[-\frac{1}{2QN},\frac{1}{2QN}\right]$. Note that though there are

*QN*indices, we assume real-valued signals so that there are only

*QN/*2 unique frequencies in the parameterized basis set. This has implications for the implementation of our approach as any change to the “positive” frequencies must be accompanied by an equal and opposite change to the “negative” frequencies [15]. Clearly if

*Q*= 1 and all of the

**are zero, then Eq. (3) defines the standard Fourier basis.**

*θ*Given this more general signal model, we modify Eq. (2) and solve the following minimization problem

**Ψ**

*and the associated model coefficients*

_{θ}**x**. The problem (4) was shown in [15] to be approximately bi-convex (convex in

**x**for fixed

**and convex in**

*θ***for fixed**

*θ***x**) as long as

*Q*> 3/2, hence is amenable to an “Alternating Convex Search” (ACS) strategy. The chief benefit to the ACS algorithm is speed. The algorithm exploits the local convex structure of the problem and is guaranteed to converge monotonically to a fixed point solution [20]. The downside to the approach is that claims of global optimality cannot be made, and global optimization strategies are computationally prohibitive. Nonetheless, in practice we have found that for this particular problem and for the noise levels we encounter, the solution does converge to the global optimum. It is this combination of speed and accuracy that makes ACS well-suited to practical applications. Using this approach, we initialize all ${\theta}_{k}^{(i=0)}=0$,

*k*= 1,

*…*,

*QN/*2 and solve the familiar convex

*ℓ*

_{1}regularization problem

To this end, a host of algorithms are available. In this work we use the Gradient Projections for Sparse Reconstruction (GPSR) algorithm which is one widely used approach to solving Eq. (5) [21]. We declare an estimated coefficient to be non-zero if its magnitude exceeds a threshold *κ* and we capture the associated indices in the set
${\mathcal{S}}^{(i)}\equiv \left\{k:\left|{x}_{k}^{(i)}\right|>\kappa \right\}$.

The next step is to hold the model coefficients fixed, and for each non-zero frequency $k\in {\mathcal{S}}^{(i)}$ we solve

*Q ≥*3/2 as was demonstrated in [15]. However it was further demonstrated in [15] that in practice

*Q*= 1 results in a well-defined optimization problem over the entire frequency space and can be solved using a variety of standard minimization techniques; here we use an interior-point method. While one can certainly increase

*Q*if desired, this increases recovery time without improving accuracy. The rationale behind the approach Eq. (6) is that even if an atom’s frequency location is slightly incorrect, significant signal energy (in an amount exceeding

*κ*) will still be present in the associated coefficient. Conversely, frequencies with little or no signal energy present will not require any modification. We note that similar, “local optimization” steps are also employed in a modification to orthogonal matching pursuits [22] (see also [23]), and in the spectral compressive sensing work of Duarte and Baraniuk [24]. The former is designed to minimize discretization errors by searching nearby grid vectors for improvements to the model fit (although not modifying the grid locations as is done here) while the latter uses a local quadratic interpolation to modify the grid (as opposed to solving for the frequencies directly).

This second stage of the algorithm alters the signal model, hence the coefficients that describe the model are also likely to change. Thus, we return to Eq. (5) and the process repeats until a convergence criteria is met. The method can therefore be appropriately described as an ACS with GPSR as the *ℓ*_{1} minimizer. We specify as a termination criteria that the relative change in objective function Eq. (4) between iterations be less than a pre-defined tolerance.

Guidelines for implementing the algorithm are found in [15] with relevant details given here. First, in order to initialize the GPSR algorithm we use the familiar least-squares solution
${\mathbf{x}}^{(0)}={(\mathbf{\Phi}{\Psi}_{\mathbf{\theta}})}^{T}\mathit{y}$. At each subsequent iteration (*i* = 1,2,,…,), we use the current estimate **x**^{(}^{i}^{)} in seeding the solution to Eq. (5). This is a process referred to as “warm starting” by the authors of [21]. Secondly, in selecting the sparsity penalty *λ*, we choose *λ* = *α*‖(**Φ**Ψ* _{θ}*)

^{T}**y**‖

_{∞}where

*α*is a small, real constant. In this work we use

*α*= 0.1. Finally we note that too small a threshold

*κ*increases the computational burden (increases the size of ${\mathcal{S}}^{(i)}$) while too large a threshold risks “missing” a frequency component that requires adjusting. In the experimental results that follow the threshold was set to the constant value

*κ*= 0.14.

It is also worth mentioning that the ACS algorithm is fast relative to other approaches. In [15] the ACS approach was shown to take *≈* 1second to recover signals of length *N* = 256. By comparison, the “semi-definite programming” approach [13] took *O*(10^{2}) seconds to decode the same signals. The speed of the overcomplete dictionary approach depends on *Q*. For *Q* ≥ 10, the ACS approach was shown to be clearly faster. For *Q* < 10 ACS is marginally slower(1.1 sec. vs. 0.75 sec for a sparsity of *S* = 4, for example), but was clearly shown to improve the accuracy in the recovery. In short, the balance of accuracy and speed offered by ACS is not currently matched by any other published approach.

## 4. Compressively sampled photonic link

The system studied in this work is a compressively sampled, photonic system described in detail in [6] and depicted in Fig. 1. The input signal *z*(*t*) is introduced via the first Mach-Zehnder modulator (MZM) and is subsequently mixed with a pseudo-random bit sequence (PRBS) using a second Mach-Zehnder. The PRBS possesses a bit period of 1/2.5*e*^{9} seconds, a mark ratio of 1/2 and a length of 2^{9} −1. The mixed signal is then photodetected (PD) and low-pass filtered (LPF) prior to digitization at the slow sampling interval,
${\tilde{\mathrm{\Delta}}}_{t}$. Each of these stages can be modeled as a separate, linear matrix operation on the desired signal, *z*(*n*),* n* = 1 ⋯ *N* and therefore fits in the form required of Eq. (1). It should also be stated that for our measurements the electrical signal-to-noise ratio at the input to the digitizer was approximately 30 dB in a 1 GHz bandwidth. For this noise level we achieved excellent performance (as will be shown in the next section), however, understanding the performance of CS reconstruction for a variety of SNRs is an important issue that remains to be addressed in future work.

Other photonic CS architectures have been proposed and demonstrated. For example, Yan *et al.* [7] also employ a form of microwave photonic link. However, the system used here is much simpler (requires fewer components) and also does not require a separate measurement step for frequency disambiguation during which the PRBS modulation is temporarily turned off. Rather, in the approach demonstrated here, disambibiguation occurs automatically as part of the reconstruction process. It should also be pointed out that the work presented in [7]. employs a finely-spaced but fixed grid of frequencies and therefore does not address the issue of off-grid signals.

In our past work we have demonstrated how the compressed samples **y** can be used in conjunction with the estimator Eq. (2) to achieve excellent signal reconstructions. However, accurate reconstructions were not possible for signals that did not lie on the discrete Fourier grid [14]. In the next section we clearly demonstrate the ability of the ACS algorithm to correct this shortcoming.

## 5. Experimental performance

To demonstrate the approach, we choose the number of tones, *K*, and random phases *ϕ _{k}* ∈ [0,2

*π*) and generate $z(t)={\displaystyle {\sum}_{k=1}^{K}{A}_{k}\mathrm{sin}(2\pi {f}_{k}t+{\varphi}_{k})}$ where

*A*are real-valued amplitudes (in Volts) and the

_{k}*f*are user-defined (in Hz). We then record compressed samples according to Eq. (1) using a

_{k}**Φ**matrix defined by the architecture of the previous section. We have found the errors in modeling

**Φ**come primarily from our assumption that the PRBS is transitioning instantaneously to exactly one of two values (± .45 Volts). In reality, the PRBS voltage is not a perfect binary sequence, but contains a good deal of un-modeled structure. The result is that the degree of under sampling predicted by CS theory (which assumes no modeling error) is slightly optimistic. For this reason, in experiment we choose a downsampling factor of seven (

*M/N*= 0.143).

As a first example we recover a single tone (*K* = 1) at above-Nyquist frequencies. Specifically, using *M* = 571 compressed samples recorded at 1.43 GS/s, we reconstructed *N* = 4000 samples of the signal at a rate of 10 GS/s. In order to quantify the performance of the recovery algorithm, we use the estimated mean relative recovery error,
$E\left[{\Vert \mathbf{z}-\widehat{\mathbf{z}}\Vert}_{2}^{2}/{\Vert \mathbf{z}\Vert}_{2}^{2}\right]$, as a function of the input signal frequency. These results are shown in Fig. (2) for both the GPSR recovery algorithm and the ACS algorithm. A total of 30 realizations of the input signal at each frequency were used in approximating the expectation. As predicted, the ACS algorithm provides nearly uniform recovery accuracy over all frequencies tested and does not depend on whether or not the signal is “off the grid” thus effectively removing the gridding errors associated with this CS link (and documented in [14]).

As a second demonstration, we formed a linear combination of four sinusoids at frequencies *f*_{1} = 251 *MHz*, 1.001 *GHz*, 1.006 *GHz*, and 1.351 *GHz*. This choice presents a difficult test case whereby the signal spans a large instantaneous bandwidth, yet possesses two tones that are very close in frequency. We then use *M* = 571 compressed samples at 1.43 GS/s to recover *N* = 4000 samples of the true signal at a 10 *GS/s* rate, i.e., an under sampling factor of 7×. Each of the signal frequencies is therefore “off-grid”, thus preventing the GPSR recovery algorithm from working properly. Reconstructions obtained using the GPSR algorithm are compared to those obtained via the ACS estimator in Fig. 3. The ACS algorithm is better able to identify all four frequencies, resulting in a significantly better match between the estimate and truth. This is particularly true for the two closely-spaced frequencies near 1 GHz (right plot of Fig. 3).

## 6. Conclusions

This work has presented an experimental implementation of the Alternating Convex Search (ACS) algorithm for correcting frequency errors in the signal model used in compressive sampling applications for real harmonic signals. The algorithm treats both frequencies and model coefficients as unknowns and uses an iterative approach in developing estimates. The approach effectively corrects signal model errors in an experimental compressively sampled photonic link. The resulting signal reconstruction errors no longer depend on the signal frequency resulting in a significantly more robust system performance than can be obtained without the ACS correction. Successful recovery was demonstrated on a four tone “off-grid” signal occupying a 1.1GHz instantaneous bandwidth where the traditional approach fails.

## Acknowledgments

The authors would like to acknowledge the Naval Research Laboratory for supporting this work under NRL 6.1 & 6.2 Base Programs.

## References and links

**1. **D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory **52**(4), 1289–1306 (2006). [CrossRef]

**2. **E. Candès and M. Wakin, “An introduction to compressive sampling,” IEEE Sig. Process. Mag. **25**(2), 21–30 (2008). [CrossRef]

**3. **M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G. Baraniuk, “Single-pixel Imaging via Compressive Sampling,” IEEE Sig. Process. Mag. **25**(2), 83–91 (2008). [CrossRef]

**4. **W. L. Chan, K. Charan, D. Takhar, K. F. Kelly, R. G. Baraniuk, and D. M. Mittleman, “A single-pixel terahertz imaging system based on compressed sensing,” Appl. Phys. Lett. **93**, 121105 (2008). [CrossRef]

**5. **M. Mishali and Y. Eldar, “From theory to practice: sub-Nyquist sampling of sparse wideband analog signals,” IEEE J. Sel. Top. Sig. Process. **4**(2), 375–391 (2010). [CrossRef]

**6. **J. M. Nichols and F. Bucholtz, “Beating Nyquist With light: a compressively sampled photonic link,” Opt. Express **19**(8), 7339–7348 (2011). [CrossRef] [PubMed]

**7. **L. Yan, Y. Dai, K. Xu, J. Wu, Y. Li, Y. Ji, and J. Lin, “Integrated multifrequency recognition and downconversion based on photonics-assisted compressive sampling,” IEEE Photon. J. **4**(3), 664–670 (2012). [CrossRef]

**8. **Y. Wu, C. Chen, P. Ye, G. R. Arce, and D. W. Prather, “Development of a compressive programmable array microscope,” vol. 1–5 of Proceedings of the Conference on Lasers and Electro-optics/Quantum Electronics and Laser Science Conference, pp. 3135–3136 (2009).

**9. **M. E. Gehm, R. John, D. J. Brady, R. M. Willett, and T. J. Schultz, “Single-shot compressive spectral imaging with a dual-disperser architecture,” Opt. Express **15**(21), 14013–14027 (2007). [CrossRef] [PubMed]

**10. **A. Wagadarikar, R. John, R. Willett, and D. Brady, “Single disperser design for coded aperture snapshot spectral imaging,” Appl. Opt. **47**(10), B44–B51 (2008). [CrossRef] [PubMed]

**11. **Y. Chi, L. Scharf, A. Pezeshki, and A. R. Calderbank, “Sensitivity to basis mismatch in compressed sensing,” IEEE Trans. Sig. Proc. **59**(5), 2182–2195 (2011). [CrossRef]

**12. **Y. Chi, A. Pezeshki, L. Scharf, and R. Calderbank, “Sensitivity to basis mismatch of compressed sensing for spectrum analysis and beamforming,” in in 2009 Workshop on Defence Applications of Signal Processing (DASP) (2009).

**13. **G. Tang, B. N. Bhaskar, P. Shah, and B. Recht, “Compressed sensing off the grid,” in Proceedings of the 50th Annual Allerton Conference on Communication, Control, and Computing, pp. 778–785 (IEEE, 2012).

**14. **C. V. McLaughlin, J. M. Nichols, and F. Bucholtz, “Basis Mismatch in a Compressively Sampled Photonic Link,” IEEE Photon. Technol. Lett. **25**(23), 2297–2300 (2013). [CrossRef]

**15. **J. M. Nichols, A. K. Oh, and R. M. Willett, “Reducing basis mismatch in harmonic signal recovery via alternating convex search,” IEEE Sig. Proc. Letters **21**(8), 1007–1011 (2014). [CrossRef]

**16. **N. S. Rao, P. Shah, S. J. Wright, and R. Nowak, “A greedy forward-backward algorithm for atomic norm constrained minimization,” in ICASSP’13, pp. 5885–5889 (2013).

**17. **A. M. Bruckstein, D. L. Donoho, and M. Elad, “From Sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM Review **51**(1), 34–81 (2009). [CrossRef]

**18. **D. L. Donoho and J. Tanner, “Precise undersampling theorems,” Proc. IEEE **98**(6), 913–924 (2010). [CrossRef]

**19. **S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM Review **43**(1), 129–159 (2001). [CrossRef]

**20. **J. Gorski, F. Pfeuffer, and K. Klamroth, “Biconvex sets and optimization with biconvex functions: a survey and extensions,” Math. Meth. Op. Res. **66**, 373–407 (2007). [CrossRef]

**21. **M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright, “Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems,” IEEE J. Sel. Top. Sig. Proc. **1**(4), 586–597 (2007). [CrossRef]

**22. **A. Fannjiang and W. Liao, “Mismatch and resolution in compressive imaging,” Proc. SPIE **8138**, 892434 (2011).

**23. **A. Fannjiang and W. Liao, “Coherence pattern-guided compressive sensing with unresolved grids,” SIAM J. Img. Sci. **5**(1), 179–202 (2012). [CrossRef]

**24. **M. F. Duarte and R. G. Baraniuk, “Spectral compressive sensing,” Appl. Comput. Harm. Anal. **35**(1), 111–129 (2013). [CrossRef]