## Abstract

Coregistration errors in multi- and hyperspectral imaging sensors arise when the spatial sensitivity pattern differs between bands or when the spectral response varies across the field of view, potentially leading to large errors in the recorded image data. In imaging spectrometers, spectral and spatial offset errors are customarily specified as “smile” and “keystone” distortions. However these characteristics do not account for errors resulting from variations in point spread function shape or spectral bandwidth. This paper proposes improved metrics for coregistration error both in the spatial and spectral dimensions. The metrics are essentially the integrated difference between point spread functions. It is shown that these metrics correspond to an upper bound on the error in image data. The metrics enable estimation of actual data errors for a given image, and can be used as part of the merit function in optical design optimization, as well as for benchmarking of spectral image sensors.

©2012 Optical Society of America

## 1. Introduction

Multi- and hyperspectral imaging records spectral characteristics of the incoming light within each image pixel. Through image processing, a variety of information products can be extracted. The processing of spectral images generally assumes that the image sensing process provides full spatial and spectral coregistration. In other words, it is assumed that for any given image pixel, all spectral bands measure light from the same area, and also that for any given band, all pixels have the same set of spectral responses. It is well known that even small coregistration errors can lead to large errors in the measured pixel spectrum [1–3]. Therefore, spatial and spectral coregistration are critical factors for the quality of a spectral imaging sensor. Unfortunately, perfect coregistration is not possible in a practical optical design. Coregistration errors can be introduced by aberrations, distortions and diffraction. Coregistration performance may also depend on scanning schemes and data preprocessing.

A widely discussed type of coregistration error is the “keystone” distortion in imaging spectrometers, where wavelength-dependent magnification leads to spatial offset between pixel centers in different bands. Spatial coregistration errors between bands may also have the form of differences in the size and shape of the sensitivity distributions in the scene, as illustrated in Fig. 1 . In the spectral dimension, spatially varying band offset error is known as “smile”. Coregistration error can also result from differences in the width or shape of the spectral response. A separate, rarely discussed form of coregistration error can arise within a single band and pixel if the spatial and spectral responses are interdependent.

Spectral and spatial coregistration has been the topic of numerous studies [1–14]. For imaging spectrometers, the offset-type coregistration error due to keystone and smile distortions is usually expressed in percent of the sampling interval. However, there appears to be no commonly accepted way to express differences in the size or shape of the spatial or spectral sensitivity distributions [3,5,9]. Also, there appears to be no common way to compare the effect of different types of coregistration errors, or to express their combined effect. It is thus desirable to find a common metric which can be used to characterize all forms of coregistration error, in a way that reflects the effect of the error on the measured signal.

A metric for spatial coregistration error was proposed in Ref. [13] based on analysis of particular cases of spatial coregistration error, and also put forward in Ref. [14] at the same conference. Here it is shown that this kind of metric can give a general upper bound on the signal errors resulting from coregistration errors, both spatial and spectral. After an introduction of basic terminology and concepts, spatial coregistration error is analyzed in detail in Section 2. The same methodology is then applied to spectral coregistration errors in Section 3, as well as coregistration errors due to spectral-spatial interdependence in Section 4. The treatment applies to all types of spectral image sensors. For pure keystone and smile error, the metric value approximates the conventional “percent of a pixel” measure under reasonable assumptions. The metrics discussed here have potential to become a universal way of specifying the coregistration performance of any spectral imaging sensor.

## 2. Analysis of spatial coregistration errors

#### 2.1 Preliminaries and definitions

First, it can be noted that there is a lack of established terminology to fully describe the concepts that are encountered in the analysis of coregistration errors. The terminology used in this paper is as follows: The treatment considers a generic spectral image *sensor*, which incorporates optics, photodetector elements, data preprocessing, scanning and possibly other parts. The *scene* is the landscape or object to be imaged. Details of the scene physics are not considered, and the term scene is taken as equivalent to the spectral and spatial distribution of light seen by the sensor (excitance or radiance for scenes at finite and infinite distance, respectively). The overall function of the sensor is to receive *light input* from the scene and produce a *spectral image* as its output. The sensor has a field of view composed of one or more *sensor pixels*. The spectral image data consists of *image pixels*. Each image pixel consists of *samples* of the light input in a set of spectral bands. In some cases, such as an imaging spectrometer, an image pixel corresponds to a particular sensor pixel. In other cases, such as a filter wheel camera on a moving platform, a preprocessing step is needed to compose image pixels from multiple sensor pixels. A region in the scene or image corresponding to the nominal “footprint” of an image pixel is referred to as a *pixel*, but the meaning should be clear from context. The sample values output by the sensor are referred to as *signals*, and are assumed to be proportional to the light input. The proportionality ratio of signal output to light input is the *responsivity*. Sometimes, redundant terminology such as “light samples” is used, hopefully assisting the reader.

In the spectral dimension, the sample values represent a weighted average over the spectrum of light according to a weighting function which ideally is common to all samples in a given band. This *spectral response function* (SRF) is further discussed in section 3. A more general description of the spatial and spectral sampling is given in section 4. In this first part of the paper, I consider only the spatial sampling of the sensor.

In the spatial dimensions, each sample value is ideally an integral of the light input over the nominal pixel area in the scene. In practice, the spatial distribution of responsivity usually has the form of a peak at the pixel location, but with some overlap with the neighboring image pixels as illustrated in Fig. 2 a ). Conventionally, the spatial resolution of imaging optics is described by the point spread function (PSF). However, the PSF is usually understood to be a shift-invariant impulse response function. The spatial sampling into discrete image pixels is sometimes described in the literature by a “system point spread function” which characterizes the overall spatial performance, including the sampling by the photodetector elements. Often, characterization of imaging systems assumes that all pixels exhibit identical spatial sampling properties. To characterize coregistration, however, it is necessary to consider the detailed spatial and spectral characteristics of each sample in the image.

Here, the spatial distribution of responsivity corresponding to an individual light sample will be termed a *sampling point spread function* (SPSF). Consider a single sample in the image, in a band with index *i*. Let the SPSF be a dimensionless function ${f}_{i}(x,y)$ proportional to the sensor responsivity at the point (*x*,*y*) for this sample, where *x* and *y* are image coordinates in pixel units. Thus, for convenience of notation, the pixel boundaries are assumed to form a rectangular grid. Furthermore, let the SPSF be scaled so that it satisfies

Note that the SPSF makes no reference to the internal details of the spectral image sensor, it only describes a relation between the light input and the image data output. The radiometric characteristics of the sensor are not part of the SPSF due to the normalization (1). However, it must be assumed that the sensor's radiometric response is linear, otherwise the SPSF would depend on the light input.

An ideal sensor has no coregistration errors, and the SPSFs of light samples from different bands in the same image pixel will be identical. In practice, the SPSFs of different bands in a given pixel will differ slightly from each other in position and shape due to sensor imperfections, as illustrated in Fig. 1. These differences between SPSFs contain the full information about spatial coregistration errors between bands in the image pixel under consideration. The main point of this paper is to propose a simple way to characterize the coregistration performance based on the set of SPSFs for all bands and all pixels.

#### 2.2 Signal error for a single image pixel with two bands

The effect of coregistration errors depends on the properties of the scene. If the scene is uniform then such errors will have no effect on the recorded signal. Thus, coregistration errors will not affect an image pixel that is “pure” in the sense that the light input is constant within the extent of its SPSFs (one for each band). To characterize coregistration error, it is desirable to obtain an upper bound on its effect in the image. We must then find a scene that causes the largest effect on the signal. Clearly, this worst case must occur for a “mixed” pixel, containing different scene materials within the SPSF.

Consider the simplest possible case of a single image pixel and two spectral bands *i* and *j*. Assume that the scene consists of two materials A and B, with a sharp boundary between them. Ideally, the measured spectrum will be a weighted sum of the spectra of materials A and B. The weight of material A in band *i* is

*i*is thenwhere

*S*and

_{A,i}*S*are the signal levels in band

_{B,i}*i*for pure pixels of materials A and B. (The treatment here is independent of the measurement unit for the signals, since it is based on the normalized responsivity distribution.) Thus, the ratio of contributions from materials A and B in band

*i*is ${w}_{A,i}/(1-{w}_{A,i})$.

Assume that the SPSF of band *j* has a different shape, or an offset in position, due to some coregistration error. Then the weight of material A will be different in band *j* by an amount

*weighting error*. The same amount of change occurs in the opposite direction for material B. Note that Eq. (1) ensures that the weighting error satisfiesso that the total weight can never exceed 1. The signal in band

*j*becomes

*j*, the ratio of contributions from materials A and B is $({w}_{A,i}+\Delta w)/(1-{w}_{A,i}-\Delta w)$. Thus, for $\Delta w\ne 0$, the relative weighting of the two materials is different in the two bands. Then the measured spectrum cannot be formed by any linear mixing of the spectra of materials A and B within the image pixel, as pointed out in Ref. [1].

The effect of coregistration error can be expressed as the change of the signal in band *j* due to the coregistration error:

*signal error*. Note that there is no real asymmetry between bands

*i*and

*j*in this treatment, but Eq. (5) expresses the effect of coregistration error in terms of an equivalent error in the signal value for band

*j*.

#### 2.3 Basic metric for coregistration error between two bands in a single image pixel

For the case of two bands and two scene materials, the maximum signal error occurs when the weighting error (3) has its largest possible value. A similar error model could be established for a scene with more than two materials. However, the maximum error for a multi-material scene could not be any larger than that given by Eq. (5) when the two materials are chosen so that one is bright and one is dark in the sense that their output signals are at opposite ends of the dynamic range of the sensor. Therefore, the case of two scene materials is sufficient to represent the largest possible signal error.

The worst-case spatial arrangement of a two-material scene can be found by a geometric argumentation based on SPSFs: Note first that the scene can be divided into two parts so that in one part of the scene ${f}_{i}>{f}_{j}$ and elsewhere ${f}_{i}<{f}_{j}$. The boundary between these parts at the SPSF intersection ${f}_{i}={f}_{j}$ will be a smooth line, since the SPSFs will be smooth. The division of the scene is illustrated by the examples in Fig. 2 for two specific cases. The largest weighting error occurs for a scene in which the line ${f}_{i}={f}_{j}$ coincides with the boundary between the two scene materials. For example, assume that material A is present wherever ${f}_{i}<{f}_{j}$ and material B is present wherever ${f}_{i}>{f}_{j}$. Then $\Delta w$ has its largest possible value $\Delta {w}_{\mathrm{max}}$ and, as is clear from the figure, we have found the scene geometry that produces the largest signal error.

The maximum weighting error $\Delta {w}_{\mathrm{max}}$ can then be calculated by integrating the over the region filled with one of the materials as in Eq. (3). We can observe, however, that this worst-case weighting error can also be found directly from the SPSF itself without considering a particular scene geometry: Using the notation above, we have

Inserting for the weights and using the linearity of the integration operation, we obtain

In principle, the integral is taken over the entire image plane, but in practice, the main contribution comes from an area surrounding the pixel under consideration. This expression can be used as a *metric* for spatial coregistration error in the pair of bands *i* and *j*, denoted here by the symbol ${\epsilon}_{s,ij}$. The metric (6) does not express signal error directly, but has the useful property of being determined by the sensor alone, independent of the scene.

For fully coregistered bands, ${\epsilon}_{s,ij}=0$. For the case of SPSFs with no overlap at all, we have ${\epsilon}_{s,ij}=1$ independent of the separation between the SPSFs in the image plane. If ${\epsilon}_{s,ij}\approx 1$and the SPSF width is comparable to the spatial sampling interval then obviously a better coregistration can be obtained by reindexing the data so that pixel spectra are composed of overlapping SPSFs. Thus, in most cases we will have ${\epsilon}_{s,ij}<0.5$. Higher values of the coregistration metric are in principle possible in a system where the SPSF width is much smaller than the spatial sampling interval. This is not likely to be a concern in practice, however, since an improvement in coregistration performance could be achieved simply by defocusing to make the SPSFs overlap.

In [14], spatial coregistration was expressed as $1-{\epsilon}_{s,ij}$, a figure of merit that increases with performance. Here, I prefer to express the error ${\epsilon}_{s,ij}$, which relates more directly to the physical imperfections and to the signal error.

#### 2.4 Aggregate coregistration metric for multiple bands and multiple pixels

For most practical uses, a metric for coregistration error must be an aggregate value over multiple bands and/or multiple pixels. It is reasonable to assume that a given amount of coregistration error between two bands is equally bad regardless of which image pixel or which pair of bands the error appears in. The basic metric ${\epsilon}_{s,ij}$ varies linearly with the signal error, so an aggregate metric can be formed by averaging over all band pairs in each image pixel, and then averaging over all pixels. For B bands and P pixels, the average is

*i*and

*j*in image pixel

*p*. In cases where each image pixel corresponds to a single sensor pixel,

*P*is the number of sensor pixels. However, when image pixels are composed from multiple sensor pixels, the averaging over pixel index in Eq. (7) must be understood to include an average over all possible preprocessing cases. When it is known that the signal to noise level will vary between bands, it may be appropriate to include a weighting factor in the averaging over band index, for example based on the information rate in each band [15].

The average metric ${\overline{\epsilon}}_{s}$ gives a measure of the overall coregistration error, but no upper limit. The largest errors tend to appear at the ends of the spectral range or at the edges of the field of view, and may then be significantly larger than the mean error. Thus, it may also be desirable to know the largest coregistration error between any pair of bands in any image pixel:

For reporting of sensor performance to the data user, it would be informative give both the average coregistration error ${\overline{\epsilon}}_{s}$ and the maximum error ${\epsilon}_{s,\mathrm{max}}$. This pair of values is suggested here as a possible standard for reporting of spatial coregistration performance for spectral imagers.

In some cases, a more detailed specification of coregistration may be of interest. If, for example, a subset of sensor pixels exhibits significantly better coregistration, then their performance can be reported separately so that the user can select higher quality data when needed. Similarly, the coregistration error can be reported separately for each band as an average of its coregistration with the other bands:

For many sensor types, such as the imaging spectrometer, the design aims to provide coregistration in the sensor hardware (as opposed to software preprocessing of raw data). Design of such a spectral imaging sensor faces a compromise between pixel count and coregistration. By binning the image data to form an image with fewer pixels, coregistration error will tend to be reduced by a factor equal to the binning factor. The sensor pixel count *P* and the average spatial coregistration error ${\overline{\epsilon}}_{s}$ can therefore be expressed in a combined performance metric which may be termed “limiting number of pixels” ${P}_{\mathrm{lim}}$:

This is a figure of merit which is invariant with binning of the image, and which also reflects the increased utility of a sensor with a larger number of pixels in its field of view.

#### 2.5 Equivalence with conventional measure of keystone error

Consider “keystone” coregistration error, where there is an offset between the SPSFs of two bands, often specified as a fraction of the pixel sampling interval. This form of error can be analyzed in one dimension by projecting the SPSF onto a line in the direction of the offset. Assume, without loss of generality, that the two bands are offset along the *x*-axis. Then the projected SPSF is obtained by

This projected SPSF can be seen as a line spread function for a line at the pixel center. Let the width of the projected SPSF be ∆*x*, according to some reasonable measure such as full width at half maximum (FWHM). Then the peak amplitude of $f(x)$ will be on the order of 1/∆*x* since the integral of $f(x)$ is 1.

For a small offset, the largest signal error occurs when a scene boundary passes through the SPSF peak, as shown Fig. 2 d). The resulting worst-case weighting error is the part of the SPSF volume moved across the boundary by the offset, as illustrated in Fig. 3
. If the offset is a fraction *q* of a pixel, and the pixel center and scene boundary is at *x _{0}*, then

Thus, the coregistration metric (6) expresses keystone error as a fraction of the SPSF width, approximately.

The width of the SPSF is normally comparable to one pixel, in the sense that most of the volume under the SPSF falls within the nominal pixel boundary. (This assumes that the sensor design is balanced so that blurring effects are matched to the pixel pitch of the image.) Then to satisfy Eq. (1), the peak height of the SPSF is $f({x}_{0})\approx 1$ and from Eq. (9) we have

This analytical result has been compared to a numerical simulation of coregistration between two Gaussian SPSFs, both with FWHM of 1 pixel, at varying offset *q*. In this case, the metric slightly underestimates the offset, but remains within 80% of the correct value up to an offset of *q*=0.8 pixel. Thus, under reasonable assumptions, the spatial coregistration error metric ${\epsilon}_{s}$ approximates the offset *q* and can be interpreted as “fraction of a pixel”, in accordance with the customary way of specifying keystone error.

#### 2.6 Effect of coregistration error on image processing

Coregistration error may have a significant impact on image processing. For example, signature-specific detection of a small target may fail because the target spectrum is distorted by background signals in a way that is inconsistent with a linear mixing assumption. Coregistration is also critical for any form of parameter estimation, such as indices calculated from specific band ratios, or abundance estimation based on the linear mixing model. Image processing may be particularly sensitive to coregistration error when the scene exhibits strong spatial variation, or when the processing result must be obtained with good spatial resolution.

Many processing algorithms assume that the distribution of pixel spectra falls within a subspace of the “spectral space” defined by the multivariate image data. Coregistration error can lead to large deviations from this assumption. For illustration, consider the simple case of two bands and two scene materials discussed above. Ideally, all image spectra are distributed along a line joining the two endmember spectra of material A and B, as illustrated in Fig. 4 . Even for a perfect sensor, the distribution of data is blurred by noise, illustrated in Fig. 4 by the blue region. However, coregistration errors can easily have a much larger effect than the noise. For a randomly varying scene, the weighting difference $\Delta w$ between the bands will tend to vary randomly from pixel to pixel. This variation tends to broaden the linear distribution into a two-dimensional region whose full width is given by ${\epsilon}_{s,\mathrm{max}}$. This is illustrated in the figure for ${\epsilon}_{s,\mathrm{max}}=0.15$, comparable to the specified keystone error for some practical sensors. The red region is the signal variation resulting from a weighting error $\Delta w$ varying randomly in the range ±0.15. When the contrast between the scene materials spans a large fraction of the signal range, as in the figure, then the effect of coregistration error easily dominates over the noise, leading to a large deviation from the assumption of linear mixing. For an image containing a large fraction of relatively homogeneous areas, the overall broadening of the spectral distribution may tend to be less than suggested by the figure. Still, image data from boundaries and pixel-size features in the scene would tend to fall outside the true distribution due to coregistration errors.

In Fig. 4, distribution broadening due to coregistration error spans a region which is larger than the noise by a factor on the order of 10 for the case of two bands. If another band is added, distribution broadening will occur in the new dimension as well. Thus, in the case of a large number of bands, the distribution broadening due to coregistration errors may become very large compared to the broadening by noise. It can be noted, though, that in practice the SPSFs of neighboring bands may be correlated in shape, since optical distortions tend to evolve slowly with wavelength. Therefore, the effect of coregistration error on the signal distribution may be less serious than suggested by extrapolation from Fig. 4 into a higher dimensionality.

#### 2.6 Estimating signal errors using the coregistration metric

The data user may be interested in an estimate of the effect of coregistration error. The signal error can be seen as an uncontrolled and band-dependent signal contamination from some neighboring constituent of the scene. Given an image and a value for the coregistration metric, it is possible to estimate the magnitude of signal errors in the image: First, it is necessary to calculate a relevant measure of image contrast such as the average difference between nearest neighbor image pixels. Let this contrast measure in band *i* be ${\sigma}_{i}$. The coregistration metric (8) can then be used to find an estimate of the signal error in band *i*

The estimated signal error can potentially be used in image processing. For example, if a target detection threshold is exceeded by a pixel spectrum then it may be of interest to check whether the exceedance can be explained by coregistration error.

## 3. Metric for spectral coregistration errors between image pixels

Now consider the analogous problem of spectral coregistration error. Then we can consider a single band, hence the following notation does not include a band index. Let the spectral response function (SRF) in the band for an image pixel with index *p* be ${g}_{p}(\lambda )$. This function describes the variation of responsivity with wavelength $\lambda $, normalized so that

A metric of spectral coregistration between two pixels *p* and *q* in the band under consideration can be obtained from the respective SRFs by

While material boundaries and shadows are common in the spatial domain, analogous step-like rearrangements do not occur arbitrarily in the spectral domain. Therefore Eq. (11) will tend to overestimate the signal error, particularly for a smooth spectrum recorded by a hyperspectral imager. There are nevertheless important physical effects that lead to steep slopes in spectra, notably the “chlorophyll edge” and atmospheric absorption lines, which have been shown to cause large signal errors [1,4,6]. In analogy with the spatial metric (6), the spectral coregistration metric (11) can give an upper bound on these errors.

In the case of a spectral offset error, or “smile” distortion, the metric (11) expresses the offset approximately as a fraction of the bandwidth, in direct analogy with Eq. (9). Thus, if the bandwidth is approximately equal to the spectral sampling interval then the metric corresponds to the conventional way of specifying smile, while also capturing coregistration errors due to differences in the shape or width of the SRF.

Note that the SRF is defined to be proportional to responsivity, which conventionally characterizes response per unit power of the incoming light. Alternatively, the responsivity can be defined as response per photon. (This is arguably the preferred definition, since the fundamental measured quantity is a photon count.) Depending on the definition used, different values for ${\epsilon}_{\lambda ,pq}$ will result. However, this difference will normally not be large for multi- and hyperspectral imagers, where the relative width of each band tends to be small.

Aggregate metrics for spectral coregistration can be defined and used in analogy with the spatial metrics in Sec. 2.4. The average and max operation are then taken over all pixel pairs. Average values can be given over all bands and for each band separately, in analogy with Eqs. (7) and (8). For a given spectral image, actual signal errors can be estimated in analogy with Eq. (10) by using the aggregate metrics and an estimate of spectral contrast such as the mean difference between neighboring bands.

## 4. Metric for spectral-spatial interdependence error for a single light sample

A separate and rarely discussed case of coregistration error occurs if the spectral and spatial responsivity distributions are interdependent. For a single light sample, the spatial distribution of responsivity, described above by the SPSF, is usually assumed to be completely independent of the spectral responsivity distribution, described by the SRF. For many image sensors, this assumption holds to such a high degree that it is normally not made explicit in the literature. However, some spectral image sensor concepts have potential to introduce spectral-spatial interdependencies in the sampling process, which can lead to signal errors.

Consider the sampling of light in a single band *i* in a single image pixel *p*. The response distribution in $xy\lambda $ space for this sample can be described by a spectral-spatial distribution function ${F}_{ip}(x,y,\lambda )$ whose integral over $xy\lambda $ space is 1. Strictly, the SPSF ${f}_{ip}(x,y)$ and SRF ${g}_{ip}(\lambda )$ can only be defined as averages, obtained by integration of ${F}_{ip}(x,y,\lambda )$ over the spectral axis or the image plane respectively:

Ideally, the spectral and spatial responsivity distributions are independent. Then the sensor response distribution in $xy\lambda $ space is separable, *ie* ${F}_{ip}(x,y,\lambda )={f}_{ip}(x,y){g}_{ip}(\lambda )$, in full analogy with the criterion for independence of continuous distributions in statistics. In the interpretation and exploitation of the image data, it must be assumed that the sensor behaves in this ideal way. However, if the SPSF and SRF are interdependent then the spectral response varies within the extent of the SPSF, and/or the spatial response varies within the band. In that case, the signal is influenced by the spatial arrangement of scene materials in a non-ideal way.

As an example, consider the widely used imaging spectrometer, with a slit defining the field of view, and a grating or prism for spectral dispersion. Since the dispersion is in the direction across the slit, the spectral response will change from one side of the slit to the other, causing interdependence between the spectral and spatial responsivities for a single sample. An idealized imaging spectrometer case is illustrated in Fig. 6
, where only the *x* and *λ* dimensions are considered. The *x* direction is assumed to be across the slit. This is normally also the scan direction, but here no scan movement is considered. Then the spectral-spatial responsivity distribution has form of a parallelogram, as illustrated in the figure. The responsivity is assumed to be constant for positions and wavelengths within the parallelogram, and zero outside. The SPSF and SRF, as defined by Eq. (12), are shown in the insets. The sensor records a signal from a pixel area defined by the extent of the SPSF. Consider a scene consisting of a monochromatic point source with wavelength λ_{0}, located at different points within the pixel. In the case where the source is located at *x*_{1}, it is recorded with a responsivity that is higher than $g({\lambda}_{0})$, leading to a signal value that is too high. If instead the source is located at *x*_{2}, the signal is zero. Thus, in this extreme case, spectral-spatial interdependence causes large errors.

In Fig. 6, it can also be seen that the signal error would depend on the wavelength of the point source. For an extended source in the form of a mixed pixel, the spectral-spatial interdependence would lead to a situation where the weighting of a scene material depends not only on its spatial distribution but also on its spectral properties, which is clearly non-ideal. In practice, optical blur or scan motion may make the errors significantly smaller than suggested by Fig. 6. Nonetheless, a residual error is likely to exist in many systems.

It is possible to define a worst-case signal error due to spectral-spatial interdependence. Consider first a light input which is spectrally and spatially uniform and produces a signal value *a*. Then consider a case where the light input is changed in some part of $xy\lambda $ space to another constant level corresponding to a signal value *b*. Such an input is of course fictitious, but analogous to the mixed pixel case in section 2. Consider a sample recorded in a pixel and band located in $xy\lambda $ space such that it overlaps with the boundary between the two light input levels. The signal will then be a weighted average of *a* and *b*. The weighting of the value *a* in the output signal will be

*a*. (For clarity, band and pixel indices have been omitted since the treatment here considers a single light sample.) In the ideal case, spectral and spatial responsivity distributions are independent, and the weighting iswhere the SPSF $f(x,y)$ and SRF $g(\lambda )$ are defined by Eq. (12). It is therefore possible to define a weighting error for spectral-spatial responsivity interdependence as

In analogy with Figs. 2 and 5, there will be a surface in $xy\lambda $ space defined by

The worst-case signal error due to spectral-spatial interdependence arises when the input level changes abruptly from one value to another at this boundary. Therefore, in analogy with Eqs. (6) and (11), a metric for spectral-spatial interdependence error can be defined as

It may be unlikely that a scene will produce an input signal that corresponds to the worst case outlined above. However, it is clear that the errors expressed by Eq. (13) may be significant in some cases such as the non-scanning imaging spectrometer above. Therefore, the interdependence error metric is relevant for design and specification of instruments.

As with the spatial and spectral metrics above, aggregate metrics for the spectral-spatial interdependence can be specified in terms of average or maximum values for ${\epsilon}_{\lambda s}$, as well as by bandwise averages, according to Sec. 2.4.

## 5. Practical estimation of coregistration metrics

For the proposed metrics to be of practical interest, it must be possible to determine their values experimentally. Spatial resolution is customarily characterized in terms of MTF, but this does not enable a unique determination of the SPSF. In the literature on spectral imagers, characterization of coregistration has focused on offset-type errors, and in some cases measurement of peak FWHM [9,14].

The SPSF can be measured directly by scanning a subpixel source, although such a measurement is not entirely trivial. Different ways to measure SPSF are described in Refs [16–18]. An efficient technique for measuring spatial coregistration performance is suggested in Fig. 7 . A back-illuminated reticle with a periodic pattern of relatively wide opaque bars is projected in the field of view of the imaging sensor, forming a set of knife edges. The reticle is scanned in subpixel steps through one bar period while recording a series of image data. Each sensor pixel then produces a step-like signal, which can be differentiated to form an estimate of the projected SPSF. The scan is repeated in a set of different directions. An estimate of the full SPSF can then be obtained by tomographic reconstruction [18,19]. With a broadband light source, the SPSF can be obtained for all bands. In principle, the technique can measure all sensor pixels, even for sensors with a two-dimensional pixel layout as indicated in the figure. In practice, limitations of the collimator may dictate that sections of the sensor field of view must be measured separately.

The accuracy of SPSF estimation will be limited by noise. However, it appears realistic to reduce the noise significantly by averaging. Since the SPSF is smooth, only a limited number of sampling points is needed to represent its shape. As an example, consider a case where the SPSF is resolved in 10×10 sampling points spanning 3×3 pixels. The reticle may need to make at least 20 steps to achieve good obscuration in the dark part of the scan for all pixels. Thus, the required number of reticle positions (angles and steps) is on the order of 200. This number of frames can usually be recorded in less than a minute, so that there is ample opportunity for noise reduction by multi-frame averaging or repeated measurements. Therefore, good signal to noise ratio should be achievable. Noise can possibly also be reduced by averaging over neighboring sensor pixels, which will tend to behave similarly. The effect of noise in the tails of the SPSF can be remedied by thresholding to select only values above the noise floor. Roughly speaking, this will be acceptable as long as the discarded measurements represent a fraction of the SPSF that is smaller than the fraction of noise in image data. Reference 19 discusses the effect of noise on the tomographic reconstruction.

For characterization of spectral coregistration, the SRF is relatively easy to measure, using a monochromator-based test source [5]. Thus, it is clearly feasible to evaluate the spectral coregistration metric (11) and the corresponding aggregate measures characterizing a sensor. In both the spatial and spectral case, it is only necessary to characterize a representative set of the sensor pixels.

Regarding spectral-spatial interdependence errors within a single band and pixel, I can find no publications that discuss this type of error in any detail. To evaluate the proposed error metric (13), it is necessary to measure the full spectro-spatial responsivity distribution function${F}_{i,p}(x,y,\lambda )$. By using a tunable narrowband light source, the setup of Fig. 7 is in principle capable of making such a measurement. This is not trivial, however, since it requires a tunable laser with potential issues related to availability, stability and speckle noise.

Even without any measurements, it is possible to estimate values for the coregistration metrics from optical simulations. As outlined in [3], the responsivity distributions can be estimated by convolving the simulated optical PSF with other broadening factors such as the size of the photodetector element, the slit width and the scan movement. The actual coregistration performance tends to be degraded by manufacturing tolerances. Still, an estimate of coregistration performance based on optical simulation can be useful by giving a lower limit on the metric values. Simulations may be the best way to estimate the spectral-spatial coregistration error metric ${\epsilon}_{\lambda s}$.

## 6. Discussion

Coregistration errors can lead to large errors in recorded spectral images, but the errors are scene dependent. In cases where scene objects are large and spectra are smooth relative to the sensor sampling interval, the signal errors tend to be small. (Thus, the radiometric calibration accuracy is completely decoupled from coregistration error.) For small scene objects or rapidly changing spectra, coregistration errors can approach the maximum values represented by the metrics proposed here. It may be argued that such cases are rare, but on the other hand, it is these challenging cases that drive the specification of sensor resolution in the first place. Therefore, an upper-bound metric is an appropriate way to specify coregistration error even if the maximum error occurs very rarely. In fact, it seems possible that the effects of coregistration error on image processing can go unnoticed, since the signal error is largest in cases that are considered difficult anyway, such as mixed pixels or small targets.

Some classes of sensors, such as imaging Fourier transform spectrometers or filter wheel cameras, do not record all spectral components simultaneously. Then instabilities in the sensor pointing during recording, and possibly also parallax effects, will lead to coregistration errors. Furthermore, motion or other temporal variation in the scene will lead to similar image artifacts. On the other hand, sensor movement during recording will generally tend to reduce the signal error by motion blurring of the SPSF, at least for a stationary scene. Thus, in some cases the scanning, sensor pointing and scene properties must be taken into account, either in the reported metric value or in the estimation of signal error in an image.

Currently, spectral image sensors tend to be specified in terms of spatial and spectral sampling interval, combined with a specification of smile and keystone distortions. As discussed here, these parameters do not give complete information about coregistration. On the other hand, smile and keystone are easy to measure and widely understood. Still, in view of the potential for large signal errors, it appears that a more complete and well defined measure of coregistration error should be used to specify sensors. Measurement of sensor performance according to the metrics discussed here is not a standard practice at this time. However, such measurements should be entirely feasible, for example using the procedure outlined in Fig. 7. Even without measurement, metric values can be estimated from simulation of the optics. In view of the large signal errors that can result from coregistration error, and the potential for characterizing and estimating such errors using the metrics proposed here, it appears that more emphasis should be put on measurement of the SPSF and SRF of spectral imagers.

It can be noted that the spatial metric (6), and also the spectral-spatial metric (13), can be used to characterize the performance of whiskbroom-type hyperspectral sensors as well as non-imaging spectrometers such as those used for “ground truthing” in remote sensing. Also, it can be noted that the spectral metric (11) and the spectral-spatial metric (13) apply even to monochrome imaging.

For design of spectral imager optics, it is necessary to define a global merit function for system performance, in which coregistration errors will be an important part. It is then necessary to specify coregistration performance in a single number. The metrics proposed here are proportional to the error, therefore, their values can be combined by simple linear combination

*a*must be selected according to the expected amount of spectral and spatial contrast in the application. If spatial and spectral contrasts are both comparable to the dynamic range of the signal, as in the case of reflective-domain remote sensing, then the metrics may be simply be given equal weight. Compared to the method outlined in [3], this represents a significant simplification, and probably an improvement.

## 7. Conclusions

The current way of specifying coregistration error in hyperspectral sensors in terms of keystone and smile does not fully represent the coregistration performance. At the same time, coregistration errors can have a very detrimental effect on the image data and processing results. It is even possible for the effects of coregistration error on image processing to go unnoticed, since the signal error is largest in cases that are considered difficult.

This paper proposes metrics for all types of coregistration error in multi- and hyperspectral imaging: 1) spatial coregistration error between bands in the same image pixel (including keystone), 2) spectral coregistration error between different image pixels in the same band (including smile), as well as 3) interdependencies between spectral and spatial response distribution within a single band and pixel. The metrics are given in Eqs. (6), (11) and (13), respectively. These metrics are independent of the image sensor technology, and represent an upper bound on the possible error in the recorded images. Based on the metric values for a given image sensor, the actual signal error in an image can be estimated. The coregistration metrics can be summed together for use as part of the merit function in optimization of optical designs. The spatial and spectral-spatial coregistration metrics ${\epsilon}_{s}$ and ${\epsilon}_{\lambda s}$ are also applicable to non-imaging spectrometers, such as those used for ground truthing in remote sensing. Values for the metrics can be obtained from measurements, and also from simulations of imaging optics. It is suggested that the proposed metrics should be adopted as a standard for reporting coregistration performance of spectral image sensors.

## Acknowledgments

I would like to thank Andrei Fridman of Norsk Elektro Optikk AS for interesting discussions around optics design that highlighted the need for a coregistration metric. Thanks are also due to Ingebjørg Kåsen of FFI and Peter Catrysse of Stanford, as well as the referees, for their constructive comments to the manuscript.

## References and links

**1. **P. Mouroulis, D. A. Thomas, T. G. Chrien, V. Duval, R. O. Green, J. J. Simmonds, and A. H. Vaughan, *Trade Studies in Multi/hyperspectral Imaging Systems—Final Report* (NASA Jet Propulsion Laboratory, 1998).

**2. **P. Mouroulis, “Spectral and spatial uniformity in pushbroom imaging spectrometers,” Proc. SPIE **3753**, 133–141 (1999). [CrossRef]

**3. **P. Mouroulis, R. O. Green, and T. G. Chrien, “Design of pushbroom imaging spectrometers for optimum recovery of spectroscopic and spatial information,” Appl. Opt. **39**(13), 2210–2220 (2000). [CrossRef] [PubMed]

**4. **R. O. Green, “Spectral calibration requirement for Earth-looking imaging spectrometers in the solar-reflected spectrum,” Appl. Opt. **37**(4), 683–690 (1998). [CrossRef] [PubMed]

**5. **P. Mouroulis and M. M. McKerns, “Pushbroom imaging spectrometer with high spectroscopic data fidelity: experimental demonstration,” Opt. Eng. **39**(3), 808–816 (2000). [CrossRef]

**6. **R. A. Neville and L. Sun, “Karl Staenz, “Detection of spectral line curvature in imaging spectrometer data,” Proc. SPIE **5093**, 144–154 (2003). [CrossRef]

**7. **R. A. Neville, L. Sun, and K. Staenz, “Detection of keystone in imaging spectrometer data,” Proc. SPIE **5425**, 208–217 (2004). [CrossRef]

**8. **J. Zadnik, D. Guerin, R. Moss, A. Orbeta, R. Dixon, C. Simi, S. Dunbar, and A. Hill, “Calibration procedures and measurements for the COMPASS hyperspectral imager,” Proc. SPIE **5425**, 182–188 (2004). [CrossRef]

**9. **D. Schläpfer, J. Nieke, and K. I. Itten, “Spatial PSF Non-uniformity Effects In Airborne Pushbroom Imaging Spectrometry Data,” IEEE Trans. Geosci. Rem. Sens. **45**(2), 458–468 (2007). [CrossRef]

**10. **F. Dell’Endice, J. Nieke, D. Schläpfer, and K. I. Itten, “Scene-based method for spatial misregistration detection in hyperspectral imagery,” Appl. Opt. **46**(15), 2803–2816 (2007). [CrossRef] [PubMed]

**11. **P. Mouroulis and R. O. Green, “Spectral response evaluation and computation for pushbroom imaging spectrometers,” Proc. SPIE **6667**, 66670G (2007). [CrossRef]

**12. **J. T. Casey and J. P. Kerekes, “Misregistration impacts on hyperspectral target detection,” J. Appl. Remote Sens. **3**(1), 033513 (2009). [CrossRef]

**13. **T. Skauli, “Quantifying coregistration errors in spectral imaging,” Proc. SPIE **8158**, 81580A, 81580A-8 (2011). [CrossRef]

**14. **G. Lin, R. E. Wolfe, and M. Nishihama, “NPP VIIRS geometric performance status,” Proc. SPIE **8153**, 81531V, 81531V-14 (2011). [CrossRef]

**15. **T. Skauli, “Sensor noise informed representation of hyperspectral data, with benefits for image storage and processing,” Opt. Express **19**(14), 13031–13046 (2011). [CrossRef] [PubMed]

**16. **C. D. Claxton and R. C. Staunton, “Measurement of the point-spread function of a noisy imaging system,” J. Opt. Soc. Am. A **25**(1), 159–170 (2008). [CrossRef] [PubMed]

**17. **H. Du and K. J. Voss, “Effects of point-spread function on calibration and radiometric accuracy of CCD camera,” Appl. Opt. **43**(3), 665–670 (2004). [CrossRef] [PubMed]

**18. **S. Quabis, R. Dorn, M. Eberler, O. Glockl, and G. Leuchs, “The focus of light - theoretical calculation and experimental tomographic reconstruction,” Appl. Phys. B **72**, 109–113 (2001).

**19. **H. Hovland, “Tomographic scanning imager,” Opt. Express **17**(14), 11371–11387 (2009). [CrossRef] [PubMed]