## Abstract

A new binary classification method for Mueller matrix images is presented which optimizes the polarization state analyzer (PSA) and the polarization state generator (PSG) using a statistical divergence between pixel values in two regions of an image. This optimization generalizes to multiple PSA/PSG pairs so that the classification performance as a function of number of polarimetric measurements can be considered. Optimizing PSA/PSG pairs gives insight into which polarimetric measurements are most useful for the binary classification. For example, in scenes with strong diattenuation, retardance, or depolarization certain PSA/PSG pairs would make two regions in an image look very similar and other pairs would make the regions look very different. The method presented in this paper provides a quantitative method for ensuring the images acquired can be classified optimally.

© 2017 Optical Society of America

## 1. INTRODUCTION

Enhanced signal detection is desired in all scientific imaging applications. Polarimetry has been used to enhance signal detection in reconnaissance, atmospheric science, and medical imaging, to name a few. Early work on optimizing contrast in imaging polarimetry can be found in [1–3]. In [3] the figure of merit (FOM) is the Euclidean distance between the last three Stokes parameters scattered from two homogenous regions. In [1,2] a signal-to-noise metric is used for the FOM which incorporates a given noise model. These authors have extended the technique to several noise models [4]. Other extensions of this work include using a sample covariance of the Mueller matrix to account for spatial fluctuations in regions of the image [5]. Routines that first perform image segmentation and then polarimetric contrast optimization have been developed and implemented in active polarimeters to demonstrate real-time capabilities [6–8]. Recent work extends polarimetric contrast optimization to broadband imaging [9].

Other researchers have considered support vector machines [10], principal components analysis [11], and clustering [12] for Mueller matrix classification. Another popular option is a decomposition of the Mueller matrix into constituent polarimetric properties; classification is computed from these values rather than the original Mueller matrix values [13–15].

A new method to optimize the polarization state analyzer (PSA) and the polarization state generator (PSG) using a statistical divergence between the Mueller matrix measurements in two regions of an image is presented. Applying this optimization to a Mueller matrix image yields insight into which polarimetric measurements are most useful for distinguishing two selected regions in the image. This optimization generalizes to multiple measurements at different PSA/PSG pairs so that the classification performance as a function of number of polarimetric measurements can be considered. Our mathematical formalism also allows for the classification task to be defined as either the distinction between two Mueller matrices or between two populations of Mueller matrices. The latter approach can incorporate measurement noise and spatial, spectral, or angular variations of the Mueller matrix measurements into the classifier.

Unlike prior research, this work optimizes a statistical divergence [see Eq. (11)] between two distributions of measured intensity values using a gradient-based nonlinear search. The PSA and PSG parameterize a linear transformation of the Mueller matrix, the output of which is the measured intensity value. The same linear transform is applied to a Mueller matrix belonging to one of the two classes; these two classes generate two distributions. To use the output of the linear transform as a classifier the PSA/PSG pair that maximizes the divergence is selected. Multiple PSA/PSG pairs are readily included in the optimization since the divergence is scalar-valued even for multi-dimensional distributions. The two distributions of intensity values are assumed to be normal and selected pixels in the image are used as a training set from which to estimate first- and second-order statistics for each class. Selecting this training set is application-dependent and is not included in the mathematical formalism of this work. Similar to [8] a pre-processing segmentation algorithm could be applied to automatically select the training set from a given image.

This paper is organized as follows: Section 2 describes the problem, the known solutions in very simple cases, and summarizes the general solution that is the topic of this paper. Section 3 introduces the FOM to optimize PSA/PSG pair(s). Computational methods and demonstrations on five measurement examples are given in Sections 4 and 5, respectively. The gradient of the FOM is presented in Appendix A.

## 2. FORMULATION OF THE PROBLEM

For linear light–matter interactions the relationship between a noise-free scalar-valued intensity measurement and an object’s Mueller matrix is

Here $\mathbf{a}$ is the four Stokes parameters describing the PSA, $\mathbf{g}$ is the Stokes parameters of the PSG, $i$ is a noise-free pixel value, and $t$ denotes the transpose of a real-valued vector. Bold lower-case letters are vectors and bold upper-case letters are matrices. Mueller matrix imaging is of interest when the Mueller matrix of an object, or of a scene, is spatially varying. In this case a polarimetric imaging measurement at a given PSA/PSG state yields an intensity measurement that is also spatially varying. For scenes with strong diattenuation, retardance, or depolarization, certain PSA/PSG pairs would make two regions in an image look very similar and other pairs would make the regions look very different. In other words, certain PSA/PSG pairs separate the intensity histograms of the two classes. This is particularly useful when discriminating information is not present, or only present at very low contrast, in the intensity images.#### A. Analytic Solution for Simple Cases

In a simplified version of the binary classification problem the discrimination task is between two static (i.e., noise-free) Mueller matrices ${\mathbf{M}}_{1}$ and ${\mathbf{M}}_{2}$. In general the Mueller matrix value within each class will vary, for example, spatially, spectrally, or temporally. When the Mueller matrix is static within each class closed-form solutions for the optimal PSG may exist. The least optimal solution is a PSG state for which both ${\mathbf{M}}_{1}$ and ${\mathbf{M}}_{2}$ produce the same Stokes parameters: ${\mathbf{M}}_{1}{\mathbf{g}}_{\parallel}={\mathbf{M}}_{2}{\mathbf{g}}_{\parallel}$. The least optimal PSG solution is denoted ${\mathbf{g}}_{\parallel}$ since it produces identical Stokes parameters for both Mueller matrices. For full-rank Mueller matrices the least optimal PSG solution is

which is an eigenvector of ${\mathbf{M}}_{1}^{-1}{\mathbf{M}}_{2}$ with eigenvalue equal to one. The eigenspace of ${\mathbf{M}}_{1}^{-1}{\mathbf{M}}_{2}$, with an eigenvalue of one, may or may not contain physically realizable Stokes parameters. A least optimal PSG state ${\mathbf{g}}_{\parallel}$ is not guaranteed to exist for every Mueller matrix pair but this simple approach offers an algorithm to calculate least optimal solutions. Any PSA would yield identical intensity values in the two regions.A canonical solution for the optimal solution is a PSG state for which ${\mathbf{M}}_{1}$ and ${\mathbf{M}}_{2}$ produce Stokes parameters of orthogonal polarization states; denote this PSG state as ${\mathbf{g}}_{\perp}$. Stokes parameters of perpendicular polarization states are related by a diagonal reflection matrix $\mathbf{R}$ that is $[1,-1,-1,-1]$ on the diagonal. This matrix preserves the first Stokes parameter and flips the sign of the other three. The resulting expression is

The eigenspectrum of ${\mathbf{M}}_{1}^{-1}{\mathbf{RM}}_{2}$ may or may not contain physically realizable Stokes parameters with an eigenvalue of one. Again, an optimal PSG state ${\mathbf{g}}_{\perp}$ is not guaranteed to exist for every Mueller matrix pair but this approach offers a way to check the existence of optimal solutions. The optimal PSA would maximize transmission in one region and block transmission in the other region by being parallel to either ${\mathbf{M}}_{1}{\mathbf{g}}_{\perp}$ or ${\mathbf{M}}_{2}{\mathbf{g}}_{\perp}$.The limited applicability of these analytic solutions motivates our approach to optimize a FOM.

#### B. Our Solution for General Cases

As reported in [16], Eq. (1) can be written as an inner-product between two vectors: one involving the Mueller matrix elements and the other the PSA/PSG pair. Consider the $16\times 1$ vector

where $\otimes $ is the Kronecker product. If $\mathbf{A}$ is an $m\times n$ matrix and $\mathbf{B}$ is a $p\times q$ matrix, then the Kronecker product $\mathbf{A}\otimes \mathbf{B}$ is the $mp\times nq$ block matrix*a priori*models of $p{r}_{1}(\mathbf{m})$ and $p{r}_{2}(\mathbf{m})$ are not available and estimating these distributions would require a large library of measurements.

In this work, instead of using the full Mueller matrix as input to the classifier, the J-optimal channelized quadratic observers (J-CQO) algorithm uses an intensity measurement made at a given PSA/PSG or a series of intensity measurements made at various PSA/PSGs. The optimization seeks the PSA/PSG solution which yields the best discrimination between the two classes of Mueller matrices using the intensity measurement $i$ or measurements $\mathbf{i}$ as input to the classifier. Equation (6) is the foundation of this work because it expresses the measured intensity as a weighted sum of the object’s Mueller matrix elements. Solving for an optimal linear transform for detection tasks was the topic of our prior work on J-CQO [19]. In that work the expression $\mathbf{v}=\mathbf{Tx}$ was used to find the optimal channel matrix $\mathbf{T}$ for detection between two classes of images ${\mathbf{x}}_{1}$ and ${\mathbf{x}}_{2}$ using the statistics of $\mathbf{v}$. The distribution $pr(\mathbf{v})$ is assumed to be normal, which can be supported by the central limit theorem even for sums of correlated numbers as long as the correlations are not long-range [20]. The expression $i={\mathbf{d}}^{t}\mathbf{m}$ is the starting point to consider the optimal solution for $\mathbf{d}$ to discriminate between two classes of Mueller matrices: ${\mathbf{m}}_{1}$ and ${\mathbf{m}}_{2}$. In prior J-CQO work multiple linear combinations of the image data were formed, i.e., $\mathbf{v}$ was vector-valued. Therefore the J-CQO method can be readily extended to test whether an improvement in binary classification is possible by measuring the object at multiple PSA/PSG pairs thereby producing a vector-valued intensity measurement $\mathbf{i}$.

Instead of optimizing eight numbers (i.e., two sets of Stokes parameters) the optimization can be reduced to six numbers. Stokes parameters can be transformed to a total radiance $I$ and three coordinates on the Poincaré sphere: for example, $\mathbf{a}=[{I}_{a},{\rho}_{a}\text{\hspace{0.17em}}\mathrm{cos}(2{\psi}_{a})\mathrm{cos}(2{\chi}_{a}),{\rho}_{a}\text{\hspace{0.17em}}\mathrm{sin}(2{\psi}_{a})\mathrm{cos}(2{\chi}_{a}),{\rho}_{a}\text{\hspace{0.17em}}\mathrm{sin}(2{\chi}_{a})]$; similarly for $\mathbf{g}$. In the J-CQO optimization ${I}_{a}$ and ${I}_{g}$ will be set to unity since any increase in total radiance would always increase $J$ for an otherwise fixed PSA and PSG. Consider the Poincaré coordinates as elements of the vector $\mathit{\theta}=[\rho ,\psi ,\chi ]$ with constraints $0\le \rho \le 1$, $-\pi /2\le \psi \le \pi /2$, and $-\pi /4\le \chi \le \pi /4$. The work of [21] concluded that it is never beneficial to use partially polarized illumination or analysis for contrast optimization. Therefore one could further constrain $\rho $ to be zero or one instead of on this interval. However, [21] considered a quadratic FOM and J-CQO is only quadratic in special cases. For generality we have allowed partial polarization states in this optimization.

In Appendix A the gradient of $J$ with respect to the Poincaré coordinates of the PSA ${\mathit{\theta}}^{a}$ and the Poincaré coordinates of the PSG ${\mathit{\theta}}^{\mathbf{g}}$ are given in Eqs. (A5) and (A6).

## 3. JEFFREY’S DIVERGENCE AS A FIGURE OF MERIT

The AUC is the gold-standard FOM in binary classification but the merit function and gradient of the merit function are not closed-form. The J-CQO solution to this problem is to maximize a tractable FOM between the PDFs $p{r}_{1}(i)$ and $p{r}_{2}(i)$ that will also increase detection task performance. The relationship between the AUC and other FOMs is discussed in [19]. Denote the log-likelihood ratio of $p{r}_{1}(i)$ and $p{r}_{2}(i)$ as $\lambda (i)$. The Kullback–Liebler (KL) divergence is a popular FOM in information theory given by

where for any function of a random variable $f(\mathbf{x})$ the expectation of the random variable in each class is#### A. Multiple PSA/PSG Measurements

A series of measurements can be described by $\mathbf{D}$, a $16\times L$ matrix, where $L$ is the number of intensity measurements made and each row of $\mathbf{D}$ can be written as a Kronecker product between the $l$th PSA/PSG:

The value of $J$ between the L-dimensional normal distributions $p{r}_{1}(\mathbf{i})$ and $p{r}_{2}(\mathbf{i})$ is

## 4. METHODS

The closed-form expressions for $J$ and the gradient of $J$ can be used in a six-dimensional optimization of the Poincaré coordinates for the PSA/PSG pair(s). In this work the MATLAB function fmincon was used with a user-supplied gradient. The gradient of $J$ is given in Appendix A.

The evaluation of $J$ requires prior knowledge or estimation of the first- and second-order statistics of the intensity measurements, ${\mathbf{i}}_{n}$ and ${\mathbf{C}}_{n}$. The evaluation of the gradient of $J$ also requires prior knowledge or estimation of the first- and second-order statistics of the intensity measurements and in addition requires the $L\times 16$ cross-correlation matrix ${\mathbf{D}}^{t}{\mathbf{K}}_{n}$. Estimating these statistics is referred to as training. The pixels used for training could be selected from a conventional segmentation algorithm (perhaps operating on only the ${\mathbf{M}}_{00}$ element). Measurements with minimal differences in ${\mathbf{M}}_{00}$ have been selected for demonstration in Section 5. In this work the training pixels are selected by hand using prior knowledge of the boundaries between two regions.

Once the training pixels are selected and the PSA/PSG pair(s) can be applied to the Mueller Matrix image [see Eq. (1)] to form the output $\mathbf{i}$, a $L\times 1$ vector. This vector is assumed to be normal and used in a log-likelihood classifier [see Eq. (7)] to form a scalar-valued decision variable. Again, the first- and second-order statistics on $\mathbf{i}$ must be estimated. This can either be from the same training pixels as used in the optimization or new training pixels can be selected independently, for example, by applying conventional segmentation algorithms on $\mathbf{i}$ or ${\mathbf{M}}_{00}$. In practice, using different pixels to train the classifier and perform the PSA/PSG optimization is beneficial for modeling inter-class variability. All measured Mueller matrices are normalized by the associated ${\mathbf{M}}_{00}$ element to account for non-uniform illumination across the field of view.

## 5. RESULTS

The J-CQO PSA/PSG optimization method is demonstrated on 550 nm measurements from the Stepper Polarimeter in the College of Optical Science at the University of Arizona. Five examples are presented in approximate order of easiest to most difficult. In Figs. 1–5 the subplots are as follows: (a) is the measured Mueller matrix image normalized by the ${\mathbf{M}}_{00}$ element, (b) contains four tiles of (1) the log-likelihood ratio of the intensity values [given by Eq. (7)], (2) the unnormalized ${M}_{00}$ element, (3) the histogram of log-likelihood ratios, and (4) the PSA/PSG pair plotted on the Poincaré sphere with reported values of ${\mathit{\theta}}_{a}$ and ${\mathit{\theta}}_{\mathbf{g}}$. Visual comparison between the log-likelihood ratio of the intensity values in (b) and the ${M}_{00}$ element in (a) gives an impression of the additional value of polarimetry for the discrimination task. PSA/PSG pairs are displayed on the Poincaré sphere along with tabulate values of the six PSA/PSG Poincaré coordinates. A histogram of the values in (b) tile 1 is shown in (b) tile 3 to convey the detection performance possible by thresholding. The value of $J$ is labeled in (b) tile 3. Subplot (b) and subplot (c) will report results in the same format for the PSA/PSG pair randomly selected as the starting solution and the PSA/PSG pair given at the final iterate of J-CQO. For each measurement example the value of $J$ will be greater for the J-CQO PSA/PSG solution as compared to the starting solution. The histogram of log-likelihood ratios will be unimodal in subplot (b) and closer to bi-modal in subplot (c), which indicates discrimination between two regions as a result of the optimization.

Two linear orthogonal polarizers were measured to allow a comparison between the nonlinear optimization method of J-CQO and a simple case where the optimal PSA/PSG pair is known; Fig. 1(a) shows the Mueller matrix measurement. The left region is a linear polarizer with average polarizance of 0.98 and average orientation of $-4\xb0$ and the right region is a linear polarizer with average polarizance of 0.98 and average orientation of 87°. As expected the difference between the two polarizers is not evident in the ${\mathbf{M}}_{00}$ element. The predicted optimum is equal PSA and PSG states that align with the fast axis of one of the polarizers and block transmission from the other polarizer. Figure 1(b) reports results for a randomly selected PSA/PSG pair. Since the discrimination task is relatively easy the randomly selected PSA/PSG pair yields a large visual contrast difference. However, the value of $J$ is many orders of magnitude less as compared to the J-CQO solution in Fig. 1(c). The spatial fluctuations in the log-likelihood image are greater for the randomly selected as compared to the J-CQO PSA/PSG pair. Figure 1(c) shows that the computed optimal PSA/PSG states are close to purely linear and close to parallel. The J-CQO computed optimum is a very close match to the predicted PSA/PSG optimum.

Two pure retarders were measured and the Mueller matrix measurements are shown in Fig. 2(a); the difference between the two retarders is not evident in the ${\mathbf{M}}_{00}$ element. The left region has an average linear retardance of 163° and average orientation of 88°, and the right region has an average retardance of 152° and average orientation of $-3\xb0$. Figure 2(b) reports results for a randomly selected PSA/PSG pair for which there is no visual discrimination between the retarders. In Fig. 2(c) the optimal PSA/PSG pair are both close to left-hand circularly polarized. This state is blocked by one of the retarders, which can be seen as a narrow-peaked mode in the histogram of Fig. 2(c).

The next example is designed to investigate the use of polarimetric measurements for texture discrimination. Figure 3(a) shows the Mueller matrix measurements of 400 and 800 grit sandpapers side-by-side. Like the other examples, the difference in the regions is not evident in the ${\mathbf{M}}_{00}$ element or in the randomly selected PSA/PSG pair reported in Fig. 3(b). Unlike the prior two cases of purely polarized elements, there is not an expected optimal PSA/PSG pair. The J-CQO solution in Fig. 3(c) shows unequal PSA and PSG states that are purely polarized and with one near the left-hand circular pole of the Poincaré sphere and the other near a linear state. The value of $J$ has increased approximately 1 order of magnitude from the starting solution to the final iterate. Also the histogram and image of the log-likelihood ratio in Fig. 3(c) show discrimination between the two grits of sandpaper.

The final examples are designed as a sensitivity study to small polarimetric signatures. On top of a single piece of sandpaper two pieces of transparent tape are placed at 40° from one another. Figure 4(a) shows this Mueller matrix measurement and there is no intensity difference between the two pieces of tape in the ${\mathbf{M}}_{00}$ element. Figure 4(b) reports results for a randomly selected PSA/PSG pair, which also offers no discrimination between the two pieces of tape. The black triangle in the lower-right corner is the piece of the sandpaper that is not covered in tape. In Fig. 4(c) the optimal PSA/PSG pair are both purely polarized and in the same quadrant of the sphere but on opposite sides. The PSA is close to right-hand circular and the PSG is close to a linear state. The visual difference between the two pieces of tape is obvious in Fig. 4(c), and the region where the tapes overlap each other is also apparent. The value of $J$ is 0 at the starting solution and 24 at the final iterate.

To create a more difficult discrimination task, two pieces of transparent tape are placed at 8° from one another. Figure 5(a) shows this Mueller matrix measurement and there is no intensity difference between the two pieces of tape in the ${\mathbf{M}}_{00}$ element. Figure 5(b) reports results for a randomly selected PSA/PSG pair, which also offers no discrimination between the two pieces of tape and $J=0$. In Fig. 5(c) the optimal PSA/PSG pair are both purely polarized and the PSA is close to right-hand circular and the PSG is close to a linear state, similar to when the tape is at a larger angle in Fig. 4. The visual difference between the two pieces of tape is obvious in Fig. 5(c) but the histogram is not bi-modal and $J=3$. The discrimination is visually obvious because of the spatial correlation of the log-likelihood values in Fig. 5(c), even though both regions are noisy and have many overlapping values.

## 6. CONCLUSIONS

A new method to discriminate materials based on Mueller matrix measurements has been presented in this paper. The J-CQO optimization of PSA/PSG Poincaré coordinates has been demonstrated for discriminating purely polarizing elements, different textures, and material orientation. The binary classification of Mueller matrix image data is of interest in medicine, reconnaissance, and optical testing, to name a few applications. In these applications the discrimination involves material differences in diattenuation, retardance, and/or depolarization. Sometimes the spatial and spectral dependence of these polarimetric quantities will also be of interest. Numerous linear and nonlinear classifiers have been used to reduce the 16-dimensional Mueller matrix to a class assignment. Using J-CQO as a Mueller matrix classifier is advantageous compared to these techniques since optimizing PSA/PSG Poincaré coordinates offers a physical interpretation to rank the utility of polarimetric measurements. It is conceivable that J-CQO analysis could reveal a series of PSA/PSG measurements that would offer equivalent detection performance as compared to measuring the full Mueller matrix. This approach could reduce the number of polarimetric measurements while preserving detection performance in the aforementioned imaging applications.

## 7. FUTURE WORK

In practice, most polarimeters are not capable of generating any PSA/PSG pair on the Poincaré sphere. The J-CQO algorithm could be adapted to optimize the configurations of one or many optical elements in a polarimeter, for example, the fast axis orientation of a retarder or the orientation of a series of retarders. Furthermore, a series of measurements could be considered and each individual measurement could have a different optimal configuration. This type of analysis would be useful for designing polarimeters that are optimal for specific discrimination tasks.

The J-CQO final iterate was never a partially polarized state for the examples in this paper. A constraint to purely polarized or unpolarized states is a simplification that would reduce the dimensionality of the PSA/PSG optimization and might not compromise discrimination performance in many applications.

Using the gradient expression in the J-CQO optimization requires the full Mueller matrix measurement but the optimization could in principle be conducted without the full Mueller matrix measurement if a non-gradient-based search were used. The evaluation of $J$ requires prior knowledge or estimation of the first- and second-order statistics of the intensity measurements, ${\mathbf{i}}_{n}$ and ${\mathbf{C}}_{n}$ but not the Mueller matrices. The first- and second-order statistics of intensity can be calculated by measuring the object at a given PSA/PSG pair. Therefore a non-gradient-based search could allow PSA/PSG optimizations to be done in the physical domain from a series of measurements. The evaluation of the gradient of $J$ also requires prior knowledge or estimation of the first- and second-order statistics of the intensity measurements and in addition requires the $L\times 16$ cross-correlation matrix ${\mathbf{D}}^{t}{\mathbf{K}}_{n}$. This cross correlation requires knowledge of the Mueller matrices for both classes. The Mueller matrix could be estimated from the series of PSA/PSG measurements during the gradient-free search. Then the estimated gradient could be introduced into the search method as the number of iterates increases. However, in order to be beneficial the number of measurements would need to be fewer than the number required to calculate the Mueller matrix.

## APPENDIX A

Using results derived in [19], the gradient of $J$ [see Eq. (12)] with respect to $\mathbf{d}$ is a $1\times 16$ vector:

## Funding

National Science Foundation (NSF) (CHE-1313892).

## REFERENCES

**1. **F. Goudail, “Optimization of the contrast in active Stokes
images,” Opt. Lett. **34**, 121–123 (2009). [CrossRef]

**2. **F. Goudail and A. Bénière, “Optimization of the contrast in polarimetric
scalar images,” Opt. Lett. **34**, 1471–1473 (2009). [CrossRef]

**3. **M. Richert, X. Orlik, and A. De Martino, “Adapted polarization state contrast
image,” Opt. Express **17**, 14199–14210 (2009). [CrossRef]

**4. **F. Goudail, “Noise minimization and equalization for Stokes
polarimeters in the presence of signal-dependent Poisson shot noise,”
Opt. Lett. **34**, 647–649 (2009). [CrossRef]

**5. **G. Anna, F. Goudail, and D. Dolfi, “Polarimetric target detection in the presence of
spatially fluctuating Mueller matrices,” Opt. Lett. **36**, 4590–4592 (2011). [CrossRef]

**6. **G. Anna, F. Goudail, and D. Dolfi, “Optimal discrimination of multiple regions with
an active polarimetric imager,” Opt. Express **19**, 25367–25378 (2011). [CrossRef]

**7. **G. Anna, N. Bertaux, F. Galland, F. Goudail, and D. Dolfi, “Joint contrast optimization and object
segmentation in active polarimetric images,” Opt. Lett. **37**, 3321–3323 (2012). [CrossRef]

**8. **G. Anna, H. Sauer, F. Goudail, and D. Dolfi, “Fully tunable active polarization imager for
contrast enhancement and partial polarimetry,” Appl.
Opt. **51**, 5302–5309 (2012). [CrossRef]

**9. **M. Boffety, H. Hu, and F. Goudail, “Contrast optimization in broadband passive
polarimetric imaging,” Opt. Lett. **39**, 6759–6762 (2014). [CrossRef]

**10. **I. J. Vaughn, B. G. Hoover, and J. S. Tyo, “Classification using active
polarimetry,” Proc SPIE **8364**, 83640S (2012).

**11. **B. G. Hoover and J. S. Tyo, “Polarization components analysis for invariant
discrimination,” Appl. Opt. **46**, 8364–8373 (2007). [CrossRef]

**12. **J. Zallat, C. Collet, and Y. Takakura, “Clustering of polarization-encoded
images,” Appl. Opt. **43**, 283–292 (2004). [CrossRef]

**13. **S.-Y. Lu and R. A. Chipman, “Interpretation of Mueller matrices based on polar
decomposition,” J. Opt. Soc. Am. A **13**, 1106–1113 (1996). [CrossRef]

**14. **F. Le Roy-Bréhonnet, B. Le Jeune, P. Elies, J. Cariou, and J. Lotrian, “Optical media and target characterization by
Mueller matrix decomposition,” J. Phys. D **29**, 34–38 (1996). [CrossRef]

**15. **J. Rehbinder, H. Haddad, S. Deby, B. Teig, A. Nazac, T. Novikova, A. Pierangelo, and F. Moreau, “Ex vivo Mueller polarimetric imaging of the
uterine cervix: a first statistical evaluation,” J. Biomed.
Opt. **21**, 071113 (2016). [CrossRef]

**16. **J. Zallat, S. Aïnouz, and M. P. Stoll, “Optimal configurations for imaging polarimeters:
impact of image noise and systematic errors,” J. Opt. A **8**, 807–814 (2006). [CrossRef]

**17. **T. M. Cover and J. A. Thomas, *Elements of Information Theory*
(Wiley, 1991).

**18. **H. Barrett and K. Myers, *Foundations of Image Science*
(Wiley, 2013).

**19. **M. K. Kupinski and E. Clarkson, “Method for optimizing channelized quadratic
observers for binary classification of large-dimensional image datasets,”
J. Opt. Soc. Am. A **32**, 549–565 (2015). [CrossRef]

**20. **A. DasGupta, *Asymptotic Theory of Statistics and Probability*
(Springer, 2008).

**21. **F. Goudail and J. S. Tyo, “When is polarimetric imaging preferable to
intensity imaging for target detection?” J. Opt. Soc. Am.
A **28**, 46–53 (2011). [CrossRef]