Polarization image demosaicing and RGB image enhancement for a color polarization sparse focal plane array

Ju Liu; Jin Duan; Jin Duan; Youfei Hao; Guangqiu Chen; Hao Zhang; Yue Zheng

doi:10.1364/OE.494836

1. Introduction

Polarization imaging is a special imaging modality different from spatial and spectral, allowing additional information to be obtained in complex environments. It is usually independent of the spectral content and amplitude of the electromagnetic signal. The advantages of polarization imaging in multidimensional information acquisition make it widely used in complex environment target detection [1], scene analysis [2], autonomous driving [3], shape estimation [4], reflection removal [5], etc [6–8].

In order to acquire polarization intensity information, multiple polarization imaging strategies have been proposed. Among them, the DoFP polarization imager becomes a widely used polarization sensor imaging structure since excellent imaging advantages. The DoFP sensor is designed to perform simultaneous polarization intensity measurements of dynamic scenes by overlaying a 2$\times$2 modulated linear polarizer on a focal plane array sensor, as shown in Fig. 1 (a). Although the DoFP sensor structure can effectively solve the time-sensitive problem of the division-of-time (DoT) polarimeters system, it causes a loss of spatial resolution. On the other hand, the "Bayer filter + polarization filter" structure makes the quality of RGB images degraded. Further, advanced polarization image applications [9,10] often require the simultaneous acquisition of high-quality visible and polarization information based on a single sensor. Therefore, it becomes an urgent need to enhance the quality of visible RGB image while ensuring the accuracy of polarization information. Most of the traditional polarization demosaicing methods only consider the acquisition of polarization information and the avoidance of noise amplification [11–15], while ignoring the quality degradation of RGB images caused by polarization filters.

Fig. 1. Polarization sensor structure and DoLP noise effect diagram. (a) A conventional color division of focal plane polarization sensor structure. (b) Our proposed sparse polarization sensor structure, the white area is the area of polarization pixels where the "polarization + white" filter is placed, and four polarization pixels with different angles are arranged in a 4$\times$4 area. (c) The first row of the original simulates a unpolarized scene of polarized image, where each camera component (I$_{0}$, I$_{45}$ ,I$_{90}$) is the same and computes the DoLP. The second line indicates that the Gaussian noise ($\sigma = 0.03$) is added to the camera component and the DoLP image is calculated again.

Download Full Size | PDF

Different from the above imaging methods, Kurita et al. [16] proposed a simulated sparse polarization sensor, The method simulates a sensor form with a sparse arrangement of polarization filters above the Bayer filter and a white filter placed in the polarization pixel region. This solves the problem that the polarization filter makes the sensor less sensitive and generates better RGB images and polarization information.

Which simulates a sensor form with a sparse arrangement of polarization filters above the Bayer filter and a white filter placed in the polarization pixel region, solving the problem that the polarization filters make the sensor less sensitive and generating better RGB images and polarization information.

We simulate a sparse polarization focal plane array structure as shown in Fig. 1 (b) to maximize the acquisition of the true polarization information under the polarization filter while ensuring RGB pixels. The structure is four polarization pixels with different angles arranged in a 4$\times$4 area, where the ratio of polarization pixels is $1/4$. To improve the sensitivity of the sensor to visible wavelengths, white filters are placed below the polarization pixel region. For this sparse polarization sensor structure, we explore a better demosaicing method to achieve better quality polarization information acquisition than traditional DoFP structure.

There are roughly three existing polarization demosaicing (PDM) methods based on DoFP sensors, including interpolation [17], sparse representation [18] and deep learning methods [19]. In particular, Yan et al. [20] proposed a DoFP image demosaicing method combined with polarized intensity ratio constraint. This method designs a specific polarization demosaicing cost function by borrowing the idea of guided filtering, and achieves excellent results in DoFP polarization imaging. For sparse polarization sensors, it is obvious that the existing methods have limitations to accurately implement the demosaicing task of polarization information while ensuring the acquisition of high quality RGB images under high sparsity conditions. The sparse polarization sensor structure can be considered to be formed by interspersing polarized pixels in the RGB image, which usually generate artifacts and noise at the edges of the interspersed pixels, so artifact removal is an essential task. Since S$_{1}$, S$_{2}$ and RGB images belong to two different modes, if we take advantage of the full resolution of RGB images and guide the sparsely polarized Stokes parameters S$_{1}$, S$_{2}$ for the prediction of missing pixels, it is very similar to the task of RGB guided depth completion. Therefore, we propose a recursive network structure Sparse-PDM that directly predicts missing pixels in S$_{1}$ and S$_{2}$ parameters of the sparse polarization Stokes vector using RGB images as a guide to achieve simultaneous enhancement of RGB images and polarization information.

Sparse-PDM has two advantages: 1) it avoids the use of traditional interpolation for different polarization direction intensity images and then the calculation of polarization parameters such as Stokes parameter, angle of polarization (AoP) and degree of linear polarization (DoLP). The directly predicting method S$_{1}$ and S$_{2}$ can avoid the amplification of noise when interpolating intensity images. 2) we introduce $L_1^{F\in \{\text {DoLP}, \text {AoP}\}}$ loss, which can further optimize the Stokes prediction network to prevent the inaccurate prediction of missing pixels and amplification of noise by the polarization prediction network. Since the computation of DoLP and AoP is a nonlinear operation, any slight Stokes vector noise introduced in the prediction network will be amplified in both DoLP and AoP images. The effect of noise is shown in Fig. 1 (c). We assume that each camera component is identical, simulating a polarized image of a unpolarized scene and calculating DoLP (i.e., DoLP $=0$), and then compute DoLP again for the camera component with a small amount of Gaussian noise(i.e., $\sigma = 0.03$) added as a comparison. As shown in the figure, when added noise is sufficiently weak, the visual difference between the original image and the noisy one is negligible. In contrast, the DoLP with added noise has DoLP $>0$ in the region of low intensity of the image, which indicates that even a negligible amount of noise can produce serious errors in the DoLP image.

The main contributions of our work are summarized as follows:

1) A polarization image demosaicing and high-quality RGB image acquisition method based on simulated polarization sparse focal plane array is proposed for generating high-resolution polarization images and high-quality RGB images, namely a multi-task convolutional neural network architecture Sparse-PDM.

2) To remove RGB image artifacts and compensate sparse Stokes parameters, we build a multi-modal objective function with joint DoLP and AoP information constraints to train our model.

3) To expand the training dataset, we captured 300 sets of polarization datasets from different indoor and outdoor scenes containing various materials such as metal, ceramic, wood, fabric, plastic, etc. Our Sparse-PDM dataset is available at https://github.com/JuLiu23/Polar-dataset.

2. Related works

2.1 Sparse polarization sensor raw image acquisition

The acquisition method of sparse polarization source images is introduced in detail here. As shown in Fig. 2, the input is the image acquired by the DoFP polarization sensor or four full-resolution images acquired by rotating the polarizer. If it is a DoFP image, it passes through the traditional demosaicing process. For full-resolution images, averaging and graying are performed, which leads to RGB images and intensity maps in four polarization directions. Subsequently, the final sparse polarization sensor raw image is obtained by adding sensitivity difference to the polarization intensity map, and passing through the selection module. The sparse Stokes parameters (e.g., S$_{0}$, S$_{1}$, S$_{2}$) used in the latter section is obtained by performing the conventional demosaicing operation only for polarized pixels in the whole image [15]. Since Sparse-PDM is supervised learning, the ground truth images used for training in this paper can be obtained by Eq. (1) and Eq. (2).

(1)$$\begin{aligned} \left\{\begin{array}{l} S_0=\frac{1}{2}\left(I_0+I_{45}+I_{90}+I_{135}\right) \\ S_1=I_0-I_{90} \\ S_2=I_{45}-I_{135} \end{array}\right. \end{aligned}$$

(2)$$\begin{aligned} \left\{\begin{array}{l} \mathrm{DoLP}=\frac{\sqrt{S_1^2+S_2^2}}{S_0} \\ \mathrm{AoP}=\frac{1}{2} \tan ^{{-}1}\left(\frac{S_2}{S_1}\right) \end{array}\right. \end{aligned}$$

where $I_{0}$, $I_{45}$, $I_{90}$ and $I_{135}$ are captured intensity images at four polarization orientations, respectively. Since circular polarization is rarely observed in passive remote sensing applications, the $S_3$ parameter is generally neglected in the linear Stokes parameters operation. Therefore, the linear Stokes parameters are simplified to $S_0$, $S_1$ and $S_2$, where $S_0$ is the total light intensity, while $S_1$ and $S_2$ describe the differences of the two orthogonal directional linear polarized light intensity components, respectively. Finally, DoLP and AoP can be solved with Stokes parameters. The transmittance $t$ of the polarizer simulated in this paper is set as 0.7.

Fig. 2. Details of the generation process of sparse polarization raw image from DoFP sensors or polarizer rotation. The dotted line branch is the source image acquired by the DoFP polarization camera, and the solid branch is the source image acquired by the ordinary RGB camera with a rotating polarizer placed on the surface. The process of acquiring sparsely polarized raw image after preprocessing is the same. Firstly, the RGB values of the four polarization directions are averaged to obtain the unpolarized RGB images. Then, the four single-channel polarization direction images are multiplied by the sensitivity difference gain. Finally, the noise processing is performed according to the noise model of the sensor. The transmittance $t$ of the polarizer simulated in this work is set as 0.7.

Download Full Size | PDF

2.2 Depth completion and Image Enhancement Method

Many related types of research on denoising [21,22] or demosaicing [23] for enhancing the quality of polarized images have been proposed, but there are few studies on methods to acquire excellent RGB and polarization information in parallel. To improve the resolution, sparse polarization information needs to be fully compensated with a RGB guided image. Similar problems have been encountered in tasks such as super-resolution of hyperspectral images, depth image completion, and up-sampling. However, existing hyperspectral image completion [24] tends to be strictly dependent on its physical properties, leading to the limitations in other tasks. Although depth completion [25,26] or up-sampling [27] is quite similar to the task of compensating for low resolution at high resolution, these methods are so targeted that they introduce significant noise and artifacts when applied directly to polarization information completion. Qiu et al. [28] proposed a method to generate dense depth images guided by RGB images, using the estimated surface normal as the intermediate representation. Besides, the relationship between surface normal and depth has been investigated in related literature [29]. However, it is unfortunate that there is currently no empirical evidence to support the relationship between the Stokes parameters and surface normal, as well as to suggest whether estimating surface normal is superior to estimating S$_1$ and S$_2$. Therefore, we propose a novel network for sparse polarization pixel completion and high-quality RGB image acquisition by simulating sparse polarization sensor imaging.

3. Method

3.1 Overview

For sparse polarization sensors, the ideal polarization image demosaicing algorithm should accurately demodulate the polarization image (i.e., DoLP, AoP) while enhancing the RGB intensity image. However, both DoLP and AoP are composed of the ratio operation of Stokes parameters. To obtained an accurate and high quality polarization image, it is necessary to predict accurate and distortionless Stokes parameter images and avoid introducing weak noise as much as possible. In this paper, we set aside the traditional idea of interpolation of intensity images in different directions and directly use a deep learning network to compensate for the missing pixels in the Stokes parameter images. In this way, the DoLP and AoP images can be calculated without going through the step of first interpolating and then solving, that is, by directly mapping the Stokes parameters, avoiding the large noise introduced by interpolating the four intensity images.

To accomplish the above objectives, we construct a Sparse-PDM, a multi-task convolutional neural network architecture for sparse polarization sensors, as shown in Fig. 3. The overall network consists of two parts: the first part is a RGB image de-artifacting enhancement network, and the second part is a Stokes parameter prediction network guided by the enhanced RGB images based on the output of the first part. The specific structures of the two parts of the network are presented in Sec. 3.2. First, the raw image of the sparse polarization sensor is used as input, and the RGB image and the sparse quadrupolarization directional intensity image are obtained after data decomposition. The sparse Stokes parameters are calculated from the sparse four-directional intensity images by Eq. (1). Then, the RGB image is fed into the RGBDAN network to correct for demosaicing artifacts in the RGB image. Finally, we use the de-artifacts RGB image as a guide to performing missing pixel prediction for the sparse Stokes parameter of the PPN network input to obtain the high-resolution Stokes parameters and thus the final polarization information (i.e., DoLP and AoP). In order to obtain the high-resolution S$_0$, we perform grayscale processing on the de-artifacted RGB image and calculate gain-absorption sensitivity difference of S$_0$ parameter.

Fig. 3. The proposed stokes parameter generation network architecture. The input of the network is a sparse polarization raw image and the output is the high quality RGB image and DoLP, AoP image. The RGBDAN branch processes the edge artifacts between RGB pixels and polarization pixels, and feeds the processed high quality RGB image to the PPN branch as its full resolution guide image for the polarization information completion task. The RGBDAN branch consists mainly of four residual dense blocks. The PPN branch mainly consists of a stacked hourglass network.

Download Full Size | PDF

3.2 Network architecture

3.2.1 RGB de-artifact network

The sparse polarization sensor structure chosen for imaging in this paper is equivalent to the polarization pixels interspersed between the RGB pixels. Therefore, after the RGB image acquisition shown in Sec. 2.1, the significant artifacts generated at the edges of RGB pixels and polarization pixels make the subsequent Stokes vector prediction lack a high-quality RGB image guide. Ultimately, it is difficult to obtain a high-quality S$_0$ image. Inspired by the literature [30], an RGB image de-artifact enhancement network is designed in this paper to achieve the prediction of Stokes parameter images S$_1$ and S$_2$ (i.e., S$_1,_2$) by more accurate guidance. As shown in Fig. 3, the first branch of Sparse-PDM is the RGB de-artifact network (RGBDAN), where the input is the demosaiced RGB image of the raw image, as well as the sparse polarization components S$_1,_2$ and the masked image M. The overall network structure mainly consists of four residual dense blocks (RDB). To prevent information loss, we integrate the outputs of the four RDB modules into the connection layer, and then sum the features extracted from the convolution layer with the feature maps extracted from the RDB groups. Among them, the purpose of the input sparse S$_0$, S$_1$, S$_2$ (i.e., S$_0,_1,_2$) and the mask M is to determine the region of artifacts and to complement the pixels in the unpolarized region of the RGB image, so as to achieve the refinement of the RGB image. However, our ultimate goal is not to obtain the RGB image after the first branch enhancement, but to add the RGB image to the polarization information prediction network (PPN) and predict the missing polarization pixels by RGB image bootstrapping.

RDB module: Our four RDB modules (Fig. 4) all use the same convolutional structure, where each RDB module consists of four convolutional layers, the first three layers being the same 3$\times$3 size convolution with an activation function connected to each layer, and the fourth layer being a 1$\times$1 size convolution with no activation function connected to it. The input of each of the first 3 layers is connected to its output and used together as the input of the next layer, and the input of the fourth layer is the set of the output feature maps of the first 3 layers. The ultimate output of RDB is the sum of the output of the fourth convolution layer and the initial input.

Fig. 4. Residual dense block (RDB) architecture. The module consists of four convolution layers. The (d-1) RDB and the (d+1) RDB denote the connection of the RDB module with the previous layer of convolution and the next layer of convolution, respectively.

Download Full Size | PDF

3.2.2 Polarization prediction network

As mentioned before, we only perform polarization missing pixel prediction for the Stokes parameters S$_1,_2$, and the detailed prediction network is shown in Fig. 5. The inputs are the de-artifacted enhanced RGB images and the sparse Stokes parameters S$_1,_2$. The outputs are the final predicted dense S$_1, _2$ maps. The overall framework consists of two repetitive hourglass network (RHN) structures, where RGB images are first fed into the RGB Image guidance branch and into RHN$_1$ via a 5$\times$5 convolutional encoder, followed by repetition of this similar but lightweight unit, consisting of two convolutions per layer. The first stage goes through two repetitive hourglass networks to extract clear RGB image features in complex scenes, providing a clearer guide to the coarse dense prediction of the Stokes parameters in the 1st stage. Next, the 2nd stage is the Stokes parameters generation branch, which takes the coarsely dense Stokes parameters from the first stage and the sparse S$_1,_2$ as inputs to generate the more finely dense Stokes parameters S$_1,_2$ and use them as the final output.

Fig. 5. Polarization information prediction network. It consists of a high-resolution RGB image guidance branch and a Stokes parameter generation branch, where the repetitive guidance (RG) module is used to refine the Stokes parameter.

Download Full Size | PDF

It is known that S$_1,_2$ are the results of subtraction operation of two polarization orthogonal direction intensity images respectively, which makes the structure of S$_1,_2$ image more unclear. The quality of the RGB image is also degraded due to the weakening effect of the polarization filter on the photon transmission rate. For these reasons, the performance of the network using RGB images to guide the prediction of Stokes parametrization is greatly degraded. Inspired by the literature [31], a repetitive hourglass network structure (Rignet) is chosen in this paper for missing pixel prediction of S$_1,_2$, and the main advantage of the network is the use of a repetitive design with multiple hourglass cells RHN$_i$ similar to the symmetric structure of Unet for gradual and adequate S$_1,_2$ recovery. This network form can avoid the blurred guidance of RGB images and the problem of false prediction due to the unclear structure of S$_1,_2$. In addition, for the use of a repetitive guidance module [31], it is well suited to solve the problem of unclear structure of sparse Stokes parameter images and can strengthen the prediction of edge information of dense S$_1,_2$. For the polarized image missing pixel densification prediction problem, we supplement the Rignet network. We extend the number of input channels to two, and taking into account the possible negative values of S$_1$ and S$_2$, LReLU [32] was used as the activation function to allow for negative results.

We also considered the difference between single and complex scenes and added an attentional feature extraction module, the Channel Attention Block (CAB) [33], between the encoder and decoder, which considers the visual and global features of the Scenario. By increasing the weight of important channel features on the encoder side, and then adding the encoder-weighted features to the decoder-side features, RGB image guidance branch achieves a more flexible feature representation.

In the RHN$_i$ encoder, $\mathrm {E}_{\mathrm {{i j}}}$ takes $\mathrm {E}_{\mathrm {{i}}(j-1)}$ and $\mathrm {D}_{(\mathrm {{i}}-1) \mathrm {{j}}}$ as inputs. When $i>1$, its operation process is as follows:

(3)$$\begin{aligned} E_{i j} & =\left\{\begin{array}{cl} \operatorname{Conv}\left(D_{(i-1) j}\right), & j=1, \\ \operatorname{Conv}\left(E_{i(j-1)}\right)+D_{(i-1) j}, & 1<j \leq 5, \end{array}\right. \\ D_{i j} & =\left\{\begin{array}{cc} \operatorname{Conv}\left(E_{i 5}\right), & j=5, \\ \operatorname{Deconv}\left(D_{i(j+1)}\right)+E_{i j}, & 1 \leq j<5, \end{array}\right. \end{aligned}$$

where Deconv $(\cdot )$ represents the deconvolution operation, and $\mathrm {E}_{1 \mathrm {{j}}}=\operatorname {Conv}\left (\mathrm {E}_{1(\mathrm {{j}}-1)}\right )$.

3.3 Loss function

Since the prediction of S$_1,_2$ is considered to be more challenging at the farthest point, we supervise S$_1,_2$ using the $\ell _2$ norm. For RGB network and polarized information similarity supervised loss, we choose $\ell _1$ norm. In addition, we supervised the intermediate outputs $\hat {\mathbf {S}}^{1 \text { st }}, \hat {\mathbf {S}}^{2 \text { st }}$ and $\hat {\mathbf {S}}^{3 \mathrm {nd}}$ learned by the hourglass network. The overall loss contains three parts: intensity loss, Stokes loss, and polarization information similarity loss, among which Stokes loss is dominant. To ensure the validity of network training, we introduce DoLP and AoP constraint terms with different weights as auxiliary.

1) Strength loss is defined as:

(4)$$\mathcal{L}_{\mathrm{G}}(\hat{\mathbf{G}})=\left\|\hat{\mathbf{G}}-\mathbf{G}^{\mathrm{gt}}\right\|_1$$

2) Stokes loss is defined as:

(5)$$\begin{aligned} & \mathcal{L}_{\mathrm{S}}(\hat{\mathbf{S}})=\left\|\hat{\mathbf{S}}_{1,2}-\mathbf{S}_{1,2}^{\mathrm{gt}}\right\|_2 \\ \end{aligned}$$

(6)$$\begin{aligned} & \mathcal{L}_{\text{Stokes}}=\!\mathcal{L}_{\mathrm{S}}(\hat{\mathbf{S}})+\!\lambda_1\left\{\mathcal{L}_S\left(\hat{\mathbf{S}}^{1 \mathrm{st}}\right)+\!\mathcal{L}_{\mathrm{S}}\left(\hat{\mathbf{S}}^{2 \mathrm{st}}\right)+\!\mathcal{L}_S\left(\hat{\mathbf{S}}^{3 \mathrm{nd}}\right)\right\} \end{aligned}$$

where $\lambda _1$ is a hyperparameter that decreases with the number of epochs, and we set the initial value $\lambda _1 = 0.2$, $\mathbf {S}_{1,2}^{\text {gt }}, \mathbf {G}^{\text {gt }}$ are the ground truth.

3) The loss of similarity of polarization information is defined as follows:

(7)$$L_1^F=\lambda_F \frac{1}{M \times N} \sum_{i=1}^M \sum_{j=1}^N\left|F^*(i, j)-F(i, j)\right|$$

where $F^*$ and $F$ denote the predicted feature maps and the corresponding ground truth, respectively, $F \in \{\text {DoLP}, \text {AoP}\}$. $\lambda _F$ is an empirically set hyperparameter scaling factor for balance.

In addition, the auxiliary loss function used to assess the polarization information is designed as follows:

(8)$$L_F=\sum_F \omega_F L_1^F$$

where $\omega _F=L_1^F / \sum _F L_1^F$.

In summary, the overall loss function is denoted as follows:

(9)$$f=\arg \min \left[\left(\lambda_2 L_{\text{Stokes }}+\left(1-\lambda_2\right) L_F\right)+\mathcal{L}_G(\hat{\mathbf{G}})\right]$$

where $\lambda _2$ is an empirically adjusted hyperparameter, which is empirically set to 0.7.

4. Experimental

4.1 Experimental configurations

4.1.1 Dataset

The Sparse-PDM dataset consists of 200 sets of focal plane polarization images by the cameras LUCID PHX050S-QC and 100 sets of full resolution polarization images using the " ordinary color camera MER-502-79U3M/C + rotating polarizer". Each group contains four pairs of light intensity images in different directions. In addition, to expand the data volume, we also choose the open datasets [2,13,22,34]. We classify four datasets, where [13,22] are single scene datasets, [2,34] are complex outdoor scenes. We verify the superiority of Sparse-PDM in this paper with both simple and complex scenes. For the acquisition of sparse polarization images we have described in Sec. 2.1.

4.1.2 Training details

To evaluate our method, we use 528 images for training and 80 images for validation. The ground truth (GT) images used for evaluation are provided by the dataset. The network architecture is implemented in PyTorch on a PC equipped with an NVIDIA RTX A5000 GPU. We perform 40 epochs of network training with a batch size of 5 and an initial learning rate of 0.0007. In addition, the Adam optimizer is chosen since it is well suited for optimization problem with sparse gradient or highly noisy gradient.

4.2 Experimental results

4.2.1 Comparison with other demosaicing methods

We first qualitatively analyzed the performance of Sparse-PDM on the artifact removal enhancement of RGB images and the demosaicing performance of polarized images, measured in terms of the overall image content, color and contrast. As shown in Fig. 6, the quantitative and qualitative evaluation of the RGB image enhancement effect shows a significant improvement in image quality, with a maximum quantitative improvement of 18.24 dB in PSNR compared to conventional polarized DoFP imaging.

Fig. 6. DoFP polarization sensor, sparse polarization sensor, and our Sparse-PDM RGB image output. Combining the sparse polarization sensor with the de-artifacting enhancement method results in a more excellent RGB image. DoFP polarization sensor output (top), sparse polarization sensor output (middle) and our sparse-polarization sensor combined with the de-artifacting method (bottom). RGB quantitative evaluation index (PSNR) (the larger the better).

Download Full Size | PDF

Figure 7 shows the qualitative analysis results of ${S}_{1}$, ${S}_{2}$, DoLP and AoP, where ${S}_{1}$ and ${S}_{2}$ are the intermediate products of Sparse-PDM, and DoLP and AoP are the final outputs of the polarization imaging model (Sparse-PDM) in this paper. In addition, we also compare it with the respective ground truth, and it can be seen that our model is closest to the ground truth. It should be noted that since the ForkNet structure is not designed for ${S}_{1}$ and ${S}_{2}$ outputs, but the literature has mentioned the applicability of the network for outputs S1 and S2 in the paper. Therefore, we trained and tested our dataset on ForkNet without changing its network structure and any parameters. ${S}_{1}$ and ${S}_{2}$ were output and compared qualitatively and quantitatively with other methods. As the figures show, the qualitative results of ${S}_{1}$ and ${S}_{2}$ are in full agreement with DoLP and AoP. Sparse-PDM has better results in both indoor and outdoor scenes containing different materials. Among them, the biggest advantage is that Sparse-PDM suppresses the introduction of obvious noise in DoLP and AoP images while ensuring the highlighting of edge information in polarized images.

Fig. 7. Results of polarization imaging based on sparse polarization sensor structure. Indoor scene on the left. Outdoor scene on the right. Detail is an enlargement of the details of DoLP and AoP in the green box area. It is clear that our Sparse-PDM results are very close to the ground truth, and the generated DoLP and AoP have excellent noise suppression effect.

Download Full Size | PDF

Qualitative comparison: In the following, we compare Sparse-PDM with other PDM models, where Bicubic [35] and Newton’s interpolation [12] are traditional interpolation methods and ForkNet [15] is a deep learning method for end-to-end mapping. To facilitate comparison, we convert several methods to the same color space. It can be seen that although Bicubic can better maintain the edge and gradient information of DoLP and AoP, the noise introduced is very serious, especially the quality of AoP is obviously degraded. Newton’s interpolation is enhanced at the edges of DoLP and AoP, but the generated images produce certain distortion, such as large color deviation, as well as obvious artifact and gridding effect. In comparison, ForkNet has a better representation in suppressing noise, but the AoP images have obvious distortion and cannot keep the gradient information well, and the image contrast is low. Compared with these models, the results of our Sparse-PDM have lower noise, increased contrast, and superior color distribution. It can keep the advantageous information such as the edge and gradient of the polarization image very well.

Quantitative comparison: We select 30 sets of images in each of the self-built and public datasets for testing, and choose four quantitative indexes to judge the Sparse-PDM performance, including the structural similarity (SSIM) [36], peak signal-to-noise ratio (PSNR) [36], patch-based contrast quality index (PCQI) [37] and angle "Error" [22], which measures the angular error of polarization angle images. Among them, PSNR indicates the degree of distortion of the demosaicing image, and higher PSNR means more valuable image information and less noise in the generated image. The higher PCQI value indicates better contrast of the image. SSIM mainly measures the brightness, contrast and structure fidelity of the polarized image. Angle error indicates the generation error of the AoP image, and smaller angle error represents better performance of the model. The average indexes of the test results are presented in Table 1, and it is clear that Sparse-PDM achieves the optimum in several indexes, and the quantitative results of ${S}_{1}$ and ${S}_{2}$ also match perfectly with the results of DoLP and AoP. This also proves its advantages in signal fidelity, noise suppression and high contrast image generation.

Table 1. Quantitative comparison of the average metrics of several typical demosaicing methods on the test dataset. 3pt

View Table | View all tables in this article

4.2.2 Comparison with depth completion networks

Since this paper is inspired by the idea of depth completion, we also try to compare two typical depth completion methods Cspn++ [26] and PENet [25]. We replace the PPN module of Sparse-PDM in this paper with each of these two methods, keeping other modules and parameters unchanged, to evaluate the advantage of our PPN module for sparse polarization parameter completion. The qualitative comparison results are presented in Fig. 8, which shows that our method is able to generate higher-quality polarization images. From the detail part, it can be seen that PPN is able to generate the edge and gradient information of the building glass window region with high-quality and without distortion, and the noise of AoP and DoLP images is as trifling as it is.

Fig. 8. Results of polarization image generation comparison of several typical sparse image depth completion methods. These are the results obtained after replacing our PPN with Cspn++ or PENet in the proposed Sparse-PDM network architecture, where the top is S$_0,_1,_2$, the middle is DoLP and AoP, the bottom is Detail.

Download Full Size | PDF

In comparison, the Cspn++ method does not compensate well for the polarization information in the sparse part, and there are gridding artifacts. Furthermore, it is obvious that the generated polarization information is distorted. PENet has relatively better results, and its can ensure that the polarization information is not distorted, but the method over-smoothes the gradient information of the polarization image, resulting in the loss of contours and lines in the glass window area of the building. The average results of the quantitative comparisons are shown in Table 2, and we also performed quantitative analysis of the Stokes parameters S$_0$ and S$_1,_2$. The results show that our quantitative metrics are consistent with the qualitative results, and it can be seen that Sparse-PDM achieves the best results for all metrics except the PCQI of DoLP. Since PENet has some distortion in polarization information, its PCQI reaches maximum value, which resulted in a relative enhancement of the contrast. However, the method aims at denoising and does not take into account the losses of polarization gradient information in the prediction process. Overall, our PPN has the best performance in compensating the polarization information.

Table 2. Quantitative comparison of the results of replacing the PPN module of the Sparse-PDM architecture with a depth completion approach.

View Table | View all tables in this article

4.2.3 Ablation experiments

We also performed ablation experiments to verify the function of each component in Sparse-PDM. Firstly, we remove the RGB de-artifact network from Sparse-PDM, as shown in Fig. 9. It can be seen that the completion S$_1,_2$ parametric image is quite frustrated when raw RGB image is used as the guide for the PCN module, and the generated S$_0,_1,_2$, AoP and DoLP images have severe artifacts. Moreover, we also verified the guide performance of RGBDAN with only one hourglass network. It is observed that the guide performance of RGB images is significantly degraded compared with the dual network structure. Although the smoothness of the polarization image is guaranteed to some extent and the noise is removed, the method severely loses the edge contour and gradient information of the AoP and DoLP images. Finally, we verified the effect of the $L_1^F$ loss function components. If the Stokes parameters are known, the ground truth DoLP and AoP can be computed, but if the AoP and DoLP images are known, it is not possible to directly calculate each parameter of the Stokes parameters linearly. So theoretically, it is reasonable to introduce $L_1^F$ loss in the sparse-PDM architecture. We experimentally verify the effect of this loss function. The results show that the generated image without this constraint term introduces a certain amount of noise and a certain loss in the edge part. So it can be shown that the choice of this constraint term is reasonable.

Fig. 9. Results of ablation experiment. W/O RGBDAN indicates without high-quality RGB guidance, W/O RHN$_2$ indicates that the RGB image guidance branch of the PPN network contains only an hourglass structure, and W/O $L_1^F$ Loss indicates without $L_1^F$ Loss is introduced, and the ablation experiments demonstrate the effect of polarization imaging in the above three cases. The indispensable role of each part in Sparse-PDM is indirectly argued.

Download Full Size | PDF

The quantitative metrics are shown in Table 3. From the metrics, the Sparse-PDM model that includes each component is optimal. We conclude that the $L_1^F$ loss is critical for improving the PSNR and Angle Error metrics of AoP and DoLP images. In addition, the repetitive hourglass network structure is more capable of feature extraction for RGB-guided images and produces more accurate polarization information.

Table 3. Quantitative comparative results of ablation study.

View Table | View all tables in this article

5. Conclusion

In this paper, we propose a polarization image demosaicing and excellent RGB image generation model, namely Sparse-PDM, for polarization sparse focal plane array arrangement. Specifically, the model consists of two progressive networks of de-artifact and polarization information completion, which is based on a residual dense network and stacked hourglass network as the backbone. A direct completion of Stokes vector information is constructed to achieve the demosaicing task of polarized images. In addition, a self-constructed polarization Sparse-PDM dataset containing metallic and dielectric materials for different scenes indoors and outdoors is also presented. Qualitative and quantitative outcomes demonstrate that our Sparse-PDM is superior to alternative models in terms of high contrast and high fidelity generation of polarization information as well as excellent RGB image acquisition. However, our model is more complex, and in future work, we plan to design more lightweight models to facilitate hardware implementation.

Funding

National Natural Science Foundation of China (62127813); Science and Technology Development Program of Jilin Province (20210203181SF).

Disclosures

The authors declare that there are no conflicts of interest related to this article.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. A. Kalra, V. Taamazyan, S. K. Rao, K. Venkataraman, R. Raskar, and A. Kadambi, “Deep polarization cues for transparent object segmentation,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog. (CVPR), (2020), pp. 8602–8611.

2. T. Ono, Y. Kondo, L. Sun, T. Kurita, and Y. Moriuchi, “Degree-of-linear-polarization-based color constancy,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog. (CVPR), (2022), pp. 19740–19749.

3. K. Xiang, K. Yang, and K. Wang, “Polarization-driven semantic segmentation via efficient attention-bridged fusion,” Opt. Express 29(4), 4802–4820 (2021). [CrossRef]

4. Y. Ba, A. Gilbert, F. Wang, J. Yang, R. Chen, Y. Wang, L. Yan, B. Shi, and A. Kadambi, “Deep shape from polarization,” in Proc. Eur. Conf. Comput. Vis. (ECCV), (Springer, 2020), pp. 554–571.

5. C. Lei, X. Huang, M. Zhang, Q. Yan, W. Sun, and Q. Chen, “Polarized reflection removal with perfect alignment in the wild,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog. (CVPR), (2020), pp. 1750–1758.

6. Q. Fu, L. Si, J. Liu, H. Shi, and Y. Li, “Design and experimental study of a polarization imaging optical system for oil spills on sea surfaces,” Appl. Opt. 61(21), 6330–6338 (2022). [CrossRef]

7. Q. Fu, K. Luo, Y. Song, M. Zhang, S. Zhang, J. Zhan, J. Duan, and Y. Li, “Study of sea fog environment polarization transmission characteristics,” Appl. Sci. 12(17), 8892 (2022). [CrossRef]

8. J. Marco-Rider, A. Cibicik, and O. Egeland, “Polarization image laser line extraction methods for reflective metal surfaces,” IEEE Sens. J. 22(18), 18114–18129 (2022). [CrossRef]

9. J. Liu, J. Duan, Y. Hao, G. Chen, and H. Zhang, “Semantic-guided polarization image fusion method based on a dual-discriminator gan,” Opt. Express 30(24), 43601–43621 (2022). [CrossRef]

10. S. Mo, J. Duan, W. Zhang, X. Wang, J. Liu, and X. Jiang, “Multi-angle orthogonal differential polarization characteristics and application in polarization image fusion,” Appl. Opt. 61(32), 9737–9748 (2022). [CrossRef]

11. S. Wen, Y. Zheng, F. Lu, and Q. Zhao, “Convolutional demosaicing network for joint chromatic and polarimetric imagery,” Opt. Lett. 44(22), 5646–5649 (2019). [CrossRef]

12. N. Li, Y. Zhao, Q. Pan, and S. G. Kong, “Demosaicking dofp images using newton’s polynomial interpolation and polarization difference model,” Opt. Express 27(2), 1376–1391 (2019). [CrossRef]

13. S. Qiu, Q. Fu, C. Wang, and W. Heidrich, “Linear polarization demosaicking for monochrome and colour polarization focal plane arrays,” Comput Graph Forum 40(6), 77–89 (2021). [CrossRef]

14. D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Beyond color difference: Residual interpolation for color image demosaicking,” IEEE Trans Image Process 25(3), 1288–1300 (2016). [CrossRef]

15. X. Zeng, Y. Luo, X. Zhao, and W. Ye, “An end-to-end fully-convolutional neural network for division of focal plane sensors to reconstruct s 0, dolp, and aop,” Opt. Express 27(6), 8566–8577 (2019). [CrossRef]

16. T. Kurita, Y. Kondo, L. Sun, and Y. Moriuchi, “Simultaneous acquisition of high quality rgb image and polarization information using a sparse polarization sensor,” in Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), (2023), pp. 178–188.

17. M. Morimatsu, Y. Monno, M. Tanaka, and M. Okutomi, “Monochrome and color polarization demosaicking using edge-aware residual interpolation,” in Proc. IEEE Int. Conf. Inf. Process. (ICIP), (IEEE, 2020), pp. 2571–2575.

18. Y. Luo, J. Zhang, and D. Tian, “Sparse representation-based demosaicking method for joint chromatic and polarimetric imagery,” Opt. Lasers Eng. 164, 107526 (2023). [CrossRef]

19. H. Wang, H. Hu, X. Li, L. Zhao, Z. Guan, W. Zhu, J. Jiang, K. Liu, Z. Cheng, and T. Liu, “Joint noise reduction for contrast enhancement in stokes polarimetric imaging,” IEEE Photonics J. 11(2), 1–10 (2019). [CrossRef]

20. L. Yan, K. Jiang, Y. Lin, H. Zhao, R. Zhang, and F. Zeng, “Polarized intensity ratio constraint demosaicing for the division of a focal-plane polarimetric image,” Remote Sens. 14(14), 3268 (2022). [CrossRef]

21. W. Ye, S. Li, X. Zhao, A. Abubakar, and A. Bermak, “Ak times singular value decomposition based image denoising algorithm for dofp polarization image sensors with gaussian noise,” IEEE Sens. J. 18(15), 6138–6144 (2018). [CrossRef]

22. A. Abubakar, X. Zhao, S. Li, M. Takruri, E. Bastaki, and A. Bermak, “A block-matching and 3-d filtering algorithm for gaussian noise in dofp polarization images,” IEEE Sens. J. 18(18), 7429–7435 (2018). [CrossRef]

23. Y. Hao, J. Duan, J. Liu, J. Zhan, and C. Cheng, “Dolp and aop synthesis from division of focal plane polarimeters using cyclegan,” Opt. Commun. 533, 129296 (2023). [CrossRef]

24. C. Lanaras, E. Baltsavias, and K. Schindler, “Hyperspectral super-resolution by coupled spectral unmixing,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), (2015), pp. 3586–3594.

25. M. Hu, S. Wang, B. Li, S. Ning, L. Fan, and X. Gong, “Penet: Towards precise and efficient image guided depth completion,” in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), (IEEE, 2021), pp. 13656–13662.

26. X. Cheng, P. Wang, C. Guan, and R. Yang, “Cspn++: Learning context and resource aware convolutional spatial propagation networks for depth completion,” in AAAI – AAAI Conf. Artif. Intell. (AAAI), vol. 34 (2020), pp. 10615–10622.

27. Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Joint image filtering with deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1909–1923 (2019). [CrossRef]

28. J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog. (CVPR), (2019), pp. 3313–3322.

29. Y. Zhang and T. Funkhouser, “Deep depth completion of a single rgb-d image,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog. (CVPR), (2018), pp. 175–185.

30. Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image restoration,” IEEE Trans. Pattern Anal. Mach. Intell. 43(7), 2480–2495 (2021). [CrossRef]

31. Z. Yan, K. Wang, X. Li, Z. Zhang, J. Li, and J. Yang, “Rignet: Repetitive image guided network for depth completion,” in Proc. Eur. Conf. Comput. Vis. (ECCV), (Springer, 2022), pp. 214–230.

32. K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), (2015), pp. 1026–1034.

33. C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang, “Learning a discriminative feature network for semantic segmentation,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog. (CVPR), (2018), pp. 1857–1866.

34. Y. Sun, J. Zhang, and R. Liang, “Color polarization demosaicking by a convolutional neural network,” Opt. Lett. 46(17), 4338–4341 (2021). [CrossRef]

35. S. Gao and V. Gruev, “Bilinear and bicubic interpolation methods for division of focal plane polarimeters,” Opt. Express 19(27), 26161–26173 (2011). [CrossRef]

36. J. Wang, M. Wan, G. Gu, W. Qian, K. Ren, Q. Huang, and Q. Chen, “Periodic integration-based polarization differential imaging for underwater image restoration,” Opt. Lasers Eng. 149, 106785 (2022). [CrossRef]

37. R. He, M. Guan, and C. Wen, “Scens: Simultaneous contrast enhancement and noise suppression for low-light images,” IEEE Trans. Ind. Electron. 68(9), 8687–8697 (2021). [CrossRef]

	S1			S2			DoLP			AoP
Method	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	Error[ $^{\circ}$ ]
Bicubic	52.88	0.83	0.73	47.55	0.75	0.63	28.79	0.61	0.33	20.52	0.37	0.27	16.82
Newton’ interpolation	52.13	0.86	0.74	46.32	0.73	0.64	28.31	0.57	0.29	23.35	0.25	0.26	14.53
Fork-net	54.64	0.84	0.77	48.77	0.78	0.69	30.76	0.66	0.65	22.77	0.28	0.33	17.97
Sparse-PDM	56.23	0.88	0.78	49.96	0.79	0.71	32.27	0.68	0.71	25.42	0.51	0.49	10.38

	S $_{0}$			S $_{1},_{2}$			DoLP			AoP
Methods	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	Error[ $^{\circ}$ ]
Spaser input	27.22	0.68	0.55	36.92	0.59	0.43	28.33	0.54	0.42	12.22	0.30	0.12	15.36
With Cspn++	28.73	0.84	0.82	38.54	0.78	0.68	30.05	0.55	0.43	12.58	0.25	0.39	19.54
With PENet	39.93	0.92	0.60	55.57	0.82	0.52	36.56	0.73	0.71	19.03	0.32	0.40	11.33
Sparse-PDM	41.22	0.93	0.89	56.01	0.87	0.76	37.05	0.78	0.68	21.43	0.46	0.43	9.24

	S $_{0}$			S $_{1},_{2}$			DoLP			AoP
Methods	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	Error[ $^{\circ}$ ]
W/O RGBDAN	32.26	0.83	0.82	32.05	0.81	0.79	35.13	0.69	0.71	18.96	0.42	0.42	19.14
W/O RHN $_{2}$	36.58	0.94	0.85	35.32	0.87	0.81	36.42	0.74	0.69	22.35	0.45	0.41	16.03
W/O $L_{1}^{F}$ Loss	35.72	0.93	0.84	34.49	0.86	0.82	35.75	0.72	0.71	21.57	0.46	0.43	10.89
Sparse-PDM	36.75	0.95	0.86	35.38	0.88	0.83	36.76	0.75	0.73	23.92	0.49	0.45	11.05

	S1			S2			DoLP			AoP
Method	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	Error[ $^{\circ}$ ]
Bicubic	52.88	0.83	0.73	47.55	0.75	0.63	28.79	0.61	0.33	20.52	0.37	0.27	16.82
Newton’ interpolation	52.13	0.86	0.74	46.32	0.73	0.64	28.31	0.57	0.29	23.35	0.25	0.26	14.53
Fork-net	54.64	0.84	0.77	48.77	0.78	0.69	30.76	0.66	0.65	22.77	0.28	0.33	17.97
Sparse-PDM	56.23	0.88	0.78	49.96	0.79	0.71	32.27	0.68	0.71	25.42	0.51	0.49	10.38

	S $_{0}$			S $_{1},_{2}$			DoLP			AoP
Methods	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	PSNR	SSIM	PCQI	Error[ $^{\circ}$ ]
Spaser input	27.22	0.68	0.55	36.92	0.59	0.43	28.33	0.54	0.42	12.22	0.30	0.12	15.36
With Cspn++	28.73	0.84	0.82	38.54	0.78	0.68	30.05	0.55	0.43	12.58	0.25	0.39	19.54
With PENet	39.93	0.92	0.60	55.57	0.82	0.52	36.56	0.73	0.71	19.03	0.32	0.40	11.33
Sparse-PDM	41.22	0.93	0.89	56.01	0.87	0.76	37.05	0.78	0.68	21.43	0.46	0.43	9.24

Polarization image demosaicing and RGB image enhancement for a color polarization sparse focal plane array

Abstract

1. Introduction

2. Related works

2.1 Sparse polarization sensor raw image acquisition

2.2 Depth completion and Image Enhancement Method

3. Method

3.1 Overview

3.2 Network architecture

3.2.1 RGB de-artifact network

3.2.2 Polarization prediction network

3.3 Loss function

4. Experimental

4.1 Experimental configurations

4.1.1 Dataset

4.1.2 Training details

4.2 Experimental results

4.2.1 Comparison with other demosaicing methods

4.2.2 Comparison with depth completion networks

4.2.3 Ablation experiments

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (9)

Tables (3)

Equations (9)

Optics Express