## Abstract

Super resolution (SR) reconstruction is a profitable technology to acquire high resolution images from low resolution images without replacing devices. This study was concentrated on searching strategies of dealing with color information in the SR reconstruction process. Based on an algorithm with dictionary learning, different algorithms were designed to test which color coordinate systems could obtain better image reconstruction quality, involving color spaces of RGB, YIQ, YCbCr, HSI, HSV, and CIELAB. Their results were compared via typical numerical measures, and the recommended strategies are to adopt merely L*** coordinate in CIELAB space or merely Y coordinate of YIQ system.

© 2017 Optical Society of America

## 1. Introduction

High resolution (HR) images are often desired in applications since they can supply more detailed information. Other than reducing pixel size or increasing chip size at the aspect of hardware, super resolution (SR) reconstruction is a promising technology to achieve HR images from low resolution (LR) images without updating imaging devices, which is a resolution enhancement approach at the aspect of signal processing [1, 2]. Generally, SR reconstruction requires one frame or multiple frames of LR images [3–5].

Recently, learning-based algorithms became an open and widely investigated topic in this field, which stemmed from using a database of training images to create plausible high-frequency details in zoomed images [6]. Inspired by the compressed sensing theory, a sparse-coding-based SR algorithm was proposed [7, 8], which opened up a favorable situation of learning-based SR studies. Most researches in this field were focused on the improvements of the sparse-coding-based method, such as to perfect the process of sparse representation [9], or to make the sparse domain selection and regularization become adaptive [10]. These algorithms showed their advantage of combining the apriori knowledge from LR images since they could induce more high frequency information from training samples. However, their reconstruction performances depend on the size of the formed dictionaries (the amount of atoms), so that improving the amount of atoms may help to achieve better reconstruction effects, but would increase computational complexity. To resolve this problem from the aspect of optimizing dictionary training, more studies were concentrated on building more ascendant dictionaries in learning process, e.g. dual-dictionary learning [11], multi-scale dictionary learning [12], geometric dictionaries [13], and adaptive dictionary learning [14]. Currently, most learning-based algorithms cannot fulfil real time requirement, which is a limit in applications. Therefore, the employment of embedded platform may supply a solution from the aspect of hardware. Another approach to realize nearly real time SR reconstruction was based on deep learning such as convolutional neural networks, which could show good performance on time cost in the reconstruction process [15, 16]. These deep learning algorithms often cost several days or more in the training process and involve large quantity of image samples. In addition, methods of protecting edge information for the deep learning SR algorithms should be investigated.

These SR methods are designed to acquire more spatial information in the processing of resolution enhancement, to a great extent they are presented as for grayscale images (intensity images). While for a color image the treatment is simple, it is decomposing the color information into different dimensions such as in RGB (red, green, blue) space and in YCbCr system [17], and then merging the reconstructed grayscale images of individual dimensions to form an HR color image via a reversible operation. However, few studies explore the influences of color spaces on image reconstruction effects, even implementing the same SR reconstruction algorithm in different color spaces may produce diverse results.

For that matter, this study aimed to systematically investigate how the selection of color spaces take effect in the process of image SR reconstruction, in which various typical color spaces were tested and compared using a reconstruction algorithm based on classified dictionary learning. From the aspect of improving the dictionary training process, the proposed algorithm could improve the reconstruction quality without increasing computational complexity by sorting image features into several dictionaries. The evaluated spaces included RGB, YCbCr, YIQ, HSV (hue, saturation, value), HSI (hue, saturation, intensity), and CIELAB [17–21], which produced corresponding color coordinate systems. Moreover, the necessity of whether all the dimensions need to be reconstructed was discussed. In addition, considering the frequently-used indexes of comparing reconstruction results were based on the diversities of image digital inputs, such as peak signal to noise ratio (PSNR) [22] and structural similarity index (SSIM) [22, 23], which did not reflect perceptual differences between two images. Therefrom, the comparisons among different color spaces were carried out by some typical numerical measures involving both digital inputs’ deviations and perceptual differences. Finally the recommended color coordinate systems and how to process their dimensions were presented, and the conclusions could support beneficial strategies on how to deal with image color information for any given SR reconstruction algorithm in applications.

## 2. Methods and algorithms

#### 2.1 Procedure and selection of color coordinate systems

The whole procedure for a general SR reconstruction algorithm is demonstrated in Fig. 1, as well as the process of dealing with image color information. For a digital color image, its digital inputs are usually described in the three dimensions of RGB color space [17], i.e. using the digital inputs of (*d*_{R}, *d*_{G}, *d*_{B}) coordinates for each pixel. This image can be split into three grayscale pictures of P_{1}, P_{2}, and P_{3}, in which each picture represents a separate and independent color coordinate. Usually, most color spaces hold a three-dimensional coordinate system. For example, in RGB color space, the grayscale pictures form red channel *d*_{R}, green channel *d*_{G}, and blue channel *d*_{B} can be regarded as P_{1}, P_{2}, and P_{3} respectively. Certainly, in other color spaces, the settings of P_{1}, P_{2}, and P_{3} are different accordingly. Afterwards, the processing of the three grayscale pictures of P_{1}, P_{2}, and P_{3} can employ two strategies, the first one is handling these three pictures all by a learning-based SR reconstruction algorithm separately and combining the reconstructed P_{1}', P_{2}', and P_{3}' pictures to a new color image, while the second one is reconstructing merely P_{1} to P_{1}' via SR reconstruction algorithm and deriving P_{2}' and P_{3}' by pixel interpolation algorithm. The operations of both decomposing one color image into three LR grayscale pictures and merging the processed HR grayscale pictures into one color image are reversible, which will not bring about any color transformation errors. In the course of dictionary learning, some HR image samples along with their downsampled LR versions are regarded as training samples and handled by a learning procedure with K-means singular value decomposition (K-SVD) method and principal component analysis (PCA) method [24]. Then, an HR image with more detail information can be derived from its LR version and the sparse representation coefficients, which are solved based on the pairs of HR dictionaries and LR dictionaries via the orthogonal matching pursuit (OMP) method [25].

As for the selection of color coordinates, in principle, color spaces can produce corresponding color coordinate systems, which can be distributed into three categories. The first category stems from describing color by additive color principle, such as the (*d*_{R}, *d*_{G}, *d*_{B}) coordinates in RGB color space, as well as the device-independent color specification way, i.e. the Commission Internationale de L'Eclairage (CIE) stipulated XYZ system [20, 21]. The second category of color coordinate systems comes from conventional television signal standards, and these color coordinate systems include the YCbCr of phase alteration line (PAL) and the YIQ proposed by National Television Standards Committee (NTSC), which employ one dimension to represent bright and dark (such as luminance) information and the other two dimensions to express chromatic information [26]. The third category is based on a theory that human eyes often perceive color in three dimensions that are considered as color appearance parameters, i.e. hue, brightness/lightness (or saturation), and colorfulness/chroma [19, 27]. These color systems not only include the HSV and the HSI in computer graphics [18], but also involve the CIELCh system generated from the CIE recommended CIELAB color space [20, 21]. Another color coordinate system based on CIELAB space is adopting one dimension L* to describe brightness/lightness and two dimensions a*, b* of chromatic information [21], which can be sorted into the second category such as YCbCr and YIQ.

For the color coordinate systems of the first category, the three grayscale pictures P_{1}, P_{2}, and P_{3} are in equal positions which should be handled in the same way. While for color coordinate systems of the second and the third categories, they both have a dimension to express the bright and dark information of images, such as luminance and brightness/lightness, which can be set as the P_{1} coordinate in Fig. 1, whereas P_{2} and P_{3} coordinates can represent their two dimensions of chromatic information. Since the bright and dark coordinate is usually considered to undertake more information than the two chromatic coordinates in digital color transmission, the strategies of spacial resolution enhancement for P_{2} and P_{3} can involve either the SR reconstruction algorithm or the pixel interpolation method. Table 1 lists the selected color spaces/systems in this study and their corresponding coordinates of P_{1}, P_{2}, and P_{3}. A matter needing attention is that the numerical ranges of the second to the fourth columns in Table 1 are widely different for these involved color spaces/systems, even negative values may appear. However, for image processing such as SR reconstruction and pixel interpolation, the input data should be positive and in a fixed interval such as 0~1 or 0~255. Therefore, these various numerical ranges should be preprocessed, since both constraining negative values to 0 and forcing values larger than 1 to be as 1 will cause terrible image distortions. Thus, for all the color coordinates in Table 1, they should be converted to a suitable numerical range, here, a normalization of 0.1~0.9 is carried out when importing P_{1}, P_{2}, and P_{3} coordinates into the process of resolution enhancement, in order to ensure all the data are suitable for the disposes of SR reconstruction and the pixel interpolation. The determination of normalization range is based on two aspects of consideration, for one thing, boundary points (the minimum 0 and the maximum 1) in intercity images are usually hard to handle since only one side of the numeral information could be employed, so a normalization range narrower than 0~1 is implemented to avoid treating boundary points. For another, the reason for not choosing 0.2~0.8 or 0.3~0.7 settings is that, it is unnecessary to leave too wide range to boundary points, and too narrow ranges for SR reconstruction and pixel interpolation might be disadvantageous in some degree because they cause intensive data distribution. In addition, the two right columns of Table 1 give the denomination for the strategies on how to deal with color information in the process of resolution enhancement, which is also used in “Section 3”.

Thus, even for the same SR reconstruction method, the employment of various color spaces/systems would induce different image quality performances and visual effects. Moreover, even for the same selected color space/system, the strategies of whether dealing with P_{1}, P_{2}, and P_{3} by SR method in parallel or handling P_{2} and P_{3} by pixel interpolation may cause different results, which should be evaluated by image numerical measures in detail.

#### 2.2 SR reconstruction of classified dictionary learning

This study presents an SR reconstruction algorithm based on dictionary learning to increase the resolution of a grayscale image, which originates from the aspect of optimizing dictionary learning process. In the proposed procedure, all the features (atoms) can be reasonably sorted into several dictionaries. The computational complexity in the reconstruction step can be deduced, because the involved atoms to reconstruct one image patch become a fraction of all the atoms. Compared to other SR reconstruction algorithms, when using the same amount of atoms, the proposed SR algorithm would cost less time in the reconstruction process. The whole procedure of SR reconstruction is illustrated in Fig. 2. After partitioning the image samples into some HR patches and acquiring their corresponding LR patches by means of downsampling, the features can be extracted by K-SVD method and their dimensionalities could be reduced by PCA technique. Then these features from training samples are clustered into separate groups by K-means cluster algorithm [24, 28], which form several clusters of dictionary pairs, that is to say, each cluster contains pairs of HR dictionaries and LR dictionaries. Thus, each image patch will be reconstructed based on its most suitable dictionary, so that the high quality of reconstructed images can be ensured. As for the target LR grayscale picture, firstly it should be cut apart into a number of LR patches such as 50 × 50 pixels, since the information of different areas in one image may be various, so that disposing patches rather a whole image could make the matching between the trained dictionary pairs and LR patches more accurate. Afterward, for the extracted features of target LR patches, a suitable cluster of dictionary pairs with high similarity can be chosen according to the weighting information from the K-means cluster algorithm, and then all the LR patches of different clusters are all reconstructed by using trained HR and LR dictionary pairs along with the solved sparse representation coefficients [29].

The idea of image sparse representation is originated from the compressed sensing theory, it is stated that natural images can be sparsely represented by some dictionary matrixes [30]. Supposing the image is *x* ∈ *R ^{n}*, as shown in Eq. (1), it can be represented by the overcomplete dictionary

*D*= [

*d*

_{1},

*d*

_{2}, …,

*d*] ∈

_{m}*R*(

^{n × m}*n*<

*m*), which is the linear combination of elements, i.e. the vector

*d*(

_{i}*i*= 1, 2, …,

*m*). The

*α*= [

*α*

_{1},

*α*

_{2}, …,

*α*]

_{m}^{T}∈

*R*is the matrix of sparse representation coefficients, which satisfies the inequation ||

^{m}*α*

_{0}|| <<

*n*. The symbol ||

*α*

_{0}|| means the number of nonzero elements.

In specifics, based on the theory of image sparse representation, the two key technological sections are elaborated in the following.

### (i) Classified dictionary learning

Dictionary learning is to train a dictionary pair of sparse representation by the method of machine learning using the given samples, in which features of training samples are split into several clusters, and pairs of HR dictionaries and corresponding LR dictionaries are derived. The employment of classified dictionaries can use more features and reduce training time because of the introduction of similarity among features.

Each image sample is partitioned into *p* HR patches, denote the *i th* original HR patch is *x _{i}* (

*i*= 1, 2, …,

*p*), all the original HR samples are {

*x*}. Use the bicubic method to reduce the resolution (to make the pixel of each side become the 1/3 of its original version) of {

_{i}*x*}, and then enlarge them via the same method to obtain their corresponding LR patches {

_{i}*u*}. As shown in Eq. (2), remove low frequency information from high frequency information to get HR features.

_{i}Afterwards, high frequency features are extracted by the two-dimensional filtering operator F shown in Eq. (3), including first-order and second-order derivatives.

The four sub-operators of F are represented in Eq. (4), in which LoG means an operation of 5 × 5 Laplacian of Gaussian filtering.

After the F operation, high frequency features of HR patch *f _{x}^{i}* and those of LR patch

*f*are obtained, so that pair of extracted features can be expressed as {

_{u}^{i}*f*,

_{x}^{i}*f*}. To reduce time cost of dictionary learning, the dimensionality reduction is applied on the high frequency features of LR patch via the PCA algorithm.

_{u}^{i}The high frequency features can be classified into *K* clusters with *K* cluster centers using the K-means cluster algorithm, and its cluster center is *C _{k}* (

*k*= 1, 2, …,

*K*). The features in

*k th*cluster can also be written as Eq. (5).

The process of classified dictionary learning is expressed as Eq. (6), by which each cluster can be trained into a dictionary pair *D ^{k}*, and

*α*is its matrix of sparse representation coefficients, which contains the elements of

_{i}*α*(

_{i}^{k}*k*= 1, 2, …,

*K*). The

*T*is the parameter that controls the degree of sparse representation.

_{k}This is an ill-posed problem, and the K-SVD method can be employed to solve Eq. (6) and to calculate *D _{h}^{k}* and

*D*, finally the

_{l}^{k}*k th*dictionary pair

*D*{

^{k}=*D*,

_{h}^{k}*D*} is obtained.

_{l}^{k}### (ii) Reconstruction algorithm

After exacting each dictionary pair from training samples by learning process, a reconstruction algorithm is adopted to exactly recover high frequency information for an input LR image.

Firstly, this LR image is partitioned into *m* LR patches with a certain size of *n* × *n* pixels, where *n* is a constant. Then, each LR patch is scaled up using the bicubic method (to make the pixels of each side become 3 times of its original version) to obtaining the *i th* patch *y _{i}* (

*i*= 1, 2, …,

*m*).

The feature extraction for patch *y _{i}* can be implemented by the filtering operator F shown in Eqs. (3) and (4), so its LR features { F

*y*} is obtained. Afterwards, the dimensionality reduction of PCA algorithm is applied on the { F

_{i}*y*}.

_{i}For the classified dictionaries, the dictionary pair is selected to calculate its HR version based on the similarity between each cluster center *C _{k}* (

*k*= 1, 2, …,

*K*) and this LR patch

*y*. According to the K-means cluster algorithm, Eq. (7) expresses the function to find the most suitable dictionary pair

_{i}*D*{

^{k}=*D*,

_{h}^{k}*D*}, in which

_{l}^{k}*g*{

_{k}*y*,

_{i}*C*} as the criterion of similarity is expressed as the membership grade of Euclidean distance between

_{k}*y*and

_{i}*C*.

_{k}According to the optical observation model, each LR patch *y _{i}* can be expressed as Eq. (8), where S, H,

*η*denote the down sampling, blurring, and additive noise of the optical system respectively, and ${\widehat{y}}_{i}$ represents the ideal high resolution patch.

The key step is to calculate the unknown ideal HR patch ${\widehat{y}}_{i}$ through the LR patch *y _{i}* and the selected dictionary

*D*{

^{k}=*D*,

_{h}^{k}*D*}, the expressions of Eqs. (9) and (10) are used. Here,

_{l}^{k}*β*is the sparse representation coefficient of

_{i}^{k}*y*under the selected sub-dictionary

_{i}*D*and

_{l}^{k}*β*can be calculated by the OMP method. As shown in Eq. (10), the ideal HR patch ${\widehat{y}}_{i}^{k}$ can be obtained through multiplying

_{i}^{k}*D*by

_{h}^{k}*β*.

_{i}^{k}Thus, repeat the above steps to the other LR patches and obtain their HR patches, and finally join the *m* HR patches together to a whole HR image.

To show the visual effects of the proposed SR reconstruction algorithm, a grayscale image of optical resolution test board RT-MIL-T4102 was employed as an original HR image, and its downsampled version was LR image. Hereby, the reconstructed HR version (multiplying power values is 3 × 3) from the above algorithm could be visually assessed and compared. Figure 3 supplies the visualization results, in which the spacial resolution parameter of line pairs can be calculated. From Fig. 3, it can be seen, compared to the LR version, this algorithm can improve the identifiable line pairs by 40 percent (from 1.41 lp/mm to 2.00 lp/mm).

## 3. Experiment and discussions

To analyze the impacts of various color coordinate systems and their strategies of dealing with color, a set of digital color images exhibited in Fig. 4 [all photos were taken by the authors] was adopted as test samples, and the image contents involved plants, buildings, portraits, still objects, and landscapes, etc. The selection of these test samples aimed to cover a certain range in color, shadow, and frequency to a feasible extent. The HR versions of test images were regarded as “ideal answers”, and then they were downsampled by averaging the information of the nearest 3 × 3 pixels to get their LR test images, which employed a similar process of tuzzy sampling of digital cameras. The resolution of HR images was 1800 × 1800 pixels, so that the resolution of their LR versions was 600 × 600 pixels. Other multiplying power values were also feasible in the SR reconstruction procedure, here we adopted the setting of 3 × 3. As the inputs of the process, the selected LR images were firstly split into three color coordinates of P_{1}, P_{2}, and P_{3} according to the settings of Table 1, then the grayscale pictures from each coordinate could be reconstructed via the SR algorithms of classified dictionary learning described in “Section 2.2”. For the reconstruction in the CIELAB color space, the (*d*_{R}, *d*_{G}, *d*_{B}) coordinates of RGB color space would firstly be transformed to XYZ system according to the sRGB (standard red green blue color space) standard [31] for digital images, then the color coordinates in CIELAB space, i.e. (L*, a*, b*), and (L*, C_{ab}*, h_{ab}), were obtained [20, 21]. In this way, 14 HR versions could be collected correspondingly for each LR test image in total.

Therefrom, to compare which color coordinate systems can perform more effectively for SR reconstruction, the numerical analysis was implemented by calculating the differences between the reconstructed images and the original HR test images, not only included common indexes of PSNR and SSIM, but also involved the CIE recommended color difference formula, i.e. CIEDE2000 (ΔE_{00}). The introduce of color difference is based on the consideration that, PSNR is mainly to show the diversities on image digital inputs (*d*_{R}, *d*_{G}, *d*_{B}), and SSIM is a parameter to present content structural similarity between two images based on digital inputs as well, while color difference formulae are more suitable to convert the diversities of digital inputs (*d*_{R}, *d*_{G}, *d*_{B}) into differences on human perceptual feelings. Therefore, the CIEDE2000 was employed, usually color difference values smaller than 3 ΔE_{00} units could be considered as visually acceptable between two color pairs [32, 33]. The mean ΔE_{00} value for all the pixels was employed as an index to compare the overall performances among various strategies, whereas the maximum ΔE_{00} value was adopted to show the worst visual effects caused by the improper reconstruction. Table 2 lists results of these four indexes for all the 14 strategies, along with their time cost, in order to take the time efficiency and image quality synthetically. Here, the time cost was recorded by a computer (Intel Core CPU i5-4460 3.20GHz, RAM 8G) with MATLAB R2014b software. In Table 2, the listed data of the four indexes are the mean values calculated from the numerical results of all the 18 selected images for each given strategy.

From Table 2, the performances of the 14 strategies on selecting color coordinate systems and dealing P_{2} and P_{3} coordinates can be deduced. According to the definitions of these indexes, higher values of PSNR and SSIM show higher similarities between two images, while higher values of ΔE_{00} represent poorer similarities. As for SSIM, all the values are 1.000 in the case of four significant digits, indicating that SSIM is not a suitable parameter to evaluate differences between the reconstructed version and the ideal version for the same image and it is not susceptible for two images with close resemblance, while it pays more attention on similarity of contents such image outlines. For the SSIM and PSNR results, they show accordant tendencies for these 14 strategies, but PSNR is an index that is often more sensitive to image contents to a certain extent, e.g. for different test images T1~T18, their PSNR ranges of the 14 SR versions locate in separate numerical regions. Whereas CIEDE2000 is an index mainly concerned on the perceptual differences between two images, and it is unconcerned on the contents of different test images. As shown by the PSNR and CIEDE2000 results, the better image reconstruction quality comes from the strategies of M-YIQ-1, M-YIQ-3, M-LAB-1, and M-LAB-3 (mean color difference CIEDE2000 values smaller than 2 ΔE_{00} units, and PSNR values larger than 27). Their color coordinate systems have a communality, that is, they are all from the second category of color systems with one dimension to represent bright and dark information and the other two dimensions to express chromatic information. The medium performance is from M-RGB, M-XYZ, M-YCbCr-1, M-YCbCr-3, M-LCh-1, and M-LCh-3. While the strategies of M-HSV-1, M-HSV-3, M-HSI-1, and M-HSI-3 cannot achieve good SR reconstruction effects. In Fig. 5, all the 14 SR reconstructed versions and their original HR version for some representative area in image T3 are exhibited. By comparing to the “ideal answer”, the strategies of M-LCh-1, M-LCh-3, M-HSV-1, M-HSV-3, M-HSI-1, and M-HSI-3 would lead to some wrong interim colors in the stripes of petals.

These above strategies with poor performances have a common ground, they are all based on the color coordinate systems from the third category that express color information in three dimensions of hue, brightness/lightness, and colorfulness/chroma. Figure 6 depicts another extreme example of the failures cause by these color coordinates systems. It can been that, the pixels in the color boundaries between different contents with disparate color information may suffer from a serious distortion, especially for the HSV and HSI systems. These results can also be verified by the extremely high values of maximum ΔE_{00} in Table 2, since values larger than 10 ΔE_{00} units could be regarded as abnormal reconstructed colors for the corresponding pixels. According to the visualization results and numerical calculation of maximum ΔE_{00}, it can be concluded that these coordinates are not suitable for SR reconstruction and pixel interpolation. One explanation on their poor results is that, the hue and saturation (similar dimension of colorfulness/chroma) information is discontinuous in color images, even in the area of the same content, hue and saturation would show disconnected points, which are not appropriate results neither for the pixel interpolation nor for the SR reconstruction algorithm. So treating the hue and saturation coordinates as grayscale images may cause improper median values far from the truth, and it is the reason for the emergence of the green pixels in the contiguous area between the white pixels and the red pixels, as shown in Fig. 6.

As for the time cost, an obvious outcome is that strategies of treating P_{2} and P_{3} coordinates using SR process would cost about 3 times of second quantity compared with those of using pixel interpolation method, thus, from the aspect of productiveness, the M-YIQ-1 and M-LAB-1 can take away less time and achieve equivalent quality in comparison with the M-YIQ-3 and M-LAB-3, which are more functional and more efficient in applications. Moreover, to show the whole reconstruction process with real images using the recommended strategies of dealing with color, Fig. 7 shows the phased visual results of each step when reconstructing a color image by the M-LAB-1 strategy (Only part of image T14 with high frequency information is depicts, in order to give more details). It can indicate that, the L* coordinate carries more detailed information than a* and b* coordinates, thus there are no essentialities to employ the SR algorithm for the two chromatic dimensions of a* and b*.

To sum up, the conclusions derived from the results includes three following points. Firstly, the color coordinate systems of CIELAB and YIQ are suitable mapping spaces for the resolution enhancement operations, including pixel interpolation and SR reconstruction. Secondly, take time cost into consideration, it is unnecessary to treat the coordinates of P_{1}, P_{2}, and P_{3} by the SR reconstruction process simultaneously, since only the P_{1} coordinate of bright and dark information (including luminance, and brightness/lightness) is the dimension that need to be reconstructed by the SR algorithm. Accordingly, for the RGB and XYZ color systems from the first category, their algorithms may cost more time, though they achieve acceptable image quality for SR reconstruction. Thus, the recommended strategies of dealing with color information in the SR reconstruction are implementing the SR algorithm for merely L* coordinate of CIELAB space or merely Y coordinate of YIQ system, while the other two coordinates should use the pixel interpolation to enhance resolution. The third point is that, the color coordinates systems from the third category with three dimensions of hue, brightness/lightness and colorfulness/chroma will cause severe color distortions when they are treated as coordinates in the process of resolution enhancement, especially for the HSV and HSI systems. The color coordinates systems from this category cannot provide suitable mapping spaces neither for SR reconstruction nor for pixel resolution.

## 4. Conclusions

To explore optimum strategies of dealing with color information in SR reconstruction, a method with classified dictionary learning was designed, then various color spaces/systems including RGB, YIQ, YCbCr, HSI, HSV, and CIELAB were involved. Moreover, whether all the three color dimensions need to be reconstructed was tested for those color spaces. Beside PSNR and SSIM, the color difference formula of CIEDE2000 was employed to reflect perceptual differences. Based on the comparisons of these numerical measures, the recommended strategies to obtain good image reconstruction quality are adopting merely L* coordinate in CIELAB space or merely Y coordinate of YIQ system, which indicate the color coordinate systems with one dimension of bright and dark information and the other two dimensions to express chromatic information have more advantages in the process of SR reconstruction. That is to say, only the coordinate of bright and dark information should be reconstructed by the method of classified dictionary learning. Though handling the three coordinates using the same way for (Y, Cb, Cr) and (L*, a*, b*) can also yield good effects, it will cost more computation time. Another significant matter is that, color spaces with three perceptual parameters of hue, brightness/lightness, and colorfulness/chroma are not suitable to supply coordinates in the resolution enhancement process, neither for pixel interpolation nor for SR reconstruction. Our further study will be focused on increasing the efficiency of the whole SR reconstruction procedure, and improving the dictionaries for some particular images with pertinency.

## Funding

National Natural Science Foundation of China (NSFC) (61505156, 61575154); Fundamental Research Funds for the Central Universities (JB150512).

## References and Links

**1. **P. Milanfar, *Super-resolution Imaging* (CRC, 2011).

**2. **S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution image reconstruction: a technical overview,” IEEE Signal Process. Mag. **20**(3), 21–36 (2003). [CrossRef]

**3. **D. Glasner, S. Bagon, and M. Irani, “Super-resolution from a single image,” in Proceedings of IEEE International Conference on Computer Vision (IEEE, 2009), pp. 349–356.

**4. **S. H. Rhee and M. G. Kang, “Discrete cosine transform based regularized high-resolution image reconstruction algorithm,” Opt. Eng. **38**(8), 1348–1356 (1999). [CrossRef]

**5. **S. D. Babacan, R. Molina, and A. K. Katsaggelos, “Variational Bayesian Super Resolution,” IEEE Trans. Image Process. **20**(4), 984–999 (2011). [CrossRef] [PubMed]

**6. **W. T. Freeman, T. R. Jones, and E. C. Pasztor, “Example-based super-resolution,” IEEE Comput. Graph. Appl. **22**(2), 56–65 (2002). [CrossRef]

**7. **J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution as sparse representation of raw image patches,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2008), pp. 1–8.

**8. **J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Trans. Image Process. **19**(11), 2861–2873 (2010). [CrossRef] [PubMed]

**9. **R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” on International Conference on Curves and Surfaces (IEEE, 2010), pp. 711–730.

**10. **W. Dong, L. Zhang, G. Shi, and X. Wu, “Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization,” IEEE Trans. Image Process. **20**(7), 1838–1857 (2011). [CrossRef] [PubMed]

**11. **J. Zhang, C. Zhao, R. Xiong, S. Ma, and D. Zhao, “Image super-resolution via dual- dictionary learning and sparse representation,” on IEEE International Symposium on Circuits and Systems (IEEE, 2012), pp. 1688–1691. [CrossRef]

**12. **K. Zhang, X. Gao, D. Tao, and X. Li, “Multi-scale dictionary for single image super-resolution,” IEEE Computer Vision Pattern Recognition **157**(10), 1114–1121 (2012).

**13. **S. Yang, M. Wang, Y. Chen, and Y. Sun, “Single-Image Super-Resolution Reconstruction via Learned Geometric Dictionaries and Clustered Sparse Coding,” IEEE Trans. Image Process. **21**(9), 4016–4028 (2012). [CrossRef] [PubMed]

**14. **Q. Liu, S. Wang, L. Ying, X. Peng, Y. Zhu, and D. Liang, “Adaptive dictionary learning in sparse gradient domain for image recovery,” IEEE Trans. Image Process. **22**(12), 4652–4663 (2013). [CrossRef] [PubMed]

**15. **C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” on *IEEE Transactions on Pattern Analysis and Machine Intelligence* (IEEE, 2015), pp. 295–307.

**16. **J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep convolutional networks,” on IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2016), pp. 1646–1654. [CrossRef]

**17. **E. J. Giorgianni and T. E. Madden, *Digital Color Management: Encoding Solutions*, 2nd ed. (JohnWiley & Sons, 2008).

**18. **J. C. Russ, *The Image Processing Handbook,* 6th ed. (CRC, 2011).

**19. **R. W. G. Hunt, *The Reproduction of Color,* 6th ed. (John Wiley & Sons, 2004).

**20. **CIE 15.3, Colorimetry, 3rd ed. (Commission Internationale de L'Eclairage, Vienna, 2004).

**21. **G. Wyszecki and W. S. Stiles, *Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Edition* (John Wiley and Sons, 2000).

**22. **Z. Wang and A. C. Bovik, *Modern Image Quality Assessment* (Morgan & Claypool, 2006).

**23. **Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process. **13**(4), 600–612 (2004). [CrossRef] [PubMed]

**24. **M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process. **54**(11), 4311–4322 (2006). [CrossRef]

**25. **Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition,” in Proceedings of 27th Asilomar Conference on Signals, Systems and Computers (IEEE, 1993), pp. 40–44. [CrossRef]

**26. **Eric Dubois, *The Structure and Properties of Color Spaces and the Representation of Color Images* (Morgan & Claypool, 2010)

**27. **M. D. Fairchild, *Color Appearance Models*, 2nd ed. (John Wiley & Sons, 2005).

**28. **M. J. Gangeh, A. Ghodsi, and M. S. Kamel, “Kernelized supervised dictionary learning,” IEEE Trans. Signal Process. **61**(19), 4753–4767 (2013). [CrossRef]

**29. **Y. Zhou, K. Liu, R. E. Carrillo, K. E. Barner, and F. Kiamilev, “Kernelbased sparse representation for gesture recognition,” Pattern Recognit. **46**(12), 3208–3222 (2013). [CrossRef]

**30. **M. Elad, *Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing* (Springer, 2010).

**31. **IEC 61966–2-1, Multimedia systems and equipment– Colour measurement and management–Part 2–1: Colour management–Default RGB colour space–sRGB, Amendment 1 (IEC, Switzerland, 2003).

**32. **M. R. Luo, G. Cui, and B. Rigg, “The development of the CIE 2000 Colour-Difference formula: CIEDE2000,” Color Res. Appl. **26**(5), 340–350 (2001). [CrossRef]

**33. **CIE 142, Improvement to industrial colour-difference evaluation (Commission Internationale de L'Eclairage, Vienna, 2001).