Denoising algorithm of OCT images via sparse representation based on noise estimation and global dictionary

Xi Zhang; Xi Zhang; Zhongliang Li; Zhongliang Li; Zhongliang Li; Nan Nan; Nan Nan; Xiangzhao Wang; Xiangzhao Wang; Xiangzhao Wang

doi:10.1364/OE.447668

1. Introduction

Optical coherence tomography (OCT) is a high-resolution, non-invasive optical tomography technology that can detect the microstructure of tissue in vivo by detecting the intensity and time delay of backscattered light [1–4]. However, OCT images inevitably suffer from speckle noise because the OCT system exploits low-coherence interferometry. In addition, OCT images are also affected by detection noise and photon shot noise. By comparison, speckle noise is a multiplicative noise, which is the main noise component in a fine-designed OCT system. Therefore, OCT denoising technology mainly deals with speckle noise. Speckle noise severely degrades the signal-to-noise ratio of the images and destroys the structural features and detailed information of the images [5,6], which significantly affects subsequent image processing. Therefore, speckle noise is not conducive to achieving accurate diagnosis using the OCT system in clinical medicine. Therefore, it is of great significance to improve the quality of OCT images by removing speckle noise from them.

At present, researchers have proposed various methods to eliminate speckle noise in OCT images to improve their quality. Generally speaking, these methods can mainly be divided into two categories: hardware-based methods and software-based methods. The hardware correction methods, such as angular compounding [7], spatial combination and average [8], frequency compounding [9,10], polarization compounding [11], improve the hardware of the OCT system to reduce or remove some noise in the image. However, it is not easy to adapt these methods to commercial OCT systems, because they require significant modifications in the hardware of existing imaging systems; further, additional hardware, some of which are expensive, increases the system complexity. It has been observed that the method based on software digital post-processing is low-cost and highly efficient. Thus, it has become a research hotspot in the field of speckle-noise removal from OCT images. Some examples of software-based methods are the filtering method [12], wavelet decomposition transform method [13,14], total variation method [15], nonlocal mean method [16], sparse representation method [17], etc. Although these methods effectively remove speckle noise from OCT images, many of them have high complexity and long running times. In addition, some methods that deploy global threshold parameters destroy the local detailed features and texture information of the image.

In recent years, sparse representation has been applied in many research fields owing to the rapid development of the compressed sensing theory. Based on sparse representation, a new image denoising scheme has been proposed [18–21]. Zhang and Mallat proposed the concept of sparse representation based on a dictionary in 1993 [22]. Because dictionaries can effectively represent the characteristics of signals, sparse representation theory and dictionaries have been extensively studied. Donoho and Elad [23] proposed a sparse representation model of an image in 2006 and introduced a dictionary learning algorithm for image denoising, they achieved significant denoising results. Sparse representation [24] models a signal as a sparse linear combination of basis elements, termed atoms, from a dictionary. One of the key steps in sparse representation is designing a complete dictionary [25]. With advances in research, various sparse representation methods also have an increasing requirement of dictionaries. In general, the more similar the structures of the dictionary and image to be measured, the better the sparse representation of the image. At present, there are two types of dictionaries, namely, fixed dictionaries and learning dictionaries. The fixed dictionaries are difficult to effectively identify local features of an image due to the limited dimensions, which makes them only suitable for limited types of images. Learning dictionaries are divided into global dictionaries and adaptive dictionaries [26]. At present, OCT image denoising is mainly performed using an adaptive dictionary, which can highly fit the image according to the relevant attributes of the image and then reconstruct a high-quality image. However, the dictionary must be updated iteratively in the calculation process, which increases the algorithm complexity. A global dictionary that is universally applicable to large-scale similar images is obtained using large amounts of data training sets. In the sparse representation process, compared with the adaptive dictionary that continually updates the dictionary, the global dictionary only requires one round of dictionary learning. Besides, the global dictionary has relatively large dimensions, which enables fast computation.

In clinical applications, such as ophthalmic OCT, dermatological OCT, and cardiovascular OCT, each type of OCT system detects the same kind of measured samples. The structural differences of the tested samples on OCT images are usually reflected in the structural ensemble, such as the location or contour of the tissue. However, the constitution of the sample tissues is similar, thus exhibiting higher similarity in image details. In practical detection, an adaptive dictionary for each noisy image requires dictionary learning to achieve superior noise reduction, which increases the computational complexity significantly. Thus, it is difficult to meet the need for imaging speed in clinical diagnosis. If a global dictionary containing sample structure information is obtained through dictionary learning on multiple high-quality OCT images and the dictionary is used to process noisy images, then there is no need to learn the dictionary for each image, thus greatly improving the efficiency of the denoising method.

In this study, the basic idea of sparse representation denoising is used to remove speckle noise in OCT images. Taking the high-quality images as the training sample, the global dictionary containing the image information of the training sample is constructed using the K-means singular value decomposition (K-SVD) algorithm. Then, the noise intensity of weak texture blocks in noisy OCT images is estimated based on the noise variance estimation method of the Laplacian filter. According to the global dictionary and image noise intensity, the noisy image is sparsely decomposed and reconstructed using the orthogonal matching pursuit (OMP) algorithm, which can quickly and effectively remove the noise of the OCT image.

2. Methodology

2.1 Overview of OCT Image denoising algorithm

The sparse representation assumes that most signals of interest can be estimated using a linear combination of a few (sparse) elements from a set of basis functions (atoms) that is called a dictionary. To exploit the sparse model, the noisy image needs to be divided into overlapping blocks in image denoising algorithms. Taking an image block as a column vector, all the image blocks are assembled into a matrix to be processed; subsequently, the image blocks in the matrix are restored to an image after the completion of all the subsequent operations. Noise removal using a global dictionary is mainly used to solve the following problem [23]

(1)$$\{ {\mathop \alpha \limits^ \wedge _{ij}},\mathop {\boldsymbol X}\limits^ \wedge \} = \mathop {\arg \min }\limits_{{\alpha _{ij}},{\boldsymbol X}} \lambda \parallel {\boldsymbol X} - {\boldsymbol Y}\parallel _2^2 + \sum\limits_{ij} {{\mu _{ij}}} \parallel {\alpha _{ij}}{\parallel _0} + \sum\limits_{ij} {\parallel {{\boldsymbol R}_{ij}}{\boldsymbol X} - {\boldsymbol D}{\alpha _{ij}}\parallel _2^2} ,$$

where ${\boldsymbol X}$, ${\boldsymbol Y}$, and $\lambda $ are the denoised image, the noisy image, and fidelity, respectively. In this expression, the first term on the right-hand side of the equation is used to ensure that the noisy image and the denoised one are similar. As a constraint, the first term on the right can also be expressed as ${\parallel} {\boldsymbol X} - {\boldsymbol Y}\parallel _2^2 \le const \cdot {\sigma ^2}$, reflecting the direct relationship between $\lambda $ and $\sigma $, where ${\parallel} {\boldsymbol X} - {\boldsymbol Y}\parallel _2^2$ represents the sum of squares of each element in the vector and $\sigma $ is the image noise intensity. Subscript $ij$ denotes the image pixel, ${\alpha _{ij}}$ is the sparse encoding coefficient vector corresponding to each image block, ${\mu _{ij}}$ is the weight of sparsity ${\parallel} {\alpha _{ij}}{\parallel _0}$, which is used to constrain the number of non-zero elements in a sparse vector, ${{\boldsymbol R}_{ij}}$ is an image block extracted from a noisy image, ${\boldsymbol D}$ is a global dictionary, and ${\boldsymbol D}{\alpha _{ij}}$ is the reconstructed sub-image. Here, the error between ${\boldsymbol D}{\alpha _{ij}}$ and ${{\boldsymbol R}_{ij}}{\boldsymbol X}$ should be as small as possible. In this expression, the second and third terms on the right represent prior knowledge of the image, which ensures that each block ${{\boldsymbol R}_{ij}}{\boldsymbol X}$ of the reconstructed image ${\boldsymbol X}$ has a bounded representation error.

The global dictionary ${\boldsymbol D}$ is constructed through training on a corpus of patches obtained from a high-quality set of images. Equation (1) contains only two unknown parameters, which are the sparse representation ${\alpha _{ij}}$ of each image block and the denoised image to be restored ${\boldsymbol X}$. First, ${\boldsymbol X}$ is initialized to ${\boldsymbol Y}$, and each image block of ${\boldsymbol X}$ is sparsely encoded to get the optimal solution ${\mathop \alpha \limits^ \wedge _{ij}}$

(2)$${\mathop \alpha \limits^ \wedge _{ij}} = \mathop {\arg \min }\limits_\alpha {\mu _{ij}}\parallel \alpha {\parallel _0} + \sum\limits_{ij} {\parallel {{\boldsymbol R}_{ij}}{\boldsymbol X} - {\boldsymbol D}} {\alpha _{ij}}\parallel _2^2.$$

In general, the optimization problems of Eq. (2) are usually solved using greedy tracing and relaxation optimization algorithms. Since the orthogonal matching pursuit (OMP) algorithm in the greedy pursuit algorithm is simple and efficient [27], it is used to solve the problem in Eq. (2). In the calculation process, the residual error of each time is continuously approximated until the set sparsity or error limit requirements are met, and the sparse representation coefficient ${\mathop \alpha \limits^ \wedge _{ij}}$ for each image block is obtained. The error limit set here is determined by the bounded error $\varepsilon $ and noise intensity $\sigma $. After obtaining ${\mathop \alpha \limits^ \wedge _{ij}}$ of all the image blocks, ${\boldsymbol X}$ is updated and solved to obtain the final denoised image

(3)$$\mathop {\boldsymbol X}\limits^ \wedge{=} {(\lambda {\boldsymbol I} + \sum\limits_{ij} {{\boldsymbol R}_{ij}^T{{\boldsymbol R}_{ij}}} )^{ - 1}}(\lambda {\boldsymbol Y} + \sum\limits_{ij} {{\boldsymbol R}_{ij}^T} {\boldsymbol D}{\mathop \alpha \limits^ \wedge _{ij}}),$$

where ${\boldsymbol I}$ is the unit matrix. Several values $\lambda $, which are dependent on the noise intensity $\sigma $ of the image, have been tested in previous research [23]. It is empirically found that the best result is achieved with $\lambda \approx {{30} / \sigma }$.

When imaging the sample to be measured by OCT, there are differences in the noise intensity $\sigma $ of the different images obtained owing to the influence of external environmental factors and the different samples being measured. A dictionary learning denoising algorithm with fixed parameters can achieve good denoising performance when the difference in noise intensity is small. However, the influence of noise on the denoising result of the algorithm is not negligible when the noise intensities differ significantly (we have verified this in the subsequent simulation). Thus, it is necessary to estimate the noise intensity $\sigma $ for each image before performing the denoising operation.

Based on the above considerations, Fig. 1 shows the flow of the proposed OCT image denoising algorithm. First, high-quality OCT images are used as training samples to construct a global dictionary ${\boldsymbol D}$ containing sample image information. Then, the noisy OCT image is divided into blocks, and the weak texture image block with the minimum entropy value ${E_{\min }}$ is selected according to the gray-level co-occurrence matrix (GLCM). The noise intensity $\sigma $ of the OCT image is calculated by using the Laplacian filter-based noise variance estimation method. After partitioning the noisy images into subsets of images, sparse encoding is performed on each subset based on the OMP algorithm and global dictionary ${\boldsymbol D}$ to obtain the sparse coefficients $\alpha $. Finally, the denoised OCT image is reconstructed from the global dictionary ${\boldsymbol D}$, noise intensity $\sigma $, and sparse encoding coefficients $\alpha $, based on Eq. 3.

Fig. 1. Flowchart of denoising algorithm of OCT images via sparse representation based on noise estimation and global dictionary.

Download Full Size | PDF

2.2 Estimation of the noise intensity of OCT image

From the above analysis, the image fidelity $\lambda $ in the denoising input parameters is related to the noise intensity $\sigma $. Generally speaking, the default image noise intensity is a constant in the sparse representation denoising algorithm, and the corresponding noise intensity is set and adjusted manually to achieve a better image restoration effect. However, the noise intensity of different OCT images is unknown in practical applications, and the noise intensity error affects the performance of the algorithm. Therefore, it is necessary to estimate the noise intensity of OCT images quickly and effectively.

A noise estimation model based on Laplacian mask filtering is chosen in this study. In the model, a convolution kernel composed of two Laplacian Masks is used to perform a convolution operation on the test image to obtain the intensity $\sigma $ of the image. The two Laplacian masks are as follows [28]

(4)$${L_1} = {\begin{array}{|c|c|c|}\hline 0& 1 & 0\\ \hline 1& {\textrm{ - }4}& 1\\ \hline 0& 1 & 0\\\hline \end{array}}\;,\;\;{L_2} = {1 / 2}{\begin{array}{|c|c|c|}\hline 1 & 0 & 1 \\ \hline 0 & {\textrm{ - }4}& 0 \\ \hline 1 & 0 & 1\\ \hline \end{array}}.$$

The convolution kernel N composed of ${L_1}$ and ${L_2}$ is represented as follows

(5)$$N = 2({L_2} - {L_1}) = {\begin{array}{|c|c|c|}\hline 1 & {\textrm{ - }2} & 1 \\ \hline {\textrm{ - }2}& 4 & {\textrm{ - }2}\\ \hline 1& {\textrm{ - }2}& 1\\\hline \end{array}}.$$

Although the method can efficiently estimate the noise, the image edge details are regarded as noise in the highly textured image area, which results in a large estimation value [28]. Because image edge structures generally have strong differential characteristics, the noise estimation model based on the Laplacian mask filter is not sensitive to the image edge structures. In view of this, we divide the image into several area blocks and use the GLCM [29] to describe the texture of each area block. In this analysis method, the spatial information of the relative position of the pixels in the image is used to efficiently describe the image texture. The GLCM is a matrix that counts the gray-level relationship between pixels in the global or local area [30–33], and the eigenvalue entropy (E) of the matrix reflects the amount of information in the image. The larger the entropy value, the more texture the image has. The entropy is expressed as follows [33]

(6)$$E ={-} \sum\limits_{i = 0}^{L - 1} {\sum\limits_{j = 0}^{L - 1} {p(i,j,d,\theta )} } \lg p(i,j,d,\theta ),$$

where ${L_{}}$ is the number of gray levels, $p(i,j,d,\theta )$ is the value of the element in the $i\textrm{ - }th$ row and $j\textrm{ - }th$ column of the GLCM, which represents the probability of gray level i when d and

$\theta $ are determined with gray-level i as the starting point. The weak texture area block is obtained according to the entropy value E of each area block.

Speckle noise in an OCT image is related to signal strength and is categorized as multiplicative noise. The pixels of the noise image is the product of the image pixels and speckle noise, which is expressed as ${\boldsymbol Y} = {\boldsymbol X} \times n$ . Generally, multiplicative noise is converted into additive noise through logarithmic transformation and processed as $\ln {\boldsymbol Y} = \ln {\boldsymbol X} + \ln n$[34]. To estimate the noise intensity $\sigma $ in noisy OCT images, the image ${\boldsymbol Y}$ is divided into $W \times H$ image blocks; the weak texture image block corresponding to ${E_{\min }}$ is convoluted, and the noise intensity is estimated using the convolution kernel [28]

(7)$$\sigma = \sqrt {\frac{\pi }{2}} \frac{1}{{6(W - 2)(H - 2)}}\sum\limits_I {|I(i,j)} \ast N|,$$

where $I(i,j)$ is the gray value at the pixel point $(i,j)$, and ${\ast} $ represents the convolution.

2.3 Design global dictionary

As noted, the selection of a dictionary ${\boldsymbol D}$ is an important factor in using the denoising model in Eq. (3). The global dictionary affects the final denoising effect. The current mainstream sparse representation denoising algorithm is based on an adaptive dictionary, which is defined by the noisy image itself [23]. In many situations, this method yields very promising results. However, the high level of noise in the OCT images negatively interferes with the learning process, which degrades the quality of the trained dictionary and subsequently results in a suboptimal image restoration [19]. An alternative (ideal) approach is to learn the dictionary from the noiseless image. Because a noiseless image does not exist in practical applications, a low-noise image is selected. A high-quality sample image is obtained by repeated B-scan images which are processed by speckle noise suppression technology [35]. Multiple samples are imaged and processed to obtain a sample set. On the other hand, dictionary learning algorithms are particularly important. The dictionary learning algorithms include the Method of Optimal Directions (MOD) [36], the K-Singular Value Decomposition (K-SVD) algorithm [25], and so on (or see [37] and [38] for details). Of all the dictionary learning algorithms, the K-SVD algorithm, a generalization of the K-means clustering process, has been proven to be able to adapt to the training data more finely and capture their salient features more accurately [23]. The dictionary learning model using the K-SVD algorithm can be represented as follows [25]

(8)$$\mathop {\min }\limits_{D,\alpha } \{ \;\parallel {\boldsymbol Y} - {\boldsymbol D}\alpha \parallel _F^2\;\} ,st.\;{\forall _i},\parallel \;{\alpha _i}\;\parallel \; \le \;T,$$

where ${\boldsymbol Y}$ represents the training sample set, in which each column vector is a training sample. An initial dictionary ${\boldsymbol D}$ is generally a wavelet dictionary or a discrete cosine dictionary, whose size is a matrix of $n \times K$, where n and K are the dimensions of the sample and the total number of atoms contained in the dictionary, respectively, $\alpha $ is the encoding factor after sparse decomposition of the training sample ${\boldsymbol Y}$, T is a constraint on Eq. (7), which indicates that the number of dictionary atoms used in each sparse representation cannot exceed T. The K-SVD algorithm optimizes the objective function and updates the atoms of the dictionary ${\boldsymbol D}$ one by one to reduce the error. Assuming that the dictionary atom ${d_j}$ in the column j must be updated and the corresponding sparse matrix is $\alpha _j^T$, the error is expressed as

(9)$$\parallel \;{\boldsymbol Y} - {\boldsymbol D}\alpha \parallel _F^2 ={\parallel} \;{\boldsymbol Y} - \sum\limits_{j = 1}^K {{{\boldsymbol d}_j}\alpha _j^T} \parallel _2^2 ={\parallel} ({\boldsymbol Y} - \sum\limits_{j \ne k}^{} {{{\boldsymbol d}_j}\alpha _j^T} ) - {{\boldsymbol d}_k}\alpha _k^T\parallel _F^2\textrm{ = }\parallel {{\boldsymbol E}_k} - {{\boldsymbol d}_k}\alpha _k^T\parallel _F^2,$$

where matrix ${{\boldsymbol E}_k}$ represents the residual matrix resulting from the removal of the $k\textrm{ - }th$ atom from the dictionary. Singular value decomposition (SVD) is performed on ${{\boldsymbol E}_k}$ to obtain the maximum full rank column vector, and the $k\textrm{ - }th$ atom in the original dictionary is updated. It should be noted that the direct decomposition of ${{\boldsymbol E}_k}$ will reduces the sparsity of the resulting $\alpha _k^T$. To ensure the sparsity of $\alpha _k^T$ after SVD, only $\alpha $ corresponding to the non-zero position is reserved in $\alpha _k^T$ and only non-zero position items are reserved in ${{\boldsymbol E}_k}$ during iteration, and a set ${{\boldsymbol E}_{Rk}}$ is formed. In this way, SVD for ${{\boldsymbol E}_{Rk}}$ does not affect the sparsity of $\alpha _k^T$, and the convergence speed of the dictionary update is significantly improved. All the atoms in the dictionary are traversed to update the dictionary after one iteration. To effectively suppress noise, the required dictionary ${\boldsymbol D}$ must be obtained after multiple iterations.

3. Simulation and experiment

In the simulation, the influence of the noise intensity on the denoising performance of the existing sparse representation denoising algorithm is investigated, and the effectiveness of the proposed denoising algorithm is proved. Furthermore, the advantages of the proposed denoising algorithm are demonstrated through comparison with other denoising methods in the experimental section. For the analysis of the results, to avoid the shortcomings of subjective visual evaluation, four objective evaluation functions, mean square error (MSE) [39], peak signal to noise ratio (PSNR) [40], structural similarity (SSIM) [41] and edge preservation index (EPI) [42] are used. The MSE is used to evaluate the mean square value of the pixel difference between the denoised and the original images. The smaller the MSE value, the higher the image quality is. As a commonly used full-reference image quality evaluation index, the PSNR reflects the relationship between the maximum signal and noise. A greater PSNR value shows a better effect of denoising, which indicates a stronger denoising ability of the algorithm. The SSIM is mainly used to evaluate the brightness, contrast, and structural similarity between two images, and its general value range is 0–1. The closer the SSIM value is to 1, the more similar the two images are, which indicates the better image reconstruction quality. The EPI is further used to examine the sharpness of edges in the denoised images and its value range is 0–1. Its value was directly proportional to the edge retention ability.

3.1 Simulation analysis

According to the method described in Subsection 2.3, the multiple repeated B-scan images of samples are obtained by the OCT system and then processed by speckle noise suppression technology to obtain a high-quality sample image. These “clean” OCT images are considered to be low-noise or near noise-free, as shown in Fig. 2(a). To obtain images with different noise intensities, speckle noise of varying intensities (5.1, 10.2, and 15.3) are added to the “clean” image, as shown in Fig. 2(b)–2(d), respectively. Three noisy images are separately denoised.

Fig. 2. OCT images with different noise intensities. (a) a “clean” OCT image; (b) OCT image with noise intensity of 5.1, PSNR of 25.6123; (c) OCT image with noise intensity of 10.2, PSNR of 22.1026; (d) OCT image with noise intensity of 15.3, PSNR of 19.2036.

Download Full Size | PDF

First, dictionary learning is performed for the noise-free OCT images based on the K-SVD algorithm to obtain the global dictionary ${\boldsymbol D}$, which represents the feature structure of the image. In the sparse representation denoising algorithm, the input parameter of noise intensity is the default priority parameter. The average value of noise intensity 10.2 is used here as a priori parameter. Then, with fixed input parameters, the OCT images with different noise intensities are sparsely decomposed by the global dictionary ${\boldsymbol D}$ and OMP algorithm. Then, the sparse encoding coefficient $\alpha $ of each subset is obtained. The image is reconstructed using Eq. (3). The denoised images are shown in Fig. 3(a)–3(c).

Fig. 3. Denoised OCT images. (a)–(c) the OCT image is processed by the existing sparse representation denoising algorithm with fixed parameters; (d)–(f) the OCT image is processed by the proposed denoising algorithm; (g)–(l) the OCT image is processed by the SRAD algorithm.

Download Full Size | PDF

The three noisy images in Fig. 2(b)–2(d) are denoised using the proposed algorithm. The noise intensity of the weak texture blocks in the image is estimated via the Laplacian filter-based noise variance estimation method, and the noise intensity values $\sigma $ of the three images are obtained. $\sigma $ is used as a priori parameter to denoise the OCT image sparsely. Three denoised images are shown in Fig. 3(d)–3(f).

For further quantitative evaluation of the proposed algorithm, the existing popular denoising algorithm of sparse representation over an adaptive dictionary [23] (hereafter referred to as SRAD) is compared with the proposed algorithm. Here, the average noise intensity value of 10.2 is also used as the input parameter. First, the dictionary is learned from each noisy image, and then the sparse coefficients are solved and the images are reconstructed. The denoised images are shown in Fig. 3(g)–3(l), respectively.

For the sparse representation denoising algorithm with fixed parameters, the denoising results of the OCT images are different in the case of the different noise intensity $\sigma $. When the noise is small, the retinal edge and internal structure lines in the red line frame in Fig. 3(a) are blurred after denoising. When the noise is large, as shown in the retinal structure inside the yellow wireframe and the imaging background in Fig. 3(c), there is noise residue after denoising. Referring to the four evaluation function indicators in Table 1, the final denoising result is not optimal when sparse denoising is performed under fixed parameters. In contrast, the proposed algorithm is sparse denoising when obtaining the noise intensity $\sigma $ of each image. Regardless of the noise intensity, the resulting image is basically free of noise residue and retains the complete structural details. Compared with the proposed algorithm, the image processed by the SRAD algorithm is not ideal when the noise intensity is large. In Fig. 3(l), although the noise of the image is suppressed to some extent, there is still noise at the background and image edge structure in the blue line frame.

Table 1. Comparison of three denoising algorithms based on image evaluation index.

View Table | View all tables in this article

3.2 Experiment

To verify the ability of the proposed denoising algorithm, the retinas of the volunteers are imaged using a frequency domain OCT system. The light source of the system is a superluminescent diode with a power of 20 mW, center wavelength of 840 nm, and bandwidth of 50 nm. The theoretical axial resolution of the system is 6.23 µm. The size of each OCT image obtained is $1024 \times 1024\;\textrm{pixels}$. All the processing of image denoising in the experiment is achieved using a 64-bit Windows 10 system and MATLAB R2018b software.

The retinas of several volunteers are repeatedly imaged to collect multiple images. High-quality sample images are obtained through the speckle noise suppression technology. Herein, as long as the sample set contains enough feature structures suitable for denoised images, the required global dictionary can be learned from the sample set. Subsequently, the dictionary learning of the sample image is learned by the K-SVD algorithm to obtain the global dictionary ${\boldsymbol D}$, as shown in Fig. 4(a). The relevant parameters are set as follows. The input noise image is pixel sampled using an $8 \times 8$ image block, the size of the dictionary is $64 \times 256$. Each $8 \times 8$ image block in the dictionary represents an atom, the total number of dictionary atoms is 256. The number of iterations during dictionary learning is 20. A noisy retinal image is randomly selected for processing, and the noise intensity $\sigma $ of the image is obtained using the noise variance estimation method of the Laplacian filter. Based on the global dictionary ${\boldsymbol D}$, the image is sparsely decomposed and each subset of the sparse encoding coefficient $\alpha $ is obtained. Then the image is reconstructed according to the noise intensity $\sigma $ and sparse encoding coefficient $\alpha $ to obtain the denoised OCT image, as shown in Fig. 4(c). Compared with the image shown in Fig. 4(b), the denoising effect is significant, and the retinal details are well preserved.

Fig. 4. Noise reduction for OCT image based on the proposed algorithm. (a) global dictionary; (b) original retinal image; (c) denoised retinal image.

Download Full Size | PDF

To compare the influence of the number of dictionary atoms and the image block size on the final denoising effect of the proposed algorithm, Fig. 4(b) is denoised with different dictionary atoms and image blocks of different sizes respectively. The results are shown in Table 2. As can be seen, the increase of the dictionary atoms generally improves the results, although this improvement is small (0–0.1128 dB). The PNSR value indicates that the size of the image block significantly affects the results (0–6.0982 dB). As the size of the image block increases, it is beneficial to the final image denoising effect.

Table 2. Effect of changing the number of dictionary atoms and image block size on the PSNR of final denoising results.

View Table | View all tables in this article

Next, the results of the proposed algorithm are compared with the SRAD algorithm. The denoising ability and execution efficiency of both algorithms are measured using four evaluation function indexes and running time. As shown in Fig. 5, four retinal images are selected, which are named OCT-1–OCT-4. Both algorithms are used for denoising, and the results are quantitatively evaluated and compared.

Fig. 5. Noisy retinal OCT images. OCT-1 with PSNR of 19.2036, OCT-2 with PSNR of 19.1005, OCT-3 with PSNR of 18.1013, OCT-4 with PSNR of 19.5072.

Download Full Size | PDF

First, the noisy OCT images are processed using the denoising algorithm of SRAD. It should be noted that the algorithm iteratively updates the dictionary according to the structural feature attributes of the input image itself. Thus, it is necessary to re-learn the dictionary for each input noise image. The same K-SVD algorithm is used here, and the input image is sampled with an $8 \times 8$ image block. The dictionary size is $64 \times 256$, and the number of iterations is 20. Here, the noise intensity value is set at the mean noise intensity of 11 for the OCT image. For each of the four images, the sparse coefficients are solved and the images are reconstructed. The denoised images are shown in Fig. 6(a)–6(d). Then, the noisy OCT images are processed using the proposed algorithm. The noise intensity values of the four input images are estimated at 11.2431, 11.5761, 12.4350, and 10.2679, respectively. To obtain the denoised OCT images, the input image is sparsely decomposed and reconstructed using the global dictionary, as shown in Fig. 6(e)–6(h).

Fig. 6. Denoised OCT images. (a)–(d) the OCT image is processed by the SRAD; (e)–(h) the OCT image is processed by the proposed denoising algorithm.

Download Full Size | PDF

The quality evaluation indexes are obtained, as shown in Fig. 6(a)–6(h), using the two denoising algorithms, as listed in Table 3. The noise intensity value of the OCT-1 image is approximately consistent with that of OCT-2. In this case, it can be observed from the four evaluation indexes in Table 3 that the performances of both denoising algorithms, in terms of the reconstruction quality of the denoised image and the capability of maintaining the image edge, are similar. However, the running time of the proposed denoising algorithm is significantly shorter than that of SRAD. In addition, the noise intensities of OCT-3 and OCT-4 are relatively larger and smaller, respectively. When the difference between the noise intensity value of the image and the noise mean is large, the comparison of the MSE and PSNR values of SRAD and the proposed algorithm indicated that the noise removal ability of the former is slightly worse. Thus, the importance of parameters $\lambda $ for accurate image reconstruction is underscored. In addition, the comparison of the SSIM and EPI values indicates that the proposed algorithm is superior to SRAD in terms of image reconstruction quality and edge preservation. Experiments show that the proposed algorithm exhibits excellent advantages for images with significant differences in the mean noise values obtained by the OCT system. This also indicates that, in the sparse representation denoising algorithm, in addition to the design of a complete dictionary, both the solution of the sparse coefficient and the image reconstruction are important links to the final image quality. Noise intensity is one of the factors that affect the algorithm. Thus, for images with different noise levels, the sparse representation denoising algorithm under fixed parameters cannot reconstruct high-quality images.

Table 3. Comparison between the two denoising algorithms based on image evaluation index.

View Table | View all tables in this article

It is noteworthy that applying K-SVD to the entire image is computationally expensive. Because the global dictionary containing sample structure information is obtained by one dictionary learning, it is unnecessary to learn the dictionary for each image in the denoising process, which improves the efficiency of the algorithm significantly. Compared with the running time of each image in Table 3, the execution efficiency of the proposed algorithm is significantly better than that of SRAD.

The proposed algorithm is also compared with other denoising algorithms. A noisy OCT image with a PSNR of 18.1672 is randomly selected from the experiment images. The image is denoised using two-dimensional adaptive Wiener filtering [43], wavelet soft threshold [44], TV (Total variation) algorithm [45], and the proposed algorithm. The denoised images obtained by the different methods are shown in Fig. 7. The parameters of all the four denoising algorithms are set at the optimal values. The SSIM and EPI values in Table 4 show that the retinal boundary is blurred and some details are lost after two-dimensional adaptive Wiener filtering. Similarly, the wavelet soft threshold also smoothened the retinal edge after noise removal, which results in the loss of some effective information and reduces the retention of the retinal structure information. Thus, the SSIM and EPI values, as shown in Fig. 7(b) are not ideal. Although the processing results of the TV algorithm effectively preserve the edge details, it is unable to suppress the speckle noise in the image, so the improvement in the PSNR is not significant. In contrast, the quality of the image reconstructed using the proposed denoising algorithm is superior, as shown in Fig. 7(d). The details of the same position in the four images are selected for local amplification and comparison, as shown in the white wireframe in the figure. The proposed denoising algorithm not only improves the PSNR level but also preserves the retinal layer structure and edge structure in the original image noise. The MSE and PSNR values in Table 4 indicate that the proposed denoising algorithm is superior to other algorithms in terms of denoising capability. The SSIM and EPI values indicate that the proposed denoising algorithm has excellent structure preservation and edge preservation capabilities, and the denoised image is more similar to the original image.

Fig. 7. Denoised OCT images. OCT image is processed by (a) two-dimensional adaptive Wiener filtering; (b) wavelet soft threshold denoising algorithm; (c) TV denoising algorithm; (d) proposed denoising algorithm.

Download Full Size | PDF

Table 4. Comparison of the four denoising algorithms based on image evaluation index.

View Table | View all tables in this article

Furthermore, to verify the robustness of the proposed algorithm. 50 noisy OCT images are selected for statistics, and the images processed by the algorithm achieve a good denoising effect. The average PSNR of the images before and after the application of the algorithm is 19.2163 and 26.1302 respectively, and the PSNR is improved by 6.9139.

4. Conclusion

In this paper, a denoising algorithm via sparse representation of OCT images based on noise estimation and global dictionary is proposed. For OCT images with unknown noise intensity, the weak texture area of the image is processed by obtaining the noise intensity of the image using the Laplacian filter noise variance estimation method. Then, the image is sparsely decomposed and reconstructed based on the global dictionary and noise intensity. In the experiment, we quantitatively compare the performance of the proposed algorithm with that of a popular modern adaptive denoising algorithm. When the difference in the noise intensity in the OCT images is significant, the proposed denoising algorithm achieved a better denoising effect for low-noise images and high-noise images. Furthermore, since the proposed denoising algorithm decomposes and rebuilds the measured image based on the global dictionary, the program execution time is significantly shortened, which is very important in clinical OCT imaging.

Funding

National Natural Science Foundation of China (61971406, 81927801); Science and Technology Commission of Shanghai Municipality (19441910000); Youth Innovation Promotion Association of the Chinese Academy of Sciences CAS.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J. Jerwick, Y. Huang, Z. Dong, A. Slaudades, A. J. Brucker, and C. Zhou, “Wide-field ophthalmic space-division multiplexing optical coherence tomography,” Photonics Res. 8(4), 539–547 (2020). [CrossRef]

2. Z. Ibarra-Borja, C. Sevilla-Gutiérrez, R. Ramírez-Alarcón, H. Cruz-Ramírez, and A. B. U’Ren, “Experimental demonstration of full-field quantum optical coherence tomography,” Photonics Res. 8(1), 51–56 (2020). [CrossRef]

3. M. Wan, S. Liang, X. Li, Z. Duan, J. Zou, J. Chen, and J. Zhang, “Dual-beam delay-encoded all fiber Doppler optical coherence tomography for in vivo measurement of retinal blood flow,” Chin. Opt. Lett. 20(1), 011701 (2022). [CrossRef]

4. C. Chen, W. Shi, Z. Qiu, V. X. Yang, and W. Gao, “B-scan-sectioned dynamic micro-optical coherence tomography for bulk-motion suppression,” Chin. Opt. Lett. 20(2), 021102 (2022). [CrossRef]

5. M. Bashkansky and J. Reintjes, “Statistics and reduction of speckle in optical coherence tomography,” Opt. Lett. 25(8), 545–547 (2000). [CrossRef]

6. A. Baghaie, Z. Yu, and R. M. D’Souza, “State-of-the-art in retinal optical coherence tomography image analysis,” Quant. Imag. Med. Surg. 5(4), 603 (2015). [CrossRef]

7. D. Cui, E. Bo, Y. Luo, X. Liu, X. Wang, S. Chen, and L. Liu, “Multifiber angular compounding optical coherence tomography for speckle reduction,” Opt. Lett. 42(1), 125–128 (2017). [CrossRef]

8. B. F. Kennedy, T. R. Hillman, A. Curatolo, and D. D. Sampson, “Speckle reduction in optical coherence tomography by strain compounding,” Opt. Lett. 35(14), 2445–2447 (2010). [CrossRef]

9. M. Pircher, E. Götzinger, R. A. Leitgeb, A. F. Fercher, and C. K. Hitzenberger, “Speckle reduction in optical coherence tomography by frequency compounding,” J. Biomed. Opt. 8(3), 565–569 (2003). [CrossRef]

10. Y. Zhao, K. K. Chu, W. J. Eldridge, E. T. Jelly, M. Crose, and A. Wax, “Real-time speckle reduction in optical coherence tomography using the dual window method,” Biomed. Opt. Express 9(2), 616–622 (2018). [CrossRef]

11. J. M. Schmitt, S. H. Xiang, and K. M. Yung, “Speckle in optical coherence tomography,” J. Biomed. Opt. 4(1), 95–105 (1999). [CrossRef]

12. Y. Huang, X. Liu, and J. U. Kang, “Real-time 3D and 4D Fourier domain Doppler optical coherence tomography based on dual graphics processing units,” Biomed. Opt. Express 3(9), 2162–2174 (2012). [CrossRef]

13. Z. Jian, Z. Yu, L. Yu, B. Rao, Z. Chen, and B. J. Tromberg, “Speckle attenuation in optical coherence tomography by curvelet shrinkage,” Opt. Lett. 34(10), 1516–1518 (2009). [CrossRef]

14. F. Zaki, Y. Wang, H. Su, X. Yuan, and X. Liu, “Noise adaptive wavelet thresholding for speckle noise removal in optical coherence tomography,” Biomed. Opt. Express 8(5), 2720–2731 (2017). [CrossRef]

15. T. Wu, Y. Shi, Y. Liu, and C. He, “Speckle reduction in optical coherence tomography by adaptive total variation method,” J. Mod. Opt. 62(21), 1849–1855 (2015). [CrossRef]

16. Q. Chen, L. de Sisternes, T. Leng, and D. L. Rubin, “Application of improved homogeneity similarity-based denoising in optical coherence tomography retinal images,” J Digit Imaging 28(3), 346–361 (2015). [CrossRef]

17. A. Abbasi, A. Monadjemi, L. Fang, and H. Rabbani, “Optical coherence tomography retinal image reconstruction via nonlocal weighted sparse representation,” J. Biomed. Opt. 23(03), 1 (2018). [CrossRef]

18. S. Li, L. Fang, and H. Yin, “An efficient dictionary learning algorithm and its application to 3-D medical image denoising,” IEEE Trans. Biomed. Eng. 59(2), 417–427 (2011). [CrossRef]

19. L. Fang, S. Li, Q. Nie, J. A. Izatt, C. A. Toth, and S. Farsiu, “Sparsity based denoising of spectral domain optical coherence tomography images,” Biomed. Opt. Express 3(5), 927–942 (2012). [CrossRef]

20. S. Li, H. Yin, and L. Fang, “Group-sparse representation with dictionary learning for medical image denoising and fusion,” IEEE Trans. Biomed. Eng. 59(12), 3450–3459 (2012). [CrossRef]

21. Y. Zhang, G. Wu, P. T. Yap, Q. Feng, J. Lian, W. Chen, and D. Shen, “Hierarchical patch-based sparse representation—A new approach for resolution enhancement of 4D-CT lung data,” IEEE Trans. Med. Imaging 31(11), 1993–2005 (2012). [CrossRef]

22. S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequency dictionaries,” IEEE T. Signal Proces. 41(12), 3397–3415 (1993). [CrossRef]

23. M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE T. Image Process. 15(12), 3736–3745 (2006). [CrossRef]

24. A. M. Bruckstein, D. L. Donoho, and M. Elad, “From sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM Rev. 51(1), 34–81 (2009). [CrossRef]

25. M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process. 54(11), 4311–4322 (2006). [CrossRef]

26. J. Liang, M. Zhang, X. Zeng, and G. Yu, “Distributed dictionary learning for sparse representation in sensor networks,” IEEE Trans. on Image Process. 23(6), 2528–2541 (2014). [CrossRef]

27. Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition,” In Proceedings of 27th Asilomar conference on signals, systems and computers (1993), pp. 40–44.

28. J. Immerkaer, “Fast noise variance estimation,” Comput. Vis. Image Und. 64(2), 300–302 (1996). [CrossRef]

29. R. M. Haralick, K. Shanmugam, and I. H. Dinstein, “Textural features for image classification,” IEEE Trans. Syst., Man, Cybern. SMC-3(6), 610–621 (1973). [CrossRef]

30. E. S. Gadelmawla, “A vision system for surface roughness characterization using the gray level co-occurrence matrix,” NDT&E Int. 37(7), 577–588 (2004). [CrossRef]

31. M. Partio, B. Cramariuc, M. Gabbouj, and A. Visa, “Rock texture retrieval using gray level co-occurrence matrix,” In Proc. of 5th Nordic Signal Processing Symposium (2002), Vol. 75.

32. B. Pathak and D. Barooah, “Texture analysis based on the gray-level co-occurrence matrix considering possible orientations,” Int. J. Adv. Res. Electr. Electron. Instrum. Eng. 2(9), 4206–4212 (2013). [CrossRef]

33. A. Baraldi and F. Panniggiani, “An investigation of the textural characteristics associated with gray level cooccurrence matrix statistical parameters,” IEEE Trans. Geosci. Remote Sensing 33(2), 293–304 (1995). [CrossRef]

34. M. Li, R. Idoughi, B. Choudhury, and W. Heidrich, “Statistical model for OCT image denoising,” Biomed. Opt. Express 8(9), 3903–3917 (2017). [CrossRef]

35. H. Zhang, Z. Li, X. Wang, and X. Zhang, “Speckle reduction in optical coherence tomography by two-step image registration,” J. Biomed. Opt. 20(3), 036013 (2015). [CrossRef]

36. K. Engan, S. O. Aase, and J. H. Husøy, “Multi-frame compression: Theory and design,” Signal Process. 80(10), 2121–2140 (2000). [CrossRef]

37. I. Tošić and P. Frossard, “Dictionary learning,” IEEE Signal Process. Mag. 28(2), 27–38 (2011). [CrossRef]

38. R. Rubinstein, A. M. Bruckstein, and M. Elad, “Dictionaries for sparse representation modeling,” Proc. IEEE 98(6), 1045–1057 (2010). [CrossRef]

39. Q. Huynh-Thu and M. Ghanbari, “Scope of validity of PSNR in image/video quality assessment,” Electron. Lett. 44(13), 800–801 (2008). [CrossRef]

40. M. D. Robinson, C. A. Toth, J. Y. Lo, and S. Farsiu, “Efficient fourier-wavelet super-resolution,” IEEE Trans. on Image Process. 19(10), 2669–2681 (2010). [CrossRef]

41. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. on Image Process. 13(4), 600–612 (2004). [CrossRef]

42. S. Adabi, H. Mohebbikarkhoran, D. Mehregan, S. Conforto, and M. Nasiriavanaki, “An intelligent despeckling method for swept source optical coherence tomography images of skin,” In Medical Imaging 2017: Biomedical Applications in Molecular, Structural, and Functional Imaging, International Society for Optics and Photonics (2017), Vol. 10137, p. 101372B.

43. A. A. Shah, M. M. Malik, M. U. Akram, and S. A. Bazaz, “Comparison of noise removal algorithms on Optical Coherence Tomography (OCT) image,” In 2016 IEEE International Conference on Imaging Systems and Techniques (IST) (2016), pp. 166–170.

44. R. M. Zhao and H. M. Cui, “Improved threshold denoising method based on wavelet transform,” In 2015 7th International Conference on Modelling, Identification and Control (ICMIC) (2015), pp. 1–4.

45. G. Gong, H. Zhang, and M. Yao, “Speckle noise reduction algorithm with total variation regularization in optical coherence tomography,” Opt. Express 23(19), 24699–24712 (2015). [CrossRef]

Noise Level	Method	MSE	PSNR	SSIM	EPI
$σ = 5 .1$	fixed parameters	34.4790	40.0183	0.5421	0.6034
	Proposed	25.5423	46.0186	0.6872	0.8961
	SRAD	25.5617	46.0034	0.6705	0.8953
$σ = 10 .2$	fixed parameters	41.5375	36.2933	0.6420	0.6873
	Proposed	40.7872	36.6579	0.6563	0.7029
	SRAD	40.7810	36.6609	0.6570	0.7034
$σ = 15 .3$	fixed parameters	58.4881	29.4488	0.4227	0.5878
	Proposed	49.6698	32.7173	0.5301	0.6541
	SRAD	51.8753	31.8474	0.5106	0.6315

Image block	Dictionary atoms
Image block	64	128	256	512
$2 \times 2$	20.7245	20.7316	20.7386	20.7410
$4 \times 4$	25.8652	25.8697	25.9123	25.9124
$8 \times 8$	26.7921	26.7943	26.9043	26.9049
$16 \times 16$	26.8214	26.8298	26.8350	26.8348

	Algorithm	MSE	PSNR	SSIM	EPI	Time/s
OCT-1	SRAD	59.4492	29.1228	0.8189	0.6759	204.44
OCT-1	Proposed	60.1492	28.8887	0.8089	0.6699	33.07
OCT-2	SRAD	74.1642	24.6996	0.6711	0.5060	190.41
OCT-2	Proposed	74.6318	24.5739	0.6640	0.5009	31.32
OCT-3	SRAD	69.1928	26.0873	0.8345	0.6518	279.81
OCT-3	Proposed	60.8659	28.6518	0.8925	0.6860	30.21
OCT-4	SRAD	71.2760	25.4941	0.6233	0.5028	210.98
OCT-4	Proposed	69.1138	26.1102	0.6678	0.5543	20.77

Algorithm	MSE	PSNR	SSIM	EPI
two-dimensional adaptive Wiener filtering	85.0056	21.9709	0.5043	0.3649
wavelet soft threshold denoising	102.3689	18.2536	0.4876	0.3013
TV denoising	87.0485	21.4960	0.8011	0.5008
the proposed algorithm	66.4235	26.9043	0.8531	0.6740

Noise Level	Method	MSE	PSNR	SSIM	EPI
$σ = 5 .1$	fixed parameters	34.4790	40.0183	0.5421	0.6034
	Proposed	25.5423	46.0186	0.6872	0.8961
	SRAD	25.5617	46.0034	0.6705	0.8953
$σ = 10 .2$	fixed parameters	41.5375	36.2933	0.6420	0.6873
	Proposed	40.7872	36.6579	0.6563	0.7029
	SRAD	40.7810	36.6609	0.6570	0.7034
$σ = 15 .3$	fixed parameters	58.4881	29.4488	0.4227	0.5878
	Proposed	49.6698	32.7173	0.5301	0.6541
	SRAD	51.8753	31.8474	0.5106	0.6315

Denoising algorithm of OCT images via sparse representation based on noise estimation and global dictionary

Abstract

1. Introduction

2. Methodology

2.1 Overview of OCT Image denoising algorithm

2.2 Estimation of the noise intensity of OCT image

2.3 Design global dictionary

3. Simulation and experiment

3.1 Simulation analysis

3.2 Experiment

4. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (7)

Tables (4)

Equations (9)

Optics Express