Membranous nephropathy classification using microscopic hyperspectral imaging and tensor patch-based discriminative linear regression

Meng Lv; Tianhong Chen; Yue Yang; Yue Yang; Tianqi Tu; Nianrong Zhang; Wenge Li; Wei Li; Wei Li

doi:10.1364/BOE.421345

1. Introduction

With an incidence rate of more than 10%, chronic kidney disease (CKD) has become a global public health problem, which is the eighth leading cause of women death and affects approximately 195 million women worldwide [1]. If CKD is not treated in time, it may develop into end-stage kidney disease, which requires dialysis or kidney transplant to maintain life. Finding the cause of the disease and treating it early can usually prevent the deterioration of CKD, thereby improving the patient’s quality of life. Among CKD, membranous nephropathy (MN) [2] is one of the most common pathological types of adult nephrotic syndrome. According to etiology, MN can be divided into primary MN (PMN) and secondary MN (SMN). The etiology of PMN has not been clarified, and SMN often secondary to tumors, lupus erythematosus, hepatitis B virus, autoimmune diseases, drug and poison exposure, etc [3]. Researches relating the differential diagnosis and treatment of MN have always been hotspots in the field of nephropathy.

PMN and SMN have obvious differences in treatment options. SMN needs to be treated first for the cause, such as chemotherapy, antiviral, and anti-infection. If SMN is misjudged as PMN, it may cause delays in disease. If PMN is misjudged as SMN, the tumor chemotherapy drugs and antiviral drugs used may aggravate the risk of kidney damage. Therefore, timely diagnosis of PMN and reasonable treatment options are of great significance to the prognosis of patients. In clinical work, the diagnosis of PMN mainly relies on optical renal biopsy histopathological examination, combined with clinical manifestations and medical history, to exclude possible secondary factors [4]. However, due to the possible hidden onset, atypical clinical manifestations and symptoms and a probability of false positives in the optical inspection results, there are still certain difficulties in accurately distinguishing PMN and SMN. The characteristic of pathological change with MN is manifest as a large amount of immune complex deposition on the epithelial side of the glomerular capillaries. PMN is a single-organ autoimmune disease. The components deposited by immune complexes are related to the initiation of IgG4 in the pathogenesis, and the components deposited by SMN are closely related to the source of infection. The different pathogenesis of PMN and SMN makes it possible to use the difference in immune complex components to identify PMN. Therefore, a novel alternative method is much needed both in initial diagnosis and in the follow-up.

Hyperspectral imaging technology is an emerging diagnostic method that combines the two traditional optical advantages of spectral analysis and optical imaging [5–6]. Different from optical microscopy images that only contain sample morphology information, hyperspectral images (HSI) provide detailed and rich spatial-spectral information [7–9]. With the rapid development of hyperspectral cameras and artificial intelligence, hyperspectral imaging systems have become a promising intelligent medical auxiliary diagnostic tool. In medical applications, HSI has been used for histopathological tissue analysis [10–13], cancer detection [14–16], burn depth assessment [17], tumor detection [18,19] and blood oxygen saturation of the retina [20–22].

Since more than 257 million people worldwide suffer from chronic hepatitis B virus (HBV) infection, and HBV-related MN (HBV-MN) is the main extrahepatic manifestation of HBV infection [23], we focus on the classification task of PMN and HBV-MN. In this large study of 68 patients, hyperspectral microscopy is performed on 105 kidney tissue specimens from 35 HBV-MN patients and 99 kidney tissue specimens from 33 PMN patients. In addition to using hyperspectral imaging systems to acquire data, it is also important to design algorithms for the unique characteristics of medical hyperspectral image data. Least squares regression (LSR)-based classifiers have been widely used in pattern recognition field due to its efficient data analysis capabilities, compact form as well as efficient solutions [24]. As one of the simplest regression methods, linear regression (LR) learns discriminant data representation by linking source data to the target output. Due to its good performance and computational efficiency, LR has been applied to various classification tasks. However, the target binary (i.e., zero-one label) matrix in the standard LR is too rigid for classification. To solve this problem, many strategies have been developed to relax the regression target. Discriminative least squares regression (DLSR) [25] relaxed the regression target by integrating the $\varepsilon$-dragging into the LSR framework, thereby expanding the distance between different categories. A retargeted least squares regression (ReLSR) [26] approach was proposed to directly learn the soft target matrix from data by enforcing marginalized constraints. Although these LSR-based methods have achieved good performance, target overfitting is a problem that cannot be ignored. From this point of view, DLSR [25] forced different types of regression targets to move in opposite directions for enlarging inter-class distances, which aggravates the degree of overfitting. For improving inter-class separability, inter-class sparsity based discriminative least square regression (ICS_DLSR) [27] was proposed by introducing a row-sparsity constraint which can maintain the sparsity structure of the transformed features in each class.

In order to avoid overfitting and preserve the underlying structure of the data, manifold learning are introduced to improve the multiclass classification performance of LSR-based classifiers. A regularized label relaxation linear regression (RLR) was developed [28], where a class compactness graph was constructed to preserve the intrinsic structure of transformed samples and a nonnegative relaxation matrix is introduced to ensure the freedom of the label matrix. By considering the probabilistic connection knowledge, marginally structured representation learning (MSRL) method was proposed [29], in which an adaptive probabilistic graph was designed to discover the underlying feature correlations. Recently, discriminative marginalized least squares regression (DMLSR) was proposed for multiclass hyperspectral image classification [30], in which the main energy of hyperspectral data were preserved in projections. DMLSR considered both data reconstruction ability and class separability by introducing data-reconstruction constraint and intra-class compactness graph, thereby improving the classification performance.

All LSR-based classifiers mentioned above have achieved satisfied performance on pattern recognition. However, they ignored the spatial information when applied to medical hyperspectral data. In fact, the medical hyperspectral image (MHSI) is a three-dimensional image cube composed of two-dimensional spatial information and one-dimensional spectral signals. Spatial information has been proven to have a significant contribution to improving the accuracy of hyperspectral image classification [9,31]. In this paper, tensor patch-based discriminative linear regression (TDLR) is developed with considering the cubic nature of hyperspectral image. Different from the existing LSR-based methods, TDLR makes full use of the spatial and spectral information of MHSI by introducing regional covariance matrix-based descriptor for tensor patch-based intra-class compactness graph construction. In addition, an inter-class sparsity constraint is utilized in TDLR to enhance the class separability. TDLR aims to enlarge the distance between different classes while preserving the spatial-spectral structure of intra-class samples to improve the MHSI classification performance. The results of this work will help guide future HSI research and determine the special benefits that HSI may provide for MN intelligent diagnosis.

2. Material and methods

The experimental framework mainly consists of three parts: pathology hyperspectral image acquisition, hyperspectral data pre-processing and a tensor patch-based discriminative linear regression (TDLR) classifier for MN classification. First, renal biopsy ex-vivo tissue slices are imaged with the microscopic hyperspectral imaging system developed in our laboratory. The imaging process follows the micro-hyperspectral data collection standards jointly developed by the laboratory and nephrologists. Second, to remove system noise and facilitate subsequent data processing, data preprocessing such as mean filtering and normalization are applied to HSI data. At last, by considering the cubic nature of hyperspectral image, TDLR is developed to make full use of the spatial and spectral information of MHSI by introducing regional covariance matrix-based descriptor.

2.1 Microscopic hyperspectral imaging system

To capture the spectral information of MN pathological tissues, we have established a microscopic hyperspectral imaging system. For capturing the component information of the immune complex of MN, there are very high requirements on the spectral resolution and the number of spectral bands of the system. Therefore, the system adopts built-in line scanning hyperspectral imaging system SOC-710 and a biological microscope CX31RTSF. The diagram of microscopic hyperspectral imaging system is shown in Fig. 1(a). The designed system covers 400-1000 nm with 4.69 nm resolution, which reaches the requirement of wide spectral range. The number of spectral band is 128 of this system and the spatial size is 696$\times$520. The microscope objective of this system has magnification of 40$\times$ and a numerical aperture of 0.65. The image scanning time is less than 25s which insures simple operation and high efficiency. The above performance indicators all meet the research requirements of immune complex components in this paper. Figure 1(b) shows a schematic diagram of the kidney tissue obtained by the system.

Fig. 1. (a) is the microscopic hyperspectral imaging system and (b) is the schematic diagram of the kidney tissue.

Download Full Size | PDF

2.2 Experimental hyperspectral image dataset

Experimental validation data use two types of MN, PMN and HBV-MN, which are difficult to distinguish with optical microscopy. There are a total of 204 microscopic hyperspectral images of renal biopsy tissue slices captured from 68 different patients in the Nephrology Department of The China-Japan Friendship Hospital. For each patient, the pathological diagnosis is determined by a team of experienced clinicians and pathologists through renal biopsy. The renal pathological slices were stained with periodic acid silver methenamine (PASM+M) for accurately displaying and locating various immune complexes associated with glomerular diseases. Based on the image acquisition principle that each image contains a complete glomerulus, three images are collected from each patient. In our research, we use the hyperspectral data information corresponding to the immune complex to distinguish between HBV-MN and PMN. The regions of immune complexes on whole-slide digital images of PASM+M stained tissue slides are marked by experienced nephrologists specializing in MN. The obtained HSI data consist of 105 HBV-MN images and 99 PMN images, from 35 HBV-MN patients and 33 PMN patients. The normalized spectral curves of two-types of MN are illustrated in Fig. 2.

Fig. 2. The normalized spectral curves of HBV-MN and PMN.

Download Full Size | PDF

2.3 Methods

2.3.1 Hyperspectral data preprocessing

In order to reduce the system noise of MHSI generated during acquisition process, a mean filtering algorithm is employed. In mean filtering, each pixel is replaced with the average vector of pixels contained in the $T\times T$ window centered on it. Mathematically, mean filtering can be illustrated as ${\tilde{\textbf{x}}_i} = {{\sum \nolimits _{{\textbf{x}_j} \in \Omega \left ( {{\textbf{x}_i}} \right )} {{\textbf{x}_j}} } \mathord {\left / {\vphantom {{\sum \nolimits _{{\textbf{x}_j} \in \Omega \left ( {{\textbf{x}_i}} \right )} {{\textbf{x}_j}} } {{T^2}}}} \right. } {{T^2}}},i = 1, \ldots ,n.$ where ${\tilde{\textbf{x}}_i}$ denotes the center pixel in the filtering window, $\Omega \left ( {{\textbf{x}_i}} \right )$ illustrates the local spatial neighborhood centered at ${\textbf{x}_i}$ and ${T^2}$ is the total number of pixels in the filtering window. In order to facilitate data analysis, all image data are normalized into the range of [0,1].

2.3.2 TDLR model

Due to the completeness of statistical theory and the effectiveness of data analysis, least squares regression (LSR) has become a common tool in the field of pattern recognition. LSR-based classifiers have been widely used in pattern recognition field and achieved satisfied performance due to its efficient data analysis capabilities, compact form as well as efficient solutions. In this paper, TDLR classifier is designed for MHSI classification, which inherits all the advantages of LSR and makes full use of the spatial-spectral information of the pathological image by introducing region covariance descriptor.

Denoting the spectral vector in region of interest in hyperspectral image as $\textbf{x}$, the collection of $n$ training samples construct matrix $\textbf{X} = {[{\textbf{x}_1},{\textbf{x}_2}, \ldots ,{\textbf{x}_N}]^T} \in {R^{N \times D}}$. The model of LSR is defined as

(1)$$\mathop {\min }\limits_\textbf{Q} \left\| {\textbf{XQ} - \textbf{Y}} \right\|_F^2 + \lambda \left\| \textbf{Q} \right\|_F^2$$

where $\textbf{Q} \in {R^{D \times C}}$ is the projection matrix, $\textbf{Y} = {[{\textbf{y}_1},{\textbf{y}_2}, \ldots ,{\textbf{y}_N}]^T} \in {R^{N \times C}}$, ${\lambda }$ is the balance regularization parameter, $C \ge 2$ is the binary label matrix corresponding to $\textbf{X}$. The definition of $\textbf{y}$ is based on the class that $\textbf{x}$ belongs to. That is, if ${\textbf{x}_i}$ belongs to the $c$-th class, the $c$-th element of ${\textbf{y}_i}$ is 1, and the other elements are 0.

For MHSI classification tasks, when transforming the data to the label space, it is expected that the distance between the different classes can be enlarged, and the potential spatial-spectral structural information within the same class can be preserved. Inherited from the RLR, the proposed TDLR introduces the inter-class sparsity constraint and tensor-based manifold regularization term to learn a soft regression target matrix. The model of TDLR is formulated as

(2)$$\begin{array}{l} \mathop {\min }\limits_{\textbf{Q,M}} \left\| {\textbf{XQ} - \left( {\textbf{Y} - \textbf{A} \odot \textbf{M}} \right)} \right\|_F^2 + {\lambda _1}\sum\limits_{l = 1}^C {{{\left\| {{\textbf{X}_l}\textbf{Q}} \right\|}_{2,1}}} + {\lambda _2}{\cal T}\\ s.t.{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \textbf{M} \ge 0 \end{array}$$

where ${\textbf{X}_l} = {[{\textbf{x}_1},{\textbf{x}_2}, \ldots ,{\textbf{x}_{n_l}}]^T} \in {R^{{n_l} \times D}}$ is the matrix composed of all $n_l$ pixels belonging to the $l$-th class, $\odot$ is a Hadamard product operator, ${\lambda _1}$ and ${\lambda _2}$ are the balance regularization parameters. $\textbf{A} \in {R^{N \times C}}$ is a luxury matrix corresponding to $\textbf{Y}$, which is defined as

(3)$${\textbf{A}_{ij}} = \left\{ {\begin{array}{cc} {1,} & {\textrm{if}{\kern 1pt} {\kern 1pt} {\textbf{Y}_{ij}} = 1{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} }\\ { - 1,} & {\textrm{if}{\kern 1pt} {\kern 1pt} {\textbf{Y}_{ij}} = 0.} \end{array}} \right.$$

$\textbf{M} \in {R^{N \times C}}$ is a nonnegative label relaxation matrix, which is defined as

(4)$$\textbf{M} = \left[ {\begin{array}{ccc} {{\textbf{m}_{11}}} & \cdots & {{\textbf{m}_{1C}}}\\ \vdots & \ddots & \vdots \\ {{\textbf{m}_{N1}}} & \cdots & {{\textbf{m}_{NC}}} \end{array}} \right].$$

The first item of the objective function relaxes the strict binary label constraint into the soft one, which provides greater freedom to fit the labels. The inter-class sparsity constraint makes the projected features have a consistent row-sparsity structure in each class and thus have natural distinguishability. The last and most important item ${\cal T}$ is the tensor-based manifold regularization term, which makes the projected features keep the main energy and effectively avoid overfitting. In this paper, ${\cal T}$ is formulated based on the tensor-based intra-class compactness graph $\textbf{G} = \left \{ {\textbf{X},\textbf{W}} \right \}$.

Thus, construction of the tensor based intra-class compactness graph is crucial to the performance of the manifold item. The graph $\textbf{G} = \left \{ {\textbf{X},\textbf{W}} \right \}$ with vertex set $\textbf{X}$ and adjacency matrix $\textbf{W}$ should be able to characterize certain desired properties. In order to simultaneously utilize the spatial and spectral information of the data, the region covariance descriptor is introduced to construct the tensor-based intra-class compactness graph. Region covariance descriptor is a novel and robust data descriptor which has strong ability in data representation [32]. Hyperspectral pixels in the form of tensors can be characterized by covariance features. Denoting a hyperspectral image as ${{\cal X}^{W \times H \times D}}$, the local neighbors with window size $w \times w$ that centered on each pixel can be regarded as the spatial-spectral third-order tensor ${\hat {\cal X}_i} \in {R^{w \times w \times D}}$, where the 3-mode fibers of ${\hat {\cal X}_i}$ is denoted as ${\hat{\textbf{x}}_k} \in {R^D}$ ($k = 1,2, \ldots ,J;J = w \times w$). Then, the spectral region covariance descriptor $\textbf{C}$ is formulated as

(5)$${\textbf{C}_i} = \frac{1}{{J - 1}}\sum_{k = 1}^J {\left( {{{\hat{\textbf{x}}}_k} - {\mu _i}} \right){{\left( {{{\hat{\textbf{x}}}_k} - {\mu _i}} \right)}^T}}$$

where ${\mu _i} = ({1 \mathord {\left / {\vphantom {1 J}} \right. } J})\sum \nolimits _{t = 1}^J {{{\hat{\textbf{x}}}_t}}$ is the mean vector, and $J$ is the number of spectral vectors within the spatial window.

Considering that the defined covariance features lie on a Rimannian manifold [33], Log-Euclidean distance is chosen to compute the similarity between ${\textbf{C}_i}$ and ${\textbf{C}_j}$, which is defined as

(6)$${D_{{\textrm{cov}} }}\left( {{\textbf{C}_i},{\textbf{C}_j}} \right) = {\left\| {\log ({\textbf{C}_i}) - \log ({\textbf{C}_j})} \right\|_F}.$$

Thus, we define the tensor-based intra-class adjacency matrix $\textbf{W}$ as

(7)$${\textbf{W}_{ij}} = \left\{ {\begin{array}{cc} {\exp \left\{ { - \frac{{{D_{{\textrm{cov}} }}{{({\textbf{C}_i} - {\textbf{C}_j})}^2}}}{{2{t^2}}}} \right\},} & {\begin{array}{c} {\textrm{if}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\textbf{x}_j} \in {\Omega _k}({\textbf{x}_i}){\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} }\\ {\textrm{and}{\kern 1pt} {\kern 1pt} l({\textbf{x}_i}) = l({\textbf{x}_j})} \end{array}}\\ {{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt}0, {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} }&{\textrm{otherwise}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} } \end{array}} \right.$$

where ${\Omega _k}({\textbf{x}_i})$ is the set of $k$ nearest neighbors of ${\textbf{x}_i}$. In this way, the tensor based intra-class compactness graph $\textbf{G} = \left \{ {\textbf{X},\textbf{W}} \right \}$ exploits both spatial and spectral information of HSI data. The graph-preserving criterion is defined as

(8)$$\mathop {\min }\limits_\textbf{Q} \sum\limits_{i \ne j} {{{\left\| {\textbf{x}_i^T\textbf{Q} - \textbf{x}_j^T\textbf{Q}} \right\|}^2}{\textbf{W}_{ij}}}.$$

By using trace technology, the tensor-based manifold regularization term ${\cal T}$ is obtained by

(9)$$\begin{array}{l} \mathop {\min }\limits_\textbf{Q} \sum\limits_{i \ne j} {{{\left\| {\textbf{x}_i^T\textbf{Q} - \textbf{x}_j^T\textbf{Q}} \right\|}^2}{\textbf{W}_{ij}}} = \mathop {\min }\limits_\textbf{Q} Tr({\textbf{Q}^T}{\textbf{X}^T}L\textbf{XQ})\\ \Rightarrow {\cal T} = Tr({\textbf{Q}^T}{\textbf{X}^T}L\textbf{XQ}) \end{array}$$

where $\textbf{L} = \textbf{D} - \textbf{W}$ is a Laplacian matrix, $\textbf{D}$ is a diagonal matrix with the $i$-th diagonal element being ${\textbf{D}_{ii}} = \sum \nolimits _{j = 1}^N {{\textbf{W}_{ij}}}$. Then the model of TDLR can be converted to

(10)$$\begin{array}{l} \mathop {\min }\limits_{\textbf{Q,M}} \left\| {\textbf{XQ} - \left( {\textbf{Y} - \textbf{A} \odot \textbf{M}} \right)} \right\|_F^2 + {\lambda _1}\sum\limits_{l = 1}^C {{{\left\| {{\textbf{X}_l}\textbf{Q}} \right\|}_{2,1}}} + {\lambda _2}Tr({\textbf{Q}^T}{\textbf{X}^T}L\textbf{XQ})\\ s.t.{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \textbf{M} \ge 0. \end{array}$$

The unknown variables in optimization problem depend on each other, which means that the proposed TDLR has no analytical solution. We exploit the alternating direction method (ADM) [34] to the optimization problem above. By introducing an extra variable $\mathbf{E}$, the optimization problem can be solved separably and reformulated as

(11)$$\begin{array}{l} \mathop {\min }\limits_{\textbf{Q,M},\textbf{E}} \frac{1}{2}\left\| {\textbf{XQ} - \left( {\textbf{Y} - \textbf{A} \odot \textbf{M}} \right)} \right\|_F^2 + {\lambda _1}\sum\limits_{l = 1}^C {{{\left\| {{\textbf{E}_l}} \right\|}_{2,1}}} \\ + \frac{{{\lambda _2}}}{2}Tr({\textbf{Q}^T}{\textbf{X}^T}L\textbf{XQ}) + \frac{\mu }{2}\left\| {\textbf{E} - \textbf{QX} + \frac{C}{\mu }} \right\|_F^2\\ s.t.{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \textbf{M} \ge 0 \end{array}.$$

We alternately solve $\mathbf{Q}$, $\mathbf{M}$ and $\mathbf{E}$ by fixing other variable. Fix $\mathbf{M}$ and $\mathbf{E}$, and let $\textbf{F} = \textbf{Y} - \textbf{A} \odot \textbf{M}$, $\mathbf{Q}$ can be calculated by minimizing the following objective:

(12)$$J(\textbf{Q}) = \frac{1}{2}\left\| {\textbf{XQ} - \textbf{F}} \right\|_F^2 + \frac{{{\lambda _2}}}{2}Tr({\textbf{Q}^T}{\textbf{X}^T}L\textbf{XQ}) + \frac{\mu }{2}\left\| {\textbf{E} - \textbf{QX} + \frac{C}{\mu }} \right\|_F^2.$$

Fix $\mathbf{Q}$ and $\mathbf{M}$, $\mathbf{E}$ can be obtained by the following optimization problem:

(13)$$\mathop {\min }\limits_\textbf{E} {\kern 1pt} {\kern 1pt} {\lambda _2}\sum_{l = 1}^c {{{\left\| {{\textbf{E}_l}} \right\|}_{2,1}}} + \frac{\mu }{2}\left\| \textbf{E} - \textbf{QX} + \frac{C}{\mu } \right\|_F^2.$$

Fix $\mathbf{Q}$ and $\mathbf{E}$, the optimal $\mathbf{M}$ can be obtained by the following optimization problem:

(14)$$\begin{array}{l} \mathop {\min }\limits_\textbf{M} \left\| {\textbf{XQ} - \textbf{Y} - \textbf{A} \odot \textbf{M}} \right\|_F^2\\ s.t.{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \textbf{M} \ge 0 \end{array}.$$

After obtaining $\mathbf{Q}$, the testing samples are classified by the nearest neighbor method.

Medical hyperspectral images have significant intra-class differences and inter-class similarities due to individual differences in patients. Therefore, it is necessary to explore how to construct an effective classifier which can not only exploit the local and global discriminant structures but also preserve the intrinsic manifold. TDLR is proposed by designing the inter-class sparsity constraint and tensor-based manifold regularization term to learn a soft regression target matrix. The inter-class sparsity constraint is employed to reduce the margins of samples within the same class and enlarge those of samples from different classes. The tensor-based manifold regularization is constructed by employing region covariance descriptor which is powerful for capturing the spatial–spectral information of MHSI and fusing different features naturally without being sensitive to region scale and rotation. TDLR effectively integrates and utilizes the advantages of the above items, which helps to improve the classification performance.

3. Experimental results and analysis

The MN patients are divided into two parts: training part and testing part. The training part consists of 10 PMN patients and 10 HBV-MN patients. Then, the training part is divided into learning part and validation part (consists of 5 PMN patients and 5 HBV-MN patients) for parameter optimization. The results of each experiment are verified by the validation set to ensure that the classification results are obtained under the optimal parameters. The samples included in the test part (consists of 23 PMN patients and 25 HBV-MN patients) are completely independent and only used for evaluation. Five objective quality indexes (i.e., sensitivity (SE) and specificity (SP), overall accuracy (CA), average accuracy of different classes (AA) and kappa coefficient (Kappa)) are used to evaluate the performance of MN classification.

3.1 Selection of model parameters

The proposed TDLR has three important parameters: size of the tensor patch $w$, trade-off parameters $\lambda _1$ and $\lambda _2$. The influence of different parameters on the performance of TDLR and the best parameters for model selection are discussed below.

The size of the tensor patch $w$ plays a very important role since it determines the amount of spatial information used. Hence, various tensor patch sizes ([5, 7, 9, 11, 13, 15]) are validated on the MN dataset and the experimental results are depicted in Table 1. The influence of the parameter $w$ is analyzed by evaluating the average accuracy obtained by cross validation. It can be seen from Table 1 that the classification accuracy of the model increases as the size of the spatial neighborhood increases. The best performance is obtained with a value not less than 9, which means that the size of 9 or more contains enough spatial information to reasonably describe the spatial characteristics of immune complexes. Therefore, considering the calculation cost, 9 is selected as the size of the tensor patch in the following experiment.

Trade-off parameters $\lambda _1$ and $\lambda _2$ are used to adjust the contribution of the inter-class sparsity constraint term and tensor-based manifold regularization term in TDLR. Trade-off parameters are chosen from {${10^{-6}}$, ${10^{-5}}$, ${10^{-4}}$, ${10^{-3}}$, ${10^{-2}}$, ${10^{ -1}}$}. The results indicate that setting $\lambda _1$ and $\lambda _2$ as (${10^{-1}}$, ${10^{-3}}$) acquires the best performance. Thus, (${10^{-1}}$, ${10^{-3}}$) is selected as the trade-off parameter for MN dataset in the following experiments.

Table 1. Comparisons with different patch size for TDLR

View Table | View all tables in this article

3.2 Performance for MN classification

To evaluate the effectiveness of TDLR, we compare the classification performance of TDLR with typical support vector machine (SVM) [35] and several state-of-the-art LR-based methods including DLSR [25], ReLSR [26], ICS-DLSR [27], MSRL [29], RLR [28] and DMLSR [30]. The same data processing operations of TDLR are implemented on the same training dataset and testing dataset for all compared models. The performance of all compared methods in subsequent experiments are obtained with the optimal parameters.

In practical medical applications, the quantity of available training samples is usually limited, so it is crucial to study the sensitivity of training size. We investigate the performance of TDLR with different number of training samples which range from 50 to 300. For each training size, we randomly select training samples for 5 times. Figure 3 shows the average OA of 5 tests obtained by various classification methods with different number of training samples. From the results, TDLR is consistently better than other DR methods, especially in the case that the training scale is very low (e.g., 75).

Fig. 3. Overall accuracy obtained by different methods with different number of training samples using the HBV-MN and PMN data.

Download Full Size | PDF

Table 2 lists the best performance of SE, SP, OA, AA and Kappa obtained by all methods with 300 training samples in each class, which indicates that TDLR provides the best OA. In detail, TDLR outperforms the other classification methods by 6.71% to 15.40% in SE, 0.75% to 10.88% in OA, 3.23% to 12.76% in AA, and 0.034 to 0.367 in Kappa. For SP, although TDLR does not achieve the best performance, TDLR obtains comparable results with DMSLR. The results confirm that the spatial information provided by tensor-based operators help to improve classification accuracy.

In order to further evaluate the performance of TDLR in extreme cases, only 50 samples in each class is used for training. Table 3 tabulates the best performance of SE, SP, OA, AA and Kappa obtained by all methods and the best accuracy is highlighted in bold. The proposed TDLR has significantly better SP and OA than other methods and comparable SE compared to the best performed method. In summary, all the experimental results verify the nonnegligible potential of TDLR for further application in MN idenification.

Table 2. Classification accuracy on different methods with 300 training samples per class

View Table | View all tables in this article

Table 3. Classification accuracy on different methods with 50 training samples per class

View Table | View all tables in this article

Traditional MN diagnosis methods are subjective, and the accuracy of diagnosis depends on the doctor’s clinical experience. In this article, the pathological diagnosis is determined by a team of experienced clinicians and pathologists through renal biopsy. As renal biopsy is the gold standard for diagnosing MN, the diagnostic accuracy of this team is assumed 100%. The classification accuracy of TDLR is as high as 98.77%, which means that TDLR has great clinical application potential.

4. Conclusion

In this paper, a hyperspectral database of 68 MN patients is built. For identification of MN, a novel framework of tensor patch-based discriminative linear regression (TDLR) has been proposed based on the characteristic of the MHSI. By incorporate manifold regularization term and multi-class sparsity constraint into the label relaxation regression model, the proposed TDLR is constructed to learn a more compact and discriminative projection for regression. Extensive experiments on MN dataset have demonstrated that the proposed TDLR outperforms typical SVM and state-of-the-art LR-based classifiers. Our work provides an effective technology for characterizing and distinguishing PMN from HBV-MN, and verifies its potential for further applications in clinical diagnosis.

Funding

National Natural Science Foundation of China (61922013); Beijing Municipal Natural Science Foundation (JQ20021); Beijing Talent Foundation Outstanding Young Individual Project (2018000052580G470).

Disclosures

The authors declare no conflicts of interest.

References

1. J. Xu, H. Lei, S. Zhang, and D. O. Nephrology, “Major research advances in chronic kidney disease of 2017,” Clin. Focus. (2018).

2. J. A. J. G. van den Brand, J. M. Hofstra, and J. F. M. Wetzels, “Low-molecular-weight proteins as prognostic markers in idiopathic membranous nephropathy,” Clin. J. Am. Soc. Necrol. Cjasn 6(12), 2846–2853 (2011). [CrossRef]

3. P. Ronco and H. Debiec, “Pathophysiological advances in membranous nephropathy: Time for a shift in patient’s care,” Lancet 385(9981), 1983–1992 (2015). [CrossRef]

4. H. R. Dong, Y. Y. Wang, X. H. Cheng, G. Q. Wang, L. J. Sun, H. Cheng, and Y. P. Chen, “Retrospective study of phospholipase A2 receptor and IgG subclasses in glomerular deposits in Chinese patients with membranous nephropathy,” PLoS One 11(5), e0156263 (2016). [CrossRef]

5. L. Fang, C. Wang, S. Li, and J. A. Benediktsson, “Hyperspectral image classification via multiple-feature-based adaptive sparse representation,” IEEE Trans. Instrum. Meas. 66(7), 1646–1657 (2017). [CrossRef]

6. X. Wei, W. Li, M. Zhang, and Q. Li, “Medical hyperspectral image classification based on end-to-end fusion deep neural network,” IEEE Trans. Instrum. Meas. 68(11), 4481–4492 (2019). [CrossRef]

7. M. Zhang, W. Li, and Q. Du, “Diverse region-based CNN for hyperspectral image classification,” IEEE Trans. on Image Process. 27(6), 2623–2634 (2018). [CrossRef]

8. W. Li, G. Wu, and Q. Du, “Transferred deep learning for anomaly detection in hyperspectral imagery,” IEEE Geosci. Remote Sensing Lett. 14(5), 597–601 (2017). [CrossRef]

9. Q. Huang, W. Li, B. Zhang, Q. Li, R. Tao, and N. H. Lovell, “Blood cell classification based on hyperspectral imaging with modulated gabor and cnn,” IEEE J. Biomed. Health Inform. 24(1), 160–170 (2020). [CrossRef]

10. S. Ortega, H. Fabelo, R. Camacho, M. Plaza, G. M. Callico, and R. Sarmiento, “Detecting brain tumor in pathological slides using hyperspectral imaging,” Biomed. Opt. Express 9(2), 818–831 (2018). [CrossRef]

11. S. Zhu, K. Su, Y. Liu, H. Yin, Z. Li, F. Huang, Z. Chen, W. Chen, G. Zhang, and Y. Chen, “Identification of cancerous gastric cells based on common features extracted from hyperspectral microscopic images,” Biomed. Opt. Express 6(4), 1135–1145 (2015). [CrossRef]

12. C. Lu and M. Mandal, “Toward automatic mitotic cell detection and segmentation in multispectral histopathological images,” IEEE J. Biomed. Health Inform. 18(2), 594–605 (2014). [CrossRef]

13. Y. Khouj, J. Dawson, J. Coad, and L. Vona-Davis, “Hyperspectral imaging and k-means classification for histologic evaluation of ductal carcinoma in situ,” Front. Oncol. 8, 17 (2018). [CrossRef]

14. G. Lu and B. Fei, “Medical hyperspectral imaging: A review,” J. Biomed. Opt. 19(1), 010901 (2014). [CrossRef]

15. M. A. Calin, S. V. Parasca, D. Savastru, and D. Manea, “Hyperspectral imaging in the medical field: Present and future,” Appl. Spectrosc. Rev. 49(6), 435–447 (2014). [CrossRef]

16. M. Carrión-Camacho, I. Marín-León, J. Molina-Doñoro, and J. González-López, “Safety of permanent pacemaker implantation: A prospective study,” J. Clin. Med. 8(1), 35 (2019). [CrossRef]

17. S. V. Parasca, M. A. Calin, D. Manea, S. Miclos, and R. Savastru, “Hyperspectral index-based metric for burn depth assessment,” Biomed. Opt. Express 9(11), 5778 (2018). [CrossRef]

18. M. Halicek, J. D. Dormer, J. V. Little, A. Y. Chen, and B. Fei, “Tumor detection of the thyroid and salivary glands using hyperspectral imaging and deep learning,” Biomed. Opt. Express 11(3), 1383 (2020). [CrossRef]

19. B. L. Jian, Z. F. Zhang, and W. Quan, “Tumor tissue classification based on micro-hyperspectral technology and deep learning,” Biomed. Opt. Express 10(12), 6370–6389 (2019). [CrossRef]

20. D. J. Mordant, I. Al-Abboud, G. Muyo, A. Gorman, A. Sallam, P. Ritchie, A. R. Harvey, and A. I. McNaught, “Spectral imaging of the retina,” Eye 25(3), 309–320 (2011). [CrossRef]

21. W. R. Johnson, D. W. Wilson, W. Fink, M. Humayun, and G. H. Bearman, “Snapshot hyperspectral imaging in ophthalmology,” J. Biomed. Opt. 12(1), 014036 (2007). [CrossRef]

22. L. Gao, R. T. Smith, and T. S. Tkaczyk, “Snapshot hyperspectral retinal camera with the image mapping spectrometer (ims),” Biomed. Opt. Express 3(1), 48–54 (2012). [CrossRef]

23. A. Schweitzer, J. Horn, R. T. Mikolajczyk, G. Krause, and J. J. Ott, “Estimations of worldwide prevalence of chronic hepatitis b virus infection: A systematic review of data published between 1965 and 2013,” Lancet 386(10003), 1546–1555 (2015). [CrossRef]

24. X. Yong, X. Fang, Z. Qi, C. Yan, J. You, and L. Hong, “Modified minimum squared error algorithm for robust classification and face recognition experiments,” Neurocomputing 135, 253–261 (2014). [CrossRef]

25. S. Xiang, F. Nie, G. Meng, C. Pan, and C. Zhang, “Discriminative least squares regression for multiclass classification and feature selection,” IEEE Trans. Neural Networks and Learning Syst. 23(11), 1738–1754 (2012). [CrossRef]

26. X. Zhang, L. Wang, S. Xiang, and C. Liu, “Retargeted least squares regression algorithm,” IEEE Trans. Neural Networks and Learning Syst. 26(9), 2206–2213 (2015). [CrossRef]

27. J. Wen, Y. Xu, Z. Li, Z. Ma, and Y. Xu, “Inter-class sparsity based discriminative least square regression,” Neural Netw 102, 36–47 (2018). [CrossRef]

28. X. Fang, Y. Xu, X. Li, Z. Lai, W. K. Wong, and B. Fang, “Regularized label relaxation linear regression,” IEEE Trans. Neural Networks and Learning Syst. 29(4), 1006–1018 (2018). [CrossRef]

29. Z. Zhang, L. Shao, Y. Xu, L. Liu, and J. Yang, “Marginal representation learning with Graph structure self-adaptation,” IEEE Trans. Neural Networks and Learning Syst. 29(10), 4645–4659 (2018). [CrossRef]

30. Y. Zhang, W. Li, H. C. Li, R. Tao, and Q. Du, “Discriminative marginalized least-squares regression for hyperspectral image classification,” IEEE Trans. Geosci. Remote Sensing 58(5), 1–4 (2020). [CrossRef]

31. L. Zhang, Q. Zhang, B. Du, X. Huang, Y. Y. Tang, and D. Tao, “Simultaneous spectral-spatial feature selection and extraction for hyperspectral images,” IEEE Trans. Cybernetics 48(1), 16–28 (2018). [CrossRef]

32. Y. J. Deng, H. C. Li, L. Pan, L. Y. Shao, and Q. Du, “Modified tensor locality preserving projection for dimensionality reduction of hyperspectral images,” IEEE Geosci. Remote Sensing Lett. 15(2), 277–281 (2018). [CrossRef]

33. F. Masoud, P. Maziar, and S. Conrad, “Log-euclidean bag of words for human action recognition,” Iet Comput. Vis. 9(3), 331–339 (2015). [CrossRef]

34. J. F. Yang and X. Yuan, “Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization,” Math. Comput. 82, 281 (2011). [CrossRef]

35. T. A. Moughal, “Hyperspectral image classification using support vector machine,” J. Phys.: Conf. Ser. 439, 012042 (2013). [CrossRef]

$W i n d o w S i z e$	5	7	9	11	13	15
OA(%)	98.32 $\pm$ 1.04	99.70 $\pm$ 0.18	100.00	100.00	100.00	100.00
AA(%)	98.98 $\pm$ 0.63	99.82 $\pm$ 0.11	100.00	100.00	100.00	100.00
Kappa	94.40 $\pm$ 0.03	98.97 $\pm$ 0.01	100.00	100.00	100.00	100.00

Methods	SVM	DLSR	ReLSR	ICS-DLSR	MSRL	RLR	DMLSR	TDLR
SE (%)	79.94	85.47	84.47	82.61	86.34	85.59	88.63	95.34
SP (%)	89.22	98.24	98.68	99.43	98.67	99.00	99.59	99.34
OA (%)	87.89	96.42	96.65	97.03	96.90	97.08	98.02	98.77
AA (%)	84.58	91.85	91.57	91.02	92.50	92.29	94.11	97.34
Kappa	0.583	0.851	0.859	0.871	0.871	0.877	0.916	0.950

Methods	SVM	DLSR	ReLSR	ICS-DLSR	MSRL	RLR	DMLSR	TDLR
SE (%)	66.12	80.18	80.12	80.47	79.94	81.53	90.94	87.76
SP (%)	84.68	87.90	88.00	92.49	88.22	92.41	93.45	98.16
OA (%)	81.90	86.75	86.82	90.69	86.98	90.78	91.97	96.60
AA (%)	75.40	84.04	84.06	86.48	84.08	86.97	92.20	92.96
Kappa	0.417	0.567	0.568	0.666	0.571	0.671	0.802	0.866

$W i n d o w S i z e$	5	7	9	11	13	15
OA(%)	98.32 $\pm$ 1.04	99.70 $\pm$ 0.18	100.00	100.00	100.00	100.00
AA(%)	98.98 $\pm$ 0.63	99.82 $\pm$ 0.11	100.00	100.00	100.00	100.00
Kappa	94.40 $\pm$ 0.03	98.97 $\pm$ 0.01	100.00	100.00	100.00	100.00

Methods	SVM	DLSR	ReLSR	ICS-DLSR	MSRL	RLR	DMLSR	TDLR
SE (%)	79.94	85.47	84.47	82.61	86.34	85.59	88.63	95.34
SP (%)	89.22	98.24	98.68	99.43	98.67	99.00	99.59	99.34
OA (%)	87.89	96.42	96.65	97.03	96.90	97.08	98.02	98.77
AA (%)	84.58	91.85	91.57	91.02	92.50	92.29	94.11	97.34
Kappa	0.583	0.851	0.859	0.871	0.871	0.877	0.916	0.950

Membranous nephropathy classification using microscopic hyperspectral imaging and tensor patch-based discriminative linear regression

Abstract

1. Introduction

2. Material and methods

2.1 Microscopic hyperspectral imaging system

2.2 Experimental hyperspectral image dataset

2.3 Methods

2.3.1 Hyperspectral data preprocessing

2.3.2 TDLR model

3. Experimental results and analysis

3.1 Selection of model parameters

3.2 Performance for MN classification

4. Conclusion

Funding

Disclosures

References

Cited By

Figures (3)

Tables (3)

Equations (14)

Biomedical Optics Express