## Abstract

The exposure of normal tissues to high radiation during cone-beam CT (CBCT) imaging increases the risk of cancer and genetic defects. Statistical iterative algorithms with the total variation (TV) penalty have been widely used for low dose CBCT reconstruction, with state-of-the-art performance in suppressing noise and preserving edges. However, TV is a first-order penalty and sometimes leads to the so-called staircase effect, particularly over regions with smooth intensity transition in the reconstruction images. A second-order penalty known as the Hessian penalty was recently used to replace TV to suppress the staircase effect in CBCT reconstruction at the cost of slightly blurring object edges. In this study, we proposed a new penalty, the TV-H, which combines TV and Hessian penalties for CBCT reconstruction in a structure-adaptive way. The TV-H penalty automatically differentiates the edges, gradual transition and uniform local regions within an image using the voxel gradient, and adaptively weights TV and Hessian according to the local image structures in the reconstruction process. Our proposed penalty retains the benefits of TV, including noise suppression and edge preservation. It also maintains the structures in regions with gradual intensity transition more successfully. A majorization-minimization (MM) approach was designed to optimize the objective energy function constructed with the TV-H penalty. The MM approach employed a quadratic upper bound of the original objective function, and the original optimization problem was changed to a series of quadratic optimization problems, which could be efficiently solved using the Gauss-Seidel update strategy. We tested the reconstruction algorithm on two simulated digital phantoms and two physical phantoms. Our experiments indicated that the TV-H penalty visually and quantitatively outperformed both TV and Hessian penalties.

© 2016 Optical Society of America

## 1. Introduction

On-board cone-beam computed tomography (CBCT) is extensively used in clinical image-guided radiation therapy [1]. Repeated exposure to CBCT scanning during radiation therapy may increase the risk of cancer [2] and genetic defects [3].

The radiation dose may be reduced by lowering the mAs levels during the CBCT acquisition process. However, this strategy also decreases the number of *x*-ray photons detected by the scanner. As a result, the conventional FDK method will produce reconstructed CBCT images with increased noise from the projections acquired with lower mAs levels [4].

Several algorithms have been proposed to improve the quality of CBCT imaging from lower mAs level projections [5]. Statistical iterative reconstruction algorithms have shown to perform better than the FDK method. Algorithms based on compressed sensing have also been applied to CBCT reconstruction [6–9]. An adaptive steepest-descent projections on convex sets (ASD-POCS) iterative algorithm with the total variation (TV) penalty has been developed with encouraging reconstruction results [10]. The penalized weighted least-squares (PWLS) iterative algorithm with TV (PWLS-TV) has demonstrated its advantage in suppressing noises and preserving edges, showing state-of-the-art performance in CBCT reconstruction [11, 12].

Although the TV penalty is successful from many points of view, one disadvantage is the so-called staircase effect [13, 14]. Characterized by the first-order differential operator, TV penalizes the image gradient regardless of the image structures, and potentially produces reconstruction results with piecewise-constant artifacts that negatively affect the correct interpretation of images in medical imaging.

Recently, there has been growing interest in using penalties with higher-order differential operators, aiming to suppress the staircase effect for various inverse problems in image processing [15–20]. Higher-order penalties can potentially prevent the oversharpening of regions with smooth intensity transitions because of their piecewise-vanishing property, which produces piecewise-linear solutions that better fit smooth intensity changes. Most high-order penalties involve second-order differential operators. Bredies *et al.* proposed the Total Generalized Variation (TGV) with symmetric tensors, involving and balancing image higher-order derivatives [15]. TGV is characterized by convexity and weak semi-continuity, and shows appealing properties for natural image denoising particularly in reducing the staircase effect. You and Kaveh proposed a fourth-order partial differential equation (PDE) to optimize the trade-off between noise removal and edge preservation for image denoising problems [16]. Hu *et al.* developed both isotropic and anisotropic higher degree TV penalties for compressed sensing and image denoising [17]. Lefkimmiatis *et al.* proposed regularizers involving the Schatten norms of the Hessian matrix computed at every image pixel [18]. Higher-order penalties can effectively suppress the staircase effect, but may also lead to the oversmoothing of object edges [20].

In a previous study [21], we adopted the Hessian penalty to reduce the staircase effect in PWLS iterative CBCT reconstruction. Unlike TV, Hessian is a second-order penalty [19], which allows for linear smooth intensity transition between neighboring pixels. The Hessian penalty has a number of favorable properties including convexity, homogeneity, and rotation and translation invariance. It can also suppress the staircase effect of the TV penalty for PWLS CBCT reconstruction [21]. However, like most higher-order penalties, Hessian tends to slightly blur the edges of the reconstructed image.

A combination of Hessian and TV penalties may overall benefit CBCT reconstruction. Many approaches have been proposed to combine penalties for various image processing applications [22, 23]. Chambolle and Lions proposed the combination of first- and second-order convex regularizations through inf-convolution for image denoising, where the noisy image was decomposed into three parts [24]. Lysaker and Tai proposed the combination of TV with a fourth-order PDE to recover gray-scale images from noisy observations [25]. They separately solved two objective energy functions, one with a first-order penalty (TV) and one with a higher-order penalty, and then used the two solutions to generate a new result through a convex combination. Do *et al.* proposed a variational approach to combine the first- and higher-order TV penalties by using an equivalent constant weighting strategy for low-dose CT image reconstruction [26]. In their method, the weights were manually set to balance the first- and second-order terms based on visual inspection. Dutaa *et al.* presented a combination of *L*^{1} and TV penalties for fluorescence molecular tomography; *L*^{1} was used to suppress spurious background signals and enforce sparsity, while TV preserved local smoothness and piecewise constancy in the reconstructed images [27]. They manually selected the best constant weights for both penalties in an objective energy function through a trial-and-error approach. Bioucas-Dias *et al.* considered the regularizers through linear combinations of “simpler” regularizers [28]. In their method, the weighting parameters for individual regularizers were manually tuned for optimal performance. Chen *et al*. proposed the combination between the tight framelet and TV to restore images corrupted by blur and impulsive noise [29]. The resulting objective function included two penalties: the TV penalty on the framelet coefficients and the *L*^{1} norm on the restored image. In the same method, the weighting parameters were also manually tuned for different images.

In this study, we proposed a new TV-H penalty combining TV and Hessian in a structure-adaptive way for CBCT reconstruction without introducing extra parameters. The proposed TV-H penalty can automatically adjust the weight parameters between TV and Hessian according to local image structures. We designed an exponential term to automatically recognize edges, gradual transition and uniform regions based on the image gradient. For uniform regions, the exponential term was about 1; for edges, the term approached 0; for gradual transition regions, values varied between 0 and 1. Using the majorization-minimization (MM) approach [30] and the Gauss-Seidel update strategy, we developed an effective algorithm to minimize the objective function with the proposed TV-H penalty. We conducted a systematic comparison between TV, Hessian and TV-H for CBCT reconstruction using two simulated phantoms and two physical phantoms. Our results indicated that the TV-H penalty not only outperformed the TV penalty in suppressing the staircase effect, but also overcame the drawbacks of the Hessian penalty, without blurring object edges.

## 2. Methods

#### 2.1 PWLS reconstruction

Noise in the CBCT projection data after logarithm transformation follows an approximate Gaussian distribution; noise variance can be determined by the following exponential formula [31]

where ${\overline{p}}_{i}$ is the mean of the projection ${p}_{i}$, and${N}_{i0}$is the incident photon number, both at the detector bin*i*. Following the noise model in Eq. (1), the cost function of the PWLS algorithm can be written as [32]

*p*is the vector of the projection measurements.

*A*represents the projection matrix, $\mu $ is the attenuation map and $\Sigma $ is a diagonal matrix with its

*i*th element ${\sigma}_{i}^{2}$. The symbol

*T*denotes the transpose operator. The second term is a penalty term, where $\beta $ is a parameter that balances the WLS and the penalty terms.

#### 2.2. Adaptive structure reconstruction

The TV penalty consists of the first-order derivative of the reconstructed images. Although the TV penalty in CBCT reconstruction presents advantages in noise suppression and edge preservation [31], it may also lead to the staircase effect. We extended the first-order derivative to a second-order derivative and defined the Hessian penalty in our previous study for CBCT reconstruction [21].

The Hessian penalty successfully suppresses the staircase effect in CBCT reconstruction, but also reduces image resolution and blurs object edges [21]. In contrast, the TV penalty successfully preserves those edges. A combination between the two penalties may eliminate the TV staircase effect while preserving edges. In this study, we proposed the TV-H penalty that combines both penalties as follows:

**is the voxel coordinate $\left(x,y,z\right)$. The second term in Eq. (3) is a modified Hessian penalty, where**

*X*In discrete cases, the TV-H penalty in Eq. (3) can be re-written as

Let $\alpha ={\left\{{\alpha}_{X}\right\}}_{\forall X\in \Omega}$. After substituting Eq. (6) into Eq. (2), we obtained the cost function for the structure-adaptive CBCT reconstruction as follows:

#### 2.3. Optimization method

Both TV and Hessian penalties are convex. However, the cost function $\Phi \left(\mu ,\alpha \right)$ is no longer convex over the two variables $\left(\mu ,\alpha \right)$. To minimize $\Phi \left(\mu ,\alpha \right)$, we used a strategy similar to the alternating minimization (AM) method. In the standard AM method, the cost function is iteratively minimized with two alternating minimization steps. Each step minimizes the problem over one variable while keeping the second variable fixed. In our strategy, the minimization of $\Phi \left(\mu ,\alpha \right)$ was conducted iteratively with a minimization step and an update step. The minimization step minimizes $\Phi \left(\mu |\alpha \right)$ over $\mu $with a fixed $\alpha $, while the update step updates $\alpha $ using the result from the minimization step. We noted that when $\alpha $ is fixed, $\Phi \left(\mu |\alpha \right)$ is convex over $\mu $.

### 2.3.1 Minimization step

This step minimizes $\Phi \left(\mu |\alpha \right)$. We observed that the proposed TV-H penalty in Eq. (6) is not continuously differentiable. Gradient-based optimization methods such as the gradient descent and the conjugate gradient [33] cannot be used for minimizing $\Phi \left(\mu |\alpha \right)$. In this study, we adopted the MM approach [30]. The MM approach takes a series of quadratic forms of the upper-bound of the original objective function $\Phi \left(\mu |\alpha \right)$. Solutions that minimize these quadratic forms converge into the solution of the original minimization problem. The minimization of a quadratic form objective function can be easily implemented using methods such as the Gauss-Seidel [5] update strategy.

In the MM approach,${\mu}^{\left(t\right)}$ represents a fixed $\mu $ value at the *t*th iteration and $S\left(\mu \right)$ is the objective function. $Q\left(\mu |{\mu}^{\left(t\right)}\right)$ is a majorizer of $S\left(\mu \right)$ at ${\mu}^{\left(t\right)}$so that for $\forall \mu $:

In our study, the following inequality was used to find the quadratic majorizer of the TV-H penalty:

Equations (11) and (12) can construct majorizers for TV and Hessian, respectively. The sum of the TV and Hessian majorizers is obviously equal to the TV-H majorizers, satisfying the condition defined in Eq. (10). By replacing Eq. (6) with Eq. (11) and Eq. (12), we have the upper bound of ${R}_{TV-H}$:

*j*. Both ${\left({w}_{ji}^{TV}\right)}^{\left(t\right)}$ and ${\left({w}_{ji}^{Hessian}\right)}^{\left(t\right)}$are the weighting parameters between neighboring voxels determined by${\mu}^{\left(t\right)}$, with ${\left({w}_{ji}^{TV}\right)}^{\left(t\right)}=-{\left({R}_{TV}^{\left(t\right)}\right)}_{i,j}/2$ and ${\left({w}_{ji}^{Hessian}\right)}^{\left(t\right)}=-{\left({R}_{Hessian}^{\left(t\right)}\right)}_{i,j}/2$. In Eq. (14) for a fixed voxel

*j*, the cross terms with index

*ji*arise from the derivative operations, and only exist when

*i*is no more than one voxel and two pixels away from

*j*for TV and Hessian, respectively. More details about ${w}_{ji}^{\left(t\right)}$and its update during the reconstruction process are shown in Appendix B.

By substituting Eq. (14) into Eq. (7), we obtain the following quadratic objective function:

### 2.3.2 Update step

This step updates $\alpha $ after the minimization step. In this step, each component in $\alpha $ was updated as ${\left(\alpha \right)}_{j}={\alpha}_{j}^{}=\mathrm{exp}\left(-{\left|\nabla {\mu}_{j}^{}\right|}_{}^{2}/{\eta}^{2}\right)$, according to Eq. (5). Our experiments showed that$\eta $ can be reasonably set as 40% of the mean of the FDK reconstructed image gradient magnitude. In other words, $\eta $ was fixed during the iterative reconstruction. More details about our minimization strategy can be found in Appendix C.

#### 2.4. Details in calculating projection matrix A

Calculating the 3D CBCT reconstruction is time-consuming, hence it is important to find an effective method to determine projection matrix$A$in Eq. (17). To implement the forward projection, we adopted separable footprint (SF) projectors [34]. We specifically used the SF projector with a trapezoid/rectangle function (SF-TR) [34]. SF-TR approximates the voxel footprint functions as 2D separable functions, and uses the trapezoid function in the transaxial direction and the rectangular function in the axial direction.

## 3. Experiments

We tested our reconstruction algorithm on a Compressed Sensing (CS) phantom [35], a modified 3D Shepp-Logan phantom [36], a commercial calibration phantom CatPhan 600 (The Phantom Laboratory, Inc., Salem, NY) and an anthropomorphic head phantom. The first two are simulated digital phantoms, while the others are physical phantoms.

The projections of the two digital phantoms were simulated using the SF algorithm. There were 360 projections over 360°; the dimension of each projection was 620.8 × 155.2 mm^{2}, including 800 × 200 pixels. The source-to-axial distance was 100 cm and the source-to-detector distance was 150 cm. The noise of different levels was added to the projections according to Eq. (1). The incident photon number in Eq. (1) was set to 5 × 10^{3}, 1 × 10^{4} and 5 × 10^{4} to simulate low-dose, medium-dose and high-dose protocols, respectively. The reconstructed image size was 350 × 350 × 16 while the voxel size was 0.776 × 0.776 × 0.776 mm^{3}.

In both physical phantoms, the CBCT projection data were acquired by a Varian Acuity system (Varian Medical System, Palo Alto, CA) with 125 kVp voltage, 10 mA/10 ms current for the low dose and 125 kVp voltage 80 mA/12 ms for the high dose. There were 634 projections in a circle with individual dimensions of 397 × 298 mm^{2}, including 1024 × 768 pixels. The projection data were downsampled by a factor of 2 and only 16 out of 768 projection data were chosen for reconstruction along the axial direction. The source-to-axial distance was 100 cm and the source-to-detector distance was 150 cm. For CatPhan 600, the image size was 350 × 350 × 16 and the voxel size was 0.776 × 0.776 × 0.776 mm^{3}. For the anthropomorphic head phantom, the image size was 550 × 550 × 32 mm^{3} and the voxel size was 0.388 × 0.388 × 0.388 mm^{3}.

#### 3.1. Evaluation criteria

To compare the performance of the TV, Hessian and TV-H penalties we adopted the following criteria: peak signal-to-noise ratio (*PSNR*), increase in the signal-to-noise ratio (*ISNR*), structural similarity (*SSIM*) [37], full width at half maximum (*FWHM*), edge-spread function (*ESF*) [5] and contrast-to-noise ratio (*CNR*).

The *PSNR* is defined as

*PSNR*indicates a reduced error between the reconstructed image and the original image.

The *ISNR* is defined as

*MSE*is the mean-squared error between the FDK reconstructed image and the original image.

_{in}*MSE*is the mean-squared error between the reconstructed image and the original image. A higher

_{out}*ISNR*indicates better quality of the reconstructed image as compared with the FDK reconstructed image.

The *SSIM* is defined as

*a*and

*b*are two windows of 11 × 11 pixels in the same position of two images. ${\mu}_{a}$ and ${\mu}_{b}$are the average voxel intensities, ${\lambda}_{a}^{2}$ and ${\lambda}_{b}^{2}$ are the voxel intensity variances, all in windows

*a*and

*b*. ${\lambda}_{ab}$ is the intensity covariance between the two windows. ${C}_{1}$ and ${C}_{2}$ were chosen as ${C}_{1}={\left(0.01{\mu}_{\mathrm{max}}\right)}^{2}$ and ${C}_{2}={\left(0.03{\mu}_{\mathrm{max}}\right)}^{2}$, respectively. A higher

*SSIM*indicates that the regions of these two images are more similar to each other.

To calculate the *FWHM,* the gradient was first fitted along a selected profile perpendicular to an object’s edge using the following Gaussian function

*FWHM*is then defined as,A lower

*FWHM*indicates that the object’s edge is sharper.

The resolution of an image can also be described by *ESF* along profiles through a region of interest (ROI). The *ESF* is defined as:

*erf*is the error function,

*r*is the shift of

*erf*and

*H*is the magnitude of

*erf*, and $\overline{x}$ is the center of an edge. The parameter $\kappa $ represents the resolution of the image, and a higher $\kappa $ indicates a blurrier edge.

*CNR* is defined as

*CNR*suggests better preservation of the signal-to-noise contrast.

These criteria evaluated the performance of different penalties from various standpoints, though some of them presented potential limitations. For example, *MSE*-based criteria (*ISNR* and *PSNR*) do not always accurately model perceptual quality, *CNR* can become meaningless when applied to the ROI and background with constant intensity because it achieves infinity, while *SSIM* is not stable enough in regions of low intensity variance [38].

#### 3.2. CS phantom

The CS phantom was used to measure the resolution properties of different penalties and the staircase effect of the TV penalty. This phantom has a uniform background with an attenuation coefficient of 0.0125 mm^{−1}. An octahedron and a ball with linearly gradual transitioning intensity were used to evaluate the potential staircase effect of different penalties. A set of line objects and circular cylinders were used to measure the resolution and *CNR* of the image, respectively. A representative slice of the CS phantom is shown in Fig. 1. Five green boxes were used to calculate the noise of the phantom, while the red box indicates the ROI.

### 3.2.1 Convergence analysis

To examine the convergence of the proposed algorithm, we used the difference in the attenuation coefficient between the successive iterations to indicate whether the algorithm was convergent. Figure (2) shows the convergence curves of the CS phantoms reconstructed by the PWLS algorithm with TV, Hessian and TV-H penalties, where the residual was defined as ${{\Vert {\mu}^{\left(k+1\right)}-{\mu}^{\left(k\right)}\Vert}_{2}/\Vert {\mu}^{\left(k\right)}\Vert}_{2}$ with *k* as the iteration number. In this study 30 iterations were used for the reconstruction in PWLS with different penalties. As shown, the three penalties had similar convergence speeds for all noise levels (Fig. 2).

### 3.2.2 Evaluation

To compare the performance of the three penalties, we adjusted the parameter $\beta $ in Eq. (2) by trial-and-error so that the reconstruction results had the same noise level. Most evaluation criteria used in this work were related to the noise level. Particularly, image resolution and noise level is always a trade-off. By matching the noise level in the reconstructed image, we can directly compare the relative performance of different penalties [39].

The evaluation index values of the whole image are listed in Table 1. The proposed TV-H and Hessian penalties had similar performances, and were significantly better than the TV penalty and the FDK reconstruction for all incident photon numbers (Table 1). For example, when the incident photon number was 5 × 10^{3}, the *PSNR* (44.09 dB), *ISNR* (17.52 dB) and *SSIM* (0.9812) of the TV-H penalty were higher than those of the TV penalty.

Figure 3 shows the *SSIM* maps between the reconstructed images and the original images for the three penalties. The regions near the octahedral and the ball in the reconstructed images using TV-H and Hessian had higher values than those reconstructed using TV, indicating a better preservation of the intensity gradual transition regions. In contrast, the annular and strip-type regions reconstructed using TV-H and TV exhibited higher values than those of the Hessian penalty, suggesting an improved preservation of object edges. In addition, PWLS reconstruction algorithms with the three penalties significantly outperformed the FDK algorithm. As expected, these results indicate that the TV-H penalty successfully enhanced the individual benefits of TV and Hessian.

#### 3.3. Modified 3D Shepp-Logan phantom

We modified the original 3D Shepp-Logan phantom by substituting the constant ellipsoid with a linear transitioning ellipsoid, in order to compare the reconstruction performance of different penalties for preserving both edges and transitioning regions in the image.

A representative slice of the modified 3D Shepp-Logan phantom obtained by different reconstruction algorithms is illustrated in Fig. 4, where the original (Fig. 4(a)), the analytical FDK reconstructed image (Fig. 4(b)), the reconstructed image using TV (Fig. 4(c)), the reconstructed image using Hessian (Fig. 4(d)) and the reconstructed image using TV-H (Fig. 4(e)) are shown. For improved illustration, a ROI (indicated by the green square in Fig. 4(a)) was enlarged at the ellipsoid for each image. The display window was set to 0-0.05mm^{−1} for better illustration. The reconstructed images using TV-H and Hessian successfully preserved the smooth region while the reconstructed image using TV penalty showed an obvious staircase effect.

Figure 5 reports the profiles along the long red line in Fig. 4 (a) through the gradual transition regions in the modified ellipsoid of the reconstructed images. These plots clearly show that the profiles from TV-H (Fig. 5(e)) and Hessian (Fig. 5(d)) were smooth, while those from TV (Fig. 5(c)) had several obvious steps. This finding is consistent with our visual observation in Fig. 4.

To measure the blurring degree of the object edges, we calculated the derivative of the reconstructed image along the short blue line in Fig. 4(a). We then fitted the derivative with a Gaussian function to obtain the *FWHM* (Fig. 6). The short blue line goes through a sharp edge. The *FWHM*s of TV-H, TV and Hessian, were 1.75, 1.61 and 3.16, respectively. These results indicate that both TV-H and TV can preserve the edges better than Hessian.

#### 3.4. CatPhan 600 phantom

A typical slice of the reconstructed results from the CatPhan 600 phantom is reported in Fig. 7. The following information is shown: FDK reconstruction from projection data acquired with the low-dose protocol (10 mA/10 ms) (Fig. 7(a)); FDK reconstruction from projection data acquired with the high-dose protocol (80 mA/10 ms) (Fig. 7(b)); low-dose CBCT image reconstructed using TV (Fig. 7(c)); low-dose CBCT image reconstructed using Hessian (Fig. 7(d)); low-dose CBCT image reconstructed using TV-H (Fig. 7(e)). The three iterative reconstruction results were compared at the same noise level measured with the standard derivation of the reference region, indicated by a red square in Fig. 7(a). Among the three penalties, the noises of the original FDK reconstruction from the low-dose protocol in Fig. 7(a) were greatly reduced.

To compare the *CNR* of the three penalties, we selected four low-contrast ROIs with a size of 17 × 17 pixels. Their exact locations are indicated by arrows in Fig. 7(a). The *CNRs* of the four regions from the three PWLS algorithms at different noise levels are listed in Table 2. Most of the *CNR* values from the TV-H reconstruction were similar to those of the TV and Hessian reconstruction, and were higher than those of the FDK (80mA) reconstruction (Table 2). Our findings indicate that these three penalties can preserve image contrast.

#### 3.5. Anthropomorphic head phantom

A representative slice of the results reconstructed from the anthropomorphic head phantom is shown in Fig. 8. Reconstruction and images are shown as follows: FDK reconstruction from the projection data acquired with the low-dose protocol (10 mA/10 ms) (Fig. 8(a)); FDK reconstruction from the projection data acquired with the high-dose protocol (80 mA/10 ms) (Fig. 8(b)); low-dose reconstructions using TV, Hessian and TV-H, respectively (Fig. 8(c)-8(e)). The three iterative reconstruction results were approximately at the same noise level as the standard derivation of the region indicated by a blue square in Fig. 8(a), equal to 4.40 × 10^{−4}, 4.29 × 10^{−4} and 4.49 × 10^{−4} for Fig. 8(c), Fig. 8(d) and Fig. 8(e), respectively. We focused on a selected ROI indicated by the red square in Fig. 8(a). We adopted the FDK result from the high-dose projections as a reference. The mean *SSIMs* of the ROI between the high-dose FDK result and the TV, Hessian and TV-H results were 0.9881, 0.9949 and 0.9935, respectively. The *SSIM* maps of the ROI for TV, Hessian, and TV-H are illustrated in Fig. 9(a), 9(b) and 9(c), respectively. Both Hessian and TV-H outperformed TV possibly because the head phantom was dominated by gradual intensity transition regions.

A yellow line was drawn to fit the edge-spread function (*ESF*) as expressed in Eq. (23) (Fig. 8(a)). The parameter $\kappa $ was used to characterize the image resolution. The *ESFs* of the PWLS images reconstructed with the low-dose protocol using TV, Hessian, and TV-H penalty are shown in Fig. 10; $\kappa $ values were 0.9680, 4.4601 and 1.2143, respectively. This indicates that the resolution of the TV-H reconstruction was similar to that of the TV reconstruction, and higher than that of the Hessian reconstruction.

We examined the adaptive parameter $\alpha $ defined in Eq. (5) along the yellow line in the anthropomorphic head phantom for TV-H in the PWLS reconstruction process (Fig. 11). The parameter $\alpha $ balanced the weight between TV and Hessian, and was automatically updated in each iteration during reconstruction. The values of $\alpha $ gradually stabilized, with increases in the iteration number (Fig. 11).

The $\alpha $ of a representative slice of the anthropomorphic head phantom is shown in Fig. 12. The parameter $\alpha $ played the role of an edge detector, and gradually converged into the real boundary during the iteration process.

## 4. Discussion

The state-of-the-art TV penalty successfully suppresses noise and preserves edges in low dose CBCT reconstruction. However, one of its drawbacks is the staircase effect, which modifies gradual intensity transition regions into piecewise constant regions in the reconstruction images.

We described a new TV-H penalty that balances the properties of TV and Hessian in the reconstruction process. As shown in the two digital simulation experiments, piecewise constant artifacts were observed over the regions with smooth intensity change in the TV reconstruction results, *e.g.*, in the ellipsoid of the modified 3D Shepp-Logan phantom (Fig. 4(c)). The *SSIM* values also demonstrated the failure of the TV penalty in recovering the regions near the octahedral and the ball in the CS phantom (Fig. 3(d), 3(e), 3(f))). These TV reconstruction images were degraded by the piecewise constant regions and lost their original structures. The Hessian penalty suppresses the staircase effect by introducing the second-order derivative into the CBCT reconstruction process [21]. However, the Hessian penalty cannot properly preserve the edge of the line-style and the annular objects of the CS phantom (Fig. 3(g), 3(h), 3(i)), and the edges of the Shepp-Logan phantom (Fig. 6(b)). The proposed TV-H penalty can overcome the staircase effect of the TV penalty and also preserve object edges, as shown in Fig. 3(j), 3(k), 3(l)), Fig. 4(e) and Fig. 5(e).

The adaptive parameter $\alpha $was used to automatically detect edges, gradual transition and uniform regions in the reconstruction image, so that the proposed TV-H penalty could adaptively weigh different penalties in different local regions of the reconstruction images. A similar idea to explore local structure was previously published [39].Wang *et al.* constructed an anisotropic penalty by examining the intensity difference between a voxel and its direct neighbors. The anisotropic penalty in Wang’s method automatically adjusted the weight of the *same penalty* (quadratic penalty) over different local image structures.

In our method, $\alpha $ was automatically updated during the iteration. We observed a significant fluctuation in $\alpha $ at the beginning of the reconstruction process, as shown in Fig. 11(a)-11(c) and Fig. 12(a)-12(c). Noise is primarily responsible for this fluctuation. At the first iteration, $\alpha $was calculated from the FDK result which typically had a high noise level. With increasing numbers of iterations, the noise level in the reconstruction decreased, leading to $\alpha $ with lower fluctuation (Fig. 11(d) and Fig. 12(d)). As the number of iterations increases, $\alpha $ gradually stabilizes. In our experiment, we found that a maximum iteration number of 30 was enough to obtain satisfactory reconstruction.

Most current approaches based on mixed penalties introduce parameters that need extra manual adjustments [25–29]. We only observed one parameter, $\eta $, in the TV-H penalty (See Eq. (5)) and found that its variation did not significantly change the reconstruction performance. We set $\eta $ as 40% of the mean of the FDK reconstructed image gradient magnitude in all experiments. Except for this parameter, the proposed reconstruction algorithm with TV-H did not introduce any other extra parameters. This is a potentially important advantage of our penalty. The additional parameter we needed to manually adjust in our study was the weighting parameter *β*. This parameter is required in a typical inverse problem, and is used to balance the fidelity and the penalty terms in the whole objective function, as in Eq. (2) or Eq. (7). For fair comparison, we manually adjusted this parameter to keep the images reconstructed with TV, Hessian and TV-H at the same noise level.

Bredies *et. al* previously extended TV to TGV [15]. TGV can incorporate the first and higher order derivatives, similarly to our proposed TV-H penalty. However, TGV and TV-H differ in many ways. We mentioned the second-order TGV as an example. The second-order TGV of image $\mu $ can be written as $TG{V}_{\alpha}^{2}\left(\mu \right)=\underset{v\in \Omega}{\mathrm{min}}{\alpha}_{0}{\displaystyle {\int}_{\Omega}\left|\nabla \mu -v\right|}dx+{\alpha}_{1}{\displaystyle {\int}_{\Omega}\left|\epsilon \left(v\right)\right|}dx,$ where $\epsilon \left(v\right)=\frac{1}{2}\left(\nabla v+\nabla {v}^{T}\right)$ is the matrix-valued weak symmetrized derivative, and $\left({\alpha}_{0},{\alpha}_{1}\right)$ are two positive parameters [40]. First, the TGV does not explicitly incorporate the second derivative since the actual minimum $v$ may be located anywhere between 0 and $\nabla \mu $ [40]. In contrast, our proposed TV-H combines the real TV and Hessian in a structure-adaptive way. Second, the first and second order terms in TGV are balanced via weights $\left({\alpha}_{0},{\alpha}_{1}\right)$. TV-H only includes one parameter $\eta $ (see Eq. (5)). We noted that the reconstruction was quite robust to the variation of $\eta $, which was kept unchanged for all experiments in our study.

In this study, the PWLS algorithm (Eq. (2)) was adopted for CBCT reconstruction at a low-dose level, and the noise of the projection data was assumed to follow a Gaussian distribution with zero mean and signal dependent variance. Wang *et al*. studied the noise properties of the calibrated or preprocessed sinogram data in Radon space, and validated the nonlinear mean-variance relationship (*i.e*., Eq. (1)) of the projected data at different mAs levels [31]. The mean–variance relationship precludes the use of Poisson and Gamma distribution for image reconstruction. The authors argued that a PWLS cost function could be a suitable choice, fully utilizing the relationship between the first and the second statistical moments [31]. Nevertheless, the Gaussian distribution is only a rough approximation of the projection noise in CBCT [31, 41]. The reconstruction might benefit from a more sophisticated noise model on the raw projection data (*i.e.*, before logarithm transform) such as Poisson-Gaussian [42, 43] and compound Poisson [44]. In addition, algorithms designed for signal-dependent noise may further improve the reconstruction performance [45]. In this study, the noise covariance matrix $\Sigma $ had a diagonal form, and no correlation was considered among the nearest bins of the CBCT projection. Considering the noise correlation can improve the reconstruction performance [46].

An important problem of statistical iterative reconstruction methods is the long computational time as compared to the FDK method [4]. In our experiments, 2 minutes were needed to complete one iteration to reconstruct CBCT images of 350 × 350 × 16 using a PC with 3.3GHz CPU. Thus, one image with 30 iterations will take at least 1 hour, which can be sped by various methods. As for the converging speed, many effective fast converging algorithms have recently been proposed including the FISTA [47] and the NESTA [6, 48]. In terms of hardware, graphics processing unit (GPU) can be used to reduce the computation time in CBCT reconstruction [12, 49].

## 5. Conclusion

In this study, we proposed the TV-H penalty for CBCT reconstruction, and developed an effective algorithm to minimize the objective function using the majorization-minimization strategy. The TV-H penalty balanced the weight between TV and Hessian in the objective function in a structure-adaptive way. Comparison studies among TV, Hessian and TV-H revealed that the proposed penalty effectively preserves both edges and other regions with smooth intensity transition, while restraining the staircase effect of the TV penalty.

## Appendix A derivative filters

For a three-dimensional image, there are six commonly used second-order derivatives. The voxel index is represented by the triplet $\left(x,y,z\right)$. For a discrete image $\mu $, the filtering operations can be defined as

## Appendix B ${w}_{ji}^{\left(t\right)}$and its update

For voxel $j$, ${N}_{j}$ is the set of neighbor voxels which are at most 2-voxel away from it. From Eqs. (13) and (14), we can easily show that ${w}_{ji}^{\left(t\right)}:=-{\left({R}_{TV-H}^{\left(t\right)}\right)}_{ji}/2,\text{\hspace{0.17em}}i\in {N}_{j},\text{\hspace{0.17em}}i\ne j$. ${w}_{ji}^{\left(t\right)}$ is updated in each iteration. For the proposed TV-H penalty, we defined ${W}_{\left(x,y,z\right)}^{\left(t\right)}$ as

## Appendix C structure adaptive CBCT reconstruction using TV-H penalty

Our task was to estimate the attenuation coefficient distribution map $\mu $ from the projection data$p$ by minimizing the cost function $\Phi \left(\mu ,\alpha \right)$ defined in Eq. (7). To minimize $\Phi \left(\mu ,\alpha \right)$, we used a strategy similar to the alternating minimization algorithm. We noted that $\Phi \left(\mu ,\alpha \right)$ is non-convex, and as a result the reconstruction depends on initialization. In our experiments, $\mu $ was initialized using the FDK reconstruction $FDK\left\{p\right\}$ and $\alpha $ was updated using Eq. (5). Let *K* and *M* be the maximum iteration number for the update and MM approach, respectively, *N* be the image size (the voxel numbers that need to be updated in the Gauss-Seidel method), and ${T}_{1}$ and ${T}_{2}$ be two small real number.

## Initialization

## Iteration

For $k=1\text{\hspace{0.17em}}:\text{\hspace{0.17em}}K$ (**Update step**)

For $t:=1\text{\hspace{0.17em}}:\text{\hspace{0.17em}}M$(**Minimization step**)

For *j*=1: *N* (**Gauss-Seidel**)

$update\left\{{w}_{ji}^{\left(t\right)}\right\}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(See\text{\hspace{0.17em}}Appendix\text{\hspace{0.17em}}B\right)$

End

If ${{\Vert {\mu}^{\left(t+1\right)}-{\mu}^{\left(t\right)}\Vert}_{2}/\Vert {\mu}^{\left(t\right)}\Vert}_{2}\le {T}_{1}$, Break, End

End

If ${{\Vert {\tilde{\mu}}^{\left(k+1\right)}-{\tilde{\mu}}^{\left(k\right)}\Vert}_{2}/\Vert {\tilde{\mu}}^{\left(k\right)}\Vert}_{2}\le {T}_{2}$, Break, End

End

In our experiment, we set *K=*30. We found that the parameter $\alpha $ should be updated in each MM iteration (*i.e.*, *M*=1) since in this setting we always obtained slightly better reconstruction results with lower objective costs and more satisfactory visual effects.

## Acknowledgment

This work was supported in part by the National Natural Science Foundation of China (NNSFC), under Grant Nos.60971112 and 61375018, and Fundamental Research Funds for the Central Universities, under Grant No. 2012QN086. J. Wang was supported in part by grants from the Cancer Prevention and Research Institute of Texas (RP130109 and RP110562-P2), the National Institute of Biomedical Imaging and Bioengineering (R01 EB020366) and a grant from the American Cancer Society (RSG-13-326-01-CCE). We would like to thank Dr. Damiana Chiavolini for editing the paper.

## References and links

**1. **D. A. Jaffray, J. H. Siewerdsen, J. W. Wong, and A. A. Martinez, “Flat-panel cone-beam computed tomography for image-guided radiation therapy,” Int. J. Radiat. Oncol. Biol. Phys. **53**(5), 1337–1349 (2002). [CrossRef] [PubMed]

**2. **A. Berrington de González, M. Mahesh, K.-P. Kim, M. Bhargavan, R. Lewis, F. Mettler, and C. Land, “Projected cancer risks from computed tomographic scans performed in the United States in 2007,” Arch. Intern. Med. **169**(22), 2071–2077 (2009). [CrossRef] [PubMed]

**3. **N. Wen, H. Guan, R. Hammoud, D. Pradhan, T. Nurushev, S. Li, and B. Movsas, “Dose delivered from Varian’s CBCT to patients receiving IMRT for prostate cancer,” Phys. Med. Biol. **52**(8), 2267–2276 (2007). [CrossRef] [PubMed]

**4. **L. Feldkamp, L. Davis, and J. Kress, “Practical cone-beam algorithm,” J. Opt. Soc. Am. A **1**(6), 612–619 (1984). [CrossRef]

**5. **L. Ouyang, T. Solberg, and J. Wang, “Effects of the penalty on the penalized weighted least-squares image reconstruction for low-dose CBCT,” Phys. Med. Biol. **56**(17), 5535–5552 (2011). [CrossRef] [PubMed]

**6. **K. Choi, J. Wang, L. Zhu, T.-S. Suh, S. Boyd, and L. Xing, “Compressed sensing based cone-beam computed tomography reconstruction with a first-order method,” Med. Phys. **37**(9), 5113–5125 (2010). [CrossRef] [PubMed]

**7. **H. Lee, L. Xing, R. Davidi, R. Li, J. Qian, and R. Lee, “Improved compressed sensing-based cone-beam CT reconstruction using adaptive prior image constraints,” Phys. Med. Biol. **57**(8), 2287–2307 (2012). [CrossRef] [PubMed]

**8. **S. Lefkimmiatis, A. Bourquard, and M. Unser, “Hessian-based regularization for 3-D microscopy image restoration,” in *Biomedical Imaging (ISBI)**,**2012**9th IEEE International Symposium on*, (IEEE, 2012), 1731–1734. [CrossRef]

**9. **J. C. Park, B. Song, J. S. Kim, S. H. Park, H. K. Kim, Z. Liu, T. S. Suh, and W. Y. Song, “Fast compressed sensing-based CBCT reconstruction using Barzilai-Borwein formulation for application to on-line IGRT,” Med. Phys. **39**(3), 1207–1217 (2012). [CrossRef] [PubMed]

**10. **E. Y. Sidky and X. Pan, “Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization,” Phys. Med. Biol. **53**(17), 4777–4807 (2008). [CrossRef] [PubMed]

**11. **E. Y. Sidky, C.-M. Kao, and X. Pan, “Accurate image reconstruction from few-views and limited-angle data in divergent-beam CT,” J. XRay Sci. Technol. **14**, 119–139 (2006).

**12. **X. Jia, Y. Lou, R. Li, W. Y. Song, and S. B. Jiang, “GPU-based fast cone beam CT reconstruction from undersampled and noisy projection data via total variation,” Med. Phys. **37**(4), 1757–1760 (2010). [CrossRef] [PubMed]

**13. **T. Chan, A. Marquina, and P. Mulet, “High-order total variation-based image restoration,” SIAM J. Sci. Comput. **22**(2), 503–516 (2000). [CrossRef]

**14. **J. Yuan, C. Schnörr, and G. Steidl, “Total-variation based piecewise affine regularization,” in *Scale Space and Variational Methods in Computer Vision* (Springer, 2009), pp. 552–564.

**15. **K. Bredies, K. Kunisch, and T. Pock, “Total generalized variation,” SIAM J. Imaging Sci. **3**(3), 492–526 (2010). [CrossRef]

**16. **Y.-L. You and M. Kaveh, “Fourth-order partial differential equations for noise removal,” IEEE Trans. Image Process. **9**(10), 1723–1730 (2000). [CrossRef] [PubMed]

**17. **Y. Hu and M. Jacob, “Higher degree total variation (HDTV) regularization for image recovery,” IEEE Trans. Image Process. **21**(5), 2559–2571 (2012). [CrossRef] [PubMed]

**18. **S. Lefkimmiatis, J. P. Ward, and M. Unser, “Hessian Schatten-norm regularization for linear inverse problems,” IEEE Trans. Image Process. **22**(5), 1873–1888 (2013). [CrossRef] [PubMed]

**19. **S. Lefkimmiatis, A. Bourquard, and M. Unser, “Hessian-based norm regularization for image restoration with biomedical applications,” IEEE Trans. Image Process. **21**(3), 983–995 (2012). [CrossRef] [PubMed]

**20. **J. Liu, T.-Z. Huang, I. W. Selesnick, X.-G. Lv, and P.-Y. Chen, “Image restoration using total variation with overlapping group sparsity,” Inf. Sci. **295**, 232–246 (2015). [CrossRef]

**21. **T. Sun, N. Sun, J. Wang, and S. Tan, “Iterative CBCT reconstruction using Hessian penalty,” Phys. Med. Biol. **60**(5), 1965–1987 (2015). [CrossRef] [PubMed]

**22. **K. Papafitsoros and C.-B. Schönlieb, “A combined first and second order variational approach for image reconstruction,” J. Math. Imaging Vis. **48**(2), 308–338 (2014). [CrossRef]

**23. **G. Steidl, “Combined first and second order variational approaches for image processing,” Jahresber. Dtsch. Math. Ver. **117**(2), 133–160 (2015). [CrossRef]

**24. **A. Chambolle and P.-L. Lions, “Image recovery via total variation minimization and related problems,” Numer. Math. **76**(2), 167–188 (1997). [CrossRef]

**25. **M. Lysaker and X.-C. Tai, “Iterative image restoration combining total variation minimization and a second-order functional,” Int. J. Comput. Vis. **66**(1), 5–18 (2006). [CrossRef]

**26. **S. Do, W. C. Karl, M. K. Kalra, T. J. Brady, and H. Pien, “A variational approach for reconstructing low dose images in clinical helical CT,” in *Biomedical Imaging: From Nano to Macro, 2010 IEEE International Symposium on*, (IEEE, 2010), 784–787. [CrossRef]

**27. **J. Dutta, S. Ahn, C. Li, S. R. Cherry, and R. M. Leahy, “Joint L1 and total variation regularization for fluorescence molecular tomography,” Phys. Med. Biol. **57**(6), 1459–1476 (2012). [CrossRef] [PubMed]

**28. **J. M. Bioucas-Dias and M. A. Figueiredo, “An iterative algorithm for linear inverse problems with compound regularizers,” in *Image Processing**,**2008**. **15th IEEE International Conference on*, (IEEE, 2008), 685–688. [CrossRef]

**29. **F. Chen, Y. Jiao, G. Ma, and Q. Qin, “Hybrid regularization image deblurring in the presence of impulsive noise,” J. Vis. Commun. Image Represent. **24**(8), 1349–1359 (2013). [CrossRef]

**30. **D. R. Hunter and K. Lange, “A tutorial on MM algorithms,” Am. Stat. **58**(1), 30–37 (2004). [CrossRef]

**31. **J. Wang, H. Lu, Z. Liang, D. Eremina, G. Zhang, S. Wang, J. Chen, and J. Manzione, “An experimental study on the noise properties of x-ray CT sinogram data in Radon space,” Phys. Med. Biol. **53**(12), 3327–3341 (2008). [CrossRef] [PubMed]

**32. **J. Wang, T. Li, Z. Liang, and L. Xing, “Dose reduction for kilovotage cone-beam computed tomography in radiation therapy,” Phys. Med. Biol. **53**(11), 2897–2909 (2008). [CrossRef] [PubMed]

**33. **Z. Tian, X. Jia, K. Yuan, T. Pan, and S. B. Jiang, “Low-dose CT reconstruction via edge-preserving total variation regularization,” Phys. Med. Biol. **56**(18), 5949–5967 (2011). [CrossRef] [PubMed]

**34. **Y. Long, J. A. Fessler, and J. M. Balter, “3D forward and back-projection for X-ray CT using separable footprints,” IEEE Trans. Med. Imaging **29**(11), 1839–1850 (2010). [CrossRef] [PubMed]

**35. **W. Dong, G. Shi, and X. Li, “Nonlocal image restoration with bilateral variance estimation: a low-rank approach,” IEEE Trans. Image Process. **22**(2), 700–711 (2013). [CrossRef] [PubMed]

**36. **D. Stsepankou, A. Arns, S. K. Ng, P. Zygmanski, and J. Hesser, “Evaluation of robustness of maximum likelihood cone-beam CT reconstruction with total variation regularization,” Phys. Med. Biol. **57**(19), 5955–5970 (2012). [CrossRef] [PubMed]

**37. **Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process. **13**(4), 600–612 (2004). [CrossRef] [PubMed]

**38. **J. F. Pambrun and R. Noumeir, “Limitations of the SSIM quality metric in the context of diagnostic imaging,” in *Image Processing (ICIP)**,**2015**IEEE International Conference on*, 2015, 2960–2963. [CrossRef]

**39. **J. Wang, T. Li, and L. Xing, “Iterative image reconstruction for CBCT using edge-preserving prior,” Med. Phys. **36**(1), 252–260 (2009). [CrossRef] [PubMed]

**40. **K. Bredies and T. Valkonen, “Inverse problems with second-order total generalized variation constraints,” Proceedings of Sampta (2011).

**41. **B. R. Whiting, P. Massoumzadeh, O. A. Earl, J. A. O’Sullivan, D. L. Snyder, and J. F. Williamson, “Properties of preprocessed sinogram data in x-ray computed tomography,” Med. Phys. **33**(9), 3290–3303 (2006). [CrossRef] [PubMed]

**42. **E. Chouzenoux, A. Jezierska, J. Pesquet, and H. Talbot, “A convex approach for image restoration with exact Poisson-Gaussian likelihood,” SIAM J. Imaging Sci. **8**(4), 2662–2682 (2015). [CrossRef]

**43. **J. Li, Z. Shen, R. Yin, and X. Zhang, “A reweighted *L*^{2} method for image restoration with Poisson and mixed Poisson-Gaussian noise,” Inverse Probl. Imaging (Springfield) **9**(3), 875–894 (2015). [CrossRef]

**44. **B. R. Whiting, P. Massoumzadeh, O. A. Earl, J. A. O’Sullivan, D. L. Snyder, and J. F. Williamson, “Properties of preprocessed sinogram data in x-ray computed tomography,” Med. Phys. **33**(9), 3290–3303 (2006). [CrossRef] [PubMed]

**45. **A. Repetti, E. Chouzenoux, and J. C. Pesquet, “A penalized weighted least squares approach for restoring data corrupted with signal-dependent noise,” in *Signal Processing Conference (EUSIPCO)**,**2012**Proceedings of the 20th European*, 2012, 1553–1557.

**46. **H. Zhang, L. Ouyang, J. Ma, J. Huang, W. Chen, and J. Wang, “Noise correlation in CBCT projection data and its application for noise reduction in low-dose CBCT,” Med. Phys. **41**(3), 031906 (2014). [CrossRef] [PubMed]

**47. **A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sci. **2**(1), 183–202 (2009). [CrossRef]

**48. **S. Becker, J. Bobin, and E. J. Candès, “NESTA: a fast and accurate first-order method for sparse recovery,” SIAM J. Imaging Sci. **4**(1), 1–39 (2011). [CrossRef]

**49. **P. B. Noël, A. M. Walczak, J. Xu, J. J. Corso, K. R. Hoffmann, and S. Schafer, “GPU-based cone beam computed tomography,” Comput. Methods Programs Biomed. **98**(3), 271–277 (2010). [CrossRef] [PubMed]