## Abstract

We developed a mathematical scheme that allows us to improve retrieval
products obtained from the inversion of multiwavelength Raman/HSRL
lidar data, commonly dubbed “$3\text{\hspace{0.17em}}\text{backscatter}+2\text{\hspace{0.17em}}\text{extinction}$” ($3\beta +2\alpha $) lidar. This scheme works
independently of the automated inversion method that is currently
being developed in the framework of the Aerosol-Cloud-Ecosystem (ACE)
mission and which is successfully applied since 2012 [Atmos. Meas. Tech. **7**, 3487 (2014) [CrossRef] ;
“Comparison of aerosol optical and
microphysical retrievals from HSRL-2 and in-situ measurements
during DISCOVER-AQ 2013 (California and
Texas),” in International Laser
Radar Conference,
July 2015, paper PS-C1-14]
to data collected with the first airborne multiwavelength
$3\beta +2\alpha $ high spectral resolution lidar (HSRL)
developed at NASA Langley Research Center. The mathematical scheme
uses gradient correlation relationships we presented in part 1 of our
study [Appl.
Opt. **55**, 9839
(2016) [CrossRef] ] in which we
investigated lidar data products and particle microphysical parameters
from one and the same set of optical lidar profiles. For an accurate
assessment of regression coefficients that are used in the correlation
relationships we specially designed the *proximate
analysis* method that allows us to search for a
first-estimate solution space of particle microphysical parameters on
the basis of a look-up table. The scheme works for any shape of
particle size distribution. Simulation studies demonstrate a
significant stabilization of the various solution spaces of the
investigated aerosol microphysical data products if we apply this
gradient correlation method in our traditional regularization
technique. Surface-area concentration can be estimated with an
uncertainty that is not worse than the measurement error of the
underlying extinction coefficients. The retrieval uncertainty of the
effective radius is as large as $\pm 0.07\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$ for fine mode particles and
approximately 100% for particle size distributions composed of fine
(submicron) and coarse (supermicron) mode particles. The volume
concentration uncertainty is defined by the sum of the uncertainty of
surface-area concentration and the uncertainty of the effective
radius. The uncertainty of number concentration is better than 100%
for any radius domain between 0.03 and 10 μm. For monomodal
PSDs, the uncertainties of the real and imaginary parts of the CRI can
be restricted to $\pm 0.1$ and $\pm 0.01$ on the domains [1.3; 1.8] and [0;
0.1], respectively.

Published by The Optical Society under the terms of the Creative Commons Attribution 4.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.

## Corrections

5 December 2016: Corrections were made to Eq. (7) and to the Acknowledgment on 5 December 2016.

21 March 2017: A correction was made to the Funding section on 21 March 2017.

## 1. INTRODUCTION

Our recent publication [1] demonstrates significant progress in the use of multiwavelength lidar systems that measure backscatter coefficients at wavelength 355, 532, and 1064 nm and extinction coefficients at 355 and 532 nm, commonly dubbed “$3\beta +2\alpha $.” Such systems require software capable of processing the optical aerosol profiles into profiles of particle microphysical parameters. In this context, we develop an unsupervisedautomated inversion method as part of a project study within NASA’s Aerosol-Cloud-Ecosystems (ACE) mission. This inversion method has been successfully applied since 2012 to data collected with the first airborne multiwavelength $3\beta +2\alpha $ High-Spectral-Resolution Lidar (HSRL-2) developed at NASA Langley Research Center [1,2].

First results retrieved with this automated inversion are encouraging
because they demonstrate good agreement to *in situ* data
taken aboard an aircraft that flew next to an aircraft that carried HSRL-2
[1,2]. An in-depth analysis of the first results from
this automated inversion shows that the main source of disagreement
between the results from the automated inversion and the *in
situ* data are related to the retrieved number and
surface-area concentrations. In many cases, these outliers can be
relatively easily identified by visual inspection of the retrieved
profiles of the microphysical particle parameters.

In part 1 of our series of papers [3]
we presented first results on a mathematical scheme that uses
*gradient correlation relationships* among different
particle microphysical properties of particle size distributions. We
analyzed optical profiles from lidar for our investigations of particle
microphysical properties [3]. We
investigated how we can use the relationships found in that study to
identify outliers of particle microphysical properties that result from
the inversion of optical lidar data with the automated, unsupervised
inversion algorithm mode.

We developed what we denote as the *proximate analysis
method*. This method allows us not only to assess more
accurately the regression coefficients that we found from our study of
correlation relationships, see Ref. [3]. This method also allows us to search for a first-estimate
solution space of particle microphysical parameters on the basis of a
look-up table (LUT). The basis of that LUT is a synthetic optical data
(SOD) bank that we generated and that we introduced in Section 3.1 in
Ref. [3]. The SOD includes 63869
$3\beta +2\alpha $ data sets computed from
logarithmic-normal size distributions that have different mean radii, mean
widths, and complex refractive indices.

We describe in Section 2 the gradient correlation method. We explain how correlation relationships are used as a constraint. In Section 3 we show numerical examples with synthetic optical data and compare retrieval results that we obtained with different approaches. Section 4 summarizes our results.

## 2. METHODOLOGY

#### A. Regression Equations for Gradient Correlation Method

Present-day, state-of-the-art multiwavelength Raman and HSRL multiwavelength lidar measures backscatter coefficients ($\beta $) at the wavelengths 355, 532, and 1064 nm, and extinction coefficients ($\alpha $) at 355 and 532 nm. This combination of optical data (OD) is usually denoted as $3\beta +2\alpha $, or simply $3+2$. In Ref. [3] we showed that profiles of optical data and particle microphysical parameters (PMP) are interdependent. We find correlation relationships that describe this interdependence. The correlations have a comparably high degree of confidence from the mathematical point of view. The correlation coefficients we obtain from regression analysis are close to ${R}^{2}=1$ for the most important intensive parameters $p$ (IP) we are interested in. We find that surface-area concentration ($p=s$), the ratio of volume concentration to effective radius ($p=v/{r}_{\text{eff}}$) and the product $p=n({r}_{\text{mean}}^{2}+{\sigma}^{2})$ are linearly correlated with the extinction coefficient $\alpha (\lambda )$ measured at wavelength $\lambda =355\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$, i.e.,

The parameters $n,{r}_{\text{mean}}$, and $\sigma $ are number concentration, mean radius, and standard deviation, respectively, of a particle size distribution $f(r)$.The regression coefficients (RCs) ${a}_{p}$ and ${b}_{p}$ in the correlation relationships [Eq. (1a)] fulfill the conditions

The linear correlations hold true for effective radius ${r}_{\text{eff}}$ versus the extinction-related Ångström exponent ${\mathit{\xe5}}_{\alpha}$ (EAE)

where the RCs fulfillIn Ref. [3] we also investigated if there are more correlation relationships for other PMPs. We find linear correlations of mean radius $p={r}_{\text{mean}}$ and standard deviation $p=\sigma $ with effective radius ${r}_{\text{eff}}$

where the RCs fulfill The correlation coefficient ${R}^{2}$ is close to 1 in Eq. (5a) if aerosol properties do not change significantly between successive height bins of an aerosol profile. In terms of lidar data products, it means profiles of the lidar ratio $\alpha /\beta $ and the EAE at these height bins do not oscillate, and their variations are within $\pm 25\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{sr}$ and $\pm 0.25$, respectively.We note that the linear correlations of Eqs. (3) and (5a) are valid both for mono- and bi-modal PSDs. However, in contrast to the conditions of Eq. (2) the conditions in Eqs. (4) and (6) were obtained only for monomodal PSDs and change significantly in the case of bimodal PSDs.

Finally, we found in Ref. [3] that for particles of any mode fraction (fine versus coarse mode) of the PSD the complex refractive index $m={m}_{\mathrm{R}}-i{m}_{\mathrm{I}}$ (CRI) is uniquely defined by the extinction‐ (${\mathit{\xe5}}_{\alpha}$) and backscatter‐ (${\mathit{\xe5}}_{\beta}$) (BAE) related Ångström exponents, the lidar ratios $\alpha /\beta $ (LR) at two wavelengths, and the effective radius ${r}_{\text{eff}}$. We can use that result in the case of monomodal PSDs for locating the CRI in our LUT.

To locate the CRI, we collect the LUT elements that are closest to the data products ${\mathit{\xe5}}_{\alpha}$, ${\mathit{\xe5}}_{\beta}$, and $\alpha (\lambda )/\beta (\lambda )$ which can be measured with lidar, and ${r}_{\text{eff}}$ which can be retrieved with our regularization algorithm.

As a measure of closeness, we use in Ref. [3] the discrepancy

Equation (7), six
correlation relationships, described by Eqs. (1), (3), and (5) are linked to eight
important PMPs (${r}_{\text{eff}}$, ${r}_{\text{mean}}$, $\sigma $, $n$, $s$, $v$, ${m}_{\mathrm{R}}$, and ${m}_{\mathrm{I}}$) which are commonly used for the
description of atmospheric aerosol particles. They can be used
*a priori* for constraining the solution space of
aerosol microphysical properties in our underlying inversion problem
(see Eq. (1) in Ref. [3]).

#### B. Constraints of the Solution Space

The lidar measurements deliver profiles of optical data. Thus, we want to derive the microphysical particle parameters not only for one but for several height layers. We therefore rewrite Eq. (1) in Ref. [3] as

The superscript $l$ indicates the number of the height bins we use to describe the OD profiles ${g}^{(l)}(\lambda )$ as well as the parameters ${r}_{\mathrm{min}}^{(l)}$, ${r}_{\mathrm{max}}^{(l)}$, ${m}^{(l)}$, and the function ${f}^{(l)}(r)$ in each height bin $l$. The total number of height bins is ${\mathrm{N}}_{\mathrm{L}}$. The kernel function $K$ is known and its definition can be found in Ref. [4] for the case of spherical particle geometry.

Equation (8) can be solved, for example, with regularization which allows us to find a solution space ${F}^{(l)}=\{{f}_{1}^{(l)}(r),\dots ,{f}_{{\text{N}}_{\mathrm{s}\mathrm{o}\mathrm{l}}}^{(l)}(r)\}$ for each height bin $l$ [5–8] where ${\mathrm{N}}_{\mathrm{s}\mathrm{o}\mathrm{l}}$ is the total number of individual solutions. If we know ${F}^{(l)}$ we can also obtain the solution spaces for any bulk parameter ${P}^{(l)}=\{{p}_{1}^{(l)},\dots ,{p}_{{\text{N}}_{\mathrm{s}\mathrm{o}\mathrm{l}}}^{(l)}\}$ for each height bin $l$, where ${P}^{(l)}={R}_{\text{mean}}^{(l)}$, ${R}_{\text{eff}}^{(l)}$, ${\mathrm{\Sigma}}^{(l)}$, ${N}^{(l)}$, ${S}^{(l)}$, and ${V}^{(l)}$ denote the solution space of ${r}_{\text{mean}}$, ${r}_{\text{eff}}$, $\sigma $, $n$, $s$, and $v$, respectively, see Ref. [5]. The collection of the solution spaces for all ${\mathrm{N}}_{\mathrm{L}}$ heights forms the profiles of the PMPs. The final solution ${p}_{\mathrm{av}}$ at height $l$ is defined by averaging all individual parameters ${p}_{i}^{(l)}$, $i\le {\mathrm{N}}_{\text{sol}}$, over a prescribed interval of the discrepancy range $[{\rho}_{\mathrm{min}}^{(l)},{\rho}_{\mathrm{av}}^{(l)}]$ [6,8].

To find physically meaningful solutions we need to apply different constraints, as for example, the discrepancy range $[{\rho}_{\mathrm{min}}^{(l)},{\rho}_{\mathrm{av}}^{(l)}]$, the radius range $[{r}_{\mathrm{min}}^{(l)},{r}_{\mathrm{max}}^{(l)}]$, and the complex refractive index ${m}^{(l)}$. In that way we can exclude individual solutions ${p}_{i}^{(l)}$ which for example deviate too much from the average value ${p}_{\mathrm{av}}$. In that case, we insert a threshold $\delta p$ that allows us to take into consideration only those individual solutions that fulfill the condition

In our previous strategy of data inversion we used this averaging procedure for each solution space (in each height bin) independently of each other, i.e., we did not take into consideration that results of PMPs of successive height bins may be correlated to each other.

We developed a mathematical framework that allows us to consider the
correlation in a very specific way that is described by Kolgotin and
Müller [8] and
Müller *et al.* [9]. However, that method is complex, time consuming, and
will require more effort to understand how it can be efficiently used
for the automated, unsupervised inversion method we are developing for
HSRL-2. In the method presented here we provide an elegant, efficient
way of using the profile information from lidar without including
complications from the mathematical point of view as discussed by
Kolgotin and Müller [8].

The profiles of the ODs contain information regarding the variation of the profiles of the PMPs. In other words, we can predict the PMP behavior if we know the law(s) (or correlations) that describe the interdependence of the OD and the PMPs.

In that regard, the linear correlation as described by Eq. (3) can be used as an extra constraint if we want to find the solution space for ${R}_{\text{eff}}^{(l)}$. In fact, if we measure ${\mathit{\xe5}}_{\alpha}^{(l)}$, the correlation presented in Eq. (3) allows us to estimate ${r}_{\text{eff}}^{(l)*}$ according to

If we introduce the threshold $\delta {r}_{\text{eff}}$ we can rewrite Eq. (9) as The linear correlations as described by Eq. (1) can be used as extra constraints to identify the solution spaces ${R}_{\text{mean}}^{(l)}\cup {\mathrm{\Sigma}}^{(l)}\cup {N}^{(l)}$, ${V}^{(l)}\cup {R}_{\text{eff}}^{(l)}$, and ${S}^{(l)}$ as well; the symbol $\cup $ means the space conjunction. Again, introducing the threshold $\delta p$ the condition in Eq. (11) can be rewritten in a more general form as## 3. NUMERICAL SIMULATIONS

#### A. Proximate Analysis of Particle Parameters

The conclusions of the previous section naturally lead us to the step in which we want to develop a practical method that allows us to identify the parameters of fine-mode particles even if the investigated PSDs may contain coarse mode particles (bimodal size distribution). This situation until now has not been tested in simulation studies carried out with the automated software.

HSRL-2 and multiwavelength Raman lidars can measure the extinction coefficients $\alpha (\lambda )$ at $\lambda =355$ and 532 nm. As a result, the EAE is available from Eq. (19) in [3]. From our own measurements and literature we know that the EAE often varies between ${\mathit{\xe5}}_{\alpha}=1$ and 2 except in cases of mineral dust, volcanic ash, and aged smoke. These aerosol types are characterized by comparably large particles. We find the following results:

If we assume that the standard deviation is not too large so that $\sigma \ll {r}_{\text{mean}}/\surd 3$, see Eq. (29) in Ref. [3], we find that ${r}_{\text{eff}}\approx {r}_{\text{mean}}$. For example, if we select from our LUT optical data such that $\mathit{\xe5}>1$ we find ${r}_{\text{mean}}<0.25\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$. That means $\sigma $ should be less than 0.15. In contrast, the maximal value is ${\sigma}_{\mathrm{max}}<0.09$ for all elements in the LUT bank if $\mathit{\xe5}>1$. In that case, we obtain the number concentration from Eq. (15) in Ref. [3] according to

Equations (13)–(16) allow us to find the most important PSD parameters very easily and fast, and their uncertainties can be estimated if we use as threshold the value $\delta {r}_{\text{eff}}=\pm 0.07\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$ and Eq. (2a) at $p=s$. However, in this approach we must be sure that we deal with monomodal PSDs. Otherwise, Eq. (3) does not work with the condition $\delta {r}_{\text{eff}}\le 0.07\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$, see for example the stars in Fig. 1(a) in Ref. [3].

For that reason we must adapt Eqs. (13)–(16) such that they hold true in the case of bimodal PSDs. To achieve this goal, we split the optical coefficients into the fine-mode contribution ${g}_{\mathrm{f}}(\lambda )$ and the coarse-mode contribution ${g}_{\mathrm{c}}(\lambda )$, i.e., we write

Apparently, the ratio of the fine mode to the coarse mode, expressed in terms of the optical coefficients is

If we use the fine-mode and coarse-mode extinction coefficients measured at 532 nm, i.e., ${\alpha}_{\mathrm{f}}(532)$ and ${\alpha}_{\mathrm{c}}(532)$ we can find the ratio of the extinction coefficients at 532 and 355 nm, i.e.,

If we express ${\alpha}_{\mathrm{f}}(355)$ in Eq. (18) in terms of ${\phi}_{\alpha (355)}$ and ${\alpha}_{\mathrm{c}}(355)$ we obtain the ratio of the extinction coefficients at 532 and 355 nm of the fine mode fractions, i.e.,

As we discussed in Ref. [3], we find ${d}_{\mathrm{c}}\in [1;1.07]$ for large particles whose effective radius is larger than 1.1 μm.

According to the definition in Eq. (19) in Ref. [3], we can write for the EAE which describes the fine mode

Now we can find all microphysical parameters of the fine mode fraction of a bimodal particle size distribution. In fact, in view of Eqs. (13)–(15) the effective radius, surface-area, and volume concentrations for the fine mode are

In the case of a fine-mode PSD, the inequality ${r}_{\text{eff}}<3{r}_{\text{mean}}$ works quite well. Then the maximum value of the number concentration can be obtained from Eq. (15) in Ref. [3] and the inequality [Eq. (31b)] presented in Ref. [3] is

The minimum level can be obtained from Eq. (15) in Ref. [3], and the inequality shown in Eq. (31c) in Ref. [3] becomes

According to these equations, the fine mode fraction varies from ${\phi}_{\alpha (355)}=1$ to some minimal value ${\phi}_{\alpha (355)}={\phi}_{\alpha (355)\mathrm{min}}\approx 0.5$ at which ${r}_{\text{eff},\mathrm{f}}\le 0$ and the results become unphysical.

If ${\phi}_{\alpha (355)}=1$, we assume that the particles are distributed according to a monomodal distribution law. On the contrary, if ${\phi}_{\alpha (355)}<1$ there is a pronounced coarse mode.

If we want to estimate the CRI of the fine mode we need to find at least three other fractions ${\phi}_{\beta (355)}$, ${\phi}_{\alpha (532)}$, and ${\phi}_{\beta (532)}$, and extract optical data spectra ${\alpha}_{\mathrm{f}}(\lambda )$ and ${\beta}_{\mathrm{f}}(\lambda )$ which describe the fraction of small particles, as we discussed in Section 3.2 in Ref. [3]. We find all fractions more or less accurately if we take into account our LUT. If we predefine ${\phi}_{\alpha (355)}$ and ${d}_{\mathrm{c}}$ in Eq. (23) and if we consider that ${\alpha}_{\mathrm{c}}=(1-{\phi}_{\alpha})\alpha $ we automatically find from Eq. (22)

The four equations [Eq. (23)] and [Eqs. (26)–(28)] describe the four unknown fractions and thus ensure the unique definition of ${\phi}_{\alpha (355)}$, ${\phi}_{\beta (355)}$, ${\phi}_{\alpha (532)}$, and ${\phi}_{\beta (532)}$ for any predefined values of ${\phi}_{\alpha (355)}$ and ${d}_{\mathrm{c}}$. The parameter ${d}_{\mathrm{c}}$ is quite strictly constrained by the interval [1; 1.07], whereas the fraction ${\phi}_{\alpha (355)}$ can vary between 0.4 and 1. In spite of the fact that we apparently have the constraints

These constraints are still not sufficient for the unambiguous estimation of the unknown fractions.

We suggest to test the quality of the unknown fractions by comparing the derived LRs, BAEs, and EAE of the coarse mode, which is described by ${g}_{\mathrm{c}}=(1-{\phi}_{g})g$, to the respective parameters LRs, BAEs, and EAE of our LUT. We modify the discrepancy expressed by Eq. (7) to

We can test this approach in the case of the bimodal PSDs considered in Section 3.1 in Ref. [3]. The test can be easily carried out, for example with the help of a table written in Excel to make the calculations of the formulas shown in Eqs. (23) and (24) faster. The example is presented in Table 1 [see also Fig. 1, asterisks].

Table 1 contains the columns $l=1,\dots $, ${\mathrm{N}}_{\mathrm{L}}=6$ which describe different bimodal PSDs or, in other words, height bins that describe a profile of optical data taken by lidar. The actual fraction ${\phi}_{\alpha (355)}$ of the fine mode particles decreases with height from 1 to 0.49. According to Eq. (17), this fraction describes the fine-mode fraction in terms of the extinction coefficient at 355 nm. If we use Eq. (7) in Ref. [3], we see that in fact we describe the fine-mode fraction of the surface-area concentration. All actual (true) microphysical parameters of the fine mode of the PSDs are fixed. We show them just for reference.

The profile of the synthetic extinction coefficient $\alpha (355)$ increases with height from 0.093 to $0.190\text{\hspace{0.17em}\hspace{0.17em}}{\mathrm{Mm}}^{-1}$. The EAE decreases from 1.64 to 0.61 because the contribution of coarse mode particles increases with height. We use these data as input (see green symbols in Table 1) for the calculation of the particle microphysical parameters according to Eq. (24).

The first step of our computations is to estimate the fine mode fraction ${\phi}_{\alpha (355),\text{fit}}$ by means of a fitting procedure (see red numbers in Table 1). Let us assume there is no coarse mode at the lowest height bin $l=1$, i.e., ${\phi}_{\alpha (355),\text{fit}}=1$. In that case, the EAE of the fine mode can be found from Eq. (23). Table 1 shows that ${\mathit{\xe5}}_{\alpha ,\mathrm{f}}^{(1)}=1.64$ which coincides with the actual value ${\mathit{\xe5}}_{\alpha}^{(1)}$. If, in addition, we assume that the fine-mode effective radius does not change with height we can select ${\phi}_{\alpha (355),\text{fit}}^{(l)}$ in Eq. (23), which leads to ${\mathit{\xe5}}_{\alpha ,\mathrm{f}}^{(l)}\approx 1.64$ for $l=2,\dots ,{\mathrm{N}}_{\mathrm{L}}$. The result is that the numbers ${\phi}_{\alpha (355),\text{fit}}^{(l)}$ reproduce the true values ${\phi}_{\alpha (355)}^{(l)}$ rather well.

We emphasize that the effective radius as determined by Eq. (24a) is equal to $\sim 0.05\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$ if we significantly decrease ${\phi}_{\alpha (355),\text{fit}}$, to 0.77, i.e., we assume that the coarse mode exists at $l=1$. This value of the effective radius is very small. Given the available measurement wavelengths, such a small value cannot be retrieved in a reliable way with this automated inversion software. That means the fine mode fraction ${\phi}_{\alpha (355),\text{fit}}$, cannot be varied over a wide range unless we want to produce unreasonable values of the microphysical particle parameters.

After we fit the profile ${\phi}_{\alpha (355),\text{fit}}^{(l)}$ according to the condition ${\mathit{\xe5}}_{\mathrm{f}}^{(l)}\approx $ const all microphysical parameters of the fine mode of the investigated PSD can be calculated with the help of the formulas given in Eq. (24). As seen from Table 1 (red symbols), the parameters do not change with height, and the values are close to the true values. We find a maximum relative error of 30% for volume concentration.

If we take into account the threshold $\delta {r}_{\text{eff}}=\pm 0.07\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$ for effective radius we can estimate the uncertainties of all parameters of the fine mode particles. In that case the maximum level of uncertainty is 0.2 μm for effective radius, and it is ${v}_{\mathrm{f},\mathrm{max}}=0.01\text{\hspace{0.17em}\hspace{0.17em}}{\mathrm{\mu m}}^{3}\text{\hspace{0.17em}}{\mathrm{cm}}^{-3}$ for volume concentration. If the PSD shows a fine mode of particles, we can estimate the maximum level of the number concentration from Eq. (24d), i.e., ${n}_{\mathrm{f},\mathrm{max}}=1.5\text{\hspace{0.17em}\hspace{0.17em}}{\mathrm{cm}}^{-3}$.

If we repeat that approach with the assumption that the coarse mode is absent, i.e., ${\phi}_{\alpha (355),\text{fit}}^{(l)}=1$ we obtain a similar result for layers $l=1,\dots ,5$ within the uncertainties $\delta {p}^{(1)}$. With regard to layer $l=6$, the surface-area and volume concentrations are overestimated significantly. We stress that we do not make any assumption about monomodality of the PSD if the BAE is very close to 0 or even negative, and the EAE is above 0.5 simultaneously.

Let us estimate the CRI by using the correlation relationships. Since we assume that the PSD is monomodal at $l=1$, i.e., ${g}_{\mathrm{f}}^{(1)}(\lambda )={g}^{(1)}(\lambda )$ and ${r}_{\text{eff},\mathrm{f}}^{(1)}={r}_{\text{eff}}^{(1)}$, we can calculate the lidar data products BAEs, EAE, and LRs at the measurement wavelengths, and the discrepancy ${\rho}_{\mathrm{LUT}}$, see Eq. (7). The minimum discrepancy ${\rho}_{\mathrm{LUT},\mathrm{min}}\approx 2.8\%$ is obtained from the LUT for the following set of values:

- – ${\mathit{\xe5}}_{\alpha ,\mathrm{f}}=1.66$, ${\mathit{\xe5}}_{\beta (355),\beta (532),\mathrm{f}}=1.45$, ${\mathit{\xe5}}_{\beta (532),\beta (1064),\mathrm{f}}=1.04$,
- – lidar ratios 76 sr and 70 sr at $\lambda =355$ and 532 nm, respectively,
- – ${r}_{\text{eff}}=0.12\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$, and
- – $m=1.7-i0.025$.

The interval $[{\rho}_{\mathrm{LUT},\mathrm{min}},{\rho}_{\mathrm{LUT}}]=[2.8\%;5\%]$ includes 12 sets of parameters for which effective radius ${r}_{\text{eff}}=0.12\u20130.15\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$, the real part of the CRI ${m}_{\mathrm{R}}=1.425\u20131.7$ and the imaginary part ${m}_{\mathrm{I}}=0.002\u20130.025$. We note that the true effective radius (almost 0.15 μm) lies between ${r}_{\text{eff}}=0.13$ and 0.15 μm. For that range of effective radii, we find $m=1.475-i0.005$ and $1.425-i0.003$, respectively. However, we see that quite small variations of the effective radius result in a wide variation of the CRI.

We can use this approach for all other height bins $l=2,\dots ,6$ under the assumption that the PSD is monomodal. Since the uncertainty $\delta {r}_{\text{eff}}$ does not decrease with height, we do not expect an improvement of the estimated values of the CRI at $l=2,\dots ,6$.

Now we consider the general case in which we do not make any assumption about the fraction ${\phi}_{\alpha (355)}$. For that purpose, we analyze the data at height bin $l=6$. As we discussed before, the SOD bank does not contain an optical data set that is consistent with ${g}^{(6)}(\lambda )$. Therefore, we use our idea to extract the spectrum ${g}_{\mathrm{f}}^{(6)}(\lambda )$ of the fine mode PSD with the help of the fractions ${\phi}_{g(\lambda )}$. We consider 13 initial values for the fraction ${\phi}_{\alpha (355)}$ from 0.4 up to 1 with step size 0.05. For each of these 13 initial values we find from Eqs. (23) and (26)–(29) the 13 discrepancies which are defined by Eq. (31).

The minimum discrepancy ${\rho}_{\mathrm{LUT},\mathrm{c}}=6\%$ is obtained for ${\phi}_{\alpha (355)}=0.5$. The other discrepancies are as large as ${\rho}_{\mathrm{LUT},\mathrm{c}}=21\%$ if ${\phi}_{\alpha (355)}$ increases to 0.9 and as large as ${\rho}_{\mathrm{LUT},\mathrm{c}}=10\%$ if ${\phi}_{\alpha (355)}$ decreases to 0.4 (see Table 2). For the fraction values ${\phi}_{\alpha (355)}=0.95$ and 1, there are either no solutions that fulfill Eq. (30) or there are no solutions that have a low discrepancy. We note that ${\rho}_{\mathrm{LUT},\mathrm{c}}=4\%$ for the true value ${\phi}_{\alpha (355)}=0.49$. The results in Table 2 are obtained at ${d}_{\mathrm{c}}\approx 1.03$ which is about in the center of the interval [1; 1.07]. Other ${d}_{\mathrm{c}}$‐values increase all discrepancies ${\rho}_{\mathrm{LUT},\mathrm{c}}({\phi}_{\alpha (355)})$ similarly, but they do not change the trend for ${d}_{\mathrm{c}}\approx 1.03$.

We analyzed the fractions ${\phi}_{g(\lambda )}$ we found in this way in the vicinity of the minimum discrepancy, i.e., from ${\rho}_{\mathrm{LUT},\mathrm{c}}=4\%$ to 10% (see green lines in Table 2, last column). This vicinity covers the range of values from 0.4 to 0.65 for ${\phi}_{\alpha (355)}$ and from 0.04 to 0.11 for ${\phi}_{\beta (1064)}$ (see Table 2). At the point ${\phi}_{\alpha (355)}=0.49$, we retrieve ${\phi}_{\beta (355)}=0.45$, whereas the true value is ${\phi}_{\beta (355)}=0.35$ (see Table 1). The uncertainty is 29% at the point ${\phi}_{\alpha (355)}=0.49$, which is maximal compared with all other fractions (see Table 1 for reference). If we average the fractions ${\phi}_{g(\lambda )}$ in the vicinity ${\rho}_{\mathrm{LUT},\mathrm{c}}=4\%$ to 10% we obtain the following values for the height bin $l=6$:

For this set of fractions, the EAE of the fine mode is 1.48. If we use ${\mathit{\xe5}}_{\alpha ,\mathrm{f}}=1.48$ and ${\phi}_{\alpha (355)}^{(6)}=0.52$ in Eqs. (24a)–(24e) we find values of the PMPs that are close to the values mentioned in Table 1. We also obtain the range of the CRI of the fine-mode PSD on the basis of the analysis of the lidar data products and the discrepancy ${\rho}_{\mathrm{LUT}}$ [see Eq. (7)]. The results in the height bins $l=1$ and $l=6$ are similar, too.

Our approach for estimating the fine mode parameters of a PSD are summarized into the following steps:

- A) We make an analysis regarding the number of modes of the PSD, i.e., whether it is monomodal (1 mode) or bimodal (2 modes) by means of a comparison of the measured properties and the lidar data products in the LUT. If the comparison is consistent (see, for example, case #2 in Table 1 in Ref. [3]), the PSD contains a coarse mode which can be “cut” by extracting the fine mode spectrum ${g}_{\mathrm{f}}(\lambda )={\phi}_{g}\text{\hspace{0.17em}}g(\lambda )$. The fractions ${\phi}_{g}$ are estimated with Eqs. (23) and (26)–(29) and we minimize the discrepancy according to Eq. (31).
- B) We calculate the EAE with Eq. (23).
- C) We calculate the effective radius from the system of Eq. (33) in Ref. [3]. The selection of Eq. (33a), or Eq. (33b), or Eq. (33c) depends on the range of values of the EAE. The uncertainty is $\delta {r}_{\text{eff}}=\pm (0.05\u20130.08)\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$ and depends on the EAE range as well.
- D) We calculate the surface-area concentration from Eq. (24b). The uncertainty is $\delta s=\pm 20\%$.
- E) We calculate the volume concentration from Eq. (24c). The uncertainty $\delta v$ can be derived from the “worst” possible combination of $\delta {r}_{\text{eff}}$ and $\delta s$.
- G) We estimate the CRI on the basis of the analysis of the lidar data products according to the spectrum ${g}_{\mathrm{f}}(\lambda )$ and the discrepancy ${\rho}_{\mathrm{LUT}}$ [see Eq. (7)].

Our proximate analysis (PA) is useful for preliminary estimations of microphysical parameters of particles in the fine mode of the PSD without applying the method of data inversion and the calculation of kernel functions. PA also allows for the more robust initial assessment of RCs used in the correlation relationships, see Eq. (10). Besides that, a preliminary estimation of the microphysical parameters can be employed as a constraint for more complicated methods of optical data analysis with inversion methods.

#### B. Retrieval Example for Synthetic Data

### 1. Numerical Example: Vertical Profile Type 1

In this example, we investigate in more detail the case considered in Section 3.A. The only difference is that we replace layer #6 by layer #1 and change the order of layer #4 and layer #5 to keep the optical data profile more realistic. Besides these changes, we distort the optical data with 15% extreme error. Extreme error in the context of this study means that each of the $3\beta +2\alpha $ channels is independently distorted with an error of fixed magnitude of $\pm 15\%$. Figure 1(a) shows the profiles of the synthetic (solid line) and distorted (symbols) optical data at 355, 532, and 1064 nm, as well as the Ångström exponents BAE for the wavelength pair 532/1064 nm and the EAE for the wavelength pair 355/532 nm. The profiles of the true PMPs [Figs. 1(b)–1(d)] and the true PSD [Fig. 1(e)] at bin $l=4$ (at height 4 km) are shown as thick solid black curves.

The statistics and correlations for the true parameters are presented in Fig. 2 by open black circles and black lines in the same way as in Fig. 1(a) in [3]. The regression equations for all parameters are shown in the legends. In spite of the bimodality of the PSD, we see a very “strong” correlation between the individual height bins of the profiles because the correlation coefficients are equal to ${R}^{2}=1$ in all cases.

We used distorted optical data profiles as input in our traditional regularization technique [6] to retrieve the solution space ${F}^{(l)},l=1,\dots ,6$. Each solution space ${F}^{(l)}$ contains about $1\times {10}^{5}$ individual solutions which are retrieved for 150 inversion windows that are located in the radius range from 0.03 to 10 μm. We used 20 equidistant values of the real part, i.e., ${m}_{\mathrm{R}}\in [1.325;1.8]$ and 30 equidistant values of the imaginary part, i.e., ${m}_{\mathrm{I}}\in [0.0;0.1]$ in the retrieval. We note that these ranges for the inversion windows and the CRI do not create any constraints on the solution spaces, and we do not use them in practice.

In the first step we post-processed the solution spaces without using GCM (NoGCM) and without using any constraints for the parameters ${r}_{\text{eff}}$, $n$, $r\in [0.03;10]$, ${m}_{\mathrm{R}}\in [1.325;1.8]$, and ${m}_{\mathrm{I}}\in [0.0;0.1]$ in the automated mode [1]. The results are shown as blue curves with triangles in Figs. 1(b)–1(e). The retrieved profiles of effective radius, surface-area and volume concentrations reproduce the true profiles quite well. The largest disagreement of 45% from the true value is observed for effective radius at bin $l=5$ (at 5 km height). However, the quality of the retrieved profile of the number concentration is unsatisfactory. There are outliers with 100%–200% error. Just two profile bins, the ones at 3 and 6 km, respectively, coincide with the true values.

Figure 2 shows the statistics of the parameters retrieved with NoGCM (blue lines, triangle). We see that the results are not correlated. The correlation coefficient changes from ${R}^{2}=0.57$ for ${r}_{\text{mean}}$ versus ${r}_{\text{eff}}$ [Fig. 2(f)] to ${R}^{2}=0.96$ for $\sigma $ versus ${r}_{\text{eff}}$ [Fig. 2(e)]. The RC for surface-area concentration is ${a}_{s}=1.19$ [Fig. 2(b)]. We find ${a}_{n}=0.16$ for number concentration [Fig. 2(d)]. These values are outside the ranges defined by Eq. (2). The low correlation coefficients and values of the RCs that do not fulfill the condition in Eq. (2) mean that we deal with outliers in our results.

In the next step, we applied GCM in an attempt to suppress these outliers and postprocess the solution spaces for the case that the true RCs are known (TrueRC). Table 3 shows the RCs for all IPs. We applied the algorithm described in Section 2 and used $\delta p=10\%\u201325\%$. Figure 1 shows the results as red curves with squares. We see that all retrieved and true profiles match each other significantly better. The maximum error of 30% occurs for the number concentration at height bin $l=2$ and $l=4$.

The result of applying GCM is that we find strong correlations of ${R}^{2}\ge 0.97$ for the IPs. Figure 2 shows these correlations as red lines with squares. We note that the actual RCs shown in the legends of Fig. 2 do not coincide with the true values (see the TrueRC of Table 3) after we used GCM for the postprocessing. Particularly the RC ${a}_{n}$ for $n({r}_{\text{mean}}^{2}+{\sigma}^{2})$ is 0.147, whereas the true value is 0.137. The true and actual mean widths of the RC ${b}_{\sigma}$ are 0.03 and 0.02, respectively.

The disagreement between the actual (retrieved) and the true RCs can be explained, as we already discussed, by the fact that it is not mandatory that Eq. (12) is simultaneously fulfilled for the individual and the averaged $n({r}_{\text{mean}}^{2}+{\sigma}^{2})$. If we reduce the threshold $\delta p$ to 1%–5% we could further stabilize the actual ${a}_{n}$ (or ${b}_{\sigma}$), but we deal with measurement and mathematical errors. These two error sources do not allow us to find any individual solution ${f}_{i}^{(l)}(r)$, $i=1,\dots ,{\mathrm{N}}_{\text{sol}}$, if $\delta p$ is too small. This is the reason why we have to set $\delta p=10\%\u201325\%$ in this case.

The true values of the RCs are not known in practice. If the RCs for our IPs, i.e., $s$, $v/{r}_{\text{eff}}$, and $n({r}_{\text{mean}}^{2}+{\sigma}^{2})$ vary inside small intervals, i.e., $\pm 20\%$ in Eq. (2) the spread of the RCs for ${r}_{\text{eff}}$, ${r}_{\text{mean}}$, and $\sigma $ are very wide, see for example the conditions in Eq. (32) in Ref. [3]. The choice of the RCs depends on many factors such as aerosol type, measurement case, and the experience of the data operator. To assess the capabilities of GCM we consider two different (opposite) retrieval strategies.

One strategy assumes that the PSD is monomodal (MMS). The other strategy assumes that the PSD is bimodal (BMS). The constraints for the RCs of $s$, $v/{r}_{\text{eff}}$, and $n({r}_{\text{mean}}^{2}+{\sigma}^{2})$ can be described by the same equations for both strategies. The RCs are ${a}_{s}=1.6$, ${a}_{v}=1.6/3\approx 0.53$, and ${a}_{n}=1.6/(4\pi )\approx 0.127$ (see Table 3). The numbers are in the center of the respective ranges of values of these RCs. At this step (1st iteration) the profile of the number concentration is stabilized if we apply GCM for $s$, $v/{r}_{\text{eff}}$, and $n({r}_{\text{mean}}^{2}+{\sigma}^{2})$ without using the constraints for ${r}_{\text{eff}}$, ${r}_{\text{mean}}$, and $\sigma $.

However, after the 1st iteration the profile of effective radius still contains the outliers that we also find if we do not apply GCM (case of NoGCM). As a result, the effective radius and EAE are only slightly correlated with ${R}^{2}=0.80$ (results are not shown here). To increase the correlation, we add a new constraint in the case of the effective radius. We use the RCs ${a}_{r}=-0.08$ and ${b}_{r}=0.26$ in the case of MMS, see Eq. (33b) in Ref. [3]. We use ${a}_{r}=-0.65$ and ${b}_{r}=1.2$ in case of BMS; see the equation that describes the regression line of the stars in Fig. 1(a) in Ref. [3].

We simultaneously use GCM with the constraint applied to ${r}_{\text{eff}}$, as described in the previous paragraph (1st iteration), to $s$, to $v/{r}_{\text{eff}}$, and to $n({r}_{\text{mean}}^{2}+{\sigma}^{2})$. This step stabilizes (smoothes) the profile of ${r}_{\text{eff}}$. For the case of BMS, we find a high correlation ${R}^{2}=0.97$ between ${r}_{\text{eff}}$ and ${\mathit{\xe5}}_{\alpha}$. In contrast, we find a lower correlation of ${R}^{2}=0.79$ between ${r}_{\text{eff}}$ and $\mathit{\xe5}$ in the case of MMS (results are not shown here). Furthermore, the discrepancy between input and backcalculated EAE for the case of MMS starts to increase to values of 20% and more in bins $l=3\u20135$.

After the 2nd iteration in which we used the constraints simultaneously with regard to $s$, $v/{r}_{\text{eff}}$, $n({r}_{\text{mean}}^{2}+{\sigma}^{2})$, and ${r}_{\text{eff}}$ the profile of $\sigma $ becomes well correlated with ${r}_{\text{eff}}$, and we obtain ${R}^{2}>0.92$ from both strategies (not shown here).

With regard to ${r}_{\text{mean}}$, the correlation with ${r}_{\text{eff}}$ remains quite small, i.e., ${R}^{2}<0.15$ (results are not shown). To improve the correlation of ${r}_{\text{mean}}$ and ${r}_{\text{eff}}$ we use all available constraints for $s$, $v/{r}_{\text{eff}}$, $n({r}_{\text{mean}}^{2}+{\sigma}^{2})$, ${r}_{\text{eff}}$, $\sigma $, and ${r}_{\text{mean}}$ in the 3rd iteration. In addition, with regard to ${r}_{\text{mean}}$ and $\sigma $, we use the RC values we obtained from the previous (2nd) iteration, i.e., we use ${a}_{\sigma}=0.31$, ${b}_{\sigma}=0$, ${a}_{m}=0.2$, and ${b}_{m}=0.075$ for MMS, and we use ${a}_{\sigma}=0.09$, ${b}_{\sigma}=0.02$, ${a}_{m}=0.01$, and ${b}_{m}=0.1$ for BMS (see Table 3). Figures 1 and 2 show the final results after the last, 3rd iteration as dashed curves. The results we obtain with GCM for the case of TrueRC and BMS are close to each other. In contrast, the results for MMS underestimate effective radius and volume concentration. We can define this deviation (underestimation) as the uncertainty of GCM.

We also apply the PA to the analysis of the fine mode PMPs and the fractions ${\phi}_{g}$ in the case of the distorted OD profiles (see Fig. 1, asterisks). This approach provides us with similar results that we obtain in Section 3.A for error free OD (see Table 1, red symbols). The effective radii for the distorted and the error free OD profiles coincide since the EAEs of both OD profiles are the same. The respective values for number, surface-area, and volume concentrations deviate from each other proportionally to the distortion of the OD.

In the case of the distorted OD, we find even less uncertainty for the CRI, i.e., in the vicinity of ${\rho}_{\mathrm{LUT}}\approx 5\%$ at $l=1$ we find that real parts are between 1.45 and 1.625. We find a similar uncertainty for the imaginary part, i.e., it varies between 0.001 and 0.02 regardless whether the OD are distorted or error free. Again, this uncertainty could be reduced to the resolution of the real part ($\pm 0.025$) in the LUT, and from 0.001 to 0.0075 for the imaginary part if we knew the effective radius to at least $\pm 0.01\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$. With regard to the fine mode fractions at $l=4$, the PA results in

The discrepancy stays below 12% for ${\phi}_{\beta (532)}$ (see Table 1 for reference at $l=5$). Apparently, we can use this approach to estimate the effective radius of the coarse mode by taking into account the values of the LUT. As another result, the volume concentration of the coarse mode can be assessed, too, like what can be done for ${v}_{\mathrm{f}}$ by using Eq. (24c). In turn, if we know the total volume and surface-area concentrations ($p={p}_{\mathrm{f}}+{p}_{\mathrm{c}}$) we may derive the total effective radius which is very important for the initial assumptions we need to make for the RCs ${a}_{r}$ and ${b}_{r}$ in Eq. (10). In that way, we obtain the RCs we use for the BMS. However, we do not further speculate on this result because we need to carry out further investigations regarding this approach of using the LUT; we want to consider all possible variations of the coarse mode effective radius from 1 to approximately 5 μm, which to our opinion covers most of the potential aerosol scenarios.

We note that the PMPs derived from the MMS and PA described in Section 3.A converge (see Fig. 1). For example, the effective radius retrieved with both approaches is equal to $\sim 0.13\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$ at $l=1,\dots ,6$. All other parameters retrieved with MMS and PA at all heights are close to each other within the uncertainty of PA. We conclude that even if we make a wrong assumption, in our numerical examples this is the case for MMS, we are still able to retrieve the PMPs of the fine mode of the size distribution with good accuracy.

In summary, we recommend the following strategy that allows us to select properly the RCs:

- B) The preliminary RCs ${a}_{r}$ and ${b}_{r}$ are assessed with PA on the basis of MMS and/or BMS.
- C) The discrepancies of the particle extinction coefficients and/or the EAEs do not exceed the measurement errors significantly.
- D) The correlation coefficients are close to 1.
- E) The predefined number of individual solutions that need to be averaged and the actual number of individual solutions that fulfill all correlation constraints should be similar. It is not acceptable if there are height bins in which the solutions do not fulfill all constraints.

### 2. Numerical Example: Vertical Profile Type 2

We computed ${\mathrm{N}}_{\mathrm{L}}=6$ optical sets of the $3\beta +2\alpha $ type on the basis of Eq. (8). From these data sets we constructed OD profiles which are shown in Fig. 3(a) as solid curves. The true PMPs and CRIs are shown in Figs. 3(b)–3(d) as thick lines. For example, the real part and mean width do not change with height and are equal to ${m}_{\mathrm{R}}=1.5$ and ${\sigma}_{\mathrm{G}}=2.1$, respectively. The imaginary part increases with height, i.e., from ${m}_{\mathrm{I}}=0$ (height bins 1–2) to 0.01 (height bins 3–4) to 0.03 (height bin 5) to 0.05 (height bin 6).

With regard to the CRI values this example could for example describe a situation of a comparably clean marine boundary layer that contains a mixture of marine particles (sea salt) with low light-absorbing anthropogenic pollution and a strongly light-absorbing pollution layer aloft. We stress that the purpose of the synthetic profiles is to test the robustness of our postprocessing scheme. We will show results of the analysis of experimental profiles in the third part of our study which is currently in preparation.

The true OD profiles were distorted with 15% extreme error and processed with the inversion algorithm. The circles in Fig. 4 show the statistics of the true data. The RCs of the regression equations (black lines) are within the respective ranges shown in Eqs. (2) and (4). For convenience, we summarize the RCs of the SOD in Table 4.

First, we classify the OD profiles according to Table (1) in Ref. [3]. The profiles contain OD sets from case #8 at $l=1$, from case #1 at $l=2$, 3, 4, and 6, and from case #4 at $l=5$. That means we can approximate our results by monomodal PSDs and use our LUT for the PA. However, at $l=1$ the particle size is comparably small, i.e., ${r}_{\text{eff}}\le 0.29\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$, and ${m}_{\mathrm{I}}\le 0.045$ and ${m}_{\mathrm{R}}\ge 1.4$. In contrast, at $l=2-6$ the cases #1 and #4 permit for the “full” ranges of effective radius (${r}_{\text{eff}}\ge 0.2\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$) and CRI (${m}_{\mathrm{I}}\ge 0.0$ and ${m}_{\mathrm{R}}\ge 1.3$).

In contrast to the numerical example in Section 3.A (see layer $l=6$) where BAE is equal to $-0.24$ the BAE is close to 2 in this example. In view of Eq. (33c) in Ref. [3], and Eq. (24) and ${\phi}^{(1)}=1$ we find (true values are in brackets):

Our PA gives us reasonable results within the respective uncertainty of $\delta p$, though the maximum value of the number concentration does not include the true value. That result can be explained by the fact that the true PSD is not “narrow” (${\sigma}_{\mathrm{G}}=2.1$), i.e., the PSD includes particles both of the fine and the coarse modes, and the numbers in Eq. (24d) are only a rough estimate at $l=1$. Besides that, we neither took into account the 20% uncertainty of our estimation of $s$ nor did we consider the measurement error of 15% of the extinction coefficients.

We carried out our PA of the CRI in the vicinity of ${\rho}_{\mathrm{LUT}}\le 15\%$ [see Eq. (7)]. The minimal discrepancy ${\rho}_{\mathrm{LUT},\mathrm{min}}\approx 7\%$ corresponds to the set of data in the LUT for which $m=1.55-i0.003$. In that vicinity we find another 118 sets with real parts ${m}_{\mathrm{R}}\in [1.5;1.7]$ and imaginary parts ${m}_{\mathrm{I}}\in [0;0.02]$. We again observe that in spite of erroneous OD the LUT data sets that have effective radii closest to the true value (0.24 μm) result in real parts ${m}_{\mathrm{R}}=1.5\u20131.55$ and imaginary parts ${m}_{\mathrm{I}}=0\u20130.003$.

Since the EAE is too low and changes from ${\mathit{\xe5}}_{\alpha}=-0.15$ to ${\mathit{\xe5}}_{\alpha}=-0.02$ in the upper height bins ($l=2,\dots ,6$) the PA of the fine-mode fraction of the particle size distribution will result in an unacceptably high uncertainty $\delta {r}_{\text{eff}}$ of effective radius [see Fig. 1(a) in [3] at ${\mathit{\xe5}}_{\alpha}\approx 0$]. Actually, this example is one of the most difficult scenarios for our retrieval scheme. However, we can further simplify the PA of the effective radius, not by using Eq. (24a) but by using the LUT in the same way we use it to find the CRI. In this case, we rewrite the discrepancy of Eq. (7) as

We used the simplified PA of the PMPs for the remaining heights ($l=2\u20136$). The results are shown as asterisks in Fig. 3. We collected the LUT values of ${r}_{\text{eff}}$ and $m$ for discrepancies ${\rho}_{\mathrm{LUT}}$ up to 25% because of the high level of the OD distortion (15%). This distortion level can permit for even higher discrepancies of the lidar data products. If we derive effective radius from our LUT on the basis of Eq. (32), and furthermore derive surface-area concentration from Eq. (14) we find volume and number concentrations with Eqs. (24c), (24d), and (24e), respectively.

Unfortunately, the minimal discrepancies ${\rho}_{\mathrm{LUT}}$ at height 2 and 6 km ($l=2$, 6) exceed 25%. In practice, it may mean that the error of the OD measurements is very high and/or the PSD is not monomodal. We tested the latter assumption and estimated the fine mode fractions as described in Section 3.A, i.e., we find

As a result the minimum discrepancies defined by Eq. (31) decrease to ${\rho}_{\mathrm{LUT},\mathrm{c}}^{(2)}=6\%$ and ${\rho}_{\mathrm{LUT},\mathrm{c}}^{(6)}=9\%$, respectively. If we take into account the LUT values for effective radii ${r}_{\text{eff},\mathrm{f}}$ and ${r}_{\text{eff},\mathrm{c}}$, and the fraction ${\phi}_{\alpha (355)}$, we can find the surface-area, volume and number concentrations of the fine and coarse modes [see Eqs. (24b)–(24e)], respectively.

Figure 3 (asterisk) shows that the PA provides the PMPs and their error bars and that the results include the true values of the PMPs. The only exception is number concentration. We already discussed that Eq. (24d) does not work properly for the estimation of ${n}_{f,\mathrm{max}}$ if the investigated PSD includes large particles. The parameters ${\mathit{\xe5}}_{\alpha}$ and ${\mathit{\xe5}}_{\beta (532;1064)}$ are close to 0 and even negative, simultaneously, which confirms our assumption. We note the large uncertainty of ${r}_{\text{eff}}$ at 2 and 6 km. As a result, we obtain a large uncertainty of $v$ in these same heights. Such large uncertainty arises because the erroneous OD can be reproduced by a bimodal PSD quite well. Our PA demonstrates this possibility. Besides that, we note that the ${r}_{\text{eff}}^{(4)}$ value at 4 km does not cover the true value, even if we include the uncertainty bar. We can explain this behavior by the fact that the minimum discrepancy ${\rho}_{\mathrm{LUT}}^{(4)}$ is 20%. If we consider the discrepancy range from 20% to approximately 25% we can avoid using wrong parameter sets from the LUT, i.e., parameters for which the OD are too “far” off from the input OD. In that case, the derived set of LUT parameters contains an effective radius that varies only slightly. In such a case we would extend the discrepancy vicinity or consider the strategy of extracting the fine and coarse mode fractions.

We use the same approach in this example as in Section 3.B.1, i.e., we identify the solution space ${F}^{(l)},l=1,\dots ,6$, [6] and we could postprocess it with the automated, unsupervised software for the following situations

- A) GCM is not used, i.e., the case of NoGCM is applied,
- B) GCM is used for the case of True RC,
- C) GCM is used with incorrect RCs and the assumption that the PSD is monomodal (MMS) or bimodal (BMS, i.e., presence of coarse-mode particles), and ${a}_{s}=1.6$, ${a}_{v}\approx 0.53$, and ${a}_{n}\approx 0.127$.

Case A: Figures 3(b)–3(e) show all retrieval results. We see that in the case of NoGCM (triangles) effective radius has large retrieval errors of up to 300% in three of the six height bins. The error of number concentration is up to 200% in the height bin $l=4$. In particular, the outliers of ${r}_{\text{eff}}$ are caused by a strong coarse mode which is absent in the true PSD, see the blue curve with triangles in Fig. 3(e).

Case B: The solution space can be significantly improved if we require that the correlation relationships are known and fulfilled (True RC). In that case, the retrieval errors of number concentration are below 90%. The error of effective radius is 20%. We obtained these results by using the RCs shown in Table 5 (True RC).

Case C: We assess the GCM uncertainty by assuming MMS and BMS. In both cases the RCs ${a}_{s}$, ${a}_{v}$, and ${a}_{n}$ are located in the middle of the respective intervals (see Table 5). The RCs for effective radius are defined on the basis of the analysis of our LUT. Figure 1(a) in Ref. [3] shows that in the case of ${\mathit{\xe5}}_{\alpha}\in [0;0.6]$ values for ${r}_{\text{eff}}$ are minimal if we select ${a}_{r}=-0.6$ and ${b}_{r}=0.6$. The values for ${a}_{r}$ and ${b}_{r}$ fulfill the conditions required by MMS. On the contrary, we select the values ${a}_{r}=-1.25$ and ${b}_{r}=0.9$ which allow us to maximize ${r}_{\text{eff}}$ in the case of BMS. In that case, we can take into account the condition that the correlation trends in both strategies should intersect the vicinity of the coordinate point (${\mathit{\xe5}}_{\alpha}^{(1)}=0.6$; ${r}_{\text{eff},\mathrm{f}}^{(1)}\approx 0.25\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$) that we found from our PA. Furthermore, we make the thresholds $\delta {r}_{\text{eff}}$ large enough (0.2–0.5 μm) since we deal with a wide spread of the effective radius when ${\mathit{\xe5}}_{\alpha}<0.5$, see Table 5 (MMS, BMS).

After a few iterations we can estimate the RCs for our IPs $\sigma $ and ${r}_{\text{mean}}$. The values are ${a}_{\sigma}=0.25$, ${b}_{\sigma}=0.01$, ${a}_{m}=0.27$, and ${b}_{m}=0.06$ for MMS. The values are ${a}_{\sigma}=0.21$, ${b}_{\sigma}=0.0$, ${a}_{m}=0.2$, and ${b}_{m}=0.05$ for BMS. Table 5 summarizes these numbers for the cases of MMS and BMS. The final results are shown in Fig. 3 as dashed curves.

The use of GCM causes quite large variations of effective radius and number concentration in the cases MMS and BMS. The uncertainties are 100% and 300% for effective radius and number concentration, respectively. That level of uncertainty could be suppressed if information about the modality (mono- or bimodal) of the PSD is available. Unfortunately, the solution subspaces for the monomodal and the bimodal PSDs are acceptable with regard to ${F}^{(l)}$ (no matter if NoGCM or GCM is used) and to PA. For the moment there still is ambiguity of the results when both BAE and EAE are comparably small, e.g., if they are close to 0. In this case, erroneous OD cannot be identified as “monomodal” because of large measurement errors. At the same time, the “bimodal” interpretation yields a good discrepancy but a wrong coarse mode. One of the next steps in our research work could be to decrease the uncertainty of the retrieved effective radius that is provided by MMS and BMS.

Any further improvement of the GCM results depends on the data
operator’s experience and *a priori*
information. Nonetheless, we stress that even at this stage of our
work GCM already smoothes the profiles of
${r}_{\text{eff}}$ and $v$ and thus provides us with
significantly better consistency between successive aerosol layers
compared with what we obtain from the method of NoGCM.

Figure 4 presents the statistics for the vertical profile of type 2 for all results derived with NoGCM (triangle), TrueRC (square), and MMS and BMS (dashed line). The legend in Fig. 4 also shows the regression equations that we obtained after postprocessing the solution space ${F}^{(l)}$ in the case of NoGCM and the case of True RC. We also show the regression equations for the true data (circles).

We see that the IPs $s$ and $v/{r}_{\text{eff}}$ that are retrieved with NoGCM and TrueRC are linearly correlated with the particle extinction coefficient [Figs. 4(b) and 4(c)]. The results for MMS and BMS show the same behavior (not shown here). We find outliers in the profiles of $n$ if we use NoGCM (see Fig. 3), and as a result the actual RC ${a}_{n}=0.25$ [Fig. 4(d), triangle] is outside the interval defined by Eq. (17) in Ref. [3]. The statistics for NoGCM for the IPs ${r}_{\text{eff}}$, $\sigma $, and ${r}_{\text{mean}}$ also are “de-correlated,” i.e., ${R}^{2}$ is low. The numbers decrease from 0.91 to 0.7 [see Figs. 4(a), 4(e), and 4(f)].

If we employ the method TrueRC and use the RCs from Table 5 we find the following numbers for the $\mathrm{RCs}\text{:}{a}_{r}=1.09$ (${R}^{2}=0.93$), ${a}_{n}=0.135$ (${R}^{2}=1$), ${a}_{\sigma}=0.26$ (${R}^{2}=0.99$), and ${a}_{m}=0.3$ (${R}^{2}=0.99$), see the squares in Figs. 4(a) and 4(d)–4(f). If we use MMS and BMS (dashed lines) we obtain RCs close to the ones shown in Table 5. The correlation coefficients are ${R}^{2}>0.98$. We note once more that the measurement errors are 15%.

We complete the analysis of both numerical examples with 15% measurement error by considering the results of the CRI retrieval. If we know the true RCs, the real and imaginary parts converge significantly better to the true values compared to the convergence we obtain from the method of NoGCM. The results are shown in Figs. 1(d) and 3(d) as red solid lines with squares. The errors of ${m}_{\mathrm{R}}$ and ${m}_{\mathrm{I}}$ are on average approximately 0.05–0.10 and $i0.010-i0.015$, respectively. There is only one outlier at 1 km (see vertical profile type 2). The analysis of the solution space at this height shows that it contains individual solutions with ${m}_{\mathrm{R}}>1.7$ and ${m}_{\mathrm{I}}>0.05$.

The results for MMS and BMS almost coincide [see Figs. 1(d) and 3(d), red dashed line] and overestimate the
true CRI (bias). The results for the CRI are also close to the
NoGCM results [see Figs. 1(d) and 3(d),
blue solid line with triangles]. Therefore, we cannot expect an
improvement of the CRI retrieval if we apply our new GCM only,
compared to our traditional approach [6]. However, we can use GCM and combine it
with our PA. We obtain ${r}_{\text{eff}}=0.265\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$ as a mean value from the two
values we find from MMS and BMS at the lowest height bin. In that
height, a monomodal PSD is highly likely. We obtain from the LUT
values of CRI and effective radius ($m=1.55-i0.003$ and ${r}_{\text{eff}}=0.24\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$), if we use a discrepancy
[Eq. (7)] that
is equal to ${\rho}_{\mathrm{LUT},\mathrm{min}}\approx 8\%$. If we extend the value of the
discrepancy to ${\rho}_{\mathrm{LUT}}=16\%$ (twice as large as the minimum
value) the LUT values for the CRI and the effective radius stay
below 1.6 (${m}_{\mathrm{R}}$), 0.01 (${m}_{\mathrm{I}}$), and 0.24 μm
(${r}_{\text{eff}}$). The results we obtain for the
CRI from the *combination* of PA and GCM are shown
as stars in Figs. 1(d) and
3(d). We are
considering the option of including this new method in the next
generation version of the automated inversion software that we are
developing for NASA Langley’s HSRL-2 instrument.

## 4. CONCLUSION

We presented the novel method of GCM which allows us to stabilize the solution space of particle microphysical parameters (PMPs) that we retrieve with our data inversion algorithm from profiles of optical data taken with multiwavelength Raman/HSRL ($3\text{\hspace{0.17em}}\text{backscatter}+2\text{\hspace{0.17em}}\text{extinction}$) lidar. GCM uses correlation relationships between particle bulk parameters and optical properties.

These relationships allow us to use additional constraints during the postprocessing of the solution spaces. As a result, the IPs $[s,v,n({r}_{\text{mean}}^{2}+{\sigma}^{2})]$ can be estimated with an uncertainty that does not exceed the measurement error of the extinction coefficient.

We specially designed the PA method that allows us to obtain from a LUT a
first-estimate solution space of the PMPs. We obtain a robust initial
estimation of the RCs on the basis of the correlation analysis presented
in Ref. [3]. PA itself provides us
with estimated values of the PMPs of the fine-mode fraction of particle
size distributions. The accuracy that we obtain for the PMPs with this
approach is not worse than the accuracy for the PMPs obtained with our
traditional inversion method. However, one of our next steps in fact is
that we use the results from PA as *a priori* information
in the second step of the data analysis procedure, i.e., we apply this PA
in combination with the traditional inversion algorithm. We are of the
opinion that the combination of the two methods will further improve our
inversion data products, most notably the complex refractive index. That
step of combining the two methods will allow us to derive trustworthy
profiles of absorption coefficients in future versions of our software
development.

The PA method can be generalized for the estimation of coarse-mode parameters if we extend our LUT by data that describe particle size distributions with effective radii exceeding 3 μm.

Our comparisons between the results obtained with GCM and our traditional inversion-with-regularization technique shows that the methodology presented in this contribution leads to a significant stabilization of the retrieved profiles of aerosol microphysical properties. That means the uncertainty of

- – surface-area concentration does not exceed the sum of mathematical ($\pm 20\%$) and measurement ($\pm 15\%$ for modern lidar systems) errors, or 35% in total;
- – effective radius is $\pm 0.07\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\mu m}$ for fine mode particles and about 100% for particle size distributions composed of fine (submicron) and coarse (supermicron) mode particles;
- – volume concentration is defined by the sum of the uncertainty of surface-area concentration and the uncertainty of effective radius;
- – the uncertainty of number concentration is less than 100% for any particle radius domain from 0.03 to 10 μm.

Aside from these results, we find that the uncertainties of the real and imaginary parts of the CRI of monomodal PSDs can be restricted to $\pm 0.1$ and $\pm 0.01$ on the domains [1.3;1.8] and [0;0.1], respectively, if we combine PA and GCM. This will allow us in the future to investigate bimodal particle size distributions in the sense that we can fully analyze the fine mode fraction of particle size distributions without making critical assumptions on properties of the coarse mode fraction. We can thus focus our future work steps on developing a light-scattering model for coarse model particles which to a large part consist of nonspherical (dust) particles and for which to date, to the best of our knowledge, there is no practical model that can describe light-scattering at 180°‐scattering direction (backscatter).

We will continue our research work with numerical simulations and the analysis of case studies to further assess the potential of this new approach.

We are currently preparing part 3 of our series of papers. There, we will use experimental data and test our method for retrieving PMPs from optical profiles measured with lidar.

## Funding

NASA Langley Research Center; University of Hertfordshire (UH); Science Systems and Applications.

## REFERENCES

**1. **D. Müller, C. A. Hostetler, R. A. Ferrare, S. P. Burton, E. Chemyakin, A. Kolgotin, J. W. Hair, A. L. Cook, D. B. Harper, R. R. Rogers, R. W. Hare, C. S. Cleckner, M. D. Obland, J. Tomlinson, L. K. Berg, and B. Schmid, “Airborne
multiwavelength high spectral resolution lidar (HSRL-2)
observations during TCAP 2012: vertical profiles of optical and
microphysical properties of a smoke/urban haze plume over the
northeastern coast of the US,”
Atmos. Meas. Tech. **7**, 3487–3496
(2014). [CrossRef]

**2. **P. Sawamura, D. Müller, S. Burton, E. Chemyakin, C. Hostetler, R. Ferrare, A. Kolgotin, L. Ziemba, A. Beyersdorf, and B. Anderson, “Comparison of
aerosol optical and microphysical retrievals from HSRL-2 and
in-situ measurements during DISCOVER-AQ 2013 (California and
Texas),” in International Laser
Radar Conference, July 2015,
paper PS-C1-14.

**3. **A. Kolgotin, D. Müller, E. Chemyakin, and A. Romanov, “Improved
identification of the solution space of aerosol microphysical
properties derived from the inversion of profiles of lidar optical
data, part 1: theory,” Appl.
Opt.55, 9839–9849
(2016).

**4. **C. F. Bohren and D. R. Huffman, eds., *Absorption and Scattering of
Light by Small Particles*
(Wiley,
1983).

**5. **D. Müller, U. Wandinger, and A. Ansmann, “Microphysical
particle parameters from extinction and backscatter lidar data by
inversion with regularization: theory,”
Appl. Opt. **38**, 2346–2357
(1999). [CrossRef]

**6. **I. Veselovskii, A. Kolgotin, V. Griaznov, D. Müller, U. Wandinger, and D. Whiteman, “Inversion with
regularization for the retrieval of tropospheric aerosol
parameters from multiwavelength lidar
sounding,” Appl. Opt. **41**, 3685–3699
(2002). [CrossRef]

**7. **C. Böckmann, I. Miranova, D. Müller, L. Scheidenbach, and R. Nessler, “Microphysical
aerosol parameters from multiwavelength
lidar,” J. Opt. Soc. Am. A **22**, 518–528
(2005). [CrossRef]

**8. **A. Kolgotin and D. Müller, “Theory of inversion
with two-dimensional regularization: profiles of microphysical
particle properties derived from multiwavelength lidar
measurements,” Appl. Opt. **47**, 4472–4490
(2008). [CrossRef]

**9. **D. Müller, A. Kolgotin, I. Mattis, A. Petzold, and A. Stohl, “Vertical profiles
of microphysical particle properties derived from inversion with
two-dimensional regularization of multiwavelength Raman lidar
data: experiment,” Appl.
Opt. **50**, 2069–2079
(2011). [CrossRef]