An algorithm based on the radiance transfer model (RM) and a dynamic learning neural network (NN) for estimating water vapor content from moderate resolution imaging spectrometer (MODIS) 1B data is developed in this paper. The MODTRAN4 is used to simulate the sun–surface–sensor process with different conditions. The dynamic learning neural network is used to estimate water vapor content. Analysis of the simulation data indicates that the mean and standard deviation of estimation error are under 0.06 gcm-2 and 0.08 gcm-2. The comparison analysis indicates that the estimation result by RM–NN is comparable to that of a MODIS water vapor content product (MYD05_L2). Finally, validation with ground measurement data shows that RM–NN can be used to accurately estimate the water vapor content from MODIS 1B data, and the mean and standard deviation of the estimation error are about 0.12 gcm-2 and 0.18 gcm-2.
©2010 Optical Society of America
Water vapor content is an important tropospheric greenhouse gas, which is very important in the study of energy balance and global climate change [1,2]. The near-infrared (IR) at around 1 is very sensitive to water vapor content . Kaufman and Gao and Sobrino et al. [2,4,5] used ratios of water vapor absorbing channels at 0.905, 0.936, and 0.94 with atmospheric window channels at 0.865 and 1.24 to estimate the water vapor content from the MODIS data on the Earth Observing System (EOS) . The ratios partially eliminate the effects of the variations of surface reflectance with wavelengths and give approximate atmospheric water vapor transmittances. This method is influenced by the spectral reflectance of the ground surface and mixed pixels. The overall water vapor error estimated by using the ratio method is about ± 13% [2,7], which demonstrates the need for further improvement for estimation accuracy of water vapor content in many applications such as atmospheric correction in visible spectral remote sensing and land surface temperature retrieval in thermal remote sensing [8,9].
In Section 2 of this paper we will present why and how to improve estimation accuracy by using a combined radiative transfer model (RM) neural network (NN) algorithm to estimate water vapor content from MODIS1B data. In Section 3 the comparison and evaluation will be made between the estimation results by RM–NN and MODIS water vapor content products and in situ measurement data. Finally, conclusions are given in Section 4.
2. Utilizing RM–NN to estimate water vapor content from MODIS data
The derivation of the algorithm for water vapor content estimation is based on the radiance of the sun, which is reflected from the ground through the atmosphere to a remote sensor. The sun’s radiance is attenuated by the atmosphere on its way to the remote sensor. Transmittance depicts the magnitude of the attenuation of the radiance transfer through the atmosphere. It varies with the wavelength and viewing angle. After some simplification, the radiance transfer equation can be depicted as Eq. (1) is the direct reflected solar radiation; is the solar radiance above the atmosphere; is the total atmospheric transmittance, which is equal to the product of the atmospheric transmittance from the sun to the earth's surface and that from the surface to the satellite sensor; is the surface bidirectional reflectance; and is the path of scattered radiance. can be treated approximately as an unspecified fraction of direct reflected solar radiation when the aerosol concentrations are low, which allows derivation of column water vapor amounts from satellite data without the need to model single and multiple scattering effects .
2.1 Why Use RM–NN
Many atmospheric constituents such as carbon dioxide, nitrogen oxide, ozone oxide, methane, carbon monoxide, and other gases are relatively stable, which can be assumed as being constant in different conditions and simulated by standard atmospheric profiles. On the contrary, water vapor content is highly variable. Thus, the variation of atmospheric transmittance for near-IR strongly depends on the dynamics of the water vapor content in the profile . Equation (1) can be changed into Eq. (2) as
In Eq. (2), the main contribution to is the scattering by aerosols because Rayleigh scattering is negligible near 1. in the 1 region is usually a few percent of the direct reflected solar radiation . Equation (2) can be simplified into Eq. (3) as
The relationship between transmittance  and water vapor content can be simulated by using MODTRAN4 (see Fig. 1 ), which shows that MODIS channels at 0.865 and 1.24 are non-absorption channels, and the channels at 0.935, 0.94, and 0.905 are water absorption channels. In Eq. (3) can be obtained from the sensor, and is the ground reflectance. The transmittance can be computed if can be obtained. Kaufman and Gao and Sobrino et al. [2,4,5,7] used non-absorption channels instead of , which makes it possible to get the transmittance of absorption channels by using the ratio method. So the water vapor content can be estimated through building a relationship between the transmittance and water vapor content, which is simulated by MODTRAN.
The ratio method assumes that reflectance varies almost linearly with the wavelength for different surface types between two channels around the water absorption channel. The ratio method is very good for one surface type, but it is not very good for all surface types because the spectral reflectance of a curve is different for different surface types, which can be shown in Table 1 . In order to prove this point, MODTRAN  is used to simulate and analyze the relationship. The land surface reflectance of water, snow, soil, and vegetation (about 49 kinds of land surface types)  in MODIS bands 2, 5, 17, 18, and 19 are used as input parameters of MODTRAN4. The range of atmospheric water vapor content is from 0.3 to 3.5 for simulation. The transmittance of near-IR is mainly influenced by water vapor content, so the other parameter is set as the default in standard atmospheres, including tropical, mid-latitude summer, mid-latitude winter, sub-arctic summer, and sub-arctic winter. , , are defined as ratios between bands 17, 18, 19, and band 2 :Table 1. Figures 2(a) –2(e) show the relationship between the radiance ratios and the total atmospheric water vapor amount in different regions (tropical, mid-latitude, and sub-arctic) and seasons (summer and winter). The scatter-point relationships are columnar, which is not as good as the results reported in [2,5] because Kaufman and Gao  used 24 surface types and Sobrino et al.  just used 10 surface types. The sensitivity of the radiance ratios for water vapor content is different because the atmosphere profile is different for different regions and seasons, which can be indicated by Figs. 2 and 3 . Figure 3 is the combination of radiance ratios and atmospheric water vapor amounts in different regions and seasons from Figs. 2(a)–2(e), which indicates that the estimation accuracy becomes worse if one inverts the equation used for all conditions (different regions and different seasons). Obviously, the accuracy will be well improved if a different inverse equation can be used by using additional MODIS bands to infer the different surface types  in different seasons and different regions.
2. 2. Estimation analysis from MODIS data by RM–NN
The NN is much different from a conventional algorithm requiring that the inversion algorithm be known exactly. For geophysical parameters estimation from remote sensing data, this may be quite difficult due to the many nonlinear and poorly understood factors involved (like mixed pixels). Many studies have proved the inherent capabilities of the NN to perform classification, function approximation, optimization computation, and self-learning. The complicated relationships between geophysical parameters determine that the NN is one of the best ways to solve the complex inverse problem [12–15]. In this study, we utilize the RM (MODTRAN4) and NN to estimate water vapor content from MODIS data.
In contrast to conventional methods, the NN does not require that the relationship between the input parameters and the output parameters be known, which determines the relationship between the inputs to the network and the outputs from the networks directly from the training data . The implementation of the RM–NN algorithm is very simple and can be broken into four basic steps.
- (1) Simulating the training and testing data by MODTRAN4, or obtaining the reliable field measurement data, including the high accuracy of water vapor content product in reliable research region.
We use MODTRAN4 to simulate the radiance transfer of MODIS bands 2, 5, 17, 18, and 19 as the training and testing data, which can be viewed as reference data from a known ground truth. The relationship of reflectance between MODIS bands 2, 5, 17, 18, and 19 can be kept well through RM (MODTRAN4) simulation, which is very difficult to keep in field measurement. The reflectance in MODIS bands 2, 5, 17, 18, and 19 obtained from Bowker et al.  is used as input parameters in MODTRAN4. The atmosphere water vapor content change range is from 0.3to 4.5. The standard atmospheres include tropical, mid-latitude summer, mid-latitude winter, sub-arctic summer, and sub-arctic winter.
- (2) Computing the radiance in bands 2, 5, 17, 18, and 19; the ratios, , , , , and , which are used as six input nodes of the NN; and the output node is the water vapor content.
(3) Training and testing the NN.
We divide randomly the simulation data into two parts. The training data are 9760 sets and the testing data are 2934 sets, and then we use a dynamic learning (DL) NN  to estimate the water vapor content. First we use training data to train the NN, and then we use test data to verify the NN. After trial and error, part of the test data set information can be seen in Table 2 .
Shown in Table 2, the accuracy is the highest when the number of hidden layers is two and the number of hidden nodes is 800–800, which is mainly determined by the number of surface types and the different atmosphere profiles in different seasons and regions. We make a comparison between the retrieval water vapor content () and the truth water vapor content () for the test data. Seen in Fig. 4 , the estimation result is very good and the error is very small. The average error of water vapor content is under 0.06 . The average percentage error () is about 5%, and the standard deviation of estimation error is about 0.08 . The distribution of average error () and the average percentage error () are like in Figs. 4 and 5 , which shows that the absolute error becomes larger with increasing water vapor content and the relative error (percent) is reversed.
3. Comparison with MODIS water vapor content product and validation
In order to give an application example, we select the MODIS/TERRA image of Shandong Peninsula, China on 08/22/2007 as the research region, and Fig. 6 is the image combined by MODIS bands 3, 2, and 1. Figure 7 is a MYD05_L2 product, which provides per-pixel water vapor content values. We use this DL NN, which has been focused above to estimate the water vapor content from MODIS1B data. The radiance ratios (, , , , , and ) are used as six input nodes of the NN, and the output is the water vapor content. Figure 8 is the estimation result by the RM–NN. In a comparison with Figs. 6 and 7, there is some difference for the identification of clouds, which can be shown in block D in Fig. 8. Eight clear sky regions of about 8 × (200 × 13) pixels in Figs. 7 and 8 are selected to make the comparison. The comparison result is like in Fig. 9 . The mean and the standard deviation of retrieval error are about 0.56 and 0.68 relative to the NASA product (MYD05_L2). The estimation result by RM–NN is obviously larger than MYD05-L2 when the values of water vapor content are over 3.5 and below 0.7 , which are shown in Fig. 9. When the water vapor content is below 0.7, the ratio method is not suitable for retrieving water vapor content because the estimation equation [2,5] can get a zero value, which is impossible in reality, so the estimation error becomes larger when the value of water vapor content is little. When the water vapor content is over 3.5 , the ratio method is not sensitive enough for water vapor content, which is shown in Figs. 2 and 3, so the estimation error also become larger when the value of water vapor content is over 3.5. The greatest difference between estimation results by RM–NN and the MYD05_L2 product is labeled by using red at the junction with the sea in Fig. 8. The estimation result is not good for either the MYD05_L2 product or the RM–NN, which is common knowledge because the value of water vapor content is gradual changed in the sky. The value of water vapor content is larger than above the sea and the land in Fig. 7, but that is reversed in Fig. 8. The main reason is that the reflectance at the junction with the sea is very different from the land and the sea, and the other reason is the influence of mixed pixels. The assumption of the ratio method does not meet with this condition. The training database in RM–NN does not include this condition because we do not have the reflectance spectral curve at the bank of the sea. In order to overcome the difficulty of measurement, we use the average value of water vapor content  at the junction of the sea and land () () to the true value, which can be shown in Fig. 8, and read the values of the bands 2, 17, 18, and 19 from MODIS 1B data according to the latitude and longitude. We obtained 476 sets, which are used to make up for the training database. Figure 10 is the estimation result after training again by compensating for some training data sets. The spatial distribution at the bank of the sea in Fig. 10 is better than in Figs. 7 and 8, but the difference becomes larger when there are clouds in Figs. 10, 8, and 7. These show that the outlandish spectral curve of some ground surface types will influence estimation accuracy. The estimation results indicate that the NN has powerful self-learning and is capable of suiting more conditions if we can obtain reliable measured data that can overcome the shortcomings of the conventional retrieval algorithm (ratio method) [2,5].
It is very difficult to obtain the in situ ground truth measurement of water vapor content matching the pixel scale (1km × 1km at nadir) MODIS data at the satellite pass for validation of the algorithm. Generally speaking, water vapor content varies from point to point in the sky, and MODIS observes the ground at different angles, precisely locating the pixel of the measured ground in the MODIS data. The AERONET (Aerosol Robotic Network) program is a federation of ground-based remote-sensing aerosol networks established by NASA and PHOTONS (Univ. of Lille 1, CNES, and CNRS-INSU). AERONET collaboration provides globally distributed observations of spectral aerosol optical depth, inversion products, and precipitable water in diverse aerosol regimes (http://aeronet.gsfc.nasa.gov/). AERONET obtained 328 data sets of water vapor content in clear sky from 12 sites (such as in Table 3 ). We extract the MODIS pixels from the MODIS 1B data by using a program through longitude/latitude conditions. The comparison between estimation results by RM–NN and observation data is in Fig. 11 , and the mean and the standard deviation of estimation error are about 0.12 and 0.18 . We will make more analyses of the application, which will be reported in the future. On the other hand, another advantage of the RM–NN is that the estimation accuracy can be improved by offsetting some training data (like reliable measurement data).
The shortcomings of conventional retrieval algorithms for retrieving water vapor content from MODIS data are discussed. The relationships are explored between the total atmospheric water vapor amount and different radiance ratios for MODIS bands 2, 5, 17, 18, and 19. The analysis indicates that the radiance transfer model (RM) neural network (NN) can be competent for accurately estimating water vapor content, because some potential information between geophysical parameters were fully used in previous algorithms.
We utilize MODTRAN4 to simulate data to train and test neural networks. The test results indicate that RM–NN is very robust. Accuracy is highest when the number of hidden layers is two and the number of hidden nodes is 800–800. Simulation data analysis indicates that the mean error of water vapor content is under 0.06 . The trained neural network (DL) is used to estimate water vapor content from the MODIS1B data. The comparison analyses between estimation results by RM–NN and the MODIS product provided by NASA indicate that the MODIS product underestimates the water vapor content when the values of water vapor content are over 3.5 and below 0.7 . The mean error is 0.56 relative to the MODIS product (MYD05_L2) estimated by the ratio method. The comparison between estimation results by RM–NN and the observation data shows that the mean and the standard deviation of estimation error are about 0.12 and 0.18 . The main purpose of this study proves that RM–NN is competent for estimating water vapor content. The incorporation of RM–NN to perform inversion is an important advancement in the remote sensing field and makes it possible to perform inversion with higher accuracy and more practicality. We will do further application analysis that will be reported in future and will make RM–NN more robust and suitable for more conditions.
The authors thank K. Chen, Y. Tzeng and H. W. Lee, and The Center for Space and Remote Sensing Research, National Central University, Taiwan, for their various help with this study and the JPL for providing the ASTER Spectral Library data. They also thank the AERONET (Aerosol Robotic Network) program for providing precipitable water data and NASA for providing MODIS 1B and the MODIS water vapor content product. They also thank the anonymous reviewers for their valuable comments, which greatly improved the presentation of this paper. This work was supported by the open fund of the State Key Laboratory of Remote Sensing Science, jointly sponsored by the Institute of Remote Sensing Applications of the Chinese Academy of Sciences and Beijing Normal University; the National Natural Science Foundation of China (NSFC) (grants 40930101 and 40971218), the 973 Program (grant 2007CB714403), and the Special Fund for Basic Research Work of Central Scientific Research Institutions for Public Welfare (grants 901-40).
References and links
1. S. Manabe and R. T. Wetherald, “Thermal equilibrium of atmosphere with a given distribution of relative humidity,” J. Atmos. Sci. 24(3), 241–259 (1967). [CrossRef]
2. Y. J. Kaufman and B. C. Gao, “Remote sensing of water vapor in the near-IR from EOS/MODIS,” IEEE Trans. Geosci. Rem. Sens. 30(5), 871–884 (1992). [CrossRef]
3. V. Carrere and J. E. Conel, “Recovery of atmospheric water vapor total column abundance from imaging spectrometer data around 940 nm—sensitivity analysis and application to airborne visible/ infrared imaging spectrometer (AVIRIS) data,” Remote Sens. Environ. 44(2-3), 179–204 (1993). [CrossRef]
4. B. C. Gao and Y. J. Kaufman, “Water vapor retrievals using Moderate Resolution Imaging Spectroradiometer (MODIS) near-infrared channels,” J. Geophys. Res. 108(D13), 4389 (2003), doi:. [CrossRef]
5. J. A. Sobrino, J. E. Kharraz, and Z. L. Li, “Surface temperature and water vapor retrieval from MODIS data,” Int. J. Remote Sens. 24(24), 5161–5182 (2003). [CrossRef]
6. M. D. King, Y. J. Kaufman, W. P. Menzel, and D. Tanre, “Remote sensing of cloud, aerosol, and water vapor properties from the moderate resolution imaging spectrometer (MODIS),” IEEE Trans. Geosci. Rem. Sens. 30(2), 1–27 (1992). [CrossRef]
7. B. Gao and Y. J. Kaufman, The MODIS Near-IR Water Vapor Algorithm: Product ID: MOD05-Total Precipitable Water, Algorithm Technical Background Document, Remote Sensing Division, Code 7212, Naval Research Laboratory, 4555 Overlook Avenue, SW, Washington, DC 20375 (1998).
8. K. Mao, J. Shi, Z. Li, and H. Tang, “An RM–NN algorithm for retrieving land surface temperature and emissivity from EOS/MODIS data,” J. Geophys. Res. 112(D21), D21102 (2007), doi:. [CrossRef]
9. K. Mao, Z. Qin, J. Shi, and P. Gong, “A practical split-window algorithm for retrieving land surface temperature from MODIS data,” Int. J. Remote Sens. 26(15), 3181–3204 (2005). [CrossRef]
10. A. Berk, L. S. Bemstein, and D. C. Roberttson, “MODTRAN: a moderate resolution model for LOWTRAN,” Burlington, MA, Spectral Science, Inc. Rep. AFGL-TR-87–0220 (1987).
11. D. E. Bowker, R. E. Davis, D. L. Myrick, K. Stacy, and W. T. Jones, “Spectral reflectances of natural targets for use in remote sensing studies,” NASA Reference Pub.1139 (1985).
12. Y. C. Tzeng, K. S. Chen, W. L. Kao, and A. K. Fung, “A dynamic learning neural network for remote sensing applications,” IEEE Trans. Geosci. Rem. Sens. 32(5), 1096–1102 (1994). [CrossRef]
13. K. Mao, J. Shi, H. Tang, Q. Zhou, Z. L. Li, and K. S. Chen, “A neural network technique for the retrieval of land surface temperature from advanced microwave scanning radiometer-EOS passive microwave data using a multiple-sensor/ multi-resolution remote sensing approach,” J. Geophys. Res ., doi: 10.1029/ 2007JD009577 (to be published).
14. K. Mao, J. Shi, H. Tang, Z. L. Li, X. Wang, and K. Chen, “A neural network technique for separating and surface emissivity and temperature from ASTER imagery,” IEEE Trans. Geosci. Rem. Sens. 46(1), 200–208 (2008). [CrossRef]
15. K. Mao, H. Tang, X. Wang, Q. Zhou, and D. Wang, “Near-surface air temperature estimation from ASTER data based on neural network algorithm,” Int. J. Remote Sens. 29(20), 6021–6028 (2008). [CrossRef]