Influence of high-resolution surface databases on the modeling of local atmospheric circulation systems

Large-eddy simulations are performed using the Advanced Regional Prediction System (ARPS) code at horizontal grid resolutions as fine as 300 m to assess the influence of detailed and updated surface databases on the modeling of local atmospheric circulation systems of urban areas with complex terrain. Applications to air pollution and wind energy are sought. These databases are comprised of 3 arc-sec topographic data from the Shuttle Radar Topography Mission, 10 arc-sec vegetation-type data from the European Space Agency (ESA) GlobCover project, and 30 arc-sec leaf area index and fraction of absorbed photosynthetically active radiation data from the ESA GlobCarbon project. Simulations are carried out for the metropolitan area of Rio de Janeiro using six one-way nested-grid domains that allow the choice of distinct parametric models and vertical resolutions associated to each grid. ARPS is initialized using the Global Forecasting System with 0.5 -resolution data from the National Center of Environmental Prediction, which is also used every 3 h as lateral boundary condition. Topographic shading is turned on and two soil layers are used to compute the soil temperature and moisture budgets in all runs. Results for two simulated runs covering three periods of time are compared to surface and upper-air observational data to explore the dependence of the simulations on initial and boundary conditions, grid resolution, topographic and land-use databases. Our comparisons show overall good agreement between simulated and observational data, mainly for the potential temperature and the wind speed fields, and clearly indicate that the use of high-resolution databases improves significantly our ability to predict the local atmospheric circulation.


Introduction
Numerical modeling of unsteady three-dimensional turbulent atmospheric flow is a natural approach to describe mean properties of various physical processes that are not often captured by field measurements collected at a few scattered points in space.One of the most robust, versatile and modular mesoscale models designed to resolve atmospheric flows in many scales is the Advanced Regional Prediction System (ARPS; Xue et al., 1995Xue et al., , 2000Xue et al., , 2001)).Its numerical code has been developed and disseminated freely in the scientific community to allow changes and developments in its subroutines.The ARPS code solves a set of partial differential equations (PDEs) written for the three-dimensional, compressible time-dependent atmospheric flow, under dry, moist and nonhydrostatic conditions.ARPS is a typical mesoscale model that can be run at fine space and time resolutions in order to be able to represent highly complex terrain based on the high-resolution surface databases that are available nowadays.ARPS incorporates heterogeneous land-surface conditions and time-dependent synoptic boundary forcing, but they are typically limited by outdated coarse resolution.As other current mesoscale models -such as Weather Research and Forecasting (WRF; Skamarock et al., 2001Skamarock et al., , 2005)), Fifth-Generation National Center for Atmospheric Research (NCAR)/Penn State Mesoscale Model (MM5; Grell et al., 1993;Dudhia et al., 2005), Regional Atmospheric Modeling System (RAMS; Walko et al., 1995), Meso-Eta Model (Black, 1994), Mesoscale Non-Hydrostatic Model (Meso-NH; Lafore et al., 1998), and Aire Limitee Adaptation Dynamique Developement International (ALADIN; Bubnová et al., 1993;Radnóti et al., 1995;Horányi et al., 1996) -

L. M. S. Paiva et al.: Influence of high-resolution surface databases
ARPS allows significant refinement of the numerical grid to the point where LES (large eddy simulation) can be used, since some turbulent parametric models developed for LES are available in the code.We chose the LES-ARPS model as our main tool because it is based on a 1.5 order of magnitude turbulent kinetic energy (1.5-TKE) scheme and the Moeng and Wyngaard (1989) turbulence model and because it has been thoroughly tested (Chow, 2004;Chow et al., 2006) and used as a reference for the assessment of state-of-the-art mesoscale models such as WRF (Gasperoni, 2013).Among all turbulence parameterization schemes available in ARPS, the 1.5-TKE scheme and the Moeng and Wyngaard (1989) turbulence model is the best for this type of simulation.ARPS was formulated to be run in either a RANS (Reynolds-Averaged Navier-Stokes) or a LES code that solves the threedimensional, compressible, nonhydrostatic, filtered Navier-Stokes equations.The relevant settings for our application requires the use of ARPS in the LES mode because the length scale is based on the grid spacing, as explained by Chow et al. (2006), and the difference between RANS and LES in this case is in the definition of the length scale (Michioka and Chow, 2008).For the LES mode, the length scale employed in the eddy viscosity equation is based on the grid size, whereas the length scale for the RANS mode is based on a PBL (planetary boundary layer) depth or distance from the ground.The differences between RANS and LES become small when similar space and time resolutions are used in numerical modeling.This is also one of the four concepts rated by Pope (2000) that characterize the LES model, indicating that the physical and numerical modeling must be deliberately combined.Additionally, we prefer the LES procedure for our study because it is clear which physical features are resolvable and which must be modeled.
Several numerical studies available in the literature have adopted significant refinement of the grid in mesoscale simulations.As an example, we may cite the simulations carried out by Grell et al. (2000), who used MM5 to compute the atmospheric flow in some regions of the Swiss Alps with horizontal resolutions of up to 1 km.It is worth noting that previous works, such as Lu and Turco (1995), point out that the increase in the spatial resolution of the grid can generate more detailed and reliable solutions.In fact, most studies show that high-resolution numerical grids tend to improve the quality of numerically simulated data when compared to observed data (Revell et al., 1996;Grønȧs and Sandvik, 1999;Grell et al., 2000;Chow et al., 2006).However, in regions of steep and extensive slopes, the topography may be poorly represented because the ramps that form the slopes on the surface may become irregular as the grid resolution is increased.In this case, the coarse spatial resolution of the topographic database does not add any additional information to the simulations, since fine numerical grids require extensive surface information.The same concerns are expressed in Chow (2004).Usually, high-resolution numerical grids are often employed for simulations of small areas due to the high computational cost.Revell et al. (1996) and Grønȧs and Sandvik (1999) used numerical grids with resolutions of 250 m to perform LES simulations, but the wind field was not reproduced accurately in the regions studied.The authors considered that the main source of inaccuracy was the absence of high-resolution surface data.However, Zhong and Fast (2003) were successful in capturing the general characteristics of the surface fluxes present in the Salt Lake valley, in the state of Utah, using three of the mesoscale models cited above: RAMS, MM5, and the Meso-Eta Model.All models were initialized using synoptic data and used horizontal grid resolutions of 560 and 850 m, which are close to the topographic database resolution of the models.Even so, the simulations were not able to capture the local circulation and the surface fluxes.In order to improve the results, Zhong and Fast suggested changes in the vertical mixing terms, in the radiation model and in the parameterizations adopted for the surface fluxes.Chen et al. (2004) also used ARPS to simulate the atmospheric flow in the Salt Lake valley.The results were more satisfactory because they increased the numerical domain size and the horizontal resolution to 250 m.Sensitivity tests were performed by Chow et al. (2006) running ARPS in LES mode to simulate the flow in the Riviera Valley, situated in the Swiss Alps, using five one-way nested grids at horizontal resolutions of 9 km, 3 km, 1 km, 350 m, and 150 m.Chow et al. (2006) concluded that, although sensitive to the soil temperature and moisture initialization, their numerical results were in good agreement with the field data recorded during the 1999 campaign of the Mesoscale Alpine Programme (MAP Riviera Project;Rotach et al., 2004).In simulations of the type discussed here, which are characterized by short spin-ups, the initialization of soil moisture and skin temperature may become one of the main issues of the modeling, since it may require offline models to provide proper initial conditions.Chow et al. (2006) tried many different approaches to solve this problem, but the statistical indices were still lower than expected.
In the numerical studies performed by Hanna and Yang (2001), who used four different mesoscale models in their simulations, the discrepancies that appeared in the calculation of the wind direction and speed were attributed to the misrepresentation of the turbulent fluxes, mainly due to the land-use database and the subgrid-scale parameterization adopted.However, Zängl et al. (2004) and Gohm et al. (2004) simulated foehn winds using MM5 with two-way nested grids and found nonnegligible differences between simulated and observed data even using a horizontal resolution of 267 m for the innermost grid.Based on that they concluded that the topographic data require a high resolution and that the lateral BCs (boundary conditions) are poorly satisfied if the grid points are too far apart from each other.In contrast, Zängl et al. (2004) found that the effect of the horizontal computational mixing model was larger than the effect of grid resolution.Their model performed better with an improved computational mixing scheme at coarse resolution (3 km) than with the traditional mixing scheme at fine resolution (1 km).As a matter of fact, most studies also point out that the soil and the vegetation databases are also important sources of error.De Wekker et al. (2005), using RAMS, showed good agreement between numerical and observed data, but their modeling did not capture accurately the wind structure in a region characterized by valleys and mountains, even though their grid resolution of 333 m was very fine.The probable cause was the bad representation of the topographic database provided by RAMS.
Many numerical weather-and climate-prediction models are sensitive to the heat and moist surface fluxes (Beljaars et al., 1996;Viterbo and Betts, 1999).Because these surface transport processes occur on the subgrid scales, they cannot be solved directly and, therefore, they need to be parameterized.In practice, the moist fluxes at the soil surface are estimated by soil and vegetation models (Pitman, 2003).The transfer of moisture is usually described by semiempirical aerodynamic coefficients, which are based on the similarity functions presented by Businger et al. (1971) and Deardorff (1972).Recently, Weigel et al. (2007) showed that the moist flux from the soil surface to the atmosphere is not controlled by the turbulent eddies only.The authors note that other mechanisms are also important, such as the mass transport due to the geometry of the topography and the interactions that exist in the thermally induced circulations present in regions of valleys and mountains.
Recent studies have shown that LES has been adopted frequently, mainly due to increased computational power to solve high-resolution atmospheric flow.Wyngaard (2004) observed that LES is not restricted to applications where the flow occurs in the smallest resolvable turbulent scales.Chow et al. (2006) and Weigel et al. (2006Weigel et al. ( , 2007) ) indicated that the complex thermal structure and dynamics of the atmospheric flow over the complex terrain present in the Swiss Alps may be reproduced in detail using ARPS with Moeng and Wyngaard's (1989) LES model turned on.Michioka and Chow (2008) also showed that ARPS performs well when configured to run in the LES mode.These authors coupled ARPS to a code that calculates the dispersion of passive pollutants and ran simulations in regions of highly complex terrain using one-way nested grids, where the highest resolution was 25 m in the horizontal directions.Recently, Chow and Street (2009) implemented a new turbulent-flux parameterization model in ARPS in the form of a Taylor-series expansion, which aims to reconstruct the resolved subfilterscale turbulent stresses.Variations of this series expansion are combined with dynamic eddy-viscosity models for the subgrid-scale stresses to create a dynamic reconstruction model (DRM; Chow, 2004;Chow et al., 2005).The authors evaluated the performance of DRM computing the flow over the Askervein Hill (Taylor andTeunissen, 1985, 1987) and found promising results.The atmospheric boundary layer (ABL) flow over the Askervein Hill has also been studied by Chow and Street (2009), who added a new stress tensor in the ARPS code to run it in LES mode.The authors concluded that the use of explicit filters and a DRM avoids common problems in the ground-surface BCs that appear in the LES model and that the results were quite satisfactory when compared to other field data and numerical models (Castro et al., 2003;Lopes et al., 2007).
The main objective of this paper is to evaluate the local atmospheric circulation system of the metropolitan area of Rio de Janeiro (MARJ), Brazil, by setting up a high-resolution numerical model using ARPS in the LES mode as the main tool, and having detailed and updated topography and landuse databases incorporated into the model.New preprocessors are developed and incorporated into ARPS to input the information from the database files of the Shuttle Radar Topography Mission (SRTM; Farr and Kobrick, 2000) and the European Space Agency (ESA) GlobCover (Bicheron et al., 2008) and GlobCarbon (Eyndt et al., 2007;Arino et al., 2008) projects in order to generate appropriate nonhomogeneous surface BCs for the present model.

Site characterization, period synoptic analysis and surface station data
In order to assess the capability of our high-resolution numerical model to compute the local atmospheric circulation over a complex-terrain region we perform numerical simulations for the MARJ.The MARJ is characterized by complex topography, nonhomogeneous land-use and land-cover surfaces, and is surrounded by water bodies, such as the Sepetiba Bay, the Guanabara Bay and the Atlantic Ocean (see Zeri et al., 2011).Also, the MARJ is influenced by the South Atlantic subtropical anticyclone (SASA), by low frontal pressure systems on the synoptic scale, and by breeze systems on meso-and local scales (sea-land and valleymountain).The influence of the several atmospheric scales over the local circulation in this region becomes a major challenge for assessment of the performance of the mesoscale model ARPS.The MARJ is a high-population-density area that accounts for about 80 % of the population of the state of Rio de Janeiro, Brazil, and contributes the most to the emission of pollutants into the atmosphere.The main reasons for that are the existence of many industrial and mobile pollution sources that have emerged after new investments in infrastructure have been made that are associated with the expansion of the Itaguaí Harbor, the installation of the Steel Company of the Atlantic Ocean and the existing petrochemical industry.The high emission rates of pollutants in conjunction with the characteristics of the local atmospheric flow lead to the degradation of the air quality of the MARJ's air basins (Fig. 1) which are regions delimited as a function of the homogeneity of the areas, the topographic formation, the soil-type coverage, the climate characteristics, the mechanisms of pollutant dispersion and the airspace regions.There are a few World Meteorological Organization (WMO) standard surface weather-observation stations and just one upper-air station in the MARJ, indicating the need to use high-resolution simulated data of the atmospheric flow to provide support to the air quality modeling in the region.
Our simulations cover two time periods of 48 h -between 00:00 UTC (coordinated universal time) 6 September and 00:00 UTC 8 September 2007, and between 00:00 UTC 6 February and 00:00 UTC 8 Feburary 2009 -and one time period of 24 h on 8 August 2011.In the first period (Sep/2007), the synoptic analyses indicated the dominance of the SASA over the MARJ.This is a system of semipermanent high pressure, characterized by the presence of horizontal synoptic winds that rotate counterclockwise, vertical subsidence wind that generates divergence near the surface, clear sky, calm weather, and stable conditions, such that the SASA location and intensity change seasonally (Richter et al., 2008;Zeri et al., 2011).The influence of the SASA contributes to inhibit cloudiness and the advancement of high-latitude frontal systems in the region of interest (Lucena et al., 2012).Meteorological mesoscale and microscale systems, such as the sea breezes that act in the MARJ, can be hidden by the SASA system, but they are not totally destroyed as is the case when fronts pass by the area.During this first period, the SASA system remained mostly over the Atlantic Ocean, between latitudes of −50 and 0 arc-deg south and longitudes of −50 and 0 arc-deg west, whereas the directions of prevailing synoptic-scale winds were northeast and east in the MARJ.In the second period (Feb/2009), the synoptic analyses indicated that a moist mass of air was replaced by a drier high-pressure post-frontal mass of air over the MARJ after the passage of a frontal system.This drier mass of air joined the SASA hours later and moved further into the Atlantic Ocean (with respect to the first period).Although the MARJ was also dominated by the SASA circulation in the third period (Aug/2011), which is common in a month between the fall and the winter seasons, we note the occurrence of fog between dawn and morning.
For a comparison and statistical analysis of the results, we use hourly observational data to calculate potential temperature and the water-vapor mixing ratio at 2 m above ground level (a.g.l.), and get wind direction and speed at 10 m a.g.l.from 11 available.WMO standard surface weather-observation stations located in different zones of the MARJ, as seen in Fig. 1 and Table 1.The data from Marambaia, Ecologia Agricola, Vila Militar, Jacarepaguá (JPA), Copacabana and Xerem surface stations were obtained from the Brazilian National Institute of Meteorology.The METeorological Aerodrome Report (METAR) code data, which may be decoded to get wind direction and speed, visibility, absolute temperature, dew point temperature, and atmospheric pressure, were produced at the aerodromes of Santa Cruz (SBSC), Campo dos Afonsos (SBAF), Jacarepaguá (SBJR), Santos Dumont (SBRJ) and Galeão (SBGL), which are regulated by the Meteorology Network of the Brazilian Air Force Command.It is worth mentioning that SBGL has also sounding data available.

Numerical modeling setup
The procedures we employed to run accurate numerical simulations of the atmospheric flow in the MARJ are described here.These procedures can also be applied to other regions on earth, since we have developed new subroutines to process all the satellite data needed.The steps taken include the setup of the numerical method employed on ARPS, the structure of a high-resolution one-way nested grid, the incorporation of a detailed and updated topography and land-use databases on ARPS, and the adequate selection of radiation, turbulence closure, microphysics and cumulus parameterizations based on the ARPS user guide (Xue et al., 1995).A control run (CTL) set up with the (outdated) original ARPS surface databases serves as a reference for comparisons between the high-resolution (HR) surface database final runs and the field observational data.

Numerical schemes
We set up ARPS to employ a fourth-order spatial differencing for the advection terms and a mode-splitting technique for the temporal discretization to accommodate highfrequency acoustic waves.Large time steps ( t) are chosen based on the leapfrog method.For the small time steps ( τ ) we use first-order forward-backward explicit time stepping, except for terms responsible for vertical acoustic propagation, which are treated semi-implicitly.

Grid nesting and topography
We adopt a one-way nested-grid structure that is set up based on the tutorials of Mesinger and Arakawa (1976) and Warner   (1997).An external grid (GEXT) is set up in order to preprocess the data from the 0.5 • Global Forecasting System analyses (0.5 • -GFS; Kanamitsu, 1989), and to produce the first initial conditions (ICs) and lateral BCs at 3 h intervals for the outermost domain (G1), which has a horizontal resolution of 27 km.Relaxation on the values of the BCs is applied to a 5-10 grid-cell zone around the domain boundary, depending on the grid.In all simulations output data are produced at hourly intervals such that the data computed on any coarse grid are also employed as ICs and BCs at 3 h intervals for a subsequent fine-grid simulation in the one-way nested-grid domain.Namely, G1 produces data for G2, G2 for G3 and G3 for G4.Similarly, G4 produces data for both G5 and G6.In order to determine the horizontal resolutions for grids G2-G4 of the one-way nested-grid setup, we employ a ratio of one grid size to the next equal to 3, all centered with respect to the coarsest grid.The finest grids (G5 and G6) have a horizontal resolution of 300 m and are not nested to each other, as illustrated in Fig. 2.This setup has been chosen because we are interested in understanding how the local atmospheric circulations behave when they cross the boundaries of air basins I, and II and III of the MARJ (Fig. 1), which are covered by the G5 and G6 domains, respectively.This grid setup also helps us to reduce the computational cost that we would obtain if a single domain were used with the finest-grid resolution.It is important to remember that the higher-resolution grids process information coming from the 0.5 • -GFS analyses and results of the previous simulation performed on the coarser-resolution grids, which embed physical effects of the various scales being modeled.When one-way nested grids are used, the energy transfer occurs from large to small scales.Nevertheless, this procedure used by ARPS allows that each vertical resolution be treated separately for each numerical grid, ensuring that the flow computation is performed in the LES mode, which we found to be necessary to accommodate the steep terrains of the MARJ.Currently, two-way nesting schemes are available in other codes, such as RAMS, that do not allow for vertical resolution changes, but the effect of two-way nesting will be explored in future work.2. A Mercator projection is used with the true latitude and longitude located at the center of each domain to minimize distortion in the main area of interest, particularly in smaller areas that have the highest resolutions.For each nested subdomain, the terrain is smoothed next to its boundary to match the local neighboring grids that have lower resolution.
Table 2. Structure of the one-way nested numerical grids.Given the number of grid points n x , n y and n z , the physical domain size can be calculated as L x × L y × H z , where L x = (n x − 3) x, L y = (n y − 3) y, and The ARPS original files of the 30 arc-sec (i.e., approximately 900 m) USGS (United States Geological Survey) topography database are preprocessed for grids G1-G3 in our simulations.Depending on the run configuration, the ARPS files for grids G4 and G6 are either preprocessed from the 30 arc-sec USGS (i.e., approximately 900 m) topography database or the 3 arc-sec SRTM (i.e., approximately 90 m) detailed and updated high-resolution topography database, which we have recently incorporated into ARPS.This incorporation procedure required substantial modifications to the arpstrn.f90source file of the original ARPS code, which can be downloaded directly from http://meteoro.cefet-rj.br/leanderson/arps/ and run with the 3 arc-sec SRTM data on the ARPS model version 5.2.8.Details on how the arpstrn.f90routine works can be seen in the file's comment lines and also in Xue et al. (1995).In Fig. 3a and b it is possible to compare these two distinct databases processed for G5, where we can easily notice that the topographic details are much better reproduced by the 3 arc-sec SRTM data, especially in regions where the topography exceeds 800 m a.g.l.The highresolution topography database allows a much more appropriate definition of the surface boundary condition.Although a comparison between these databases for grid G6 is not shown, the east side of the G5 domain provides a good idea of what the topography maps look like for G6.

Vertical resolution and grid aspect ratio
ARPS incorporates a σ -coordinate system that follows the ground surface.The grids are stretched using a hyperbolic tangent function (Xue et al., 1995) that produces an average spacing z med and a domain height H z equal to (n z − 3) z med , where n z is the number of the vertical grid points.The smallest vertical spacing z min used for each grid and the number of grid points below ground level n zg can be found in Table 2. To resolve the smallest structures of the atmosphere it is necessary to adopt high vertical resolutions, but the grid aspect ratio ( / z min , where = x = y) should not be extremely large to avoid numerical errors, especially in the horizontal gradients (Mahrer, 1984), and also to avoid distortion of the resolvable turbulent structures when runs are carried out in LES mode (Kravchenko et al., 1996).Poulos (1999) and De Wekker (2002) have found that the aspect ratio of the grid should be small, especially for terrains with steep topography.Following the tutorials of Mahrer (1984), Kravchenko et al. (1996), Poulos (1999) and De Wekker ( 2002) and similar procedures adopted by Chow et al. (2006) we set up grids G1 and G2 with grid aspect ratios of 540 and 180 near the surface, respectively, to represent the scales in the atmosphere.These choices are adequate because the characteristic scales of the topography and the resolvable flow are large enough for the G1 and G2 domains, in addition to the fact that the 1.5-TKE parameterization scheme for the closure of the turbulent fluxes are used in both grids.Tests with high values of z min for grids G1 and G2 degraded the representation of the synoptic structures in comparison with the analysis of the synoptic charts, especially when we analyzed the mean atmospheric sea-level pressure fields.The same proportion was used for the G3 and G4 aspect ratios, since their z min values are equal to the coarser grids.Particularly, for the G5 and G6 grids, we avoided increasing the aspect ratio more than necessary and imposing a substantial decrease in the value of z min .Thus, we adopted an aspect ratio equal to 15, which results in a first level at 10 m a.g.l. and z min equal to 20 m for the finest grids.

Land-use databases
In addition to the atmospheric model component, our simulations work together with the ARPS soil-vegetation model that has been constructed based on the parametric scheme developed by Noilhan and Planton (1989) and modified by Pleim and Xiu (1995).This scheme is a function of the soil and vegetation types, vegetation cover fraction and the normalized difference vegetation index (NDVI) and/or leaf area index (LAI).Thus, it is important that the databases contain detailed updated information about urban and vegetated areas and water bodies with relatively high spatial resolution to allow more suitable nonhomogeneous BCs to simulate accurately the atmospheric flow, mainly in LES mode.ARPS models physically 13 soil types (including water and ice) and 14 vegetation types, according to the classification of the United States Department of Agriculture (USDA).
Table 3. Conversion of the original 30 arc-sec USGS and updated 10 arc-sec ESA categories of vegetation type to the USDA classification that is adopted by ARPS.See tables in Xue et al. (1995) and Bicheron et al. (2008), and the user guide for more information on the vegetation-type categories.For the soil-type representation, the original 30 arc-sec USGS database files are processed and mapped into the grid categories set up in ARPS by selecting values from the nearest data points.For the vegetation-type representation, original files of either the 30 arc-sec USGS database or the 10 arc-sec ESA (i.e., approximately 300 m) GlobCoverproject database that we incorporated into ARPS are employed, depending on the run.The incorporation of the vegetation data into ARPS is carried out through the modifications that we have introduced to the original arpssfc.f90and arpssfclib.f90source files.Particularly, we have developed and added two new subroutines to the arpssfclib.f90source file, referred to as GET_10S and MAPTY10S, which are similar to the GET_30S and MAPTY30S original subroutines.The GET_10S subroutine reads 10 arc-sec or 300 m vegetation-type data resolution files from the ESA Glob-Cover project.The MAPTY10S subroutine transforms the 22 categories from 10 arc-sec vegetation-type data into the simpler 14 original vegetation-type USDA/ARPS categories and feed them into the model domain by choosing the data values at the nearest grid points.Beyond that, the setting of the surface-roughness (z 0 ) map is processed by choosing values associated to the vegetation-type classes, according to the same conversions shown in Table 3.Both the arpssfc.f90and arpssfclib.f90source files have been commented to explain the modifications and they can be downloaded directly from http://meteoro.cefet-rj.br/leanderson/arps/.
When the 30 arc-sec USGS vegetation-type database files are adopted in our runs, LAI is calculated from the 30 arcsec USGS monthly NDVI database for herbaceous vegetation and trees, respectively (Xue et al., 1995).The relation between the NDVI and LAI for herbaceous vegetation can be consulted in Asrar et al. (1984), and for trees in Nemani   and Running (1989).Also, vegetation-fraction data from the 30 arc-sec National Environmental Satellite Data and Information Service (30 arc-sec NESDIS), supported by the National Oceanic and Atmospheric Administration (NOAA), are derived from the same NDVI data using the methodology suggested by Gutman and Ignotov (1998).However, whenever the 10 arc-sec ESA vegetation-type database files are employed in our runs, LAI and vegetation fractions are directly obtained from the 30 arc-sec ESA GlobCarbon project database, and little corrections on the mapped 30 arc-sec USGS soil-type data are needed near the coastlines of the water bodies, as illustrated by comparison in Fig. 4a and b for the G5 domain.We point out that we have developed the GETLAIGLOBCARBON and the GETFAPARGLOBCAR-BON subroutines and included them into the arpssfclib.f90source file of the ARPS code.These routines are able to read the LAI and calculate the vegetation-fraction values, respectively, from the GlobCarbon project database and interpolate them into the model domain.In addition, we have also developed the MAPTYLAIGLOBCARBON and MAPTYFA-PARGLOBCARBON subroutines to transform the 30 arcsec ESA LAI and the FAPAR (Fraction of Absorbed Photosynthetically Active Radiation) data into the simpler LAI and vegetation-fraction data and feed them into the model domain by choosing the data values at the nearest grid points.For an adequate reproduction of our results we also provide the namelist files -i.e., the arps.inputfiles, and the SRTM and ESA databases at http://meteoro.cefet-rj.br/leanderson/arps/ -in order to allow any setup we have used to run on all domain grids employed in this work.
Figure 5a and b highlight a comparison between the vegetation-type maps processed on G5 with the 30 arc-sec USGS and the 10 arc-sec ESA databases, respectively.It can be noticed that the 10 arc-sec ESA vegetation-type mosaic presents more detailed and smoothed areas than the 30 arcsec USGS database.The analysis of Fig. 6a and b, which illustrate a comparison of the vegetation-fraction maps on G5, indicates that the 30 arc-sec ESA vegetation fraction map presents a pattern that is in accordance with the pattern presented by the 10 arc-sec ESA vegetation-type map.However, the same does not exactly occur when we compare the 30 arc-sec USGS vegetation-fraction map to the 30 arc-sec USGS vegetation-type map.Therefore, we can safely conclude that in this work it is better to use the ESA land-use database than the USGS database.The surface roughness maps are not shown here because they are closely related to vegetation-type information available in Table 3.Similarly, the LAI maps are also omitted because they are intimately related to vegetation-type and vegetation-fraction information available on the maps in Figs.5a and b and 6a and b.
Additionally, we use two soil layers with depths of 0.01 m and 1 m for the computation of the temperature and moisture balances according to the ARPS soil-vegetation model.The sea surface temperature (SST) and the soil skin temperature and moisture initial databases for the G1 grid are obtained from the 0.5 • -GFS analyses.For the subsequent grids, initial values of these surface characteristics are obtained by numerical interpolation performed in each preceding grid.We also point out that, for all grids, we adopted the Colette et al. (2003) topography shading scheme, the Chou (1990) and Chou and Suarez (1994) short-and long-wave radiation schemes, the Kain andFritsch (1990, 1993) microphysics scheme and the 1.5-TKE, Moeng and Wyngaard (1989) turbulence model.The Kessler (1969) and Lin et al. (1983) cumulus scheme is turned on only for the G1 and G2 synoptic grids.Table 4 summarizes the differences adopted in our one-way nested-grid runs.

Results and discussion
The simulations are performed in parallel on a cluster comprised of three identical machines with the Intel Xeon E5450 processor of 3.0 GHz of RAM (random access memory) and cache size of 6144 KB.Our short spin-up runs cover two periods of 48 h and one of 24 h, which are long enough to infer the dependence of the simulations on initial and boundary conditions, grid resolution and topographic and land-use databases, and to compare the results to surface and upper-air observational data for distinct days.We anticipate that, in the absence of frontal systems and depending on the positioning of SASA in the southeastern region of Brazil, the prevailing winds that blow in the region of interest are the result of mesoscale and microscale mechanisms that occur as a function of the land-sea contrast, mountain-valley and land use.

Statistics indexes, meteograms and upper-air profiles
Analogously to Chow et al. (2006), Table 5 illustrates the mean errors (bias) and the root-mean-square errors (RMSE) computed for the potential temperature θ, water-vapor mixing ratio q v and wind direction and speed for both runs of all periods.The results completely exclude the fast spin-up time of the indices and maintain only the computationally stable time results of the daily-cycle periods.The bias and RMSE are computed as follows: where ϕ = ϕ f −ϕ o represents the difference, or deviation, between any forecasted and observed variable, and N is the total number of verifications.For wind direction, a positive deviation means that the simulated wind vector deviates clockwise in relation to the observed wind vector.Because the largest possible error in wind direction is 180 arc-deg, the definition of the deviation ϕ needs to be changed according to The analysis of the statistical indexes from Table 5 clearly summarizes that the HR-run results are much better than the CTL-run results.From the data displayed in Table 5, we see that the HR-run statistics are worse only in 6 out of the 22 statistics for potential temperature, and only in 2 out of 22 statistics for the wind speed.We consider these statistical indexes to be very appealing.Specifically in the case of the wind speed, which is a very important quantity, the improvement obtained in its calculation can be quantified by looking, for example, at the Marambaia and Ecologia Agrícola stations (located in the west zone).At Marambaia, Table 5 shows that there is a decrease of the bias from 2.32 to 1.66 m s −1 , whereas at Ecologia Agrícola the decrease goes from 3.18 to 2.69 m s −1 .Also, the RSME goes from 3.02 to 2.42 m s −1 at Marambaia, and from 4.26 to 3.59 m s −1 at Ecologia Agrícola.To support this line of reasoning, we created Table 6, which summarizes the statistics by classifying the stations into three zones as seen in Table 1.Table 6 shows that the wind speed results are better for the HR runs in the west and south-central zones, which adds up to 14 improvements out of 14 statistics.
From Tables 5 and 6, the statistical indexes for the potential temperature show significant improvement over the CTL run results when HR databases are employed.All cases are better for the HR runs in the east zone, and four out of six are better in the west zone.Considering all zones, only 6 cases out of 22 presented worse results with HR databases.Although we had four worse cases out of eight in the southcentral zone, the calculated bias for both the CTL and the HR runs are small (less than 1.8 K for the west and south-central zones) compared to potential temperature values on the order of 300.0 K, which were calculated from observed, measured values.In the east zone, where we had eight improvements out of eight statistics, the bias values are also small (in the range 1.5-2.6K).This set of results indicates that ARPS, overall, is doing a good job in the prediction of the time and space variation of this quantity in the simulations, although the computed values tend to underestimate the observational data.
When we consider the statistical indexes for the vapormixing ratio, we see from Table 5 that there is no significant difference between the bias values calculated from the HR and the CTL runs.Although this result indicates that there is no clear advantage of using HR databases over lowresolution databases, we point out that the bias values are small (less than 1.7 g kg −1 ) compared to measured values on the order of 15 g kg −1 (which sets the scale for this quantity), except at the SBJR and SBRJ stations.In other words, the flow model predicts correctly the time and space variation of this quantity in ARPS simulations, and from a statistical point of view there is relatively little to improve on www.geosci-model-dev.net/7/1641/2014/Geosci.Model Dev., 7, 1641-1659, 2014 Table 5. Mean errors (bias) and RMSEs for θ , q v , wind direction and speed.

Weather stations and runs
Variables and statistics θ (K) q v (g kg the calculation of the vapor-mixing ratio.Therefore, the statistical indexes that compare the HR and CTL runs' results should not be used directly to assess the advantage of the HR simulation over the CTL simulation.It is worth noting that we compared the observational and model results at the same height by extrapolating the ARPS results from the first grid point at 10 m down to 2 m.In the case of the wind direction, large deviations between the ARPS runs are observed against observational data.Despite the fact that the wind direction is probably the most difficult quantity for any model to forecast accurately, the HR run shows significant improvement over the CTL run at the SBAF and JPA stations.At the SBAF station, the wind direction bias discrepancy goes from −16 to 2 • , approximately, whereas, at the JPA station, the bias discrepancy goes from −17 to 6 • , approximately.Also, we must be careful to interpret the statistics indexes when, for example, bias is calculated for time cross sections that remain lagged during the entire period of the simulation.For the SBSC station, the bias for the wind direction indicates values of about 20 • , but the RMSE confirms that errors are as large as we see in the time cross sections (not shown).At the SBJR the differences indicated by the RMSE values are small too, although the statistical indices were not calculated between 00:00 and 09:00 UTC; i.e., night to dawn and early morning.We should also emphasize that the data at the observation level closest to the ground may be influenced by surface effects due to the plant canopy, which is not totally represented in mesoscale models due to difficulties with the turbulence parameterization schemes, incipient grid resolution, poor urban soil-type database, and low resolution of the topography database outside of the fine-resolution domains.These effects imply that  the model is unable to provide accurate lateral BC forcing for the finest grids, as discussed by Gohm et al. (2004).Regardless of these issues, our results indicate that the scales modeled in the finest grids (G5 and G6), based on "Terra Incognita" (Wyngaard, 2004), are still accurate enough to provide encouraging results.HR runs perform better than the CTL runs.We can see in these figures that the CTL and the HR results underestimate the potential temperature in most of the analyzed periods and overestimate the wind speed data in comparison with the observational data.This is in accordance with the analysis above based on the statistical indexes shown in Table 5. Specifically at the Marambaia station, the potentialtemperature daily cycle is well simulated for both runs in both days of the first period (Sep/2007), when compared to the observational data (Fig. 7a, b).Good agreement between the simulated and the observational data for the wind speed can be noticed at almost all times, mainly for the HR run (Fig. 7c, d).We also note the occurrence of large speed values in the periods 18:00-21:00 UTC 6 September and 15:00-17:00 UTC 7 September 2007, which are associated with the sea breeze coming from the Atlantic Ocean.The simulation results show just a slight discrepancy with respect to this behavior on the second day, since ARPS does not compute adequately the speed decrease from 18:00 UTC on, showing a possible influence of the synoptic forcing in the modeling process of the ABL and hiding the real effect of the sea breeze coming from the Atlantic Ocean.
The observed daily cycles for the potential temperature are also represented qualitatively well by the simulation for both runs at the Ecologia Agricola station (Fig. 8a, b).The computed results for the HR run present better agreement with the observational data than the CTL run between 08:00 and 14:00 UTC for both days, just when the convective mixed layer is in development.The results illustrated by time cross section of the wind speed computed from the HR run present better agreement with the observational data than the CTL run, although the results show a systematic trend to overestimate the wind speed (Fig. 8c, d).The Ecologia Agricola station is positioned relatively far from Sepetiba Bay (Fig. 1), in a direction transverse to the coastline.Thus, the sea breeze is sometimes the driving force of the wind speed and direction, as it probably occurred on 7 September 2007.Based on these results and similar results obtained by Chow et al. (2006) for other regions, we infer that the poor representation of the 30 arc-sec USGS soil type can greatly influence the results of the simulation, in spite of the best representation of the topography and vegetation-type provided in the HR run.The numerical results for the SBSC station present good agreement with the observational data in terms of potential temperature, mainly on the second day of the simulation (Fig. 9a, b).On the first day, although the CTL and the HR runs underestimate the potential temperature in the period 12:00-21:00 UTC, the HR run performs better than the CTL run from 04:00 to 12:00 UTC, 6 September 2007, which represents a good short spin-up simulation.Overall, negligible differences are found when simulated values are compared with the observational data (Fig. 9a, b).With respect to the wind speed, both runs present large discrepancies when compared to observational data from the beginning of the run until 14:00 UTC, 6 September 2007, approximately.After that, both runs predict well the wind speed decrease at 18:00 UTC, 6 September 2007, probably because of the turn of the wind (not shown), and the HR run performs well in the detection of the few wind speed changes.Although the wind direction meteograms are not shown for the SBSC station in particular, the simulations seem to detect when the sea breeze starts to turn coming from the Sepetiba Bay direction, but it fails to detect when the wind turns completely from the southwest direction.These results indicate that the fine settings proposed for the databases are not yet fully appropriate to represent the wind speed and direction behavior at the SBSC station.Even with the good performance of the computed daily wind-speed cycle for the HR run on the second day analyzed (Fig. 9d), we highlight that there are combined effects due the synoptic scale and the sea breeze coming from the Atlantic Ocean and Sepetiba Bay.The low resolution of the soil-type database associated to soil temperature and moisture initialization data may be affecting the modeled results in this region too.
The daily cycles for the potential temperature computed at the SBAF station (Fig. 10a, b) also presented successful results on the second period (Feb/2009) -i.e., in the period from 00:00 UTC 6 February to 00:00 UTC 8 Feburary 2009as observed for other stations.The CTL run performs better than the HR run only at some periods of the day, showing in a convincing way that the HR run presents important results for a region characterized by valley-mountain effects (see the eastern side of Fig. 3a and b).We note that the ARPS results tend to overestimate the observational data for the potential temperature at night, dawn and early morning (about 21:00-09:00 UTC), and tend to underestimate them in the period morning-afternoon (about 11:00-15:00 UTC).In general, the calculated wind speed values are lower than the observational data between 06:00 and 15:00 UTC 6 Feburary 2009 and between 00:00 and 12:00 UTC 7 Feburary 2009, mainly due to the SBAF location, where the wind direction can be driven by a sum of two factors: the weak catabatic winds from the Pedra Branca (see the mountain in the northeast of Fig. 3a and b) and Tijuca massifs (see the mountain in the southeast of Fig. 3a and b), and the weak land breeze.At the same time intervals, we note from the wind direction meteograms (not shown) that the wind blows (with variation) from NNE (land and mountain breeze) and SSE (mountain breeze).For both days (Fig. 10a, b), the flow accelerates slightly when the wind blows from the S and SE (not shown) in the period 15:00-21:00 UTC, approximately, suggesting the occurrence of a canalized jet due to the sea-breeze effect from the Atlantic Ocean.In general, the HR run overcomes the CTL run for most of the time when we compare the wind speed results to the observational data, as illustrated in Fig. 10c and d.
At the JPA station, qualitative and quantitative agreement is observed between the observational and the simulated data values for the potential temperature (Fig. 11a, b).There are few discrepancies between the CTL and the HR runs at all times, but the HR run performs better than the CTL run mainly in the period of full development of the convective mixed layer.In general, Fig. 11a and b indicate that the landuse database did not change too much the characteristics of the JPA region and its vicinity.However, the HR run performs well on both days analysed when we compare the simulated values for the wind speed with the observational data.It is worth mentioning that the maximum values of wind speed occurred around 15:00-21:00 UTC, when the wind blew from the S and SE under the influence of the sea breeze from the Atlantic Ocean, as discussed for the SBAF station, although not shown in the wind direction meteograms.At these times, although the ARPS results overestimate the observational data for the wind speed, the HR run presents a decrease in the wind speed values when compared to those of the CTL run, indicating a positive influence of the updated surface databases on the HR simulation.Satisfactory results for the daily cycle of the potential temperature in the third period (August 2011) -i.e., between 00:00 UTC 8 August and 00:00 UTC 9 August 2011 -can be seen in Fig. 12a and b for the Xerem station.However, the ARPS results for the potential temperature overestimate a little the potential temperature obtained with the observational data at all times.Small discrepancies between the CTL and the HR runs are detected after 13:00 UTC; however the HR run overcomes clearly the CTL run between 02:00 and 12:00 UTC.The main cause of the discrepancies found in the wind direction (not shown) is the occurrence of calm winds, which reach a maximum of 9 m s −1 at 21:00 UTC.We highlight the occurrence of calm winds between 00:00 and 12:00 UTC and moderate winds in the afternoon and night (between 13:00 UTC 8 August and 00:00 UTC 9 August 2011), just when the sea-breeze flow from the Atlantic Ocean is completely developed in conjunction with anabatic wind effects from Tijuca Massif (Fig. 12c, d).In general, wind speed values computed by the HR run are a little better than the CTL run when compared to observational data.The discrepancy is larger after 21:00 UTC.But in such situations, when the wind speed is very calm, the turbulence parameterization schemes typically show considerable difficulties in modeling the flow near the ground under the referredto conditions; i.e., calm winds, intermittent flow and, normally, when mesoscale models are downscaling to the LES domains.However, it is important to point out that the results obtained from the simulations reproduce qualitatively the observed maximum wind speed in the afternoon and the minimum wind speed in the morning.
The upper air or vertical profiles of potential temperature, water-vapor mixing ratio and wind direction and speed computed for the SBGL upper-air sounding station show encouraging results when compared to the observed data collected at the SBGL station (see Figs. 13a-d and 14a-d) in the first 1.6 km a.g.l. at 12:00 UTC for both days.Despite the trend of the model to underestimate the observational data, the analysis based on the potential temperature profile at 12:00 UTC 6 September (Fig. 13a) shows good agreement between the observational data and the CTL and HR results, since a stable layer predominates in the whole atmosphere.The watervapor-ratio distribution calculated by ARPS underestimates the observed data up to 1600 m a.g.l., maybe due to the initialization used, but there is good agreement above this height (Fig. 13b).In the ABL region the variation of the wind direction with height is not well represented by the ARPS results (Fig. 13c).The ARPS results do not reproduce adequately the height above ground level where the wind speed maximum occurs, and the CTL run overestimates more than the HR run wind speed values up to 100 m a.g.l.(Fig. 13d).From 100 up to 800 m a.g.l. the CTL run performs better than the HR run.The comparison of simulated meteorological variables with observational data on the second day present a few differences between the CTL and the HR runs and overall better agreement between the runs and the observational data (Fig. 14a-d).In general, the ARPS results reproduce correctly the physical behavior of the atmosphere when compared to the observed data at the SBGL station, mainly with respect to the atmospheric stability, the wind shear, and the occurrence of low wind speed near the ABL and higher wind speed up to 1600 m a.g.l.

Difference of potential temperature fields
In order to show the spatial discrepancies between the HR and the CTL runs on the G5 domain, we present the horizontal cross section of the difference of potential temperature between the two runs, defined as θ(HR)θ(CTL), for the period from 14:00 to 17:00 UTC, 7 September 2007, as illustrated in Fig. 15a-d.This figure highlights the areas where there are visible discrepancies between both runs.These results also indicate that changes alone on the vegetation type and not merely on the soil type provide meaningful differences on the air flow, as we can see from the contrast between the continent and the sea.We inserted a dashed line in each figure to indicate the vertical cross section which we considered in the analysis of the sea-breeze front based on the TKE distribution that we present in Sect.4.3.At 14:00 UTC (11:00 local time -LT), we can clearly observe that the major discrepancies are on the western side and over the water bodies, mainly in the vicinity of a water reservoir located in the continent.Overall, the potential temperature values of the HR run tend to be higher than those of the CTL, except around the entrance of Sepetiba Bay, where a colder air parcel due to the HR run appears (Fig. 15a).In this case, the results for the HR run present an accentuated differential heating between the sea and the continent, which is able to turn on an efficient thermodynamic trigger to start the sea-land breeze mechanism.Thus, it is possible that the vegetationtype database changes the heat and water-vapor fluxes in order to represent more adequately the local circulation.At 15:00 UTC (Fig. 15b) the discrepancy increases slightly and the cool air parcel moves from the southeast to the northwest, indicating that the sea-breeze front penetrates perpendicularly into the continent with respect to the grid's east-side shoreline (approximately between −43.60 and −43.40 arcdeg west longitude).During this motion, the breeze does not feel the change in direction that the Sepetiba Bay shoreline presents after −43.60 arc-deg west longitude, approximately, which the dashed line crosses.Likewise, this behavior indicates that the soil-type database that represents this change in direction of the Sepetiba Bay shoreline is not influencing the flow enough to capture the wind direction suitably in this area.At 16:00 and 17:00 UTC (Fig. 15c, d) we note the same discrepancy areas between the runs as seen in the previous hour, but the potential temperature difference values are smaller.The cool air parcel moves towards the grid's west side at 16:00 UTC (Fig. 15c), and practically disappears at 17:00 UTC (Fig. 15d).This behavior shows the importance of increasing the density of surface-weather observation stations at MARJ in order to evaluate whether the physical trend captured with the high-resolution ARPS modeling is in agreement with the sea-breeze front advance analyzed.

Vertical-latitudinal cross-section analysis
Figure 16a-d illustrate the vertical-latitudinal cross-section distribution of TKE, potential temperature and meridionalvertical wind vector components only for the HR run, computed at −43.60 arc-deg west longitude (Marambaia longitude location) in the period 14:00-17:00 UTC, 7 September 2007, since the HR results are better than the CTL results.The scale goes from 0 to 1000 m a.g.l. and the color scale ranges between 0.05 and 3.0 m 2 s −2 .The dashed line which crosses each panel in Fig. 16a-d indicates where the land and sea cross sections are located in order to support our analysis.The calculations were performed in this period because we are interested in highlighting the TKE distribution close to the sea-breeze front indicated by the meteogram analysis from the Marambaia station (Fig. 7).One can see that the TKE distribution in the northern part of the continental region (between −22.90 and −22.65 arc-deg south latitude) may be associated, at all times, to the near-surface vertical shear and buoyancy effects caused mainly by the mountain waves (see topographic elevation maps in Fig. 3a, b), as also suggested by the numerical results for the wind velocity vector and vertical potential temperature gradient.In this region of the cross section, the TKE intensity presents higher values at 15:00 and 16:00 UTC (see Fig. 16b and c) due to the higher temperatures and wind shear computed.At 14:00 UTC (Fig. 16a), a stably stratified ABL occurs over the ocean region and near the coastline (between −23.11 and −23.00 arc-deg south latitude), and one can see a northerly wind component blowing along the vertical-latitudinal cross section beyond 200 m a.g.l.Below this level, on the surface layer, there is evidence that a southerly wind component starts the sea-breeze flow, which is associated with a weak horizontal gradient and a relative intense TKE distribution at the sea-breeze front near the coastline.
At 15:00 UTC (Fig. 16b) the horizontal gradient of temperature increases and the sea-breeze front reaches −22.96 arc-deg south latitude, approximately, characterized by a very near-surface southerly wind component which turns to the north direction at about 150 m a.g.l.However, the wind is still weak and some vertical motion is generated, even near the horizontal limit of the sea-breeze flow.In a shallow layer, the gradient of TKE increases in the HR run, leaving a TKE trail in the onshore region.From 16:00 to 17:00 UTC (Fig. 16c, d) we can see that air rises over the warm land near the shoreline and into the continent (near −22.95 arc-deg south latitude), and cooler air from the water is advected by the southerly wind component to replace it.The noticeable temperature drop observed in Fig. 16c is typical of a mesoscale cold front, as observed by Bastin and Drobinski (2006).The HR run predicts that the low-level convergence occurs in the continent, approximately 10 km from the shore, at 16:00 UTC (Fig. 16c).Also, the HR run presents an adequate wind speed prediction when compared to the observed data at the Marambaia station (see also Fig. 8).In the HR run results, a return circulation (the anti-sea breeze) of about 5 m s −1 brings warmer air back to the sea, which descends towards the sea surface and closes the circulation circuit.We can also note that, at 14:00 and 15:00 UTC (Fig. 16a, b), the speed of the following sea breeze is a little faster than the speed of the seabreeze front, and a light wave propagates upstream of the sea-breeze front as the southerly flow component collides with the adverse northerly flow component.When the following sea-breeze reaches the front, convergence results in a "head-shaped" updraft, as also seen by Reible et al. (1993) and Bastin and Drobinski (2006).This "head" is a zone of intense mixing, which is supported by the significant values of the TKE shown in Fig. 16b.We also observe that the highest TKE values occur in the mesoscale cold-front region around 450 m a.g.l. and near the ground surface.The vertical motion of the air is evident in the HR run at 14:00-15:00 UTC (see Fig. 16a and b).The maximum depth of the sea-breeze is observed to be at approximately 300 m a.g.l. at all times.After 17:00 UTC the magnitude of the updraft decreases quickly, as well as its vertical extent, and the TKE intensity decreases slowly with time near the surface.Also, the northerly component of the upper-level wind strengthens, whereas the TKE decreases and remains confined near the surface, where some shear appears.

Summary and conclusions
Numerical simulations of the ABL flow are strongly influenced by several factors; namely, the parametric models adopted in the boundary value problem that represents the physical situation, the numerical methods applied to solve the conservation equations, the numerical-grid scheme, and the boundary conditions related to the synoptic forcing and surface databases.In order to reduce the influence of these factors, we incorporate into ARPS the 3 arc-sec SRTM topographic database, the 10 arc-sec ESA vegetation-type database and the 30 arc-sec ESA LAI and FAPAR databases, which are preprocessed by subroutines we developed for the ARPS architecture.These subroutines are available to the scientific community.The numerical simulations are carried out in the LES mode for three time periods, running on six one-way nested grids that are setup separately, such that different vertical resolutions and parameterizations are chosen for each scale being modeled.
Our results clearly show that the use of high-resolution surface databases improves significantly our ability to predict the local atmospheric circulation based on the ARPS model.We observed satisfactory agreement between numerical results and field data for some periods of the days we investigated, particularly the results that we obtained with the high-resolution model proposed in this work.Overall, the slight discrepancies between the field data and the simulated potential temperature and wind speed results observed for the HR run present significantly lower errors than for the CTL run, but do not lead necessarily to better simulation results in all variables.This fact indicates that additional improvement will depend on other factors, such as the local surface characterization, the turbulence closure and other microphysics parameterizations associated to the numerical mesoscale model employed.
For the cases we studied, our simulations showed that there is no significant difference between the bias values calculated from the HR and the CTL runs for the vapor-mixing ratio.Although this result may inconclusively support that the use of HR databases over low-resolution databases is advantageous, we point out that the bias values we obtained are small compared to the observation data.In the case of the wind direction, which is probably the most difficult quantity to accurately forecast with any model, large deviations between the ARPS runs were observed against observational data.However, the HR runs show significant improvement over the CTL run at the SBAF and JPA stations.From a statistical point of view, the HR flow model performs very well in the prediction of the time and space variation of these quantities using ARPS simulations.
From a closer look at the results obtained for some specific surface stations, we infer that, at the SBAF station for example, both runs forecasted well the canalized jet triggered by the sea-breeze effect.However, at the SBSC and Xerem stations, the wind speed presents increased discrepancies in some periods of time due to the occurrence of calm winds in the simulation.As Hanna and Yang (2001) pointed out, the turbulence parameterization schemes typically have difficulties in modeling the flow near the ground under conditions of calm, intermittent flow and when mesoscale models are downscaling to LES domains.Equivalent statistical results and conclusions were obtained by Chow et al. (2006), demonstrating again the difficulties in modeling the wind field.Some clear discrepancies were observed, mainly at the moments when a flow transition occurs to form the sea breeze.However, the remarkable TKE distribution on the sea-breeze front shows a pattern very similar to the one found by Bastin and Drobinski (2006), which is evidence of a consistent physical behavior.In accordance with the literature, our results indicate that an improved representation of the properties and characteristics of the land use can dramatically influence the calculation of the momentum, heat and moisture fluxes between the surface and the atmosphere, and may significantly affect the calculation of the meteorological field quantities.This suggests a need for improving our soil-type databases, and soil moisture and temperature initializations.

Figure 1 .
Figure 1.The MARJ air basins and the location of the surface weather-observation stations.

Figure 2 .
Figure 2. The limited areas of G4 (resolution of 1 km), and the G5 and G6 innermost grids (resolution of 300 m).

Figure 4 .
Figure 4. Soil-type shaded maps for G5, processed with the 30 arcsec USGS database: (a) non-fit and (b) fit.

Figure 7 .
Figure 7. Time cross-section data of (a, b) θ (K), and (c, d) wind speed (m s −1 ) observed (closed circle) and simulated by G5 of the CTL (open triangle) and HR (open square) runs from 00:00 UTC 6 September to 00:00 UTC 8 September 2007 at Marambaia surface weather-observation station.

Figure 8 .
Figure 8.Time cross-section data of (a, b) θ (K), and (c, d) wind speed (m s −1 ) observed (closed circle) and simulated by G5 of the CTL (open triangle) and HR (open square) runs from 00:00 UTC 6 September to 00:00 UTC 8 September 2007 at Ecologia Agricola surface weather-observation station.
Figures 7-12 show the time cross-section data, or meteograms, of potential temperature and wind speed that compare the simulated data from the CTL runs (open triangle) and the HR runs (open square) with the observational data (closed circle) for some surface weather stations where the

Figure 9 .
Figure 9.Time cross-section data of (a, b) θ (K), and (c, d) wind speed (m s −1 ) observed (closed circle) and simulated by G5 of the CTL (open triangle) and HR (open square) runs from 00:00 UTC 6 September to 00:00 UTC 8 September 2007 at SBSC surface weather-observation station.

Figure 10 .
Figure 10.Time cross-section data of (a, b) θ (K), and (c, d) wind speed (m s −1 ) observed (closed circle) and simulated by G5 of the CTL (open triangle) and HR (open square) runs from 00:00 UTC 6 February to 00:00 UTC 8 February 2009 at SBAF surface weather-observation station.

Figure 11 .
Figure 11.Time cross-section data of (a, b) θ (K), and (c, d) wind speed (m s −1 ) observed (closed circle) and simulated by G5 of the CTL (open triangle) and HR (open square) runs from 00:00 UTC 6 February to 00:00 UTC 8 February 2009 at JPA surface weatherobservation station.

Table 1 .
Location of the surface and a upper-air weather observation stations.b a.m.s.l.: above mean sea level.
Source: c Brazilian National Institute of Meteorology and d Meteorology Network of the Brazilian Air Force Command.
The grid domains from GEXT to G4 are centered at a point located at −22.82 arc-deg south latitude and −43.50 arc-deg west longitude, whereas the center of grid G5 is slightly shifted to −22.88 arc-deg south latitude and −43.72 arc-deg west longitude and G6 to −22.80 arcdeg south latitude and −43.27 arc-deg west longitude, as can be seen in Fig.

Table 4 .
Summarized surface database configuration of the one-way nested numerical grids.

Table 6 .
Summary of the statistical indexes.Cases where the HR run is worse than the CTL run, with respect to the total number of cases.