TIER version 1.0: an open-source Topographically InformEd Regression (TIER) model to estimate spatial meteorological fields

This paper introduces the Topographically InformEd Regression (TIER) model, which uses terrain attributes in a regression framework to distribute in situ observations of precipitation and temperature to a grid. The framework enables our understanding of complex atmospheric processes (e.g., orographic precipitation) to be encoded into a statistical model in an easy-to-understand manner. TIER is developed in a modular fashion with key model parameters exposed to the user. This enables the user community to easily explore the impacts of our methodological choices made to distribute sparse, irregularly spaced observations to a grid in a systematic fashion. The modular design allows incorporating new capabilities in TIER. Intermediate processing variables are also output to provide a more complete understanding of the algorithm and any algorithmic changes. The framework also provides uncertainty estimates. This paper presents a brief model evaluation and demonstrates that the TIER algorithm is functioning as expected. Several variations in model parameters and changes in the distributed variables are described. A key conclusion is that seemingly small changes in a model parameter result in large changes to the final distributed fields and their associated uncertainty estimates.


Introduction
Gridded near-surface meteorological products (specifically precipitation and temperature) are a foundational product for many applications including weather and climate model validation, hydrologic modeling, climate model downscal-ing, among others (Day, 1985;Franklin, 1995;USBR, 2012;Pierce et al., 2014;Liu et al., 2017). It is often challenging to develop realistic estimates of these variables, particularly when complex terrain or large spatial climate gradients are present in the domain of interest. Because of their widespread usage and potential challenges generating products, a plethora of methods have been developed ranging from nearest neighbors, distance-weighted interpolation, Kriging, knowledge-based, climatologically aided interpolation, Gaussian filters, multiple linear regression, and others (Thiessen, 1911;Shepard, 1968Shepard, , 1984Chua and Bras, 1982;Daly et al., 1994;Willmott and Roebson, 1995;Thornton et al., 1997;Banerjee et al., 2003;Clark and Slater, 2006;Cressie and Wikle, 2011;Nychka et al., 2015;Cornes et al., 2018).
Across the methods, nearest neighbor and distanceweighted interpolations use spatial distance as the only predictor, a reasonable choice in areas with high station densities but much less so in sparsely gauged regions. The resultant field is also discontinuous between station areas of influence for nearest neighbor, while distance-weighted interpolation will increase precipitation occurrence unless explicit occurrence prediction is included (Thornton et al., 1997;Newman et al., 2015Newman et al., , 2019. Climatologically aided interpolation assumes the climatological field is better resolved by the available observations and has a strong relationship with the field of interest (e.g., daily precipitation) such that using the climatological field in the final interpolation increases the output information content (Willmot and Robeson, 1995). These assumptions are invalid when the climatological field is poorly resolved, or has little correspondence to the field of interest 1828 A. J. Newman and M. P. Clark: TIER version 1.0 which happens when an event has a significantly different pattern than climatology (e.g., Lundquist et al., 2015;Newman et al., 2019). Kriging and linear regression frameworks may include multiple spatial predictors and uncertainty estimates. However, these methods may also produce unrealistic results with sparse station observations (Cornes et al., 2018). Finally, knowledge-based systems impose a regularization on the input data through knowledge-based rules (Sect. 2). This allows for physically plausible interpolation fields in sparsely gauged regions but inflexibility similar to climatologically aided interpolation. Finally, in sparsely observed regions, we do not know the true error characteristics of any method.
The currently available climate products that use these methods have complex processing systems (Daly et al., 2008;Xia et al., 2012;Livneh et al., 2015;Newman et al., 2015;Thornton et al., 2018). The product workflow typically includes many processing steps, methodological choices, and model parameters, all interacting to influence the characteristics of the final product. Therefore, comparison studies of product performance (and even single product evaluations) are often not able to attribute differences at the final output level to specific methodological choices (Newman et al., 2019). To help alleviate these difficulties and improve our understanding of method performance across conditions, flexible modular software systems need to be developed that expose model parameters to the users and allow for new functionality to be easily added (Clark et al., 2011).
This paper focuses on approaches that incorporate knowledge of atmospheric physics into relatively simple underlying statistical models (e.g., orographic precipitation, temperature lapse rates into linear regression models) to improve the accuracy of the gridded field (e.g., Daly et al., 1994;Willmott and Matsuura, 1995). Daly et al. (1994), hereafter D94, develop a complex knowledge-based system consisting of (1) terrain pre-processing; (2) station selection; (3) development of a locally weighted meteorological variableelevation linear regression; and (4) post-processing. Omitted here are the pre-processing steps to screen station data (including quality control) and filling missing or suspicious data values. Following D94, many studies have included new knowledge-based capabilities, new intermediate processing steps, and increased granularity in a given step (Daly et al., 2002(Daly et al., , 2007(Daly et al., , 2008. In recent papers, there are upwards of 15-20 model parameters that have varying degrees of influence on the final product. The Topographically InformEd Regression (TIER) model implements the knowledge-based approach described in D94 and subsequent papers (Daly et al., 2000(Daly et al., , 2002(Daly et al., , 2007(Daly et al., , 2008hereafter D00, D02, D07, and D08). Note that TIER is not designed to be an exact replica, as it does not match any source code, nor does it implement all features described in D94, D00, D02, D07, and D08. The paper is organized as follows: we introduce the TIER conceptual model in Sect. 2.1, the pre-processing algorithms in Sect. 2.2, the regression model in Sect. 2.3, and post-processing routines in Sect. 2.4.
Then, a brief model evaluation is included in Sect. 3 to verify that the TIER model is functioning as expected. Next, we explore model parameter variation experiments for three simple test cases to highlight how model parameter choices impact the final product in Sect. 3.1. Finally, a summary discussion of TIER, lessons learned from the parameter experiments, and next steps are discussed in Sect. 4, with code and data availability provided at the end of the paper, respectively.

Conceptual model
Precipitation and temperature are unevenly distributed around the globe for myriad reasons including general circulation patterns and landscape effects. Following D94, TIER assumes that large-scale gradients are resolved by the input station data and incorporates direct knowledge of atmospheric physics to account for landscape effects (e.g., orographic precipitation, mesoscale circulations near the coast). Since the landscape influences the distribution of precipitation and temperature, particularly their climatology, many past studies have developed methods to use terrain attributes to estimate meteorological fields (e.g., Spreen, 1947;Phillips et al., 1992). For example, orography has a particularly strong influence on precipitation by enhancing uplift of air (e.g., Schermerhorn, 1967;Smith and Barstad, 2004).
D94 develops a method to use a high-resolution digital elevation model (DEM) to produce empirical estimates of the precipitation-elevation relationship. They demonstrate that using actual station elevations in the precipitationelevation relationship leads to a weak or nonexistent relationship, while using a coarse-resolution DEM smooths out local variability and results in a stronger relationship between precipitation and elevation. Such stronger relationships occur because microscale terrain features have a much smaller impact on the atmosphere than the larger-scale terrain features of the order of 2-15 km (D94 and references therein). Of course, the optimal length scale varies across atmospheric conditions and for each precipitation event, but in general a coarse-resolution or smoothed high-resolution DEM provides a strong basis for developing robust climatological precipitation-elevation relationships. Additionally, the amount of precipitation varies according to aspect (e.g., windward or lee slope), suggesting the need for different relationships for different aspects (e.g., Alter, 1919;Houghton, 1979). D94 use the smoothed DEM and decompose the domain into directional "facets" that all individually have a separate precipitation-elevation relationship. Facets are defined as continuous areas with similar aspects (slope orientation).
Daly and his colleagues have introduced several methodological enhancements since the seminal D94 paper. D00 expand on D94 to include maximum and minimum temperature, while D02 fully describe the knowledge-based system and the various physical processes included in it. Beyond elevation, the influence of large bodies of water on precipitation and temperature are incorporated by using coastal proximity or distance to the coastline. Finally, cold-air drainage down slopes and subsequent pooling in valleys is modeled as well. A conceptual two-layer atmosphere where layer 1 is the boundary layer containing temperature inversions and layer 2 is the free atmosphere is applied to the DEM. A simple twolayer atmospheric model for temperature is necessary to capture near-surface temperature inversions. This method identifies areas highly susceptible to inversions (e.g., valleys) and allows for the model to have temperature lapse rate reversals from increasing temperature with height below to decreasing above the inversion. Grid points that are identified to be within the boundary layer (layer 1) are allowed to have strong temperature inversions. These grid points are identified using the topographic position concept, which essentially computes the average nearby topographic relief to identify valleys and ridges. D94, D02, and D08 provide extensive details on the underlying theory of this knowledge-based approach.

TIER terrain pre-processing
The TIER pre-processing routines consist of the functions used to generate the required terrain attributes for the regression model. Currently, this consists of functions that perform NetCDF input/output (IO), process the DEM into topographic facets, the distance to the coast, topographic position, and estimate the idealized two-layer atmosphere. A parameter and control file specifies model parameters, and IO directories and files; see Tables 1 and 2, respectively. A flowchart describing the general flow, order of operations, and data requirements is given in Fig. 1.

Topographic facets
The native resolution DEM is first smoothed using a userdefined filter ( Table 2). The "Daly" filter is defined as (D94) where S E i,j is the smoothed elevation and E i,j is the highresolution elevation at grid point (i, j (3) south (135 • < aspect ≤ 225 • ); (4) west (225 • < aspect ≤ 315 • ); and (5) flat (D94). Flat aspects are areas with terrain gradients (slopes) less than the user-specified minGradient (m km −1 , Table 2). After the facets are defined, small facets are merged together with neighboring facets using the minimum size model parameters (Table 2). Flat regions that are very narrow are considered ridges and behave like neighboring facets. These are merged into the neighboring facets on the west or south slopes depending on their orientation (D94).

Distance to the coast
The user defines a land-ocean mask field in the input grid file. This mask defines which grid points are large bodies of water (ocean points) and are used in the coastal proximity calculation. The distance to the coast is computed using the great circle distance assuming a spherical earth for every grid cell within a user-defined distance threshold.  /path/to/output/processed/grid/file Name of output processed grid stationPrecipPath /path/to/precipitation/station/data/directory Path to precipitation station data stationPrecipListName /path/to/precipitation/metadata/output/file Name of generated precipitation station list file stationTempPath /path/to/temperature/station/data/directory Path to temperature station data stationTempListName /path/to/temperature/metadata/output/file Name of generated temperature station list file preprocessParameterFile /path/to/TIER/preprocessing/parameter/file Name of TIER pre-processing parameter file

Topographic position
To identify the atmospheric layer (Sect. 2.2.4) of a grid point, the local topographic position of a grid point is computed first. The topographic position calculation uses the highresolution DEM. Following D02, for each grid point, the following steps are taken.
The minimum elevation within a user-defined local search radius (r, Table 2) is found. D02 suggest a search radius of 40 km.
where E (i − r : i + r, j − r : j + r) denotes the DEM elevations within ±r grid points at valid land points in the i and j directions, and E M i,j is the local minimum elevation. The topographic position is then estimated as where N g is the number of land grid points within the search radius (N g ≤ r 2 ).

Two-layer atmosphere
Following the determination of the topographic position, each grid cell is placed into the first (boundary or inversion layer) or second layer (free atmosphere) of the idealized twolayer atmosphere. The height of the inversion layer is defined by the user (Table 2) and added to the mean elevation computed on the left-hand side of Eq. (3). This defines an inversion height above sea level for all grid points. All grid points where E i,j is less than the inversion height are placed into layer 1, while all other grid points are placed into layer 2 (D02).

Station metadata
After the input domain grid file has been processed, the preprocessing routine generates station metadata files for all precipitation and temperature stations that will be used in the regression model. Each station is assigned the closest grid point value of the smoothed DEM, facet, topographic position, atmospheric layer, and coastal distance.

Interpolation model
The regression model is applied to each land-masked grid cell. It consists of routines to compute the station weights, to estimate the meteorological-terrain relationships, and to estimate the variable value at each grid point. A parameter and control file specifies model parameters, and IO directories and files; see Tables 3 and 4, respectively. A flowchart describing the general flow, order of operations, and data requirements is given in Fig. 2. Figures 3 and 4 provide more detailed flowcharts for the specific processing flow for precipitation and temperature variables.
where W is the final weight vector, W d is the distancedependent weights, W f is the facet weights, W l is the atmospheric layer weights, W t is the topographic position weights, and W p is the coastal proximity weights. All component weights and the final weight vector are normalized to sum to unity (D02). For precipitation, only W d , W f , and W p are used to estimate W , while temperature uses all five component weights in W .

Distance-dependent weighting
A station's relevance to the current grid point decreases as the station distance increases (e.g., Shepard, 1968); thus, this component station weighting decreases with increasing distance. Here, we generally follow the synagraphic computer mapping (SYMAP) algorithm of Shepard (1968Shepard ( , 1984 and develop inverse distance weights that are further modified by including direction information. Direction information is used to downweigh stations that are in a similar direction but further distance than other stations, as their influence has been "shadowed" by the nearer station (Shepard, 1968). The distance weighting function of Barnes (1964) is used: where I is the inverse distance-dependent weights, d is the station distance vector, and y and s are the user-defined Barnes exponent and scale factor, respectively (Table 4). The angle-dependent weights are then computed as (Shepard, 1984) where T s is the station angle weight for station s, and subscripts s and q denote station subscripts for stations 1 : n x , where n x is the maximum number of stations considered for each grid point (Table 4). The final distance-dependent weights are then computed as where n is the number of stations considered at the current grid point. W d is then normalized to sum to unity.

Facet weighting
Stations on the same facet type as the current grid point receive an initial facet weight of 1. D02 introduces a method to reduce the weight of stations on the same facet type but with intervening facets of different types between the station and grid point (D02 Eq. 5). This is not implemented here, and all stations on the same facet type as the current grid point receive the same weight. This could be a decision considered for exploration in a future TIER release. The distancedependent weights already account for this implicitly, but the explicit inclusion of additional weight decreases for stations on the same facet type will increase the localization of the TIER station weighting even further. This would increase the small-scale features of TIER.

Atmospheric layer
The atmospheric layer weight function is defined as where l s is the layer difference between the grid point and station s, the elevations are defined using the high-resolution DEM and station elevation (E s ), and a is the user-defined layer weighting exponent (Table 4). Following D02, stations in the same atmospheric layer as the current grid point receive an initial weight of 1. D02 includes an additional check to see if the station-grid elevation difference is smaller than some threshold value for a station. If this is true, a station in a different layer and grid cell may still receive a weight of 1. TIERv1.0 does not include the additional conditional statement and only stations in the same atmospheric layer receive an initial weight of 1. The vertical elevation difference is then used to weigh the remaining stations. The atmospheric layer weighting is applied only to temperature variables in TIERv1.0.

Topographic position
The topographic position weights are computed following D07: where t s is the topographic position difference between the current grid cell and station s, t m and t x are the userdefined minimum and maximum topographic position differences, and z is the topographic position weighting exponent (Table 4). The topographic position weight enhances identification of stations that lie in similar topographic areas (e.g., valleys) and is applied only to temperature variables in TIERv1.0.

Coastal proximity
Using the computed distance to the coast, the coastal proximity weights are computed as where p s is the absolute difference between the current grid cell and station s distance to the coast values, and c is the user-defined coastal proximity weighting exponent (Table 4). D02 computes coastal proximity weights using the same inverse distance function but also includes a threshold ( p x ), which, if p s > p x , is set to zero. This weighting factor highlights stations with similar coastal proximity to the current grid cell.

Grid point estimate
Once nearby stations are selected and the final weight vector is computed (Eq. 4), a base grid point estimate is developed using the weighted average of all nearby stations: whereμ b o is the grid point meteorological variable estimate, and Y s and W s are the observed station value and the station weight for station s, respectively. The uncertainty of this value is estimated as the standard deviation of the leave-oneout estimates, which is all possible combinations of n r − 1 stations, n r n r −1 , which in this case are n r possible combinations. where n r is the subset of stations that are both within the distance threshold and on the same facet as the current grid cell,σ b o is the estimated standard deviation ofμ b o , andμ b,−1 is the estimated value when the ith station is withheld. Next, the variable-elevation linear regression coefficients are solved for where β is the vector of linear regression coefficients (β o , β 1 ), A s is the n r ×2 design matrix, W is the n r ×n r diagonal weight matrix populated with the final weight vector W , and Y is the vector of observed station values. In D94, D02, and D08, these coefficients determine the grid point estimate aŝ where β 1M and β 1X are the user-defined minimum and maximum valid regression slopes (Table 4). Note that slope here is in physical units per distance (e.g., mm km −1 or K km −1 ), which is also referred to as the lapse rate in atmospheric science. In TIERv1.0, we have chosen to use the base grid point estimate,μ b , as the intercept value in the variable-elevation regression equation. This is done because whenβ 1 falls outside of the bounds in Eq. (14), a default slope value is used, butβ 0 is not modified. Thus, the base estimate of a variable for a grid cell is sometimes derived from an equation the system considers invalid. Therefore, we fully disassociate the intercept and slope estimates. Here, we provide an initial assessment of this choice in Sect. 3, but this methodological choice should be examined in more detail in future work. Subsequently, we then also modify the elevation used in the regression equation to be the difference between the high-resolution DEM elevation and the W weighted station elevation using the smoothed DEM station elevations. The switch to an elevation difference is required as we are effectively correcting the base estimate to the DEM elevation, and the base estimate has an intrinsic elevation associated with it. Therefore, the TIERv1.0 grid point estimate iŝ where E is the difference between the smoothed DEM elevation and the W weighted station elevation using the smoothed DEM station elevations. Whenβ 1 is invalid, the default slope is used, and when the initialβ 1 is valid, the uncertainty ofβ 1 is estimated in a similar manner to that of µ b o using Eq. (12). Note that for temperature variables only, the user can define a spatially variable default lapse rate (Table 3, Fig. 4). The standard deviation of all valid slope estimates from the leave-one-out estimates, n r n r −1 , is used as the uncertainty estimate ofβ 1 : whereσ β 1 is the estimated standard deviation ofβ 1 andβ 1,−1 is the estimated value when the ith station is withheld. D02 define the method to adaptively adjust the station search radius until the minimum number of needed stations is met. Here, we do not adjust the search radius and instead attempt the regression and uncertainty estimation when n r ≥ n m , where n m is the user-specified minimum number of stations required for the regression (Table 4). When n r < n m , the regression is attempted for 2 ≤ n r < n m , and the default slope is used when n r < 2. Additionally, for n r < n m , Eq. (16) is never applied and there is no direct uncertainty estimate ofβ 1 for those grid cells.
Finally, D94 found that normalizing the precipitation lapse rate (km −1 ) after performing the regression reduces the large spatial variability in precipitation lapse rates due to the large spatial variability in the underlying precipitation amounts. The normalization is done at each grid cell aŝ where where Y P is the mean precipitation (mm) of all stations considered for the regression for the current grid point,β 1P is the estimated slope in physical units (mm km −1 ), and Y P ,s is the station precipitation (mm) at station s. Accordingly,σ B 1 P iŝ The normalization allows for the bounds in Eqs. (14)-(15) to be broadly applicable for precipitation, as well as for a reasonable default lapse rate to be applied to grid points where a valid regression slope cannot be found. Temperature lapse rates are computed in physical units (K km −1 ), as there is little variability in temperature lapse rates.

Post-processing
Several post-processing steps are undertaken to reach the final gridded estimates after all grid points have an initial estimate, shown in Fig. 5. These include updating estimated slope values, applying spatial filters, and recomputing the final fields.

Precipitation
The initial precipitation normalized slope estimates are used to recompute the default slope if the user specifies (Table 4).
In this case, all grid points with valid regression slopes are used to compute the domain mean normalized precipitation slope. This value is then substituted at all grid points with default slope estimates. Next, a 2-D Gaussian filter is applied to the normalized slopes to reduce noise and smooth the artificial numerical boundaries in slope values and is taken as the final precipitation slope estimate (Fig. 5a). The parameters of this spatial filter (size and spread) are specified in the TIER model parameter file (Table 4).
After the slope estimates have been finalized, the precipitation is field is recomputed using Eq. (15) and then a feathering process is applied to smooth any remaining very large gradients (e.g., D94; Fig. 5a). The feathering routine operates on the normalized precipitation slopes and searches for grid cell-grid cell gradients in the normalized slope larger than a user-specified value (Table 4). If a large gradient is found, the slope of the grid cell with less precipitation is increased until the gradient falls below the maximum allowable value. The feathering routine iterates over the grid until there are no remaining large gradients and is an additional smoothing step for precipitation in TIER. Also, the feathering routine only runs for grid cells with larger elevation changes than a userspecified minimum gradient (Table 4), which effectively ignores flat areas (D94), and in TIERv1.0 the feathering routine only operates on grid cells above a user-specified minimum elevation.
Finally, uncertainty estimates are recomputed for the entire grid, first for the base estimate and slope components, then for the total uncertainty of Eq. (15) (Fig. 5a). For those grid points with no initialσ B 1 , the nearest neighbor estimate is used. Then the same Gaussian filter applied to the normalized precipitation slopes is applied to the griddedσ b o andσ B 1 . The final uncertainty contribution due to uncertainty in the precipitation slope at a grid point in physical units (mm) is then computed aŝ whereσ βP is the final uncertainty (mm) due to uncertainty in the precipitation slope,σ β 1 P is the final slope uncertainty (mm km −1 ), andμ P is the final precipitation estimate (mm). The filteredσ b o P field is used as the final base precipitation estimate,σ bP . The total uncertainty is estimated as the combined standard deviation of the two component estimates: because the covariance between the two component uncertainties is sometimes nonzero. The covariance is computed locally at each grid point using a user-defined 2-D window of points (Table 4) around the current grid point.

Temperature
Post-processing for temperature is simpler than that for precipitation because the temperature lapse rates are in physical units. The initial valid temperature slope estimates are used to recompute the default lapse rate if the user specifies (Table 4) when there is no spatially varying default temperature lapse rate information provided (Table 3). Again, the mean of all valid regression slope estimates is used as the updated default temperature lapse rate for this case. As for precipitation, a 2-D Gaussian filter is then applied to the slopes to reduce noise and smooth the artificial numerical boundaries in slope values and is taken as the final temperature slope estimate (Fig. 5b). Then, the final temperature estimate is computed using these updated lapse rate values and Eq. (15). As for precipitation, the component and total uncertainty estimates are then finalized for temperature. The base temperature estimate uncertainty and slope uncertainty are smoothed using the 2-D Gaussian filter to estimate the final component uncertainties,σ bT andσ β 1 T , respectively. Then, the final temperature uncertainty contribution due to temperature lapse rate uncertainty is computed using Eq. (18), and Eq. (19) is used to compute the total uncertainty of the temperature estimate, substituting subscript Ts for Ps in both.

Model evaluation and sensitivity experiments
An example use case over the western United States, focused primarily on the Sierra Nevada mountains between roughly 35-43 • N and 118-125 • W including precipitation, maximum (T max ) and minimum (T min ) temperature data (Fig. 6), is used for computing basic model evaluation statistics. This evaluation is to simultaneously determine if the TIERv1.0 algorithm is performing as expected numerically and to provide a brief baseline of performance. We calculate bias and mean absolute error (MAE) statistics from the final gridded meteorological variables using all available stations or a Figure 6. The TIER test domain with (a) the temperature station distribution and (b) the precipitation station distribution. Contours indicate the 0, 500, 1500, and 2500 m elevation contours moving from black to light gray. calibration sample evaluation. Additional evaluation is considered outside the scope of this initial presentation of the model.
The gridded output fields are nearly unbiased for all three variables: 0.2 mm, −0.22, and −0.21 K for precipitation, T max , and T min , respectively. MAE values are 0.84 K for T max and 0.75 K for T min and 14.3 mm for precipitation (Table 5). Additionally, the gridded output values have nearly zero conditional bias for temperature, as indicated in Fig. 7a-b, where the fitted slope to the TIER-observation points is 0.93 and 0.96 for T max and T min , respectively. There is an overestimation at smaller values transitioning to an underestimation at larger values. Precipitation has the same conditional bias structure as temperature (Fig. 7c); however, the slope of the TIER-observation fitted linear regression is 0.88, indicating a larger conditional bias as observed precipitation increases. Figure 8 highlights the methodological choice made in Sect. 2.3.2 to disassociate the intercept parameter from the regression estimated slope in Eq. (15) for precipitation. We Table 5. Calibration sample evaluation statistics for TIER using the default parameters with 90 % confidence intervals in parentheses. Comparison to in-sample station observations shows that thê β 0 estimation method results in higher biases and MAE than TIERv1.0: 38.2 versus 0.2 mm bias and 51.9 versus 14.3 mm MAE for the two methods, respectively. These differences could be due to several reasons, including that TIERv1.0 parameters were subjectively tuned for the published methodology. Also, in-sample validation does not truly determine method performance, an out of sample verification exercise and further evaluations should be undertaken. The PRISM-similar method within the PRISM model performs extremely well and may likely be more appropriate for higher elevations given the tendency for these types of linear regression systems to underestimate precipitation above the highest observation when using smoothed DEM values (see Sect. 4d.4 and parameter B1EX in Table 1 of D94).

Model parameter experiments
Here, we explore the impact of model parameter changes on the output values and their associated uncertainty estimates. We modify TIER model parameters only (no pre-processing parameters) and make three parameter changes to parameters focused on different parts of the interpolation model for different variables in an effort to concisely highlight how model parameter choices impact the final product. First, we modify the inverse distance weighting exponent in the distancedependent weighting function for T min (experiment 1), then we modify the coastal distance weighting exponent for precipitation (experiment 2), and finally we modify the maximum number of stations allowed for each grid point for precipitation (experiment 3).

Experiment 1
In experiment 1, the parameter "distanceWeightExp" (Table 4), which is the exponent in the distance-dependent weighting function (Eq. 5), is modified from 2 (default) to 1.75 (modified) for a spatial simulation of T min . This decreases the negative slope of the inverse distance weighting function such that stations further from the considered grid point receive more weight in the modified case than the default. The resulting T min distributions and difference field are given in Fig. 9. The spatial distributions are very similar throughout most of the domain as the observation network is relatively high density across most of the domain (Fig. 6). Where the station density decreases along the eastern side of the domain, differences increase in magnitude east of 119 • W. Notably, there are also pockets of differences outside of ±1 • C in areas with high station density along the coast and between 40-42 • N and 121-123 • W. These locations contain complex terrain, specifically large elevation gradients, and the modified station weights result in different estimated temperature lapse rates in addition to changes in the base estimate, resulting in the different temperature estimates. However, the calibration sample statistics are not significantly different at the 90 % confidence level than the default parameter set ( Table 6), suggesting that the changes in the gridded field are not able to be differentiated in a meaningful way.

Experiment 2
For experiment 2, we examine precipitation and modify the "coastalExp" (Table 4) parameter from 0.75 to 1 to examine the influence of changes to the coastal proximity weighting. Qualitatively, the two precipitation distributions are identical to the overall precipitation pattern, remaining essentially unchanged ( Fig. 10a-b). The difference fields show that there are shifts in the precipitation placement throughout the domain through the alternating positive/negative difference patterns, particularly across the complex terrain (Fig. 10b), but essentially there is no net precipitation change with a total relative difference of 0.2 % between the two estimates. Absolute differences can be as large as 46 mm in areas of large total accumulations; however, the relative differences in those areas are generally less than 10 % (Fig. 10b). Correspondingly, dry areas have smaller absolute differences but sometimes larger relative differences, as can be seen along the eastern third of the domain (Fig. 10b). The mean absolute value of the cell-to-cell precipitation gradient is 1.56 mm km −1 versus 1.69 mm km −1 (8.3 % increase) in the base and modified cases, respectively. Increased station localization should be expected to increase high-frequency variability and thus spatial gradients. The calibration sample statistics are not significantly different at the 90 % confidence level from the default parameter set (Table 6). However, in this case, the confidence bounds are almost all nonoverlapping, which may suggest increasing the coastal exponent further would improve the model performance.
The total uncertainty is generally increased by a few millimeters across the domain (0.55 mm on average), with a corresponding relative increase in uncertainty of around 5 %-10 % (Fig. 10c). This is due to the fact that increasing this weight exponent decreases the weight of stations more dissimilar to the current grid point, effectively increasing the localization of the weights and increasing the variability of the leave-one-out estimates (Eq. 16).

Experiment 3
Finally, we change the "nMaxNear" parameter (Table 4) from 10 to 13, which controls the maximum number of stations used for each grid point the interpolation model. Again, the precipitation pattern is essentially qualitatively unchanged between the two configurations with a 0.2 % domain average change; also see Fig. 11a. However, the relative difference fields highlight larger and more systematic changes to the precipitation distribution than in experiment 2. Areas of the highest accumulation in the base case have less precipitation in the modified case (compare Fig. 10a and Fig. 11a) with 89 % (217/245) of the grid points having precipitation > 300 mm in the base case having less precipitation in the modified case. Conversely, 57 % (7686/13 430) of the grid points having < 300 mm in the base case have more precipitation in the modified case. This is because generally more stations are included in the estimate for each grid cell, which results in a smoother final estimate through smoothed base and slope precipitation estimates in Eq. (15). The mean absolute value of the cell-to-cell precipitation gradient is 1.56 mm km −1 versus 1.48 mm km −1 (5.5 % decrease) in the base and modified cases, respectively. The calibration sample statistics are statistically equivalent to the base case, but the MAE in experiment 3 is statistically significantly larger than that in experiment 2. This is an expected result given that the final estimate is less localized for any specific station.   Increasing the number of stations considered reduces the estimated uncertainty across nearly the entire domain ( Fig. 11b-c). On average, there is a 2.1 mm (16 %) reduction in the domain mean uncertainty, with some grid cells having reductions > 25 %. The large decreases are primarily in regions of complex terrain, and this is controlled by changes in the slope uncertainty estimate,σ βP (Fig. 11c). This change in total uncertainty is slightly larger than but opposite in sign to the parameter modification in experiment 2.

Summary and discussion
The Topographically InformEd Regression (TIER) software was developed for several reasons. First, the systems for spatial modeling of meteorological variables from in situ observations have matured to the point that they are complex systems with many methodological choices and model parameters. TIERv1.0 provides an initial implementation of a knowledge-based statistical modeling system based on D94, D02, D00, D07, and D08 with the capability to explore different methodological choices in a systematic fashion. The system is modular so that new knowledge-based ideas can be added to the regression model through including new weighting terms. Model parameters are also accessible to the user, allowing for parameter perturbation experiments. More broadly, this should be viewed as a first step towards development of flexible, open-source systems that include many of the commonly used spatial interpolation models so the community can more fully understand methodological choices in gridded meteorological product generation (e.g., Newman et al., 2019). Understanding how methods and model parameters interact and modify the final output is key to improving these systems.
The parameter experiments performed here provide three examples highlighting how minor changes to one model parameter impact the final spatial distribution. For example, modifying the coastal weight exponent results in a shift in placement of precipitation across the domain (Fig. 10) and systematic changes in the estimated uncertainty. Increasing the maximum number of stations considered for the interpolation results in systematic changes to the precipitation distribution and decreases the sharpness of the final field (Fig. 11). Also, the spatial gradients of precipitation and total uncertainty changes are of opposite sign for experiments 2 and 3. In general, parameter changes that act to increase localization will enhance gradients and uncertainty, while those that decrease localization or increase sample sizes will decrease gradients and uncertainty. This highlights that parameter interactions could play a role in the final result through positive or negative feedbacks. Finally, experiments 2 and 3 result in non-significant differences as compared to the base case, while the MAE between the modified parameter sets in experiments 2 and 3 results in statistically significant MAE differences.
Given the ability to perform parameter sensitivity experiments in TIER, we reemphasize the need for novel evaluation methods including out-of-sample station networks (e.g., Daly, 2006;Daly et al., 2017;Newman et al., 2019) that are as independent from the input networks as possible and integrated validation methods using ancillary observations such as streamflow and other modeling tools such as hydrologic models (Beck et al., 2017;Henn et al., 2018;Laiti et al., 2018).
Finally, TIER does not implement the exact system developed by Daly and colleagues and will not produce the same climate fields even with the same input data. TIER is not duplicating source code and every feature described in D94, D00, D02, D07, and D08, as TIER was developed as a knowledge-based system following these papers, not replicating them and other unpublished details. Also, TIER version 1.0 does not contain station input data pre-processing routines. Instead, example input data are provided in the example cases' dataset (Sect. 5). Station pre-processing and quality control can encompass a vast number of methods (e.g., Serreze et al., 1999;Eischeid et al., 2000;Durre et al., 2008Durre et al., , 2010Menne and Williams 2009). These methods may be included in future releases or as separate community station quality control tools.
Data availability. The input data for the example domain used here are available at https://ral.ucar.edu/solutions/products/ the-topographically-informed-regression-tier-model (Newman, 2019c