Interactive comment on “ Implementation and comparison of a suite of heat stress metrics within the Community Land Model version 4 . 5 ” by J . R

Abstract. We implement and analyze 13 different metrics (4 moist thermodynamic quantities and 9 heat stress metrics) in the Community Land Model (CLM4.5), the land surface component of the Community Earth System Model (CESM). We call these routines the HumanIndexMod. We limit the algorithms of the HumanIndexMod to meteorological inputs of temperature, moisture, and pressure for their calculation. All metrics assume no direct sunlight exposure. The goal of this project is to implement a common framework for calculating operationally used heat stress metrics, in climate models, offline output, and locally sourced weather data sets, with the intent that the HumanIndexMod may be used with the broadest of applications. The thermodynamic quantities use the latest, most accurate and efficient algorithms available, which in turn are used as inputs to the heat stress metrics. There are three advantages of adding these metrics to CLM4.5: (1) improved moist thermodynamic quantities; (2) quantifying heat stress in every available environment within CLM4.5; and (3) these metrics may be used with human, animal, and industrial applications. We demonstrate the capabilities of the HumanIndexMod in a default configuration simulation using CLM4.5. We output 4× daily temporal resolution globally. We show that the advantage of implementing these routines into CLM4.5 is capturing the nonlinearity of the covariation of temperature and moisture conditions. For example, we show that there are systematic biases of up to 1.5 °C between monthly and ±0.5 °C between 4× daily offline calculations and the online instantaneous calculation, respectively. Additionally, we show that the differences between an inaccurate wet bulb calculation and the improved wet bulb calculation are ±1.5 °C. These differences are important due to human responses to heat stress being nonlinear. Furthermore, we show heat stress has unique regional characteristics. Some metrics have a strong dependency on regionally extreme moisture, while others have a strong dependency on regionally extreme temperature.


Introduction
Heat-related conditions are the number one cause of death from natural disaster in the United States -more than tornadoes, flooding, and hurricanes combined (NOAAWatch, 2014).Short-term duration (hours) of exposure to heat while working may increase the incidence of heat exhaustion and heat stroke (Liang et al., 2011).However, long-term exposure (heat waves or seasonally high heat), even without working, may drastically increase morbidity and mortality (Kjellstrom et al., 2009a).Although there is high uncertainty in the number of deaths, the 2003 European heat wave killed 40 000 people during a couple weeks in August (García-Herrera et al., 2010), and tens of thousands more altogether for the entire summer (Robine et al., 2008).The 2010 Russian heat wave, the worst recorded heat wave, killed 55 000 people over the midsummer (Barriopedro et al., 2011).
A growing literature is concerned with the frequency and duration of heat waves (Seneviratne et al., 2012, and references therein).One study concluded that intensification of 500 hPa height anomalies will produce more severe heat waves over Europe and North America in the future (Meehl and Tebaldi, 2004).Another study shows that, even with including the global warming "hiatus", there is an increasing occurrence of extreme temperatures (Seneviratne et al., 2014).Multiple studies associate lack of precipitation and/or low soil moisture with contributing to high temperatures (Fischer et al., 2007;Mueller and Seneviratne, 2012;Miralles et al., 2014).
Regarding humans, however, temperature differences are not the primary method for heat dissipation.Evaporation of sweat is crucial to maintaining homeostasis, and none of the aforementioned studies incorporate atmospheric moisture to measure heat stress.Many diagnostic and prognostic methods were developed to diagnose heat stress (over a 100-year history, Table 1) -such as the Wet Bulb Globe Temperature (WBGT), the Discomfort Index (DI), or Heat Index (HI)and policy makers have decided to incorporate these indices in weather warning systems (Epstein and Moran, 2006;Parsons, 2006Parsons, , 2013;;Rothfusz, 1990;Fiala et al., 2011).
There are a limited number of studies validating, exploring, or using heat stress metrics on a global scale (Kjellstrom et al., 2009b;Hyatt et al., 2010;Sherwood and Huber, 2010;Fischer and Schar, 2010;Fischer et al., 2012;Fischer and Knutti, 2012;Willett and Sherwood, 2012;Dunne et al., 2013;Kjellstrom et al., 2013;Oleson et al., 2013b).Algorithms for measuring heat stress and labor capacity are based upon sub-daily rates of exposure to heat stress (Parsons, 2006).Most of these studies do not capture the diurnal cycle of heat stress (Kjellstrom et al., 2009b;Hyatt et al., 2010;Fischer and Schar, 2010;Fischer and Knutti, 2012;Willett and Sherwood, 2012;Dunne et al., 2013;Kjellstrom et al., 2013), thus not representing either nighttime highs or daytime extremes.Only one study includes solar radiation as a component in heat stress (Kjellstrom et al., 2013).Different metrics are used between each study, and only one study attempts to compare more than two metrics (Oleson et al., 2013b).
Occasionally, results using heat stress limits are misinterpreted.One study confuses wet bulb temperature thresholds with dry bulb temperature thresholds (Benestad, 2011).This has misleading consequences as their results do not include moisture metrics, yet the author cites Sherwood and Huber's (2010) wet bulb threshold (35 • C) as the threshold value for their temperature analysis.The wet bulb temperature at 35 • C is a theoretical limit where humans would die from heat stress after 6 h of exposure.Benestad's (2011) misapplication implies that most humans should die every year, because a great portion of the world reaches temperatures of 35 • C for more than a 6 h period.
Our goal here is to improve the situation by creating a module that calculates a large suite of metrics, using the most accurate and efficient algorithms available, that may be used with as many applications as possible: climate models, offline archive data, model validation studies, and weather station data sets.We call this module the HumanIndexMod.The module calculates four moist thermodynamic quantities and nine heat stress metrics.These heat stress metrics are in operational use worldwide and cover a wide range of assumptions.
As an example of numerous applications, we implement the HumanIndexMod into the Community Land Model (CLM4.5), a component model of the Community Earth System Model (CESM), maintained by the National Center for Atmospheric Research (NCAR) (Hurrell et al., 2013).The metrics are directly calculated at the sub-grid scale, capturing heat stress in every environment: urban areas, lakes, vegetation, and bare ground.We show examples of the advantages of calculating these metrics at the model time step as compared to lower temporal resolution, and the importance of using accurate moist thermodynamic calculations.We also show that having all metrics calculated at the same time allows for comparison of metrics between each other, and allows for unique analysis of conditional distributions of the inputs.Finally, we show that the metrics may also be used as model diagnostics.The outline of the paper is as follows: Sect. 2 (Heat stress modeling) focuses on the development, calculation, and use of these 13 metrics.Section 3 (Methods) describes the implementation and model setup.Section 4 (Results) presents the results of a model simulation using these metrics.Section 5 (Discussion) discusses the implications of the research, and Sect.6 (Summary) presents the conclusions of the paper.

Background
The primary focus of this paper is on atmospheric-variablebased heat stress metrics that we introduce into the Hu-manIndexMod.The models for determining heat stress for humans vary greatly, ranging from simple indices to complex prognostic physiology modeling (Table 1).Prognostic thermal models are beyond the scope of this paper, as they require more than atmospheric inputs.Additionally, metrics that include radiation and wind (with one exception: Apparent Temperature, or AT) are also beyond the scope of this paper.Each index that we chose uses a combination of atmospheric variables: temperature (T ), humidity (Q), and pressure (P ).We chose these metrics because they are in operational use globally by industry, governments, and weather services.Furthermore, these metrics may be applied to the broadest range of uses: climate and weather forecasting models, archive data sets, and local weather stations.
Sections 2.2-2.4 describe the metrics that we have chosen to implement in the HumanIndexMod (see variables defined in Table 2).Most of the metrics have units of temperature, which may be misleading.The metrics have temperature scales for comparative purposes only, as the metrics are an index, not a true thermodynamic quantity.We break these metrics into three categories, based upon design philosophies: comfort, physiological response, and empirical fit.Comfort-based algorithms are a quantification of behavioral or "feels-like" reactions to heat in both animals and humans.Physiological indices quantify the physical response mechanisms within a human or animal, such as changes in heart rate or core temperatures.The empirical indices quantify relationships between weather conditions and a non-

Comfort algorithms
We use Apparent Temperature, Heat Index, Humidex, and Temperature Humidity Index for Comfort to account for comfort level.These metrics were either tailored to the global locations where they were developed or streamlined for ease of use from physiology models.The underlying philosophical approach to deriving comfort metrics is representing behavioral reactions to levels of comfort (Masterson and Richardson, 1979;Steadman, 1979a).The goal of these equations of comfort is to match the levels of discomfort to appropriate warnings for laborers (Gagge et al., 1972) and livestock (Renaudeau et al., 2012).Discomfort in humans sets in much earlier than physiological responses; i.e., the human body provides an early warning to the mind that continuing the activity may lead to disastrous consequences.For example, when heat exhaustion sets in, the body is sweating profusely, and often there are symptoms of dizziness.However, the actual core temperature for heat exhaustion is defined at 38.5 • C, which is considerably lower than heat stroke (42 • C).We describe the four comfort-based algorithms below.
Apparent Temperature was developed using a combination of wind, radiation, and heat transfer to measure thermal comfort and thermal responses in humans (Steadman, 1994).AT is used by the Australian Bureau of Meteorology and was developed for climates in Australia (ABM, 2014).The metric is an approximation of a prognostic thermal model of human comfort (Steadman 1979a(Steadman , b, 1984)).
where the vapor pressure (e RH ) is in pascals and is calculated from the relative humidity (RH in %) and saturated vapor pressure (e sPa , also in pascals).We use this notation because e s (Table 2) is in millibars.These variable names are the explicit names of the variables in the HumanIndexMod.AT uses the wind velocity (m s −1 ) measured at the 10 m height (u 10 m ).Air temperature (T C ) and AT are in units of degrees Celsius.AT is the only metric in the HumanIndexMod that includes an explicit calculation for wind velocity; the other metrics assume a reference wind.We included this metric due to a previously used legacy version within CLM4.5 (Oleson et al., 2013b).An assumption made by AT is that the subject is outside but not exposed to direct sunlight.AT has no explicit thresholds; rather, the index shows an amplification of temperatures.Previous work, however, has used temperature percentiles to describe AT (Oleson et al., 2013b).Heat Index was developed using a similar process to AT.The United States National Weather Service (NWS) required a heat stress early warning system, and the index was created as a polynomial fit to Steadman's (1979a) comfort model (Rothfusz, 1990).
Here, air temperature (T F ) and HI are in degrees Fahrenheit.HI has a number of assumptions.The equation assumes a walking person in shorts and T-shirt, who is male and weighs ∼ 147 lbs (Rothfusz, 1990).Additionally, this subject is not in direct sunlight.As with AT, HI represents a feels-like temperature, based upon levels of discomfort.HI uses a scale for determining heat stress: 27-32 Humidex (HUMIDEX) was developed for the Meteorological Service of Canada and describes the feels-like temperature for humans (Masterson and Richardson, 1979).The original equation used dew point temperature, rather than specific humidity.The equation was modified to use vapor pressure, instead: HUMIDEX is unitless because the authors recognized that the index is a measure of heat load.The index has a series of thresholds: 30 is some discomfort, 46 is dangerous, and 54 is imminent heat stroke (Masterson and Richardson, 1979).
The Temperature Humidity Index for Comfort (THIC) is a modification of the Temperature Humidity Index (THI) (Ingram, 1965).Comfort was quantified for livestock through THIC (NWSCR, 1976).We use the original calibration, which is for pigs (Ingram, 1965).The index is unitless: where wet bulb temperature (T w ) is in degrees Celsius.The index is used to describe behavioral changes in large animals due to discomfort (seeking shade, submerging in mud, etc.).The index is in active use by the livestock industry for local heat stress and future climate considerations (Lucas et al., 2000;Renaudeau et al., 2012).The index describes qualitative threat levels for animals: 75 is alert, 79-83 is dangerous, and 84+ is very dangerous.There are different approaches to the development of THIC, including considerations of physiology of large animals.

Physiology algorithms
Numerous metrics are based upon direct physiological responses within humans and animals; however, almost all of them are complicated algorithms (e.g., Moran et al., 2001;  Berglund and Yokota, 2005;Gribok et al., 2008;Maloney and Forbes, 2011;Havenith et al., 2011;Gonzalez et al., 2012;Chan et al., 2012).Most metrics require radiation measurements or heart rates, and/or even sweat rates.The available metrics that are calibrated for physiological responses using only meteorological inputs, however, are limited, such as the Temperature Humidity Index for Physiology (THIP; Ingram, 1965): THIP and THIC are modifications of the THI.Additionally, THIC and THIP have applications beyond heat stress.THIP and THIC threshold levels are computed from both indoor and outdoor atmospheric variables.The differences between outdoor and indoor values are used to evaluate evaporative cooling mechanisms, e.g., swamp coolers (Gates et al., 1991a, b).

Empirical algorithms
The last category of metrics is derived from first principle thermo-physiology models, or changes in worker productivity, etc., and then reduced by empirical fit.The first metric we present is a widely used modification of an industry labor standard, the simplified Wet Bulb Globe Temperature (sWBGT): sWBGT was designed for estimating heat stress in sports medicine and adopted by the Australian Bureau of Meteorology; however, it is acknowledged that its accuracy of representing the original labor industry index may be questionable (ABOM, 2010;ACSM, 1984ACSM, , 1987)).We chose, however, to implement sWBGT due to its wide use.sWBGT is unitless, and its threat levels are as follows: 26.7-29.3 is green, or alert; 29.4-31.0 is yellow, or caution; 31.1-32.1 is red, or potentially dangerous; and ≥ 32.2 is black, or dangerous conditions (US Army, 2003).
Discomfort Index was developed in the 1950s as a calibration for air conditioners (Thom, 1959).It was adapted by the Israeli Defense Force as a decision-making tool regarding heat stress (Epstein and Moran, 2006).DI requires T w and T C .The computation of T w in the past was computationally expensive, and the DI equations often used approximations (e.g., Oleson et al., 2013b): where T wS is the wet bulb temperature in degrees Celsius (Stull, 2011).Stull's function has limited range of effective accuracy.
We compute DI with both T wS and T w calculated using our implementation of Davies-Jones (2008) (Eq.A22).We keep the legacy version (Stull, 2011) for comparative purposes.DI is calculated from these inputs: where the DI is unitless and the values are an indicator of threats to the populations: 21-24 is < 50 % of population in discomfort, 24-27 > 50 % of population in discomfort, 27-29 most of the population in discomfort, 29-32 severe stress, and > 32 is state of emergency (Giles et al., 1990).
The last index we present is a measurement of the capacity of evaporative cooling mechanisms.Often, these are referred to as swamp coolers.Large-scale swamp coolers generally work by spraying a "mist" into the air, or blowing air through a wet mesh.This mist then comes in contact with the skin, and subsequently evaporates, thus cooling down the subject.In dry environments, they can be an effective mass cooling mechanism.Unfortunately, swamp coolers raise the local humidity considerably, reducing the effectiveness of direct evaporation from the skin.Swamp coolers are measured by their efficiency: where η(%) is the efficiency, and T t is the target temperature for the room to be cooled towards in degrees Celsius (Koca et al., 1991).Rearranging Eq. ( 11) and solving for T t , where T t is now the predicted temperature based upon environmental variables.The maximum efficiency of typical swamp coolers is 80 %, and a typical value of a sub-standard mechanism is 65 % (Koca et al., 1991).Thus, we calculate T t with two different efficiencies: SWMP80, for η at 80 %, and SWMP65 for η at 65 %.With the mist-injected air cooled to T t , T t is approximately equal to a new local T w .Humid environments or environments that are hot and have an aboveaverage RH relative to their normally high T severely limit the cooling potential of swamp coolers.The livestock industry uses evaporative cooling mechanisms for cooling, often in conjunction with THIP and THIC, as mentioned previously (Gates et al., 1991a, b).Due to their low cost, swamp coolers are used throughout the world as a method of cooling buildings and houses.No one has implemented SWMP65 and SWMP80 in global models, and we believe that this will provide many uses to industry by its inclusion in CLM4.5.
Table 2 shows which metrics are discussed in this paper.

Methods
Our approach is to choose a subset of heat stress metrics that are in common use operationally by governments and/or used extensively in prior climate modeling studies (Table 3).We do this in order to provide a framework to allow comparisons of metrics across studies, and we designate the algorithms the HumanIndexMod.Section 3.1 describes CLM4.5.Section 3.2 discusses the implementation of the HumanIn-dexMod into CLM4.5.Section 3.3 describes our simulation setup that we use to demonstrate the capabilities of the Hu-manIndexMod.The simulation is for showcasing the Hu-manIndexMod, not as an experiment for describing real climate or climate change.Section 3.4 describes a unique application method for analyzing heat stress.

The structure of Community Land Model version 4.5
We use CLM version 4.5, which was released in June 2013 (Oleson et al., 2013a).Boundary conditions for CLM4.5 consist of land cover and atmospheric weather conditions.Each grid cell in CLM4.5 can include vegetation, lakes, wetlands, glacier, and urban areas.There are new parameterizations and models for snow cover, lakes, and crops; a new biogeochemical cycles model; and new urban classifications (Oleson et al., 2013a).The urban biome, a single-layer canyon model, is designed to represent the "heat island", where temperatures are amplified by urban environments (Oleson et al., 2008a(Oleson et al., , b, 2010a(Oleson et al., , b, 2011)).The heat-island effect can increase the likelihood of complications from human heat stress (Oleson, 2012).

HumanIndexMod design and implementation
There are two philosophical aspects to the design of the HumanIndexMod: (1) accurate and efficient moist thermodynamic algorithms, and (2) a modular format to increase use through both narrowly focused applications and broadly based studies.The module is in an open-source format and is incorporated into the CLM4.5 developer branch (the module itself is available from the corresponding author's website).
The modular format encourages adapting the code to specific needs, whether that focus is on moist thermodynamics or heat stress.The inclusion of heat stress metrics covering comfort, physiology, and empirical philosophies encourages the use of HumanIndexMod for many applications.We directly implemented the code into the CLM4.5 architecture through seven modules.Four of these modules -BareGroundFuxesMod, CanopyFluxesMod, SlakeFluxes-Mod, and UrbanMod -call the HumanIndexMod.The HumanIndexMod is calculated for every surface type in CLM4.5.The design of CLM4.5 allows the urban and rural components, where the rural component represents the natural vegetation surface, to be archived separately for intercomparison.The HumanIndexMod uses the 2 m calculations of water vapor, temperature, and pressure, as well as 10 m winds.Three other modules are modified with the implementation process.These modules -clmtype, clmtypeInitMod, and histFldsMod -are used for initializing memory and outputting variable history files.
Moist thermodynamic water vapor quantities in CLM4.5 are calculated within QSatMod.We use the outputs from QSatMod as the inputs to the HumanIndexMod.Within the HumanIndexMod, we created a subroutine, QSat_2, which has all the same functionalities as QSatMod.This subroutine uses the August-Roche-Magnus (ARM) equation (Eq.A13) rather than the Flatau et al. (1992) polynomial equations for vapor pressure in QSatMod.The log derivative of ARM (Eq.A15) is a critical component of the calculation of T w , and it is not available in QSatMod.Additionally, QSat_2 calculates f (θ E ) (Eq.A18) with respect to the input temperature and the subsequent derivatives.These are required to calculate T w (Eq.A22) using Davies-Jones (2008) and cannot be accomplished using QSatMod.We show acceptable differences between the Stull version of wet bulb temperature (T wS ) calculated using both QSatMod and QSat_2 (Fig. 1a).The new subroutines improve CLM4.5 by calculating previously unused thermodynamic quantities.Additionally, these routines are useful moist thermodynamic routines for other data sets for researchers to use, thus expanding the capacity of the HumanIndexMod.
We implement specific thermodynamic routines developed by Davies-Jones (2008) to accurately calculate T w (see Appendix A).Equation (A4) is the most accurate and efficient θ E calculation available (Bolton, 1980;Davies-Jones, 2009).Calculating Eq. (A4) required implementing T L and θ DL (Eqs.A2 and A3, respectively) into the HumanIndexMod.T , P , and Q from CLM4.5 are used to calculate θ E and T E (Eq.A5).T E , a quantity used in a previous heat stress study (Fischer and Knutti, 2012), is an input into QSat_2 for calculating the initial guess of T w , and subsequently followed by the accelerated Newton-Raphson method (Eqs.A9-A22).We found it advantageous to split the heat stress quantities into their own subroutines, allowing the user to choose which quantities are to be calculated.The minimum requirements to execute the entire module are T (K), P (Pa), RH (%), Q (g kg −1 ), e (Pa), and u 10 m (m s −1 ).Table 4 shows the subroutines, input requirements, and outputs in HumanIndex-Mod.

CLM4.5 experimental setup
CLM4.5 may be executed independently of the other models in CESM; this is called an I-compset.To do so, CLM4.5 requires atmospheric boundary conditions.We use the default data set for CLM4.5 -CRUNCEP.CRUNCEP is the NCEP/NCAR reanalysis product (Kalnay et al., 1996) corrected and downscaled by the Climatic Research Unit (CRU) gridded observations data set from the University of East Anglia (Mitchell and Jones, 2005).The time period is 4× daily from 1901 to 2010, and it is on a regular grid of ∼ 0.5 • × 0.5 • .The combination of CRU and NCEP products was to correct for biases in the reanalysis product and improve overall resolution (Casado et al., 2013).To drive CLM4.5 we used surface solar radiation, surface precipita-tion rate, temperature, specific humidity, zonal and meridional winds, and surface pressure.
Our simulation has the carbon and nitrogen cycling on (biogeophysics "CN").The simulation was initialized at year 1850, on a finite volume grid of 1 • × 1 • , using boundary conditions provided from NCAR (Sam Levis, personal communication, 2013).The simulation spun up while cycling 3 times over CRUNCEP 1901CRUNCEP -1920 forcings.Once completed, our experiment used the spun-up land conditions and ran the entirety of 1901-2010.

Heat stress indices analysis
An open question is what drives extreme high-heat-stress events, which are, by definition, rare events.For example, we cannot determine from the mean climate state or from theory, in a warm and humid climate, whether abnormally high temperature, abnormally high moisture, or a combination of the two caused a heat stress event.This is a question of the covariance of perturbations of temperature and humidity, not a statement of mean conditions, and there is no theory to explain these situations.For example, we may apply Reynolds averaging to the NWS Heat Index equation (Eq.3): where a, b, c, d, e, f , g, h, and i are constants in the polynomial.RH and T are relative humidity and temperature, respectively.We are not concerned with the terms outside the brackets, as they are the means.The terms within the bracket are representative of turbulent effects on the heat index, which we are discussing.It is these turbulent states where a GCM is able to determine these individual factors, by calculating the heat stress metrics and thermodynamic quantities at every model time step.Furthermore, each heat stress metric has different assumptions (such as body size, physical fitness, etc.) that weight temperature and humidity differently.A high-heat-stress event indicated by one metric does not necessarily transfer onto another metric.Thus, we outputted 4× daily averages of the heat stress metrics and the corresponding surface pressure (P ), 2 m temperature (T ), 10 m winds (u 10 m ), and 2 m humidity (Q) fields.We computed statistics for the time series (mean, variance, exceedance, etc.).We focus primarily on the 99th percentiles (hottest 1606 6-hour intervals, ∼ 402 days) but also show some of the robust features with the 75th (hottest 40 150 6-hour intervals, ∼ 10 038 days) and 95th percentiles (hottest 8030 6-hour intervals, ∼ 2008 days).
Every 6 h period that exceeds the percentiles was located within the time series, and we calculated the conditional distributions.For example, the 99th percentile exceedance of HI isolated the top 1606 hottest time steps for each grid cell.After isolating these time steps, we use this distribution as a mask to isolate all other quantities (e.g., temperature and humidity), allowing cross comparison between all metrics and HI.The goal is to develop an analysis technique comparing all covariances of the metrics within CLM4.5.
After the conditional distributions are calculated, we, again, compute the statistical dispersion (mean, variance, exceedance, etc.) of the percentiles.We display this analysis with maps in two ways.(1) We show the exceedance value of a metric, and (2) we show T −Q regime plots of that same metric.We calculate the T − Q regimes through expected rank values (Fig. 2).This required a series of steps.(1) We take the conditional distribution of T and Q that represent exceedance percentile of the source heat stress or moist thermodynamic metric.(2) We take the expected value (median) of the conditional distributions of T and Q and determine what percentile they come from in their respective time series.(3) We condition these values on each other to create the expected rank values (Fig. 2).

Results
We present a snap shot of the many metrics calculated.First, we present results of our evaluation the improved moist thermodynamic calculations and the implementation these metrics into CLM4.5 (Fig. 1).Second, we show an example of the possible global applications for these metrics .This approach characterizes heat stress within CLM4.5 in response to one observation reanalysis product, the CRUN-CEP.

Evaluation of improved moist thermodynamic quantities
We present a series of box-and-whisker plots demonstrating the value added of implementing (1) accurate and efficient moist thermodynamic quantities, and (2) online calculation of the heat stress metrics is an improvement over calculating these metrics using monthly or 4× daily model output (Fig. 1). Figure 1a shows the difference in the Stull (2011) wet bulb temperature calculated using the saturated vapor pressure from Davies-Jones (2008) (QSat_2) and Flatau et al. (1992) (QSatMod).The differences are minimal.However, our point is that the Davies-Jones (2008) method for wet bulb temperature is preferred.We show the difference between wet bulb temperatures using Stull (2011) calculated with QSat_2 and Davies-Jones (2008) (which requires QSat_2) (Fig. 1b).Differences are greater than 1 K between Stull (2011) and Davies-Jones (2008) methods, and they are temperature dependent (Fig. 1b).Lastly, we show the difference between calculating Davies-Jones ( 2008) T w using monthly and 4× daily averaged model data versus the model instantaneous calculations (Fig. 1c and d, respectively).Using model-averaged data instead of the instantaneous data systematically overestimates T w by more than 1 K for monthly and 0.5 K for 4× daily output.

Exceedance values and regime maps
We show exceedance and T − Q regime maps for the 75th and 95th percentiles of three metrics, and 99th percentiles of six metrics.The maps show spatial patterns of heat stress and characteristics.Equatorial and monsoonal regions show moderate levels of heat stress in the 75th percentile (Fig. 3a-c).sWBGT shows values exceeding minimum metric warning levels (e.g., China, northern Africa), whereas HI does not have necessarily the same warning.The 95th percentile shows that moderate levels of heat stress have expanded into higher latitudes (Fig. 4a-c).At equatorial and monsoonal regions, heat stress labor reductions should be in effect as it is not safe to work outside and, in some cases (western Africa, the Arabian Peninsula, and the Himalayan wall), not safe to work at all.At the 99th percentile, severe heat stress is experienced in the monsoonal regions (Fig. 5a-c).These maxima correlate with maxima in T w (Fig. 5c).The T − Q regime maps show that partitioning of heat stress into T and Q begins in regional locations at the 75th percentile (Fig. 3d-f).The partitioning occurs in low latitudes and is not consistent between metrics.At the 95th percentile, the partitioning expands into higher latitudes; however, many areas (continental interiors) remain equally dependent on T and Q (Fig. 4d-f).T w is largely driven by extreme moisture (Fig. 4f) and in some locations (monsoonal Africa, Indian sub-continent, and equatorial South America) very extreme moisture.HI is driven by T (Fig. 4e), and sWBGT is mixed between extreme Q and extreme T (Fig. 4d).All three metrics agree with T in the western United States and Middle East.At the 99th percentile, HI, although dominated by T worldwide, shows sign reversals in very small locations (Fig. 5e).Extreme Q expands for T w , and all of the low latitudes experience moisture dependence except for the western United States and Middle East (Fig. 5f).sWBGT has some reversal of T -to Q-dominated heat stress (western Africa).Q largely expands worldwide.In all instances, except for HI, high latitudes are equally dependent on Q and T for heat stress.
Our final maps show SWMP65, SWMP80, and θ E at the 99th percentile.Maxima for θ E are spatially the same as T w (Figs.5c and 6c).Additionally, θ E partitions towards Q, just as T w shows (Figs.5f and 6f).Spatial patterns between SWMP65 and HI are similar (Figs.5b and 6a), and their regime maps show similar partitioning toward T globally, except for select locations of strong monsoonal locations that show Q dependency (Figs.5e and 6d).Lastly, SWMP80 and sWBGT share similar spatial patterns (Figs.5a and 6b).As with the other paired metrics, their T − Q regime maps share the same characteristics (Figs.5d and 6e).Low latitudes show strong Q dependence, and higher latitudes switch to a T dependence.

Discussion
We designed the HumanIndexMod to calculate diagnostic heat stress and moist thermodynamics systematically.There are many approaches to evaluating heat stress.Monthly and seasonal temperature and moisture averages were used for general applications (Dunne et al., 2013); however these averages overestimate the potential severity of heat stress (Fig. 1c, d).Even using daily or sub-daily averages (Kjellstrom et al., 2009b;Hyatt et al., 2010;Fischer and Schar, 2010;Fischer and Knutti, 2012;Willett and Sherwood, 2012;Kjellstrom et al., 2013) potentially overestimates heat stress.This is due to the nonlinear covariance of T and Q, and averages miss these extremes.Ultimately, capturing the diurnal cycle is crucial for quantifying heat stress extremes (Oleson et al., 2013b).Heat-stress-related illness is exacerbated by high-heat-stress nights as well as daytimes.To accurately calculate these extremes, one needs either high-temporalresolution data or direct computation of them at each time step within climate models.We discuss the results from the implementation separately: moist thermodynamics and heat stress.

Moist thermodynamics
The spatial distributions of high heat stress are robust between CLM model versions (Oleson et al., 2011(Oleson et al., , 2013b;;Fischer et al., 2012).Due to the conservation of energy and entropy, calculating moist thermodynamic variables shows that climate models and reanalysis fall along constant lines of T E (Eq.A5), even out to the 99th percentile of daily values (Fischer and Knutti, 2012).The spread between models is small as compared to the spread in T ; thus using heat stress metrics in Earth system modeling may reduce the uncertainties of climate change (Fischer and Knutti, 2012).
Previous modeling studies have demonstrated that urban equatorial regions transition to a nearly permanent high-heatstress environment when considering global warming (Fischer et al., 2012;Oleson et al., 2013b).The convective regions are areas with the highest heat stress maximums and are often near coastal locations.Many of these metropolitan areas are in monsoonal regions, which have strong yearly moisture variability, yet the partitioning of heat stress is towards Q, not T , in these regions (Figs.5d-f and 6d-f).Heat stress in both equatorial and monsoonal regions is expected to increase dramatically when considering global warming (Kjellstrom et al., 2009b;Fischer and Knutti, 2012;Dunne et al., 2013;Oleson et al., 2013b  namic calculations from the HumanIndexMod will aid future characterizations of heat stress.

Heat stress
We show that there are two regimes of heat stress globally in agreement between metrics in the CRUNCEP CLM4.5 simulation, T (western United States and Middle East) and Q (monsoonal regions).The western United States and Middle East regions consistently have higher temperatures and lower humidities than the monsoonal areas.However, we show that maximum heat stress is partitioned between T and Q globally.Characterizing arid regions versus non-arid regions may require different heat stress metrics (e.g., Oleson et al., 2013b; specifically the comparison between Phoenix and Houston).The HumanIndexMod provides this capability.
The assumptions/calibrations that derived the heat stress metrics in the HumanIndexMod are another avenue of research that may be explored using a global model.For exam-ple, the original equation from which sWBGT was derived was calibrated using US Marine Corps Marines during basic training (Minard et al., 1957), who are in top physical condition.HI was calibrated for an "average" American male (Steadman, 1979a;Rothfusz, 1990).Calculating these heat stress metrics, and the many others in the HumanIndexMod, at every time step within climate models was previously intractable due to insufficient data storage capabilities for hightemporal-resolution variables.We show that SWMP65 and SWMP80 diverge in their values (Fig. 6a, b and d, e).Yet, SWMP80 and sWBGT are similar in spatial patterns and regimes, while HI and SWMP65 have similar patterns and regimes.What links SWMP65 and SWMP80 together is T w .Swamp coolers are evaporators, and, as their efficiency approaches 100 %, their solutions approach T w .Figures 5 and  6 are similar to a circuit resistor, or stomatal resistance (Oke, 1987), which is measure of efficiency.The average person (HI) may be acting as a stronger resistor to evaporation than one who is acclimatized (sWBGT).The HumanIndexMod may explore the effects of acclimatization and its impact on efficiency of evaporative cooling through climate modeling.This type of research may ultimately reduce the number metrics required for computing heat stress.Exposure to high temperatures and moisture, ultimately, threatens humans physically, and long-term exposure may lead to death.Extreme moist temperatures are projected to increase in the future, and potentially may reach deadly extremes, permanently in some regions (Sherwood and Huber, 2010).Heat stress indices have the ability to diagnose instantaneous exposure.Diagnostic models, however, cannot measure or evaluate the potential impacts of long-term exposure to heat stress accurately.Prognostic thermal physiological models can be used to predict the complexities of heat stress on humans.
Prognostic thermal physiology considers wind, ambient temperature, and moisture from the environment, as well as internal processes, such as blood flow and sweat.There are numerous different forms of prognostic models (Table 1).Some of them are quite complicated, using hundreds of grid cells to represent all parts of the body (Fiala et al., 1999).
Less complicated models represent the human body as a single cylinder with multiple layers (Kraning and Gonzalez, 1997).Neither computational method is currently coupled to Earth system models, and this is a significant gap in determining future heat stress impacts that the HumanIndexMod may not be able to fulfill.To make progress towards representing the effects of heat stress on the human body prognostically, we recommend, as a first step, incorporating mean radiant temperature of humans.Radiation is a major component of human energy balance, and implementing this also allows incorporating more accurate diagnostics, such as Wet Bulb Globe Temperature (Minard et al., 1957) and the Universal Thermal Climate Index (Havenith et al., 2012).

Summary
We present the HumanIndexMod, which calculates nine heat stress metrics and four moist thermodynamical quantities.curate and efficient algorithms available.The heat stress metrics cover three developmental philosophies: comfort-, physiologically, and empirically based algorithms.The code is designed, with minimal effort, to be implemented into general circulation, land surface, and weather forecasting models.Additionally, this code may be used with archived data formats and local weather stations.Furthermore, we have implemented the HumanIndexMod into the latest public release version of CLM4.5.Archival is flexible, as the user may choose to turn on high-frequency output, and the default is monthly averages.Additionally, monthly urban and rural output of the metrics is default.We show that the module may be used to explore new avenues of research: characterization of human heat stress, model diagnostics, and intercomparisons of heat stress metrics.Our results show that there are two regimes of heat stress -extreme moisture and extreme temperature -yet all of the most extreme heat stress events are tied to maximum moisture.
Our approach has limitations.None of the metrics in the HumanIndexMod include the effects of solar and thermal radiation.Radiation is a non-negligible component of heat stress.As a consequence, the heat stress metrics presented always assume that the subject is not in direct solar exposure.Additionally, the indices represent a diagnostic environment for heat stress.These metrics do not incorporate prognostic components or complex physiology of the human thermal system.
Overall, the HumanIndexMod provides a systematic way for implementing an aspect of thermo-animal physiology into an Earth system modeling framework.Incorporating the HumanIndexMod into a variety of different models would provide a baseline for model-model comparisons of heat stress, such as the Coupled Model Intercomparison Project (CMIP) (Taylor et al., 2012) and other collaborative modeling frameworks.We encourage researchers to incorporate the HumanIndexMod into their research environments.

Figure 1 .
Figure 1.Evaluation of wet bulb temperatures.The boxes represent the 90 % confidence interval.The upper and lower tails represent the 100 % confidence interval.The horizontal line in each box is the median value.(a) is the difference between T wS using QSat_2 saturated vapor pressure and QSatMod saturated vapor pressure over the valid range for T wS .(b) is the difference between T w (Davies-Jones, 2008) and T wS (Stull, 2011) (both using QSat_2 saturated vapor pressure calculation) over the valid range for T wS .(c) is the difference between using model monthly averaged input fields and model instantaneous fields to calculate monthly T w .(d) is the difference between using model 4× daily averaged input fields and model instantaneous fields to calculate 4× daily T w .For (a), (b), and (d) the inputs of T , P , and Q are derived from model 4× daily fields from the years 2001-2010.For (c) the inputs of T , P , and Q are derived from model monthly fields from the years 2001-2010.

Figure 2 .
Figure 2. Expected value ranking.T and Q conditioned upon exceedance value of a heat stress or moist thermodynamic metric.The T and Q values are compared to their respective time series as a percentile.These T and Q percentiles are binned and are compared to each other.Extreme Q are greens and extreme T are magentas.

Table 1 .
Heat stress diagnostics and prognostic models.

Table 2 .
Moist temperature variables and heat stress metrics.

Table 3 .
List of previous heat stress studies.Studies using data sets, reanalysis, and/or model output that range from ∼ 1900 until ∼ 2010 are labeled "modern" and from ∼ 2005 to ∼ 2100 are labeled "future".Some studies do not analyze heat stress quantitatively (Assessment).