A Python-enhanced urban land surface model SuPy (SUEWS in Python, v2019.2): development, deployment and demonstration

. Accurate and agile modelling of cities weather, climate, hydrology and air quality is essential for integrated urban services. The Surface Urban Energy and Water balance Scheme (SUEWS) is a state-of-the-art widely used urban land surface model (ULSM) which simulates urban– atmospheric interactions by quantifying the energy, water and mass ﬂuxes. Using SUEWS as the computation kernel, SuPy (SUEWS in Python) uses a Python-based data stack to streamline the pre-processing, computation and post-processing that are involved in the common modelling-centred urban climate studies. This paper documents the development of SuPy, including the SUEWS interface modiﬁcation, F2PY (Fortran to Python) conﬁguration and Python front-end implementation. In addition, the deployment of SuPy via PyPI (Python Package Index) is introduced along with the automated workﬂow for cross-platform compilation. This makes SuPy available for all mainstream operating systems (Windows, Linux and macOS). Three online tutorials in Jupyter Notebook are provided to users of different levels to become familiar with SuPy urban climate modelling. The SuPy package represents a signiﬁcant enhancement that supports existing and new model applications, reproducibility and enhanced functionality.


Introduction
Cities need to be resilient to weather, climate, hydrological and air quality hazards given their large and ever-increasing populations (Baklanov et al., 2018). One prerequisite to building resilience is information at various spatio-temporal scales, e.g. to understand energy partitioning over urban surfaces (D. Sun et al., 2017;Wang et al., 2015;Ward and Grimmond, 2017;Zhao et al., 2014), pedestrian level meteorology to diagnose thermal comfort (Bar et al., 2011;Erell et al., 2013;Krayenhoff et al., 2018;Sun et al., 2016;Tan et al., 2009), or ambient radiation and wind conditions to assist building design (Chen, 2004;Jentsch et al., 2013;B. Li et al., 2015;Reinhart and Cerezo Davila, 2016;Santamouris et al., 2001). To obtain such information, accurate and agile modelling capacity of the urban weather and climate are essential.
Urban land surface models (ULSMs) are widely used to simulate urban-atmosphere interactions by quantifying the energy, water and mass fluxes between the surface and urban atmosphere (Best and Grimmond, 2015;Chen et al., 2011;Wang et al., 2012). These models require information on urban morphology (e.g. heights, spacings of buildings, etc.) and anthropogenic dynamics (e.g. building-operation-related heat release, emissions of heat by traffic) to be included.
One widely used and tested ULSM, the Surface Urban Energy and Water balance Scheme (SUEWS; Table 1), requires basic meteorological data and surface information to characterize essential urban features (i.e. urban surface heterogeneity and anthropogenic dynamics). SUEWS enables long-term urban climate simulations without specialized computing facilities (Järvi et al., , 2014Ward et al., 2016). SUEWS is regularly enhanced Järvi et al., 2011Järvi et al., , 2014Järvi et al., , 2019Offerle et al., 2003;Ward et al., 2016) and tested in cities under a range of climates worldwide (Table 1). Although operationally simple and scientifically robust, SUEWS still requires some skill for application (e.g. computing environment setup, parameter configuration), which may limit uptake for broader applications in urban planning and design.
Published by Copernicus Publications on behalf of the European Geosciences Union. Reproducibility and open science principles are increasingly important (Peng, 2011). Although climate scientists by convention publish detailed model configurations used in their research, minor inconsistencies or lack of transparency of code often hampers efforts to reproduce simulation results. In addition, new users may lack prerequisite knowledge in low-level compilation and scripting to undertake initial model runs and interpretation of simulation results (Lin, 2012).
Today Python is used extensively by the atmospheric sciences community for data analyses and numerical modelling (Lin, 2012;Perkel, 2015) thanks to its simplicity and the large scientific Python ecosystem (e.g. PyData community: https://pydata.org, last access: 12 June 2019). Recent Python-based endeavours include global climate system models (Monteiro et al., 2018), stochastic geological models (de la Varga et al., 2019) and hydrological models (Hamman et al., 2018) to cite just a few.
In this paper, we present a Python-enhanced urban climate system based on the popular Fortran-coded SUEWS -SuPy (SUEWS in Python). The development of SuPy (Sect. 2), the essential workflow in its cross-platform deployment (Sect. 3), and three demonstration tutorials for users of different levels (Sect. 4) are presented.

Development
The following are considered within the design process of SuPy: 1. Data preparation. Climate simulations typically require extensive pre-processing of data (loading input data, reformatting to conform with standards, etc.) and postprocessing (conversion of output, graphical and cartographic plotting, etc.). Python has a vast array of utilities to support this; notably, NumPy (the fundamental package for scientific computing with Python, https: //www.numpy.org, last access: 12 June 2019) and pandas (a tabular-format-centred data analysis tool) are two cornerstone libraries in Python-based scientific computing.
2. Performance. Python, as a scripting language, has poorer performance than compiled languages (e.g. C, Fortran; Kouatchou, 2018 4. Extendibility. It is desirable, possibly even essential, for the scientific model to interact with other models and data sources to extend the overall capacity and to explore questions related to urban climate beyond the climate science. To address these four considerations, SuPy's architecture uses Python's data processing and Fortran's computational efficiency. SuPy consists of three parts ( Fig. 1): 1. SuPy. A Python-based front-end processor based on the pandas DataFrame with functionality for data analysis and simulation management (Appendix A).
2. SuPy_driver. Calculation kernel compiled by F2PY (Fortran to Python, part of the NumPy package) (Peterson, 2009) to facilitate the transfer of SUEWS' Fortran modelling ability to Python and guarantee the computational performance.
3. SUEWS. A Fortran-coded local-scale urban land surface model of moderate complexity that can simulate the urban surface energy balance in combination with the complete urban hydrological cycle, considering irrigation and runoff processes Oke, 1986, 1991;Järvi et al., 2011Järvi et al., , 2014Offerle et al., 2003;Ward et al., 2016).
Development of SuPy (Sun, 2019) started with SUEWS v2017b . SUEWS has three distinct groups of subroutines: model physics, input and output (I/O). To help generalize coupling, the use of Fortran modules to pass variables and parameters has been reduced and there has been a return to further use of Fortran subroutine arguments with explicitly stated intent (e.g. in, out). The modified physics subroutines are called from two subroutines, suews_cal_main and suews_cal_multistep (Fig. 1a), depending on the model time step (single or multi). This structure constitutes the SUEWS v2018c calculation kernel ( Fig. 1a) and enables efficient communication between SUEWS and other models (e.g. WRF) through an explicit and unified interface.
The SUEWS kernel (v2018c) is compiled by F2PY to generate the Python-compliant SuPy_driver package. Using the two subroutines allows better computational performance. The SuPy_driver calls the two subroutines depending on time-step simulation type: single-(sd_cal_tstep) or multitime-step mode (sd_cal_multitstep, Fig. 1b). The former is useful in flexible manipulation of SuPy runtime behaviours (application in Sect. 4.3), while the latter has much better performance because of the much lower computational overheads with the F2PY wrapper. Therefore, sd_cal_multitstep is the default executer in run_supy, the SuPy core processor performing simulations, for regular runs without runtime manipulation.

load_forcing. Meteorological and other external
forcing information are loaded into df_forcing (a pandas DataFrame; https://supy.readthedocs.io/en/ latest/data-structure/df_forcing.html, last access: 12 June 2019) to drive the SuPy simulations with time-step size inferred from its DatetimeIndex (i.e. the freq attribute). SUEWS should be run at short time steps (e.g. 5 min) as precipitation or irrigation runoff from impervious surfaces becomes too large if the water arrives as one large hourly (or longer) amount Ward et al., 2018). As such, load_forcing is implemented with the ability to downscale the raw forcing data to finer time steps (5 min by default). The temporal resolution of raw forcing data can be between 5 and 360 min, with 30-60 min being the most common.
For new users without experience of other versions, a helper function, load_SampleData, is provided to get the sample input DataFrames (i.e. df_state_init and df_forcing) ready to run simulations. Once users understand the SUEWS/SuPy variables, the sample DataFrames provide a template to work with to meet their next specific needs. Examples using the sample datasets are provided as tutorials (Sect. 4).
To facilitate reuse of model runs (e.g. for model spin-up), df_state_final has the same data structure as df_state_init (dashed line, SuPy panel, Fig. 1a).

Deployment
To achieve cross-platform compatibility, SuPy has two parts: 1. SuPy_driver (calculation kernel): the F2PY generated binaries of SUEWS are platform-dependent because of compilation being necessary for assurance of performance.
2. SuPy (front-end processor): this platform-independent Python code allows rapid iteration in functionality enhancement and bug fixing thanks to the powerful ecosystem of Python utilities.
As software compilation can be frustrating and/or prone to operator errors, this procedure is automated using two online services: To build the SuPy_driver two crucial steps to allow crossplatform deployment (full details refer to configuration file setup.py in SuPy_driver) are the following.
1. Static linking: to eliminate the issue of missing dynamic libraries, the calculation kernels are pre-built using static linking and therefore run directly after downloading.
2. manylinux tagging: given the many Linux distributions and their different runtime libraries that often require distribution-specific compilation, we use the manylinux docker image (for details refer to https: //github.com/pypa/manylinux) to compile SuPy_driver.
In addition to the cross-platform compilation, to guarantee delivery quality we perform automatic code tests of four preset configurations for every build: 1. Connectivity between SuPy and SuPy_driver: checks if the front-end processor and back-end calculation core can communicate with correct input and output.
2. Success in single-time-step mode: checks SuPy can produce correct simulation results in the single-time-step mode.
3. Success in multi-time-step mode: checks SuPy can produce correct simulation results in the multi-time-step mode and does a quick benchmark of computation speed.

4.
Compare simulation results between single-and multitime-steps modes: checks SuPy can produce identical simulation results as designed.
All build and test output is logged in detail (see all logs here: https://dev.azure.com/sunt05/SuPy/_build, last access: 12 June 2019) and the results are reported to developers in real time. This feature is used for all code and underpins a commitment for timely support to SuPy development. The Python Package Index (PyPI: https://pypi.org, last access: 12 June 2019) is the official third-party software repository for Python. As it is supported by the pip toolchain it provides Python users easy worldwide access to packages and frees Python developers from maintaining indexing and distribution servers. By using the PyPI channel, SuPy can be easily installed by users with a one-line input in a command line tool on any desktop/server system (Listing 1).
Listing 1. Command line code for SuPy installation using pip. Note 64-bit Python 3.5+ is required for SuPy installation.
They can run in browsers (e.g. desktop, tablet) either by local configuration or on remote servers with preset environments (e.g. Google Colaboratory, https://colab. research.google.com, last access: 12 June 2019; Microsoft Azure Notebooks, https://notebooks.azure.com, last access: 12 June 2019). As Jupyter notebooks allow source code to be incorporated with detailed notes, users can organize their analyses (Shen, 2014). Jupyter notebooks can be installed with pip on any desktop/server system and open .ipynb notebook files locally (Listing 2). We note that running SuPy in browsers is not implemented by SuPy per se but allowed by the Jupyter environment where Python 3 is supported. The reason for running SuPy (and many other python applications) on mobile devices (e.g. mobile phone, tablet) is simple: working seamlessly across different devices is a natural need. These are made available to SUEWS by calling the load_SampleData function. This produces pandas DataFrames with the initial model state (df_state_init) and the forcing variables (df_forcing). These are used in all three tutorials.

SuPy quick-start
In this tutorial, we demonstrate the key steps in using SuPy to undertake the core task to simulate energy and water balance in an urban context using SUEWS. Here the runs are for a central London area in 2012.
The urban surface energy balance (SEB) can be expressed as where the flux densities (W m −2 ) are Q * net all-wave radiation, Q F anthropogenic heat, Q H turbulent sensible heat, Q E latent heat and Q S the net storage heat flux. Through Q E , the SEB characteristics can be linked to the water balance: where each term is a depth of water per unit of time (e.g. mm d −1 ). P is precipitation, I irrigation, E evapotran-  Radiation radiationmethod 3 Net all-wave radiation modelled with incoming longwave radiation modelled using air temperature and relative humidity  Heat storage storageheatmethod 1 OHM model  Anthropogenic heat emissionsmethod 2 Anthropogenic heat model responding to temperature, population density, time of day and day of week  Snow snowuse 1 Snow module to model snowpack and related thermal and hydrological dynamics (Järvi et al., 2014) Roughness length for momentum roughlenmommethod 2 Momentum roughness length determined using Grimmond and Oke (1999) Roughness length for heat roughlenheatmethod 2 Thermal roughness length determined using Kawai et al. (2009) Atmospheric stability stabilitymethod 3 Atmospheric stability correction function (Campbell and Norman, 1998) Geosci spiration (= Q E /L v where L v is the latent heat of vaporization), R runoff and S the net change in water storage. The fundamental steps to use SuPy after the software environment has been installed (see Listings 1, 2) are (1) load input, (2) run a simulation and (3) examine the results. With everything ready, three lines of python code are needed.
SuPy is run by calling run_supy after df_state_init and df_forcing have been loaded. After the simulation the two DataFrames provide major SUEWS outputs (df_output) and the model state (df_state_final) at the end of the run. The latter can be used as initial conditions for other SuPy runs. The post-processing uses pandas functions to resample, plot and write out the model output. The default output DataFrame of 5 min resolution can be upscaled to the month for an overview of intra-annual dynamics of surface energy and water balances (Fig. 3).
This workflow using SuPy for urban climate modelling can be easily adapted to existing SUEWS tutorials under the UMEP framework (https://tutorial-docs.readthedocs.io, last access: 12 June 2019) by replacing the conventional SUEWS binary executable with the python SuPy package. Given the central role of Python in the UMEP framework, it is expected the adoption of SuPy will further streamline the workflows for urban climate simulations in UMEP.

Impacts of the urban area on local climate
A major application of urban climate models is to study the impacts on urban climate from design scenarios that change surface characteristics or the climate (atmospheric forcing). In this tutorial both scenario types are explored: we provide one example of modification of albedo for surface characteristics, while another of air temperature alteration for climate conditions.
Technically, this requires several configuration files to be prepared for a suite of independent model runs. These could be run consecutively (i.e. no interactions between runs are needed) or in parallel, so-called "embarrassingly parallel computation" (Bailey et al., 1991), with multiple independent runs with sufficient CPUs. In this tutorial, we first demonstrate how SuPy can be easily set up to efficiently complete multiple simulations in parallel.
We use dask (https://dask.org, last access: 12 June 2019) to parallelize the SuPy simulations given its close coherence with NumPy and pandas, in particular its almost identical DataFrame interfaces as pandas. Specifically, we use the apply method of dask.DataFrame to improve the simulation performance by distributing the SuPy computations across different configurations. Compared with the serial mode, the dask-based parallel mode takes only ∼ 30 % of the execution time of the serial mode for simulations longer than 1000 d for 12 grids (Fig. 4). The parallel configuration for running SuPy, run_supy_mgrids, is then used in the following two cases for more efficient parallel simulations.
To explore the effect of changes to surface properties, the DataFrame df_state_init needs to be modified. The surface albedo of different materials impacts the outgoing shortwave (solar) radiation and thus the surface energy balance fluxes and other atmospheric variables. Modifying roof albedo has been suggested extensively as a method to cool urban areas (e.g. Santamouris et al., 2011;Li et al., 2014;Ramamurthy et al., 2015). In the example, we conduct simulations from January 2012 to July 2012 with the first 6 months as the spin-up period. The building roof albedo is incrementally increased from 0.1 to 0.8 (e.g. a change from a very dark to a very light surface). The near-surface temperature T 2 , an indicator of thermal state at pedestrian level, is analysed using the monthly maximum, mean and minimum (Fig. 5). It would be expected that the maximum and mean values of T 2 are greatly reduced as they are directly influenced by the altered net solar radiation, while impacts on the minimum T 2 might be expected to be minimal.
In this tutorial we demonstrate some starting cases rather than a complete research cycle. Notably, limitations are imposed in the configuration used (e.g. length of the model run, spin-up period, feedbacks permitted) and thus the relations shown should be interpreted with caution.
To explore changes in atmospheric forcing, the DataFrame df_forcing is modified. In this example, we investigate the impact of increased local-scale (constant flux layer) air temperature T a on the near-surface air temperature T 2 . Air temperature T a is increased over 24 runs from 0 • C (no change) to +2 • C. The upper limit (+2 • C) represents a highly possible average global warming scenario for the near future (IPCC, 2014). The SuPy simulations are conducted from January to July 2012 and July data are analysed. The T 2 results indicate the increased T a has different impacts on the T 2 metrics (minimum, mean and maximum) but all increase linearly with T a . The maximum T 2 has the stronger response compared to the other metrics (Fig. 6).
This tutorial demonstrates the simplicity of using SuPy to conduct impact studies of both surface characteristics and background climates. These can be easily adapted by users to their specific application interests. Thus, as various effects are combined the net impacts becomes more realistic.

Interaction between SuPy and external models
SUEWS can be coupled to other models that provide or require forcing data using the SuPy single-time-step running mode (Sect. 2). We demonstrate this feature with a simple online anthropogenic heat flux model.
Anthropogenic heat flux (Q F ) is an additional term to the surface energy balance in urban areas associated with human activities Grimmond, 1992;Nie et al., 2014Nie et al., , 2016Sailor, 2011). In most cities, the largest emission source is from buildings (Hamilton et al., 2009;Iamarino et al., 2011;Sailor, 2011) and is highly dependent on outdoor ambient air temperature. For demonstration purposes we have created a very simple model instead of using the SUEWS Q F  with feedback from outdoor air temperature (Fig. 7). The simple Q F model considers only building heating and cooling:   30, 90, 120, 150, 180, 270, 365, 730 and 1095. Simulations performed with macOS 10.14.3 running on 2.9 GHz Intel Core i9 with 32 GB memory. The model configuration is the same as Quickstart SuPy (Table 2).
implying other building activities (e.g. lightning, water heating, computers) are zero and therefore do not change the temperature or change with temperature. The coupling between the simple Q F model and SuPy is done via the low-level function suews_cal_tstep, which is an interface function in charge of communications between the SuPy front end and the calculation kernel. By setting SuPy to receive external Q F as forcing, at each time step, the simple Q F model is driven by the SuPy output T 2 and provides SuPy with Q F , which thus forms a two-way coupled loop.
Here we replace the SUEWS Q F (Table 2) with the simpler Q F model (Fig. 7, Eq. 3) to explore the question of the impact of Q F on T 2 and its feedback on Q F . The simula- Figure 5. Impacts of increasing building roof albedo α (from 0.1) on near-surface temperature T 2 considering monthly maximum, mean and minimum temperatures at 2 m for July 2012 based on 5 min output. tion using SuPy coupled is performed for London in 2012. The data analysed are a summer (July) and a winter (December) month. Initially, Q F is 0 W m −2 and the T 2 is determined and used to determine Q F [1] , which in turn modifies T 2[1] and therefore modifies Q F [2] and the diagnosed T 2 [2] . Results indicate a positive feedback, as Q F is increases T 2 is elevated but with different magnitudes (Fig. 8). Of particular note is the positive-feedback loop under warm air temperatures: the anthropogenic heat emissions increase, which in turn elevates the outdoor air temperature causing yet more anthropogenic heat release (Fig. 8). Note that London is relatively cool (cf. air temperature in Fig. 2) so the enhancement is much less than it would be in warmer cities.  Table 2b. Note that in this example only one variable is modified. In this case the anthropogenic heat flux model is simple, but a more complex model could be coupled to SUEWS in the same way. This can facilitate development of climate service tools that are both agile and responsive.

Concluding remarks
The development and delivery of a Python-enhanced urban climate model SuPy is introduced with tutorials (Table 2) to demonstrate typical applications and some new SUEWS features (e.g. surface diagnostics calculation). The Python code and tutorials are freely and openly available online (Appendix B). Users are encouraged to explore more intriguing urban-climate-related questions with SuPy. Notable features of SuPy include the following: 1. Version consistency via PyPI: SuPy is distributed via the well managed Python package repository PyPI with all history versions stored. This allows for clear version consistency for reproducing simulation results.
2. Simplicity in input/output sharing: SuPy uses pandas DataFrame as its core data structure and thus draws on a powerful data analysis toolchain, which can facilitate the ease with which urban climate research outcomes can be communicated.
3. Ease of scientific development: given the importance of meteorological forcing data in running climate simulations, SuPy will shortly be equipped with the ability to retrieve forcing variables from global reanalysis datasets. We anticipate data analyses and model development will be added more conveniently within the Python data ecosystem.
4. An open source tool: we welcome all kinds of contributions, e.g. incorporation of a new feature (pull requests), submission of issues or development of new tutorials.
In addition to SuPy in data analysis and communication features, the computation kernel is SUEWS so all physics schemes development will remain in the Fortran stack for computational performance and compatibility with a large Geosci. Model Dev., 12, 2781-2795, 2019 www.geosci-model-dev.net/12/2781/2019/ cohort of scientific code. In one application software, UMEP (Lindberg et al., 2018) written in Python, the SUEWS binary executable will shortly be updated to SuPy for better connectivity to other UMEP components. We expect SuPy will help guide future development of SUEWS (and similar urban climate models) and enable new applications of the model. For example, the parallel set up of SuPy will allow large-scale simulations of urban climate across larger domains with greater surface heterogeneity. Moreover, the improvement in the SUEWS model structure and deployment process introduced by the development of SuPy paves the way to a more robust workflow of SUEWS for its sustainable success. T. Sun and S. Grimmond: A Python-enhanced urban land surface model SuPy

Appendix A: SuPy functions
The utility of the six SuPy functions are -init_supy. Initialize SuPy by loading initial model states.
-load_forcing_grid. Load forcing data for a specific grid included in the index of df_state_init.
-load_SampleData. Load sample data for quickly starting a demo run.