tobac 1.2: towards a flexible framework for tracking and analysis of clouds in diverse datasets

We introduce tobac (Tracking and Object-Based Analysis of Clouds), a newly developed framework for tracking and analysing individual clouds in different types of datasets, such as cloud-resolving model simulations and geostationary satellite retrievals. The software has been designed to be used flexibly with any twoor three-dimensional timevarying input. The application of high-level data formats, such as Iris cubes or xarray arrays, for input and output allows for convenient use of metadata in the tracking analysis and visualisation. Comprehensive analysis routines are provided to derive properties like cloud lifetimes or statistics of cloud properties along with tools to visualise the results in a convenient way. The application of tobac is presented in two examples. We first track and analyse scattered deep convective cells based on maximum vertical velocity and the threedimensional condensate mixing ratio field in cloud-resolving model simulations. We also investigate the performance of the tracking algorithm for different choices of time resolution of the model output. In the second application, we show how the framework can be used to effectively combine information from two different types of datasets by simultaneously tracking convective clouds in model simulations and in geostationary satellite images based on outgoing longwave radiation. The tobac framework provides a flexible new way to include the evolution of the characteristics of individual clouds in a range of important analyses like model intercomparison studies or model assessment based on observational data.

Tracking of individual cloud objects in high-resolution CRM simulations and large-eddy simulation models (LES) has been developed alongside the evolution of these simulations in recent decades. Earlier studies on tracking shallow convection in high-resolution model simulations (Zhao and Austin, 2005a, b;Heus et al., 2009) strongly relied on manual detection techniques. Subsequently, Dawe and Austin (2012) and Heus and Seifert (2013) presented automated methods of tracking shallow convection that rely on a continuous release of a decaying tracer from the model as described in Couvreux et al. (2010). However, the functionality of the tracer release and advection must be specifically implemented in each model and restricts the use of this technique to output of high-resolution models. Cloud tracking algorithms applied online during the actual model simulations (Plant, 2009) have the advantage of direct access to the relevant model fields at the model time step and thus the highest possible time resolution. However, these online algorithms must also be implemented separately in a specific model. Moseley et al. (2013Moseley et al. ( , 2016 tracked precipitation patterns for investigations of deep convective clouds and convective invigo-10 ration. Davis et al. (2006Davis et al. ( , 2009 presented an object-based analysis of rainfall patches, including tracking capabilities, which was applied to precipitation on a relatively large regional scale. Heiblum et al. (2016a, b) developed and applied a tracking algorithm for warm convective clouds that determines cloud volumes from the condensate mixing ratio field and then propagates the clouds based on the velocity of the cloud centre of mass. This algorithm allows for cloud splits and merges to form complex cloud entities possibly involving numerous individual clouds. Only a few studies have focused on tracking in- 15 dividual deep convective clouds in model simulations in a way that takes into account the actual cloud volumes . Terwey and Rozoff (2014) developed a tracking algorithm for individual convective updrafts and applied it to CRM simulations of hurricane cases with two different models. However, this effort has not been provided to the community as a generalised software package aimed at more widespread use cases. Several other approaches that included the tracking of individual updrafts in different types of cumulus clouds in a very detailed manner (Sherwood et al., 2013;Hernandez-Deckers 20 and Sherwood, 2016) would not be easily transferable to data with a lower temporal and spatial resolution. Despite these advances in developing detailed cloud tracking approaches for use in highly resolved model simulations, most current studies are performed with model grid spacings of several hundreds of metres to a few kilometres, especially when using larger domains or simulations for longer time periods. Providing adequate ways of performing tracking and object-based analyses for different types of clouds, including deep convection, in these kinds of simulations provides a key pathway to better understanding the 25 underlying physical processes.
This overview clearly shows the wide range of extensive efforts that went into the development of elaborate software and analysis tools to track clouds in different types of datasets. The application of cloud identification and tracking and related techniques has substantially increased our understanding of cloud size distributions, the time evolution of different types of clouds, and the underlying physical processes governing cloud formation, development, and propagation. However, the overview also high- 30 lights the problem of limited compatibility between the different existing approaches and implementations, especially regarding the intended use of tracking clouds based on different data sources using the same algorithms and analysis tools.
To address some of these limitations of existing approaches and provide a more functional tool with increased flexibility for different applications, we have developed tobac as a new flexible software tool for the identification, tracking and analysis of clouds. This approach certainly does not intend to replace the existing tools in their specific applications, but is rather aims to 35 provide a flexible framework that can be used for a wide range of different datasets and also allows the future integration of some of the existing approaches discussed here. tobac is designed in a modular way that includes the following basic steps, which are described in detail in the following sections 2.1-2.5: 1. Data input and output 5 2. Feature detection 3. Segmentation of cloud areas/volumes 4. Trajectory linking 5. Object-based analysis and visualisation tobac provides a framework that allows for a convenient application to output from a wide range of model simulations and 10 observational products, as long as it is provided with sufficient temporal and spatial resolution and contains output variables that can be used to identify individual clouds. Therefore, the software package can be used for a range of important applications like model intercomparison studies, which generally rely on simpler analysis methods that do not capture the evolution of individual clouds. These capabilities also allow for comparative studies between model simulations and observational datasets, e.g. from satellite retrievals, using the same underlying statistical methods. Due to the modular structure, tobac is set up for the in- 15 tegration of existing or newly developed algorithms for the different steps in the analysis chain. The implementation in Python provides tobac with access to numerous more specialised existing software libraries for different aspects of the software, such as data input/output, memory usage and the existing functionality from the field of image processing. We also show how we can leverage an existing Python library from an entirely different field of the physical sciences to perform integral parts of the linking step in our application. Furthermore, the choice of Python for tobac makes the package more easily accessible to users 20 as it allows for easier integration into existing analysis workflows and also stimulates the integration of additional components in the modular workflow of the package.
To show the advantages of tobac in practical applications, we present two different examples of using the framework in tracking and analysing deep convective clouds. In the first application, the detection of features is performed on the column-maximum vertical velocity at each output time interval from a CRM simulation. A three-dimensional watershedding algorithm is applied 25 to the updraft field and to the total condensed water content field (mass mixing ratio of all hydrometeors) at each step in time to infer both the volume of the individual updrafts and the clouds associated with the tracked updrafts. These features are then filtered and linked into consistent trajectories. We use the tracking results to assess the distribution of cloud lifetimes and the requirements for the model output temporal resolution. In the second application, we perform a simultaneous analysis for model and satellite data. Similar vertically resolved data as used in the other example are usually not available from satellite 30 imagers. The information in most satellite retrievals of cloud properties is limited to two-dimensions. With a multi-spectral selection of channels from the satellite instrument, cloud-top height and radiative properties can be retrieved (

2018
). An analysis of model-simulated and satellite-retrieved fields of outgoing longwave radiation is presented to demonstrate the flexibility of the tobac approach. By making use of the framework consistently across different datasets like this, we can compare the tracked clouds in both data sources by examining the statistical properties of the resulting population of convective clouds, thereby facilitating model-observation comparisons.

2 Software description
In this section, we describe the general design and workflow of the software package as illustrated in Fig. 1 for the two example applications presented in Sect. 3 and Sect. 4. The implementation of the individual analysis steps described here reflects an example combination of analysis steps currently implemented in tobac. Due to the modular setup of the package, different parts of the workflow can be combined in a different way or replaced by future additions to the framework.

Data input and output
The input data for the framework are provided in a high-level format of either Iris cubes (Met Office, 2018) or xarray DataArrays (Hoyer and Hamman, 2017), which include detailed metadata for each data variable, such as units and coordi-nates. The algorithm can thus automatically use these metadata, and the tracking setup can be controlled independently of the temporal resolution, spatial resolution or dimensions of the input data. Tracking parameters representing physical properties like distances or time periods can be set in physical units and are automatically converted to pixel-based values needed for the underlying calculations. Scientific data are provided in a vast variety of different file formats and data structures. Implementing a way of loading the data into the right format for an application often proves to be a significant limitation to the use of new 5 datasets and generally consumes an unjustifiable amount of time and effort, apart from providing an important source of implementation errors. The Python library Community intercomparison Suite (CIS) (Watson-Parris et al., 2016) overcomes this challenge and provides a convenient way to automatically load a vast array of observational datasets into Iris-compatible objects for direct use in tobac.
Both Iris and xarray make use of so-called "lazy loading" based on the dask package (Rocklin, 2015;Dask Development 10 Team, 2016) for efficient memory usage. The initial loading of data from a file only creates a place-holder. Then, individual operations on the data are combined and evaluated once the final result is to be saved, printed or plotted. Only at this stage are data actually loaded from disk into the physical memory of the computing machine and individual calculations performed.
Based on these capabilities, the entire tobac framework is written with a focus on limiting instantaneous memory usage by splitting up calculations into chunks, e.g. along the time dimension. Hence, even large datasets with individual fields much 15 larger than the memory available on the computing system can be conveniently processed without special adaptation by the user.
The output of the tracking analysis is given in commonly used high-level Python data format as pandas DataFrames (McKinney, 2010) for a table containing the tracked cell centres and trajectories and as Iris cubes or xarray DataArrays for the masks of cloud volumes or areas. This output is automatically amended with the same metadata as the input data like coordi-  25 The feature detection can work with any two-dimensional field either present or derived from the input data. In the first example, we use maxima in the maximum vertical velocity in each column of the three-dimensional high-resolution model output to identify individual updrafts (see Sect. 3). In the second example, minima in outgoing longwave radiation from satellite retrievals and model output are used in the feature detection (see Sect. 4).

Feature detection
To identify the features, contiguous regions above or below a threshold are determined and labelled individually. Smoothing 30 the input data, e.g. with a Gaussian filter, can make this step more reliable. The detection of regions above a specific threshold can lead to large interconnected regions combining several features linked by narrow ridges. To prevent this and identify these interconnected features separately, the tobac feature detection allows for the use of "erosion" techniques based on the implementation in skimage.morphology.binary_erosion. These techniques shrink the identified regions from the edges by a specific length or number of pixels, thus removing the connecting ridges between interconnected features. This has been shown to lead to more robust detection of individual features, as described in detail in Senf et al. (2018).
To describe the specific location of the feature at a specific point in time, we have investigated the use of different spatial properties describing the identified region. The geometric centre can be strongly affected by changes in the shape of the boundary, which is determined based on the selected threshold value. Instead, we have found that a weighted mean with weights w i given by the difference between the values of the chosen field at the individual points V i and the threshold value V feature has proven to perform best in determining a robust feature position. We can interpret this position as the centre of mass of the 10 component of the field exceeding the chosen threshold value.
Using a single threshold to identify features can lead to problematic results in two different ways. A very restrictive threshold can result in omitting clouds with weak vertical velocities or clouds during their initial and decaying stage will not be captured.
On the other hand, a weakly restrictive threshold can lead to spurious results as it might lead to large unconfined regions around deep convection being selected, or to an unwanted merging of several distinct cloud features into one. To resolve these This combination of different thresholds allows tobac to detect lower intensity features representing weaker convective clouds or clouds in their initial or decay stage but identify localised features with stronger updrafts or colder cloud tops within the weaker-threshold areas. In the first example application (Sect. 3), consecutive maximum updraft threshold values of 3, 5 and 25 10 m s −1 were used, while tracking based on OLR in the second example (Sect. 4) was performed with consecutively smaller threshold values (250,225,200,175 and 150 W m −2 ). While using multiple thresholds is usually beneficial, feature detection using a single threshold value is possible in tobac by only supplying a single threshold value and can be appropriate in certain applications.
An iterative set of threshold values was used in other recent approaches to cloud detection (Liang et al., 2017;Fu et al., 2019), 30 however with a specific focus on the application to certain types of satellite images. Multiple-threshold methods are also applied in other fields of science facing similar challenges in feature detection such as astronomy (Zheng et al., 2015).

Segmentation
Once features and feature centres are identified, segmentation techniques are used to associate areas or volumes to each identified feature. In the current version of the tobac framework, we have implemented segmentation using watershedding techniques from the field of image processing (skimage.morphology.watershed from the scikit-image library (Soille and Ansoult, 1990;van der Walt et al., 2014)) with a fixed threshold value V segmentation . This value has to be set specifically for 5 every type of input data and application, as explained in more detail for the two example applications in Sect. 3 and Sect. 4.
Watershedding segmentation treats the input field as a topographic map and separates the input into individual regions similar to individual watersheds or catchment basins along a dividing ridge in a geological context (Meyer, 1994). These techniques are widely used in several existing cloud tracking and analysis algorithms described in Sect. 1, such as Heiblum et al. (2016a), Fiolleau and Roca (2013) and Senf et al. (2018).
This segmentation routine can be performed for both two-dimensional and three-dimensional data. At each time step, a marker is set at the position (weighted mean centre) of each feature identified in the detection step in an array otherwise filled with zeros. In the case of the three-dimensional watershedding, all cells in the column above the weighted mean centre position of the identified features fulfilling the threshold condition V segmentation are set to the respective marker. The algorithm then fills the area (2D) or volume (3D) based on the input field starting from these markers until reaching the threshold V segmentation . If two or 15 more cloud objects are directly connected, the border runs along the watershed line between the two regions. This procedure creates a mask of the same shape as the input data, with zeros at all grid points where there is no cloud or updraft and the integer number of the associated feature at all gridpoints belonging to that specific cloud/updraft. This mask can be conveniently and efficiently used to select the volume of each cloud object at a specific time step for further analysis or visualisation.
The structure of tobac allows for the future implementation of other algorithms for the segmentation step, e.g. replacing the 20 watershedding approach by random walk techniques (Grady, 2006;Wang et al., 2019) or other image processing tools. Similarities between the feature detection and segmentation steps mean that these steps could be combined in some implementations in future versions of tobac, e.g. for applications based on a single input dataset (OLR), as used in Sect. 4. However, treating the two analysis steps separately allows for the combination of different datasets (vertical velocity and condensed water content), as shown in Sect. 3.
In the applications presented in Sect. 3 and 4, we set this value to v max =10 m s −1 , which results in a search range of 600 m around the projected position for 1 minute data input and 3 km for 5 minute data input. Variations in the shape of the regions used to determine the positions of the features can lead to quasi-instantaneous shifts of the position of the feature by one or two 10 grid cells even for a very high temporal resolution of the input data, potentially jeopardising the tracking procedure. To prevent this, tobac uses an additional minimum radius of the search range d min (2 km, equivalent to four times the grid spacing in Sect. 3) that specifies a lower limit for the size of the search region. Both these parameters are given as physical quantities and then converted into pixel-based values used in trackpy. This allows for cloud tracking that is controlled by physically-based parameters that are independent of the temporal and spatial resolution of the input data. We make use of this for cloud tracking 15 with different model output frequencies for the same simulations in the example application in Sect. 3.
Features can be allowed to be missed for a certain number of time steps (memory) and still get linked into a trajectory. However, this option should be used with caution, as it can lead to erroneous trajectory linking, especially for data with low time resolution. For example, convective clouds can produce outflow boundaries that initiate new convective clouds nearby, and the newly-formed clouds are more likely to be linked to the original clouds with this option. 20 The feature detection step can often omit the initial or final stages of the evolution of a cloud due to the choice of specific thresholds. Thus, trajectories can also be extrapolated to additional output time steps at the start and at the end of the tracked path. This allows for the inclusion of both the initiation of the cell and the decaying later stages in the analysis of the cloud life cycle. Furthermore, a threshold for the minimum lifetime of the tracked objects can be used to exclude the analysis of clouds that have only been tracked for a very short period and are likely to be spurious features. Such tracked objects can contaminate analyses focusing on the cloud lifetime and associated quantities.
The trajectories are recorded in a pandas DataFrame. This enables filtering the resulting trajectories, e.g. to reject trajectories that are only partially captured at the boundaries of the input field both in space and time.
The current implementation of the linking step does not include an explicit treatment of the splitting and merging of clouds, 5 as implemented in several of the cloud tracking algorithms reviewed earlier (Dawe and Austin, 2012;Heus and Seifert, 2013;Heiblum et al., 2016a). Instead, the current version of tobac will create a continuous track with only one of the two separate cloud objects that combine in a merger or evolve from splitting of a tracked object, mostly based on which of these has the more similar direction of travel to the joint object. However, we have structured the implementation of tobac in a way that allows for the future addition of more complex tracking methods that can record a more complex network of relationships between cloud 10 objects at different points in time.

Object-based analysis and visualisation
To make use of the results of the previous steps, we provide detailed tools to analyse and visualise the tracked objects. We provide a set of routines that enable performing analyses and deriving statistics for individual clouds, such as the time series of integrated properties and vertical profiles. We also provide routines to calculate summary statistics of the entire population properties.

Advantages of the implementation in Python
While the majority of the existing tracking approaches reviewed in Sect. 1 are implemented either in Fortran, C and C++ or in proprietary programming languages like MATLAB, we have chosen to use Python for our tracking framework for several 25 practical reasons. Python has become the go-to standard for data analysis in many fields of scientific research, including the atmospheric sciences in recent years Perkel, 2015). This makes it possible to develop software that is accessible and modular, which allows for the successful addition of user-contributed algorithms or the adoption or application of the workflow for cases beyond those presented here. The use of libraries in the scientific Python ecosystem including NumPy, SciPy, and matplotlib (Hunter, 2007;van der Walt et al., 2011), along with a large stack of existing and optimised libraries providing calculations, which means that many of the individual operations within tobac make use of the increased computational speed of these languages. The use of Python also means that even users without extensive programming experience will be able to easily adapt existing procedures into the workflow or contribute additional algorithms to the modular structure of the tobac tracking framework.
The implementation in Python also enables the use of Jupyter notebooks (Perez and Granger, 2007;Kluyver et al., 2016) as an 5 innovative way of developing, visualising and sharing scientific data analyses. We provide examples of the analyses presented here as Jupyter notebooks provided in the software package.
Memory limitations have been cited as a significant challenge for the application of many of the presented algorithms (Dawe and Austin, 2012;Heus and Seifert, 2013). The use of modern memory management techniques such as "lazy data loading" in the underlying Python libraries Iris (Met Office, 2018) or xarray (Hoyer and Hamman, 2017), which both rely on the 10 dask data types (Rocklin, 2015), allows for clear and concise source code while sparing the users of having to deal with most memory-related considerations themselves. Memory usage and algorithm run time for the two applications presented in this manuscript are included in the following sections.  model (Skamarock et al., 2005). These simulations use the Morrison microphysics scheme (Morrison et al., 2005(Morrison et al., , 2009) and the Rapid Radiative Transfer Model (RRTMG) short and longwave radiation scheme (Iacono et al., 2008).
We use a combination of the three-dimensional fields of vertical velocity and total condensate mixing ratio in this application to track individual convective clouds. The individual steps of the analysis are visualised for a specific point in time and a subset 5 of the model domain in Fig. 4. The three-dimensional vertical velocity field is reduced to the maximum updraft velocity in each model column over a mid-tropospheric range of geopotential height (3000 m to 8000 m) (Fig. 4a). This avoids the impact of strong vertical motions both in the lower troposphere, that may be associated with outflow boundaries, and also gravity waves in the upper troposphere. A Gaussian filter with a variance of σ = 1 km is used to filter the input in the feature detection step (Fig. 4b) to create a smoother field that assists in the feature detection. This two-dimensional field is then used to identify 10 individual deep convective updrafts in the simulation. The feature identification following Sect. 2.2 is performed with a set of three updraft velocity thresholds of 3 m s −1 , 5 m s −1 and 10 m s −1 (Fig. 4c) and yields the individual features marked in Fig. 4d. Segmentation is performed on the condensate mixing ratio using the watershedding technique (see Sect. 2.3) with a threshold of 0.5 g kg −1 to identify the cloud volumes corresponding to the individual identified updrafts. The cloud volumes derived with watershedding from the condensate mixing ratio field of each of the identified updrafts is represented by the surface projection of the 3D volumes (Fig. 4e). Note that the intersecting lines in Fig. 4e represent instances where cloud 5 volumes associated with different updraft cores may be present in the same column but at different altitudes. Trajectories are formed by linking up the individual features and are shown including the surface projection of the cloud volumes at the initial and final time step of each tracked cell (Fig. 4f).
The data processing has been performed on the JASMIN data analysis facility (CEDA, 2019). The script including feature detection, segmentation, trajectory linking and saving of the analysis output for the 1-min data output had a processing time 10 of around 17 minutes with a maximum memory footprint of 3.1 GB using a maximum of 3 processes and 27 threads. The segmentation step has been broken up into chunks of 10 minutes each to limit the total memory consumption of the analysis.
The processing time is almost entirely taken up by the segmentation step using time-resolved 3-dimensional data of the total condensate. It is thus strongly affected by the time required to access the data on the disk and thus highly dependent on both the infrastructure and the structure of the data file, i.e. the data compression in the input files. A smaller subset of the data 15 and analysis for this example including the tracking analysis and visualisation is available as a Jupyter notebook as part of the package source code.

Time resolution requirements for cloud tracking
The cloud tracking framework presented here can be applied to model output from any atmospheric model simulation with sufficient resolution to resolve the features intended to be studied. However, successful tracking of individual clouds in the 20 simulation output requires sufficiently high spatial and temporal resolution. However, writing output data at high frequency from numerical model simulations drastically increases the computational expense of the simulations and the size of the output datasets. For observational data, such as geostationary satellite data, the available time resolution might be limited by technical restrictions such as scanning time or data transmission. It is thus important to determine the necessary input frequency for the successful tracking of a specific type of dataset. 25 The tracking step (Sect. 2.4) uses trackpy, which is based on the tracking methods developed for in Crocker and Grier (1996). The algorithm has been originally developed for microscopic particles; however, all considerations apply equally to the tracked features we regard here in tobac. In their development of the algorithm, the authors state that successfully linking objects into trajectories is only feasible if the typical displacement of a particle during one time step is smaller than the typical inter-particle spacing. To assess how valid these assumptions are for our application, we investigated the nearest-neighbour dis-30 tances for individual cells and the typical displacement of the tracked objects within one time step. Distances between cloudy updrafts (Fig. 5a) were most frequently around 5 km with a substantial tail of up to 30 km representing more isolated cells.
This distribution is independent of the chosen output time step as it represents an instantaneous relationship between cells at individual points in time. The updraft propagation velocities derived for tracking with a 1 minute output time step (Fig. 5b) were most frequently at around 4 m s −1 with more than 90 % of the velocities below 10 m s −1 .
Using the output time step and these velocities, we can calculate the displacement of the clouds within one tracking time step and compare that to the nearest-neighbour distances (Fig 5c). In addition to the time step of 1 minute, the displacements that would result from lower output frequencies of 5, 10, 15 and 30 minutes based on these velocities were calculated (Fig 5c).
While there is no clear overlap between the nearest-neighbour distance distribution and the displacement distribution for an 5 output time step of 1 minute, the tails of the distributions start overlapping for 5-minute input data, although the peaks are still distinctly separate. For lower output frequencies of 15 minutes and 30 minutes, however, there is a clear overlap between the nearest-neighbour distance distribution and the distributions of displacement within one time step. Therefore, these frequencies would be outside the range postulated for the successful application of the tracking algorithm used by trackpy. Hence, when applying this tracking algorithm, it is important to understand both the spatial distribution of the desired tracked features and 10 their propagation velocities to ensure that the output time step is sufficiently frequent. For the simulations assessed here, both 1-minute and 5-minute output frequencies would be acceptable for tracking cloudy updrafts, with 1-minute output likely to provide more successful and accurate tracks.
The cloud lifetimes (Fig. 6) are analysed for the same 3-hour period using the two different time resolutions (1 minute and 5 minutes) and agree well for clouds with lifetimes larger than about 15 to 20 minutes. For shorter lifetimes, the 1-minute input 15 data yield substantially more tracked cells. It is obvious that we can only properly represent and analyse cloud lifetimes for clouds that exist over a certain number of output time steps in this framework. An individual cloud that is tracked for 5 to 10 minutes based on 1-minute output allows for robust conclusions about the evolution of the cloud in that period. The same time would merely lead to two or three individually identified objects for 5-minute data output, which would be the minimum to draw any useful conclusions about the lifetimes or time evolution of the clouds.   Although the temporal and spatial resolution of the input data can be arbitrary for the use in tobac, a meaningful comparison of the two datasets requires that the analysis covers the same region at a similar temporal and spatial resolution. The spatial resolution of the two datasets is reasonably similar (around 4 km for the satellite data and 4.5 km for the model output) and both datasets use a regular 15-minute interval, with a difference of up to a minute due to the scan time of the satellite data. The satellite data were restricted to the same temporal and spatial extent as the model output. 15 Top of the atmosphere outgoing longwave radiation (OLR) is used to track individual deep convective clouds in both model simulations and satellite retrievals. OLR is a standard model output for most high-resolution simulations and is often used as a diagnostic for simulated deep convection (Pearson et al., 2010;Russo et al., 2011). OLR retrievals also have the benefit that they do not depend on other aspects of a complicated radiative transfer model, which require, amongst other assumptions, that pixels are assigned as either cloud or cloud-free for the radiative retrieval of several optical (effective radius and optical depth) and thermal (cloud top temperature and height) cloud properties (McGarragh et al., 2018). For the satellite data, we use an empirical conversion derived in Singh et al. (2007) to convert the radiances L from two channels in the GOES-13 5 measurements, the water vapour channel (WV, 5.8 to 7.30 µm) and a channel in the infrared window (WIN, 10.2 to 11.2 µm), to OLR.
Singh et al. (2007) report an uncertainty from these conversions within 2.5 W m −2 .

10
The distribution of OLR for the model simulations and the satellite retrievals show a very similar shape (Fig. 7). The satelliteretrieved OLR features a larger number of pixels characterised by lower OLR values in the range between 100 and 250 W m −2 corresponding to deep cloud tops. The range covered and the peak position of OLR, corresponding to cloud-free and low cloud regions around 290 W m −2 , agree well between the model simulation and the satellite retrieval.
We use these histograms to choose the threshold values for the feature detection and the segmentation steps in the tobac rou- The individual steps of the tracking analysis for the model data are shown in Fig. 8, but the same steps are applied equally 20 to the satellite-retrieved data. The outgoing longwave radiation field (Fig. 8a) is filtered with a Gaussian filter with a standard deviation of σ = 4.5 km, equivalent to the grid spacing of the model data (Fig. 8b). The feature identification following Sect. 2.2 is performed with the set of five OLR thresholds of 250, 225, 200, 175 and 150 W m −2 (Fig. 8c,d) Fig. 10a). However, more clouds are identified in the satellite data than in the model data. When normalised for total number, the lifetime distributions agree better between the two different data inputs ( Fig. 10b). Most cloud objects are tracked for periods of up to an hour, but in both the model simulations and the satellite 10 retrievals there are numerous cloud objects tracked for up to several hours. The distributions of the cloud areas (Fig. 10c,d) show that the total cloud area for both model and satellite data is made up of two types of identified objects, smaller tracked clouds with a radius of up to 100 km and large tracked features with a radius of a few hundred kilometres. Due to the larger number of tracked clouds, there is more total cloud area in the tracked clouds in the satellite data. The distribution of cloud sizes is relatively similar between the two datasets. The satellite data show more small clouds below 100 km equivalent radius. 15 Furthermore, the size of the largest tracked objects is larger in the satellite data than in the model data, which corresponds to the large MCS propagating through the domain of interest (Fig. 9), and which is not represented properly in the model simulations with respect to both timing and total size.
An analysis of the cloud velocities and nearest-neighbour distances as described in Sect. 3.1 is presented in Fig. 11. The distribution of both the nearest-neighbour distances (Fig. 11a) and the cloud displacement velocities (Fig. 11b)  agation velocities peak at around 8 m s −1 , with most of the velocities below 20 m s −1 . A comparison of the nearest-neighbour distances and the displacements per input time step that would result for different temporal resolution (1, 5, 15 and 30 minutes) shows that the 15-minute time step used here already shows some overlap in the distributions. Longer time steps of 30 minutes or more would probably lead to problems in the tracking, while shorter time steps of a few minutes would be expected to improve the tracking further. However, output at similarly high temporal frequencies is not always feasible or simply not available 5 for a lot of data sources, e.g. for the GOES-13 geostationary satellite retrievals used in this study. The newest generation of geostationary satellite imagers such as the GOES-R series (GOES 16/17) that has replaced the GOES-13 satellite used here, as well as Himawari-8 (Bessho et al., 2016) and the future Meteosat Third Generation (MTG) satellites (Stuhlmann et al., 2005) all feature substantially higher temporal and spatial resolution.
The scattered convective cells of differing depths over the area of Houston that were the focus of the analysis in the first ap-  satellite retrieval (around 4 km compared to 500 m in the high-resolution simulations used in Sect. 3) limit the spatial scale of cloud features that can be resolved to more than a few tens of kilometres in radius. The use of outgoing longwave radiation as a variable for feature identification does not include as much information as the three-dimensional model output fields used in Sect. 3, however, it provides complementary information to compare model simulations with satellite retrievals. We have developed a feature detection algorithm based on identifying regions above/below a defined sequence of thresholds in two-dimensional input fields. Cloud volumes or cloud areas are associated based on a watershedding technique featuring a single specific threshold value on two-or three-dimensional input fields.
We have shown how we can leverage another open-source Python package trackpy, initially developed for application in We have presented two application examples of the use of tobac for the study of deep convective clouds. In the first application (example A), we have tracked scattered deep convective cells based on a combination of the vertical velocity and total condensate mixing ratio fields from CRM simulations with WRF over the area around Houston, Texas. The simulations were performed with a grid spacing of 500 m, and thus represent a typical application of a CRM. The tracking framework is currently being applied to other CRMs for the same case study as part of the ACPC deep convection case study (van den Heever et al., 2017) to investigate the response of deep convective clouds in models to changes in aerosols. We have performed the tracking for different output frequencies to evaluate the dependency of the tracking performance on the time resolution of the input data.

5
The output resolutions of 1 minute and 5 minutes lead to comparable tracking results for scattered convective cells. This result can be confirmed using an analysis of typical displacement velocities of the clouds and nearest-neighbour distances between the individual identified cloud objects.
In a second application (example B), we have presented a simultaneous tracking of deep convective cloud features and larger convective systems based on outgoing longwave radiation output from model simulations with convection-permitting grid spac-10 ing (4.5 km) and outgoing longwave radiation derived from geostationary satellite retrievals (GOES-13) in the same region. The 15-minute time resolution available from the satellite retrieval is shown to be sufficient for successful tracking performance.
The analysis also demonstrated that the model simulations and the satellite retrieval feature clouds with a similar lifetime distribution. The distribution of cloud areas in model and satellite data shows a similar combination of smaller convective cells and larger systems. The main differences occur for the largest tracked systems, which are stronger in the satellite retrievals. 15 This can be explained by the limited representation of the propagation of two large organised storms within the model domain.
This would have been more challenging to assess from a bulk analysis of the domain-wide averaged properties.
The newest generation of geostationary satellites, such as Himawari-8 or GOES-16/17, provide substantially higher spatial and temporal resolution (Bessho et al., 2016;Schmit et al., 2016). These advances will strongly improve the applicability of this type of satellite data for use in object-based tracking and analyses with tobac, and also allows for a wider range of applications, 20 e.g. by capturing smaller scattered cells such as the ones investigated in Sect. 3.
The ability of tobac to be used for both models and observations as shown in these examples helps to compare models with observations more directly, and therefore, better understand the differences between the two types of data.
Although we have focused on tracking and analysing deep convection here, there are numerous other applications that tobac can be used for without much additional work. There are a large number of existing data products, such as high-resolution 25 radar data, e.g. from NEXRAD over the United States or similar networks in several other regions of the world (Reed et al., 2017), that would be most suited for the use with tobac. Furthermore, the application of tobac is not strictly limited to the analysis of clouds, and it can also be applied to study other features of the Earth system that can be identified as well-defined time-evolving regions, such as distinct aerosol plumes in the atmosphere or plankton in the surface layer of the ocean.
We are currently working on implementing additional algorithms for the modular steps of the framework, e.g. based on the 30 analyses developed in Senf et al. (2018). Additionally, we are implementing a more flexible representation of the links between cloud objects at specific points in time, which will allow for proper treatment of more complex splitting and merging of cells. We invite the community to contribute to the future development of tobac both through the implementation of existing algorithms into the common framework and by using the framework as a basis for new developments.
Code and data availability. The tobac source code publicly available in a GitHub repository distributed under a BSD 3-Clause license at https://github.com/climate-processes/tobac (Heikenfeld, 2019c, d). The version tobac 1.2 described here is available as a release (Heikenfeld, 2019a). The latest version of tobac can be installed using conda with the command conda install -c conda-forge tobac.
The linking step makes use of trackpy (Allan et al., 2016), which is available on GitHub (https://github.com/soft-matter/trackpy). We use several standard Python packages for scientific computing and image processing that are all available through package managers such as pip 5 or conda. The GOES-13 satellite imager data has been downloaded from the NOAA Comprehensive Large Array-data Stewardship System (CLASS) (NOAA, 2019a) and processed with the NOAA Weather and Climate Toolkit (WCT) (NOAA, 2019b).
Jupyter notebooks containing the tracking analysis and visualisations of the tracking results for a smaller subsample of the data used in the two example applications are provided as part of the tobac source code. The data used in these notebooks is downloaded automatically and is available as Heikenfeld (2019b) and on GitHub at https://github.com/climate-processes/tobac_example_data.