UFlow 1.0: A Computer Model for Projections of Urban Sprawl

Cities concentrate most of the world’s population and are the stage of difficult problems around logistics, economy, or quality of life, to enumerate just a few. As an object of research on itself, an urban agglomeration is difficult to characterize; it is both an ensemble of various disconnected heterogeneous elements, and the product of numerous actions and effects between those elements. Studies of the structure and the functioning of cities date back to one century ago, with an increased interest in the 5 last decades on the phenomenon of expansion and all of its impacts. Models of city growth face the complex nature of this system and are approximate. Different representations seek to balance characteristics as data availability, level of detail of internal processes, or precision. The uflow model approaches the problem with the metaphor of an abstract field, which evolves over time and signals the conversion from empty to urban cells. The procedure for calibration adjusts parameters according to the history of the region under study, and is able to capture local 10 conditions. The implementation takes advantage of parallel hardware, and the simulation can be performed in reverse mode, a feature that can be useful to verify the adaptation of the tool to a given scenario, or to compute approximations of the past state of a region. Tests confirmed the expected behaviour of the algorithms, and good agreement with actual data. The flexibility of the concept of intensity of urbanization is open to the integration of different data sources into the model, and the possibility of simulating 15 their evolution over time.

. Overview of Urban simulation. sense that outputs are linked to the inputs in a straightforward manner, and it can be used as a basis to implement other criteria governing grow speed.
2 Overview of CA-based LUCC simulation LUCC simulators perform computations derived from geophysical and historical data, in order to extrapolate the future be-65 haviour of a region. The most basic and common output is a classification, usually in Boolean form: an area is either urban or not, although there are models which break this division further into categories as residence, rural and industrial areas, or parks (Barredo et al., 2003).
To say that LUCC forecast is a sophisticated form of data regression is a fair description (Brunsdon et al. (1996) being an example), but it misses a central feature of the problem. What's specially remarkable in the study of city dynamics is its 70 ambivalence, as deterministic rules operate together with random events to produce results. For instance, smooth topography, transportation facilities and distance to commercial zones are examples of quantifiable information, and catalysts for terrain occupation. On the other hand, unexpected events as the shutdown of a factory (e.g.: Chapain and Murie (2008)), and investors decisions may change the future of a city (Batty, 2007;Rocha, 2012).
The mix of deterministic causal rules and random factors may point two different directions to build a representation, 75 depicted in Figure 1.
First, the whole dynamics comprising deterministic and aleatory effects can be absorbed into black-box models, commonly referred as machine learning tools. Examples in this line include Logistic Regression (Hu and Lo, 2007), Neural Networks (Li and Yeh, 2002;Almeida et al., 2008), Stochastic Automata (Wu, 2002), Particle Swarm (Feng et al., 2011), Random Forests (Kamusoko andGamba, 2015), Deep Learning (He et al., 2018) or Markov models (Aaviksoo, 1995;López et al., 2001;Guan et al., 2011;Arsanjani et al., 2013;Moghadam and Helbich, 2013). The accuracy of a simulation built this way will be mainly affected by the volume of available data, the correct combination of variables in calculation procedures, and the presence of patterns that can be detected and remain valid throughout the evolution of the city.
Second, emphasis is placed on known facts about the system. This knowledge is translated into rules to describe transitions, relations, effects, which have higher interference over the processes occurring in the system, complemented with data that 85 backs the construction of a simulation. The rules can depict hierarchical levels and be parameterized to fit the dynamics of a particular scenario. Parameters that control the simulator are adjusted to fit the scenario under study. Because of the intrinsic stochastic nature of the problem, deterministic models are extended with sources of perturbation, to give them some organic resemblance and break rigid patterns that may fall short of representing actual behaviours. Models of this genre include Agents (Arsanjani et al., 2013;Ligtenberg et al., 2001;Parker et al., 2003), Rule-based Systems (Berberoglu et al., 2016) and Cellular 90 Automata (Rocha, 2012;Santé et al., 2010;Moghadam and Helbich, 2013;Berberoglu et al., 2016;Clarke, 2008).
Simulation generally takes into account the whole dataset; regional trends are averaged out, and this choice owns to the need to keep the complexity of data and algorithms under a manageable level. Alternatives to circumvent this problem consists of partitioning the map (Ke et al., 2016;Shu et al., 2017;McGarigal et al., 2008), or to structure the model in different scales (White and Engelen, 2000;Stevens and Dragićević, 2007).

Cellular automata
Cellular automata executes a set of fixed rules over tabular data. The ability to exhibit complex execution patterns stems from the large set of states that a grid can represent. If the state z of a cell depends on all of its neighbours (a common configuration), the transition z new = f (s old 1..8 ) can be chosen among 256 possibilities. A tiny grid with 10 x 10 binary cells (urban/non urban) has room for 2 100 situations. A thorough review of CA properties can be found in Wolfram (1983).
This ensemble of factors is sometimes summarized as a set of attractive and repulsive forces (White and Engelen, 2000;Stanilov and Batty, 2011;Furtado et al., 2012), and an illustrative example of this metaphor is the use of gravity equations 115 (Chen, 2009(Chen, , 2015. Figure 2 is an illustration of forces acting in a city (Ponta Grossa, Brazil). The two locations are approximately 500 meters apart; the steep, unpaved street (Bartolomeu Bueno) is close to a flat boulevard (Carlos Cavalcanti av.) and a hypermarket. Land prices climb along of the slope in the direction of the avenue.

Metrics and Calibration
A run of a simulator s(·) takes as input a two-dimensional map or matrix m n , a set of parameters θ, and produces a new city 120 map m n+1 = s(θ, m n ). Some tools require a set of maps, depicting several years. Time is generally measured in years, and before computing any predictions, a simulator must be tuned to reproduce historical data; given a series of maps m 0,1,... , it is expected that m n+1 ≈ s(θ, m n ). This entails a calibration procedure, which can be cast as an optimization problem defined with a distance metric: (1) 125 This formulation implies that the parameters θ remain fixed throughout the historical series. The alternative would be to have θ = θ(t), leading to a control problem; no urban simulator, to our knowledge, operates this way. The model SLEUTH modifies parameters during execution, but it does so according to internal heuristics and not following a prescribed trajectory (Clarke and Gaydos, 1998).
The choice of the norm in Equation (1) is of special importance, as it may reflect different properties of the landscape  Garigal and Marks, 1995;Amiri et al., 2017). One of the most obvious choices is the difference between simulated and actual growth, computed as the number of urban pixels. This metric has a low computational cost, but is blind to city morphology: expansion predicted in the right or wrong places may yield the same values. This calculation is, nonetheless, useful as a raw estimate to filter configurations during calibration. Projections of population growth (Stevens and Dragićević, 2007) enter the same category. A comprehensive set of metrics is found in McGarigal and Marks (1995), including, for example, patch size 135 and count, average and total perimeter of patches, and shape analysis. Research in protein structure classification also deal with shape matching with similar metrics (Baldi et al., 2000). The use of a set of measurements might provide a multifaceted view of city growth (Schneider and Woodcock, 2008), but integrating several functions in (1) would require the definition of an aggregation index (e.g. Dietzel and Clarke (2007)) and induces the problem of fixing the relative weight of each factor.
One of the basic metrics in this context is the Lee-Sallee symmetric difference (Lee and Sallee, 1970), which addresses the 140 limitation of the simple difference. It is computed as 1 − |A ∩ B|/|A ∪ B|, where A and B are images being compared, the union and intersection operators are computed per pixel, and the norm operator stands for pixel count.
The use of a binary classification produce four possible results: a pixel can be correctly or wrongly predicted to be urban or not. This leads to the Matthews correlation coefficient (Matthews, 1975); it combines the counts of the four types of pixels (TP true positive, TN true negative, FP false positive, FN false negative) into a uni-dimensional measure given by: Matthews coefficient gives more informative than Lee-Salee (Chicco, 2017), and it has been extended to compare more than two categories (Gorodkin, 2004). Another metric is the Kappa index, but its interpretation is non trivial (Pontius Jr and Millones, 2011;Olofsson et al., 2014). In the case of urban sprawl a perfect match is of lesser importance if, in a series of snapshots, a model succeeds to reproduce general trends for a city. of a model, as registered in this rare account: "Settings for these four constants were arrived at by examining growth rates of cities over time, but are otherwise the result of trial and error" (Clarke, 2008). Tools as SLEUTH make a handful of parameters accessible to end-users (Clarke and Gaydos, 1998), but there are simulators that increase this number to dozens of adjustable coefficients (Fang et al., 2005;Arsanjani et al., 2013). Some authors go one step further, and use machine learning techniques to find the very rules that are going to be executed (Li and Gar-On Yeh, 2004;Li and Liu, 2006;Bone and Dragićević, 2009; Given the non-convexity and discontinuities of the optimization problem (1), techniques as gradient descent are not the most indicated. Methods commonly employed include Genetic Algorithms, Simulated Annealing, or even brute force (Newland et al., 2018;Paegelow et al., 2014;Dietzel and Clarke, 2007;Al-Ahmadi et al., 2009). Meta-heuristics have a potential interest here, but the recent flood of proposals in this area may shadow algorithms that are really effective; a thoughtful discussion in this sense can be found in Sörensen (2015). constructions. Periods of decline may also happen, caused for example by lack of maintenance, physical deterioration, or severe weather events. These processes accumulate a difference in urbanization between two moments in time, in a given area. In the one dimensional case and for small values ∆x, ∆t, the amount corresponds to Improvements in an area tend to propagate and ultimately lead to urban sprawl; this can be assimilated into a directional flux that drives the evolution of a region. Similar to an energy gradient (Torrens and Alberti, 2000), it can be written as , where κ captures conditions, like slope, that interfere with the intensity of the process.

200
It can be shown that this expression gives rise to the following equation: This expression is known as the Heat Equation (a derivation is provided in Appendix C). It relates the change of temperature with the transfer of heat energy from hotter to colder regions. This differential equation has already been considered to represent distribution of population (Tobler, 1970;Chen, 2008;Rocha, 2012).

205
The model takes two inputs: (i) a map m 1 of pixels corresponding to the current city map; (ii) a table κ describing what might be called urban diffusivity. Before solving the model numerically, some adjustments to the heat analogy are necessary; this is discussed in the following section.

Variations on a theme
A limitation of the model (4) is the requirement that regions be adjacent to conduct flow; new zones may emerge as satellites 210 around the main footprint. Installation of industrial plants is a typical case, and complex landforms also produce fragmented morphologies. In the present case a probability map is computed to control this mechanism.
Input maps m 1 and m 2 use two extreme values to classify land. In the present study, 0 corresponds to empty, and the value 5 was assigned to urban cells. The higher the initial value, the fastest the progression of urban borders. The change from an empty pixel to urbanized was empirically fixed at the threshold u > 0.5. Different combinations of these values and κ controls 215 the reach of the urbanized region under the heating curve. This point is illustrated in Figure 3.
The use of binary maps flattens the city and disregards initial urbanization gradients, leaving the computation of the coefficients κ alone to capture city dynamics. An alternative to explore is the use of information as population density, land value, distance to commercial centres and other factors (Tobler, 1969;Meentemeyer, 1989;Tayyebi et al., 2014). Since this corresponds to a change in the initial conditions of the problem, it may affect how the solution pushes the borders of the city.

220
Calibration requires maps in two different points of time m 1,2 , separated by an arbitrary period t A . The algorithm searches the optimum κ * , guided by Matthews' coefficient; and it also finds the simulated time t S that takes the system from m 1 to m 2 .
Finding parameters that match a given trajectory corresponds to an inverse problem; the heat equation is particularly difficult in this respect, and the conditions imposed here, as discontinuities and the large number of unknowns κ(x, y) make it harder (Beck and Arnold, 1977;Tervola, 1989;Alifanov et al., 1995). Since the objective was only determining if the values of cells 225 fall within a given range, an interactive algorithm was devised; in short, an error map is used to modify κ * in multiple iterations of the equation (4).

Main algorithm
Numerical solutions to the Heat Equation are available in many textbooks and won't be discussed here. The implementation used a finite difference, forward scheme. The mesh spacing has no real physical meaning (the calculation does not represent The calibration process is described by Algorithm 1. It begins by taking the initial city map m 1 as a constant heat source and an initial coefficient map κ(x, y) = 1; exclusion areas as parks can be defined by setting κ(x, y) = 0.

235
Algorithm 1 Calibration algorithm j = 1; mS = m1; (initial simulation output) repeat while (count of urbanized pixels in mS is lesser than in m2) do Integrate (4)  The inner loop is iterated until the number of pixels in the urban footprint is equal to, or larger than the reference image m 2 .
The difference r = m s − m 2 is calculated to obtain cells wrongly classified as urban. The minus operator in this formula takes two maps, applies the threshold 0.5 to classify pixels as urban or not, and then for each pixel computes (m s (x, y) − m 2 (x, y)) + .
The result is an error image containing spots that are false positives.
Next, spots in the error image r are enlarged to create regions where the diffusion coefficient will be diminished, in order to 240 form barriers that slow down the urban flow. The resulting image is subtracted from κ and negative values are set to zero. At each iteration the error image is recalculated and the enlargement factor is increased. The effect is similar to plating a surface with a series of concentric circles of increasing radius; the inner region will have the lowest values for the diffusion coefficient.
Expanding the spots is not the same as rescaling the image; the closest morphological operation is dilation, but it deforms borders according to a parameter known as structuring element (Haralick et al., 1987). In the present case, better results are 245 obtained using equation (4) itself with r as input, and this process will be referred here as T (r) for a given T interval.
This appears in the Algorithm 1 in the expression [κ t − γ * j * τ (r)] + , which is controlled by two parameters: γ defines how fast insulation areas build up in κ(x, y); and τ controls the rate of expansion of spots at each iteration. Values were chosen empirically, with τ = 5 and γ ≈ 0.075; the latter parameter was exposed in configuration files. These coefficients do not alter the mechanics of the model, but influence the speed and quality of calibration.

250
The calibration stops if the maximum number of iterations n C is attained, of if Matthew's coefficient does not improve between cycles. The algorithm outputs t S , the simulated time separating images m 1 and m 2 ; κ * , a sub-optimal diffusion map; and the best value of Matthews' coefficient.
Forecasts are calculated in two steps: expansion of the second map m 2 followed by the generation of new urban islands.
Users define the number of years of projection, and a linear regression extrapolates the number of new pixels along of m 1 and 255 m 2 . The result is divided between expansion of existing areas and creation of new clusters, following the same proportions found in the input images.
The process of diffusion produces isotherms and creates regular, rounded shapes; to counteract this effect, the map κ * is modified by adding a random value χ to every pixel. Exclusion areas can be defined by assigning pixels with the value 0.
Expansion is computed taking as inputs κ * + χ and m 2 as a heat source with temperature set to 5.0, and integrating Equation

260
(4) until the growth attain or surpasses the expected count of pixels. Next, new urban fragments are created using a random spraying procedure described in the following sections; it involves analysis of the images to count urban clusters and their size, and the calculation of a probability map π(x, y).

Count of satellite regions
The size and number of urban islands is important to characterize the morphology of a city and to parameterize the spraying 265 process. The usual solution to find clusters involves edge detection techniques, but since the images are monochromatic, a simpler procedure was devised. A variable n R is initialized to 1; the image is scanned pixel by pixel and whenever a cell with value 1 is found, a flood-fill algorithm is applied to paint the region with colour n R + 1. The value of n R is incremented and the scanning proceeds until the end of the image. As the filling routine runs it also counts the number of painted cells to obtain measures of areas.

270
This procedure is executed with images m 1 and m 2 . The number of fragments in the images, n 1 and n 2 , is used in a linear interpolation in the forecast. The area of the new clusters is chosen as an average of the (n 2 − n 1 ) smaller clusters in m 2 .

Probability map and generation of clusters
The emergence of new fragments is more intense near roads and in the proximity of the city boundaries. The effect can be related to Tobler's Law (Tobler, 1970;Miller, 2004), and depends on the distance between elements. Despite the simplicity of 275 the concept, an efficient implementation is non-trivial (Grevera, 2007;Fabbri et al., 2008;Friedman et al., 2018;Felzenszwalb and Huttenlocher, 2012). Giving an implementations of the distance transform δ(·), a probability map can be computed as , where ρ is a map of roads and e is an exclusion layer, both represented as binary matrices. The rightmost term removes pixels ). An exponential decay can be applied as e −g(1−δ(x,y)) 2 , increasing the density of pixels generated near city boundaries. The distance transform has a limitation, as it does not represent the cumulative influence of several sources. This effect is present in heat model T (·), which can replace δ(·) in the calculation of Equation (5). Maps generated using the two techniques 285 are shown in Figure 4. The results obtained with T (·) are more plausible for geometries as crossroads. In view of this, the diffusion equation was chosen for the calculation of π(x, y).
The map π can be further adjusted with the negative influence of a slope map. Other aspects as historical and sociological characteristics can be incorporated by manual edition (e.g. Houet et al. (2016)).
The spraying procedure follows a classic generate/reject logic, as shown in Algorithm 2. if v > π(x, y) then The interval [0.1; 0.9] in the algorithm was chosen to escape from two extreme cases. Low values for v would select locations of low relevance, while values close to 1 would place new clusters touching areas already occupied.

Verification and Validation
Conceptual model validation is the first step in checking a simulator; it assesses that "...theories and assumptions underlying the conceptual model are correct and that the model representation of the problem entity is reasonable for the intended purpose 295 of the model" (Sargent, 2013). The UFlow model incorporates several ideas developed in other studies; important points mentioned in the preceeding sections include: the city map is sampled in a bidimensional regular mesh, with arbitrary resolution; urban boundaries may expand with different speeds and irregular shapes; disconnected urban fragments may emerge at random moments and places;
exclusion areas can be arbitrarily defined; a probability map can represent trends of land conversion.
An important hypothesis of the model is the capability of adjusting the coefficients κ to capture local tendencies of sprawl.
Initial operational validation (Sargent, 2013) was conducted using artificial images, allowing controlled observations similar to 305 the ones found in N. Gazulis (2006). This test confirmed expectations as is presented in Appendix B.

Tests with actual maps
Two cities were tested: Ponta Grossa, Brazil, for which comparison data was available; and the capital of México, México City.
Putting aside differences of size, history and economy, the cities share two characteristics that make them interesting testcases. First, a significant part of urban sprawl occurred outside the control of legislation or public planning (Duhau, 1988;310 Silva, 2013). Second, both cities have a complicated topography, with México having its own extinct volcano, Xaltepec. These factors create more opportunities for heterogeneous local conditions and give rise to irregular morphology.

Ponta Grossa
Ponta Grossa occupies approximately 170km 2 and has a population density around 160/km 2 . If compared to other similar brazilian cities of the region (Paraná state), it has a very scattered layout. The city was shaped by irregular occupation, 315 complicated topography and a lack of infra-structures as viaducts.
Reference maps for Ponta Grossa represented the years 1984, 1993 and 2002, with a resolution of 1242 × 1339 pixels.
Images are shown in Figure 5. present in satellite images (Manandhar et al., 2009;Roth, 2019). During the simulated period, urban zoning was modified and farms were absorbed into city boundaries, but many areas remained unoccupied for reasons as litigation and lack of investments. 1984-1993 the simulation found 3 new clusters, but the actual number is slightly greater, since the heristic used to count clusters misses fragments that fuse together.
Red and blue pixels in the error image correspond respectively to false positive and false negative. The rapid alternance between the two types of errors occurs along all of the highly indented border. Images of the city in the following years 330 confirm this growth pattern. Figure 7 shows the footprint 15 years later in relation to the forecast: it can be seen that the city borders advance over the false positive/negative pixels of the error image, and that the expansion does not concentrate on specific regions. The second part of Figure 7 exemplifies the interlaced pattern between the city and its surroundings, still present long after the forecast period; the region depicted is known as Colônia Santa Luiza, at the south of Ponta Grossa. Old farmlands are surrounded by 335 the city; as a matter of fact, car accidents with farm animals in urban roads, are still not rare (e.g., Globo (2019)).
Some statistics are collected during calibration; the values corresponding to the last iterations are shown in Table 1. The simulator uses a working κ that is altered in every iteration. The best value for Matthews' coefficient was found at iteration 25, and κ * was recorded at this point.  Approximately 95% of the images are empty space, and this reflects in the fact that the True Negative curve shows little variation.
The results of the UFlow model can be compared with the study developed by Roth (2019), using the SLEUTH simulator and input images that were used in the present work. Three points are noteworthy.
The first aspect is calculation speed; the calibration process in SLEUTH took approximately 40 hours, what is in accordance Clarke, 2017, 2018). By contrast, the calibration of the UFlow model took 10 minutes and is deterministic; random processes are only active in the prediction step, which runs under a minute.

350
The second point is the quantitative results. The SLEUTH simulations in Roth (2019) employed a sequence of images to extrapolate growth in 2017. Results indicated less expansion than expected, and the presumed reason was a phase of slow growth in the sequence of footprints. UFlow follows tendencies found in two input images as a feature of the model, which translates into more predictable values -although both SLEUTH and UFlow cannot make sure that the growth speed will be correctly simulated.

355
The third point is the qualitative aspect of the simulations. SLEUTH tries to assimilate the intensity of different types of expansion across a city. In her study, Roth observed a prevalence of growth around the borders of the images and irregular occurrence of other mechanisms such as expansion along roads. UFlow does not implement a set of growth rules, but, on the other hand, it adapts to the intensity of expansion according to local information. The adjustment to global statistics only occurs in the generation of scattered fragments.

México City
México City occupies a surface of approximately 1485km 2 , with a population of almost 9 million people. The conurbation of the capital of México with neighbour towns exceeds 20 million inhabitants. The city has more than 500 years marked by dramatic events with effects on its morphology.
The city was built on the floor of ancient Texcoco Lake, is surrounded by montains and experiences intense rainfalls. It 365 suffered several floods and in 1629 was covered by water for five years (Martínez, 2004). Systems for drainage were built, but caused ground subsidence. Three tectonic plates meet at the region and are the source of frequent seismic activity. The last notable event was in 2017, with magnitude 7.1 and causing the loss of more than 300 lifes. White, R. and Engelen, G.: High-resolution integrated modelling of the spatial dynamics of urban and regional systems, Computers, Envi- A companion application implements these operations separately, as a helping tool for users of the simulator.

A2 Numerical Integration
Equation (4) is integrated using a first order, forward, finite differences method. Time and space are arbitrarily scaled. Each pixel in the input image corresponds to one node of the mesh, and the distance between nodes is set to 1.0.
The integration method imposes the following stability condition: Since κ ≤ 1.0, convergence requires ∆t < 2 −1 . Smaller values slow down the propagation of the urbanization process, but also makes it more likely that changes in small areas are captured in the updates of κ.
The core of the integration code is shown in Figure A1. Figure A1. Parallelization: code is executed by two threads, pointers take strides between them.
Two threads scan the data array using a stride scheme, represented in the figure by the colours red and blue. Values are accessed using pointers, making the code a bit faster than using indices. An explanation of the method was found later in 695 (Horak and Gruber, 2005). The scheme can be extended to use more execution threads; modifying the code so that it adapts on-the-fly to any number of threads is not difficult either, but that would require pointer arithmetic in place of the increment operator ++, with a slight increase in computational cost. Execution times are of the order of 10 minutes with images of size 1000 × 1000 in a laptop with a i3 Intel microprocessor.
The main parameters playing a role in the code are listed in Table A1.

700
Values were tuned empirically, as new tests unfolded during the implementation. Four parameters were chosen to be made accessible by means of a configuration file. The decision to keep some variables hardwired into the code was rather arbitrary, but had the intention of reducing the complexity the software operation. The exposed variables were deemed sufficient to give control over the simulator.
Researchers interested in modifying the choice of which parameters are accessible will face no special difficulties. The way 705 the configuration file is handled makes it straightforward to add new features. For instance, if a new variable named "Xis" must be added, only two modifications are required: 1. insert a line in the configuration file, as 'Xis = 3.1415'; 2. access the variable in the code using this : 'double xis = config_getd ("Xis")'; /* get double */ Other helper functions include config_getb() to read Boolean values, and config_gets() to read strings.

710
The report system is also simple to extend. A global variable (of type struct SReport) named "GReport" collects all data of interest. The definition of the variable and the generation of the report file are both found in a header file, "report.h". In order to generate the output of a new statistic, for instance a variable named "alpha", the user should: compare with Gazulis, N., Clarke, K. C. (2006, September). Exploring the DNA of our regions: Classification of outputs from the SLEUTH model. In International Conference on Cellular Automata (pp. 462-471). Springer, Berlin, Heidelberg.
The first practical tests used images designed to represent different growth patterns, shown in Figure B1; the elapsed time was chosen arbitrarily as 10 years just to provide a reference. The map of roads, slightly non-simmetrical, was already shown in Figure 4. The simulated city is symmetrical along the horizontal axis. Areas in the southern part remain static, while in the 725 north there are regions growing in patterns of different speed. New isolated fragments appear near edges that did not expand. Outputs computed by the simulator are shown in Figure B2. The first image, B2 a), depicts the κ * coefficient. The red background colour represents maximum conductivity, with κ = 1 assigned at the beginning of calibration. Blue pixels have value 0 and function as contention barriers to urban flow. It can be perceived that the whole south part of the city was surrounded by an insulation barrier.
730 Image B2 b) shows simulated growth. The simulated city expands in three locations; new pixels occupy a triangle, a D shape, and an irregular pattern. All cases contain different grow gradients, and they were reproduced with good accuracy. Sharp edges in the original image became rounded in the simulation, an expected effect from the behaviour of Equation (4).
The error image B2 c) compares actual data and simulation, showing false positive (red) and false negative (blue) pixels.
Notably, this kind of representation is not used in most studies of urban growth with cellular automata, the literature cited in 735 this paper being an example. These pixels are concentrated where the expansion was faster, or in urban islands that appear only in the second snapshot of the city and are not considered during calibration.
A forecast is shown in Figure B2 d). Before calculating a forecast the simulator modifies κ * with the addition of uniform noise; in this test, this parameter was set to χ ∈ [−0.25; 0.25]. This gives some margin for growth to occur in areas that were otherwise inactive in the period spanned between the input images. This procedure also breaks isotherms that form around