<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">GMD</journal-id><journal-title-group>
    <journal-title>Geoscientific Model Development</journal-title>
    <abbrev-journal-title abbrev-type="publisher">GMD</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Geosci. Model Dev.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1991-9603</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/gmd-19-3757-2026</article-id><title-group><article-title>A novel cluster-based learning scheme to design optimal networks for atmospheric greenhouse gas monitoring (CRO<sup>2</sup>A version 1.0)</article-title><alt-title>Concepteur de Réseaux Optimaux d'Observations Atmosphériques</alt-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1">
          <name><surname>Matajira-Rueda</surname><given-names>David</given-names></name>
          <email>david.matajira-rueda@univ-reims.fr</email>
        <ext-link>https://orcid.org/0000-0003-0885-4476</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Abdallah</surname><given-names>Charbel</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-3410-6965</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1 aff2">
          <name><surname>Lauvaux</surname><given-names>Thomas</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-7697-742X</ext-link></contrib>
        <aff id="aff1"><label>1</label><institution>Université de Reims Champagne-Ardenne, CNRS, Climate Impacts on Environment Laboratory (CIEL), AtmosphEric Research and Observations LABoratory (AEROLAB), Campus du Moulin de la Housse, 51687 Reims CEDEX 2, France</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Laboratoire des Sciences du Climat et de l'Environnement (LSCE), IPSL, CEA-CNRS-UVSQ,  Université Paris-Saclay, 91191 Gif-sur-Yvette CEDEX, France</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">David Matajira-Rueda (david.matajira-rueda@univ-reims.fr)</corresp></author-notes><pub-date><day>8</day><month>May</month><year>2026</year></pub-date>
      
      <volume>19</volume>
      <issue>9</issue>
      <fpage>3757</fpage><lpage>3782</lpage>
      <history>
        <date date-type="received"><day>28</day><month>August</month><year>2025</year></date>
           <date date-type="rev-request"><day>20</day><month>October</month><year>2025</year></date>
           <date date-type="rev-recd"><day>10</day><month>February</month><year>2026</year></date>
           <date date-type="accepted"><day>27</day><month>February</month><year>2026</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2026 David Matajira-Rueda et al.</copyright-statement>
        <copyright-year>2026</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026.html">This article is available from https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026.html</self-uri><self-uri xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026.pdf">The full text article is available as a PDF file from https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d2e116">With the continued deployment of atmospheric greenhouse gas monitoring (GHG) networks worldwide, optimal and strategic positioning of ground stations is essential to minimize network size while ensuring robust observation of fossil fuel emissions in large and diverse environments. In this study, a novel scheme (<italic>Concepteur de Réseaux Optimaux d'Observations Atmosphériques</italic> – CRO<sup>2</sup>A) is developed to design optimal mesoscale atmospheric GHG monitoring networks through a three-stage process of unsupervised clustering with inverse weighting and data processing. Unlike current approaches that rely primarily on inverse-modeling pseudo-data and heavily on error or uncertainty assumptions, this scheme requires no such assumptions; instead, it relies solely on direct atmospheric simulations of GHG concentrations. The CRO<sup>2</sup>A design scheme improves convergence to an optimal solution by minimizing the number of ground-based monitoring stations in the network while maximizing overall network performance. It can perform both foreground and background analyses and can assess and diagnose the quality of existing monitoring networks, among other special features. CRO<sup>2</sup>A treats simulated GHG concentration fields as spatiotemporal images, processed through multiple transformations, including data cleaning and automatic information extraction. These transformations reduce processing time and sensitivity to outliers and noise. The developed scheme incorporates techniques such as image processing and pattern recognition, supported by optimal heuristics derived from operations research, which enhance the ability to explore and exploit the problem search space during the solution process. Two main applications are presented to illustrate the capabilities of the proposed optimal design scheme. These are based on simulations of atmospheric anthropogenic CO<sub>2</sub> concentrations from the Weather Research and Forecasting (WRF) model-one for an urban setting and the other for a regional case centered in eastern France-used to evaluate optimal network designs and the computational performance of the scheme. The results demonstrate that the design scheme is competitive, straightforward, and capable of solving the design problem while maintaining a balanced computational cost. Based on the WRF reference simulation, CRO<sup>2</sup>A performed analyzes of foreground measurements (atmospheric signatures of fossil fuel emissions) and their associated background fields (where simulated large-scale background concentrations are used, avoiding major sources and sinks of GHGs), providing the minimal number of ground-based measurement stations and their optimal locations in the regions. As additional features, CRO<sup>2</sup>A enables users to diagnose the performance of any existing network and improve it in the event of future expansion plans. Furthermore, it can be used to design and deploy an optimal monitoring network based on predefined potential locations within the region under analysis.</p>
  </abstract>
    
<funding-group>
<award-group id="gs1">
<funding-source>Université de Reims Champagne-Ardenne</funding-source>
<award-id>GSMA-7331-MATA0011</award-id>
</award-group>
<award-group id="gs2">
<funding-source>Agence Nationale de la Recherche</funding-source>
<award-id>ANR-22-CPJ1-0002-01</award-id>
</award-group>
</funding-group>
</article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d2e186">Atmospheric inversions of GHG emissions have been applied at the global scale to better understand the spatial and temporal distribution of sources and sinks across continents and oceans <xref ref-type="bibr" rid="bib1.bibx11 bib1.bibx4" id="paren.1"/>. At finer scales, denser observation networks have been deployed, such as the atmospheric network of the Integrated Carbon Observing System (ICOS) (<uri>https://www.icos-cp.eu</uri>, last access: 5 May 2026), the permanent National Oceanic and Atmospheric Administration (NOAA) tall tower network <xref ref-type="bibr" rid="bib1.bibx2" id="paren.2"/>, and temporary regional networks (e.g., the Mid Continent Intensive Experiment; <xref ref-type="bibr" rid="bib1.bibx24" id="altparen.3"/>). More recently, urban ground-based monitoring station networks have been established in Indianapolis <xref ref-type="bibr" rid="bib1.bibx25" id="paren.4"/>, Paris <xref ref-type="bibr" rid="bib1.bibx10" id="paren.5"/>, Los Angeles <xref ref-type="bibr" rid="bib1.bibx15" id="paren.6"/>, and Mexico City <xref ref-type="bibr" rid="bib1.bibx36" id="paren.7"/>. These networks were designed to address specific questions, such as regional carbon budgets <xref ref-type="bibr" rid="bib1.bibx16" id="paren.8"/>, ecosystem responses to climate variability <xref ref-type="bibr" rid="bib1.bibx40" id="paren.9"/>, or long-term trends in fossil fuel emissions <xref ref-type="bibr" rid="bib1.bibx17" id="paren.10"/>. However, their design is often guided by non-objective criteria based on expert judgement, aiming to capture atmospheric signatures of GHG fluxes from biogenic or anthropogenic sources and sinks. Beyond the practical requirements of specific instrumentation (e.g., pre-existing infrastructures, temperature-controlled environments, absence of nearby sources, etc.), optimal network design remains a key research question. This requires objective criteria to address specific scientific goals while limiting the number of measurement locations and improving network performance.</p>
      <p id="d2e223">Many studies have proposed algorithms for designing observation networks that reduce uncertainties in the estimation of GHGs (CO<sub>2</sub>, CH<sub>4</sub>, and N<sub>2</sub>O) on a global scale <xref ref-type="bibr" rid="bib1.bibx30" id="paren.11"/> and in specific regions, such as the African continent <xref ref-type="bibr" rid="bib1.bibx28" id="paren.12"/>, North America <xref ref-type="bibr" rid="bib1.bibx31" id="paren.13"/>, Northern Europe <xref ref-type="bibr" rid="bib1.bibx35" id="paren.14"/>, Italy <xref ref-type="bibr" rid="bib1.bibx42" id="paren.15"/>, or China <xref ref-type="bibr" rid="bib1.bibx44" id="paren.16"/>. Common approaches include the use of Bayesian inversion models, atmospheric Eulerian transport models, or Lagrangian Particle Dispersion Models (LPDMs) to describe the influence areas for each measurement location <xref ref-type="bibr" rid="bib1.bibx26 bib1.bibx33" id="paren.17"/>, combined with multi-objective optimization algorithms such as Incremental Optimization <xref ref-type="bibr" rid="bib1.bibx28" id="paren.18"/>, Genetic Algorithms <xref ref-type="bibr" rid="bib1.bibx20" id="paren.19"/>, or Simulated Annealing <xref ref-type="bibr" rid="bib1.bibx29" id="paren.20"/>. In most proposed schemes, objective functions consider a range of factors-geographical, seasonal, spatio-temporal, physical, and economic <xref ref-type="bibr" rid="bib1.bibx14" id="paren.21"/>.</p>
      <p id="d2e288"><xref ref-type="bibr" rid="bib1.bibx29" id="text.22"/> proposed an inversion model-based design approach using Incremental Optimization (compared with Simulated Annealing) for global observational networks, aiming to maximize constraints on CO<sub>2</sub> flux uncertainty using simulations from the Semi-Lagrangian NIES (National Institute for Environmental Studies, Tsukuba)/FRSGC global transport model. <xref ref-type="bibr" rid="bib1.bibx45" id="text.23"/> employed an inverse-mode Lagrangian particle dispersion model coupled with a Bayesian inversion framework, using simulated fields from the regional version of the Australian Community and Earth System Simulator (ACCESS), and found negligible influence from data outside the domain. The optimization method in the study by <xref ref-type="bibr" rid="bib1.bibx46" id="text.24"/> was incremental and multi-objective, accounting for establishment and maintenance costs. They observed that uncertainty reduction is maximized when designing a monitoring network without considering the existing network, and noted a saturation effect when increasing the number of observing stations beyond a certain point.</p>
      <p id="d2e308"><xref ref-type="bibr" rid="bib1.bibx20" id="text.25"/> developed a method to design an optimal GHG observing network that incorporates several factors beyond simulated data from the Weather Research and Forecasting Model with Coupled Chemistry (WRF-Chem) for the target GHG. Their approach used a Bayesian inversion model and a multi-objective genetic optimization algorithm in an incremental scheme to minimize both emission uncertainties and station measurement costs. Given that urban areas account for about 70 % of anthropogenic CO<sub>2</sub> emissions, several studies have attempted to design optimal urban networks to assess the effectiveness of emission reduction strategies (e.g., <xref ref-type="bibr" rid="bib1.bibx41" id="altparen.26"/>). <xref ref-type="bibr" rid="bib1.bibx32" id="text.27"/> sought to balance the quality and quantity of low-cost observing stations in Oakland, California, and surrounding areas for the Berkeley Atmospheric CO<sub>2</sub> Network (BEACO2N), using cost, reliability, accuracy, and systematic uncertainty to characterize their network. <xref ref-type="bibr" rid="bib1.bibx38" id="text.28"/> proposed a scheme for designing atmospheric monitoring networks based primarily on information theory-that is, statistical data processing-without employing the uncertainty reduction approach. This scheme can handle various types of atmospheric data and is computationally efficient, as it does not require repeated inversion of large matrices. However, it is constrained by the dimensions of certain matrices, which can lead to hardware memory overloads.</p>
      <p id="d2e341">The use of an inversion model is computationally expensive and often subject to restrictions and assumptions that limit its accuracy and reliability-factors already affected by the model implicit regularization process. The assumptions made when defining prior error covariances frequently rely on subjective definitions, which can guide the optimization process and inter-station distances more than the atmospheric transport itself <xref ref-type="bibr" rid="bib1.bibx16" id="paren.29"/>. Furthermore, a successful application requires reframing the ill-posed problem as an approximation of a well-posed problem; such reframing must satisfy the well-known Hadamard conditions <xref ref-type="bibr" rid="bib1.bibx31 bib1.bibx13" id="paren.30"/>. Therefore, given recent advances in data analysis and processing, new solution techniques are emerging, offering alternatives to classical methods and representing a potential paradigm shift (e.g., <xref ref-type="bibr" rid="bib1.bibx19" id="altparen.31"/>). The approach proposed in our study (<italic>Concepteur de Réseaux Optimaux d’Observations Atmosphériques</italic> – CRO<sup>2</sup>A) not only seeks to avoid the inversion process and to evaluate alternative solutions, but also aims to change the problem formulation itself. The core concept is to observe and analyze trends in data simulated by transport models, selecting ground-based monitoring station locations that ensure the presence of significant GHG concentrations for the majority of the simulation period.</p>
      <p id="d2e365">For the sake of the extension of this proposal, and since it is described in detail in Sect. <xref ref-type="sec" rid="Ch1.S2.SS2"/> and its foundations are presented in Appendices <xref ref-type="sec" rid="App1.Ch1.S2"/> and <xref ref-type="sec" rid="App1.Ch1.S3"/>, it only remains to mention some of the motivations about machine learning and clustering algorithms that gave the basis for the development of this project.</p>
      <p id="d2e374">For machine learning algorithms, several factors guide the appropriate choice, including prevention of overfitting and underfitting, data characteristics, and, critically, understanding the problem. Understanding the problem depends on recognizing its defining characteristics. In pattern recognition, when the training dataset consists solely of input values without corresponding outputs, the problem is typically addressed using unsupervised learning algorithms. These algorithms can reveal clusters of similar (or dissimilar) data points-referred to as clustering-which is equivalent to estimating the data point distribution (i.e., density estimation) <xref ref-type="bibr" rid="bib1.bibx8" id="paren.32"/>.</p>
      <p id="d2e380">Clustering algorithms commonly model similarity (or dissimilarity) using distance metrics (i.e., the distance paradigm). Depending on the chosen metric, substantially different results may be obtained according to the clustering criteria applied <xref ref-type="bibr" rid="bib1.bibx43 bib1.bibx9" id="paren.33"/>.</p>
      <p id="d2e386">Distance metrics are widely used due to their ease of implementation and broad acceptance within the scientific community. However, recent studies indicate they are less effective when handling noisy or irrelevant data points. Feature weighting has been suggested as a potential solution to this limitation, which becomes particularly pronounced for temporal or spatial series <xref ref-type="bibr" rid="bib1.bibx9" id="paren.34"/>.</p>
      <p id="d2e392">Given the problem characteristics, it is impossible to fully map the search space, and since the number of possible solutions increases factorially with the number of clusters, it is necessary to conduct an initial global search (exploration) followed by a local search (exploitation) of candidate solutions in the aforementioned space. The success of this search depends on the heuristic strategy adopted and on the choice or definition of the objective (cost) function. The above, taking into account the close relationship between design and optimization, is most frequently applied in engineering and often requires novel strategies supported by high-performance computational resources.</p>
      <p id="d2e396">Based on the above, the main characteristics of the data used in this research are presented in Sect. <xref ref-type="sec" rid="Ch1.S2.SS1"/>. The applied methodology is then described in detail-almost as a user manual-through the three stages of the optimal design scheme in Sect. <xref ref-type="sec" rid="Ch1.S2.SS2"/>. The results demonstrate the potential of the CRO<sup>2</sup>A optimal scheme in two applications-at the urban scale and at the regional scale-both around northeastern France, focusing on quantifying fossil fuel CO<sub>2</sub> emissions (Sect. <xref ref-type="sec" rid="Ch1.S3"/>). For each application, CRO<sup>2</sup>A is used to: first (foreground analysis) explore the design of a network to monitor fossil fuel CO<sub>2</sub> emission signatures, and second (background analysis) address the need for measurements capable of capturing large-scale CO<sub>2</sub> input concentrations (background observations). Discussions and conclusions are provided in Sects. <xref ref-type="sec" rid="Ch1.S4"/> and <xref ref-type="sec" rid="Ch1.S5"/>, respectively. Finally, special features of the optimal design scheme are summarized in Appendix <xref ref-type="sec" rid="App1.Ch1.S1"/>, technical notes on the clustering process underlying the CRO<sup>2</sup>A development are given in Appendix <xref ref-type="sec" rid="App1.Ch1.S2"/>, the formal definition of the clustering process and its characteristics are presented in Appendix <xref ref-type="sec" rid="App1.Ch1.S3"/>, an additional application, on a different scale, as well as its comparison with the results already published in another article for reference, are shown in Appendix <xref ref-type="sec" rid="App1.Ch1.S4"/>, and additional and complementary proofs of results are provided in Appendix <xref ref-type="sec" rid="App1.Ch1.S5"/>.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Methods and materials</title>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Modeling and simulation of atmospheric CO<sub>2</sub> concentrations</title>
      <p id="d2e500">The data used in this study were simulated using the Weather Research and Forecasting model <xref ref-type="bibr" rid="bib1.bibx34 bib1.bibx12 bib1.bibx7" id="paren.35"/> with the passive tracer transport option of its chemistry module (WRF-Chem v3.9.1) to generate CO<sub>2</sub> concentration fields. For the regional application, the WRF-Chem domain was centered over the region of interest and extended beyond its boundaries toward Paris and neighboring countries. The model grid comprised 50 vertical levels and a horizontal resolution of <inline-formula><mml:math id="M23" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula> km, resulting in <inline-formula><mml:math id="M24" display="inline"><mml:mrow><mml:mn mathvariant="normal">201</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">150</mml:mn></mml:mrow></mml:math></inline-formula> grid points per level.</p>
      <p id="d2e539">For lateral boundary conditions, we used European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5) hourly meteorological data <xref ref-type="bibr" rid="bib1.bibx6" id="paren.36"/> and Copernicus Atmosphere Monitoring Service (CAMS) global inversion-optimized GHG concentrations for CO<sub>2</sub> <xref ref-type="bibr" rid="bib1.bibx5" id="paren.37"/>.</p>
      <p id="d2e557">Anthropogenic fluxes were extracted from the high-resolution (<inline-formula><mml:math id="M26" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> km) TNO dataset for 2019 <xref ref-type="bibr" rid="bib1.bibx39" id="paren.38"/>, while biogenic fluxes were modelled using the Vegetation Photosynthesis and Respiration Model (VPRM; <xref ref-type="bibr" rid="bib1.bibx21" id="altparen.39"/>). To distinguish between CO<sub>2</sub> sources, we separated total concentrations into three components: anthropogenic, biogenic, and background. The anthropogenic signal was further decomposed to separate CO<sub>2</sub> originating from the Grand Est region from that of the rest of the domain. For the urban application, the model configuration was identical except for the resolution (<inline-formula><mml:math id="M29" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> km), fully nested within the parent domain used for the regional application.</p>
      <p id="d2e610">Although both anthropogenic and biogenic fields have been considered in this development, the latter present a unique challenge due to their specific characteristics, requiring the use of techniques complementary to those described herein; this first version of CRO<sup>2</sup>A focuses solely on anthropogenic fields. However, given the undeniable need to include biogenic fields, it is expected that these will be incorporated in a subsequent version.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>The CRO<sup>2</sup>A optimal design scheme</title>
      <p id="d2e640">The processing performed by the three main stages of the CRO<sup>2</sup>A (version 1.0) optimal design scheme is described below, detailing each transformation applied to the data. Throughout these stages, an illustrative example is presented, assuming seven ground-based measurement stations (<inline-formula><mml:math id="M33" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">7</mml:mn></mml:mrow></mml:math></inline-formula>) within a masked version of the Grand Est region (northeastern France) to show both the inputs and outputs of each transformation. The proposed optimal design scheme is represented as a SISO (Single-Input, Single-Output) system in Fig. <xref ref-type="fig" rid="F1"/>, with its core based entirely on an inverse-weighted clustering process.</p>

      <fig id="F1" specific-use="star"><label>Figure 1</label><caption><p id="d2e668">Schematic flow chart of CRO<sup>2</sup>A illustrating the proposed scheme for designing optimal networks for monitoring atmospheric GHGs. The inversely weighted clustering-based learning scheme (black dashed line box) is detailed in three stages of development: pre-processing (green dashed line box), processing (yellow dashed line box), and post-processing (red dashed line box).</p></caption>
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f01.png"/>

        </fig>

<sec id="Ch1.S2.SS2.SSS1">
  <label>2.2.1</label><title>Pre-processing</title>
      <p id="d2e693">This initial stage treats the data as images to extract the maximum amount of useful information after applying cleaning, filtering, and masking processes (i.e., image processing techniques). As input, the CRO<sup>2</sup>A pre-processing stage primarily requires simulated spatio-temporal datasets (see Fig. <xref ref-type="fig" rid="F2"/>a and b) containing simulated GHG concentrations for a specific region (see Fig. <xref ref-type="fig" rid="F2"/>c), generated using the resources described in Sect. <xref ref-type="sec" rid="Ch1.S2.SS1"/>.</p>

      <fig id="F2" specific-use="star"><label>Figure 2</label><caption><p id="d2e713">First pseudo-color image of the simulated atmospheric CO<sub>2</sub> concentration dataset over the analyzed region at 50 m a.g.l. <bold>(a)</bold>, its three-dimensional surface with a contour plot underneath <bold>(b)</bold>, and its corresponding geographical region (northeastern France) delimited by the black dashed line <bold>(c)</bold>.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f02.png"/>

          </fig>

      <p id="d2e740">The following description corresponds to the sequence of procedures within the yellow dashed line box in the flow chart of Fig. <xref ref-type="fig" rid="F1"/>. The first transformations are applied sequentially to reveal relevant information and aid in the data cleaning process.</p>
      <p id="d2e746">The variables of the selected tracer (the GHG to be analyzed) and the corresponding latitude and longitude coordinates delimiting the study region are defined and stored in matrices projected according to the Coordinate Reference System (CRS), here WGS84/pseudo-mercator.</p>
      <p id="d2e749">In CRO<sup>2</sup>A, it is possible to select a subregion for analysis (i.e., an area within the boundaries of the input dataset without recompilation). This can be done graphically or manually by specifying only the minimal and maximal coordinate quadrants of the latitudes and longitudes.</p>
      <p id="d2e761">Tracer data are converted into images in line with the proposed approach to leverage this representation. This is equivalent to treating the data as matrices allowing fast indexing, where each element specifies the color of a pixel in the image.</p>
      <p id="d2e764">The row and column indices of the elements determine the centers of the corresponding pixels and, in turn, the associated concentration value.</p>
      <p id="d2e767">The converted datasets are stored in a three-dimensional array whose dimensions define the image height and width and the number of images equals the length of the time vector. For example, from a (<inline-formula><mml:math id="M38" display="inline"><mml:mrow><mml:mn mathvariant="normal">150</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">201</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">768</mml:mn></mml:mrow></mml:math></inline-formula>) dataset, 768 frames or images (equal to the time vector length) can be obtained, as shown in Fig. <xref ref-type="fig" rid="F2"/>a, each with <inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:mn mathvariant="normal">150</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">201</mml:mn></mml:mrow></mml:math></inline-formula> pixels (height and width, respectively).</p>
      <p id="d2e800">In the pseudo-color plots, the logarithmic values of each matrix from the arrays are displayed as colored flat surfaces in the <inline-formula><mml:math id="M40" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>–<inline-formula><mml:math id="M41" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> plane. This procedure can be used for visualization and verification; however, each matrix can also be directly converted to a grayscale image without first converting it to a true-color RGB (Red–Green–Blue) image.</p>
      <p id="d2e818">This approach reduces storage requirements, as only one channel (grayscale) is needed instead of three (RGB) to represent the same information. In addition to storage savings, this method can significantly reduce the processing time of the proposed optimal design scheme and ensures that the result is independent of the color space chosen for graphical representation. For these reasons, every matrix in the array is converted to a grayscale image, as shown in Fig. <xref ref-type="fig" rid="F3"/>a, with values in the range [0, 1] corresponding to pixels from black to white, respectively.</p>
      <p id="d2e823">Once this transformation is applied to each image, a datastore is created to hold the entire collection. The datastore enables rapid processing without exceeding available memory, particularly when handling a large number of images.</p>
      <p id="d2e826">A further advantage of using datastores is that transformations can be applied to the entire collection simultaneously rather than individually in sequence.</p>
      <p id="d2e829">To isolate pixels with intensities close to the target concentrations-defined by an instrument sensitivity or through contour analysis (as described below)-it is recommended to adjust image contrast by uniformly modifying pixel intensity values according to their values.</p>
      <p id="d2e832">This adjustment shifts them toward either the bright or dark range, depending on the data and user-defined target, and is achieved through contrast stretching. Two contrast adjustment options are proposed: (i) logarithmic rescaling of pixel intensities according to concentration intensities (see Fig. <xref ref-type="fig" rid="F3"/>b), and (ii) histogram equalization (see Fig. <xref ref-type="fig" rid="F3"/>c).</p>

      <fig id="F3" specific-use="star"><label>Figure 3</label><caption><p id="d2e842">Contrast adjustment of the grayscale version of Fig. <xref ref-type="fig" rid="F2"/>a <bold>(a)</bold>, its logarithmically rescaled version <bold>(b)</bold>, and its equalized version <bold>(c)</bold>.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f03.png"/>

          </fig>

      <p id="d2e862">In both cases, an optimal thresholding process is required, which can be considered a separate topic of study. The advantages and disadvantages of each method depend on the concentration values of interest, the size of the resulting dataset, and the associated processing time. Typically, the equalized version includes a larger proportion of the data, resulting in longer processing times than the scaled version. However, the latter should not be discarded; its use will depend on the particular application.</p>
      <p id="d2e865">The differences between the two options are shown in Fig. <xref ref-type="fig" rid="F4"/>, which presents the histogram transformations for versions of the same processed image. These figures highlight an implicit duality between the two transformations. Brighter pixels (intensity values close to 1) are more prominent in the equalized version (see Fig. <xref ref-type="fig" rid="F4"/>c) than in the rescaled version (see Fig. <xref ref-type="fig" rid="F4"/>b).</p>

      <fig id="F4" specific-use="star"><label>Figure 4</label><caption><p id="d2e876">Histograms of the contrast adjustment of Fig. <xref ref-type="fig" rid="F2"/>a <bold>(a)</bold>, the compressed version by logarithmic rescaling <bold>(b)</bold>, and the uniformly distributed (decompressed) version by equalization <bold>(c)</bold>.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f04.png"/>

          </fig>

      <p id="d2e896">Such pixels are the focus of this procedure, though it is important to note that revealing more pixels increases the data volume and processing requirements, as already mentioned.</p>
      <p id="d2e900">In this case, by default, contrast adjustment is performed by logarithmic rescaling, using the value obtained from contour analysis of the concentration intensities in each image as a threshold. Once the grayscale image contrast is adjusted, the image is binarized (see Fig. <xref ref-type="fig" rid="F5"/>a) using the well-known Otsu method <xref ref-type="bibr" rid="bib1.bibx18 bib1.bibx3" id="paren.40"/>, which determines an optimal threshold corresponding to the minimal intraclass variance from a 256-bin histogram. Pixels with intensities above the global threshold are set to 1 (white), and those below are set to 0 (black). Both contrast adjustment and binarization define the search (solution) space for each image. These transformations isolate the information of interest-white pixel regions with intensity 1 in Fig. <xref ref-type="fig" rid="F5"/>a – so that only target GHG concentrations are analyzed, while remaining areas are ignored as irrelevant or noise.</p>

      <fig id="F5" specific-use="star"><label>Figure 5</label><caption><p id="d2e912">Binarized version of Fig. <xref ref-type="fig" rid="F3"/>b <bold>(a)</bold> and the Grand-Est mask applied to every image <bold>(b)</bold>.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f05.png"/>

          </fig>

      <p id="d2e929">Once the binarized image collection is obtained, masking is applied to exclude all bodies of water or other undesired areas (when visible at the given resolution and dimensions), as these locations should not be part of the search space. In the illustrative example, the mask in Fig. <xref ref-type="fig" rid="F5"/>b is used to analyze only CO<sub>2</sub> concentrations within the Grand-Est region.</p>
      <p id="d2e943">Thus, the solution space for each image, initially defined through earlier transformations, is further restricted to account for the topographic conditions of the study area.</p>
      <p id="d2e946">The masking procedure uses an equally binarized image with dimensions matching those of the dataset. Leveraging the binary values (1 and 0) that correspond to Boolean true and false, a logical AND operation is performed via a dot product between each image and the mask. This operation is inherent to computing systems and is computationally inexpensive compared with alternative methods.</p>
      <p id="d2e950">As one of the final pre-processing steps, all resulting binarized images are accumulated (superimposed), producing a scoring matrix whose elements are the sums of pixel intensities at each position. The scoring matrix represents the spatial distribution of significant concentrations, expressed as the frequency of occurrence at each location. For example, for annual simulations (365 days), a matrix element with a value of 75 indicates that the location exhibits substantial (since they are of considerable importance, value, and therefore measurable) GHG concentrations for 75 % of the simulation period (approximately 273 d). The scoring matrix (Fig. <xref ref-type="fig" rid="F6"/>) depends directly on the pre-processed data and is used in the clustering process for weighting during both processing and post-processing stages.</p>

      <fig id="F6"><label>Figure 6</label><caption><p id="d2e957">Scoring matrix of concentrations (from 0 % to 100 %) representing the frequency of substantial concentration enhancements across the region analyzed.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f06.png"/>

          </fig>

      <p id="d2e966">Finally, before initiating the processing stage, the reference raster postings to geographic coordinates and the datastores are saved. The distance metric (squared Euclidean distance by default), maximum number of iterations (ite), repetitions (run), replicates (rep), and saturation value (sat) are also defined; some of these serve as stopping criteria for decision-making and optimization steps (their use is described later).</p>
</sec>
<sec id="Ch1.S2.SS2.SSS2">
  <label>2.2.2</label><title>Processing</title>
      <p id="d2e979">This intermediate iterative stage aims to identify similar or dissimilar groups by clustering the pixels detected in each image from the previous stage. The resulting groups are characterized according to the patterns that generated them (i.e., pattern recognition techniques are applied to the data). The following description corresponds to the sequence of procedures within the blue dashed box in the flow chart shown in Fig. <xref ref-type="fig" rid="F1"/>.</p>
      <p id="d2e984">The CRO<sup>2</sup>A processing stage requires the output images from the pre-processing stage (see Fig. <xref ref-type="fig" rid="F5"/>a), the scoring matrix (see Fig. <xref ref-type="fig" rid="F6"/>), and the number of clusters under consideration (<inline-formula><mml:math id="M44" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>), corresponding to the number of ground-based measurement stations.</p>
      <p id="d2e1007">Using the binarized images, the transformed dataset identifies pixels with an intensity of 1 (colored and marked pixels in Fig. <xref ref-type="fig" rid="F7"/>). For each image, these data points define the search space for the solution, as they represent the locations of the relevant concentrations (targets).</p>

      <fig id="F7"><label>Figure 7</label><caption><p id="d2e1015">Univariate histogram (marginal distributions) of Fig. <xref ref-type="fig" rid="F5"/>a used to define the starting points of the clustering.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f07.png"/>

          </fig>

      <p id="d2e1026">To apply the clustering process to these data points, the clustering algorithms must be initialized by defining starting points (i.e., a priori positions of ground-based measurement stations) that act as initial centroids in iteration zero. This initialization is commonly performed by selecting <inline-formula><mml:math id="M45" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> points uniformly distributed within the search space, representing the simplest among several possible options.</p>
      <p id="d2e1036">Each clustering problem constitutes an optimization problem, and the present development assumes a global solution in both search space and values. To reduce the risk of converging to a local minimal, and given that the underlying algorithms (<inline-formula><mml:math id="M46" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means and <inline-formula><mml:math id="M47" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-medoids) are sensitive to initial conditions, the starting points for clustering are defined using information from the marginal distributions of each image width (horizontal axis) and height (vertical axis). These are obtained as univariate histograms (see Fig. <xref ref-type="fig" rid="F7"/>). These marginal distributions show the vertical and horizontal densities of pixels (i.e., data points) per row and column, respectively.</p>
      <p id="d2e1055">This considerably reduces processing time and the number of iterations, as the starting points are typically located close to the highest concentrations. This positioning guides the search vectors during solution exploration, thereby requiring fewer iterations to converge to an optimal solution. The size of the starting point matrix is <inline-formula><mml:math id="M48" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula>, determined by the number of clusters tested (<inline-formula><mml:math id="M49" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>) and the number of features-in this case, 2, corresponding to latitude (<inline-formula><mml:math id="M50" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula>-projection) and longitude (<inline-formula><mml:math id="M51" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>-projection) for each point. In other words, the number of starting points equals the number of ground-based measurement stations in the monitoring network.</p>
      <p id="d2e1091">Once the pixels are identified, a primary grid is superimposed on the image, with the number of divisions in both height and width equal to <inline-formula><mml:math id="M52" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>, resulting in <inline-formula><mml:math id="M53" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> cells (see vertical and horizontal black lines in Fig. <xref ref-type="fig" rid="F8"/>). For each division in width and height, the maximum value of the corresponding portion of the marginal distribution is selected.</p>

      <fig id="F8"><label>Figure 8</label><caption><p id="d2e1117">Primary grid corresponding to the first binarized and masked image of the dataset (see Fig. <xref ref-type="fig" rid="F7"/>) (black lines), the overlapping marginal distributions for each row (red) and column (blue), and maximal values of the marginal distributions (green) for each vertical and horizontal division (secondary grid). Starting point candidates are at green line intersections; selected starting points are shown as black circles.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f08.png"/>

          </fig>

      <p id="d2e1128">A secondary grid is then drawn, representing the locations most likely to host ground-based measurement stations (intersections of the green dashed lines in Fig. <xref ref-type="fig" rid="F8"/>).</p>

      <fig id="F9" specific-use="star"><label>Figure 9</label><caption><p id="d2e1135">Data points associated with the different clusters and their corresponding centroids (black triangles) for the first processed image <bold>(a)</bold> and the collection of centroids resulting from processing all images in the dataset <bold>(b)</bold>.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f09.png"/>

          </fig>

      <p id="d2e1150">Depending on the image characteristics, up to <inline-formula><mml:math id="M54" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>k</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> starting point candidates can be obtained (i.e., the maximum number of internal intersections between the horizontal and vertical grid lines). However, even though the theoretical maximum is <inline-formula><mml:math id="M55" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>k</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, some candidates are discarded if they do not belong to the search space (i.e., identified pixels). In some cases, certain cells may contain no identified pixels from the previous procedure, reducing the number of available starting points to less than <inline-formula><mml:math id="M56" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>k</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e1210">If the number of available candidates is greater than or equal to <inline-formula><mml:math id="M57" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>, exactly <inline-formula><mml:math id="M58" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> are randomly selected and ordered. To diversify the clustering process, a starting point array of size <inline-formula><mml:math id="M59" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>×</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:math></inline-formula> can be created by assigning random permutations of the selected points. This strategy allows the clustering process to be replicated with a different ordering of starting points, to which the clustering algorithms are also sensitive.</p>
      <p id="d2e1244">If the number of candidates is less than <inline-formula><mml:math id="M60" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>, those options are discarded and the starting points are instead selected uniformly at random within the search space.</p>
      <p id="d2e1254">Since the number of candidates for the first analyzed image in the illustrative example is greater than <inline-formula><mml:math id="M61" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">7</mml:mn></mml:mrow></mml:math></inline-formula>, the randomly selected starting points are shown as black circles in Fig. <xref ref-type="fig" rid="F8"/>. Each selected starting point can be verified to belong to the identified pixels.</p>
      <p id="d2e1271">Before starting the clustering process, if any ground-based measurement station is already installed in the analyzed region and the user wishes to include it, its location can be added to the dataset by applying a multiplicity concatenation equivalent to 2.5 % of the total number of images.</p>
      <p id="d2e1274">The clustering algorithm then partitions the dataset into <inline-formula><mml:math id="M62" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> clusters and returns: a vector containing the cluster index of each data point; the <inline-formula><mml:math id="M63" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> cluster centroid locations; the within-cluster sums of point-to-centroid distances; and the distances from each point to every centroid. In other words, the clustering process characterizes the data points identified in each image according to patterns defined by the distance relationships between each data point and the centroids. Such characterization can be used to identify the most appropriate cluster for each point, considering all possible clusterings that meet an optimal criterion. The optimal criterion is based on the distance between a pixel and its cluster centroid, the distances between centroids, and the scoring matrix value (weight) at the pixel location.</p>
      <p id="d2e1291">This stage operates under the hypothesis that the location of the ground-based measurement station (the cluster centroid) should be as close as possible to the source of GHGs (highest concentration) to ensure accurate measurements and improve subsequent data inversion, if required. This is the same hypothesis used to define the clustering starting points. The gradual formation of the seven clusters is shown through the changing colored groups of identified pixels.</p>
      <p id="d2e1294">In Fig. <xref ref-type="fig" rid="F9"/>a, the processing results for the first image in the dataset are shown after 27 iterations. The seven clusters obtained (<inline-formula><mml:math id="M64" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">7</mml:mn></mml:mrow></mml:math></inline-formula>) are displayed in different colors. The <inline-formula><mml:math id="M65" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>–<inline-formula><mml:math id="M66" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> projection of each cluster centroid is indicated by black triangles, representing the optimal locations for ground-based measurement stations according to the first image (see Fig. <xref ref-type="fig" rid="F9"/>a).</p>
      <p id="d2e1328">The relationship between the randomly selected starting points and the pseudo-global centroids confirms the hypothesis of spatial closeness and convergence proposed through the use of marginal distributions.</p>
      <p id="d2e1331">It should be noted that the optimal clustering is obtained without requiring clusters to contain the same number of members; instead, it depends on the distances between centroids and members, as well as the scoring matrix values at the member locations. Maximizing inter-centroid distances produces well-separated clusters that can form, expand, contract, and exchange members dynamically. Centroid tendencies align with regions of highest GHG concentration and correspond to dense pixel areas in the scoring matrix.</p>
      <p id="d2e1334">In this case, initializing starting points using the proposed strategy results in a low number of iterations to reach an optimal solution, as outcomes tend to follow the stated hypothesis. Convergence is thus relatively fast, reducing computational cost and reinforcing the hypothesis. As a result of the processing stage, after clustering each image in the dataset, the resulting centroids (see Fig. <xref ref-type="fig" rid="F9"/>b), referred to as local centroids, are stored for use in the post-processing stage. The centroids shown for the first image in Fig. <xref ref-type="fig" rid="F9"/>a are elements of the cluster sets in Fig. <xref ref-type="fig" rid="F9"/>b, or the set of local centroids. In total, 5376 local centroids are identified in Fig. <xref ref-type="fig" rid="F9"/>b, as the illustrative example comprises 768 frames with <inline-formula><mml:math id="M67" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">7</mml:mn></mml:mrow></mml:math></inline-formula> clusters each.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS3">
  <label>2.2.3</label><title>Post-processing</title>
      <p id="d2e1365">This final stage performs a general cluster analysis, estimates clusters, determines centroids, and displays optimal results. In addition to pattern recognition techniques, optimization and operations research methods are applied to the results collected in the processing stage. The following description refers to the sequence of procedures within the dashed red box in the flow chart in Fig. <xref ref-type="fig" rid="F1"/>.</p>
      <p id="d2e1370">As input, the CRO<sup>2</sup>A post-processing stage requires the output data from the processing stage (local centroids of the images in Fig. <xref ref-type="fig" rid="F9"/>b) and, as in the processing stage, the scoring matrix and the number of clusters to be evaluated (<inline-formula><mml:math id="M69" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>). Although the processed data from both stages appear similar, they represent different information. In the processing stage, the spatio-temporal distributions of GHG concentrations were used (sets of images such as those shown in Fig. <xref ref-type="fig" rid="F5"/>a).</p>
      <p id="d2e1393">In the post-processing stage, the collection of local centroids from each image is used (a single image as shown in Fig. <xref ref-type="fig" rid="F9"/>b).</p>
      <p id="d2e1398">Building on the learning from the processing stage, the cluster analysis here again considers the large dataset size; however, it now processes representative entities (local centroids) rather than individual data points analyzed previously.</p>
      <p id="d2e1402">Once the overall results of the new clustering process are obtained (global centroids; see Fig. <xref ref-type="fig" rid="F10"/>a), i.e., after clustering the data from the processing stage, percentage scores for each ground-based measurement station location and for the monitoring network are calculated using the scoring matrix, as shown in Fig. <xref ref-type="fig" rid="F10"/>b. Promising permutations of the global centroids are then used to further explore the search space. Alternative solutions are tested around the general solution, i.e., reclustering with different permuted and modified starting points. This approach is analogous to forcing the Pareto profile in a multi-objective optimization, but at low computational cost. After at least 25 repetitions, to ensure adequate statistical reliability, the clustering with the highest mean score for a specific value of <inline-formula><mml:math id="M70" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> is considered the optimal solution (<inline-formula><mml:math id="M71" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>), provided it outperforms the general solution; otherwise, the general solution is retained due to the lack of improvement.</p>

      <fig id="F10" specific-use="star"><label>Figure 10</label><caption><p id="d2e1429">Resulting optimal centroids (black triangles) for all images in the dataset <bold>(a)</bold>, and their corresponding score (performance) percentages <bold>(b)</bold>.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f10.png"/>

          </fig>

      <p id="d2e1444">Finally, CRO<sup>2</sup>A outputs the geographic coordinates of the <inline-formula><mml:math id="M73" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> ground-based measurement stations, their individual scores, and the score of the optimally selected monitoring network.</p>
      <p id="d2e1467">It should be noted that in clustering algorithms (both <inline-formula><mml:math id="M74" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-medoids and <inline-formula><mml:math id="M75" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means), one of their characteristics is that the number of <inline-formula><mml:math id="M76" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> clusters used to represent the dataset is predefined before the algorithm starts.</p>
      <p id="d2e1491">Consequently, the algorithm iteratively modifies the locations of the centroids of these predefined clusters <inline-formula><mml:math id="M77" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>, based on the calculation of the mean, which reflects the density distribution of the dataset. Given this characteristic, a trend analysis is implemented using the basic sequential algorithmic scheme (BSAS) <xref ref-type="bibr" rid="bib1.bibx37" id="paren.41"/>. This approach is proposed to address the persistent and subjective question of the minimum number of clusters required for adequate clustering (i.e., the number of ground-based measurement stations in the network and their corresponding locations in the region under analysis).</p>
      <p id="d2e1505">Through this analysis and batch processing, as illustrated in the first panel in Fig. <xref ref-type="fig" rid="F11"/>, an interval of clusters is tested sequentially and incrementally (in the example, [2, 15]), providing information on the overall performance of the monitoring network as a function of the number of ground-based measurement stations.</p>

      <fig id="F11" specific-use="star"><label>Figure 11</label><caption><p id="d2e1512">Logistic fit (S-shaped curve) of network performance as a function of the number of monitoring towers. The minimal value of ground-based measurement stations (<inline-formula><mml:math id="M78" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>) that corresponds to the calculated threshold value.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f11.png"/>

          </fig>

      <p id="d2e1532">The results from this analysis are fitted to a sigmoidal curve, and the procedure presented by <xref ref-type="bibr" rid="bib1.bibx23" id="text.42"/> is applied to determine the optimal threshold (<inline-formula><mml:math id="M79" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>). This threshold indicates the minimal number of ground-based measurement stations required for the monitoring network (see first panel in Fig. <xref ref-type="fig" rid="F11"/>). The sigmoidal function selected in this case is particularly useful for modeling artificial neural networks. It typically describes complex systems characterized by learning curves with rapidly ascending intermediate transitions from low levels to a saturation point at high levels after a significant increase in the independent variable. A specific case of the sigmoidal function is the logistic function, widely used in regression models. The logistic function representing the fitted model and its first two derivatives are used to calculate the optimal threshold (see the first and second panel in Fig. <xref ref-type="fig" rid="F11"/>, respectively) according to the procedure proposed by <xref ref-type="bibr" rid="bib1.bibx23" id="text.43"/>, which makes use of intersections between certain straight lines (including the slope line at the midpoint of the logistic curve, obtained by means of the derivative in the second panel in Fig. <xref ref-type="fig" rid="F11"/>) to calculate the baroreflex threshold and saturation points. It should be noted that the performance represented is normalized; therefore, the vertical axes of this figure and its first two derivatives, shown in Fig. <xref ref-type="fig" rid="F11"/>, are dimensionless.</p>
      <p id="d2e1561">Since the main metric is based on the scoring matrix, and in this matrix a value of 100 % means that at that location a ground-based measurement station will be in the middle of a signal of considerable intensity for the entire simulation time (depending on the dataset used).</p>
      <p id="d2e1564">Therefore, a performance score of one means that each and every ground-based measurement station is located at points where signals of considerable intensity are present at all times. In other words, a performance score of 1 means that, according to the optimality criteria of CRO<sup>2</sup>A, the location obtained cannot be improved.</p>
      <p id="d2e1577">Note the tendency towards saturation in Fig. <xref ref-type="fig" rid="F11"/>, which, although analyzed in more detail later, coincides with one of the conclusions of the methodology proposed by <xref ref-type="bibr" rid="bib1.bibx45" id="text.44"/>, according to an incremental scheme, since no matter how much the number of ground-based measurement stations increases, after a certain number, they do not contribute significantly to an improvement of the system.</p>
      <p id="d2e1585">According to Fig. <xref ref-type="fig" rid="F11"/>, the illustrative example developed through these three stages shows that the minimal number of monitoring towers for the masked region corresponds to the calculated threshold. The optimal value for the overall performance of the network is approximately 57.14 %, as shown in Fig. <xref ref-type="fig" rid="F10"/>b. As presented in these sections, complete processing of these large datasets is achieved through the training and validation of a machine learning system capable of generating optimal designs for atmospheric GHG monitoring networks based on such datasets. This unsupervised machine learning scheme adapts proportionally and progressively to the amount of data available during processing, improving its performance even under uncertainty.</p>
      <p id="d2e1592">As noted at the outset, CRO<sup>2</sup>A is structured in three stages, each functioning as an analysis module. Partial results are saved after each stage, allowing processing to be paused and resumed later. It is also important to note that once pre-processing is completed, it need not be repeated for either a single monitoring tower configuration or for trend analysis.</p>

      <fig id="F12" specific-use="star"><label>Figure 12</label><caption><p id="d2e1606">Algorithmic complexity analysis of processing times for the three stages of the CRO<sup>2</sup>A design scheme.</p></caption>
            <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f12.png"/>

          </fig>

      <p id="d2e1624">Finally, Fig. <xref ref-type="fig" rid="F12"/> presents the algorithmic complexity analysis of processing times for the three stages of the CRO<sup>2</sup>A design scheme. Figure <xref ref-type="fig" rid="F12"/>a shows the processing times as a function of the number of images in the pre-processing stage and their corresponding curve fitting. The linear relationship between these variables is notable. Despite this characteristic, for datasets with larger numbers of images, the pre-processing stage does not impose a higher computational cost due to the use of datastores. Figure <xref ref-type="fig" rid="F12"/>b shows the processing times for all three stages as a function of the number of clusters (or ground-based measurement stations). The processing times for the pre-processing stage remain constant, as they are independent of the number of clusters (<inline-formula><mml:math id="M84" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>). The most evident variations occur for <inline-formula><mml:math id="M85" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>&lt;</mml:mo><mml:mi>k</mml:mi><mml:mo>&lt;</mml:mo><mml:mn mathvariant="normal">7</mml:mn></mml:mrow></mml:math></inline-formula>, primarily due to hardware adjustments during GPU initialization. The processing times for the processing and post-processing stages differ by at least two orders of magnitude, owing to the large number of images handled in the processing stage of the design scheme.</p>
</sec>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Numerical results</title>
      <p id="d2e1676">As mentioned, two applications of the CRO<sup>2</sup>A design scheme are presented below: one at the urban level and the other at the regional level. These applications address different land areas, resolutions, and spatio-temporal concentration patterns.</p>
      <p id="d2e1688">For each, the results of the trend, foreground, and background analyses are shown. To limit computational resource requirements, two one-month simulations for October 2022 and 2023 were used. While this constrains the absolute representativeness of the proposed networks, the simulations are sufficient to demonstrate the applicability and utility of the optimal design scheme.</p>
      <p id="d2e1691">Designing an operational network would require longer simulations encompassing all meteorological conditions across the four seasons. First, for the foreground analysis, the scheme described in the previous section is applied. The target tracer (the GHG to be analyzed) is selected, along with the interval for the number of monitoring towers to be tested. Second, for the background analysis, the scheme is applied directly to the background tracer using the optimal value obtained from the foreground analysis, with the special masking mode activated. This mode selects the tracer used in the foreground analysis, restricting the search space for the background tracer to locations with the significant GHG concentrations. The details of the special masking feature are provided in Appendix <xref ref-type="sec" rid="App1.Ch1.S1"/>. Third, the results from both the foreground and background analyses are compiled, and the corresponding figures and coordinates of the elements comprising each designed network are presented.</p>
      <p id="d2e1696">The background analysis uses the same number of monitoring towers as determined in the foreground analysis (i.e., creating a one-to-one network). However, at expert discretion, some of the ground-based measurement stations proposed in the background analysis may be removed following appropriate evaluation, primarily based on their separation distances.</p>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>Urban-scale application</title>
      <p id="d2e1707">For the first application, the CRO<sup>2</sup>A design scheme was applied to the urban area of Reims (Marne, France), using atmospheric CO<sub>2</sub> concentrations simulated by WRF-Chem at 1 km resolution. The fossil fuel component (individual tracer) was extracted from the simulation to isolate the urban fossil fuel signal, while the background analysis included the large-scale inflow and local biogenic fluxes. Figure <xref ref-type="fig" rid="F13"/> shows the logistic fitting curve of the trend analysis results at the urban level. The tested range for the number of monitoring towers was [2, 10] with a unit step. The resulting threshold represents the minimal number of ground-based measurement stations required for the region under study (in this case, at least three according to CRO<sup>2</sup>A). Figure <xref ref-type="fig" rid="F14"/>a and b shows the resulting clusters and their corresponding global centroids for a network composed of three ground-based measurement stations (<inline-formula><mml:math id="M90" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula>) for the foreground and background networks, respectively.</p>

      <fig id="F13" specific-use="star"><label>Figure 13</label><caption><p id="d2e1759">Logistic fit (S-shaped curve) of the urban network performance as a function of monitoring tower numbers. The minimal value of ground-based measurement stations (<inline-formula><mml:math id="M91" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>) that corresponds to the calculated threshold value.</p></caption>
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f13.png"/>

        </fig>

      <fig id="F14" specific-use="star"><label>Figure 14</label><caption><p id="d2e1781">Resulting optimal centroids (black triangles) and their corresponding clusters for all images in the dataset for both the foreground <bold>(a)</bold> and background <bold>(b)</bold> networks (urban level), location of the optimal centroids relative to the emission field <bold>(c)</bold>, and according to the scoring matrix <bold>(d)</bold>.</p></caption>
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f14.png"/>

        </fig>

      <p id="d2e1803">Similarly, Fig. <xref ref-type="fig" rid="F14"/>c and d shows the locations of the optimal centroids relative to the emissions field for the analyzed region at the urban scale and the scoring matrix.</p>
      <p id="d2e1808">A similar analysis, including pre-existing ground-based measurement stations (black triangles in Fig. <xref ref-type="fig" rid="FA1"/>a and b), is presented in Appendix <xref ref-type="sec" rid="App1.Ch1.S1.SS1"/>. The three foreground locations correspond to (i) the downtown urban area, (ii) the sugar factory located to the northeast of the city, and (iii) a site to the northwest where both the city plume from Reims and the sugar factory plume converge.</p>
      <p id="d2e1815">When a fourth measurement location is added (Appendix <xref ref-type="sec" rid="App1.Ch1.S5"/>), it is positioned to the southeast of the city near a smaller sugar factory, enabling separation of the urban plume from the second sugar factory plume, as shown in Fig. <xref ref-type="fig" rid="FE1"/>.</p>

<table-wrap id="T1" specific-use="star"><label>Table 1</label><caption><p id="d2e1825">Optimal results coordinates for urban-scale analysis according to Fig. <xref ref-type="fig" rid="F14"/>. Two columns are presented: the first (Foreground) for the main monitoring network and the second for the background network. Both networks contain three ground-based monitoring stations, since they are designed as one-to-one networks and because 3 is the minimal value (threshold in Fig. <xref ref-type="fig" rid="F13"/>) obtained from the analysis. The highest performance for each of the networks (foreground and background) has been highlighted.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="7">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:colspec colnum="5" colname="col5" align="center"/>
     <oasis:colspec colnum="6" colname="col6" align="center"/>
     <oasis:colspec colnum="7" colname="col7" align="center"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry rowsep="1" namest="col1" nameend="col3" align="center">Foreground </oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry rowsep="1" namest="col5" nameend="col7">Background </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Longitude</oasis:entry>
         <oasis:entry colname="col2">Latitude</oasis:entry>
         <oasis:entry colname="col3">Performance</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">Longitude</oasis:entry>
         <oasis:entry colname="col6">Latitude</oasis:entry>
         <oasis:entry colname="col7">Performance</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3">(%)</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"/>
         <oasis:entry colname="col7">(%)</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">4.0358</oasis:entry>
         <oasis:entry colname="col2">49.240</oasis:entry>
         <oasis:entry colname="col3"><bold>60.101</bold></oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">4.2177</oasis:entry>
         <oasis:entry colname="col6">49.326</oasis:entry>
         <oasis:entry colname="col7">54.057</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">3.9099</oasis:entry>
         <oasis:entry colname="col2">49.371</oasis:entry>
         <oasis:entry colname="col3">28.100</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">4.1617</oasis:entry>
         <oasis:entry colname="col6">49.149</oasis:entry>
         <oasis:entry colname="col7"><bold>58.921</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">4.1524</oasis:entry>
         <oasis:entry colname="col2">49.368</oasis:entry>
         <oasis:entry colname="col3">42.042</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">3.8167</oasis:entry>
         <oasis:entry colname="col6">49.240</oasis:entry>
         <oasis:entry colname="col7">58.777</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e1987">Background tower locations, identified by proximity and low scoring matrix values, are situated to the west of the city-where CO<sub>2</sub> enhancements are minimal-and upwind of the two sugar factories. When the two pre-existing ground-based measurement stations are considered, the optimal network size increases to <inline-formula><mml:math id="M93" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:math></inline-formula>, indicating that the current tower locations are not optimal (outside the main city plume). However, a longer simulation period would be required to draw conclusions regarding the overall effectiveness of the current towers in measuring urban plumes. The coordinates of the optimal ground-based measurement station locations (urban scale) and their corresponding performance values are provided in Table <xref ref-type="table" rid="T1"/>. The mask used in the background analysis for this application is shown in Fig. <xref ref-type="fig" rid="F15"/>.</p>

      <fig id="F15"><label>Figure 15</label><caption><p id="d2e2022">Binarized and inverted image of the score matrix used to mask the background tracer in the background analysis for the urban-scale application.</p></caption>
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f15.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Regional-scale application</title>
      <p id="d2e2039">For the second application, a regional tower network was considered, centered on the <italic>Grand Est</italic> region, which is dominated by croplands and forests and contains several large metropolitan areas, including Paris, Strasbourg, Nancy, Metz, and Reims in France; Frankfurt and Karlsruhe in Germany; and Basel and Zurich in Switzerland.</p>
      <p id="d2e2045">The study domain also includes several industrial areas, such as car manufacturing and highway traffic in Alsace, and large industries in the Ruhr Valley along the Rhine River, which produce noticeable atmospheric CO<sub>2</sub> plumes.</p>
      <p id="d2e2057">The WRF-Chem model configuration, described in Sect. <xref ref-type="sec" rid="Ch1.S2.SS1"/>, was run at 3 km resolution for two months (October 2022 and 2023). Results excluding the existing ICOS tower network are presented here, while the optimal network including the current ICOS station locations is shown in Appendix <xref ref-type="sec" rid="App1.Ch1.S1"/>, where a similar analysis that includes them is presented in Appendix <xref ref-type="sec" rid="App1.Ch1.S1.SS2"/> (black triangles in Fig. <xref ref-type="fig" rid="FA2"/>a and b).</p>
      <p id="d2e2068">Figure <xref ref-type="fig" rid="F16"/> shows the logistic fit of the trend analysis results over the regional domain. The calculated threshold represents the minimal number of ground-based measurement stations required for the region under study (in this case, at least nine according to CRO<sup>2</sup>A processing). Also shown are the first and second derivatives of the fitting curve (second subplot), from which the slope information used in the first subplot is derived.</p>

      <fig id="F16" specific-use="star"><label>Figure 16</label><caption><p id="d2e2085">Logistic fit (S-shaped curve) of the regional network performance as a function of monitoring tower numbers. The minimal value of ground-based measurement stations (<inline-formula><mml:math id="M96" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>) that corresponds to the calculated threshold value.</p></caption>
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f16.png"/>

          
        </fig>

      <p id="d2e2107">The optimal foreground locations cover the main urban centers (i.e., Paris, Strasbourg, Nancy–Metz, Frankfurt, Bern–Zurich) as well as bordering regions of Germany and Switzerland, where large fossil fuel emissions originate (see Fig. <xref ref-type="fig" rid="F17"/>a and c). Northern locations aim to capture large CO<sub>2</sub> plumes from the Benelux region, which strongly influence CO<sub>2</sub> spatial gradients over the domain, while the Ruhr Valley (north of Alsace) is covered by a specific measurement location.</p>

      <fig id="F17" specific-use="star"><label>Figure 17</label><caption><p id="d2e2132">Resulting optimal centroids (black triangles) and their corresponding clusters for all images in the dataset for both the foreground <bold>(a)</bold> and background <bold>(b)</bold> networks (regional level), location of the optimal centroids relative to the emission field <bold>(c)</bold>, and according to the scoring matrix <bold>(d)</bold>.</p></caption>
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f17.png"/>

          
        </fig>

      <p id="d2e2155">The domain center remains mostly devoid of ground-based measurement stations due to the absence of major plumes in the simulation. These areas have the lowest population densities and no significant industries or major highways.</p>
      <p id="d2e2158">For background concentrations, stations are located along the northern boundary and in the central part of the domain, with two additional sites identified in mountainous areas (i.e., the Black Forest and the Swiss Alps). Further experiments using simulated concentrations might enable a reduction in the number of background stations, though such decisions rely primarily on expert knowledge and spatial correlation, as discussed in Sect. <xref ref-type="sec" rid="Ch1.S4"/>.</p>
      <p id="d2e2164">Overall, 18 tower locations are required to constrain the major fossil fuel signals and their associated background concentrations over the simulation domain. The coordinates of the optimal ground-based measurement station locations (regional scale) and their performance values are provided in Table <xref ref-type="table" rid="T2"/>. The mask used in the background analysis for this application is shown in Fig. <xref ref-type="fig" rid="F18"/>.</p>

      <fig id="F18"><label>Figure 18</label><caption><p id="d2e2173">Binarized and inverted image of the score matrix used to mask the background tracer in the background analysis for the regional-scale application.</p></caption>
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f18.png"/>

        </fig>

<table-wrap id="T2" specific-use="star"><label>Table 2</label><caption><p id="d2e2185">Optimal results coordinates for regional-scale analysis according to Fig. <xref ref-type="fig" rid="F17"/>. Two columns are presented: the first (Foreground) for the main monitoring network and the second for the background network. Both networks contain nine ground monitoring stations, since they are designed as one-to-one networks and because 9 is the minimal value (threshold in Fig. <xref ref-type="fig" rid="F16"/>) obtained from the analysis. The highest performance for each of the networks (foreground and background) has been highlighted.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="8">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:colspec colnum="5" colname="col5" align="center"/>
     <oasis:colspec colnum="6" colname="col6" align="center"/>
     <oasis:colspec colnum="7" colname="col7" align="center"/>
     <oasis:colspec colnum="8" colname="col8" align="left"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry rowsep="1" namest="col1" nameend="col3" align="center">Foreground </oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry rowsep="1" namest="col5" nameend="col7">Background </oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Longitude</oasis:entry>
         <oasis:entry colname="col2">Latitude</oasis:entry>
         <oasis:entry colname="col3">Performance</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">Longitude</oasis:entry>
         <oasis:entry colname="col6">Latitude</oasis:entry>
         <oasis:entry colname="col7">Performance</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3">(%)</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"/>
         <oasis:entry colname="col7">(%)</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">2.2502</oasis:entry>
         <oasis:entry colname="col2">49.038</oasis:entry>
         <oasis:entry colname="col3">38.741</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">5.4000</oasis:entry>
         <oasis:entry colname="col6">48.682</oasis:entry>
         <oasis:entry colname="col7">49.213</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">3.8251</oasis:entry>
         <oasis:entry colname="col2">50.601</oasis:entry>
         <oasis:entry colname="col3">25.325</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">7.5282</oasis:entry>
         <oasis:entry colname="col6">48.353</oasis:entry>
         <oasis:entry colname="col7">46.954</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">6.2939</oasis:entry>
         <oasis:entry colname="col2">50.820</oasis:entry>
         <oasis:entry colname="col3">60.438</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">4.5487</oasis:entry>
         <oasis:entry colname="col6">50.025</oasis:entry>
         <oasis:entry colname="col7"><bold>52.224</bold></oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">6.6769</oasis:entry>
         <oasis:entry colname="col2">49.395</oasis:entry>
         <oasis:entry colname="col3">49.007</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">2.2076</oasis:entry>
         <oasis:entry colname="col6">49.998</oasis:entry>
         <oasis:entry colname="col7">51.608</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">7.6559</oasis:entry>
         <oasis:entry colname="col2">48.079</oasis:entry>
         <oasis:entry colname="col3">44.285</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">3.4846</oasis:entry>
         <oasis:entry colname="col6">48.737</oasis:entry>
         <oasis:entry colname="col7">48.323</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">8.1241</oasis:entry>
         <oasis:entry colname="col2">48.874</oasis:entry>
         <oasis:entry colname="col3">54.620</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">5.1446</oasis:entry>
         <oasis:entry colname="col6">47.284</oasis:entry>
         <oasis:entry colname="col7">50.103</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">8.3370</oasis:entry>
         <oasis:entry colname="col2">47.393</oasis:entry>
         <oasis:entry colname="col3">67.625</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">1.9948</oasis:entry>
         <oasis:entry colname="col6">47.722</oasis:entry>
         <oasis:entry colname="col7">43.669</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">8.5923</oasis:entry>
         <oasis:entry colname="col2">50.135</oasis:entry>
         <oasis:entry colname="col3"><bold>86.037</bold></oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">8.3795</oasis:entry>
         <oasis:entry colname="col6">47.201</oasis:entry>
         <oasis:entry colname="col7">45.174</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">8.9329</oasis:entry>
         <oasis:entry colname="col2">49.258</oasis:entry>
         <oasis:entry colname="col3">57.495</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">6.9323</oasis:entry>
         <oasis:entry colname="col6">50.190</oasis:entry>
         <oasis:entry colname="col7">49.829</oasis:entry>
         <oasis:entry colname="col8"/>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e2518">Subsequently, in Appendix <xref ref-type="sec" rid="App1.Ch1.S4"/> a third application is presented on a different scale than those shown in this section, its purpose is to compare some published results with those obtained by means of CRO<sup>2</sup>A.</p>
</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Discussions</title>
      <p id="d2e2541">As a result of the procedures carried out in this study, several aspects merit discussion: among the preliminary considerations, the development of the optimal design scheme is based on two types of measurements (direct and indirect) and on the measuring instruments themselves. The variety of instruments used in GHG monitoring requires them to be immersed in the gas flow to characterize their location relative to the measured concentrations. For this reason, CRO<sup>2</sup>A seeks to identify locations for ground-based measurement stations where GHG fluxes with considerable and measurable intensities are expected to occur most frequently and for the longest period of time. This approach is consistent with that of <xref ref-type="bibr" rid="bib1.bibx26" id="text.45"/>, who prioritized the location of monitoring network stations over the magnitude of uncertainty reduction, since the former depends on previous and observational uncertainty values.</p>
      <p id="d2e2556">The primary purpose of the pre-processing stage is to reduce the volume of data to be processed and analyzed; this is one reason for implementing each of the applied transformations. Furthermore, this stage, along with these transformations, is a distinguishing feature of this study: in addition to automating data cleansing, the analysis does not require any statistical assumptions beyond those inherent to the model used.</p>
      <p id="d2e2559">No clustering algorithm can generally guarantee convergence to a global minimum. Achieving the optimal clustering often requires exploring a search space too large to exhaustively evaluate, unless certain problem-specific conditions are met. Since the global minimum lies within the set of solutions defined by the search space-whose size in this case is approximately proportional to <inline-formula><mml:math id="M101" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>! (i.e., the factorial of the number of ground-based measurement stations in the monitoring network)-exploring a space of such magnitude entails high computational cost. The proposed algorithmic strategy in the processing stage addresses this by narrowing the search space (primarily during pre-processing), initializing starting points using information inherent in the data, and adopting a smart exploration method that diversifies candidate solutions to avoid local optima.</p>
      <p id="d2e2569">Like other algorithms, CRO<sup>2</sup>A explores only parts of the search space; its distinction lies in how these portions are selected and how the search is subsequently intensified in the post-processing stage.</p>
      <p id="d2e2582">Because different starting points produce different clusters whose general trends are then extracted, a selection strategy based on the marginal distributions of data point density was proposed.</p>
      <p id="d2e2585">The clustering procedure is performed using inverse weighting, taking the reciprocal of each point concentration as its weight. These weights identify key clustering points and allow systematic exclusion of noise from further analyses. An alternative is fuzzy clustering, implemented by setting specific constraints for each case. However, this method can be complex, potentially leading to a blurred boundary between objective and subjective outcomes. The optimal design scheme is structured around clustering algorithms such as <inline-formula><mml:math id="M103" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-medoids and <inline-formula><mml:math id="M104" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means, which are fast, iterative, require relatively few iterations to converge, and have simple per-iteration calculations.</p>
      <p id="d2e2602">When the merging or splitting of two or more clusters is required, such actions should be approved by an expert, and the optimal design scheme rerun with the updated number of clusters. Future versions of CRO<sup>2</sup>A are expected to include an automatic strategy for this task.</p>
      <p id="d2e2614">As stated initially, this is a proposed solution within the framework of unsupervised machine learning. Accordingly, complementary validation of the results requires the judgment of a subject-matter expert, through which certain sensitive groups can be considered. The optimal design scheme is therefore intended as a decision-support tool for determining the locations of ground-based measurement stations.</p>
      <p id="d2e2617">Given the modular structure of CRO<sup>2</sup>A, additional variables of interest may be incorporated into monitoring network design; however, their inclusion must be assessed to ensure it does not bias the clustering results. For such evaluations, techniques such as principal component analysis and/or regularization are recommended to address data multidimensionality and to extract features from the most influential variables. This aspect was specifically considered during the development of the optimal design scheme, for which high-performance computing principles were applied to balance algorithmic complexity. This included managing dataset storage to avoid overloading system memory while reducing processing time, without compromising accuracy-which is more important than algorithm speed for an optimal design.</p>
      <p id="d2e2629">In all cases where CRO<sup>2</sup>A is employed, and depending on the user scientific objectives, networks of varying sizes are proposed to enable the separation of primary signals within the selected domain. Similarly, the specific features of this optimal design scheme are made available for the user discretion in applying them to a particular use case.</p>
      <p id="d2e2642">For brevity, a follow-up paper is proposed to compare the results of CRO<sup>2</sup>A with previously published findings, particularly those of <xref ref-type="bibr" rid="bib1.bibx27" id="text.46"/>. Furthermore, as part of the validation process, the designs generated by the proposed optimal design scheme will be compared with those derived from inverse modeling to demonstrate its competitiveness.</p>
</sec>
<sec id="Ch1.S5" sec-type="conclusions">
  <label>5</label><title>Conclusions</title>
      <p id="d2e2666">This research developed an optimal design scheme, CRO<sup>2</sup>A, capable of supporting the deployment of GHG monitoring networks without the need for an inversion system, relying solely on direct atmospheric simulations.</p>
      <p id="d2e2678">The scheme employs an inversely weighted version of a modified clustering algorithm, combined with an optimization strategy that enables automated data analysis and processing to obtain essential information for designing atmospheric monitoring networks. The modifications applied to the classic <inline-formula><mml:math id="M110" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-medoids and <inline-formula><mml:math id="M111" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means algorithms preserve their core characteristics while enhancing their ability to diversify and intensify the search within the solution space. CRO<sup>2</sup>A can generate optimal designs for both primary (foreground) and secondary (background) monitoring networks, accommodating the presence of pre-installed ground-based measurement stations.</p>
      <p id="d2e2704">This approach is computationally efficient and offers greater flexibility than existing network design tools, without employing a complex inversion system. It also avoids dependence on many of the typical inverse assumptions (e.g., a priori error statistics) inherent in inverse modeling studies. By clustering time series of atmospheric GHG concentration fields, the scheme ensures that ground-based measurement stations are sited where signals are most frequently present, according to the seasons or times of day defined by the user.</p>
      <p id="d2e2707">The CRO<sup>2</sup>A optimal design scheme converges rapidly, requiring relatively few iterations compared with the data size, owing to its initialization strategy. It retains the <inline-formula><mml:math id="M114" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-medoids and <inline-formula><mml:math id="M115" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means property of low-complexity calculations per iteration, making it well suited for processing large datasets. An additional feature allows selection of the deployment area, masking potential locations outside predefined zones. Furthermore, CRO<sup>2</sup>A provides an effective tool for evaluating existing networks, identifying additional sites to expand them, or designing future observation networks without preexisting ground-based measurement stations. The optimal design scheme can be used in conjunction with expert input to guide site selection and design iterations.</p>
      <p id="d2e2743">While the design of a monitoring network ultimately depends on specific objectives, CRO<sup>2</sup>A serves as a decision-support tool, providing key information to assist experts in deploying atmospheric monitoring networks across diverse landscapes and environments.</p>
</sec>

      
      </body>
    <back><app-group>

<app id="App1.Ch1.S1">
  <label>Appendix A</label><title>Special features</title>
      <p id="d2e2766">Several special features of the CRO<sup>2</sup>A optimal design scheme are briefly described below, which facilitate user analysis during processing for the design of atmospheric monitoring networks. As noted earlier, with CRO<sup>2</sup>A, monitoring network design and analysis can be performed separately in the foreground or background mode, or jointly in the so-called complete mode. Both analyses are executed sequentially and in an orderly manner, providing feedback on their respective results. In complete mode, the background mode is executed immediately after the foreground mode, propagating the results from one to the next. Background measurement locations are estimated based on simulated large-scale background concentrations, while avoiding major sources and sinks of GHGs. This process reduces the risk of false solutions in background mode for locations with considerably high GHG concentrations. This is achieved through a mask extracted from the foreground mode scoring matrix.</p>
      <p id="d2e2787">If the region under analysis contains pre-existing ground-based measurement stations, these may or may not be included in the monitoring network design (the latter being the default). When inclusion and batch processing are required (as illustrated in Fig. <xref ref-type="fig" rid="F11"/> to determine network performance as a function of the number of ground-based measurement stations), it is advisable to set the number of pre-existing stations as the initial value for the test interval.</p>
      <p id="d2e2792">For example, the results for the two applications presented in Sect. <xref ref-type="sec" rid="Ch1.S3"/> are shown below, this time including the pre-existing stations in the ICOS network, listed in Table <xref ref-type="table" rid="TA1"/>. For the urban-scale application, the test interval was set to [2,10], while for the regional-scale application, [12, 15] was used, based on the information in Table <xref ref-type="table" rid="TA1"/>.</p>

<table-wrap id="TA1"><label>Table A1</label><caption><p id="d2e2805">Pre-existing ground-based measurement stations belonging to the ICOS network, considered in the analysis at urban or regional scale.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Scale</oasis:entry>
         <oasis:entry colname="col2">Longitude</oasis:entry>
         <oasis:entry colname="col3">Latitude</oasis:entry>
         <oasis:entry colname="col4">Reference</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">8.1755</oasis:entry>
         <oasis:entry colname="col3">47.189</oasis:entry>
         <oasis:entry colname="col4">BRM</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">8.6750</oasis:entry>
         <oasis:entry colname="col3">49.417</oasis:entry>
         <oasis:entry colname="col4">HEI</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">8.4249</oasis:entry>
         <oasis:entry colname="col3">49.091</oasis:entry>
         <oasis:entry colname="col4">KIT</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">8.3973</oasis:entry>
         <oasis:entry colname="col3">47.482</oasis:entry>
         <oasis:entry colname="col4">LHW</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Urban/regional</oasis:entry>
         <oasis:entry colname="col2">5.5036</oasis:entry>
         <oasis:entry colname="col3">48.562</oasis:entry>
         <oasis:entry colname="col4">OPE</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">2.1420</oasis:entry>
         <oasis:entry colname="col3">48.723</oasis:entry>
         <oasis:entry colname="col4">SAC</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">7.9166</oasis:entry>
         <oasis:entry colname="col3">47.917</oasis:entry>
         <oasis:entry colname="col4">SSL</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">2.1125</oasis:entry>
         <oasis:entry colname="col3">47.965</oasis:entry>
         <oasis:entry colname="col4">TRN</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Urban/regional</oasis:entry>
         <oasis:entry colname="col2">4.0611</oasis:entry>
         <oasis:entry colname="col3">49.243</oasis:entry>
         <oasis:entry colname="col4">MDH</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">2.4205</oasis:entry>
         <oasis:entry colname="col3">49.005</oasis:entry>
         <oasis:entry colname="col4">GNS</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">2.3018</oasis:entry>
         <oasis:entry colname="col3">49.012</oasis:entry>
         <oasis:entry colname="col4">AND</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regional</oasis:entry>
         <oasis:entry colname="col2">3.9747</oasis:entry>
         <oasis:entry colname="col3">49.236</oasis:entry>
         <oasis:entry colname="col4">ORM</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>


<sec id="App1.Ch1.S1.SS1">
  <label>A1</label><title>Urban-scale application including ICOS network</title>

      <fig id="FA1"><label>Figure A1</label><caption><p id="d2e3036">Results for the urban-scale application considering the pre-existing ground-based measurement stations in the ICOS network, with a minimal number of stations <inline-formula><mml:math id="M120" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:math></inline-formula> according to CRO<sup>2</sup>A.</p></caption>
          
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f19.png"/>

        </fig>

      <fig id="FA2"><label>Figure A2</label><caption><p id="d2e3073">Results for the regional-scale application considering the pre-existing ground-based measurement stations in the ICOS network, with a minimal number of stations <inline-formula><mml:math id="M122" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">13</mml:mn></mml:mrow></mml:math></inline-formula> according to CRO<sup>2</sup>A.</p></caption>
          
          <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f20.png"/>

        </fig>

</sec>
<sec id="App1.Ch1.S1.SS2">
  <label>A2</label><title>Regional-scale application including ICOS network</title>
      <p id="d2e3116">Comparing the main results of the two proposed applications with those presented in this section-which include ground-based measurement stations already installed and part of the ICOS network-reveals how CRO<sup>2</sup>A attempts to assimilate such locations by incorporating them into the design.</p>
      <p id="d2e3128">The most evident example is the regional-scale application, where, out of 12 possible pre-existing stations, 7 are effectively and completely assimilated (i.e., they coincide with the alternative network proposed by the optimal design scheme). In both case studies, when confronted with pre-existing stations exhibiting poor performance, CRO<sup>2</sup>A attempts to assimilate them.</p>
      <p id="d2e3140">If assimilation is not feasible, it incorporates these stations and generates complementary ones nearby to mitigate their impact on overall network performance. This approach is illustrated in the urban-scale application, where the scheme assimilates the two existing towers, aiming to preserve the original design of Fig. <xref ref-type="fig" rid="F14"/>d with three stations, while significantly modifying the location of one (the southeasternmost station in the city of Fig. <xref ref-type="fig" rid="FA1"/>a).</p>
      <p id="d2e3151">It should be noted that in both the urban-scale and regional-scale applications, CRO<sup>2</sup>A strives to include existing stations wherever possible. However, when such stations exhibit low performance, the overall performance of the network is inevitably compromised (e.g., for Reims, performance decreases from 43.41 % to 32.40 %, and for Grand Est, from 53.73 % to 47.69 %).</p>
      <p id="d2e3164">Similarly, the design can be performed by preloading potential locations for ground-based measurement stations, as in <xref ref-type="bibr" rid="bib1.bibx45" id="text.47"/>, yielding an optimal network under these conditions. Continuing with the analysis of pre-existing stations in the study region, the current monitoring network can also be evaluated, obtaining scores for each station and an approximate overall network score. This evaluation is performed using the metrics proposed in this study, which form the basis of the CRO<sup>2</sup>A optimality criterion.</p>
      <p id="d2e3179">Regarding the masking process, CRO<sup>2</sup>A can automatically generate a mask (with some resolution limitations), as illustrated in Fig. <xref ref-type="fig" rid="F5"/>b; alternatively, the user can load a custom mask according to the objectives of the analysis. In the same vein, CRO<sup>2</sup>A includes an option called “Special Mask”, which enables masking of a main field using another field (e.g., it can be used to mask an anthropogenic field using a biogenic field). Unlike other masking processes, where a single mask is applied to all images of the analyzed field in foreground mode, this option performs masking on an image-by-image basis between the two selected fields.</p>
      <p id="d2e3202">CRO<sup>2</sup>A also allows the selection of a rectangular subregion within the area defined by the input data, graphically or textually, by specifying the minimum and maximum latitude and longitude values. Although this approach avoids recompiling the input data, it remains subject to limitations in relation to the minimum resolution required for adequate cluster analysis.</p>
</sec>
</app>

<app id="App1.Ch1.S2">
  <label>Appendix B</label><title>Technical notes and definitions</title>
      <p id="d2e3223">The core of the proposed design scheme is cluster analysis, which aims to identify characteristic groups formed through the recognition of specific patterns, thereby enabling the extraction of useful information on similarity or dissimilarity from the analyzed data <xref ref-type="bibr" rid="bib1.bibx9" id="paren.48"/>.</p>
      <p id="d2e3229">Among the various clustering algorithms within pattern recognition, two general cases exist: supervised and unsupervised. Their use depends on whether the training class label is available <xref ref-type="bibr" rid="bib1.bibx8" id="paren.49"/>. Given the characteristics of the application described in this document, special attention is given to the unsupervised case due to the uncertainty of the output for a given input, i.e., the absence of a reference or expected ideal result (ground truth). The proposed design scheme incorporates elements of clustering algorithms such as <inline-formula><mml:math id="M131" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means, <inline-formula><mml:math id="M132" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-medoids, and the hierarchical variant, which help constrain the set of solutions. These are combined using an inversely weighted clustering strategy to improve overall performance without significantly increasing algorithmic complexity.</p>
      <p id="d2e3249">It should be noted that in the proposed design scheme, the number of <inline-formula><mml:math id="M133" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> clusters characterizing the data varies, and the process involves adjusting the centroid positions of these <inline-formula><mml:math id="M134" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-clusters according to the mean related to the density and inverse weighting of the same data.</p>
      <p id="d2e3266">In addition to preparing the data for further processing, the clustering analysis in the proposed design scheme aims to extract the maximum possible information on the behavioral trends of simulated GHG concentrations over a region by employing techniques from various scientific fields to explore and exploit similarities and dissimilarities in the data.</p>
      <p id="d2e3270">The proposed design scheme used for this analysis is named Designer of Optimal Atmospheric Observation Networks (CRO<sup>2</sup>A; French acronym for <italic>Concepteur de Réseaux Optimal d'Observation Atmosphérique</italic>), whose logo is shown in Fig. <xref ref-type="fig" rid="FB1"/>.</p>

      <fig id="FB1"><label>Figure B1</label><caption><p id="d2e3290">CRO<sup>2</sup>A.</p></caption>
        <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f21.png"/>

      </fig>

      <p id="d2e3308">The development of CRO<sup>2</sup>A followed the basic steps of a clustering task, supplemented by various methods to enhance data analysis: <list list-type="bullet"><list-item>
      <p id="d2e3322"><italic>Selection of features</italic>: based on their relative significance, the variables selected as features are the latitude, longitude, and GHG concentration at each point in the analysis region.</p>
      <p id="d2e3327">During clustering, variables are indirectly processed through their projections onto the <inline-formula><mml:math id="M138" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>–<inline-formula><mml:math id="M139" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> plane, while retaining the nominal and ordinal properties of interval scale features, i.e., features for which the ratio between two values is meaningless but the difference between them is meaningful. This selection encodes the information while minimizing redundancy, depending on preprocessing, which is an essential stage involving data cleaning, filtering, and masking.</p></list-item><list-item>
      <p id="d2e3345"><italic>Selection of the membership measure</italic>: this measure employs a function to model the dissimilarities (or similarities) between data points. Its variables are directly related to the selected features. In this study, a widely accepted proximity measure is used, as shown in Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S2.E1"/>).<disp-formula id="App1.Ch1.S2.E1" content-type="numbered"><label>B1</label><mml:math id="M140" display="block"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">sd</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-fraktur">p</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-fraktur">c</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:munderover><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi mathvariant="bold-fraktur">w</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>‖</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-fraktur">c</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mo>‖</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:mfenced><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula><mml:math id="M141" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">sd</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the membership function (for dissimilarity modeling) for <inline-formula><mml:math id="M142" display="inline"><mml:mrow><mml:mi>r</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula> from the <inline-formula><mml:math id="M143" display="inline"><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mi>r</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>-norm (Euclidean distance measure). Here, <inline-formula><mml:math id="M144" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> is the number of data points, <inline-formula><mml:math id="M145" display="inline"><mml:mi>m</mml:mi></mml:math></inline-formula> is the number of clusters, <inline-formula><mml:math id="M146" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-fraktur">p</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the <inline-formula><mml:math id="M147" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>th data point, and <inline-formula><mml:math id="M148" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-fraktur">c</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the centroid of the <inline-formula><mml:math id="M149" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th cluster. Both <inline-formula><mml:math id="M150" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-fraktur">p</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M151" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-fraktur">c</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> correspond to an (<inline-formula><mml:math id="M152" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M153" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula>) coordinate pair derived from the width and height values of the <inline-formula><mml:math id="M154" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>th image under testing in the <inline-formula><mml:math id="M155" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>–<inline-formula><mml:math id="M156" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> plane.</p>
      <p id="d2e3579">Lastly, <inline-formula><mml:math id="M157" display="inline"><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>≥</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula> is the weight of <inline-formula><mml:math id="M158" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-fraktur">p</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> according to a scoring matrix.</p>
      <p id="d2e3608">The membership function is used to model data patterns, minimizing the dissimilarity threshold (or maximizing the similarity threshold, depending on the approach), thereby optimizing both the quantity and quality of the resulting clusters. </p></list-item><list-item>
      <p id="d2e3613"><italic>Selection of clustering criterion</italic>: this refers to the formation of clusters, which is not limited to compact groups, as the data may exhibit both low and high dispersion. The clustering criterion is therefore linked to network performance based on available data. Network performance is defined using a scoring matrix (explained in Sect. <xref ref-type="sec" rid="Ch1.S2.SS2.SSS1"/>) that rates each point in the analyzed region according to the frequency of relatively high concentrations. A trend analysis is implemented via the basic sequential algorithmic scheme (BSAS) to address the recurring question of the optimal number of clusters (i.e., number of towers and their locations in the region under analysis). Accordingly, the clustering criterion is expressed in terms of the trend in the ratio between the total number of ground-based measurement stations and their performance.</p></list-item><list-item>
      <p id="d2e3621"><italic>Selection of clustering algorithms</italic>: the proposed design scheme combines <inline-formula><mml:math id="M159" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means, <inline-formula><mml:math id="M160" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-medoids, and hierarchical clustering strategies to uncover implicit groups in the dataset. Depending on user needs or problem characteristics, the algorithm may run in either <inline-formula><mml:math id="M161" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-medoids mode (default) or <inline-formula><mml:math id="M162" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means mode, both always supported by hierarchical clustering.</p></list-item><list-item>
      <p id="d2e3655"><italic>Validation of results</italic>: the clustering results-specifically, the number and locations of ground-based measurement stations-are evaluated using parametric and non-parametric tests based on the previously defined clustering criterion, with reference to the region scoring matrix.</p></list-item><list-item>
      <p id="d2e3661"><italic>Interpretation of results</italic>: since no ground truth is available, validation and evaluation must be performed by field experts through additional experimental tests, ensuring that results have practical relevance. The design scheme also offers alternatives to the primary design (of equal or slightly lower performance) if physical implementation of the main design is not feasible.</p></list-item><list-item>
      <p id="d2e3667"><italic>Clustering tendency</italic>: this refers to tests designed to determine whether an inherent clustering structure exists in the data. The hierarchical clustering algorithm is the primary clustering tendency test in this study, applied to assess trends in the performance and number of ground-based measurement stations in the designed network.</p></list-item></list></p>
</app>

<app id="App1.Ch1.S3">
  <label>Appendix C</label><title>Clustering algorithms</title>
      <p id="d2e3680">To formally define the clustering process, let <inline-formula><mml:math id="M163" display="inline"><mml:mrow><mml:mi mathvariant="fraktur">C</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi mathvariant="fraktur">c</mml:mi><mml:mn mathvariant="fraktur">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="fraktur">c</mml:mi><mml:mn mathvariant="fraktur">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi mathvariant="fraktur">c</mml:mi><mml:mn mathvariant="fraktur">3</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi mathvariant="fraktur">c</mml:mi><mml:mi mathvariant="fraktur">m</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> be the <inline-formula><mml:math id="M164" display="inline"><mml:mi>m</mml:mi></mml:math></inline-formula>-clustering of <inline-formula><mml:math id="M165" display="inline"><mml:mrow><mml:mi mathvariant="fraktur">D</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi mathvariant="bold-fraktur">d</mml:mi><mml:mn mathvariant="bold-fraktur">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="bold-fraktur">d</mml:mi><mml:mn mathvariant="bold-fraktur">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi mathvariant="bold-fraktur">d</mml:mi><mml:mn mathvariant="bold-fraktur">3</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi mathvariant="bold-fraktur">d</mml:mi><mml:mi mathvariant="bold-fraktur">n</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, representing the dataset of <inline-formula><mml:math id="M166" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> data points partitioned into <inline-formula><mml:math id="M167" display="inline"><mml:mi>m</mml:mi></mml:math></inline-formula> subsets, where <inline-formula><mml:math id="M168" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-fraktur">d</mml:mi><mml:mi mathvariant="bold-fraktur">n</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mo>]</mml:mo><mml:mo>⊺</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> is the <inline-formula><mml:math id="M169" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula>th data point in the <inline-formula><mml:math id="M170" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula>-dimensional space. Through a membership function, it is possible to quantify similarity or dissimilarity using the numerical values corresponding to each of the <inline-formula><mml:math id="M171" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> features stored in the data points. The correspondence between similarity and dissimilarity measures allows the analysis to be performed interchangeably from either perspective.</p>
      <p id="d2e3899">It is common to refer to the degree of similarity (or dissimilarity) as a distance measure. A widely used membership function models the proximity (distance) between two points, providing a mathematical basis for this correspondence:

          <disp-formula id="App1.Ch1.S3.E2" content-type="numbered"><label>C1</label><mml:math id="M172" display="block"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mi>r</mml:mi></mml:msup></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mo>‖</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mo>‖</mml:mo><mml:mi>r</mml:mi></mml:msup></mml:mrow></mml:mfenced><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mi>r</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

        where <inline-formula><mml:math id="M173" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mi>r</mml:mi></mml:msup></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is the similarity–dissimilarity modeling function known as the <inline-formula><mml:math id="M174" display="inline"><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mi>r</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>-norm, with <inline-formula><mml:math id="M175" display="inline"><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mn mathvariant="normal">3</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>n</mml:mi></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M176" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> is the number of data points. Each <inline-formula><mml:math id="M177" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th data point is defined by the values <inline-formula><mml:math id="M178" display="inline"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M179" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> of <inline-formula><mml:math id="M180" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M181" display="inline"><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math></inline-formula>, respectively. In <inline-formula><mml:math id="M182" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mi>r</mml:mi></mml:msup></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M183" display="inline"><mml:mrow><mml:mi>r</mml:mi><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>; in particular, for <inline-formula><mml:math id="M184" display="inline"><mml:mrow><mml:mi>r</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">∞</mml:mi></mml:mrow></mml:math></inline-formula>, the Manhattan (<inline-formula><mml:math id="M185" display="inline"><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>), Euclidean (<inline-formula><mml:math id="M186" display="inline"><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>), and maximum (<inline-formula><mml:math id="M187" display="inline"><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mi mathvariant="normal">∞</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>) norms are obtained, respectively.</p>
      <p id="d2e4163">Before describing in the following section the characteristics of the optimization algorithm, it is important to note that an appropriate <inline-formula><mml:math id="M188" display="inline"><mml:mi>m</mml:mi></mml:math></inline-formula>-clustering <inline-formula><mml:math id="M189" display="inline"><mml:mi mathvariant="fraktur">C</mml:mi></mml:math></inline-formula> consists of <inline-formula><mml:math id="M190" display="inline"><mml:mi>m</mml:mi></mml:math></inline-formula> subsets containing elements of <inline-formula><mml:math id="M191" display="inline"><mml:mi mathvariant="fraktur">D</mml:mi></mml:math></inline-formula> such that elements with similar features belong to the same cluster, whereas elements with dissimilar features belong to different clusters. In addition, the <inline-formula><mml:math id="M192" display="inline"><mml:mi>m</mml:mi></mml:math></inline-formula>-clustering <inline-formula><mml:math id="M193" display="inline"><mml:mi mathvariant="fraktur">C</mml:mi></mml:math></inline-formula> must satisfy the following conditions:

              <disp-formula specific-use="align"><mml:math id="M194" display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold">Condition</mml:mi><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mn mathvariant="bold">1</mml:mn><mml:mo>.</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="fraktur">c</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>≠</mml:mo><mml:mo>∅</mml:mo><mml:mo>,</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mn mathvariant="normal">3</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>m</mml:mi><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="bold">Condition</mml:mi><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mn mathvariant="bold">2</mml:mn><mml:mo>.</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:msubsup><mml:mo>∪</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:msub><mml:mi mathvariant="fraktur">c</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi mathvariant="fraktur">C</mml:mi><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold">Condition</mml:mi><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mn mathvariant="bold">3</mml:mn><mml:mo>.</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="fraktur">c</mml:mi><mml:mrow><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>∩</mml:mo><mml:msub><mml:mi mathvariant="fraktur">c</mml:mi><mml:mrow><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>∅</mml:mo><mml:mo>,</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>≠</mml:mo><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mi mathvariant="normal">and</mml:mi></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msub><mml:mi>k</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mn mathvariant="normal">3</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>m</mml:mi><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
</app>

<app id="App1.Ch1.S4">
  <label>Appendix D</label><title>Application: comparison with published results as a reference</title>
      <p id="d2e4418">As an additional application on a larger scale than those already presented (Sect. <xref ref-type="sec" rid="Ch1.S3.SS1"/> and <xref ref-type="sec" rid="Ch1.S3.SS2"/>), an anthropogenic CO<sub>2</sub> monitoring network was designed for the Australian territory without considering any of the already installed ground stations, using a dataset of CO<sub>2</sub> molar fractions over this domain provided by CAMS model. Data processing and analysis were performed using CRO<sup>2</sup>A, and the results were compared with those of a previously published article dedicated to the same domain.</p>
      <p id="d2e4452"><xref ref-type="bibr" rid="bib1.bibx45" id="text.50"/> presents a methodology for designing monitoring networks specifically for CO<sub>2</sub> in the Australian territory, using an inverse modeling approach to reduce uncertainty in GHG fluxes. They employ a Bayesian framework to calculate changes in flux uncertainty based solely on error statistics and the Lagrangian Particle Dispersion Transport Model (LPDM). Furthermore, they use supporting models, such as the ACCESS-R model for meteorological data and the CABLE/BIOS2 and FFDAS systems to estimate the uncertainties of biogenic and fossil fuel fluxes, respectively.</p>

      <fig id="FD1" specific-use="star"><label>Figure D1</label><caption><p id="d2e4468">Logistic fit (S-shaped curve) of the Australian network performance as a function of the number of monitoring towers (note that the minimal value for ground-based monitoring stations is equal to the calculated and indicated threshold <inline-formula><mml:math id="M199" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">9</mml:mn></mml:mrow></mml:math></inline-formula>) <bold>(a)</bold>, the resulting geographic locations of the monitoring towers (red triangles for CRO<sup>2</sup>A and black triangles for <xref ref-type="bibr" rid="bib1.bibx45" id="altparen.51"/>) <bold>(b)</bold>, the same but according to the scoring matrix (red triangles for CRO<sup>2</sup>A and black triangles for <xref ref-type="bibr" rid="bib1.bibx45" id="altparen.52"/>) <bold>(c)</bold>, and the percentage scores per tower and the overall value of the network performance according to CRO<sup>2</sup>A <bold>(d)</bold>.</p></caption>
        <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f22.png"/>

      </fig>

      <p id="d2e4539"><xref ref-type="bibr" rid="bib1.bibx45" id="text.53"/> propose a scenario in which they design the monitoring network starting from an empty network (i.e., assuming they have no ground-based measurement stations). They call this monitoring network NEW, which is shown in Fig. <xref ref-type="fig" rid="FD1"/>b and c using black triangles.</p>
      <p id="d2e4547">According to the analysis by CRO<sup>2</sup>A, shown in Fig. <xref ref-type="fig" rid="FD1"/>a, a new network like the one described above requires a minimal of <inline-formula><mml:math id="M204" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">9</mml:mn></mml:mrow></mml:math></inline-formula> ground-based measurement stations, as indicated by the threshold value calculated after batch processing and the corresponding curve fitting. This value coincides with the results of <xref ref-type="bibr" rid="bib1.bibx45" id="text.54"/> for their NEW monitoring network. The resulting network for the Australian territory according to CRO<sup>2</sup>A is also shown in Fig. <xref ref-type="fig" rid="FD1"/>b, using red triangles.</p>
      <p id="d2e4591">The trend of both monitoring networks is similar and follows what was concluded by <xref ref-type="bibr" rid="bib1.bibx45" id="text.55"/>, since the ground-based measurement stations should be located mainly in the north and east of the territory (which in turn corresponds to the most populated Australian regions). The main difference is the redistribution of ground-based measurement stations made by CRO<sup>2</sup>A, since in regions where the NEW monitoring network proposes two ground-based measurement stations, CRO<sup>2</sup>A assimilates them and reduces them to a single station (in the east of the territory). Due to the above and the presence of relevant information (according to the scoring matrix) in western Australia, CRO<sup>2</sup>A proposes a ground-based measuring station that helps characterize this region (see Fig. <xref ref-type="fig" rid="FD1"/>c).</p>
      <p id="d2e4626">An evaluation of the design of an atmospheric monitoring network for Australia, without considering existing ground-based measuring stations, is presented below (see Table <xref ref-type="table" rid="TD1"/>), based on the results of <xref ref-type="bibr" rid="bib1.bibx45" id="text.56"/> and CRO<sup>2</sup>A, according to the scoring matrix proposed in this document.</p>

<table-wrap id="TD1" specific-use="star"><label>Table D1</label><caption><p id="d2e4646">Optimal coordinate results for the analysis of Australia according to Fig. <xref ref-type="fig" rid="FD1"/>d. Two main columns are presented: the first for the New monitoring network from the article by <xref ref-type="bibr" rid="bib1.bibx45" id="text.57"/> and the second for the monitoring network resulting from CRO<sup>2</sup>A. Both networks contain 9 ground monitoring stations sorted in descending order according to their performance value calculated through the scoring matrix. Common ground measuring stations have been highlighted, and the highest performances for each of the monitoring networks have been underlined.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="9">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:colspec colnum="5" colname="col5" align="left"/>
     <oasis:colspec colnum="6" colname="col6" align="center"/>
     <oasis:colspec colnum="7" colname="col7" align="center"/>
     <oasis:colspec colnum="8" colname="col8" align="center"/>
     <oasis:colspec colnum="9" colname="col9" align="left"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry rowsep="1" namest="col1" nameend="col4" align="center">New <xref ref-type="bibr" rid="bib1.bibx45" id="paren.58"/></oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry rowsep="1" namest="col6" nameend="col9"><bold>CRO</bold><sup><bold>2</bold></sup><bold>A</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Longitude</oasis:entry>
         <oasis:entry colname="col2">Latitude</oasis:entry>
         <oasis:entry colname="col3">Performance</oasis:entry>
         <oasis:entry colname="col4">Reference</oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6">Longitude</oasis:entry>
         <oasis:entry colname="col7">Latitude</oasis:entry>
         <oasis:entry colname="col8">Performance</oasis:entry>
         <oasis:entry colname="col9">Reference</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3">(%)</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"/>
         <oasis:entry colname="col7"/>
         <oasis:entry colname="col8">(%)</oasis:entry>
         <oasis:entry colname="col9">(nearest one)</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M212" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">14.510</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">132.45</oasis:entry>
         <oasis:entry colname="col3"><underline>48.014</underline></oasis:entry>
         <oasis:entry colname="col4"><bold>Tindal</bold></oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M213" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">14.750</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">131.82</oasis:entry>
         <oasis:entry colname="col8"><underline>55.000</underline></oasis:entry>
         <oasis:entry colname="col9"><bold>Tindal</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M214" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">15.450</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">128.12</oasis:entry>
         <oasis:entry colname="col3">35.890</oasis:entry>
         <oasis:entry colname="col4">Wyndham</oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M215" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">12.620</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">131.57</oasis:entry>
         <oasis:entry colname="col8">54.726</oasis:entry>
         <oasis:entry colname="col9">Berrimah</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M216" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">29.500</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">149.85</oasis:entry>
         <oasis:entry colname="col3">34.521</oasis:entry>
         <oasis:entry colname="col4">Moree</oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M217" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">13.370</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">134.95</oasis:entry>
         <oasis:entry colname="col8">46.438</oasis:entry>
         <oasis:entry colname="col9">Gove</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M218" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">26.420</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">146.27</oasis:entry>
         <oasis:entry colname="col3">25.205</oasis:entry>
         <oasis:entry colname="col4">Charleville</oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M219" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">18.370</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">125.70</oasis:entry>
         <oasis:entry colname="col8">39.178</oasis:entry>
         <oasis:entry colname="col9">Halls Creek</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M220" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">36.030</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">146.03</oasis:entry>
         <oasis:entry colname="col3">24.384</oasis:entry>
         <oasis:entry colname="col4">Yarrawonga</oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M221" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">25.750</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">149.70</oasis:entry>
         <oasis:entry colname="col8">38.151</oasis:entry>
         <oasis:entry colname="col9">Arcturus</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M222" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">19.630</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">134.18</oasis:entry>
         <oasis:entry colname="col3">21.507</oasis:entry>
         <oasis:entry colname="col4"><bold>Tennant Creek</bold></oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M223" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">16.870</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">142.95</oasis:entry>
         <oasis:entry colname="col8">28.219</oasis:entry>
         <oasis:entry colname="col9">Saddle Mtn</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M224" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">35.660</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">149.51</oasis:entry>
         <oasis:entry colname="col3">4.9315</oasis:entry>
         <oasis:entry colname="col4">Captain Flat</oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M225" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">33.250</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">145.95</oasis:entry>
         <oasis:entry colname="col8">23.562</oasis:entry>
         <oasis:entry colname="col9">Wagga</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M226" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">16.880</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">145.75</oasis:entry>
         <oasis:entry colname="col3">0.0000</oasis:entry>
         <oasis:entry colname="col4">Cairns Airport</oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M227" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">29.620</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">117.32</oasis:entry>
         <oasis:entry colname="col8">20.822</oasis:entry>
         <oasis:entry colname="col9">Geraldton</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M228" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">16.670</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">139.17</oasis:entry>
         <oasis:entry colname="col3">0.0000</oasis:entry>
         <oasis:entry colname="col4">Mornington Island</oasis:entry>
         <oasis:entry colname="col5"/>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M229" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">20.870</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7">136.45</oasis:entry>
         <oasis:entry colname="col8">19.178</oasis:entry>
         <oasis:entry colname="col9"><bold>Tennant Creek</bold></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e5195">According to the evaluation presented, the performance of the network proposed by CRO<sup>2</sup>A (36.142 %) is greater than that of the monitoring network of <xref ref-type="bibr" rid="bib1.bibx45" id="text.59"/> (27.790 %). Note that in both designs, the location trend for the first ground-based measurement station shows the city of Tindal as a possible solution; however, CRO<sup>2</sup>A proposes a relocation near this location, increasing performance by 6.9860 %. Furthermore, while the performance of the Tennant Creek station is 21.507 % in the <xref ref-type="bibr" rid="bib1.bibx45" id="text.60"/> design, according to the scoring matrix, the relocation proposed by CRO<sup>2</sup>A decreases its performance to 19.178 %, making it the ground-based measurement station with the lowest performance value compared to the 4.9315 % of Captain Flat in the <xref ref-type="bibr" rid="bib1.bibx45" id="text.61"/> design.</p>
      <p id="d2e5235">The above demonstrates the balanced processing that CRO<sup>2</sup>A performs, since it takes into account the trade-offs between location and performance value when designing the monitoring network and decides in favor of the overall performance of the monitoring network.</p>
      <p id="d2e5248">Similarly, <xref ref-type="bibr" rid="bib1.bibx45" id="text.62"/> evaluated the performance of the BASE network (the monitoring network existing at the time of publication of the article), composed of six ground-based measurement stations, obtaining a result of approximately 30 % reduction in the uncertainty of surface flows. This BASE monitoring network was also analyzed using CRO<sup>2</sup>A, and the results are shown in the Table <xref ref-type="table" rid="TD2"/>. The analysis revealed that only two of the six ground-based measurement stations had a performance greater than 0, and therefore, the overall performance of the monitoring network was around 29.589 %.</p>

<table-wrap id="TD2"><label>Table D2</label><caption><p id="d2e5268">Evaluation of the BASE monitoring network of the article by <xref ref-type="bibr" rid="bib1.bibx45" id="text.63"/> according to the scoring matrix proposed in this document.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry namest="col1" nameend="col4" align="center">Base <xref ref-type="bibr" rid="bib1.bibx45" id="paren.64"/></oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Longitude</oasis:entry>
         <oasis:entry colname="col2">Latitude</oasis:entry>
         <oasis:entry colname="col3">Performance</oasis:entry>
         <oasis:entry colname="col4">Reference</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3">(%)</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M235" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">23.860</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">148.47</oasis:entry>
         <oasis:entry colname="col3">35.479</oasis:entry>
         <oasis:entry colname="col4">Arcturus</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M236" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">34.410</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">150.88</oasis:entry>
         <oasis:entry colname="col3">23.699</oasis:entry>
         <oasis:entry colname="col4">Wollongong</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M237" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">38.010</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">145.01</oasis:entry>
         <oasis:entry colname="col3">0.0000</oasis:entry>
         <oasis:entry colname="col4">Aspendale</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M238" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">40.700</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">144.70</oasis:entry>
         <oasis:entry colname="col3">0.0000</oasis:entry>
         <oasis:entry colname="col4">Cape Grim</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M239" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">12.420</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">130.89</oasis:entry>
         <oasis:entry colname="col3">0.0000</oasis:entry>
         <oasis:entry colname="col4">Darwin</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M240" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">12.200</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">131.00</oasis:entry>
         <oasis:entry colname="col3">0.0000</oasis:entry>
         <oasis:entry colname="col4">Gunn Point</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e5468"><xref ref-type="bibr" rid="bib1.bibx45" id="text.65"/> recommends increasing the number of ground-based measurement stations (doubling the number of stations in the BASE monitoring network), which could achieve a reduction in uncertainty of approximately 50 %. This monitoring network is called EXTENDED, and as previously mentioned, it contains 12 ground-based monitoring stations. According to the results of CRO<sup>2</sup>A, to improve the performance of the monitoring network (Fig. <xref ref-type="fig" rid="FD1"/>d), it is recommended to use the first number of stations whose performance falls within a 2.5 % percentage band around the maximum value. According to Fig. <xref ref-type="fig" rid="FD1"/>a, this value corresponds to 13 towers. This saturated value is the first point beyond which, despite increasing the number of ground-based measurement stations in the monitoring network, the overall performance of the monitoring network does not vary considerably.</p>
      <p id="d2e5487">It is worth noting that the similarities between both designs occur even under considerably different conditions, which are the main source of the dissimilarities. Some differences in the design conditions include: The data used come from different transport models and strategies (CAMS and LPDM, respectively). The data used by <xref ref-type="bibr" rid="bib1.bibx45" id="text.66"/> corresponds to the months of January and July, while the CRO<sup>2</sup>A design used data corresponding to a full year. <xref ref-type="bibr" rid="bib1.bibx45" id="text.67"/> used both anthropogenic and biogenic CO<sub>2</sub> data over the Australian territory, while this document only considers anthropogenic fields. Finally, the approach between the two design proposals is completely different, since <xref ref-type="bibr" rid="bib1.bibx45" id="text.68"/> used inverse modeling, while CRO<sup>2</sup>A is based on a direct modeling strategy.</p>
</app>

<app id="App1.Ch1.S5">
  <label>Appendix E</label><title>Additional tests</title>
      <p id="d2e5535">Presented below are additional results complementing those in Sect. <xref ref-type="sec" rid="Ch1.S3"/>, specifically for the urban-scale application described in Sect. <xref ref-type="sec" rid="Ch1.S3.SS1"/>.</p><fig id="FE1"><label>Figure E1</label><caption><p id="d2e5544">Score percentages per tower and overall network performance value for the urban-scale application with <inline-formula><mml:math id="M245" display="inline"><mml:mrow><mml:msup><mml:mi>k</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:math></inline-formula>: <bold>(a)</bold> performance scores and <bold>(b)</bold> corresponding tower locations.</p></caption>
        
        <graphic xlink:href="https://gmd.copernicus.org/articles/19/3757/2026/gmd-19-3757-2026-f23.png"/>

      </fig>

</app>
  </app-group><notes notes-type="codedataavailability"><title>Code and data availability</title>

      <p id="d2e5580">Codes, data, and examples are publicly available at <xref ref-type="bibr" rid="bib1.bibx22" id="text.69"/> (<ext-link xlink:href="https://doi.org/10.5281/zenodo.17161303" ext-link-type="DOI">10.5281/zenodo.17161303</ext-link>) and <xref ref-type="bibr" rid="bib1.bibx1" id="text.70"/> (<ext-link xlink:href="https://doi.org/10.5281/zenodo.17161462" ext-link-type="DOI">10.5281/zenodo.17161462</ext-link>).</p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d2e5598">DMR developed the optimal design scheme and performed various tests using datasets provided by CA. CA performed model simulations, validated the projected results, and contributed to the analysis of the optimal design scheme outcomes. TL provided the original project concept and performed analysis and validation of the results. DMR, CA, and TL contributed extensively to writing this document.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d2e5604">The contact author has declared that none of the authors has any competing interests.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d2e5610">Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.</p>
  </notes><ack><title>Acknowledgements</title><p id="d2e5616">This study was supported by the Postdoctoral Research Program of the Université de Reims Champagne-Ardenne (URCA) and the French Ministry of Research and Education (MESRI) through the Chaire de Professeur Junior (CASAL project). Part of this study was funded by the National Center for Scientific Research (CNRS), the French National Space Agency (CNES), and the European Space Agency (ESA) as part of the MAGIC aircraft program. We thank the ROMEO HPC computing facility of URCA for enabling the algorithm testing.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d2e5623">This research has been supported by the Université de Reims Champagne-Ardenne (grant no. GSMA-7331-MATA0011) and the Agence Nationale de la Recherche (grant no. ANR-22-CPJ1-0002-01).</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d2e5630">This paper was edited by Marko Scholze and reviewed by Alecia Nickless and one anonymous referee.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><label>Abdallah(2025)</label><mixed-citation>Abdallah, C.: CRO<sup>2</sup>A – Illustrative example data (Version 0) [Data set], Zenodo [data set], <ext-link xlink:href="https://doi.org/10.5281/zenodo.17161463" ext-link-type="DOI">10.5281/zenodo.17161463</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Andrews et al.(2014)Andrews, Kofler, Trudeau, Williams, Neff, Masarie, Chao, Kitzis, Novelli, Zhao, Dlugokencky, Lang, Crotwell, Fischer, Parker, Lee, Baumann, Desai, Stanier, De Wekker, Wolfe, Munger, and Tans</label><mixed-citation>Andrews, A. E., Kofler, J. D., Trudeau, M. E., Williams, J. C., Neff, D. H., Masarie, K. A., Chao, D. Y., Kitzis, D. R., Novelli, P. C., Zhao, C. L., Dlugokencky, E. J., Lang, P. M., Crotwell, M. J., Fischer, M. L., Parker, M. J., Lee, J. T., Baumann, D. D., Desai, A. R., Stanier, C. O., De Wekker, S. F. J., Wolfe, D. E., Munger, J. W., and Tans, P. P.: CO<sub>2</sub>, CO, and CH<sub>4</sub> measurements from tall towers in the NOAA Earth System Research Laboratory's Global Greenhouse Gas Reference Network: instrumentation, uncertainty analysis, and recommendations for future high-accuracy greenhouse gas monitoring efforts, Atmos. Meas. Tech., 7, 647–687, <ext-link xlink:href="https://doi.org/10.5194/amt-7-647-2014" ext-link-type="DOI">10.5194/amt-7-647-2014</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>Bangare et al.(2015)Bangare, Dubal, Bangare, and Patil</label><mixed-citation> Bangare, S. L., Dubal, A., Bangare, P. S., and Patil, S.: Reviewing Otsu's method for image thresholding, Int. J. Appl. Eng. Res., 10, 21777–21783, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Bousquet et al.(2000)Bousquet, Peylin, Ciais, Qur, Friedlingstein, and Tans</label><mixed-citation>Bousquet, P., Peylin, P., Ciais, P., Quéré, C. L., Friedlingstein, P., and Tans, P. P.: Regional Changes in Carbon Dioxide Fluxes of Land and Oceans Since 1980, Science, 290, 1342–1346, <ext-link xlink:href="https://doi.org/10.1126/science.290.5495.1342" ext-link-type="DOI">10.1126/science.290.5495.1342</ext-link>, 2000.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Chevallier et al.(2019)Chevallier, Remaud, O'Dell, Baker, Peylin, and Cozic</label><mixed-citation>Chevallier, F., Remaud, M., O'Dell, C. W., Baker, D., Peylin, P., and Cozic, A.: Objective evaluation of surface- and satellite-driven carbon dioxide atmospheric inversions, Atmos. Chem. Phys., 19, 14233–14251, <ext-link xlink:href="https://doi.org/10.5194/acp-19-14233-2019" ext-link-type="DOI">10.5194/acp-19-14233-2019</ext-link>, 2019. </mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Copernicus Climate Change Service(2018)</label><mixed-citation>Copernicus Climate Change Service: ERA5 hourly data on pressure levels from 1940 to present, <ext-link xlink:href="https://doi.org/10.24381/CDS.BD0915C6" ext-link-type="DOI">10.24381/CDS.BD0915C6</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx7"><label>Debnath et al.(2024)Debnath, Govardhan, Jat, Kalita, Yadav, Jena, Kumar, and Ghude</label><mixed-citation>Debnath, S., Govardhan, G., Jat, R., Kalita, G., Yadav, P., Jena, C., Kumar, R., and Ghude, S. D.: Black carbon emissions and its impact on the monsoon rainfall patterns over the Indian subcontinent: Insights into localized warming effects, Atmospheric Environ.: X, 22, 100257, <ext-link xlink:href="https://doi.org/10.1016/j.aeaoa.2024.100257" ext-link-type="DOI">10.1016/j.aeaoa.2024.100257</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>de Burgh-Day and Leeuwenburg(2023)</label><mixed-citation>de Burgh-Day, C. O. and Leeuwenburg, T.: Machine learning for numerical weather and climate modelling: a review, Geosci. Model Dev., 16, 6433–6477, <ext-link xlink:href="https://doi.org/10.5194/gmd-16-6433-2023" ext-link-type="DOI">10.5194/gmd-16-6433-2023</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Doan et al.(2023)Doan, Amagasa, Pham, Sato, Chen, and Kusaka</label><mixed-citation>Doan, Q.-V., Amagasa, T., Pham, T.-H., Sato, T., Chen, F., and Kusaka, H.: Structural <inline-formula><mml:math id="M249" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means (S <inline-formula><mml:math id="M250" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means) and clustering uncertainty evaluation framework (CUEF) for mining climate data, Geosci. Model Dev., 16, 2215–2233, <ext-link xlink:href="https://doi.org/10.5194/gmd-16-2215-2023" ext-link-type="DOI">10.5194/gmd-16-2215-2023</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Doc et al.(2024)Doc, Ramonet, Bréon, Combaz, Chariot, Lopez, Delmotte, Cailteau-Fischbach, Nief, Laporte, Lauvaux, and Ciais</label><mixed-citation>Doc, J., Ramonet, M., Bréon, F.-M., Combaz, D., Chariot, M., Lopez, M., Delmotte, M., Cailteau-Fischbach, C., Nief, G., Laporte, N., Lauvaux, T., and Ciais, P.: The monitoring network of greenhouse gas (CO<sub>2</sub>, CH<sub>4</sub>) in the Paris' region, EGUsphere [preprint], <ext-link xlink:href="https://doi.org/10.5194/egusphere-2024-2826" ext-link-type="DOI">10.5194/egusphere-2024-2826</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Enting and Mansbridge(1989)</label><mixed-citation>Enting, I. G. and Mansbridge, J. V.: Seasonal sources and sinks of atmospheric CO<sub>2</sub> Direct inversion of filtered data, Tellus B, 41, 111–126, <ext-link xlink:href="https://doi.org/10.3402/tellusb.v41i2.15056" ext-link-type="DOI">10.3402/tellusb.v41i2.15056</ext-link>, 1989.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Guo et al.(2024)Guo, Roychoudhury, Mirrezaei, Kumar, Sorooshian, and Arellano</label><mixed-citation>Guo, Y., Roychoudhury, C., Mirrezaei, M. A., Kumar, R., Sorooshian, A., and Arellano, A. F.: Investigating ground-level ozone pollution in semi-arid and arid regions of Arizona using WRF-Chem v4.4 modeling, Geosci. Model Dev., 17, 4331–4353, <ext-link xlink:href="https://doi.org/10.5194/gmd-17-4331-2024" ext-link-type="DOI">10.5194/gmd-17-4331-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Hadamard(1902)</label><mixed-citation> Hadamard, J.: Sur les problemes aux derivees partielles et leur signification physique, Princeton University Bulletin, 49–52, 1902.</mixed-citation></ref>
      <ref id="bib1.bibx14"><label>Hari et al.(2016)Hari, Petäjä, Bäck, Kerminen, Lappalainen, Vihma, Laurila, Viisanen, Vesala, and Kulmala</label><mixed-citation>Hari, P., Petäjä, T., Bäck, J., Kerminen, V.-M., Lappalainen, H. K., Vihma, T., Laurila, T., Viisanen, Y., Vesala, T., and Kulmala, M.: Conceptual design of a measurement network of the global change, Atmos. Chem. Phys., 16, 1017–1028, <ext-link xlink:href="https://doi.org/10.5194/acp-16-1017-2016" ext-link-type="DOI">10.5194/acp-16-1017-2016</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx15"><label>Kort et al.(2013)Kort, Angevine, Duren, and Miller</label><mixed-citation>Kort, E. A., Angevine, W. M., Duren, R., and Miller, C. E.: Surface observations for monitoring urban fossil fuel CO<sub>2</sub> emissions: Minimum site location requirements for the Los Angeles megacity, J. Geophys. Res.-Atmos., 118, 1577–1584, <ext-link xlink:href="https://doi.org/10.1002/jgrd.50135" ext-link-type="DOI">10.1002/jgrd.50135</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>Lauvaux et al.(2012)Lauvaux, Schuh, Bocquet, Wu, Richardson, Miles, and Davis</label><mixed-citation>Lauvaux, T., Schuh, A. E., Bocquet, M., Wu, L., Richardson, S., Miles, N., and Davis, K. J.: Network design for mesoscale inversions of CO<sub>2</sub> sources and sinks, Tellus B, 64, 17980, <ext-link xlink:href="https://doi.org/10.3402/tellusb.v64i0.17980" ext-link-type="DOI">10.3402/tellusb.v64i0.17980</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>Lian et al.(2023)Lian, Lauvaux, Utard, Bréon, Broquet, Ramonet, Laurent, Albarus, Chariot, Kotthaus, Haeffelin, Sanchez, Perrussel, Denier van der Gon, Dellaert, and Ciais</label><mixed-citation>Lian, J., Lauvaux, T., Utard, H., Bréon, F.-M., Broquet, G., Ramonet, M., Laurent, O., Albarus, I., Chariot, M., Kotthaus, S., Haeffelin, M., Sanchez, O., Perrussel, O., Denier van der Gon, H. A., Dellaert, S. N. C., and Ciais, P.: Can we use atmospheric CO<sub>2</sub> measurements to verify emission trends reported by cities? Lessons from a 6-year atmospheric inversion over Paris, Atmos. Chem. Phys., 23, 8823–8835, <ext-link xlink:href="https://doi.org/10.5194/acp-23-8823-2023" ext-link-type="DOI">10.5194/acp-23-8823-2023</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>Liu and Yu(2009)</label><mixed-citation>Liu, D. and Yu, J.: Otsu Method and K-means, in: vol. 1, 2009 Ninth International Conference on Hybrid Intelligent Systems, 344–349, <ext-link xlink:href="https://doi.org/10.1109/HIS.2009.74" ext-link-type="DOI">10.1109/HIS.2009.74</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>Lopez-Coto et al.(2017)Lopez-Coto, Ghosh, Prasad, and Whetstone</label><mixed-citation>Lopez-Coto, I., Ghosh, S., Prasad, K., and Whetstone, J.: Tower-based greenhouse gas measurement network design – The National Institute of Standards and Technology North East Corridor Testbed, Adv. Atmos. Sci., 34, 1095–1105, <ext-link xlink:href="https://doi.org/10.1007/s00376-017-6094-6" ext-link-type="DOI">10.1007/s00376-017-6094-6</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>Lucas et al.(2015)Lucas, Yver Kwok, Cameron-Smith, Graven, Bergmann, Guilderson, Weiss, and Keeling</label><mixed-citation>Lucas, D. D., Yver Kwok, C., Cameron-Smith, P., Graven, H., Bergmann, D., Guilderson, T. P., Weiss, R., and Keeling, R.: Designing optimal greenhouse gas observing networks that consider performance and cost, Geosci. Instrum. Meth. Data Syst., 4, 121–137, <ext-link xlink:href="https://doi.org/10.5194/gi-4-121-2015" ext-link-type="DOI">10.5194/gi-4-121-2015</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx21"><label>Mahadevan et al.(2008)Mahadevan, Wofsy, Matross, Xiao, Dunn, Lin, Gerbig, Munger, Chow, and Gottlieb</label><mixed-citation>Mahadevan, P., Wofsy, S. C., Matross, D. M., Xiao, X., Dunn, A. L., Lin, J. C., Gerbig, C., Munger, J. W., Chow, V. Y., and Gottlieb, E. W.: A satellite‐based biosphere parameterization for net ecosystem CO<sub>2</sub> exchange: Vegetation Photosynthesis and Respiration Model (VPRM), Global Biogeochem. Cy., 22, <ext-link xlink:href="https://doi.org/10.1029/2006gb002735" ext-link-type="DOI">10.1029/2006gb002735</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx22"><label>Matajira-Rueda et al.(2025)Matajira-Rueda, Abdallah, and Lauvaux</label><mixed-citation>Matajira-Rueda, D., Abdallah, C., and Lauvaux, T.: CRO<sup>2</sup>A, Zenodo [code and data set], <ext-link xlink:href="https://doi.org/10.5281/zenodo.17161303" ext-link-type="DOI">10.5281/zenodo.17161303</ext-link>,  2025.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>McDowall and Dampney(2006)</label><mixed-citation> McDowall, L. M. and Dampney, R. A.: Calculation of threshold and saturation points of sigmoidal baroreflex function curves, Am. J. Physiol.-Heart Circul. Physiol., 291, H2003–H2007, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Miles et al.(2012)Miles, Richardson, Davis, Lauvaux, Andrews, West, Bandaru, and Crosson</label><mixed-citation>Miles, N. L., Richardson, S. J., Davis, K. J., Lauvaux, T., Andrews, A. E., West, T. O., Bandaru, V., and Crosson, E. R.: Large amplitude spatial and temporal gradients in atmospheric boundary layer CO<sub>2</sub>mole fractions detected with a tower-based network in the U.S. upper Midwest, J. Geophys. Res.-Biogeo., 117, <ext-link xlink:href="https://doi.org/10.1029/2011JG001781" ext-link-type="DOI">10.1029/2011JG001781</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx25"><label>Miles et al.(2017)Miles, Richardson, Lauvaux, Davis, Balashov, Deng, Turnbull, Sweeney, Gurney, Patarasuk, Razlivanov, Cambaliza, and Shepson</label><mixed-citation>Miles, N. L., Richardson, S. J., Lauvaux, T., Davis, K. J., Balashov, N. V., Deng, A., Turnbull, J. C., Sweeney, C., Gurney, K. R., Patarasuk, R., Razlivanov, I., Cambaliza, M. O. L., and Shepson, P. B.: Quantification of urban atmospheric boundary layer greenhouse gas dry mole fraction enhancements in the dormant season: Results from the Indianapolis Flux Experiment (INFLUX), Elementa, 5, 27, <ext-link xlink:href="https://doi.org/10.1525/elementa.127" ext-link-type="DOI">10.1525/elementa.127</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>Nalini et al.(2019)Nalini, Sijikumar, Valsala, Tiwari, and Ramachandran</label><mixed-citation>Nalini, K., Sijikumar, S., Valsala, V., Tiwari, Y. K., and Ramachandran, R.: Designing surface CO<sub>2</sub> monitoring network to constrain the Indian land fluxes, Atmos. Environ., 218, 117003, <ext-link xlink:href="https://doi.org/10.1016/j.atmosenv.2019.117003" ext-link-type="DOI">10.1016/j.atmosenv.2019.117003</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx27"><label>Nickless et al.(2015)Nickless, Ziehn, Rayner, Scholes, and Engelbrecht</label><mixed-citation>Nickless, A., Ziehn, T., Rayner, P. J., Scholes, R. J., and Engelbrecht, F.: Greenhouse gas network design using backward Lagrangian particle dispersion modelling – Part 2: Sensitivity analyses and South African test case, Atmos. Chem. Phys., 15, 2051–2069, <ext-link xlink:href="https://doi.org/10.5194/acp-15-2051-2015" ext-link-type="DOI">10.5194/acp-15-2051-2015</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx28"><label>Nickless et al.(2020)Nickless, Scholes, Vermeulen, Beck, Lpez-Ballesteros, Ard, Karstens, Rigby, Kasurinen, Pantazatou, Jorch, and Kutsch</label><mixed-citation>Nickless, A., Scholes, R. J., Vermeulen, A., Beck, J., López-Ballesteros, A., Ardö, J., Karstens, U., Rigby, M., Kasurinen, V., Pantazatou, K., Jorch, V., and Kutsch, W.: Greenhouse gas observation network design for Africa, Tellus B, 72, 1–30, <ext-link xlink:href="https://doi.org/10.1080/16000889.2020.1824486" ext-link-type="DOI">10.1080/16000889.2020.1824486</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx29"><label>Patra and Maksyutov(2002)</label><mixed-citation>Patra, P. K. and Maksyutov, S.: Incremental approach to the optimal network design for CO<sub>2</sub> surface source inversion, Geophys. Res. Lett., 29, 97-1–97-4, <ext-link xlink:href="https://doi.org/10.1029/2001GL013943" ext-link-type="DOI">10.1029/2001GL013943</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bibx30"><label>Rayner et al.(1996)Rayner, Enting, and Trudinger</label><mixed-citation>Rayner, P. J., Enting, I. G., and Trudinger, C. M.: Optimizing the CO2 observing network for constraining sources and sinks, Tellus B, 48, 433–444, <ext-link xlink:href="https://doi.org/10.3402/tellusb.v48i4.15924" ext-link-type="DOI">10.3402/tellusb.v48i4.15924</ext-link>, 1996.</mixed-citation></ref>
      <ref id="bib1.bibx31"><label>Shiga et al.(2013)Shiga, Michalak, Randolph Kawa, and Engelen</label><mixed-citation>Shiga, Y. P., Michalak, A. M., Randolph Kawa, S., and Engelen, R. J.: In-situ CO<sub>2</sub> monitoring network evaluation and design: A criterion based on atmospheric CO<sub>2</sub> variability, J. Geophys. Res.-Atmos., 118, 2007–2018, <ext-link xlink:href="https://doi.org/10.1002/jgrd.50168" ext-link-type="DOI">10.1002/jgrd.50168</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx32"><label>Shusterman et al.(2016)Shusterman, Teige, Turner, Newman, Kim, and Cohen</label><mixed-citation>Shusterman, A. A., Teige, V. E., Turner, A. J., Newman, C., Kim, J., and Cohen, R. C.: The BErkeley Atmospheric CO<sub>2</sub> Observation Network: initial evaluation, Atmospheric Chemistry and Physics, 16, 13449–13463, <ext-link xlink:href="https://doi.org/10.5194/acp-16-13449-2016" ext-link-type="DOI">10.5194/acp-16-13449-2016</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx33"><label>Sim et al.(2024)Sim, Jeong, Park, Shin, Kim, Ban, and Lim</label><mixed-citation>Sim, S., Jeong, S., Park, C., Shin, J., Kim, I., Ban, S., and Lim, C.-S.: Designing an Atmospheric Monitoring Network to Verify National CO<sub>2</sub> Emissions, Asia-Pacif. J. Atmos. Sci., 60, 131–141, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx34"><label>Skamarock et al.(2008)Skamarock, Klemp, Dudhia, Gill, Barker, Duda, Huang, Wang, Powers et al.</label><mixed-citation>Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Barker, D. M., Duda, M. G., Huang, X.-Y., Wang, W., and Powers, J. G.: A description of the advanced research WRF version 3, NCAR technical note 475, 10-5065, NCAR, <ext-link xlink:href="https://doi.org/10.5065/D68S4MVH" ext-link-type="DOI">10.5065/D68S4MVH</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx35"><label>Super et al.(2020)Super, Dellaert, Visschedijk, and Denier van der Gon</label><mixed-citation>Super, I., Dellaert, S. N. C., Visschedijk, A. J. H., and Denier van der Gon, H. A. C.: Uncertainty analysis of a European high-resolution emission inventory of CO<sub>2</sub> and CO to support inverse modelling and network design, Atmos. Chem. Phys., 20, 1795–1816, <ext-link xlink:href="https://doi.org/10.5194/acp-20-1795-2020" ext-link-type="DOI">10.5194/acp-20-1795-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx36"><label>Taquet et al.(2024)Taquet, Stremme, Gonzalez del Castillo, Almanza, Bezanilla, Laurent, Alberti, Hase, Ramonet, Lauvaux, Che, and Grutter</label><mixed-citation>Taquet, N., Stremme, W., González del Castillo, M. E., Almanza, V., Bezanilla, A., Laurent, O., Alberti, C., Hase, F., Ramonet, M., Lauvaux, T., Che, K., and Grutter, M.: CO<sub>2</sub> and CO temporal variability over Mexico City from ground-based total column and surface measurements, Atmos. Chem. Phys., 24, 11823–11848, <ext-link xlink:href="https://doi.org/10.5194/acp-24-11823-2024" ext-link-type="DOI">10.5194/acp-24-11823-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx37"><label>Theodoridis and Koutroumbas(2006)</label><mixed-citation> Theodoridis, S. and Koutroumbas, K.: Pattern recognition, Elsevier, ISBN 9780080949123, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx38"><label>Thompson and Pisso(2023)</label><mixed-citation>Thompson, R. L. and Pisso, I.: A flexible algorithm for network design based on information theory, Atmos. Meas. Tech., 16, 235–246, <ext-link xlink:href="https://doi.org/10.5194/amt-16-235-2023" ext-link-type="DOI">10.5194/amt-16-235-2023</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx39"><label>van der Gon et al.(2019)van der Gon, Kuenen, Boleti, Muntean, Maenhout, Marshall, and Haussaire</label><mixed-citation>van der Gon, H. D., Kuenen, J., Boleti, E., Muntean, M., Maenhout, G., Marshall, J., and Haussaire, J.: Emissions and natural fluxes Dataset, CHE Consortium, TNO, <uri>https://che-project.eu/</uri> (last access: 5 May 2026), 2019. </mixed-citation></ref>
      <ref id="bib1.bibx40"><label>van der Woude et al.(2023)van der Woude, Peters, Joetzjer, Lafont, Koren, Ciais, Ramonet, Xu, , Bastos, Botía, Sitch, de Kok, Kneuer, Kubistin, Jacotot, Loubet, Herig-Coimbra, Loustau, and Luijkx</label><mixed-citation>van der Woude, A. M., Peters, W., Joetzjer, E., Lafont, S., Koren, G., Ciais, P., Ramonet, M., Xu, Y., , Bastos, A., Botía, S., Sitch, S., de Kok, R., Kneuer, T., Kubistin, D., Jacotot, A., Loubet, B., Herig-Coimbra, P.-H., Loustau, D., and Luijkx, I. T.: Temperature extremes of 2022 reduced carbon uptake by forests in Europe, Nat. Commun., 14, 6218, <ext-link xlink:href="https://doi.org/10.1038/s41467-023-41851-0" ext-link-type="DOI">10.1038/s41467-023-41851-0</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx41"><label>Vardag and Maiwald(2024)</label><mixed-citation>Vardag, S. N. and Maiwald, R.: Optimising urban measurement networks for CO<sub>2</sub> flux estimation: a high-resolution observing system simulation experiment using GRAMM/GRAL, Geosci. Model Dev., 17, 1885–1902, <ext-link xlink:href="https://doi.org/10.5194/gmd-17-1885-2024" ext-link-type="DOI">10.5194/gmd-17-1885-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx42"><label>Villalobos et al.(2025)Villalobos, Gmez-Ortiz, Scholze, Monteil, Karstens, Fiore, Brunner, Thanwerdas, and Cristofanelli</label><mixed-citation>Villalobos, Y., Gómez-Ortiz, C., Scholze, M., Monteil, G., Karstens, U., Fiore, A., Brunner, D., Thanwerdas, J., and Cristofanelli, P.: Towards improving top–down national CO<sub>2</sub> estimation in Europe: potential from expanding the ICOS atmospheric network in Italy, Environ. Res. Lett., 20, 054002, <ext-link xlink:href="https://doi.org/10.1088/1748-9326/adc41e" ext-link-type="DOI">10.1088/1748-9326/adc41e</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx43"><label>Wang and Song(2011)</label><mixed-citation>Wang, H. and Song, M.: Ckmeans. 1d.dp: optimal <inline-formula><mml:math id="M270" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means clustering in one dimension by dynamic programming, R J., 3, 29–33, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx44"><label>Wang et al.(2023)Wang, Tian, Duan, Zhu, Liu, Zhang, Zhou, Zhao, Jin, Ding, Wang, and Piao</label><mixed-citation>Wang, Y., Tian, X., Duan, M., Zhu, D., Liu, D., Zhang, H., Zhou, M., Zhao, M., Jin, Z., Ding, J., Wang, T., and Piao, S.: Optimal design of surface CO<sub>2</sub> observation network to constrain China's land carbon sink, Sci. Bull., 68, 1678–1686, <ext-link xlink:href="https://doi.org/10.1016/j.scib.2023.07.010" ext-link-type="DOI">10.1016/j.scib.2023.07.010</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx45"><label>Ziehn et al.(2014)Ziehn, Nickless, Rayner, Law, Roff, and Fraser</label><mixed-citation>Ziehn, T., Nickless, A., Rayner, P. J., Law, R. M., Roff, G., and Fraser, P.: Greenhouse gas network design using backward Lagrangian particle dispersion modelling – Part 1: Methodology and Australian test case, Atmos. Chem. Phys., 14, 9363–9378, <ext-link xlink:href="https://doi.org/10.5194/acp-14-9363-2014" ext-link-type="DOI">10.5194/acp-14-9363-2014</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx46"><label>Ziehn et al.(2016)Ziehn, Law, Rayner, and Roff</label><mixed-citation>Ziehn, T., Law, R. M., Rayner, P. J., and Roff, G.: Designing optimal greenhouse gas monitoring networks for Australia, Geosci. Instrum. Meth. Data Syst., 5, 1–15, <ext-link xlink:href="https://doi.org/10.5194/gi-5-1-2016" ext-link-type="DOI">10.5194/gi-5-1-2016</ext-link>, 2016.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>A novel cluster-based learning scheme to design optimal networks for atmospheric greenhouse gas monitoring (CRO<sup>2</sup>A version 1.0)</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>Abdallah(2025)</label><mixed-citation>
      
Abdallah, C.: CRO<sup>2</sup>A – Illustrative example data (Version 0) [Data set], Zenodo [data set], <a href="https://doi.org/10.5281/zenodo.17161463" target="_blank">https://doi.org/10.5281/zenodo.17161463</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Andrews et al.(2014)Andrews, Kofler, Trudeau, Williams, Neff,
Masarie, Chao, Kitzis, Novelli, Zhao, Dlugokencky, Lang, Crotwell, Fischer,
Parker, Lee, Baumann, Desai, Stanier, De Wekker, Wolfe, Munger, and
Tans</label><mixed-citation>
      
Andrews, A. E., Kofler, J. D., Trudeau, M. E., Williams, J. C., Neff, D. H.,
Masarie, K. A., Chao, D. Y., Kitzis, D. R., Novelli, P. C., Zhao, C. L.,
Dlugokencky, E. J., Lang, P. M., Crotwell, M. J., Fischer, M. L., Parker, M. J., Lee, J. T., Baumann, D. D., Desai, A. R., Stanier, C. O., De Wekker, S. F. J., Wolfe, D. E., Munger, J. W., and Tans, P. P.: CO<sub>2</sub>, CO, and CH<sub>4</sub> measurements from tall towers in the NOAA Earth System Research
Laboratory's Global Greenhouse Gas Reference Network: instrumentation,
uncertainty analysis, and recommendations for future high-accuracy greenhouse
gas monitoring efforts, Atmos. Meas. Tech., 7, 647–687,
<a href="https://doi.org/10.5194/amt-7-647-2014" target="_blank">https://doi.org/10.5194/amt-7-647-2014</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Bangare et al.(2015)Bangare, Dubal, Bangare, and
Patil</label><mixed-citation>
      
Bangare, S. L., Dubal, A., Bangare, P. S., and Patil, S.: Reviewing Otsu's
method for image thresholding, Int. J. Appl. Eng. Res., 10, 21777–21783, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Bousquet et al.(2000)Bousquet, Peylin, Ciais, Qur,
Friedlingstein, and Tans</label><mixed-citation>
      
Bousquet, P., Peylin, P., Ciais, P., Quéré, C. L., Friedlingstein, P., and Tans, P. P.: Regional Changes in Carbon Dioxide Fluxes of Land and Oceans Since 1980, Science, 290, 1342–1346, <a href="https://doi.org/10.1126/science.290.5495.1342" target="_blank">https://doi.org/10.1126/science.290.5495.1342</a>, 2000.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Chevallier et al.(2019)Chevallier, Remaud, O'Dell, Baker, Peylin, and Cozic</label><mixed-citation>
      
Chevallier, F., Remaud, M., O'Dell, C. W., Baker, D., Peylin, P., and Cozic,
A.: Objective evaluation of surface- and satellite-driven carbon dioxide
atmospheric inversions, Atmos. Chem. Phys., 19, 14233–14251, <a href="https://doi.org/10.5194/acp-19-14233-2019" target="_blank">https://doi.org/10.5194/acp-19-14233-2019</a>, 2019.


    </mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Copernicus Climate Change
Service(2018)</label><mixed-citation>
      
Copernicus Climate Change Service: ERA5 hourly data on pressure levels from 1940 to present, <a href="https://doi.org/10.24381/CDS.BD0915C6" target="_blank">https://doi.org/10.24381/CDS.BD0915C6</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Debnath et al.(2024)Debnath, Govardhan, Jat, Kalita, Yadav, Jena,
Kumar, and Ghude</label><mixed-citation>
      
Debnath, S., Govardhan, G., Jat, R., Kalita, G., Yadav, P., Jena, C., Kumar,
R., and Ghude, S. D.: Black carbon emissions and its impact on the monsoon
rainfall patterns over the Indian subcontinent: Insights into localized
warming effects, Atmospheric Environ.: X, 22, 100257, <a href="https://doi.org/10.1016/j.aeaoa.2024.100257" target="_blank">https://doi.org/10.1016/j.aeaoa.2024.100257</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>de Burgh-Day and Leeuwenburg(2023)</label><mixed-citation>
      
de Burgh-Day, C. O. and Leeuwenburg, T.: Machine learning for numerical weather and climate modelling: a review, Geosci. Model Dev., 16, 6433–6477, <a href="https://doi.org/10.5194/gmd-16-6433-2023" target="_blank">https://doi.org/10.5194/gmd-16-6433-2023</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Doan et al.(2023)Doan, Amagasa, Pham, Sato, Chen, and
Kusaka</label><mixed-citation>
      
Doan, Q.-V., Amagasa, T., Pham, T.-H., Sato, T., Chen, F., and Kusaka, H.: Structural <i>k</i>-means (S <i>k</i>-means) and clustering uncertainty evaluation framework (CUEF) for mining climate data, Geosci. Model Dev., 16, 2215–2233, <a href="https://doi.org/10.5194/gmd-16-2215-2023" target="_blank">https://doi.org/10.5194/gmd-16-2215-2023</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Doc et al.(2024)Doc, Ramonet, Bréon, Combaz, Chariot, Lopez,
Delmotte, Cailteau-Fischbach, Nief, Laporte, Lauvaux, and Ciais</label><mixed-citation>
      
Doc, J., Ramonet, M., Bréon, F.-M., Combaz, D., Chariot, M., Lopez, M., Delmotte, M., Cailteau-Fischbach, C., Nief, G., Laporte, N., Lauvaux, T., and Ciais, P.: The monitoring network of greenhouse gas (CO<sub>2</sub>, CH<sub>4</sub>) in the Paris' region, EGUsphere [preprint], <a href="https://doi.org/10.5194/egusphere-2024-2826" target="_blank">https://doi.org/10.5194/egusphere-2024-2826</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Enting and Mansbridge(1989)</label><mixed-citation>
      
Enting, I. G. and Mansbridge, J. V.: Seasonal sources and sinks of atmospheric CO<sub>2</sub> Direct inversion of filtered data, Tellus B, 41, 111–126, <a href="https://doi.org/10.3402/tellusb.v41i2.15056" target="_blank">https://doi.org/10.3402/tellusb.v41i2.15056</a>, 1989.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Guo et al.(2024)Guo, Roychoudhury, Mirrezaei, Kumar, Sorooshian, and Arellano</label><mixed-citation>
      
Guo, Y., Roychoudhury, C., Mirrezaei, M. A., Kumar, R., Sorooshian, A., and Arellano, A. F.: Investigating ground-level ozone pollution in semi-arid and arid regions of Arizona using WRF-Chem v4.4 modeling, Geosci. Model Dev., 17, 4331–4353, <a href="https://doi.org/10.5194/gmd-17-4331-2024" target="_blank">https://doi.org/10.5194/gmd-17-4331-2024</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Hadamard(1902)</label><mixed-citation>
      
Hadamard, J.: Sur les problemes aux derivees partielles et leur signification physique, Princeton University Bulletin, 49–52, 1902.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Hari et al.(2016)Hari, Petäjä, Bäck, Kerminen, Lappalainen,
Vihma, Laurila, Viisanen, Vesala, and Kulmala</label><mixed-citation>
      
Hari, P., Petäjä, T., Bäck, J., Kerminen, V.-M., Lappalainen, H. K., Vihma, T., Laurila, T., Viisanen, Y., Vesala, T., and Kulmala, M.: Conceptual design of a measurement network of the global change, Atmos. Chem.
Phys., 16, 1017–1028, <a href="https://doi.org/10.5194/acp-16-1017-2016" target="_blank">https://doi.org/10.5194/acp-16-1017-2016</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Kort et al.(2013)Kort, Angevine, Duren, and Miller</label><mixed-citation>
      
Kort, E. A., Angevine, W. M., Duren, R., and Miller, C. E.: Surface
observations for monitoring urban fossil fuel CO<sub>2</sub> emissions: Minimum site location requirements for the Los Angeles megacity, J. Geophys. Res.-Atmos., 118, 1577–1584, <a href="https://doi.org/10.1002/jgrd.50135" target="_blank">https://doi.org/10.1002/jgrd.50135</a>, 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>Lauvaux et al.(2012)Lauvaux, Schuh, Bocquet, Wu, Richardson, Miles,
and Davis</label><mixed-citation>
      
Lauvaux, T., Schuh, A. E., Bocquet, M., Wu, L., Richardson, S., Miles, N., and Davis, K. J.: Network design for mesoscale inversions of CO<sub>2</sub> sources and sinks, Tellus B, 64, 17980, <a href="https://doi.org/10.3402/tellusb.v64i0.17980" target="_blank">https://doi.org/10.3402/tellusb.v64i0.17980</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>Lian et al.(2023)Lian, Lauvaux, Utard, Bréon, Broquet, Ramonet,
Laurent, Albarus, Chariot, Kotthaus, Haeffelin, Sanchez, Perrussel, Denier
van der Gon, Dellaert, and Ciais</label><mixed-citation>
      
Lian, J., Lauvaux, T., Utard, H., Bréon, F.-M., Broquet, G., Ramonet, M.,
Laurent, O., Albarus, I., Chariot, M., Kotthaus, S., Haeffelin, M., Sanchez,
O., Perrussel, O., Denier van der Gon, H. A., Dellaert, S. N. C., and Ciais,
P.: Can we use atmospheric CO<sub>2</sub> measurements to verify emission trends
reported by cities? Lessons from a 6-year atmospheric inversion over Paris,
Atmos. Chem. Phys., 23, 8823–8835, <a href="https://doi.org/10.5194/acp-23-8823-2023" target="_blank">https://doi.org/10.5194/acp-23-8823-2023</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>Liu and Yu(2009)</label><mixed-citation>
      
Liu, D. and Yu, J.: Otsu Method and K-means, in: vol. 1, 2009 Ninth International Conference on Hybrid Intelligent Systems, 344–349,
<a href="https://doi.org/10.1109/HIS.2009.74" target="_blank">https://doi.org/10.1109/HIS.2009.74</a>, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Lopez-Coto et al.(2017)Lopez-Coto, Ghosh, Prasad, and
Whetstone</label><mixed-citation>
      
Lopez-Coto, I., Ghosh, S., Prasad, K., and Whetstone, J.: Tower-based
greenhouse gas measurement network design – The National Institute of
Standards and Technology North East Corridor Testbed, Adv. Atmos. Sci., 34, 1095–1105, <a href="https://doi.org/10.1007/s00376-017-6094-6" target="_blank">https://doi.org/10.1007/s00376-017-6094-6</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Lucas et al.(2015)Lucas, Yver Kwok, Cameron-Smith, Graven, Bergmann, Guilderson, Weiss, and Keeling</label><mixed-citation>
      
Lucas, D. D., Yver Kwok, C., Cameron-Smith, P., Graven, H., Bergmann, D.,
Guilderson, T. P., Weiss, R., and Keeling, R.: Designing optimal greenhouse
gas observing networks that consider performance and cost, Geosci. Instrum. Meth. Data Syst., 4, 121–137, <a href="https://doi.org/10.5194/gi-4-121-2015" target="_blank">https://doi.org/10.5194/gi-4-121-2015</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Mahadevan et al.(2008)Mahadevan, Wofsy, Matross, Xiao, Dunn, Lin,
Gerbig, Munger, Chow, and Gottlieb</label><mixed-citation>
      
Mahadevan, P., Wofsy, S. C., Matross, D. M., Xiao, X., Dunn, A. L., Lin, J. C., Gerbig, C., Munger, J. W., Chow, V. Y., and Gottlieb, E. W.: A
satellite‐based biosphere parameterization for net ecosystem CO<sub>2</sub> exchange: Vegetation Photosynthesis and Respiration Model (VPRM), Global Biogeochem. Cy., 22, <a href="https://doi.org/10.1029/2006gb002735" target="_blank">https://doi.org/10.1029/2006gb002735</a>, 2008.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Matajira-Rueda et al.(2025)Matajira-Rueda, Abdallah, and
Lauvaux</label><mixed-citation>
      
Matajira-Rueda, D., Abdallah, C., and Lauvaux, T.: CRO<sup>2</sup>A, Zenodo [code and data set], <a href="https://doi.org/10.5281/zenodo.17161303" target="_blank">https://doi.org/10.5281/zenodo.17161303</a>,  2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>McDowall and Dampney(2006)</label><mixed-citation>
      
McDowall, L. M. and Dampney, R. A.: Calculation of threshold and saturation
points of sigmoidal baroreflex function curves, Am. J. Physiol.-Heart Circul. Physiol., 291, H2003–H2007, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Miles et al.(2012)Miles, Richardson, Davis, Lauvaux, Andrews, West,
Bandaru, and Crosson</label><mixed-citation>
      
Miles, N. L., Richardson, S. J., Davis, K. J., Lauvaux, T., Andrews, A. E.,
West, T. O., Bandaru, V., and Crosson, E. R.: Large amplitude spatial and
temporal gradients in atmospheric boundary layer CO<sub>2</sub>mole fractions detected with a tower-based network in the U.S. upper Midwest, J. Geophys. Res.-Biogeo., 117, <a href="https://doi.org/10.1029/2011JG001781" target="_blank">https://doi.org/10.1029/2011JG001781</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Miles et al.(2017)Miles, Richardson, Lauvaux, Davis, Balashov, Deng, Turnbull, Sweeney, Gurney, Patarasuk, Razlivanov, Cambaliza, and
Shepson</label><mixed-citation>
      
Miles, N. L., Richardson, S. J., Lauvaux, T., Davis, K. J., Balashov, N. V.,
Deng, A., Turnbull, J. C., Sweeney, C., Gurney, K. R., Patarasuk, R.,
Razlivanov, I., Cambaliza, M. O. L., and Shepson, P. B.: Quantification of
urban atmospheric boundary layer greenhouse gas dry mole fraction
enhancements in the dormant season: Results from the Indianapolis Flux
Experiment (INFLUX), Elementa, 5, 27, <a href="https://doi.org/10.1525/elementa.127" target="_blank">https://doi.org/10.1525/elementa.127</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>Nalini et al.(2019)Nalini, Sijikumar, Valsala, Tiwari, and
Ramachandran</label><mixed-citation>
      
Nalini, K., Sijikumar, S., Valsala, V., Tiwari, Y. K., and Ramachandran, R.:
Designing surface CO<sub>2</sub> monitoring network to constrain the Indian land fluxes, Atmos. Environ., 218, 117003, <a href="https://doi.org/10.1016/j.atmosenv.2019.117003" target="_blank">https://doi.org/10.1016/j.atmosenv.2019.117003</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>Nickless et al.(2015)Nickless, Ziehn, Rayner, Scholes, and
Engelbrecht</label><mixed-citation>
      
Nickless, A., Ziehn, T., Rayner, P. J., Scholes, R. J., and Engelbrecht, F.: Greenhouse gas network design using backward Lagrangian particle dispersion modelling – Part 2: Sensitivity analyses and South African test case, Atmos. Chem. Phys., 15, 2051–2069, <a href="https://doi.org/10.5194/acp-15-2051-2015" target="_blank">https://doi.org/10.5194/acp-15-2051-2015</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>Nickless et al.(2020)Nickless, Scholes, Vermeulen, Beck,
Lpez-Ballesteros, Ard, Karstens, Rigby, Kasurinen, Pantazatou, Jorch, and
Kutsch</label><mixed-citation>
      
Nickless, A., Scholes, R. J., Vermeulen, A., Beck, J., López-Ballesteros, A., Ardö, J., Karstens, U., Rigby, M., Kasurinen, V., Pantazatou, K., Jorch, V., and Kutsch, W.: Greenhouse gas observation network design for Africa, Tellus B, 72, 1–30, <a href="https://doi.org/10.1080/16000889.2020.1824486" target="_blank">https://doi.org/10.1080/16000889.2020.1824486</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Patra and Maksyutov(2002)</label><mixed-citation>
      
Patra, P. K. and Maksyutov, S.: Incremental approach to the optimal network
design for CO<sub>2</sub> surface source inversion, Geophys. Res. Lett., 29,
97-1–97-4, <a href="https://doi.org/10.1029/2001GL013943" target="_blank">https://doi.org/10.1029/2001GL013943</a>, 2002.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Rayner et al.(1996)Rayner, Enting, and Trudinger</label><mixed-citation>
      
Rayner, P. J., Enting, I. G., and Trudinger, C. M.: Optimizing the CO2
observing network for constraining sources and sinks, Tellus B, 48, 433–444, <a href="https://doi.org/10.3402/tellusb.v48i4.15924" target="_blank">https://doi.org/10.3402/tellusb.v48i4.15924</a>, 1996.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>Shiga et al.(2013)Shiga, Michalak, Randolph Kawa, and
Engelen</label><mixed-citation>
      
Shiga, Y. P., Michalak, A. M., Randolph Kawa, S., and Engelen, R. J.: In-situ
CO<sub>2</sub> monitoring network evaluation and design: A criterion based on
atmospheric CO<sub>2</sub> variability, J. Geophys. Res.-Atmos., 118, 2007–2018, <a href="https://doi.org/10.1002/jgrd.50168" target="_blank">https://doi.org/10.1002/jgrd.50168</a>, 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>Shusterman et al.(2016)Shusterman, Teige, Turner, Newman, Kim, and
Cohen</label><mixed-citation>
      
Shusterman, A. A., Teige, V. E., Turner, A. J., Newman, C., Kim, J., and Cohen, R. C.: The BErkeley Atmospheric CO<sub>2</sub> Observation Network: initial
evaluation, Atmospheric Chemistry and Physics, 16, 13449–13463,
<a href="https://doi.org/10.5194/acp-16-13449-2016" target="_blank">https://doi.org/10.5194/acp-16-13449-2016</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Sim et al.(2024)Sim, Jeong, Park, Shin, Kim, Ban, and
Lim</label><mixed-citation>
      
Sim, S., Jeong, S., Park, C., Shin, J., Kim, I., Ban, S., and Lim, C.-S.:
Designing an Atmospheric Monitoring Network to Verify National CO<sub>2</sub> Emissions, Asia-Pacif. J. Atmos. Sci., 60, 131–141, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Skamarock et al.(2008)Skamarock, Klemp, Dudhia, Gill, Barker, Duda,
Huang, Wang, Powers et al.</label><mixed-citation>
      
Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Barker, D. M., Duda,
M. G., Huang, X.-Y., Wang, W., and Powers, J. G.: A description of the advanced research WRF version 3, NCAR technical note 475, 10-5065, NCAR, <a href="https://doi.org/10.5065/D68S4MVH" target="_blank">https://doi.org/10.5065/D68S4MVH</a>, 2008.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Super et al.(2020)Super, Dellaert, Visschedijk, and Denier van der
Gon</label><mixed-citation>
      
Super, I., Dellaert, S. N. C., Visschedijk, A. J. H., and Denier van der Gon,
H. A. C.: Uncertainty analysis of a European high-resolution emission
inventory of CO<sub>2</sub> and CO to support inverse modelling and network design, Atmos. Chem. Phys., 20, 1795–1816, <a href="https://doi.org/10.5194/acp-20-1795-2020" target="_blank">https://doi.org/10.5194/acp-20-1795-2020</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Taquet et al.(2024)Taquet, Stremme, Gonzalez del Castillo, Almanza,
Bezanilla, Laurent, Alberti, Hase, Ramonet, Lauvaux, Che, and
Grutter</label><mixed-citation>
      
Taquet, N., Stremme, W., González del Castillo, M. E., Almanza, V., Bezanilla, A., Laurent, O., Alberti, C., Hase, F., Ramonet, M., Lauvaux, T., Che, K., and Grutter, M.: CO<sub>2</sub> and CO temporal variability over Mexico City from ground-based total column and surface measurements, Atmos. Chem. Phys., 24, 11823–11848, <a href="https://doi.org/10.5194/acp-24-11823-2024" target="_blank">https://doi.org/10.5194/acp-24-11823-2024</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Theodoridis and Koutroumbas(2006)</label><mixed-citation>
      
Theodoridis, S. and Koutroumbas, K.: Pattern recognition, Elsevier, ISBN 9780080949123, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Thompson and Pisso(2023)</label><mixed-citation>
      
Thompson, R. L. and Pisso, I.: A flexible algorithm for network design based on information theory, Atmos. Meas. Tech., 16, 235–246,
<a href="https://doi.org/10.5194/amt-16-235-2023" target="_blank">https://doi.org/10.5194/amt-16-235-2023</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>van der Gon et al.(2019)van der Gon, Kuenen, Boleti, Muntean,
Maenhout, Marshall, and Haussaire</label><mixed-citation>
      
van der Gon, H. D., Kuenen, J., Boleti, E., Muntean, M., Maenhout, G.,
Marshall, J., and Haussaire, J.: Emissions and natural fluxes Dataset, CHE Consortium, TNO, <a href="https://che-project.eu/" target="_blank"/> (last access: 5 May 2026), 2019.


    </mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>van der Woude et al.(2023)van der Woude, Peters, Joetzjer, Lafont,
Koren, Ciais, Ramonet, Xu, , Bastos, Botía, Sitch, de Kok, Kneuer, Kubistin, Jacotot, Loubet, Herig-Coimbra, Loustau, and Luijkx</label><mixed-citation>
      
van der Woude, A. M., Peters, W., Joetzjer, E., Lafont, S., Koren, G., Ciais,
P., Ramonet, M., Xu, Y., , Bastos, A., Botía, S., Sitch, S., de Kok, R.,
Kneuer, T., Kubistin, D., Jacotot, A., Loubet, B., Herig-Coimbra, P.-H.,
Loustau, D., and Luijkx, I. T.: Temperature extremes of 2022 reduced carbon
uptake by forests in Europe, Nat. Commun., 14, 6218,
<a href="https://doi.org/10.1038/s41467-023-41851-0" target="_blank">https://doi.org/10.1038/s41467-023-41851-0</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Vardag and Maiwald(2024)</label><mixed-citation>
      
Vardag, S. N. and Maiwald, R.: Optimising urban measurement networks for CO<sub>2</sub> flux estimation: a high-resolution observing system simulation experiment using GRAMM/GRAL, Geosci. Model Dev., 17, 1885–1902,
<a href="https://doi.org/10.5194/gmd-17-1885-2024" target="_blank">https://doi.org/10.5194/gmd-17-1885-2024</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Villalobos et al.(2025)Villalobos, Gmez-Ortiz, Scholze, Monteil,
Karstens, Fiore, Brunner, Thanwerdas, and Cristofanelli</label><mixed-citation>
      
Villalobos, Y., Gómez-Ortiz, C., Scholze, M., Monteil, G., Karstens, U.,
Fiore, A., Brunner, D., Thanwerdas, J., and Cristofanelli, P.: Towards
improving top–down national CO<sub>2</sub> estimation in Europe: potential from
expanding the ICOS atmospheric network in Italy, Environ. Res. Lett., 20, 054002, <a href="https://doi.org/10.1088/1748-9326/adc41e" target="_blank">https://doi.org/10.1088/1748-9326/adc41e</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Wang and Song(2011)</label><mixed-citation>
      
Wang, H. and Song, M.: Ckmeans. 1d.dp: optimal <i>k</i>-means clustering in one
dimension by dynamic programming, R J., 3, 29–33, 2011.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>Wang et al.(2023)Wang, Tian, Duan, Zhu, Liu, Zhang, Zhou, Zhao, Jin, Ding, Wang, and Piao</label><mixed-citation>
      
Wang, Y., Tian, X., Duan, M., Zhu, D., Liu, D., Zhang, H., Zhou, M., Zhao, M., Jin, Z., Ding, J., Wang, T., and Piao, S.: Optimal design of surface CO<sub>2</sub> observation network to constrain China's land carbon sink, Sci. Bull., 68, 1678–1686, <a href="https://doi.org/10.1016/j.scib.2023.07.010" target="_blank">https://doi.org/10.1016/j.scib.2023.07.010</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>Ziehn et al.(2014)Ziehn, Nickless, Rayner, Law, Roff,
and Fraser</label><mixed-citation>
      
Ziehn, T., Nickless, A., Rayner, P. J., Law, R. M., Roff, G., and Fraser, P.: Greenhouse gas network design using backward Lagrangian particle dispersion modelling – Part 1: Methodology and Australian test case, Atmos. Chem. Phys., 14, 9363–9378, <a href="https://doi.org/10.5194/acp-14-9363-2014" target="_blank">https://doi.org/10.5194/acp-14-9363-2014</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Ziehn et al.(2016)Ziehn, Law, Rayner, and Roff</label><mixed-citation>
      
Ziehn, T., Law, R. M., Rayner, P. J., and Roff, G.: Designing optimal
greenhouse gas monitoring networks for Australia, Geosci. Instrum. Meth. Data Syst., 5, 1–15, <a href="https://doi.org/10.5194/gi-5-1-2016" target="_blank">https://doi.org/10.5194/gi-5-1-2016</a>, 2016.

    </mixed-citation></ref-html>--></article>
