Data input#

The data input component provides functions for reading remote sensing images, hotspot locations and (effective) wind fields.

Source databases#

The location and type of emission sources are used input for plume detection and emission quantification. ddeq uses xarray datasets to store point source information. The dataset contain source names (source), longitudes (lon_o), latitudes (lat_o), labels for visualization (label) and source types (type).

CSV files#

ddeq includes a small list of sources as a comma-separated values (CSV) file that primarily contains cities and power plants used in previous studies. User-defined files containing other sources can be prepared in the same format.

The CSV file can be read with the following function:

ddeq.misc.read_point_sources(filename=None)#

Read list of point sources and converts them to format used by the plume detection algorithm.

Parameters:

filename (str, default: None) – Name of CSV file with point source information (see “sources.csv” in ddeq.DATA_PATH for an example).

Returns:

xarray dataset containing point source locations

Return type:

xr.Dataset

CoCO2 point source database#

ddeq includes the CoCO2 global emission point source database:

ddeq.coco2.read_ps_catalogue(filename=None)#

Read CoCO2 point source catalogue [Guevara2023] in the format supported by ddeq.

Parameters:

filename (str, default: None) – Name of CSV file with point source information from CoCO2 database (see “coco2_ps_catalogue_v1.1.csv” in ddeq.DATA_PATH for an example).

Returns:

xarray dataset containing point source locations

Return type:

xr.Dataset

Notes

[Guevara2023]

Guevara, M., Enciso, S., Tena, C., Jorba, O., Dellaert, S., Denier van der Gon, H., and Pérez García-Pando, C.: A global catalogue of CO2 emissions and co-emitted species from power plants at a very high spatial and temporal resolution, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2023-95, in review, 2023.

Remote sensing images#

ddeq requires that trace gas images are provided as xr.Dataset with variables providing the trace gas columns and their uncertainties (e.g. “CO2” and “CO2_precision”) that need to have a units attribute for automatic unit conversion and a noise_level attribute that is used as random uncertainty. In addition, the central longitude and latitude of the pixels need to be provided as lon and lat.

Sentinel-5P/TROPOMI#

Sentinel-5P/TROPOMI images can read using xr.open_dataset after being downloaded and prepared by the ddeq.download_S5P module.

To iterate over TROPOMI data, a dataset class can be used, which can be used to load TROPOMI data on demand. The class is used by the divergence method.

class ddeq.sats.Level2TropomiDataset#
__init__(pattern, root='')#

Level-2 class for TROPOMI NO2 product.

Parameters:
  • pattern (str) – A filename pattern used to match the TROPOMI files based on given date. Date formatting is used to find the correct file using, for example, “S5P_NO2_%Y%m%d.nc”.

  • root (str) – Data path to TROPOMI files.

__new__(**kwargs)#
read_date(date)#

Returns a list of TROPOMI NO2 Level-2 data.

Parameters:

date (datetime.datetime) –

Returns:

List of TROPOMI datasets for given date.

Return type:

list of xr.Dataset

Synthetic CO2M images from the SMARTCARB dataset#

ddeq.smartcarb.read_level2(filename, co2_noise_scenario='medium', co2_cloud_threshold=0.01, co2_scaling=1.0, no2_noise_scenario='high', no2_cloud_threshold=0.3, no2_scaling=1.0, co_noise_scenario=None, co_cloud_threshold=0.05, co_scaling=1.0, make_no2_error_cloud_dependent=True, use_constant=False, seed='orbit', only_observations=True, add_background=False)#

Read synthetic XCO2, NO2 and CO observations from SMARTCARB project [Kuhlmann2020] .

Parameters:
  • filename (str) – Name of SMARTCARB Level-2 file

  • co2_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the CO2 observations for vegetation albedo and solar zenith angle of 50° (VEG50 scenario): “low” -> 0.5 ppm, “medium” -> 0.7 ppm and “high” -> 1.0 ppm.

  • co2_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 1% default cloud fraction.

  • co2_cloud_threshold – Cloud fraction used for masking bad pixels with 1% default cloud fraction.

  • co2_scaling (float, optional) – Scaling applied to model tracer with anthropogenic CO2 emissions

  • no2_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the NO2 observations: “low” -> 1e15 molecules cm-2 or 15% (whichever is larger) and “high” -> 2e15 molecules cm-2 or 20% (whichever is larger)

  • no2_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 30% default cloud fraction.

  • no2_scaling (float, optional) – Scaling applied to model tracer with anthropogenic NO2 emissions.

  • co_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the CO observations: “low” -> 4e17 molecules cm-2 or 10% (whichever is larger) and “high” -> 4e17 molecules cm-2 or 20% (whichever is larger)

  • co_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 5% default cloud fraction.

  • co_scaling (float, optional) – Scaling applied to model tracer with anthropogenic CO emissions

  • make_no2_error_cloud_dependent (boolean, optional) – If True, NO2 uncertainty depends on cloud fraction.

  • use_constant (boolean, optional) – Use constant emissions if True and time-varying emissions otherwise.

  • seed (string, optional) – “seed” used before generating the random noise for the Level-2 images. If seed==’orbit’, the seed is calculated based on the trace gas, satellite and orbit number, resulting in the same image every time data is read, which is useful for benchmarking studies.

  • only_observations (boolean, optional) – If False, noise-free trace gas array without cloud filtering will be added to the dataset.

  • add_background (boolean, optional) – If True, add array containing the background tracers, i.e. from anthropogenic emissions outside the model domain and, for CO2, biospheric fluxes.

Returns:

CO2M Level-2 orbit from SMARTCARB dataset.

Return type:

xr.Dataset

Notes

[Kuhlmann2020]

Kuhlmann, G., Clément, V., Marshall, J., Fuhrer, O., Broquet, G., Schnadt-Poberaj, C., Löscher, A., Meijer, Y., & Brunner, D. (2020). Synthetic XCO2, CO and NO2 observations for the CO2M and Sentinel-5 satellites [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4048228

class ddeq.smartcarb.Level2Dataset#
__init__(data_path, constellation='ace', co2_noise_scenario='medium', co2_cloud_threshold=0.01, co2_scaling=1.0, no2_noise_scenario='high', no2_cloud_threshold=0.3, no2_scaling=1.0, co_noise_scenario=None, co_cloud_threshold=0.05, co_scaling=1.0, make_no2_error_cloud_dependent=True)#

A container class to provide access to SMARTCARB Level-2 data for given constellation and uncertainty scenario.

Parameters:
  • data_path (str) – Path tof SMARTCARB Level-2 files

  • constellation (str, optional) – Code used for CO2M constellation.

  • co2_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the CO2 observations for vegetation albedo and solar zenith angle of 50° (VEG50 scenario): “low” -> 0.5 ppm, “medium” -> 0.7 ppm and “high” -> 1.0 ppm.

  • co2_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 1% default cloud fraction.

  • co2_cloud_threshold – Cloud fraction used for masking bad pixels with 1% default cloud fraction.

  • co2_scaling (float, optional) – Scaling applied to model tracer with anthropogenic CO2 emissions

  • no2_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the NO2 observations: “low” -> 1e15 molecules cm-2 or 15% (whichever is larger) and “high” -> 2e15 molecules cm-2 or 20% (whichever is larger)

  • no2_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 30% default cloud fraction.

  • no2_scaling (float, optional) – Scaling applied to model tracer with anthropogenic NO2 emissions.

  • co_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the CO observations: “low” -> 4e17 molecules cm-2 or 10% (whichever is larger) and “high” -> 4e17 molecules cm-2 or 20% (whichever is larger)

  • co_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 5% default cloud fraction.

  • co_scaling (float, optional) – Scaling applied to model tracer with anthropogenic CO emissions

  • make_no2_error_cloud_dependent (boolean, optional) – If True, NO2 uncertainty depends on cloud fraction.

__new__(**kwargs)#
read_date(date)#

Returns a list of SMARTCARB Level-2 data for given date using constellation and uncertainty scenario of instance.

Synthetic CO2M images from the CoCO2 library of plumes#

ddeq.coco2.read_level2(filename, data_path='.', co2_noise=0.7, no2_noise=3.3e-05, mask_out_of_domain=False, drop_duplicates=True)#

Read CO2M-like Level-2 from CoCO2 library of plumes [Koene2022].

Parameters:
  • filename (str) – {team}_{region}_{suffix}.nc

  • data_path (str, optional) – Data path to filename.

  • co2_noise (float, optional) – Random noise added to CO2 fields (default: 0.7 ppm)

  • no2_noise (float, optional) – Random noise added to NO2 fields (default: 33 µmol m-2 = 2e15 cm-2)

  • mask_out_of_domain (boolean, optional) – For MicroHH simulations, remove CO2/NO2 values from CAMS outside MicroHH model domain.

  • drop_duplicates (boolean, optional) – If True, drop duplicated times.

Return type:

xr.Dataset

Notes

[Koene2022]

Erik Koene, & Dominik Brunner. (2022). CoCO2 WP4.1 Library of Plumes (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7448144

Wind fields#

ddeq.wind.read_at_sources(time, sources, product='ERA5', data_path='.', radius=0.05, timesteps=1, era5_prefix='', vertical_average='gnfra')#

Read wind at provided sources (downloads ERA-5 data automatically if necessary).

Parameters:
  • time (pd.Timestamp) – time

  • sources (xr.Dataset) – A dataset containing the source information.

  • product (str, optional) – Wind product used (‘ERA5’ or ‘SMARTCARB’).

  • data_path (str, optional) – Path to files with wind data files.

  • radius (float, optional) – Radius of circle around sources (in degrees) used for averaging wind field.

  • timestep (int, optional (only ERA5)) – If larger than 1 also download ERA-5 wind fields from overpass prior to satellite overpass given by time.

  • era5_prefix (str, optional (only ERA5)) – prefix for ERA-5 filename

  • vertical_average (str, optional (only ERA5)) – The approach used for vertically averaging winds for computing the effective wind speed (“gnfra”, “pbl_mean” or “pressure_levels”): “gnfra” computes the effective wind from GNFRA-A/SNAP-1 emission profiles for power plants; “pbl_mean” computes the effective wind as the mean value in the planet boundary layer; “pressure_levels” computes the effective wind by averaging pressure levels from 775 to 1000 hPa in ERA-5.

Returns:

Wind dataset with the wind u- and v- component, speed and direction at each source.

Return type:

xr.Dataset

ddeq.wind.read_field(filename, product='ERA5', altitude='GNFR-A', average_below=False)#

Return wind field from file for different products. 3D wind fields are taken at nearest altitude or averaged_below. If altitude is “GNFR_A”, use vertically weighted wind field.

Parameters:
  • filename (str) – Name of file with SMARTCARB or ERA-5 wind fields.

  • product (str, optional) – Either “ERA5” (default) or “SMARTCARB” product.

  • altitude (str or float, optional) – The approach used for vertically averaging winds for computing the effective wind speed. Default is “GNFR-A” using the vertical emission profile for power plants. This requires that the file already includes the vertically averaged wind fields as “U_GNFR_A” and “V_GNFRA_A”. Otherwise, if a number is given, the wind field is taken at the provide altitude or (if average_below is True) averaged below the given altitude.

  • average_below – If True, wind fields will be averaged below given altitude.

  • boolean – If True, wind fields will be averaged below given altitude.

  • optional – If True, wind fields will be averaged below given altitude.

Returns:

2D wind field on model grid.

Return type:

xr.Dataset