Automatic optical depth parametrization in radiative transfer model RTTOV v13 via LASSO-induced sparsity

Vargas Jiménez, Franklin; De los Reyes, Juan Carlos

doi:10.5194/gmd-18-8511-2025

Articles | Volume 18, issue 22

https://doi.org/10.5194/gmd-18-8511-2025

Articles | Volume 18, issue 22

Development and technical paper

14 Nov 2025

Development and technical paper |

| 14 Nov 2025

Automatic optical depth parametrization in radiative transfer model RTTOV v13 via LASSO-induced sparsity

Franklin Vargas Jiménez and Juan Carlos De los Reyes

Abstract

The assimilation of satellite spectral sounder data requires fast and accurate radiative transfer models. This study proposes a novel methodology to automatically parameterize atmospheric optical depths within the Radiative Transfer for TOVS (RTTOV) version 13 scheme using statistical thresholds across pressure levels and Least Absolute Shrinkage and Selection Operator (LASSO) regression to induce sparsity. Numerical experiments with Visible Infrared Imaging Radiometer Suite (VIIRS) infrared channels demonstrate that this approach significantly reduces computational costs while maintaining accuracy. The sparsity also facilitates the automatic selection of absorbing gases and predictors by channel and pressure level, making it particularly effective for multispectral instruments with numerous atmospheric variables. These findings highlight the potential of sparse regression methods to enhance the efficiency of radiative transfer models for satellite data assimilation.

Download & links

Article (PDF, 6630 KB)

Download & links

How to cite.

Received: 27 Feb 2025 – Discussion started: 27 Mar 2025 – Revised: 22 Aug 2025 – Accepted: 08 Oct 2025 – Published: 14 Nov 2025

1 Introduction

In satellite data assimilation and remote sensing retrievals, as well as their applications in numerical weather prediction (NWP), the radiative transfer equation (RT) is the main model used to retrieve global atmospheric variables, such as temperature and trace gases concentrations, including water vapor, ozone, carbon dioxide, and other atmospheric constituents. This is achieved by utilizing top of the atmosphere (TOA) radiance measurements from satellite sounders operating across different channels of the electromagnetic spectrum. The numerical implementation of the RT equation as a forward model can primarily be carried out using two approaches: Line-by-Line (LBL) Radiative Transfer models and Fast Radiative Transfer models (Fast-RT).

Line-by-line models simulate satellite radiance by rigorously integrating atmospheric physics and chemical phenomena. These models are highly accurate in replicating the precision of modern instruments, such as hyperspectral sounders like AIRS, CrIS and IASI. However, they are characterized by significant computational demands in terms of CPU time and memory, making them impractical for use in operational data assimilation. Some of the most well-known models in this category include: LBLRTM, developed at Atmospheric and Environmental Research, Inc. (AER) (Clough et al., 1992; Clough and Iacono, 1995; Clough et al., 2005); AMSUTRAN, developed at the Met Office (UK) (Turner et al., 2019); and GENLN2, developed at the National Center for Atmospheric Research (NCAR) (Edwards, 1992). A comparison between LBLRTM and GENLN2 is presented in Matricardi (2007). Another software worth mentioning is kCARTA (DeSouza-Machado et al., 2020), a pseudo Line-by-Line model that uses precomputed and compressed physically intensive processes in RT model to compute radiances more quickly while maintaining accuracy.

On the other hand, the most common Fast-RT models estimate the expected radiance in a channel (what a sensor actually measures) and are typically based on statistical approaches. In these models, the complex and computationally costly physical processes of RT modeling, the calculation of atmospheric transmittances, are parameterized using statistical models and trained with output from Line-by-Line software on real atmospheric profile databases. The parameters are adjusted using standard linear regression models or other machine learning techniques. While these methods sacrifice a small degree of accuracy, they significantly reduce computational costs, making them practical for use in operational data assimilation. Some of the most well-known models in this category include: OPTRAN, developed by the NESDIS-NCEP community (McMillin et al., 1995; Kleespies et al., 2004; McMillin et al., 2006); the Joint Center for Satellite Data Assimilation (JCSDA) Community Radiative Transfer Model (CRTM) (Han et al., 2006; Chen et al., 2008); and the RTTOV model, see (Saunders et al., 2018) and the references therein. Other studies using statistical approaches include Matricardi (2010), which incorporates principal component analysis in RTTOV, as well as (Liu et al., 2009; Krishnan et al., 2012; Cao et al., 2021; Stegmann et al., 2022; Mauceri et al., 2022; Su et al., 2023), which apply machine learning techniques for parametrization, feature reduction, and sampling strategies.

Even though RTTOV is more efficient than line-by-line models, it remains prohibitively expensive for operational use cases¹. Indeed, in current Fast RT models based on linear regression, such as OPTRAN and RTTOV, training is performed separately for each gas type and pressure level, resulting in an over-parametrization of the RT model. To reduce the number of parameters and make the evaluation of the trained RT model less computationally expensive, it is essential to carefully select the most significant gases for each spectral channel of each instrument type, reduce the number of pressure levels, and implement other ad hoc strategies. These decisions must account for the large number of possible combinations and trade-offs, and are typically made by expert teams.

One promising approach to reducing the number of parameters without relying on expert committees is the use of optimization methods that induce sparsity in the parameters. In particular, the use of LASSO regression, a regularization method that penalizes the regression coefficients with the ℓ₁-norm, has proven effective for variable selection and model complexity reduction in various large-scale applications (see, e.g., Heilemann et al., 2024; Pak et al., 2025). In the context of radiative transfer, LASSO regression was applied by Cardall et al. (2023) to estimate water quality parameters such as clarity, temperature, and chlorophyll a, based on correlations with in-situ measurements and near-coincident Landsat spectral data, with a focus on model explainability. In Li et al. (2020), the authors proposed an algorithm for detecting hazardous clouds using passive infrared remote sensing technology with variable selection. Other studies that combine or compare LASSO with machine learning methods for remote sensing include: the removal of redundant features in PolSAR and optical images (Hong and Kong, 2021); estimation of aboveground forest biomass with variable selection (Wang et al., 2022 a); identification of important environmental variables for retrieving soil moisture content (Wang et al., 2022 b); evaluation of the accuracy and generalization capacity of grassland models (Smith et al., 2023); and a comparison of different machine learning methods for predicting soybean yield (Joshi et al., 2023).

Building on this approach, in this paper we target the automatic selection of gases and optical depth predictors in Fast RT models by inducing sparsity in the weight predictors using LASSO regression. We propose a parametrization of transmittances based on statistical thresholds to automatically select the appropriate gases by channel and pressure level, and to induce sparsity in the parameters by replacing the classical regression problem with a LASSO problem within the RTTOV framework. The proposed methodology is tested with VIIRS infrared channels, and the results are compared with the standard RTTOV model. To the best of the authors' knowledge, this is the first time that LASSO regression has been applied to the RTTOV model to automate the selection of gases and parameters.

One of the key aspects in LASSO models is the choice of the regularization weight in front of the ℓ₁-norm. This weight controls the trade-off between fitting the training data well and keeping the model simple by reducing the number of non-zero coefficients. In our context, selecting an appropriate regularization weight is crucial for effectively identifying the most relevant gases and optical depth predictors while avoiding overfitting. To establish a rigorous criterion for choosing this parameter – rather than relying on a tedious trial-and-error process – we propose a bilevel optimization approach (see, e.g., De los Reyes and Villacís, 2022; De los Reyes, 2023). The idea is to formulate an upper-level optimization problem that encodes a model quality criterion, while the LASSO problem serves as the lower-level constraint. In this article, we successfully test two types of loss functions: the first, based on an ℓ₀ seminorm that prescribes the number of non-zero predictors; and the second, inspired by a Bayesian Information Criterion-type objective.

This manuscript is organized as follows: Sect. 2 outlines the theoretical framework for the RT equation in Line-By-Line models and details the general scheme of Fast-RT methods, focusing on RTTOV. Section 3 introduces the proposed transmittance parametrization using statistical inference and LASSO regression model, as well as the bilevel optimization approach for selecting the regularization weight. Section 4 presents the experimental settings and numerical results comparing RTTOV with the proposed method. Finally, Sect. 5 offers conclusions of the performance of the proposed approach.

2 Radiative Transfer Equation

The monochromatic radiative transfer equation for the upwelling radiance in a clear sky, without solar radiation contribution, for a non-scattering atmosphere and in local thermodynamic equilibrium, is given by:

\begin{matrix} (1) & \begin{aligned} I (ν, θ) & = τ_{s} (ν, θ) ϵ_{s} (ν, θ) B (ν, T_{s}) + \int_{τ_{s}}^{1} B (ν, T (p)) d τ \\ + (1 - ϵ_{s} (ν, θ)) τ_{s}^{2} (ν, θ) \int_{τ_{s}}^{1} \frac{B (ν, T (p))}{τ^{2}} d τ, \end{aligned} \end{matrix}

where I(ν,θ) is the monochromatic TOA radiance at wavenumber ν and satellite zenith angle θ; B(ν,T) is the Planck function at temperature T; $τ (ν, θ, p, T, q)$ denotes the layer-to-space transmittance dependent on pressure p, temperature T, and gas concentration q. Here, T_s, ϵ_s, and τ_s represent surface skin temperature, emissivity, and transmittance respectively, (Weinreb et al., 1981).

The terms correspond to surface emission, upward atmospheric emission, and downward atmospheric emission reflected at the surface (assuming specular reflection). Surface emissivity can be close to 1 for ν between 714–1250 cm⁻¹ and for surfaces such as bodies of water, ice and healthy plant leaves, carbon powder, allowing the last term to be discarded.

The model described above, computed for each wavenumber ν is called the Line-by-Line model, and the resulting radiance is monochromatic.

Satellite-measured radiance is polychromatic, simulated by convolving Eq. (1) with the instrument's Normalized Spectral Response Function (NSRF):

\begin{matrix} (2) & \overline{I} (ν^{*}, θ) = \int_{ν_{a}}^{ν_{b}} ϕ (ν, ν^{*}) I (ν, θ) d ν, \end{matrix}

where $ϕ (ν, ν^{*})$ is the NSRF, representing the sensitivity to radiance within the spectral channel [ν_a,ν_b], with ν^* representing the centroid of the response. Using the expression (2) in Eq. (1), the polychromatic radiance for the spectral channel identified with ν^*, assuming ϵ_s=1, can be written as (see Weinreb et al., 1981):

\begin{matrix} (3) & \overline{I} (ν^{*}, θ) = {\overline{τ}}_{s} (ν^{*}, θ) B (ν, T_{es}) + \int_{τ_{s}}^{1} B (ν^{*}, T_{e} (p)) d \overline{τ}, \end{matrix}

where T_es and T_e are empirical effective temperatures obtained via regression. The polychromatic transmittance is given by:

\begin{matrix} (4) & \overline{τ} (ν^{*}, θ, p, T, q) = \int_{ν_{a}}^{ν_{b}} ϕ (ν, ν^{*}) τ (ν, θ, p, T, q) d ν . \end{matrix}

Transmittance follows Beer-Lambert law $τ = e^{- d}$ , with optical depth $d (ν, θ, p, T, q)$ accounting for absorption by gases (e.g., H₂O, O₃, CO₂, CH₄) and continuum effects. The monochromatic optical depth for a set of gases $g_{1}, \dots, g_{s}$ is:

\begin{matrix} (5) & \begin{aligned} d (ν, θ, p, T, q) & = - \frac{\sec (θ)}{g} \sum_{l = 1}^{s} \\ \int_{0}^{p} K^{g_{l}} (ν, p^{'}, T (p^{'})) q^{g_{l}} (p^{'}) d p^{'}, \end{aligned} \end{matrix}

where g is gravitational acceleration, $K^{g_{l}}$ is the absorption function modeled via Voigt profiles (see Lavrentieva et al., 2011).

2.1 Fast Radiative Transfer Model

Fast RT models discretize the atmosphere into L layers:

p_{0} < p_{1} < \dots < p_{L},

where p₀ is the top-of-atmosphere and p_L the surface pressure. Polychromatic radiance Eq. (3) is computed numerically, requiring parameterization of polychromatic transmittance to reduce computational cost. In Fast-RT models, the polychromatic optical depth is parameterized and fitted via linear regression to approximate Eq. (5), following ideas from McMillin and Fleming (McMillin and Fleming, 1976; Fleming and McMillin, 1977; McMillin et al., 1979).

The polychromatic optical depth from layer i to the top of the atmosphere, for a single channel and gas g_l, is:

\begin{matrix} (6) & d_{i}^{g_{l}} = d_{i - 1}^{g_{l}} + \sum_{j = 1}^{m_{l}} w_{i j}^{g_{l}} X_{i j}^{g_{l}}, d_{0}^{g_{l}} = 0, i = 1, \dots, L, \end{matrix}

where $X_{i j}^{g_{l}}$ are predictors depending on view angle, temperature, and gas concentration. The parameters $w_{i j}^{g_{l}}$ define the model. Appendix B provides details on the RTTOV v13 predictors, and further information can be found in Saunders et al. (2017).

This parametrization includes a fixed gas mixture – whose spatio-temporal variations minimally affect radiance – and variable gases, primarily H₂O, optionally including O₃, CO₂, N₂O, CO, CH₄, and SO₂, varying by channel. Water vapor absorption may be split into line and continuum components.

The polychromatic transmittance of layer i for gas g_l is approximated by:

\begin{matrix} (7) & τ_{(i, 0)}^{g_{l}} = \exp (- d_{i}^{g_{l}}), \end{matrix}

and total transmittance approximated as:

\begin{matrix} (8) & τ_{(i, 0)}^{TOT} = \prod_{l = 1}^{s} τ_{(i, 0)}^{g_{l}} . \end{matrix}

Parameters are fitted using a database of M vertical atmospheric profiles:

(p_{i}, T_{i j}, q_{i j}^{g_{1}}, \dots, q_{i j}^{g_{s}}), i = 0, \dots, L, j = 1, \dots, M,

with polychromatic transmittances computed with Line-by-Line software for N view angles θ_k:

\begin{matrix} (9) & \begin{aligned} (τ_{i j k}^{g_{1}}, & \dots, τ_{i j k}^{g_{s}}), i = 1, \dots, L, \\ j = 1, \dots, M, k = 1, \dots, N . \end{aligned} \end{matrix}

Since total polychromatic transmittance is not simply the product of individual gases transmittances (unlike the monochromatic case), data (9) are corrected following (Xiong and McMillin, 2005; McMillin et al., 2006), as in RTTOV v13 (Hocking et al., 2021), by introducing a corrective term $τ_{(i, 0)}^{COR}$ :

\begin{matrix} (10) & τ_{(i, 0)}^{TOT} = τ_{(i, 0)}^{COR} \prod_{l = 1}^{s} τ_{(i, 0)}^{g_{l}}, \end{matrix}

which is parameterized similarly to Eqs. (6) and (7). The corrective transmittance for training is:

τ_{i j k}^{COR} = \frac{τ_{i j k}^{TOT}}{{\hat{τ}}_{i j k}^{TOT}},

where $τ_{i j k}^{TOT}$ is the Line-by-Line polychromatic transmittance including all absorbers, and ${\hat{τ}}_{i j k}^{TOT}$ is the modeled transmittance from Eq. (8).

The linear regression fitting problem for gas g_l and layer i is:

\begin{matrix} (11) & ({LS}_{g_{l}}) min_{w_{i}^{g_{l}} \in R^{m_{l}}} \frac{1}{2 M N} {∥A_{i}^{g_{l}} w_{i}^{g_{l}} - y_{i}^{g_{l}}∥}_{2}^{2}, \end{matrix}

where $A_{i}^{g_{l}} \in R^{M N \times m_{l}}$ contains predictors $X_{i j}^{g_{l}}$ for angles, temperatures, and concentrations across profiles, and $y_{i}^{g_{l}} \in R^{M N}$ the corresponding optical depths.

In RTTOV v13, parameter counts per channel reach nearly 11 000, considering variable and fixed gases, layers, and corrections. Reduction is achieved by expert-based gas selection, layer thinning, and thresholding (see Saunders et al., 2017).

3 A Sparse Parametrization of Optical Depths

In this section, we present a methodology to significantly reduce the number of parameters used in optical depth parametrization within the RTTOV v13 framework. The methodology involves automatically selecting absorbing gases per channel and pressure level, as well as identifying the most important predictors for each atmospheric layer. This approach induces sparsity in the regression parameters by combining two tools: statistical inference to determine whether a given gas at a particular layer requires no parametrization, a parametrization with a single predictor, or a more complex parametrization as described in Eq. (6). In the latter case, the classic linear regression problem is replaced with a LASSO regression problem to select predictors and induce sparsity in the parameter vectors.

3.1 Parametrization Based on Statistical Inference

The aim here is to preprocess the data of the polychromatic transmittances in a channel to determine which atmospheric layers require optical depth parametrization and to automatically exclude gases that do not significantly contribute to the radiance absorption in that channel. To achieve this, we will use confidence intervals to estimate the true polychromatic transmittances.

For a gas g_l or correction term in a fixed layer i, we construct a confidence interval for the mean of the polychromatic transmittances of the layer i. This is given by:

[{\overline{τ}}_{i}^{g_{l}} - E_{i}^{g_{l}}, {\overline{τ}}_{i}^{g_{l}} + E_{i}^{g_{l}}]

where

E_{i}^{g_{l}} = Z_{1 - \frac{α}{2}} \frac{s_{i}^{g_{l}}}{\sqrt{N M}},

${\overline{τ}}_{i}^{g_{l}}$ is the mean polychromatic transmittance for layer i, considering N angles and M atmospheric profiles, $s_{i}^{g_{l}}$ is the corresponding standard deviation, and $Z_{1 - \frac{α}{2}}$ is the critical value of a distribution for a confidence level of 1−α. Given that the number of data points in each layer is NM, which is usually sufficiently large (in our experiments, for N=6 angles and M=83 profiles, NM=498), the standard normal distribution is used to obtain the critical value. Thus, the absolute error in approximating the true value of the polychromatic transmittance of gas g_l in layer i, with ${\overline{τ}}_{i}^{g_{l}}$ is at most $E_{i}^{g_{l}}$ , with a probability α that the absolute error exceeds this value. In our case, the confidence level is set to $α = 10^{- 6}$ .

Based on the above, the following statistical thresholds for optical depth parametrizations are proposed. Let ϵ₁ and ϵ₂ be positive and sufficiently small values, these will be used as thresholds to determine whether ${\overline{τ}}_{i}^{g l}$ is close to the true value or close to 1. Define the mean optical depth for layer i as ${\overline{d}}_{i}^{g_{l}} = - \ln ({\overline{τ}}_{i}^{g_{l}})$ , and consider the following three cases:

Case I. If $E_{i}^{g_{l}} > ϵ_{1}$ , the polychromatic transmittance due to gas g_l in layer i has high variability with respect to the value of the atmospheric variables in that layer. In this case, the optical depth parametrization follows as in Eq. (6) for layer i.
Case II. If $E_{i}^{g l} \leq ϵ_{1}$ and ${\overline{d}}_{i}^{g_{l}} > ϵ_{2}$ , unlike the previous case, the polychromatic transmittance due to gas g_l in layer i has low variability with respect to the value of the atmospheric variables in that layer, and can be estimated by ${\overline{τ}}_{i}^{g_{l}}$ , but is not close to 1. Thus, the optical depth can be parameterized with a single predictor as follows:
$d_{i}^{g_{l}} = d_{i - 1}^{g_{l}} + w_{i 0}^{g_{l}} X_{i 0}^{g_{l}},$
where, X_0i=1 and $w_{i 0}^{g_{l}} = {\overline{d}}_{i}^{g_{l}}$ . If this occurs in all layers, and since the parametrization does not depend on atmospheric variables, the gas g_l can be included with fixed gases.
Case III. If $E_{i}^{g_{l}} \leq ϵ_{1}$ and ${\overline{d}}_{i}^{g_{l}} \leq ϵ_{2}$ , the polychromatic transmittance in layer i can not only be estimated by ${\overline{τ}}_{i}^{g_{l}}$ but is also close to 1, meaning that gas g_l does not cause significant absorbance in this layer. The relative error of approximating ${\overline{τ}}_{i}^{g_{l}}$ with 1 is given by:
$\frac{1 - {\overline{τ}}_{i}}{{\overline{τ}}_{i}} = e^{{\overline{d}}_{i}^{g_{l}}} - 1 = {\overline{d}}_{i}^{g_{l}} e^{ξ} \leq ϵ_{2} e^{ϵ_{2}} \approx ϵ_{2},$
for some $ξ \in (0, {\overline{d}}_{i}^{g_{l}})$ . If this condition is met for all layers, then gas g_l is automatically discarded.

To summarize the above, the parametrization of optical depths based on statistical thresholds is as follows:

\begin{array}{l} d_{0}^{g_{l}} = 0 \\ (12) & d_{i}^{g_{l}} = d_{i - 1}^{g_{l}} + \{\begin{cases} \sum_{j = 1}^{m_{l}} w_{i j}^{g_{l}} X_{i j}^{g_{l}}, & E_{i}^{g_{l}} > ϵ_{1}, \\ {\overline{d}}_{i}^{g_{l}}, & E_{i}^{g_{l}} \leq ϵ_{1} and {\overline{d}}_{i}^{g_{l}} > ϵ_{2}, \\ 0, & otherwise, \end{cases} \end{array}

for $i = 1, 2, \dots, L$ . The transmittances from layer i to the top of the atmosphere are still calculated using Eq. (7).

The statistical threshold tolerances ϵ₁ and ϵ₂ should be sufficiently small. In our experiments, we set ϵ₂=ϵ₁ and evaluate the model performance for different small values of ϵ₁.

3.2 LASSO Regression and optimal choice of regularization parameter

After discarding parameter groups using the previous statistical approach with Eq. (12), in Case I, the remaining parameters are typically estimated by solving an ordinary least squares (OLS) problem, which involves a large number of parameters.

To reduce the number of parameters, we propose to induce sparsity in the parameter vector $w_{i}^{g_{l}}$ by solving the LASSO problem. This is done by replacing the OLS problem (11) with the following optimization problem:

\begin{matrix} (13) & ({LASSO}_{g_{l}}) w_{i}^{g_{l}} = \arg min_{w \in R^{m_{l}}} L_{λ} (w) \end{matrix}

where

L_{λ} (w) = \frac{1}{M N} {∥A_{i}^{g_{l}} w - y_{i}^{g_{l}}∥}_{2}^{2} + λ {∥w∥}_{1}

and λ≥0 is the regularization parameter. As $λ \to + \infty$ , high sparsity is induced, and as λ→0, sparsity is low. Specifically, if λ=0, the problem reduces to the least squares problem (11).

The regularization parameter λ has to be carefully selected to ensure that the approximation of the transmittance in layer i maintains a high level of accuracy relative to the least squares solution (11), while achieving a model with fewer parameters. Although standard techniques such as cross-validation exist for tuning λ, they may not always be appropriate, especially when alternative loss criteria are more relevant to the specific modeling goals. To address this choice, we adopt a bilevel optimization approach (see, e.g., De los Reyes and Villacís, 2022; De los Reyes, 2023), where the LASSO problem forms the lower-level constraint and the upper-level objective reflects a model quality criterion. This results in the following bilevel problem:

\begin{matrix} (14) & \begin{aligned} min_{λ \in R} & F (w (λ)) \\ s.t. & w (λ) = \arg min_{w \in R^{m_{l}}} L_{λ} (w), \\ 0 \leq λ \leq λ_{0}, \end{aligned} \end{matrix}

where λ₀>0 is a given upper bound. In the following, we show how to reduce this bilevel problem to a standard nonlinear optimization problem. For the sake of clarity, we omit the indices corresponding to gas and pressure level.

Under the assumption that matrix A is full rank, problem (13) has a unique solution for each λ≥0, denoted by w(λ). The collection of these solutions, as λ varies over the positive real numbers, is called the regularization path $P = \{w (λ) : λ > 0\}$ . A key structural property of the regularization path is that it is well-defined, unique, and continuous piecewise linear. Moreover, it can be computed using the homotopy algorithm for the LASSO problem (Osborne et al., 2000), an algorithm with exponential complexity but low computational cost, that returns the vertices of the regularization path; both the properties and the algorithm are described in Mairal and Yu (2012). The algorithm constructs a finite, monotonically decreasing sequence of values ${λ_{k}}_{k = 0}^{r}$ , with $λ_{0} = ‖ A^{T} y ‖_{\infty}$ and λ_r=0. For each λ_k in this sequence, the corresponding solution to the LASSO problem, w(λ_k), is a vertex of the regularization path 𝒫, and it can be shown that w(λ)=0, for all λ≥λ₀. In each line segment of this path, the sparsity pattern of w(λ) does not change; that is, the support set $S (w (λ)) = {j \in {1, 2, \dots, m_{l}} : w_{j} (λ) \neq 0}$ remains fixed for all $λ \in (λ_{k + 1}, λ_{k}]$ .

These properties of the regularization path allow the bilevel problem (14) to be reduced to a standard one-dimensional optimization problem with bound constraints:

\begin{matrix} (15) & min_{λ \in [0, λ_{0}]} F (w (λ)) . \end{matrix}

We still need to establish the upper-level loss function F, which serves as a model quality criterion for the LASSO regularization path. To this end, we propose two formulations for empirical comparison: the first is based on the optimal selection of the regularization parameter in the LASSO problem using an ℓ₀-regression cost function; the second is based on a well-established statistical tool for optimal model selection, the Bayesian Information Criterion (BIC).

3.2.1 Bilevel formulation based on the ℓ₀ regression

The best subset selection problem (Bertsimas et al., 2016; Miller, 2002) consists of solving a least squares formulation that allows explicit control of sparsity through the choice of the number of predictors, this is:

\begin{matrix} (16) & \begin{aligned} min_{w \in R^{m_{l}}} & \frac{1}{M N} ‖ A w - y ‖_{2}^{2} \\ s.t. & card (S (w)) \leq k, \end{aligned} \end{matrix}

for $k \in {1, \dots, m_{l}}$ given. As this problem is NP-hard, the computational loss can be prohibitive, especially when several subset sizes must be tested without prior knowledge of k. To mitigate this, more tractable relaxations have been proposed, such as the ℓ₀ regression, obtained from a penalized formulation of the problem (16):

min_{w \in R^{m_{l}}} \frac{1}{N M} ‖ A w - y ‖_{2}^{2} + γ (card (S (w)) - k),

where γ>0 is the penalty parameter. Motivated by this problem, we propose as the merit function

F_{ℓ_{0}} (w) = \frac{1}{N_{v} M_{v}} ‖ A_{v} w - y_{v} ‖_{2}^{2} + γ (β) card (S (w)),

which is used in the bilevel problem (14) to assess LASSO solutions on validation data, balancing generalization and complexity through a weighting parameter $β \in (0, 1]$ . As a reference, w(0) (the OLS solution) achieves the best fit but maximum complexity (β=1), while w(λ₀)=0 is the opposite (β=0). Consequently, the penalty is defined as

γ (β) = \frac{1}{N M} (\frac{1}{β} - 1) (‖ y ‖_{2}^{2} - ‖ A w (0) - y ‖_{2}^{2}) .

To prioritize model data fidelity over low complexity, β should be close to 1. In our experimental setting, we choose $β = 1 - 10^{- 4}$ .

With $F_{ℓ_{0}} (w (λ))$ as the objective function of (15), it is a piecewise continuous objective function, smooth along each linear segment of the regularization path and with discontinuities at ${λ_{k}}_{k = 0}^{r}$ . Moreover, $F_{ℓ_{0}} (w (λ))$ is a quadratic polynomial for $λ \in (λ_{k + 1}, λ_{k}]$ , since card(𝒮(w(λ))) remains constant within this interval. If we denote ${\tilde{λ}}_{k}$ as the minimizer of this polynomial over the closure of this interval, then problem (15) reduces to a discrete parameter optimization problem:

\begin{matrix} (17) & min_{0 \leq k \leq r} F_{ℓ_{0}} (w ({\tilde{λ}}_{k})) . \end{matrix}

3.2.2 Bilevel formulation based on the Bayesian Information Criterion

In this case, the choice of the loss function F(w) is inspired by the Bayesian Information Criterion for model selection (Schwarz, 1978). Similar to ℓ₀-regression, it penalizes model complexity but does not require a tuning parameter. Given a collection 𝒫 of candidate models, and letting σ(w) denote the maximum likelihood under model w∈𝒫, the BIC-based objective is given by

min_{w \in P} n \ln (σ (w)) + \ln (n) K (w),

where K(w) denotes the number of explanatory variables in model w (or a measure of model complexity), and n is the number of data points used to construct model w. If the “true model” belongs to 𝒫, then the probability that BIC selects this model approaches 1 as the number of data points increases.

In our context, the model set 𝒫 consists of LASSO solutions for each λ≥0, built using n=NM data points, and a good approximation to the true model is given by the ordinary least squares solution w(0). We define:

\begin{aligned} σ (w) & = \frac{1}{N M} {∥A w - y∥}_{2}^{2} and K (w) = card (S (w)), \\ for w \in P, \end{aligned}

The BIC-based objective function is then defined as:

F_{BIC} (w) = N M \ln (\frac{1}{N M} {∥A w - y∥}_{2}^{2}) + \ln (N M) card (S (w)) .

This is a piecewise continuous objective function, smooth along each linear segment of the regularization path and with discontinuities at ${λ_{k}}_{k = 0}^{r}$ . It can be verified that F_BIC(w(λ)) is monotonically increasing in $λ \in (λ_{k + 1}, λ_{k}]$ since card(𝒮(w(λ))) remains constant within this interval. Therefore, the solution of the bilevel problem (15) with the BIC-based merit function occurs at one of the discontinuity points λ_k. Consequently, problem (15) reduces to the discrete model selection problem:

\begin{matrix} (18) & min_{0 \leq k \leq r} F_{BIC} (w (λ_{k})) . \end{matrix}

3.2.3 Post-LASSO for model bias reduction

An important characteristic of LASSO solutions is that they are biased toward zero whenever λ>0. As a result, the mean squared error of w(λ) may not accurately reflect the true likelihood of the model, particularly in the context of the BIC-based formulation. To address this, we use a post-penalized estimator, namely an ordinary least squares regression restricted to the set of predictors selected by LASSO (Belloni and Chernozhukov, 2011). This approach is known as the Post-LASSO problem. As a direct consequence of the predictor set remaining fixed within each line segment of the LASSO regularization path, the Post-LASSO problems can be formulated for each $k = 0, 1, \dots, r$ as:

\begin{aligned} min_{w \in R^{m_{l}}} & \frac{1}{N M} {∥A w - y∥}_{2}^{2} \\ s.t. & w_{j} = 0, for j \notin S (w (λ_{k})) . \end{aligned}

Let ${\{w_{LS}^{k}\}}_{k = 0}^{r}$ denote the set of Post-LASSO solutions corresponding to the sequence ${λ_{k}}_{k = 0}^{r}$ . Instead of using solutions from the LASSO regularization path in the ℓ₀-regression (Eq. 17) or BIC-based (Eq. 18) formulations, we employ the Post-LASSO solutions $w_{LS}^{k}$ , which provides an alternative model selector with reduced bias:

\begin{matrix} (19) & min_{0 \leq k \leq r} F (w_{LS}^{k}) . \end{matrix}

Finally, this formulation is used to select the weights for the optical depth parametrization for each gas and pressure level, using either the bilevel ℓ₀+LASSO regression or the bilevel BIC+LASSO regression formulations.

4 Numerical Results

This section evaluates the performance of the proposed parametrization compared to the standard RTTOV v13. Specifically, it studies the level of sparsity achieved and its impact on accuracy relative to RTTOV v13 and Line-by-Line calculations using LBLRTM. Performance is measured via the root mean square error (RMSE) of the transmittances compared to Line-by-Line transmittances, and by assessing the brightness temperature (BT) approximation error from the Fast-RT models against Line-by-Line results. Additionally, the BT error is compared to the Noise-Equivalent Delta Temperature (NEdT) of the M-band VIIRS instrument to assess the proportion of profiles for which the model error remains below the instrument noise.

The numerical experiments do not include direct benchmarking against other existing Fast-RT models, only against standard RTTOV v13. It also does not evaluate scenarios where the assumptions of the method might break down, such as extreme atmospheric conditions, including extreme pollution events and environments with high volcanic activity.

4.1 Experiment settings

For training the RTTOV parametrizations and the proposed sparse variants, six variable gases are considered: H₂O, O₃, CO₂, N₂O, CO, and CH₄. The Fast-RT model can additionally consider SO₂ as a variable gas, but here it will be treated as a fixed gas among the total of 22 fixed gases considered. No distinction is made between water vapor absorption lines and continuum absorption. For the viewing angle, we consider 6 path secant angles from 1 to 2.25 with step 0.25 (from 0 to 63.61°).

4.1.1 Spectral Response Functions of VIIRS M-bands

The VIIRS is an instrument on NOAA's Suomi NPP and NOAA-20 satellites, part of the Joint Polar Satellite System (JPSS). It features 16 moderate resolution bands (M-bands) that cover visible and infrared spectra. This study focuses on spectral response functions for bands M7 to M16, which cover the near (NIR), medium (MIR), and long (LIR) infrared ranges. In this study, we use the VIIRS SRF J2, which can be downloaded from the following link: https://ncc.nesdis.noaa.gov/NOAA-21/index.php (last access: 14 December 2023). Details on the centers and spectral ranges of these bands can be found in Tables 1 and 2 in Cao et al. (2017).

For each channel, the wavenumber ν and the corresponding Spectral Response Function (SRF) values are tabulated. The wavenumber tabulation typically covers a broader spectral range, denoted as [ν_a,ν_b], with noisy SRF values at the extremes of this interval. Therefore, the SRF must be truncated to a smaller interval that retains most of the relevant SRF information. Instead of using Tables 1 and 2 from Cao et al. (2017) for our calculations, we utilize channels with a spectral range broader than those. These channels are defined as $[ν^{*} - ν_{l}, ν^{*} + ν_{u}]$ , where ν^* is the centroid of SRF in [ν_a,ν_b], ν_l and ν_u are the tabulated wavenumber values closest to ν^* below and above, respectively, such that the relative truncation error does not exceed $ϵ = 9 \times 10^{- 4}$ . Specifically:

(1 - ϵ) \int_{ν_{a}}^{ν_{b}} ϕ (ν^{*}, ν) d ν \leq \int_{ν^{*} - ν_{l}}^{ν^{*} + ν_{u}} ϕ (ν^{*}, ν) d ν .

The integrals are calculated using the composite trapezoidal rule. The SRF data are then truncated and normalized within this new interval, and the centroid ν^* is recalculated. The updated channels and centroids are presented in Table 1. By truncating the noisy tails of the SRF in this way, the resulting NSRF for each channel is interpolated using natural cubic splines to be used for calculating polychromatic transmittances with a much finer spectral resolution than the tabulated NSRF data. It can be shown that the error made by approximating the polychromatic transmittance with the truncated NSRF does not exceed ϵ.

Table 1VIIRS IR M-bands (wavenumber).

Download Print Version | Download XLSX

4.1.2 Vertical profile database ECMWF83

For training the optical depth parametrization, we use the ECMWF83 database, which includes 83 vertical profiles with temperature and gas concentrations for H₂O, O₃, CO₂, N₂O, CO and CH₄, across 101 pressure levels, originally created to train RTTOV (Matricardi, 2008). A separate database with 22 vertical profiles covers fixed gases. These datasets are available from NWP SAF of EUMETSAT and can be downloaded at https://nwp-saf.eumetsat.int/site/ (last access: 10 November 2023).

4.1.3 Line-by-Line Transmitances with LBLRTM

In this study, LBLRTM v12.15.1 (February 2023) will be employed for Line-by-Line calculations. The software uses AER Continuum MT CKD v4.1.1. for continuum models of water vapor and other gases and the AER Line Parameter Database v3.8.1. for line parameters, which consolidates various line spectral databases, primarily HITRAN 2016 (Gordon et al., 2017).

The principal parameter in the LBLRTM calculation, to generate the optical depths for training and top-of-atmosphere radiances, are the following:

The continuum absorption is not activated for isolated gases and fixed gases, nor when all gases are included.
The Voigt profile is chosen for the shape of spectral lines,
The spectral resolution is set to $d ν = {\bar{α}}_{ν} / 1.5$ where ${\bar{α}}_{ν}$ is the average value of the Voigt halfwidth for the layer. Consequently, the spectral resolution is not homogeneous across channels, achieving an average spectral resolution from $7.1 \times 10^{- 3}$ for M7 to $4.1 \times 10^{- 4}$ for M16.
The calculation of optical depths with the software is performed only for the observation point at nadir. For other angles, variations are made directly in the calculation of polychromatic transmittances.

4.1.4 RTTOV v13 and Proposals Settings

For short, we will abbreviate Fast-RT models as follows: RTTOV13 for the standard RTTOV v13; SI for RTTOV13 with statistical threshold and ordinary least squares for parameterization; BIC+L1 for RTTOV13 with statistical threshold and BIC+LASSO regression for parameterization; and L0+L1 for RTTOV13 with statistical threshold and ℓ₀+LASSO regression for parameterization

We implemented the transmittance parametrization of RTTOV v13 as described in Saunders et al. (2020), using the same predictors, except for the method of selecting gases per channel, which is detailed below.

In RTTOV v13 in the standard form, regression parameters are obtained by including only the gases that exhibit absorption lines in each channel, as shown in Table 2. In the proposed RTTOV variants, using statistical inference and LASSO regression, all gases are included in the training.

Additionally, there are other criteria for selecting predictors in the correction term and training data by level, which are listed below:

Threshold for gases correction term. Predictors for fixed gases are always included in the correction term. For other gases, predictors for a specific gas in a layer are included only if any of the corresponding optical depths in the training profile for that layer exceed a threshold 0.01 for CH₄ and 0.005 for the other gases. As a result, for all the VIIRS channels studied, only predictors for fixed gases and water vapor are included in the correction term.
Threshold for Optical Depth Data Training. Optical depth data in a layer for a gas is omitted if the corresponding transmittance from the layer to the surface is less than $3 \times 10^{- 6}$ . As a result, only channel M10 is affected by this selection criterion.

Table 2Gases considered in RTTOV v13 for VIIRS M-bands.

Download Print Version | Download XLSX

The performance of the three proposed models, SI, BIC+L1, and L0+LASSO, is evaluated using different statistical threshold parameters $ϵ_{1} \in {10^{- 9}, 10^{- 8}, 10^{- 7}, 10^{- 6}}$ . Since the L0+LASSO bilevel model is based on a validation data criterion for the upper-level merit function, we split the NM data randomly in half, using one half for training the LASSO problems and the other half as validation data for evaluating model quality using the ℓ₀ regression.

Table 3Percentage of nonzero parameters in RTTOV v13 for each channel, for the standard configuration, SI with OLS regression, BIC+LASSO regression, and ℓ₀+LASSO regression. The second column represents the different statistical thresholds ϵ₁ used for the proposed RTTOV v13 variants.

Download Print Version | Download XLSX

4.2 Sparsity Pattern in the parametrization of optical depths

Table 3 summarizes the percentage of non-zero parameters (%NZ) out of a total of 11 000 parameters (worst-case scenario) for each type of optical depth model: RTTOV13, SI, BIC+L1, and L0+L1. Figures 1 and 2 show the percentage of parameter usage and computation time relative to RTTOV13. Tables 4, 5, 6, and 7 provide details on the number of non-zero parameters (NNZ) for each gas type and correction factor, for $ϵ_{1} = 10^{- 6}$ .

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f01

Figure 1Parameter usage (left) and runtime (right) of the SI method, expressed as percentages relative to those of RTTOV v13 (fixed at 100 %) for different values of ϵ₁.

Download

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f02

Figure 2Parameter usage (left) and runtime (right) of the BIC+LASSO method, expressed as percentages relative to those of RTTOV v13 (fixed at 100 %) for different values of ϵ₁.

Download

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f03

Figure 3Parameter usage (left) and runtime (right) of the L0+LASSO method, expressed as percentages relative to those of RTTOV v13 (fixed at 100 %) for different values of ϵ₁.

Download

In Table 3, the increase in sparsity for the proposed parametrizations compared to the general RTTOV v13 scheme is evident. RTTOV v13 induces sparsity by manually selecting gases and applying optical depth thresholds to include predictors in the correction factor. Using $ϵ_{1} = 10^{- 6}$ as a reference, in the best-case scenario with channel M7, where greater sparsity is achieved with RTTOV13, the sparsity level of RTTOV13 (53.64 %) increases to 93.66 % for SI, 94.39 % for BIC+L1, and 96.22 % for L0+L1. Conversely, in the worst-case scenario with channels M10 and M11, where RTTOV13 achieves lower sparsity (20 %), the levels increase to 76.49 % and 77.13 % for SI, 80.12 % and 81.30 % for BIC+L1, and 90.50 % and 89.89 % for L0+L1. As the statistical threshold tolerances decrease, sparsity levels also decrease; however, they remain higher than those of RTTOV13, suggesting that the computational cost benefits are preserved while achieving better sparsity results with the proposed L0+L1 model.

In Figs. 1 and 2, we present the percentage of parameter usage in the proposed optical depth approximations within RTTOV, relative to the number of parameters used in the standard RTTOV configuration, and the percentage of runtime required by the proposed schemes compared to standard RTTOV. The measured runtime corresponds to the average time of 200 evaluations of the parameterized function used to compute approximate transmittances for the 83 atmospheric profiles with 6 different viewing angles. For the following comparisons, we use $ϵ_{1} = 10^{- 6}$ as a reference. For the SI configuration, parameter usage across all channels ranges from 13.67 % to 54.21 % relative to standard RTTOV, corresponding to a runtime ranging from 29.99 % to 58.32 %; for the BIC+L1 configuration, usage ranges from 12.10 % to 43.38 %, with runtime from 26.75 % to 48.64 %; and for the L0+L1 configuration, usage ranges from 8.16 % to 34.39 %, with runtime from 13.77 % to 41.39 %. These results suggest that the computational cost of evaluating parameterized transmittances is significantly and proportionally reduced with the proposed parametrizations.

Although the absolute runtime difference is small for this limited number of profiles, in practical scenarios where transmittance functions must be evaluated for hundreds of thousands of atmospheric profiles, as required in satellite data retrieval applications, the reduction in computational time becomes highly significant for the efficiency of the retrieval process.

As an illustrative example, from Fig. 2, for channel M15, for each 100 time units required to compute transmittances with the RTTOV13 model, the L0+L1 model takes only 41.69 time units with $ϵ_{1} = 10^{- 6}$ , and 59.63 time units with $ϵ_{1} = 10^{- 9}$ (worst-case), representing a significant reduction in runtime.

In Tables 5, 6, and 7, the effectiveness of introducing statistical thresholds to discard irrelevant gases by channel is evident compared to Table 4. A number of non-zero parameters below 100 for a specific gas corresponds to Case II of the statistical threshold parameterization, suggesting that the corresponding gas can be included with the fixed gases.

Table 4Number of nonzero parameters by gas type and channel in RTTOV13.

Download Print Version | Download XLSX

Table 5Number of nonzero parameters by gas type and channel in SI for $ϵ_{1} = 10^{- 6}$ .

Download Print Version | Download XLSX

Table 6Number of nonzero parameters by gas type and channel in BIC+L1 for $ϵ_{1} = 10^{- 6}$ .

Download Print Version | Download XLSX

Table 7Number of nonzero parameters by gas type and channel in L0+L1 for $ϵ_{1} = 10^{- 6}$ .

Download Print Version | Download XLSX

To illustrate in more detail, we reference channels M11 and M12 and compare the sparsity patterns in Figs. 4 and 5 among the four parameterizations using $ϵ_{1} = 10^{- 6}$ as a reference. For the L0+L1 model and the remaining channels, see Appendix A, Figs. A1 and A2. The numbering of predictors and correctors follows RTTOV v13 (Saunders et al., 2020), see Appendix B, except for predictor 0, which corresponds to the predictor in Case II of the statistical inference proposal. Each column represents the parameters of a predictor for each pressure level, and each point in a column represents a non-zero parameter associated with that predictor at the corresponding pressure level.

For channel M11 with SI model (upper-middle Fig. 4), gases O₃, CO₂, and CO are automatically discarded, and fixed gases only need one predictor. Meanwhile, gases H₂O, N₂O, and CH₄ exhibit block-like sparsity patterns from surface pressure approximately to 200, 19, and 0.8 hPa, respectively, where concentrations of these gases are important and cause significant radiance absorption. For these gases with block-like sparsity patterns, replacing classical linear regression with L0+LASSO regression (bottom figure) clearly discards some predictors across all levels or shows them as less relevant, as seen in the sparsity patterns for CH₄ and N₂O. However, H₂O still shows sparsity, but it is difficult for this channel to determine if any predictor can be discarded at all levels due to the importance of this gas and the strong non-linear relationship among the secant angle, temperature, and gas concentration in the predictors defined for it. Using BIC+LASSO regression (lower-middle figure) highlights less relevant predictors for CH₄, but does not entirely discard it or any other gas predictor retained in the SI model. For the proposed models, no correction term is needed at all, showing that a good fit of the total transmittance is obtained by considering only the approximation of the individual gas transmittances.

For channel M12 with SI model (upper-middle Fig. 4), only CO is automatically discarded, which is expected since this gas has no absorption lines in this channel. The SI model still clearly reveals the block-like sparsity patterns of predictors and correctors for each gas at the pressure levels where they contribute to absorption (upper-middle figure). From the figure, CO₂ appears to be relevant at high pressures, approximately above 767 hPa, while O₃ seems relevant between about 2 and 260 hPa. Using L0+LASSO regression (bottom figure) for these important pressure levels demonstrates that some predictors can be entirely discarded or downweighted, as seen for fixed gases, O₃, N₂O, and CH₄. Similarly, the BIC+L1 model (lower-middle figure) highlights less relevant predictors but does not completely discard any predictor retained in the SI model, except in the corrector terms.

A similar analysis can be performed for each channel, as shown in the appendix, where Figs. A1 and A2 display the sparsity patterns for all channels using the L0+L1 model. These figures clearly indicate which gases are relevant in each channel, the pressure level ranges where they play a significant role, and which predictors are most important for reconstructing the transmittance of each gas.

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f04

Figure 4Sparsity pattern for channel M11, comparing RTTOV13 (Top), SI (Upper-middle), BIC+L1 (Lower-middle), L0+L1 (Bottom) for $ϵ_{1} = 10^{- 6}$ .

Download

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f05

Figure 5Sparsity pattern for channel M12, comparing RTTOV13 (Top), SI (Upper-middle), BIC+L1 (Lower-middle), L0+L1 (Bottom) for $ϵ_{1} = 10^{- 6}$ .

Download

4.3 Validation of transmittances

To validate the proposed RTTOV v13 variants, we calculated the root mean square error (RMSE) of the total transmittance for all atmospheric layers, vertical profiles, and viewing angles, as shown in the following formula:

RMSE = {(\frac{1}{L M N} \sum_{i = 1}^{L} \sum_{j = 1}^{M} \sum_{k = 1}^{N} {(τ_{i j k}^{TOT} - {\tilde{τ}}_{i j k}^{TOT})}^{2})}^{\frac{1}{2}},

where L=100, M=83, and N=6. Here, $τ_{i j k}^{TOT}$ and ${\tilde{τ}}_{i j k}^{TOT}$ represent the polychromatic transmittances calculated using LBLRTM optical depths and their corresponding approximations obtained from Eq. (10) using the training data. The results are shown in Table 8. The values in the table correspond to RMSE×10⁴.

Table 8RMSE of total transmittance for each channel, scaled by 10⁴, for the proposed RTTOV v13 variants. The second column indicates the statistical threshold ϵ₁ used for each variant.

Download Print Version | Download XLSX

In Table 8, the RMSE for transmittance errors generally ranges between O(10⁻⁶) and O(10⁻⁵) across all Fast-RT methods and channels, except for channel M9, where errors are larger, in the range O(10⁻²) to O(10⁻³). All three proposed models slightly degrade the precision of RTTOV13, but this degradation diminishes as the statistical threshold decreases. Comparing RTTOV13 with the SI model, the error difference reduces from O(10⁻⁷) to O(10⁻⁹) on average across channels, again except for M9. With BIC+LASSO, the difference remains around O(10⁻⁷), while for channel M9 it is O(10⁻⁴). Similarly, with L0+L1 the difference is about O(10⁻⁷) for most channels, but O(10⁻³) for M9. Among the three, the L0+L1 model shows the lowest precision, as expected due to its more aggressive sparsity, yet the errors remain comparable in order of magnitude to RTTOV13.

Overall, these results indicate that including statistical thresholds in RTTOV v13 has minimal impact on the transmittance approximation. Values remain very close to the standard RTTOV13 configuration for statistical threshold tolerances below 10⁻⁶ (Table 8). Combining thresholds with LASSO regression in a bilevel framework for parameter selection, using either BIC-based or ℓ₀-regularization, slightly modifies the approximation, improving or worsening it, but variations remain small. The approximated transmittances closely match those from LBLRTM, with the added benefit of a significant runtime reduction.

4.4 Validation of brightness temperatures

To achieve a higher level of validation for the proposed transmittance parametrization, the brightness temperatures of the profiles used for training are calculated. The approximated brightness temperatures at the top of the atmosphere were calculated using polychromatic radiances from Eq. (3), applying the approximate transmittances provided by the RTTOV v13 scheme and the proposed variants, separately. To compare these results, brightness temperatures at the top of the atmosphere were calculated using the polychromatic radiances with Eq. (2), using the monochromatic radiances calculated with LBLRTM. In all cases, the integrals were approximated using composite trapezoidal formulas, with the spacing determined by the pressure levels of the data. In each case, the resulting brightness temperatures were averaged over all profiles and viewing angles. The relative errors in BT obtained with the Fast-RT models and those obtained with LBLRTM were then calculated, which are shown in Table 9 (×10⁴). The maximum relative error for brightness temperature, determined for each profile and viewing angle, is presented in Table 10 (×10³).

Table 9Average Relative Errors in Brightness Temperature (K), scaled by 10⁴, between the Fast-RT and LBLRTM models. The second column indicates the statistical threshold ϵ₁ used for each variant.

Download Print Version | Download XLSX

Table 10Maximum Relative Errors in Brightness Temperature (K), scaled by 10³, between the Fast-RT and LBLRTM models. The second column indicates the statistical threshold ϵ₁ used for each variant.

Download Print Version | Download XLSX

In Table 9, a similar behavior is observed in the errors when approximating transmittances. The average relative error of brightness temperature generally ranges from O(10⁻⁵) to O(10⁻⁴) across all channels and Fast-RT methods. The order of magnitude of the average relative error remains consistent when comparing the four methods by channel. The differences in average relative BT errors between RTTOV13 and the SI model decrease from O(10⁻⁵) to O(10⁻⁷) when lowering the statistical threshold tolerance. Similarly, the differences between RTTOV13 and the BIC+L1 model decrease in the same manner. For the L0+L1 model, the differences decrease from O(10⁻⁵) to O(10⁻⁶).

Turning to the maximum errors, for all channels the sparse approximations of optical depth for RTTOV13 show minimal deviation from the BT results of standard RTTOV13 when $ϵ_{1} \leq 10^{- 7}$ . Table 10 shows maximum relative BT errors ranging from O(10⁻⁴) to O(10⁻³) across all channels and Fast-RT methods. Comparing the maximum absolute error by channel for the four methods, errors remain of the same order of magnitude for M7, M9, and M11–M16 ( $ϵ_{1} \leq 10^{- 6}$ ), M8 ( $ϵ_{1} \leq 10^{- 8}$ ), and M10 ( $ϵ_{1} \leq 10^{- 7}$ ); in other cases, standard RTTOV13 may yield up to one order of magnitude lower errors.

Observe in Table 9 that, for some channels, the errors with the proposed methods are slightly lower than those of RTTOV13. With the L0+L1 model at $ϵ_{1} = 10^{- 9}$ this happens for channels M7, M8, M10, M11, M13, M14, and M16, and with the BIC+L1 model at the same tolerance for channels M7, M8, M10, M11, and M13. Also note that, although the BIC+L1 model gives a better transmittance fit than RTTOV13 for channel M9, its brightness temperature error is not improved. These findings suggest that using merit functions based on radiances or BT, together with model complexity penalization, instead of relying only on optical depth fitting, could improve the results of Fast-RT models within the RTTOV13 framework.

Figure 6 (left) shows the average absolute BT error between the LBLRTM model and the Fast-RT models for $ϵ_{1} = 10^{- 6}$ , while Fig. 6 (right) shows the maximum absolute error across all profiles and viewing angles. The average brightness temperature shows some degradation in the proposed methods compared to RTTOV v13: in the worst case, 0.021 K for M7, 0.008 K for M8, 0.20 K for M9, while the other channels remain below 0.003 K for all proposals. For the maximum absolute error per profile and viewing angle, the worst cases are 0.961 K for M7, 0.405 K for M8, and 0.375 K for M9, with the other channels below 0.15 K. These variations are not significant in relative terms, as shown in Table 9, and decrease with a lower statistical threshold, illustrated in Fig. 7 for $ϵ_{1} = 10^{- 9}$ . Under this setting, the average BT error worsens by only $5.7 \times 10^{- 4}$ K for M7, $3.7 \times 10^{- 5}$ K for M8, and $7.5 \times 10^{- 3}$ K for M9, while the others remain below $3.2 \times 10^{- 4}$ K. The maximum error increases by $2.1 \times 10^{- 2}$ K for M7, $1.0 \times 10^{- 3}$ K for M8, and $1.0 \times 10^{- 3}$ K for M9, with the other channels remaining below $1.5 \times 10^{- 3}$ K.

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f06

Figure 6Average Absolute Errors (left) and Maximum Absolute Errors (right) in Brightness Temperature (K) between the Fast-RT and LBLRTM models for $ϵ_{1} = 10^{- 6}$ .

Download

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f07

Figure 7Average Absolute Errors (left) and Maximum Absolute Errors (right) in Brightness Temperature (K) between the Fast-RT and LBLRTM models for $ϵ_{1} = 10^{- 9}$ .

Download

These findings confirm that the proposed methods achieve an accuracy level comparable to RTTOV v13 across most channels, with only minimal degradation observed in a few cases under stringent statistical threshold tolerances.

4.5 Validation of Brightness Temperature Against Instrument Noise Characteristics

To evaluate the accuracy of the Fast RT model, we compare the brightness temperatures it generates with those from high-fidelity simulations using LBLRTM. A standard validation criterion requires that the absolute difference in brightness temperature remains below the instrument's noise level (Garand et al., 2001). Specifically, this involves comparing against the Noise Equivalent Delta Temperature (NEdT) for the thermal emissive bands (M12 to M16), and against the Noise Equivalent Delta Radiance (NEdR) for the solar reflective bands (M7 to M11). For the VIIRS M-bands, Table 11 presents the NEdT values and the signal-to-noise ratios (SNR) used to compute the corresponding NEdR values, as reported in Table 1 of the manual Cao et al. (2017).

Table 11SNR and NEdT Values for VIIRS IR M-Bands (L: Low Gain Mode, H: High Gain Mode).

Download Print Version | Download XLSX

Table 12Percentage of absolute differences in radiance below the NEdR threshold for channels M7–M11, and percentage of absolute differences in brightness temperature below the NEdT threshold for channels M12–M16. The second column indicates the statistical threshold ϵ₁ used for each variant.

Download Print Version | Download XLSX

For each channel from M7 to M11, the table reports the SNR at the reference radiance I_typ (W m⁻² sr⁻¹ µm⁻¹), and for channels M12 to M16, it reports the NEdT at the reference temperature T_typ (K). For a thermal emissive band, the NEdT at temperature T is defined as

NEdT (T) = NEdT (T_{typ}) \cdot \frac{B^{'} (T_{typ})}{B^{'} (T)}

where B^′ is the derivative of the Planck function with respect to temperature. For solar reflective bands, the Noise Equivalent Delta Radiance (NEdR) at radiance I is defined as

NEdR (I) = \frac{I}{SNR} .

Let I_ij and ${\tilde{I}}_{i j}$ denote the top of atmosphere polychromatic radiances obtained using LBLRTM and the Fast RT model, respectively, for atmospheric profile i and observation angle θ_j, and let T_ij and ${\tilde{T}}_{i j}$ be the corresponding brightness temperatures. For emissive bands, the following condition must be satisfied:

| T_{i j} - {\tilde{T}}_{i j} | \leq NEdT (T_{i j}),

and for solar reflective bands, we require:

| I_{i j} - {\tilde{I}}_{i j} | \leq NEdR (I_{i j}) .

The percentage of atmospheric profiles for which these conditions are satisfied serves as a practical metric to evaluate the quality of the forward model. A high proportion of cases meeting the criterion indicates that the modeling error is smaller than the instrument noise, ensuring that the simulated radiances are sufficiently accurate for satellite retrievals and potentially suitable for data assimilation. Table 12 reports the percentage of cases, computed over 83 atmospheric profiles and 6 viewing angles, for which the corresponding noise threshold condition is met.

In the Table 12, it can be observed that for channels M11 to M16, all methods fully satisfy the noise condition, and the proposed methods are comparable to standard RTTOV13 for a statistical tolerance threshold of $ϵ_{1} \leq 10^{- 6}$ . For channels M7 to M10, a stricter statistical tolerance threshold is required to achieve percentages comparable to RTTOV13. For channels M7 and M10, the fulfillment of the noise criterion is quite poor; we infer that this is due to the lack of solar radiation inclusion in the various Fast-RT methods. However, the results obtained with the proposed methods are similar to RTTOV13 for small statistical thresholds. For channel M9, the proposed BIC+L1 model slightly improve the percentage of profiles that meet the noise threshold compared to standard RTTOV13. It is clear that the proposed methods reproduce the results of standard RTTOV13 for large statistical thresholds in the emissive bands and for smaller statistical thresholds in the solar reflective bands, while offering the advantage of greater computational efficiency due to the induced sparsity.

5 Conclusions

This study presents an automatic and sparse optical depth parametrization method for the RTTOV v13 model, aimed at optimizing parameter adjustment. The method applies statistical thresholding across different pressure levels, followed by LASSO regression, instead of the traditional least squares approach in the RTTOV v13 framework. A bilevel optimization approach is used to select the optimal regularization parameter, employing different model validation criteria: one based on ℓ₀ regression and another on the Bayesian Information Criterion (BIC). These alternatives enforce significant sparsity across all optical depth regression parameters, substantially reducing the computational cost of the Fast-RT model without compromising accuracy.

Validation experiments were conducted on the infrared channels of the M-bands for the VIIRS instrument. Different validation criteria were considered, including transmitance fitting against LBLRTM transmitance, brightness temperature fitting against LBLRTM transmitance, and the difference between brightness temperature and the instrument's Noise Equivalent Delta Temperature. The results show consistency with RTTOV v13, while providing improved runtime performance in the evaluation of parameterized transmitances.

The induced sparsity automatically excludes gases with negligible absorptivity in a channel, identifies pressure levels where gases significantly absorb radiance, highlights the most relevant predictors for each gas type, and classifies gases as either fixed or variable. This technique is particularly advantageous for multispectral instruments where multiple gases exhibit strong correlations in radiance absorption, especially in large-scale variable retrievals for inverse problems. The proposed method may be extended to other Fast-RT models, such as CRTM, and to other satellite instruments, such as the Advanced Technology Microwave Sounder (ATMS) and the Cross-track Infrared Sounder (CrIS), to enhance both the computational efficiency of radiative transfer models and the accuracy of retrieved atmospheric profiles. The numerical results obtained at different levels of validation, particularly the output from the proposed model indicating a high proportion of profiles with errors below the instrument's NEdT, provide strong evidence of its suitability and potential for satellite data assimilation. Nevertheless, applying sparsity-inducing models in this context requires a careful evaluation of the sensitivity of simulated radiances to the underlying model state variables. This evaluation, in practical scenarios such as the satellite data assimilation of radiances from the proposed Fast-RT model, will be carried out in future work.

Additionally, future directions may include a benchmark comparison against other existing Fast-RT models and more general scenarios with extreme atmospheric conditions, considering strong absorption due to extreme pollution events and incorporating variable SO₂ concentrations in environments with high volcanic activity.

Appendix A: Sparsity Pattern for RTTOV13+SI+LASSO

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f08

Figure A1Sparsity pattern for channels M7 to M11 in L0–L1 for $ϵ_{1} = 10^{- 6}$ .

Download

https://gmd.copernicus.org/articles/18/8511/2025/gmd-18-8511-2025-f09

Figure A2Sparsity pattern for channels M13 to M16 in L0–L1 for $ϵ_{1} = 10^{- 6}$ .

Download

Appendix B: RTTOV v3 Predictors

Table B1Predictors for RTTOV v13, (Saunders et al., 2017). ^* Not an original RTTOV v13 predictor; this predictor corresponds to Case II of the statistical threshold parametrization.

Download Print Version | Download XLSX

$p δ p (l) = p (l + 1) (p (l + 1) - p (l))$ ,	$p δ p (0) = p (1) (p (2) - p (1))$ ,
$T (l) = \frac{1}{2} (T^{prof} (l) + T^{prof} (l + 1))$ ,	$T^{*} (l) = \frac{1}{2} (T^{ref} (l) + T^{ref} (l + 1))$ ,	$T_{r} (l) = \frac{T (l)}{T^{*} (l)}$ ,
$T_{w} (l) = \frac{\sum_{i = 1}^{l} p δ p (l - 1) T (l)}{\sum_{i = 1}^{l} p δ p (l - 1) T^{*} (l)}$ ,	$δ T (l) = T (l) - T^{*} (l)$ ,
$G (l) = \frac{1}{2} (G^{prof} (l) + G^{prof} (l + 1))$ ,	$G^{*} (l) = \frac{1}{2} (G^{ref} (l) + G^{ref} (l + 1))$ ,	$G_{r} (l) = \frac{G (l)}{G^{*} (l)}$ ,
$G_{w} (l) = \frac{\sum_{i = 1}^{l} p δ p (l - 1) G (l)}{\sum_{i = 1}^{l} p δ p (l - 1) G^{*} (l)}$ ,	$G_{wt} (l) = \frac{\sum_{i = 1}^{l} p δ p (l - 1) T (l) G (l)}{\sum_{i = 1}^{l} p δ p (l - 1) T^{} (l) G^{} (l)}$ .

where p(l) is the pressure (hPa) at level l, T^prof(l) is the temperature (K) at level l of the input profile, T^ref(l) is the temperature (K) at level l of the reference profile which is the mean over the training profile set, $G \in {W = H_{2} O, O_{3}, {CO}_{2}, N_{2} O, CO, {CH}_{4}}$ represents gas concentration (ppmv over dry air), G^prof(l) are the gas concentrations at level l of the input profile and G^ref(l) are the gas concentrations at level l of the reference profile which is the mean over the training profile set.

Code and data availability

The code and data used in this study are available at https://doi.org/10.5281/zenodo.17050361 (Vargas Jiménez and De los Reyes, 2025). Access to the code and data is available to anyone with the provided link, and there are no temporal embargoes or restrictions on access. Anyone who views or downloads the code and data from Zenodo does so anonymously. The software is still under development and is not finalized for end-user applications, but it is provided to allow for the reproduction of the results presented in the manuscript. The code and data is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

Author contributions

JC conceptualized the study, contributed to the development of the methodology, and supervised the research. FV developed the model code, performed the simulations, and designed and carried out the experiments. JC and FV validated the results, ensuring reproducibility. FV wrote the original draft, while JC was responsible for review and editing.

Competing interests

The contact author has declared that neither of the authors has any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

This manuscript benefited from the use of artificial intelligence tools for style correction.

Financial support

This research has been supported by Escuela Politécnica Nacional de Ecuador under award PIGR-22-01: Asimi- lación de datos satelitales para el sistema de pronóstico meteorológico METEO: selección óptima de predictores y localización óptima de observaciones. Franklin Vargas acknowledges partial support from the PhD Program in Applied Mathematics, Escuela Politécnica Nacional de Ecuador.

Review statement

This paper was edited by Cenlin He and reviewed by Steffen Mauceri and one anonymous referee.

References

Belloni, A. and Chernozhukov, V.: L1-penalized quantile regression in high-dimensional sparse models, The Annals of Statistics, 39, 82–130, https://doi.org/10.1214/10-AOS840, 2011. a

Bertsimas, D., King, A., and Mazumder, R.: Best Subset Selection via a Modern Optimization Lens, The Annals of Statistics, 44, 813–852, https://doi.org/10.1214/15-AOS1388, 2016. a

Cao, C., Xiong, X. J., Wolfe, R., DeLuccia, F., Liu, Q. M., Blonski, S., Lin, G. G., Nishihama, M., Pogorzala, D., Oudrari, H., and Hillger, D.: Visible Infrared Imaging Radiometer Suite (VIIRS) Sensor Data Record (SDR) User’s Guide (Version 1.3), Noaa technical report nesdis 142, NOAA, U.S. Department of Commerce, National Oceanic and Atmospheric Administration, https://ncc.nesdis.noaa.gov/documents/documentation/viirs-users-guide-tech-report-142a-v1.3.pdf (last access: 1 July 2025), 2017. a, b, c

Cao, D., Ma, Y., Sun, L., and Gao, L.: Fast observation simulation method based on XGBoost for visible bands over the ocean surface under clear-sky conditions, Remote Sensing Letters, 12, 674–683, https://doi.org/10.1080/2150704X.2021.1925371, 2021. a

Cardall, A. C., Hales, R. C., Tanner, K. B., Williams, G. P., and Markert, K. N.: LASSO (L1) regularization for development of sparse remote-sensing models with applications in optically complex waters using GEE tools, Remote Sensing, 15, 1670, https://doi.org/10.3390/rs15061670, 2023. a

Chen, Y., Weng, F., Han, Y., and Liu, Q.: Validation of the Community Radiative Transfer Model by using CloudSat data, Journal of Geophysical Research: Atmospheres, 113, D00A03, https://doi.org/10.1029/2007JD009561, 2008. a

Clough, S. A. and Iacono, M. J.: Line-by-line calculation of atmospheric fluxes and cooling rates: 2. Application to carbon dioxide, ozone, methane, nitrous oxide and the halocarbons, Journal of Geophysical Research, 100, 16519–16535, https://doi.org/10.1029/95JD01386, 1995. a

Clough, S. A., Iacono, M. J., and Moncet, J.-L.: Line-by-line calculations of atmospheric fluxes and cooling rates: Application to water vapor, Journal of Geophysical Research, 97, 15761–15785, https://doi.org/10.1029/92JD01419, 1992. a

Clough, S. A., Shephard, M. W., Mlawer, E. J., Delamere, J. S., Iacono, M. J., Cady-Pereira, K., Boukabara, S., and Brown, P. D.: Atmospheric radiative transfer modeling: a summary of the AER codes, Journal of Quantitative Spectroscopy and Radiative Transfer, 91, 233–244, 2005. a

De los Reyes, J. C.: Bilevel imaging learning problems as mathematical programs with complementarity constraints: Reformulation and theory, SIAM Journal on Imaging Sciences, 16, 1655–1686, 2023. a, b

De los Reyes, J. C. and Villacís, D.: Bilevel optimization methods in imaging, in: Handbook of mathematical models and algorithms in computer vision and imaging: Mathematical imaging and vision, Springer, 1–34, https://doi.org/10.1007/978-3-030-03009-4_66-1, 2022. a, b

DeSouza-Machado, S., Strow, L. L., Motteler, H., and Hannon, S.: kCARTA: a fast pseudo line-by-line radiative transfer algorithm with analytic Jacobians, fluxes, nonlocal thermodynamic equilibrium, and scattering for the infrared, Atmos. Meas. Tech., 13, 323–339, https://doi.org/10.5194/amt-13-323-2020, 2020. a

Edwards, D.: GENLN2: A General Line-by-Line Atmospheric Transmittance and Radiance Model, Version 3.0 Description and Users Guide, Technical note ncar/tn-367-str, NCAR, National Center for Atmospheric Research, Boulder, https://opensky.ucar.edu/islandora/object/technotes:134 (last access: 7 November 2024), 1992. a

Fleming, H. E. and McMillin, L. M.: Atmospheric transmittance of an absorbing gas 2: a computationally fast and accurate transmittance model for slant paths at different zenith angles, Applied Optics, 16, 1366–1370, 1977. a

Garand, L., Turner, D. S., Larocque, M., Bates, J., Boukabara, S., Brunel, P., Chevallier, F., Deblonde, G., Engelen, R., Hollingshead, M., Jackson, D., Jedlovec, G., Joiner, J., Kleespies, T., McKague, D. S., McMillin, L., Moncet, J.-L., Pardo, J. R., Rayer, P. J., Salathé, E., Saunders, R., Scott, N. A., Delst, P. V., and Woolf, H.: Radiance and Jacobian intercomparison of radiative transfer models applied to HIRS and AMSU channels, Journal of Geophysical Research: Atmospheres, 106, 24017–24031, https://doi.org/10.1029/2000JD000184, 2001. a

Gordon, I. E., Rothman, L. S., Hill, C., Kochanov, R. V., Tan, Y., Bernath, P. F., Birk, M., Boudon, V., Campargue, A., Chance, K. V., Drouin, B. J., Flaud, J.-M., Gamache, R. R., Hodges, J. T., Jacquemart, D., Perevalov, V. I., Perrin, A., Shine, K. P., Smith, M.-A. H., and Zak, E. J.: The HITRAN2016 molecular spectroscopic database, Journal of Quantitative Spectroscopy and Radiative Transfer, 203, 3–69, https://doi.org/10.1016/j.jqsrt.2017.06.038, 2017. a

Han, Y., van Delst, P., Liu, Q., Weng, F., Yan, B., Treadon, R., and Derber, J.: JCSDA Community Radiative Transfer Model (CRTM) – Version 1, Technical report nesdis 122, NOAA, NOAA, Camp Springs, MD, https://repository.library.noaa.gov/view/noaa/1157 (last access: 7 November 2024), 2006. a

Heilemann, J., Marx, A., Klassert, C., Boeing, F., Samaniego, L., Klauer, B., Thober, S., and Gawel, E.: Projecting impacts of extreme weather events on crop yields using LASSO regression, Weather and Climate Extremes, 46, 100738, https://doi.org/10.1016/j.wace.2024.100738, 2024. a

Hocking, J., Vidot, J., Brunel, P., Roquet, P., Silveira, B., Turner, E., and Lupu, C.: A new gas absorption optical depth parameterisation for RTTOV version 13, Geosci. Model Dev., 14, 2899–2915, https://doi.org/10.5194/gmd-14-2899-2021, 2021. a

Hong, F. and Kong, Y.: Random Forest Fusion Classification of Remote Sensing PolSAR and Optical Image Based on LASSO and IM Factor, in: 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, . 5048–5051, https://doi.org/10.1109/IGARSS47720.2021.9553357, 2021. a

Joshi, D. R., Clay, S. A., Sharma, P., Rekabdarkolaee, H. M., Kharel, T. P., Rizzo, D. M., Thapa, R., and Clay, D. E.: Artificial Intelligence and Satellite Based Remote Sensing can be used to Predict Soybean (Glycine max) Yield, Agronomy Journal, 116, 917–930, https://doi.org/10.1002/agj2.21473, 2023. a

Kleespies, T. J., van Delst, P., McMillin, L. M., and Derber, J.: Atmospheric transmittance of an absorbing gas: OPTRAN status report and introduction to the NESDIS/NCEP community radiative transfer model, Applied Optics, 43, 3103–3109, https://doi.org/10.1364/AO.43.003103, 2004. a

Krishnan, P., Srinivasa Ramanujam, K., and Balaji, C.: An artificial neural network based fast radiative transfer model for simulating infrared sounder radiances, Journal of Earth System Science, 121, 891–901, https://doi.org/10.1007/s12040-012-0197-3, 2012. a

Lavrentieva, N., Buldyreva, J., and Starikov, V.: Collisional line broadening and shifting of atmospheric gases: A practical guide for line shape modelling by current semi-classical approaches, Imperial College Press, ISBN 9781848165960, 184816596X, 2011. a

Li, D., Cui, F., Wang, A., Li, Y., Wu, J., and Qiao, Y.: Adaptive Detection Algorithm for Hazardous Clouds Based on Infrared Remote Sensing Spectroscopy and the LASSO Method, IEEE Transactions on Geoscience and Remote Sensing, 58, 8649–8664, 2020. a

Liu, X., Zhou, D. K., Larar, A. M., Smith, W. L., Schluessel, P., Newman, S. M., Taylor, J. P., and Wu, W.: Retrieval of atmospheric profiles and cloud properties from IASI spectra using super-channels, Atmos. Chem. Phys., 9, 9121–9142, https://doi.org/10.5194/acp-9-9121-2009, 2009. a

Mairal, J. and Yu, B.: Complexity Analysis of the Lasso Regularization Path, in: Proceedings of the 29th International Conference on Machine Learning (ICML), Omnipress, Edinburgh, Scotland, UK, 1835–1842, https://dl.acm.org/doi/10.5555/3042573.3042807 (last access: 7 November 2024), 2012. a

Matricardi, M.: An inter-comparison of line-by-line radiative transfer models, Technical Memorandum 525, ECMWF, https://doi.org/10.21957/b3amji4k, 2007. a

Matricardi, M.: The generation of RTTOV regression coefficients for IASI and AIRS using a new profile training set and a new line-by-line database, Technical Memorandum 564, ECMWF, https://www.ecmwf.int/sites/default/files/elibrary/2008 (last access: 7 November 2024), 2008. a

Matricardi, M.: A Principal Component Based Version of the RTTOV Fast Radiative Transfer Model, Tech. Rep. 136(652), Quarterly Journal of the Royal Meteorological Society, https://doi.org/10.1002/qj.680, 2010. a

Mauceri, S., O'Dell, C. W., McGarragh, G., and Natraj, V.: Radiative Transfer Speed-Up Combining Optimal Spectral Sampling With a Machine Learning Approach, Frontiers in Remote Sensing, 3, 932548, https://doi.org/10.3389/frsen.2022.932548, 2022. a

McMillin, L. M. and Fleming, H. E.: Atmospheric transmittance of an absorbing gas: a computationally fast and accurate transmittance model for absorbing gases with constant mixing ratios in inhomogeneous atmospheres, Applied Optics, 15, 358–363, 1976. a

McMillin, L. M., Fleming, H. E., and Hill, M. L.: Atmospheric transmittance of an absorbing gas. 3: a computationally fast and accurate transmittance model for absorbing gases with variable mixing ratios, Applied Optics, 18, 1600–1606, 1979. a

McMillin, L. M., Crone, L. J., Goldberg, M. D., and Kleespies, T. J.: Atmospheric transmittance of an absorbing gas. 4. OPTRAN: a computationally fast and accurate transmittance model for absorbing gases with fixed and variable mixing ratios at variable viewing angles, Applied Optics, 34, 6269–6274, 1995. a

McMillin, L. M., Xiong, X., Han, Y., Kleespies, T. J., and van Delst, P.: Atmospheric transmittance of an absorbing gas. 7. Further improvements to the OPTRAN 6 approach, Applied Optics, 45, 2028–2034, https://doi.org/10.1364/AO.45.002028, 2006. a, b

Miller, A.: Subset Selection in Regression, Chapman and Hall/CRC, 2nd Edn., https://doi.org/10.1201/9781420035933, 2002. a

Osborne, M., Presnell, B., and Turlach, B.: A new approach to variable selection in least squares problems, IMA Journal of Numerical Analysis, 20, 389–403, https://doi.org/10.1093/imanum/20.3.389, 2000. a

Pak, A., Rad, A. K., Nematollahi, M. J., and M Mahmoudi, M.: Application of the Lasso regularisation technique in mitigating overfitting in air quality prediction models, Scientific Reports, 15, 547, https://doi.org/10.1038/s41598-024-84342-y, 2025. a

Saunders, R., Hocking, J., Rundle, D., Rayer, P., Havemann, S., Matricardi, M., Geer, A., Lupu, C., Brunel, P., and Vidot, J.: RTTOV v12 science and validation report, Tech. rep., NWP SAF, EUMETSAT, 78 pp., https://www.nwpsaf.eu/site/download/documentation/rtm/docs_rttov12/rttov12_svr.pdf (last access: 7 November 2024), 2017. a, b, c

Saunders, R., Hocking, J., Turner, E., Rayer, P., Rundle, D., Brunel, P., Vidot, J., Roquet, P., Matricardi, M., Geer, A., Bormann, N., and Lupu, C.: An update on the RTTOV fast radiative transfer model (currently at version 12), Geosci. Model Dev., 11, 2717–2737, https://doi.org/10.5194/gmd-11-2717-2018, 2018. a

Saunders, R., Hocking, J., Turner, E., Havemann, S., Geer, A., Lupu, C., Vidot, J., Chambon, P., Köpken-Watts, C., Scheck, L., Stiller, O., Stumpf, C., and Borbas, E.: RTTOV v13 Science and Validation Report, Tech. rep., NWP SAF, https://nwp-saf.eumetsat.int/site/download/documentation/rtm/docs_rttov13/rttov13_svr.pdf (last access: 7 November 2024), 2020. a, b

Schwarz, G.: Estimating the Dimension of a Model, The Annals of Statistics, 6, 461–464, https://doi.org/10.1214/aos/1176344136, 1978. a

Smith, H. D., Dubeux, J. C. B., Zare, A., and Wilson, C. H.: Assessing transferability of remote sensing pasture estimates using multiple machine learning algorithms and evaluation structures, Remote Sensing, 15, 2940, https://doi.org/10.3390/rs15112940, 2023. a

Stegmann, P. G., Johnson, B., Moradi, I., Karpowicz, B. M., and McCarty, W.: A Deep Learning Approach to Fast Radiative Transfer, Journal of Quantitative Spectroscopy and Radiative Transfer, https://doi.org/10.1016/j.jqsrt.2022.108088, 2022. a

Su, M. Y., Liu, C., Di, D., Le, T., Sun, Y., Li, J., Lu, F., Zhang, P., and Sohn, B. J.: A Multi-Domain Compression Radiative Transfer Model for the Fengyun-4 Geosynchronous Interferometric Infrared Sounder (GIIRS), Advances in Atmospheric Sciences, 40, 1844–1858, https://doi.org/10.1007/s00376-023-2293-5, 2023. a

Turner, E., Rayer, P., and Saunders, R.: AMSUTRAN: A microwave transmittance code for satellite remote sensing, Journal of Quantitative Spectroscopy and Radiative Transfer, 227, 117–129, https://doi.org/10.1016/j.jqsrt.2019.02.013, 2019. a

Vargas Jiménez, F. and De los Reyes, J. C.: Automatic Optical Depth Parametrization in Radiative Transfer Models via LASSO-Induced Sparsity: Code, Zenodo [code], https://doi.org/10.5281/zenodo.17050361, 2025. a

Wang, P., Tan, S., Zhang, G., Wang, S., and Wu, X.: Remote Sensing Estimation of Forest Aboveground Biomass Based on Lasso-SVR, Forests, 13, 1597, https://doi.org/10.3390/f13101597, 2022a. a

Wang, S., Wu, Y., Li, R., and Wang, X.: Remote sensing‐based retrieval of soil moisture content using stacking ensemble learning models, Land Degradation & Development, 34, 911–925, 2022b. a

Weinreb, M., Fleming, H., McMillin, L., and Neuendorffer, A.: NOAA Technical Report NESS 85: Vertical Sounder, Technical report, NOAA, Washington, D.C., https://repository.library.noaa.gov/view/noaa/13429 (last access: 15 December 2023), 1981. a, b

Xiong, X. and McMillin, L. M.: Alternative to the effective transmittance approach for the calculation of polychromatic transmittances in rapid transmittance models, Applied Optics, 44, 67–76, 2005. a

This is the case for Ecuador's METEO operational system, which currently relies on an HPC with only 700 cores.

Articles

Short summary

This study proposes an automatic method to parameterize atmospheric optical depths in the Radiative Transfer for TIROS Operational Vertical Sounder (RTTOV) version 13 model. The approach combines statistical inference and Least Absolute Shrinkage and Selection Operator (LASSO) regression to reduce parameters and select relevant gases. Tests with Visible Infrared Imaging Radiometer Suite (VIIRS) channels show reduced computation while preserving accuracy.

Automatic optical depth parametrization in radiative transfer model RTTOV v13 via LASSO-induced sparsity

2.1 Fast Radiative Transfer Model

3.1 Parametrization Based on Statistical Inference

3.2 LASSO Regression and optimal choice of regularization parameter

3.2.1 Bilevel formulation based on the ℓ0 regression

3.2.2 Bilevel formulation based on the Bayesian Information Criterion

3.2.3 Post-LASSO for model bias reduction

4.1 Experiment settings

4.1.1 Spectral Response Functions of VIIRS M-bands

4.1.2 Vertical profile database ECMWF83

4.1.3 Line-by-Line Transmitances with LBLRTM

4.1.4 RTTOV v13 and Proposals Settings

4.2 Sparsity Pattern in the parametrization of optical depths

4.3 Validation of transmittances

4.4 Validation of brightness temperatures

4.5 Validation of Brightness Temperature Against Instrument Noise Characteristics

3.2.1 Bilevel formulation based on the ℓ₀ regression