A modular wind profile retrieval software for heterogeneous Doppler lidar measurements

Erdmann, Anselm; Gasch, Philipp

doi:https://doi.org/10.5194/gmd-2024-222

Preprints

https://doi.org/10.5194/gmd-2024-222

Preprints

Submitted as: development and technical paper

19 Dec 2024

Submitted as: development and technical paper |

| 19 Dec 2024

Status: this preprint is currently under review for the journal GMD.

A modular wind profile retrieval software for heterogeneous Doppler lidar measurements

Anselm Erdmann and Philipp Gasch

Abstract. Retrieving wind profiles from Doppler lidar radial velocities requires processing software tools. The heterogeneity of Doppler lidar types and data acquisition settings, as well as scan patterns applied for wind profiling, make wind profile processing challenging. Addressing this challenge, a new modular open-source wind profile retrieval software is presented: the Atmospheric Profile Processing toolKIT (AtmoProKIT). The software calculates quality controlled wind profiles from heterogeneous Doppler lidar data, i.e. independent of the system type, data acquisition settings or the scan pattern applied. Ingestion of heterogeneous data is enabled by the definition of a standardized level 1 data format for the measurements, from which level 2 wind profiles are retrieved. Processing flexibility is enabled through the combination of modular processing steps in module chains. Modifications are possible by individually arranging modules, adding calculation modules or adjusting processing parameters. The documentation of the processing steps in the result’s metadata ensures the traceability of the results. A standard module chain is presented, which allows for straightforward wind profile retrieval for common Doppler lidar measurement scenarios without the need for coding. The results provided by the standard module chain are validated against radiosondes for three common Doppler lidar systems in differing atmospheric conditions. AtmoProKIT is provided as open-source Python code and includes demonstration examples, allowing for an easy use and future collaborative modification.

Received: 29 Nov 2024 – Discussion started: 19 Dec 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Anselm Erdmann and Philipp Gasch

Status: open (extended)

Post a comment Subscribe to comment alert

CEC1:
'Comment on gmd-2024-222 - No compliance with the policy of the journal', Juan Antonio Añel, 30 Jan 2025 reply

Dear authors,
After checking your manuscript, it has come to our attention that it does not comply with our Code and Data Policy.

https://www.geoscientific-model-development.net/policies/code_and_data_policy.html

You have archived your code in a site that does not comply with our trustable permanent archival policy. Therefore, please publish your code in one of the appropriate repositories according to our policy. Also the Helmholtz Codebase that you mention in the manuscript, is not an acceptable repository.

In this way, you must reply to this comment with the link to the repository used in your manuscript, with its DOI. The reply and the repository should be available as soon as possible, and before the Discussions stage is closed, to be sure that anyone has access to it for review purposes.
Also, you state in the manuscript that the Swabian MOSES 2023 campaign data will be published later, and you cite a work in preparation. We can not accept this. You must publish all the data that you use in this manuscript with it. If you wanted to use such data in this manuscript and publish the data, the right way to proceed should have been to publish the data first and then submit your manuscript to our journal. Therefore, as I said, you must publish the data in one repository that we accept, in the same way that the code, and reply to this comment with the link and DOI.
Also, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section with all the information here requested.
Also, I have not seen a license listed in the web page that you link for your code. If you do not include a license, the code continues to be your property and can not be used by others, despite any statement on being free to use or that it will be "open-source" after publication. It must be free software at the submission time. Therefore, when uploading the model's code to the repository, you could want to choose a free software/open-source (FLOSS) license. Potential licenses are the GPLv3, Apache License, MIT License, etc.
Please, reply as soon as possible to this comment with the link for it so that it is available for the peer-review process, as it should be. In the meantime, I recommend the Topical Editor to stall the review process for your manuscript, as it should have not been accepted in Discussions given the problems mentioned, and we need to clarify the compliance with the policy before investing the time of referees on reviewing manuscripts that could have to be rejected because of other reasons.
Therefore, please, be aware that failing to comply promptly with this request could result in rejecting your manuscript for publication.
Juan A. Añel
Geosci. Model Dev. Executive Editor

Reply

Citation: https://doi.org/10.5194/gmd-2024-222-CEC1
- AC1:
  'Reply on CEC1', Anselm Erdmann, 11 Feb 2025 reply
  Dear Prof. Añel,
  we thank you for your comments on the Code and Data Policy and apologize that this issue slipped our attention before.
  
  We have addressed the issue by making publicly available the code and data with persistent digital object identifiers (DOIs) now, and will include them in a potential revised manuscript. Further information can be found in our point-by-point replies below. We hope that the manuscript thereby conforms with the journal standards and that the review process can continue.
  
  Please do not hesitate to contact us in case further clarification is needed.
  Kind regards
  
  Anselm Erdmann and Philipp Gasch
  ======================
  Dear authors,
  After checking your manuscript, it has come to our attention that it does not comply with our Code and Data Policy.
  https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
  You have archived your code in a site that does not comply with our trustable permanent archival policy. Therefore, please publish your code in one of the appropriate repositories according to our policy. Also the Helmholtz Codebase that you mention in the manuscript, is not an acceptable repository.
  In this way, you must reply to this comment with the link to the repository used in your manuscript, with its DOI. The reply and the repository should be available as soon as possible, and before the Discussions stage is closed, to be sure that anyone has access to it for review purposes.
  We have now made the code publicly and permanently available. The code is publicly available here https://codebase.helmholtz.cloud/KIT-KIAOS/KITcube/AtmoProKIT and a frozen version has been archived with a DOI https://doi.org/10.5281/zenodo.14844633
  
  Also, you state in the manuscript that the Swabian MOSES 2023 campaign data will be published later, and you cite a work in preparation. We can not accept this. You must publish all the data that you use in this manuscript with it. If you wanted to use such data in this manuscript and publish the data, the right way to proceed should have been to publish the data first and then submit your manuscript to our journal. Therefore, as I said, you must publish the data in one repository that we accept, in the same way that the code, and reply to this comment with the link and DOI.
  We published the Swabian MOSES 2023 Doppler lidar measurement datasets.
  Doppler lidar WTX at Villingen-Schwenningen https://doi.org/10.5281/zenodo.14844019 (June 2023), https://doi.org/10.5281/zenodo.14844229 (July 2023), https://doi.org/10.5281/zenodo.14844286 (August 2023)
  
  Doppler lidar WLS200s
  at Schallstadt https://doi.org/10.5281/zenodo.14842966
  
  at Titisee-Neustadt https://doi.org/10.5281/zenodo.14843671
  
  at Fischerbach https://doi.org/10.5281/zenodo.14844362
  
  at Weil am Rhein https://doi.org/10.5281/zenodo.14843822
  
  at Albbruck https://doi.org/10.5281/zenodo.14844518
  
  Regarding Figure 11, the Doppler lidars with the numbers 101, 196, and 197 were operated by MeteoSwiss and the data is not yet published by them. Therefore, we suggest to remove these lidars from this figure in the future version of the manuscript.
  Regarding the validation in Section 5 we published the radiosonde and retrieved Doppler lidar wind profiles used in the comparisons: https://doi.org/10.5281/zenodo.14844888
  
  Also, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section with all the information here requested.
  We will add this information in a potential revised version.
  
  Also, I have not seen a license listed in the web page that you link for your code. If you do not include a license, the code continues to be your property and can not be used by others, despite any statement on being free to use or that it will be "open-source" after publication. It must be free software at the submission time. Therefore, when uploading the model's code to the repository, you could want to choose a free software/open-source (FLOSS) license. Potential licenses are the GPLv3, Apache License, MIT License, etc.
  We have published the code under the EUPL-1.2 license, which is compatible to the GPLv3 license referred by you. The Swabian MOSES 2023 measurements are available under the CC BY-NC 4.0 license.
  
  Please, reply as soon as possible to this comment with the link for it so that it is available for the peer-review process, as it should be. In the meantime, I recommend the Topical Editor to stall the review process for your manuscript, as it should have not been accepted in Discussions given the problems mentioned, and we need to clarify the compliance with the policy before investing the time of referees on reviewing manuscripts that could have to be rejected because of other reasons.
  Therefore, please, be aware that failing to comply promptly with this request could result in rejecting your manuscript for publication.
  We made the code and the measurement data publicly available and therefore hope that the review process can be continued thereby.
  
  Juan A. Añel
  Geosci. Model Dev. Executive Editor
  Thank you for your effort and time invested. We apologize again for the oversight and inconveniences.
  
  Reply
  
  Citation: https://doi.org/10.5194/gmd-2024-222-AC1
  - CEC2: 'Reply on AC1', Juan Antonio Añel, 12 Feb 2025 reply
    
    Dear authors,
    Many thanks for addressing the mentioned issues. We can consider now your manuscript in compliance with the Code and Data policy of the journal.
    Regarding the lidars operated by MeteoSwiss, it would be good if you can get them published; however, being third party data and which publication seems to be out of your control, it is not strictly necessary that their data are published to include them in the analysis that you present. Therefore, it is not necessary to remove them from the figure.
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Reply
    
    Citation: https://doi.org/10.5194/gmd-2024-222-CEC2
RC1: 'Comment on gmd-2024-222', Anonymous Referee #1, 28 Mar 2025 reply

The paper describes a software package to retrieve wind profile in the atmosphere from Doppler wind lidar. The paper is of good quality and describes well what the basic concept of wind retrieval from Doppler wind lidar are, and how this software implements them. The software uses a clearly defined data structure and is of modular design which allows to flexible set up retrieval chains. The presented examples contain interesting approaches to cope with noisy data, and to either identify reliable data points or reject them. The description is for the most parts clearly written and well understandable.
Nevertheless I have some comments:

Sampling in height bins:
The retrieval takes all Doppler speeds measured within one height bin, If I understood it right they are used as if all of them were measured at the same height. This neglects that the wind speeds especially in the boundary layer show a strong non-linear height dependence. Basic assumption of the wind retrieval is that the radial Doppler wind speeds measured in the different tilted beams are all result of the same wind vector. By bringing together different heights in one retrieval this assumption is violated. In my opinion it would be necessary to first interpolate the radial velocities to one single height within every height bin. The authors should discuss this in the text.
Combination of different scans patterns:
The common way for wind profile retrieval is to perform a certain scan pattern which should evenly cover the measuring volume (see Paeschke et al 2015, mentioned in the text). Different scan patterns can be used to address different questions (fast scans like DBS with only few beams to achieve high temporal resolution, PPI (or VAD) scans with many beams to reduce the uncertainty within one scan, RHI scans to eventually resolve spatial variability at least in one vertical plane, etc.). Several scans of the same kind can be combined by averaging.

Here is a different strategy proposed which combines all in a certain time interval available scans (hence the term 'heterogeneous measurements'). This introduces asymmetry in the spatial coverage and may lead to larger errors as is shortly mentioned in section 4.1 when describing the standard module chain and the removal of data from low elevation beams at large distances (line 375). I recommend to emphasise stronger this strategy but also discuss the effects and dangers of an uneven coverage of the measuring volume in section 2.

Extrapolation in regions with low CNR:
An important part of the advanced retrieval is the attempt to use even data at low CNR. To achieve this a first guess wind profile is derived and inter- and extrapolated to regions with low CNR. The kind of this extrapolation is not documented. The resulting fields are used in an iterative process as 'confidence background'. Calculated wind speeds deviating not too far from this confidence background are accepted even for low CNR values. I am wondering whether this extrapolation is robust, especially in upper heights, where CNR is low over large regions. The systematically large differences in the upper part of the profiles in fig's 13, 14 and 15 might be result of this extrapolation. The authors should discuss this in the text. It would be nice to have an analysis how the retrieval performs for low CNR and whether the difference to the radiosonde data for these low CNR regions is acceptable.

Description of basic scan patterns:
As the paper also addresses researchers not familiar with all the details of doppler Lidar retrieval it would be beneficial to give a compact, and clear overview of possible scan patterns, sources of errors etc. The word scanning is used in an ambiguous way, the step-stare scan mode is mentioned but continuous scan mode not.

To retrieve the 3D wind vector it is necessary to tilt several beams in different directions - this is called a scan. Common scan patterns are PPI (or VAD), RHI, etc. Scans can be performed in two different modes: step-stare and continuous. There is a trade-off between these modes (varying orientation, dead time during movement, time for one scan, etc.).

There are two main sources of error: the doppler retrieval has always some uncertainty, which increase with decreasing CNR (or SNR) (hence the CNR thresholding as filter strategy) and the violation of the homogenous flow assumption due to the separation of the tilted beams by e.g. turbulence. The along beam doppler uncertainty maps, depending on the elevation angle, to the retrieved wind components. With increasing elevation its contribution to the error of the horizontal wind component increases. With decreasing elevation the separation between the beams becomes larger and the error due to turbulence becomes larger.

Range of biases
Figures 13, 14 and 15 show the bias between wind speed profiles retrieved with this method and radiosondes. While it is nice that on average the bias is close to zero the percentiles show that still 10% of the wind speeds deviate by more than about 4m/s from the radiosonde data. This could be compared with the results of Rahlves et al. (2022) and Robey and Lundquist(2022) and discussed.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Detailed comments:
Line 64: period is missing
line 94: ambiguous use of the word 'scanning' , especially with the list of scans to follow. Are here meant general measurements in different directions or especially measurements during continuous movement? An explanation and discussion of step-stare and continuous mode is missing.

line 97: alternative name for PPI is VAD from 'Velocity Azimuth Display' (doppler velocities are 'displayed' as a function of azimuth. Wind components are retrieved by fitting a sine curve, |v| is proportional to the amplitude etc ...) already cited in Browning and Wexler (1968), stems from radar. Both terms are not perfect: PPI in radar technique means a scan at very low elevation and its presentation on a screen, VAD can also refer to the display of wind vectors as a function of height and time sometimes used in aviation.
line 99: With one RHI at one azimuth it is not possible to retrieve a wind vector, it should be mentioned that this is only possible with a combination of RHIs in different azimuth directions.
line 100: "arrangements of fixed-direction stares, e.g. in step-and-stare"

This gives the impression that the above described scan patterns are not done in step-stare mode but instead in a different way (which is never named in the paper). I would recommend to mention first the different modes (step stare and continuous), discuss advantages and disadvantages and then list the principal scan patterns.
line 104: "RHI scans allow for measurements closer to the ground,"

A PPI can also be performed at lower elevations. I do not see why it must be a RHI,
line 111: "The wind vector retrieval error is determined by ..."

I do not understand, is this doubled in the next sentence ?



line 117: "Steep measurements"

I guess you mean "... at high elevation angles"
line 118: "shallow scans"

I guess you mean "... at low elevation angles"



line 118: "... retrieval errors due to turbulence typically average out within a 10 min to 30 min span,"

But the uncertainty never vanishes - you cite Rahfles et al (2022) who showed that RMSD becomes never zero and is surprisingly large.
line 125: "... some instruments provide a signal-to-noise ratio (SNR),"

I would appreciate some words what the difference between CNR and SNR in Windlidardata is, and whether there is a difference in the use of it.
line 123: "Filtering with the carrier-to-noise ratio (CNR) is ..."

Please mention that with decreasing CNR the uncertainty/the error increases, Also mention parameterizations: There are estimates for the doppler retrieval uncertainty - see eg Pearson and Collier (1999, doi:10.1002/qj.49712555918), Manninnen et al 2016 (doi:10.5194/amt-9-817-2016) who they refer to Rye and Hardesty (1993) and O'Connor et al. (2010),

Also worth mentioning is that if CNR becomes too low the doppler speed becomes evenly distributed in the bandwidth (as is visible in Fig D3 in this paper).
line 127: "Potential causes of erroneous measurements despite high CNR..."

I am missing precipiation ...



line 138: "if the beam directions of the radial velocity measurements are not sufficiently dispersed ..."

I would prefer the word 'distributed [in space]' instead of dispersed.

(the term dispersion has a different meaning in optics, and another in Chemistry, ...)
line 140: "The condition number (CN) is a measure for the robustness of the beam dispersion"

I would write: "… CN is a measure for the robustness of the equation system with respect to errors in the input variables. "
line 143: "Maximizing the number of considered measurements allows for the smallest assumption violation, due to the beneficial effect of averaging."

This is the basic argument for the combination of several different scan patterns in the retrieval. In this universal formulation I doubt it, especially because it could be a violation of the argument about the spatial beam distribution of the previous paragraph. Fig.1 is a nice example: the RHI has a much higher spatial coverage than the VAD, but in only one direction. In a VAD scan a higher number of measurements in one direction leads to a lower condition number (Paeschkae et al 2015).
line 163: "Changing absolute range gate heights for different laser beam positions can be overcome by binning the measurements with respect to the height above ground."

If understand right this means that radial velocity measurements from different heights within one height bin are assumed to represent the same wind vector. This neglects the height dependence of the wind, and accordingly violates the homogenous wind field assumptions.
line 180: "timestamps of the measurements,"

Any requirement whether this should be the begin or end of the averaging interval?



line 181: "The time coordinate can differ between instruments at level 0 and will then also differ at level 1."

Unclear, Please clarify.



line 194: "... azimuth (time), elevation (time) ..."

When performed in continuous mode azimuth and elevation vary during the integration time. It should be mentioned which angles must be provided here (begin, middle, end of integration time).
line 212: "Level 2 represents wind vectors ..."

I guess you mean 'Data level 2 repesents ..."
line 239: "flagging of precipitation."

Is there a module to flag precipitation? If yes: how is this done ?
line 250: "The validity_probability variable contains an acceptance (1.0) or rejection (0.0) value for each bin according to a user-defined threshold."

That seems rather to be a flag, or is it possible to implement fuzzy logic based on values between 0 and 1. ?
Table 1: "(8) Bin_statistics_l1_to_l2 ... provide CNR and Doppler spectrum width information to l2"

Interesting! How is this done ?

Doppler spectral width can be regarded as a standard deviation - not in time but in the measuring volume of the respective range gate. In Sathe et al (2015) equation 4 can be seen that the along beam variance depends on the variances and covariances of the principal components of the wind vectors. Thus a retrieval of doppler spectrum width in the principial spatial components could be done with this. But does it work ?
line 376: "Such an imbalance negatively impacts the vertical wind vector calculation and is, therefore, undesirable."

This is the argument against the general statement from line 143 that the number of measurements should be used. Line 143 should be adapted. I am wondering how this affects only / especially(?) the vertical wind. Maybe that is result of the strong wind shear in the height bins close to the surface?
line 384: "... a strict CNR filter (the default threshold for Leosphere WLS200s is -25 dB) ... a weak CNR validity flag (default threshold -30 dB for Leosphere WLS200s)"

A guideline on how can these thresholds could be determined for instruments from other manufacturers would be desirable. I would be also of interest to mention at some point how these kind of thresholds or parameters are provided to the software and how they can be changed.
line 392: "... the volume enclosed in the convex hull spanned by the unit vectors"

This is unclear: If I imagine a VAD scan with constant elevation the volume spanned by the endpoints is a polygon (lets say a disk) with vertical extension equal to zero. Its volume would be zero. If it is the whole cone spanned by the vectors it is easily much larger than the threshold of 0.2.

Please clarify.
line 406: "... even at low CNR reliable measurements may be available in some circumstances,"

Radial velocity uncertainty increases with decreasing CNR. Values will lie in a large interval around the true value. The value can lie just by chance in the trustworthy range around the 'confidence background'. It does not contain information. It becomes trustworthy just by coincidence. I see this is addressed by limiting the share of accepted values to a minimum.

line 412: "...the confidence background is extrapolated in module (7.2)."

How does this extrapolation work ?

In fig 9 it is obvious that wind speeds are extrapolated into large regions where the first guess gave no data. This bears the danger to create unrealistic values. Application of this confidence background to low CNR values may let pass through otherwise unplausible values.

Please discusss this.
line 463: "Whether a higher availability from more iterations justifies the increased calculation effort can be decided by the user in the individual application case."

It would be great to see whether the additionally retrieved wind vectors are of good quality.
line 475: "Heterogeneous scans of type DBS, step-and-stare, and sector PPIs are used for wind profile retrieval (Fig. E2 (a))."

I see the sector PPI at an elevation of only a few degrees in a sector of only 45deg width. Does this data contribute to the retrieval or is it excluded as it is too low and too far away?
line 478: "The lidar conducted PPI scans ... (Fig. E2 (b))."

This looks as if the lidar performed its scan in continuous mode. Please clarify.
line 486: "... also termed VAD pattern for Halo systems ..."

This term is not Halo specific. You can find it for wind lidar scans already in Browning and Wexler (1968). The synonyms PPI and VAD could be introduced with the scan patterns.
line 492: "At weak SNR, very frequent vertical winds with 1ms-1"

You could refer to fig.D3(c) - it is clearly visible
line 507: "Fig. 12 shows the comparison of the radiosonde measurements with the corresponding retrieved horizontal wind speed".

How are radiosonde and wind profile data matched: The radiosonde needs about 16minutes for 1km or more than 1h15m for the 5km displayed. Is it compared to the wind profile at start time, one profile in between or to several lidar profiles according to closest time ?
Fig 12:

Instead of repeating the scatter plot with different, difficult to depict colours, it would be helpful to see differences lidar - radiosonde as a function of the parameter of interest i.e. radiosonde distance, vertical velocity and eventually wind speed.

Fig 12 - caption:

"For Villingen-Schwenningen, one radiosonde profile is omitted"

I guess the crosses are the data from this profile.
Fig 13:

Systematic negative Delta v above 4km:

I recognize that the interquartile range does not deviate as much as (arithmetic?) mean, i.e. only few extreme datapoints with Delta << -4m/s (light grey shading) are responsible for this negative Bias. N is also low, so this maybe only one single Profile ?

I am wondering whether this extreme bias is result of the extrapolation of the confidence background. Similar deviation are visible in fig 14 and 15.
line 534: "For Payerne and Villingen-Schwenningen, the comparison shows good agreement"

What about the negative bias at Villingen-Schwenningen between 0 and 3km ? Ah, i see this is discussed later.
line 539: "The slight variation in the number of retrieved wind vectors ..."

This is an aliasing effect. Heights bins are every 100m. Lets assume your range bins map to heights every 30m. As 100m and 30m do not have a common denominator you get alternating 3 and 4 measurements in one height bin. As several measurements are collected in every height bin, the variation becomes a multiple of 3 and 4, respectively.
line 547: "(bias) in the u component of about 0.4ms-1"

I see negative values in fig 14.

line 550: "The u-component shows *a* systematically higher mean values,"

(Typo with the 'a' )

The bias in the plot is negative and I thought it is 'lidar - radiosonde'.

I have the impression that at least the total range of the deltas is larger for the u than for the v component.

Reply

Citation: https://doi.org/10.5194/gmd-2024-222-RC1

Anselm Erdmann and Philipp Gasch

Viewed

Total article views: 402 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
324	57	21	402	18	31

HTML: 324
PDF: 57
XML: 21
Total: 402
BibTeX: 18
EndNote: 31

Views and downloads (calculated since 19 Dec 2024)

Month	HTML	PDF	XML	Total
Dec 2024	61	9	3	73
Jan 2025	61	13	3	77
Feb 2025	63	10	3	76
Mar 2025	31	5	2	38
Apr 2025	28	6	1	35
May 2025	30	4	1	35
Jun 2025	23	6	7	36
Jul 2025	24	4	1	29
Aug 2025	3	0	3

Cumulative views and downloads (calculated since 19 Dec 2024)

Month	HTML	PDF	XML	Total
Dec 2024	61	9	3	73
Jan 2025	61	13	3	77
Feb 2025	63	10	3	76
Mar 2025	31	5	2	38
Apr 2025	28	6	1	35
May 2025	30	4	1	35
Jun 2025	23	6	7	36
Jul 2025	24	4	1	29
Aug 2025	3	0	3

Viewed (geographical distribution)

Total article views: 387 (including HTML, PDF, and XML) Thereof 387 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 07 Aug 2025

Short summary

A new software for the calculation of quality controlled wind profiles from heterogeneous Doppler lidar measurements is presented. The processing is designed modularly. A provided standard processing chain is validated using radiosondes for three common Doppler lidar types at different locations.


Total:	0
HTML:	0
PDF:	0
XML:	0