Reply on RC2

The experiment PSEUDO needed to be rerun because of a bug in the calculation of the planetary boundary layer (PBL) height which affected the PBL pseudo-observations function. This bug only affected this experiment. Accordingly, figures 11, 12, and 13 were replaced with the new results and corresponding discussions and some diagnostics were provided in Sect. 4.3 (lines 512-537) in the new version of the manuscript, as suggested by the reviewer.

The paper tested different GSI assimilation settings with FV3 LAM runs, and proved that the system can predict the severe convective squall line case over Oklahoma on 4 May, 2020. The results are useful to most NCEP forecast data users. The paper has some unclear or incomplete reasoning but will likely be a significant contribution with revision and clarification.
We thank the reviewer for the comments and suggestions. Our responses are noted below. Changes to the document are highlighted in red.

General comments:
For most experiments in this paper, only the comparison results were shown, and the specific causes of the results were not analyzed.
For example, the differences between PSEUDO and 75EnBEC experiments were huge, but the authors have not given many diagnostics. Why more observations through GSI will cause overestimated convection?
The experiment PSEUDO needed to be rerun because of a bug in the calculation of the planetary boundary layer (PBL) height which affected the PBL pseudo-observations function. This bug only affected this experiment. Accordingly, figures 11, 12, and 13 were replaced with the new results and corresponding discussions and some diagnostics were provided in Sect. 4.3 (lines 512-537) in the new version of the manuscript, as suggested by the reviewer.
Another example, the VLOC of 1 layer should capture finer vertical features of low atmosphere but the result showed that the positive impact is above 650 hPa and negative impacts are below 800hpa, why?
Indeed, the experiment VLOC was conducted in order to capture finer features of the low atmosphere through the reduction of the vertical localization from 3 levels to 1 level in the first 10 model layers. However, the results did not show a positive impact in the lower atmosphere but above 650 hPa. In part this is because the multivariate relationships within the background error covariance that spread the impact to different levels and locations. Also, because of the cycling technique, the impacts in the forecasts are found in other levels. This led us to conclude that the default value of 3 layers already gives the best results in most vertical levels and that the value of 1 layer may be too small considering that the distance between layers in the lower atmosphere is also small. A sentence was added to add more discussion to these results.
Lines 500-503: "The analysis cycling technique and multivariate relationships in the BEC spread the observations impact throughout different levels and locations, which could have led to the slight positive impact above 650 hPa instead of the lower atmosphere where the modification in the vertical localization was made." If the RRFS aims to replace the NCEP operational suite of regional and convective scale modeling systems in the next upgrade, it would be best to show the result from RAP as a baseline for all these tests.
This study provides very preliminary results on how the GSI analysis system and the limited area capability of the Finite Volume Cubed Sphere dynamical core (FV3 LAM) performs with the different options tested. It provides an evaluation of different functions and parameter values used currently in HRRR and RAP. However, some functionalities are still being developed and tested and therefore a more comprehensive study with more up-to-date developments of RRFS is underway where a comparison against RAP (as well as HRRR, NAM, and NAM nests) will be shown.

Specific comments:
In Figure 2. RRFS cycling configuration diagram, the cold start is at 0 utc and the warm start seems to be from 1 utc to 6 utc. But in Figure 5 and relative context, the cold start is at 0 utc and 12 utc, what is the exact cold start interval?
We agree that the diagram could mislead readers. It was modified to include cycles from 00:00 UTC to 12:00 UTC. Lines 247-2I 50 were modified accordingly.
Lines 247-250: " Figure 2 illustrates the RRFS cycling configuration from cycles initialized between 00:00 UTC through 12:00 UTC. In each cycle, an 18 h free forecast is launched following the analysis, with hourly outputs. A cold start is performed at 00:00 UTC and 12:00 UTC and warm starts between 01:00 UTC to 11:00 UTC using the FV3 LAM 1 h forecast from the previous cycle as background for the analysis"

L127 LAM appeared first time, should be limited area modeling (LAM) capability
We added the LAM definition right after its first appearance in Line 55, as suggested.
In Figure 9, no "Matched pair counts used for RMSE and bias computation at each cycle" were found in the photos.
We thank the reviewer for pointing out this mistake. The sentence was removed from the caption of Figure 9.