|The manuscript has been significantly improved based on the previous reviews. However, I still think it is a bit of a mixture of topics that usually should not be combined in a single paper and much of which have relatively limited scientific novelty. But I think it is up to the journal editor to decide whether they think it is appropriate for the journal.|
The manuscript in my view consists of three topics:
a) How to derive the forcing for a lake model if limited meteorological data is available (explicitly no direct observations of the radiative fluxes)? A viable way for this is now clearly described in the paper, but I don’t think there is much novelty included in this approach.
b) The development of a new lake model. It remains unclear, however, which properties of the new model justify its presence besides the substantial number of already existing models. The manuscript focuses on the integration of the heat flux procedure from a), but this could have been easily integrated in any already existing lake model instead. Furthermore, due to the mixture of topics, the structure of the paper is suboptimal for a methods paper introducing a new model.
c) The numerical experiment looking at the model performance as a function of the simulation length. This is actually an interesting numerical experiment, and I am not aware that I have seen it been done elsewhere. However, it takes a disproportionate fraction of the paper, if the focus of the paper should be the introduction of a new model (as implied by the title). Furthermore, there is almost no guidance for the reader to understand the purpose and the implications of this numerical experiment. Why is this analysis performed? What hypotheses should be investigated with this approach? What general take-home messages do result from this analysis? What are the implications for the application of this and other lake models?
In addition, the two following major points should be considered:
 The new comparison of the model with GOTM and SCHISM could be useful. However, it needs a description how the other two models were parameterized (could be in the supplement). Was exactly the same approach used for all three models (e.g., same initial profile, same meteorological forcing)? The text is a rather unclear about that. Also, I have no experience with SCHISM, but there must be clearly something wrong with the model setup. Otherwise the surface temperature couldn’t possibly be systematically off by 5 °C for the entire second half of the year.
 The model does not, as far as I understand, consider the changing lake area with depth. This leads by definition to either a wrong net heat flux at the lake surface or wrong temperatures in the lake, because the ratio of surface area to volume is different in the model than in the real lake. This should somewhere be mentioned as a limitation of the model.
 The manuscript is rather lengthy and contains a large number of figures. It also seems occasionally repetitive. There is certainly potential to make it more concise without losing relevant information.
Line 45: I would not consider GLM (Hipsey et al., 2019) as a two-layer model
Line 54: The first author’s name is Råman Vinnå.
Line 175: Eddy diffusivity is certainly not negligible in the hypolimnion of these lakes or small lakes in general. Observations in much smaller lakes typically show vertical diffusivities in the hypolimnion on the order of 10^-6 m2/s, which is one order of magnitude larger than thermal diffusivity. One of the first famous studies to investigate this is Powell and Jassby (1975, https://doi.org/10.4319/lo.1975.20.4.0530), who investigated Castle lake (0.2 km2). Other examples are Soppensee (0.23 km2; Vachon et al. 2019, https://doi.org/10.1002/lno.11172), or two small lakes (0.25 and 0.05 km2) in the Canadian Experimental Lakes Area (Quay et al., 1980; https://doi.org/10.4319/lo.1980.25.2.0201). There are certainly many more examples available in literature. Mixing in the interior of lakes can sometimes go down to molecular levels, but basin-scale mixing almost never does.
Line 231: The description of D is rather confusing and disagrees with that in the original paper of Winslow et al. (2001).
Line 252: add units to K1 (lambda_e) and K3. Why not just use lambda_e in the equation instead of K1, as it is used in the equation for K2 anyway?
Equation (22): I don’t think there is a physical reason to assume that (1 - longwave albedo r) and the emissivity of the water surface (epsilon) are identical. Why not simply keep r and eps as two variables, they can still take the values 0.04 and 0.96 to keep the results the same.
Line 290: typo in “heat”
Equations (29) and (30): should these equations not have a negative sign (positive fluxes downward)?
Line 347: I still think a time step of 1 hour is rather large for the interpolation and could lead to significant numerical errors. Maybe it is ok, but it should at least be checked for one example by how much the model output changes if a smaller time step is used. I disagree with the previous reply that the available time resolution of the meteo data precludes such an analysis. The main question is whether the mixing algorithm in the lake model produces different results for a higher time resolution and that can also be tested with meteo forcing that is constant over each hour.
Line 396: Is “totally out of phase” the correct statement here?
Line 405: Monomictic lakes can also mix at temperature significantly different from 4°C. It is therefore not necessary a correct assumption to initiate the model with a constant temperature of 4°C.
Line 457: If the reason for the too high surface temperatures was an underestimation of turbulent mixing, there should also be an underestimation of deepwater temperatures for the same reason. This is clearly not the case, as the bias is positive for all depths. The reason for the bias therefore must almost certainly be the heat budget, where either some incoming heat flux is systematically overestimated or some outgoing heat flux is underestimated. Unfortunately, it is not possible to make a full heat budget calculation since the model does not consider the lake bathymetry (see point  above), but it could be at least approximately estimated by how much the net heat input to the lake is overestimated. For example, a mean bias of 0.5 °C (as roughly estimated from Fig. 10a) over 40 m depth would correspond to an excess heat of 84 MJ/m2. If that excess heat results over a time of 30 days, the net heat flux at the surface is overestimated by about 32 W/m2.
Figure 9: add units to those metrics that have units.
Figure 15: wrong caption
Line 580: straightforward
Figures A4 and A5: captions are exchanged?