The revised manuscript is greatly improved, especially the introduction section. Most of the comments have been well addressed. However, I still have some specific comments.
The basic idea of this manuscript is to alleviate the development burden of hydrological modelers to achieve high-performance watershed modeling without reconstruction of model code, which is novel and clearly stated. The implementation based on the SWAT model, i.e., GP-SWAT, must be helpful for the scientific community. Overall, I am glad to suggest an acceptance for publication after a minor revision.
1. In Line 53-54, the author introduced three types of parallelization strategies, such as model-level, submodel-level, and spatial-decomposition. But, in my view, the author has confused the spatial-decomposition method with the submodel-level, i.e., Line 64-79 should be the spatial-decomposition method, or more precisely, the spatial(-temporal) decomposition method, and Line 80-90 should be the submodel-level method. I mean, the so-called submodel level is a special case derived from the spatial(-temporal) decomposition method. In such a case, each submodel is a full model executed on one part of the watershed (i.e., subbasin). Besides, each parallelization type should have a short and precise definition. Please consider my suggestion.
2. The title used “a…simulation framework”, but the introduction only listed some parallelization strategies (or named parallelization schemes). I would suggest introducing existing hydrological modeling frameworks based on parallel computing and raise their weakness. I think that will be the answer to the second comment of #referee 1 (Line 95: It's better to state why this research wants to propose a new parallelization scheme?). Also, in the main text, the author used “a two-level parallelization scheme”, why not “framework”, and what is the difference?
3. The authors claimed that “indeed, the actual speedup ratio that can be achieved is largely dependent on the structure of the stream network.” and “The intention of using two study areas in this study was to demonstrate how stream network complexities can affect GP-SWAT performance”. Although the revised manuscript added some more descriptions of the two study areas, I cannot find the quantitative or qualitative analysis of the different stream networks' structures and the consequent result differences. So, I may suggest only retain the Jinjiang study area. Or, if the author can give a calculation method of theoretical speedup ratio considering the structure of stream networks and the available computing resources, that will be much valuable to adopt the two distinct study areas.