|I thank the authors for responding in full to all of my questions and comments. However, in doing so they have introduced errors within their manuscript, and in some cases made incorrect statements and/or misunderstood concepts. These errors and misunderstanding need to be corrected before the manuscript is published. One error that the authors have made (in relation to uncertainties in the GLODAPv2 data) may change some results.|
So I recommend that the paper is revised before acceptance.
These errors are listed here:
1. The authors have attempted to address my questions about temperature handling, but they have misreported and/or misunderstood multiple aspects. These are:
a) lines 159 – 164. These statements are misleading and the confuse the issue. The re-analysis is needed to ensure that all of data are valid for a consistent depth. Ship and in situ measurements within a large collated dataset will have been collected from multiple different depths, and these variations in depth will have an unknown impact on the calculated air-sea gas fluxes. This problem is mentioned in Goddijn-Murphy et al, (2015) and explained in detail within Woolf et al., (2016). The use of a remote sensing temperature dataset for the re-analysis means that the data all become valid for a fixed and known depth. The Reynolds OISST dataset that the authors have used are a long term consistent climate quality dataset of remotely sensed infrared data (measurements of the thermal signal from the top few mm of the water) that are calibrated to fixed depth using quality controlled and calibrated buoy in situ data. The statements on lines 159 to 164 (and the equivalent sentences in the supplementary section, and similar sentences in the discussion – see point 1e below) need to be updated so that they correctly convey this reasoning and explanation.
The re-analysis to a known and consistent depth then also makes it possible for a more accurate gas flux calculation, as you then have the ability to use or estimate skin and sub-skin temperatures. But this is a separate issue.
The version of the Reynolds OISST dataset also needs to be stated (as there are two different methodologies, one uses passive microwave, the other does not).
Woolf et al., (2016)
Goddijn-Murphy et al, (2015)
b) lines 165 to 167. This is an unusual scientific justification. It reads like the authors are saying that they are choosing to exclude recent advances, as including them appears to degrade their expected result. Its also not clear if the ‘reference’ dataset (that they used for the evaluation in the supplementary section) was the original SOCAT data, or the re-analysed SOCAT data. You would expect that assessing a re-analysed (interpolated output) result, using the original (not re-analysed) dataset as the reference would indeed give a poorer result (due to the poorer reference value).
This latter point is an example of why using RMSE, where the E is error, is potentially misleading as it's a difference, not an error – please see my point 3 below.
A better reasoning for their method choice would be that the authors are choosing a method that is consistent with those used by the SOCOM datasets, to which the authors would like to compare/contrast their work.
c) The methods are confusing and inconsistent with the response to reviewer comments. In the reviewer responses the authors say that they will use a more accurate calculation for the gas fluxes (as given by Woolf et al., 2016 where two solubilities are used), whereas equation 2 says that they are not using a more accurate calculation (as it contains one solubility term), but then on line 315 the authors say that they are using a more accurate calculation (they state they are using the Rapid model, but Equation 2 is not the rapid model). Please can the authors clarify their methods and ensure that the text matches the equations and the information from the Woolf et al, (2016) reference that they refer to.
d) Why is the UEA-SI method an exception? (as stated on line 454 to 455). All of the SOCOM datasets that the authors have used use the original SOCAT data and so none of these datasets have re-analysed the SOCAT data. So I don’t think that there are any exceptions and they will also contain unknown impacts due to not re-analysing the socat data to a consistent depth.
e) lines 655 to 664. This paragraph is also confused and makes incorrect statements. The same mistakes as described above in point 1a) are again repeated in this paragraph. please refer to the original Woolf et al., 2016 paper where these issues are discussed in depth.
Any difference introduced due to inconsistent sampling depths will vary spatially and dependent upon the ship collecting these measurements, and so the measurements (original versus re-analysed) are unlikely to always shift in unison as suggested. Eg. there are likely to be larger differences between original and re-analysed data within an area of upwelling, in comparison to areas of well mixed ocean waters.
2. The authors have used an incorrect combined uncertainty value for the GLODAPv2 dataset. The value taken from the Olsen et al., (2016) paper that the authors have used as a combined uncertainty is actually the bias value, which is only one part of a Type A uncertainty (BIPM, 2008). Both a bias and variance estimate are needed for a Type A uncertainty estimate. The Olsen et al. work does not provide a variance estimate, so I would suggest that the authors use the state of the art uncertainties as provided by Bockmon and Dickson, (2015). Bockmon and Dickson give a combined uncertainty of 0.5% for total alkalinity and for total dissolved carbon (whereas the existing bias estimate that the authors are using equates to ~0.2% for alkalinity, which illustrates that the uncertainty estimate used by the authors is too low). Using a more correct estimate of the combined uncertainty may mean that some results throughout the paper need updating.
Bockmon and Dickson, (2015)
Olsen et al., 2016
3. Throughout the manuscript the authors have incorrectly used the word ‘error’ when they actaully mean uncertainty or difference (e.g. when using the root mean squared error, RMSE). The use of the word error implies that some ‘truth’ value is known, whereas all measurements and observations contain uncertainties and are not ‘truth’ (including those within GLODAPv2 and SOCAT that are being used as a reference for the RMSE calculations). I would suggest that the authors instead report RMSD (where the D is difference) and check and correct the wording used within sentences throughout the paper (and the supplementary) to refer to differences, uncertainties and errors (as many instances where the word ‘error’ has been used are not strictly ‘errors’).
There typographical errors in the main text. (e.g. ‘sea-are’ on line 311) and in the reference list (e.g. the Woolf et al., 2015 paper in the reference list on page 35 has incorrect lead author, incorrect co-author list and incorrect publication year). There are also sentences that are incomplete and/or contain errors (e.g. line 123 in the supplementary ‘…showing that SOCAT temperatures or on average… [whereas it should be ‘temperatures are on average’]).
I would encourage the authors to take some time and slowly go through the manuscript and check their sentences, phrasing, spelling and references.