Retrieving monthly and interannual pH T on the East China Sea shelf using an artificial neural network: ANN-pH T -v1

. While our understanding of pH dynamics has strongly progressed for open ocean regions, for marginal seas such as the East China Sea (ECS) shelf progress has been constrained by limited observations and complex interactions between biological, physical, and chemical processes. Seawater pH is a very valuable oceanographic variable but not always measured 10 using high quality instrumentation and according to standard practices. In order to predict total scale pH (pH T ) and enhance our understanding of the seasonal variability of pH T on the ECS shelf, an artificial neural network (ANN) model was developed using 11 cruise datasets from 2013 to 2017 with coincident observations of pH T , temperature (T), salinity (S), dissolved oxygen (DO), nitrate (N), phosphate (P) and silicate (Si) together with sampling position and time. The reliability of the ANN model was evaluated using independent observations from 3 cruises in 2018, and showed a root mean square error accuracy of 0.04. 15 The ANN model responded to T and DO errors in a positive way, S errors in a negative way, and the ANN model was most sensitive to S errors, followed by DO and T errors. Monthly water column pH T for the period 2000-2016 was retrieved using T, S, DO, N, P, and Si from the Changjiang Biology Finite-Volume Coastal Ocean Model (FVCOM). The agreement is good here in winter, while the reduced performance in summer can be attributed in large part to limitations of the Changjiang Biology FVCOM in simulating summertime input variables.


Introduction
Atmospheric carbon dioxide (CO2) levels have increased by nearly 46%, from approximately 278 ppm (parts per million) in 1750(Ciais et al., 2013 to 405 ppm in 2017 (Le Qué ré et al., 2018). The oceans have absorbed approximately 48% of the anthropogenic CO2 emissions (Sabine et al., 2004), resulting in decreasing long-term pH trends of ~0.02 decade -1 in open ocean waters (e.g., Dore et al., 2009;González-Dávila et al., 2010;Bates et al., 2014;Lauvset et al., 2015). While a gradual 25 decrease in pH is a predictable open ocean response to elevated anthropogenic CO2 emissions, the seasonal changes and longterm trends in pH in coastal seas have not been fully understood due to the lack of long-term pH data and complexity of coastal systems. In this context, the development of approaches to predict carbonate chemistry parameters in coastal regions may assist both the management of local water quality and our wider understanding of the ocean carbon cycle.
Many attempts have been made to predict seawater pH by developing empirical relationships between pH and environmental 30 variables, such as temperature (T) (Juranek et al., 2011), salinity (S) (Williams et al., 2016), dissolved oxygen (DO) (e.g., Juranek et al., 2011;Sauzè de et al., 2017), nutrients (e.g., Williams et al., 2016;Carter et al., 2016Carter et al., , 2018, and longitude, latitude (Sauzè de et al., 2017). Compared with traditional empirical methods, artificial neural networks (ANNs) have been proposed as powerful tools for modelling uncertain and complex systems such as ecosystems and environmental assessment (e.g., Olden and Jackson, 2002;Olden et al., 2004;Uusitalo, 2007;Raitsos et al., 2008;Chen et al., 2017). Their main advantage 35 compared with e.g. multiple linear regression (MLR) models may be a greater flexibility and versatility in modelling complex nonlinear relationships. ANNs have been used for the retrieval of the partial pressure of carbon dioxide (pCO2) (e.g., Friedrich and Oschlies, 2009;Laruelle et al., 2017), total alkalinity (e.g., Velo et al., 2013;Bostock et al., 2013;Sasse et al., 2013), total dissolved inorganic carbon (e.g., Bostock et al., 2013;Sasse et al., 2013), and phytoplankton functional types (e.g., Raitsos et al., 2008;Palacz et al., 2013). However, these studies mainly focus on the open ocean; relatively few studies have focused on 40 coastal seas, perhaps because of the complexity and heterogeneity of the continental shelves. Alin et al. (2012) developed an MLR model to reconstruct pH in the southern California Current System, while Moore-Maley et al. (2016) evaluated the interannual variability of near-surface pH using a one dimensional, biophysical, mixing layer model in the Strait of Georgia.
To our knowledge, no empirical relationship for pH has yet been established for the ECS.
The ECS is the largest marginal sea in the western North Pacific Ocean and receives massive terrestrial inputs from the 45 Changjiang (Yangtze River). The shelf shallower than 200 m covers more than 70% of the entire ECS (e.g., Ichikawa and Beardsley, 2002;Lie and Cho, 2016), where the dominant currents present seasonal circulation patterns. The spatial and temporal distributions of the carbonate system have been investigated in the ECS (e.g., Chou et al., 2009;Cao et al., 2011;Qu et al., 2015), and were found to largely reflect the distributions of various water masses. The pattern of carbon sources and sinks exhibits substantial seasonal variation (Guo et al., 2015), and the ECS is generally considered as a sink of atmospheric 50 CO2 throughout the year except in fall (e.g., Shim et al., 2007;Zhai and Dai, 2009). A mechanistic semi-analytical algorithm (MeSAA) was developed to study pCO2 variations in response to various controlling mechanisms during summertime (Bai et al., 2015). However, the seasonal variability of pH has been very little studied in the ECS, mainly due to the limited observational coverage and irregular variability caused by seasonal fluctuations of the Changjiang discharge and anthropogenic processes. Developing methods to extend the seasonal coverage of pH data may thus help to improve our understanding of the 55 ocean carbon cycle in the ECS. This paper is structured as follows: section 2 describes the cruise data and ANN model building; section 3 shows the performance, sensitivity and application of the ANN model. Summary and conclusions are summarized in the last section.

Data 60
Ten cruises were conducted on the ECS shelf during the "Shiptime Sharing Project of National Natural Science Foundation of China" from 2013 to 2017 (Fig. 1), the summer cruise from 17 to 28 August 2013, 10 to 17 July 2014, 9 to 20 July 2015, 4 to 28 July 2016, 20 to 30 July 2017, the winter cruise from 21 to 28 February 2014, 15 to 28 February 2017, the spring cruise from 4 to 20 March 2013, 11 to 21 March 2015, 7 to 19 March 2016. T and S profiles were obtained directly using a conductivity temperature-depth/pressure (CTD) recorders (SBE 25plus or 911plus). Measurement of DO followed the Winkler 65 procedure, as described previously by Zhai et al. (2014). Nutrients samples were first filtered with 0.45 μm Whatman GF/F membrane, then stored in 250 mL HDPE bottles until chemical analysis. Nitrate (N), phosphate (P) and silicate (Si) were determined using a segmented flow analyzer (Model: Skalar SAN PLUS , Netherlands) with a precision < 5% (Zhang et al., 2007), the detection limits are 0.14 μM for N, 0.06 μM for P, and 0.07 μM for Si. pH samples were stored in 140 mL brown borosilicate glass bottles and sterilized by addition of 50 μL saturated HgCl2 solution. Three traceable pH buffers were used including 70 NIST (National Institute of Standards and Technology) buffers pH = 4.00, 7.02, 10.09. As described by Zhai et al. (2012Zhai et al. ( , 2014, we converted it into total scale pHT by subtracting 0.143 and the overall accuracy of the pHT dataset was estimated as 0.01. Three cruises were carried out on the ECS shelf in 2018 (Fig. 2) during the "National Natural Science Foundation Shared Voyage Plan", from 10 to 19 March, 12 to 20 July, 12 to 21 October, and one cruise was carried out near the Changjiang 75 Estuary during May 2017 (Fig. 1). The measurement methods of T, S, DO, and nutrients are the same as that of the above ten voyages. pH samples were stored in 500 mL high-quality borosilicate glass bottles without filtering and sterilized by addition of 200 μL saturated HgCl2 solution until measurement in the lab. The pHT was measured at the temperature in the flow cell using an Automated Flow-through system for Embedded Spectrophotometry (AFtes) with a precision of 0.0005 pH unit and uncertainty of < 0.003 (Reggiani et al., 2016). Water samples were collected at three or four different depths during all cruises.
We omitted data points where one or more other physical variables were missing. The three cruises during 2018 (Fig. 2) were used to estimate model predicted performance as an exploratory dataset, while the remaining eleven cruises (Fig. 1) were used to train the model as a confirmatory dataset. The final number of observations in the confirmatory dataset was 1854 (see Table   1 for more detailed information on the field survey).

Artificial neural network development 85
The ANN we used is a feed-forward multilayer perceptron (Tamura and Tateishi, 1997) with two hidden layers. The neurons of each layer are connected with the neurons of the previous layer and the next layer by weights (Fig. 3a). The coefficients of the weight matrix are iteratively tuned in the training step. In order to avoid overfitting, a ten-fold cross-validation was used to assess model prediction accuracy (Fig. 3b). Here, the confirmatory dataset was randomly divided into ten equal subsamples.
One subsample was used as the independent validation data (10% of the confirmatory dataset) and was always excluded from 90 training; the remaining nine subsamples were used as training data (90% of the confirmatory dataset). The training data were further divided randomly into a training set (70% of the training data), validation set (15% of the training data), and testing set (15% of the training data) during the training process. The training set was used for computing the gradient and updating the network weights and biases, the validation set was used to monitor the error and control model stop, and the testing set was used to monitor whether the model was over-fitted (Palacz et al., 2013). We compared performances in predicting the 95 independent validation data from the ten-fold cross-validation and selected the optimal model based on the lowest root mean square error (RMSE). Then we applied the optimal model to the exploratory dataset ( Fig. 2) and evaluated model performance by calculating error statistics. In our study, calculations were done in the MathWorks Matlab environment, using the Deep Learning Toolbox.
First, we compared the performance of one hidden layer vs. two hidden layers in predicting independent validation data. The 100 number of neurons varied from 2 2 to 2 8 for the first hidden layer and was fixed at four in the second hidden layer for the two hidden layers model; the number of neurons in the first layer was the same in the one hidden layer vs. two hidden layers model ( Fig. 4). The ten-fold cross-validation showed that the model with two hidden layers performed better as the number of neurons  S1). As the number of neurons increased, the performances of trainGD and tansig became poor. Although there was no obvious difference between trainLM and trainSCG, the training 110 technique trainSCG was selected and the transfer function logsig was applied to two hidden layers considering the overall performance (Fig. 5). Third, in the training phase of the ANN model, the number of neurons was tested, varying from 4 to 128 for two hidden layers (Table S1). Best performance for both training data and independent validation data was obtained with 40 neurons in the first hidden layer and 16 neurons in the second layer. Finally, different combinations of input variables were tested to choose the optimal architecture of the ANN model (Table 2); best performance was obtained using longitude, latitude, 115 month, T, S, DO, N, P and Si as input variables. The utility of these variables for predicting pH has a strong a priori basis: the carbonate system thermodynamic relationships depend on both T and S (Lueker et al., 2000); a positive correlation is expected between DO and pH (Wootton et al., 2012) because of the role of photosynthesis and respiration in removing or generating CO2 in the water; various nutrients influence phytoplankton growth and abundance, thereby increasing organic carbon fixation/uptake and increasing pH (Wootton et al., 2008(Wootton et al., , 2012. We found geographical information to be a powerful addition 120 in improving the skill of the method (Table 2), allowing the network to learn spatio-temporal patterns that could not be explained by other input variables (Sasse et al., 2013).
In order to avoid bias towards high-value inputs/outputs and to eliminate the dimensional influence of the data, all data used by the ANN model were normalized using the following equation (e.g., Sauzè de et al., 2015Sauzè de et al., , 2016: ) (1) 125 with σ the standard deviation of the considered input variables or output variable pHT. Similar to the approach of Sauzè de et al. (2015,2016), the longitude and month input variables were transformed as follows to account for the periodicity: The latitude variable was transformed into the range of the sigmoid function by dividing by 90, then normalized using (1). 130

ANN model performance
To evaluate the performance of the ANN model, we compared model simulated pHT ( Table 3. The selected ANN model (Table 2, Model#10) showed better performance than the other tested approaches using the same input variables (Table 3). 145

ANN model validation using the exploratory dataset
To further assess the ability of the ANN model to estimate pHT on the ECS shelf, we applied the ANN model to an exploratory dataset not used in ANN model development and sampled during March, July, and October 2018 ( for predicting pHT on the ECS shelf. The carbon chemistry parameters in this region are not only under the direct impact of Taiwan Warm Current and remote control of the Kuroshio water intrusion into the shelf, but are also significantly controlled by seasonal variations of the Changjiang discharge (e.g., Isobe and Matsuno, 2008;Chen et al., 2008;Chou et al., 2009).

5
Taking into account the highly complex hydrographic, biological and chemical conditions, the accuracy of pHT presented is promising.

ANN model sensitivity to environmental input variables
To assess the ANN model sensitivity to different environmental input variables, we added 5% perturbation for each environmental variable separately. Statistically, with 5% T errors added, the ANN model showed slight overestimation in pHT, 165 with mean bias (MB) of 0.0059, RMSE of 0.0079, and R 2 of 0.9949 (Fig. 9a); with 5% DO errors added, the ANN model also showed slight pHT overestimation, with MB of 0.0050, RMSE of 0.0090, and R 2 of 0.9934 (Fig. 9c); with 5% S errors added, the ANN model showed overestimation in pHT, with MB of -0.0111, RMSE of 0.0162, and R 2 of 0.9789 (Fig. 9b). These results suggested that the ANN model responded to T and DO errors in a positive way, S errors in a negative way. The positive response to increasing DO reflects positive correlation between pHT and DO (Cai et al., 2011), which can be attributed to the 170 processes of photosynthesis (generating DO and removing CO2, hence increasing pH) and aerobic respiration (consuming DO and generating CO2, hence lowering pH); the negative response to increasing S reflects the influence of the (lower salinity) Changjiang discharge, carrying large amounts of nutrients that fuel increased primary production (uptake of nutrients and CO2, hence raising the pH) in surface waters during warm seasons (Gong et al., 2011). It was found that the ANN model was insensitive to nutrients errors ( Fig. 9d-9f) and most sensitive to S errors (Fig. 9b), followed by DO and T errors. simulated DO was higher than observed at the bottom (Fig. S2c), and simulated nutrients were higher than observed at the surface ( Fig. S2d-S2f). Comparisons of monthly average pHT from the FVCOM biogeochemical model with pHT retrieved by 185 the ANN model suggested that the ANN model can potentially provide a more accurate pHT (Fig. S3). The possible reason was that the carbonate system from the Changjiang Biology FVCOM was not optimized due to challenges obtaining sufficient boundary information.
Considering the discreteness and discontinuity of the sampling sites, we compared pHT retrieved by the ANN model using the Changjiang Biology FVCOM output with the corresponding observations at some sites with repeated sampling for 3 to 4 years.  (Fig. 10). There are relatively large deviations (greater than the RMSE of 0.04) in August 2013 at station A1-5 and A6-9, and in July 2016 at station A8-5. To illustrate the application performance in the water column, a scatterplot of retrieved 195 pHT vs observations at six sites with repeated sampling for 3 to 4 years (Fig. 11) showed that the ANN model predicted pHT with a RMSE of 0.05 and R 2 of 0.71.
We further compared monthly pHT retrieved by the ANN model using the Changjiang Biology FVCOM output with in situ measured pHT values (Fig. 12). The agreement is good (within the ANN model accuracy: ANN±RMSE) here in winter, but large deviations (greater than the RMSE of 0.04) appear in summer. The reduced performance in summer can be attributed in 6 large part a reduced performance of the Changjiang Biology FVCOM in predicting summertime input variables S, DO, and nutrients (Fig. S2).

Spatial and temporal patterns of ANN-derived pHT
The temporal and spatial variations of monthly surface pHT from 2000-2016 based on Changjiang Biology FVCOM output are shown in Figure 13. During the dry season (November to March of the next year), pHT values vary from ~7.62 to ~8.24. 205 Relatively higher pHT values are found in the southeastern of the study area (Chou et al., 2011), whereas lower pHT values are found in the northeastern of the study area. During the wet season (April to October), pHT values vary from ~7.77 to ~8.35, water of higher pHT corresponded well to the seasonal dispersion of the Changjiang Dilute Water (Chou et al., 2009(Chou et al., , 2013. Water of higher pHT is found in the center of the study area during April, spreads to the southwestern part of the study area (along the coast of China) during May and June, shifts to the northeastern part of the study area during August. In September 210 and October, water of higher pHT is found in the southeastern part of the study area, strongly influenced by the Taiwan Warm Current (Qu et al., 2015).
A clear seasonality is that surface pHT gradually increases during spring (March to May), after which it gradually decreases during summer and fall (June to November) (Fig. 14). The surface pHT displays its maximum in May and minimum in December, and the pHT varies seasonally by up to ~0.3 unit. Larger changes in pH were also discovered in the Washington 215 Shelf, the pH varied ~1.0 unit over the seasons and ~1.5 unit spanning 8 years (Wootton et al., 2008). Accordingly, seasonal dynamics of surface pHT can be mainly attributed to temperature changes and strong biological activities (production and respiration processes) over the season. From March to June, a rapid increase in surface pHT indicates that production increases faster than respiration, which can be reflected in the drop in surface phosphate (Fig. S5d) and apparent oxygen utilization (AOU) (Fig. S5c). It may be driven by the Changjiang discharge (Fig. S4), which carries large amount of nutrients, result in 220 stronger primary production in warm seasons under the combined action of nutrients and suitable temperature (Gong et al., 2011). From July to October, although surface temperature remains at a high level (Fig. S5a), the rise in surface AOU (Fig.   S5c) suggest a decrease in primary production or increase of respiration, which leads to a gradual drop in surface pHT (Wootton et al., 2012). It implies respiration processes dominate relative to primary production during summer and fall.

Summary and conclusions 225
We have developed an artificial neural network (ANN) model, demonstrated its reliability, and used it to retrieve monthly pHT for the period 2000-2016 on the East China Sea shelf. We trained this ANN model using 11 cruise datasets from 2013 to 2017.
In order to choose the optimal architecture of the ANN model, we tested different training and transfer functions, the number of neurons in two hidden layers, and different combinations of input variables. We also validated the reliability of the ANN model with a root mean square error accuracy of 0.04 using three cruises in 2018 as exploratory dataset. The ANN model 230 responded to temperature and dissolved oxygen errors in a positive way, salinity errors in a negative way, and was most sensitive to salinity errors, followed by dissolved oxygen and temperature errors. We also retrieved monthly-average pHT using the ANN model in combination with input variables from the Changjiang Biology Finite-Volume Coastal Ocean Model (FVCOM).
The approach has several potential applications. First, it can provide estimates of seawater pHT with known accuracies for the 235 East China Sea shelf and the period 2013-2018. Within this region the model could be used as a cost-effective way to handle restrictions of marine observations conducted from ships, such as coarse resolution and under-sampling of carbonate system variables. Second, while the ANN model is not a replacement for direct measurements of the carbonate system, it may be a valuable tool for understanding the seasonal variation of pHT in poorly observed regions. Third, this approach can be applied to other regions to predict pH by suitably adapting the input variables and network structure using local dataset. The MATLAB 7 code used in this study to develop and apply the ANN model is freely available, and is accompanied by a README file providing detailed guidance on how to use and adapt the code.  Biogeosciences, 11, 1103Biogeosciences, 11, -1123Biogeosciences, 11, , https://doi.org/10.5194/bg-11-1103Biogeosciences, 11, -2014Biogeosciences, 11, , 2014.

470
Three statistics approaches used are the mean absolute error (MAE), the root mean squared error (RMSE), and the coefficient of determination (R 2 ). N represents the number of data points.     The green circles represent monthly regional average, the blue dashed represents mean value of each month.    Month Monthly average surface pH T (Decision tree, Random Forest, and SVM). The statistics was derived from confimatory dataset (training data independent validation data) using input variables: T, S, DO, N, P, and Si. Note R 2 statistics in our study was based on the calculation of coefficient of determination, therefore negative R 2 could be derived if there were strong bias.

Model
Kernel