Doppio – A ROMS-based Circulation Model for the Mid-Atlantic Bight and Gulf of Maine: Configuration and comparison to integrated coastal observing network observations

We describe “Doppio”, a ROMS-based model of the Mid-Atlantic Bight and Gulf of Maine regions of the northwest North Atlantic developed in anticipation of future applications to biogeochemical cycling, ecosystems, estuarine downscaling, and near-real-time forecasting. This free-running regional model is introduced with circulation simulations covering 2007-2017. 10 The ROMS configuration choices for the model are detailed, and the forcing and boundary data choices described and explained. A comprehensive observational data set is compiled for skill assessment from satellites and in situ observations from Regional Associations of the U.S. Integrated Ocean Observing Systems, including moorings, autonomous gliders, profiling floats, surface current measuring coastal radar, and fishing fleet sensors. Doppio’s performance is evaluated with respect to these observations by representation of sub-regional temperature and salinity error statistics, as well as velocity and sea level coherence spectra. 15 Model circulation for the Mid-Atlantic Bight and Gulf of Maine is visualized alongside the mean dynamic topography to convey the model’s capabilities.


Introduction
Coastal ocean circulation models that downscale global ocean simulations are useful tools for exploring regional ocean dynamics and associated links to biogeochemistry, ecosystems, geomorphology, and other applications, for example, by inferring transport pathways for nutrients, larvae, sediments, or pollutants. The reduced geographic scope of a regional model offers economies in computational effort that allow much greater experimentation than would be possible with global models alone, such as by examining sensitivity to resolution or parameterization of added physics, and they present the opportunity to affordably explore numerous application scenarios. Here we describe the development, evaluation, and application of a regional model of the northeastern continental shelf of North America from Cape Hatteras, North Carolina, northward to near Halifax on the Scotian Shelf of Canada. The model, intended principally for studies of ocean physical circulation but conceived for future applications to biogeochemical cycles and ecosystems, uses the 3-D hydrostatic shelf circulation model ROMS (Regional Ocean Modeling System; http://www.myroms.org, last access: 20 December 2019; Shchepetkin and McWilliams, 2005) as the underlying hydrodynamic model core.
The model configuration builds significantly on two earlier regional modeling programs. A ROMS Northeast North American (NENA) shelf coupled circulation and biogeochemical model encompassing the entire coastal ocean extent from Florida to the Grand Banks of Newfoundland (Hofmann et al., 2008) was used for numerous studies of nutrient and carbon fluxes in this region (Fennel and Wilkin, 2009;Fennel et al., 2006Fennel et al., , 2008. The NENA biogeochemical model performed well within the Mid-Atlantic Bight but less so for the Gulf of Maine (Hofmann et al., 2008), and this lackluster performance in the Gulf of Maine was ascribed to shortcomings of the modeled physical circulation (Cahill et al., 2016). Accordingly, an emphasis in configuring the model described here was to create an improved Published by Copernicus Publications on behalf of the European Geosciences Union. 3710 A. G. )-based circulation model  Figs. 4,5,8,9,and 13. Gulf of Maine circulation so that subsequent biogeochemical simulations will have a more skillful physical foundation. A second prior ROMS-based modeling effort, termed ESPreSSO (Experimental System for Predicting Shelf and Slope Optics), had a more limited geographic scope covering only the Mid-Atlantic Bight (Zavala-Garay et al., 2014). This model has been widely used for studies ranging from hurricane-induced cooling via mixing (Seroka et al., 2017) to shelf-wide ecosystems (Xu et al., 2013) and dissolved organic carbon fluxes (Mannino et al., 2016). An operational forecast version of ESPreSSO that used 4-D variational assimilation Zavala-Garay et al., 2014) performed the best of seven real-time models covering the region (Wilkin and Hunter, 2013).
The present modeling effort, which we have dubbed Doppio, focuses on maintaining the skill shown by ESPreSSO while expanding the domain to include the Gulf of Maine. To assess the Doppio skill, the observing network used for ESPreSSO was expanded, adding new satellite altimeters and SST (sea surface temperature) sensors and the Gulf of Maine in situ observations.
The moniker Doppio reflects that the Doppio domain is approximately twice the extent of ESPreSSO. The model domain is indicated in Fig. 1, colored by bathymetry and with positions marked for several time series observation locations used for either forcing or analyses that will be discussed later in the paper.
The Doppio domain encompasses two very different dynamical regimes in the Mid-Atlantic Bight (MAB) and the Gulf of Maine. The MAB (Cape Hatteras, North Carolina, to Cape Cod, Massachusetts; Fig. 1) is characterized by a broad ( ∼ 100 km wide) shelf with a permanent front at the shelf break that separates relatively cool and fresh shelf waters from the warmer and more salty Slope Sea (Mountain, 2003). Instabilities in the shelf break front have wavelengths of typically 40 km that can evolve significantly in time over just a few days (Fratantoni and Pickart, 2003;Gawarkiewicz et al., 2004;Linder and Gawarkiewicz, 1998). The alongshelf currents generally reach the seafloor, exhibiting significant flow-bathymetry interactions and establishing acrossshelf transport in the bottom Ekman layer.
Eddy shelf interactions tied to Gulf Stream-induced warm core rings (Zhang and Gawarkiewicz, 2015) lead to crossshelf exchange with surface and subsurface structure at scales of 10-30 km and days to weeks. Subsequent acrossshelf fluxes of heat, freshwater, nutrients, and carbon control water mass characteristics and impact ecosystem processes throughout the MAB.
The Gulf of Maine (GOM) is a semi-enclosed marginal sea distinctive in the world for having the largest tidal amplitude, over 16 m, due to its shape and length that lead to near resonance of the lunar semi-diurnal M2 constituent of the tide (Garrett, 1972). There are two main oceanic inflows to the Gulf of Maine: Scotian Shelf water flowing southwestward along the coastline from Halifax and originating from the Labrador Current; and Slope Sea water entering through the Northeast Channel that derives from subpolar North Atlantic waters mixed with eddies of the Gulf Stream. Additional inflows are river runoff from many sources along the coasts of New England, New Brunswick, and Nova Scotia and the net difference in precipitation and evaporation within the Gulf of Maine (Brown and Beardsley, 1978). The two main outflows are water exiting through the Great South Channel between Cape Cod and Georges Bank toward Nantucket, and around Georges Bank (Brown and Beardsley, 1978). This exchange flow through the Northeast Channel can be influenced by Gulf Stream eddies, episodically delivering warm, saline waters (Bisagni and Smith, 1998) in such quantity as to change the physical circulation of the Gulf of Maine (Brooks, 1987). The circulation is predominantly counterclockwise about the Gulf of Maine, from Nova Scotia into or across the Bay of Fundy, then in a strong coastal current southward along the New England coast. While some water exits via the Great South Channel, the majority of flow proceeds eastward along the northern flank of Georges Bank, finally wrapping around the underwater plateau and continuing towards the MAB (Bigelow, 1927;Wiebe et al., 2002). Within the gulf, strongly irregular bathymetry exerts significant control on the low-frequency flow variability above three deep basins, which can be challenging to model as previous studies have shown (Hofmann et al., 2008). The gulf's confluence of glacially carved bathymetry and strong tidal forcing lends itself to equally dynamic currents, namely a significant along-bank current jet that may be the prime driver of transport through the region (Loder et al., 1992).
Physical circulation processes influence the biogeochemistry of the Gulf of Maine via a number of mechanisms. Wintertime circulation is especially dynamic, influenced by winds on short timescales and partial mixing of three separate water masses (Vermersch et al., 1979). Mixed-layer depth influences the onset of primary productivity via springtime mixing, with shallower regions of approximately 60 m or less conducive to more substantial and sustained productivity (Yentsch and Garfield, 1981). Shallow waters over Georges Bank remain tidally mixed year-round, which contributes to maintaining a clockwise residual gyre (Chen et al., 2003) that has implications for larval dispersal. Recent warming has resulted in increased rainwater entering the gulf, freshening the surface and stratifying the water column, inhibiting vertical nutrient flow (Salisbury et al., 2009). Within the gulf, strong summertime recirculation causes retention of both primary producers and nutrients for no less than 40 d (Naimie et al., 2001). Improving our capability to model the physical circulation of this region and to determine what may be controlling carbon air-sea exchange and reservoirs at a regional level is important to developing a full comprehension of the carbon cycle at the global scale.
2 Model configuration for the MAB and GOM Our regional model is created using ROMS, a 3-D hydrodynamic model that solves the hydrostatic, Boussinesq, primitive equations in a structured horizontal grid with terrain-following vertical coordinates. The ROMS computational design itself and many of the model's companion features such as integrated sediment transport and ecosystem and biogeochemical models are described in detail elsewhere McWilliams, 2005, 2009). ROMS is used extensively for coastal-and continental-shelf applications.
The Doppio model, building on the ESPreSSO heritage, uses many of the same model settings and parameter values. The model resolution is a uniform 7 km horizontal grid (242 × 106 cells) and 40 vertical levels. This resolution is a compromise, as a finer horizontal resolution would help capture submesoscale dynamics but would dramatically increase the computational costs. Given the multitude of model runs to be undertaken during model configuration and then application, it is practical to employ the modest 7 km uniform resolution, which is comparable to the first baroclinic Rossby radius in shelf waters. In comparison, ESPreSSO had a 5 km grid resolution and 36 vertical levels, and NENA had a 10 km grid resolution and 30 vertical levels but also covered the entire Gulf of Maine, MAB, and South Atlantic Bight. The vertical stretching is such that the resolution is enhanced toward surface and bottom boundary layers in the coastal ocean (in-side the 100 m isobath), and there is better than 3 m resolution at the seafloor and 1.5 m resolution at the sea surface. Vertical mixing is parameterized using the generic lengthscale (GLS) (Umlauf and Burchard, 2003) implementation of the k-kl turbulence closure (Mellor and Yamada, 1982). A detailed listing of other configuration options and parameter choices is presented in Table 1.
The Doppio model configuration has been applied to simulations of the decade 2007-2017. Over this period we have reliable and consistent meteorological forcing and openboundary condition data and a dense set of observations with which to assess the model skill. The locations of river point sources used for forcing, along with tide gauges and moorings used for the skill assessment, are noted in Fig. 1. The model has also been implemented, essentially unchanged, as an experimental operational ocean forecasting system with variational data assimilation. The forecast system is not a focus of the present study, but several of the choices of model input data streams were motivated by the intent to allow nearreal-time operation. To complete the description of the model configuration we detail next the external driving data sets that determine the air-sea fluxes, river inflow, and open-boundary forcing.

Atmospheric forcing
Atmospheric forcing data are drawn from National Centers for Environmental Prediction (NCEP) products, namely the North American Regional Reanalysis (NARR) (Mesinger et al., 2006) and North American Mesoscale (NAM) (Janjic et al., 2005) forecast model. The atmospheric analysis variables used are net shortwave and downward longwave radiation; precipitation; and marine boundary layer air pressure, temperature, relative humidity, and wind velocity. With these and model sea surface temperature, the air-sea fluxes for momentum and heat are calculated according to the so-called TOGA-COARE (Tropical Ocean Global Atmospheres Coupled Ocean Atmosphere Response Experiment) bulk fluxes parameterization (Fairall et al., 2003). The air pressure also directly drives sea level variability via the inverted barometer effect (ATM_PRESS in Table 1).
An essential atmospheric forcing term is net shortwave radiation flux (downwelling shortwave radiation minus the fraction reflected due to ocean surface albedo), which is important not only for its influence on model physics but also as a driver of primary productivity when circulation is coupled to models of ocean biogeochemistry and ecosystems. It has been noted in previous studies that NARR shortwave radiation tends to be an overestimation in comparison to observed values (Kennedy et al., 2011), and a study within our region of interest, namely Delaware Bay (Wang et al., 2012), applied a reduction of NARR shortwave by 20 % to correct for this (though the analysis actually showed the overestimation to be typically 23 %). To examine whether a 23 % correction is warranted beyond 3712 A. G. )-based circulation model  Delaware Bay, we compared net NARR shortwave data to weather satellite radiance observations from the ISCCP (International Satellite Cloud Climatology Project) (Schiffer and Rossow, 1983) at one point, the ground station MVCO (Martha's Vineyard Coastal Observatory), and observe in Fig. 2a an overestimation by NARR of 17 %. The ISCCP spatial resolution is low compared to local land-based radiometer data in Wang et al. (2012), so to further test the Wang et al. (2012) analysis, we compared to a higher-resolution satellite product in the form of downwelling photosynthetically available radiation (PAR) from MODIS (Moderate Resolution Imaging Spectroradiometer; http://coastwatch.pfeg. noaa.gov/erddap/griddap/erdMEpar01day.html; Van Laake and Sanchez-Azofeifa, 2005). In Fig. 2b we have applied the 23 % adjustment to NARR net shortwave data, converted net to downward shortwave data assuming an albedo of 6 % (Payne, 1972), and applied a factor of 0.45 (Kirk, 2010) for the fraction of shortwave that is PAR for comparison to MODIS. We see that the mean ratio of the two is approximately 1 and are reassured that the NARR reduction of 23 % is justified.
The shortwave radiation values from NARR and NAM are instantaneous diagnostic quantities calculated from the modeled water vapor and other atmospheric constituents and are provided at 3 h intervals. This interval poorly resolves the diurnal cycle of air-sea heat flux and, potentially, photosynthesis.
The time of solar noon varies across the longitude extent of the domain and shifts with respect to the reporting hour during the seasonal cycle. It can be shown that the clear-sky maximum radiation reported with a 3 h sampling interval is typically underestimated by 5 % but can be underestimated by as much as 20 % when solar noon falls between the 3 h samples. Therefore, to better capture the full range of the solar cycle, daily averages of the NARR or NAM data are provided to the model, and at runtime an idealized diurnal shortwave radiation cycle is imposed, appropriate to the longitude, latitude, and year day that has the same daily average (DIURNAL_SRFLUX in Table 1). This option ensures the correct length of day and better noontime peak solar radiation.

River sources
The Gulf of Maine and Mid-Atlantic Bight are home to many rivers with moderately high discharge that varies on quite short timescales, from the Saint John River entering the Bay of Fundy in Canada, all the way down to the Susquehanna River entering the Chesapeake Bay, Maryland. A significant fraction of this terrestrial runoff makes its way to the ocean without joining a river that is actively monitored by the United States Geological Survey (USGS) or the Water Survey of Canada (WSC). To account for flow that reaches the GOM and MAB via ungauged portions of the watershed, we turned to an analysis of daily river discharge based on a high-spatial-resolution watershed model that aggregated surface water flow into a total of 403 rivers along the northeastern seaboard of North America for the 11-year period (2000-2010) (Stewart et al., 2013). This represents a nearcomplete accounting of the terrestrial surface water discharge from land to ocean. These 403 modeled sources were consolidated within large watersheds into 27 principal river sources; 24 in the United States and 3 in Canada (Fig. 3). The locations along the coast of the discharge-weighted consolidations were, in most cases, associated to one large familiarly named river source. The consolidated data set therefore comprises a decade-long record of daily total watershed discharge for a modest number of river sources suited to driving the regional ocean model. However, the retrospective analysis time period 2000-2010 leaves us without river forcing data for subsequent years. The locations of point sources contributing to each of the 27 consolidations were again compared with watershed maps to find the USGS and WSC river gauges nearest to the mouth of the chosen rivers that are known to reliably report daily data in near real time.
A second data set comprising these 27 river gauges for the 2000-2010 interval was compiled, and a maximum covariance analysis (MCA) of the two data sets was used to formulate a predictor of full watershed discharge at the 27 principal river sources based on the USGS and WSC gauge data as inputs. This statistical extrapolation compensates for the ungauged watershed, and we use this in place of the timespan-limited Stewart et al. (2013) data set or the gauge data alone, both of which are incomplete for our purposes. Furthermore, the method is suited to use in near real time by operationally acquiring the gauge data and applying the statistical expansion to the full watershed discharge.
An obstacle to this approach, however, is gaps in the WSC data for both the real-time and historical sections. Additionally, there are no real-time river gauges in Nova Scotia, Canada, that report discharge; the only parameter reported is water level. Therefore, a rating curve approach was taken by computing a relation between water level and discharge for the Mersey River, one of the two Nova Scotian stations in question, using historical data. Discharge data for Mersey River are available through 2012, while the station began recording water level in early 2011 and continues to do so operationally. Figure 4a shows water level and discharge for the period 2011-2012 when simultaneous data exist and the quadratic relation inferred to project discharge on the basis of water level data.
In Fig. 4b the projection (red) is made for the entirety of the water level record, showing strong correspondence between measured and inferred discharge in the overlap period, giving credence to the relation as a useful predictor of discharge from the real-time water level. There was no comparable training data set for the Sackville River in Nova Scotia, but an analysis of correlations between Sackville River discharge and all neighboring rivers showed the Mersey (128 km away) as a likely predictor. Though the correlation 3714 A. G. )-based circulation model Figure 2. (a) Net NARR shortwave daily-averaged data uncorrected for overestimation compared to ISCCP net shortwave (SW) data; mean ratio indicates 17 % overestimation. (b) NARR shortwave data with a 23 % reduction for overestimation (Wang et al., 2012) and a 6 % reduction for albedo (Payne, 1972) and assuming a fraction of PAR (parfrac) of 0.45 (Kirk, 2010) compared to MODIS satellite PAR. in discharge is not particularly strong (Fig. 4c), using the Mersey River to project discharge for the Sackville during times of coincident data availability (Fig. 4d) captures the timing and magnitude of peak discharge events well, giving a useful real-time discharge predictor based on river water level data. A 10-month gap in USGS data for the Carmans River, New York, was filled following a comparable process utilizing a discharge and water level relation for the nearby Quashnet River. Other minor data gaps of the order of a few days or so were simply filled by linear interpolation.
Water temperature is reported at only few of the discharge stations and is often incomplete at best. Therefore, in lieu of direct observations, the river temperatures provided to the ROMS model were interpolated from NARR atmospheric forcing data but capped at 0 • C minimum. The mean temperatures are indicated in Fig. 3. Where gauge data were available for comparison, the NARR temperatures were on average a few degrees warmer than the gauge temperatures. This is inconsequential in the ocean response because at discharge locations the water temperature is quickly modified by mixing and air-sea heat fluxes.
With these processing steps complete, we have discharge data for 27 principal rivers along the eastern seaboard, stretching from Nova Scotia down to the Carolinas for 2007 through 2019 to drive the hindcast regional ocean circulation experiments described below. Moreover, the methodology has been adapted to run the system in near real-time for operational ocean forecasting.

Open boundaries
Open-boundary information at the model domain perimeter is based on daily mean data taken from the Mercator Ocean system (Drévillon et al., 2008) provided by Copernicus Marine Environment Monitoring Service (CMEMS). To these data we apply a bias correction to the annual mean, retaining mesoscale variability. These corrections are substitutions of temperature and salinity with the annual mean from our regional climatological analysis MOCHA (Mid-Atlantic Ocean Climatology and Hydrographic Analysis) (Fleming, 2016), and mean dynamic topography and velocity are from data-assimilative climatological analysis with the Doppio grid. While others have shown that sourcing open-boundary data from global products, even with a bias correction, may not always yield better results than sourcing from regionalscale products (Brennan et al., 2016), our model configuration and domain best benefit from our chosen pairing of a global product with regional climatology correction. The Oregon State University Tidal Prediction Software (OTPS) In order for the model sea level to be of use for further downscaling applications, such as driving sea level boundary conditions to higher resolution estuarine models, or studies of coastal inundation, we needed to adjust our mean dynamic topography during the bias correction to a useful local reference datum; here, the NAVD88 (North American Vertical Datum of 1988). Coastal sea level from a preliminary simulation was compared to 14 NOAA tide gauges in the MAB and GOM that report sea level with respect to NAVD88. These showed an almost uniform bias offset of 0.1959 m, so this adjustment was made to the MDT (mean dynamic topography) derived from the climatological analysis. This has no dynamical impact but effectively reconciles the global and regional datums so that sea level output from Doppio is a best estimate of the total water level at the coast with respect to the regionally accepted datum.

Skill assessment
The model performance was assessed by comparison to a comprehensive aggregation of all available observational data for 2007-2017 from in situ and remote sensing platforms. The aggregation includes sea surface temperature (SST) from several infrared-and microwave-sensing satel-lites, sea surface height (SSH) from satellite altimeters and in situ tide gauges, velocity from surface-current-measuring HF (high-frequency) radar, and in situ temperature and salinity from numerous operational and research platforms. A full list of data types and sources is presented in Table 2, and the total number of observations per month for each source of observational data is shown in Fig. 5. A ROMS option (VERIFICATION in Table 1) activates extracting model values at points in time and space by bilinear interpolation to match the coordinates of data in observation files formatted for ROMS 4-D variational (4D-Var) data assimilation. At run time the VERIFICATION option creates a separate output file populated with interpolated model values corresponding to all observations, enabling various statistical comparisons. These observation files are the end result after quality control screening of the raw data streams and the creation of binned averages of sources where the resolution exceeds that of the model (notably SST and dense in situ trajectory-profile data from gliders) and decimation of CODAR (coastal ocean dynamics applications radar) velocity to give independent observations, consistent with standard practice in the ROMS 4D-Var framework.
For the statistical skill assessment analysis, five subregions of the Doppio domain representing distinct dynamical regimes were considered -anticipating that model performance may exhibit varying skill in different situations. These are the Scotian Shelf, the Gulf of Maine, Georges 3716 A. G. )-based circulation model Bank, the Mid-Atlantic Bight, and the shelf break to 3500 m (Fig. 6). Aggregated model-observation difference statistics within each region, independent of any further spatial or temporal distinctions, are shown for temperature (both SST and subsurface), salinity, and vector velocity model skill in Fig. 6. The results are evaluated in the form of Taylor diagrams (Taylor, 2001), a robust method of visualizing multiple statistical parameters within a single plot. Figure 6 includes an explanatory schematic Taylor diagram. Normalizations are with respect to the observation standard deviation. The normalized centered root mean square (CRMS) error is indicated by dashed arcs that show proximity to the unfilled marker on the x axis at (1, 0). The normalized standard deviation is shown as distance from the origin (0, 0) indicated by dot-ted lines, with the unit arc indicating the model and observation standard deviation match. Along the outer curved edge is shown the correlation coefficient. Lastly, the normalized mean bias is shown as the stick originating from each marker, where the distance from the tip to the aforementioned unfilled marker along the x axis is the normalized uncentered RMS (root mean square) error. During our model configuration, design, and testing, various options were evaluated using the VERIFICATION framework to systematically determine if they led to quantitative improvement in model skill. These experiences are instructive for the design of ROMS-based regional ocean models in general, so we briefly outline results for these tests below to complete the description of the Doppio configuration.

Surface stress from wind relative to surface current
A feature of the Doppio configuration that differs from widespread ROMS practice is a modification to the bulk formula to use wind velocity relative to the surface current in the calculation of surface stress (Bye and Wolff, 1999). Typically, ROMS models do not make this correction though it has been an option for some time. In a Taylor diagram for sea surface current skill (Fig. 7a) we see a modest but consistent improvement in model skill when incorporating the wind-current difference in the stress calculation that warrants its incorporation in the standard Doppio configuration (WIND_MINUS_CURRENT in Table 1). Scotian Shelf markers are absent from Fig. 7a because most of the velocity observations are from CODAR, which are predominantly for the MAB, with some coverage extending slightly into the shelf break to 3500 m and to Georges Bank. Few velocity observations from NERACOOS (Northeastern Regional Association of Coastal Ocean Observing Systems) moorings in the Gulf of Maine are close to the surface, and they are not instructive in evaluating the bulk formula parameterization of stress.

Precipitation forcing
The NARR precipitation analysis over the ocean assimilates satellite-derived rainfall data from the Climate Prediction Center (CPC) Merged Analysis of Precipitation (CMAP) (Xie et al., 2007) south of 20 • N but gradually transitions to no assimilation at all north of 50 • N. This raises some uncertainty as to the validity of the precipitation forcing for Doppio, so we have evaluated substituting NARR precipitation values entirely with data from NASA's TRMM (Tropical Rainfall Measuring Mission) Multi-satellite Precipitation Analysis (TMPA) (Huffman et al., 2007). In Fig. 7b, we can see that Doppio with TMPA precipitation forcing (square marker) results in very comparable model skill to NARR (circle marker) for salinity for most of the domain, although it is marginally worse in the MAB and Gulf of Maine. Therefore, we opted to keep the NARR forcing over TMPA.

Open-boundary bias
The Doppio open-boundary conditions are taken from the Mercator Ocean product, with an annual mean bias correction applied to bring it into agreement with our MOCHA regional climatology. To illustrate the improvement this makes, Fig. 7c contrasts the model skill for SST, in situ temperature, and salinity when running with the uncorrected Mercator Ocean (filled symbols) and the bias-corrected version (unfilled symbols). The decrease in the bias vector is to be expected. But it is also evident that bias correction carries with it modest improvement in correlation across the domain.
While we use Mercator Ocean products for boundary data, a popular choice by other users for regional models is the HYCOM (HYbrid Coordinate Ocean Model) global model analysis product by the Global Ocean Data Assimilation Experiment (GODAE) (Chassignet et al., 2009;Metzger et al., 2014). In Fig. 7d we compare model skill for SST and in situ temperature and salinity when using the HYCOM openboundary data without bias correction (filled symbols) to that of the bias-corrected Mercator files in Doppio (unfilled symbols). The model skill using HYCOM is inferior to the case using Mercator open-boundary data.

Velocity and sea level coherences
NERACOOS mooring data are valuable for skill assessment in the comparison of model and observed velocity time series in the form of frequency domain coherences (Fig. 8) for the three long-term velocity time series at sites B, M, and N. The spectra are computed by standard periodogram smoothing (Moore and Wilkin, 1998) with red lines showing 90 % confidence. The model has intrinsic skill in coherence, i.e., statistically significantly greater than zero, at all timescales in the coastal current (site B). At the Northeast Channel entrance to the GOM (site N) the model captures high-frequency and seasonal timescales but falters in the mesoscale. This sug-3718 A. G. )-based circulation model   Where the lower-bound confidence limit is below 0, coherence is not plotted. Doppio has intrinsic skill in coherence at high-frequency and seasonal timescales but falters in the mesoscale. The coastal current variability is captured well at all timescales. These mooring locations are also noted relative to the whole domain in Fig. 1. gests model performance may well improve with the assimilation of mesoscale-resolving observations of sea level from satellite altimetry. In the central GOM (site M) coherence is only significant on timescales shorter than 20 d, presumably in response to well-modeled local forcing. At this site, also, the assimilation of mesoscale-resolving data could improve simulation of intermediate timescales that impact stirring and mixing in the GOM.
In Sect. 2.3, data from 14 NOAA tide gauges were introduced in referencing mean model sea level to the regional NAVD88 datum. In Fig. 9 we present frequency domain coherence for six representative sites (see inset map) distributed across the Mid-Atlantic Bight and the Gulf of Maine. Sea level variability is statistically significantly coherent across all resolved scales throughout the region.

Results and discussion
The seasonal cycles of sea surface (red) and bottom (blue) temperatures from the model are shown in Fig. 10, with interannual variability depicted in the shaded envelope and the 11-year mean indicated by the thicker lines. In the winter months, the temperature in all shelf regions at the sea surface drops below the temperature at the seafloor, but water column stability is maintained by high salinity at depth. The increase in seasonal bottom temperature lags behind sea sur-face temperature, with typically 2 to 3 months passing after peak summer temperatures before the bottom cooling that marks the breakdown of stratification and deeper mixing of the thermocline; this is most evident in the Gulf of Maine and Mid-Atlantic Bight. The lack of variability in the bottom temperature for the shelf break to the 3500 m region is expected given the order-of-magnitude difference in depth compared to the other regions along the shelf.
Model skill in capturing the seasonal cycle of vertical stratification is presented from a different perspective in Fig. 11, showing ensemble mean vertical profiles (upper 250 m only) of temperature and salinity for 4 representative months in the various subregions. The variability in model (solid line) and the observations (dashed line) show similar behavior. The comparison is made by interpolating the model to available data coordinates as in Sect. 3 and binned at 10 m vertical intervals above 100 m depth and 50 m intervals below that. The vertical extent of the comparison varies through the year depending on data availability. The model shelf waters have a tendency to be slightly cooler than the observations below 100 m, while also being warmer at the surface during the summer months and cooler at the surface during the winter months. The cooler model temperatures during September in Fig. 11 could be due to the bias correction applied along the boundaries; as the correction uses a harmonic analysis, the fall overturning circulation could result in too much variabil- ity resulting in those cooler model temperatures. The seasonal cycle of salinity stratification is modeled well in shelf waters, with a tendency to slightly high salinity in the range 25 to 100 m during the summer months, most notably on the Scotian Shelf. The model-observation difference is generally less near both the seafloor and sea surface. A characteristic pattern throughout the region is the elevated salinity with depth that maintains water column stability in the face of the weak seasonal thermal inversion.
Ocean temperature at the seafloor is a strong driver of shellfish and fishery ecology throughout the MAB and GOM (Drinkwater et al., 2003;Murawski, 1993;Sullivan et al., 2005), so to evaluate the model's ability in this regard, a unique set of observations were used for the comparisons in Fig. 12, these being fishing-trawler-collected bottom temperatures acquired through a project coordinated by the North-east Fisheries Science Center (NEFSC) (NMFS, 2015). The left column of Fig. 12 shows 2-D histograms of bottom temperature observed by sensors mounted on fishing vessel trawl doors versus Doppio. The right column shows the corresponding geographic spread of the observations colored by the difference in model minus observed bottom temperature. The rows of Fig. 12 group the comparisons by the same 4 months used in Fig. 11. For the purposes of this analysis, we consider observations reported to be less than 1/10 of the water depth above the model seafloor to be "bottom" observations. The histograms show overall good correspondence between the model and the fishing fleet observations with any bias being generally small. Model skill is consistently good throughout the GOM and on Georges Bank. In early spring (March) there is a tendency toward a model cool bias along the shelf break, but with a dense cluster of ob- servations near 71 • W from the New Bedford fleet showing an opposing warm bias. There is an opposite sense to the model bias in this area in winter (December) with the shelf break slightly warm and mid-shelf slightly cool. The standard deviation of model-data discrepancies is not especially low (typically ∼ 4 • C), but we have not attempted to aggressively quality-control this data set with respect to the number of independent samples that enter each reported observation or the depth variance of samples in each aggregate observation. Such an effort will be required before these data are adopted in a data assimilation system.
Returning to statistical evaluations of model-data differences using Taylor diagrams, we wish to delve further into regional differences and make some comments on the accuracy of some of the data sources used in our skill assessment approach.
The Doppio model temperature skill compared against observations is presented separately for different satellite sea surface temperature products and in situ observing platforms in Fig. 13a. Upward-pointing triangle symbols are for SST; downward-pointing ones are for in situ temperature. For most observing networks there is good statistical agreement between Doppio temperatures both at the surface and at depth. Clustering at the unit radius indicates the model variance is close to observed, with strong correlations in the vicinity of 0.9. Bias is small. Of interest is that there is a clear discrepancy for the TRMM Microwave Imager (TMI) SST data against the other satellite products. The TMI sensor only provides data south of 38 • N and more than 100 km from the coast, so the statistics are skewed strongly toward model results in the Slope Sea and Gulf Stream. Nevertheless, that the skill should be so dramatically different in comparison to other microwave SST sensors (WindSat -WSAT -and Advanced Microwave Scanning Radiometer -AMSR) is troubling. WSAT and AMSR have comparable spatial coverage and resolution, yet skill for WSAT is significantly poorer than for AMSR. While we have retained these data in the subregion temperature analysis (Fig. 13b), we suspect that TMI and WSAT may not be reliable in the Doppio domain and will withdraw them from future data assimilative reanalyses.
The in situ temperature comparisons show strong agreement that is as good as infrared SST sensors, so we are confident in Doppio's ability to simulate temperature not merely at the surface but throughout the water column.
In Fig. 13b we separate the evaluation according to our standard subregions and again contrast surface (satellite) and subsurface (in situ) data with directed triangle symbols. Model performance is best in the GOM and over Georges Bank, though with a bias over Georges Bank that stems from the September and December results already noted in Fig. 11. In comparing SST and subsurface temperature for the Scotian Shelf and Mid-Atlantic Bight, we see that the Doppio model does well for SST but less so for subsurface temperature. This is perhaps unsurprising given the strong constraint that prescribed meteorological conditions exert on ocean circulation model SST.  In assessing Doppio's skill in simulating mixed-layer depth (MLD) variability within the Gulf of Maine and over Georges Bank, we compare (Fig. 14) modeled to observed frequency of occurrence of mixed-layer depths (Christensen and Pringle, 2012) using a common MLD definition: the depth where the potential density is 0.01 kg m −3 greater than at the sea surface (Thomson and Fine, 2003). We note that Doppio best simulates wintertime mixed layers along the Gulf of Maine's coast and tends to have a slightly deeperthan-observed mixed layer in the other zones of the Gulf of Maine and Georges Bank. It is worth noting that the coast zone has the best spatial coverage in sampling of all zones, whereas in other zones the coverage is not nearly as uniform. The model-estimated MLD is a uniform subregion average and may not sample the ocean equivalently to Christensen and Pringle's (2012) analysis where their sample size was small. Figure 15a shows the 5-year (2007-2012) average mean dynamic topography for the free-running Doppio model. This is constrained at the perimeter by bias-corrected openboundary data corresponding to Fig. 15b from a 4D-Var climatology analysis following the Levin et al. (2018) methodology, which we expect to be a better representation of regional MDT. There are differences between the two, notably in the free run a separation between westward coastal and shelf break flows over the outer MAB. This separation is not evident in the 4D-Var climatology solution, but that lacks the dynamical influence of tidal rectification among other processes, and we do not have a ready explanation for the differences. The free-running Doppio still well represents the coastal waters, especially the coastal current of the GOM. Also shown (Fig. 15c) is the MDT from Mercator product PSY4QV3R1, from which Doppio's open-boundary conditions were adapted. Of note is the inaccurate GOM circula- tion, which is understandable due to the regional-scale tradeoffs a global model must make, especially in the GOM where tidal dynamics (which are absent from Mercator) are such an important driver of mixing and circulation. Figure 15d shows MDT from AVISO (Archiving, Validation and Interpretation of Satellite Oceanographic data) product CNES-CLS13, on which the aforementioned Mercator product is based. Both Mercator and AVISO show a preponderance of dynamic height contours intersecting the coast, which would imply surface geostrophic currents normal to the shore. While both the AVISO and Mercator products have been superseded and much improved in their regional definition since 2013 when the Doppio system was being created, it was the lack of a physically reasonable GOM circulation structure that prompted our independent pursuit of a kinematically and dynamically balanced regional MDT . In our judgment, the free-running Doppio MDT is still more accurate than the AVISO CNES-CLS18 product (not shown), and our 4D-Var climatology analysis remains the most accurate regional portrayal of the system. Shown in Fig. 16 are the mean model circulations from the same 5-year (2007-2012) period as in Fig. 15, for the model ocean surface and bottom, overlaid upon the bathymetry binned to emphasize isobaths. Evident from these are the Gulf of Maine's main oceanic inflows: Scotian Shelf water coming along the Halifax coastline and originating from the Labrador Current and Slope Sea water entering through the Northeast Channel that derives from the subpolar North Atlantic mixed with eddies of the Gulf Stream. The two main outflows are water exiting near Nantucket and waters exiting out the Northeast Channel and around Georges Bank in alignment with the accepted general circulation pattern (Brown and Beardsley, 1978). Circulation within the deep basins of the GOM is also evident, and the GOM coastal current is pronounced at the surface. The general southwestward flow on the MAB shelf, modified by an offshore Ekman component, is clear.

Summary
This article has described in detail the features of a ROMSbased regional circulation model, Doppio, for the Mid-Atlantic Bight and the Gulf of Maine. The model downscales open-boundary information drawn from Mercator Ocean or global HYCOM, but we have shown that taking steps to adjust for biases in these global class models leads to discernable improvements in Doppio performance. The model demonstrates useful skill in comparison to a comprehensive suite of satellite and in situ observations from a dense coastal regional integrated ocean observing network. There are aspects of the model solution that would likely improve with formal data assimilation, but that is not part of this body of work. The configuration uses surface, river, and openboundary forcing data streams that are suited to real-time operation, and such a system with 4-D variational (4D-Var) data assimilation (Levin et al., 2019;Wilkin et al., 2018) has been prototyped for MARACOOS.
The focus of development was on achieving a model configuration that allows for decadal-scale simulations of physical ocean circulation that can ultimately underpin regional studies of ecosystems and biogeochemistry. As such, faithfulness to stratification throughout the entire water column, especially in coastal and shelf waters, is paramount. Doppio captures both the temperature and salinity stratification well, including a region-wide vertical salinity gradient that main- tains stable water columns despite winter temperature inversions. Bottom temperature is a particularly challenging aspect to model in this region because of the extreme vertical stratification that arises in summer in the MAB and the influence of warm offshore waters at the shelf edge that are the principal driver of seasonal temperature inversions on the outer shelf. To affirm the model performance in regions relevant to ecosystems and fisheries, comparison was made to near-seafloor temperature data acquired from trawl fishing gear with encouraging results. A further aspect of regional dynamics relevant to ecosystems is mixed-layer depth, and where reliable climatological analyses exist, in the Gulf of Maine coastal current, the model performance in acceptable. The large-scale mean circulation of the region as characterized by mean dynamic topography from data assimilative climatological analyses is preserved in the Doppio simulations.
It is anticipated that this Doppio model setup will see similarly broad adoption for studies of ecosystems, biogeochemical cycles, and ocean weather, as was the case for the ESPreSSO system as noted in Sect. 1. Decade-long simulations are already in progress to examine transport pathways and timescales of open ocean to shelf and marginal sea exchange and coupled physical-biogeochemical interactions. Future developments are also underway to enable higher spatial resolution (∼ 2 km) that admits submesoscale variabil-ity, which may in turn have a significant impact on modeled ecosystems.
Code and data availability. Doppio uses the ROMS source code (ROMS version 3.6, SVN revision 898) with specific configuration options detailed in Table 1. The code is accessible as a free download at the ROMS website (http://www.myroms.org, The ROMS/TOMS group, 2019) and is open-source licensed according to the MIT/X license. Doppio forcing, boundary, and other user input files are available on request via the Rutgers Ocean Modeling Group (OMG) THREDDS (Thematic Real-time Environmental Distributed Data Services) Data Server (TDS). The 2007-2017 Doppio model output is also accessible via OMG TDS. Validation data and model-data differences from the VERIFICATION analysis are available via the OMG ERDDAP service.
Author contributions. Much of the original model design was done by JLW and JCL, with AGL creating select forcings. JLW and JCL collected and maintained observational data aggregation used for model verification. AGL conducted configuration experiments that guided model development, as seen in the earlier skill assessment. Model results were interpreted and evaluated by AGL with guidance from JLW and JCL. All authors contributed to the paper.
Competing interests. The authors declare that they have no conflict of interest.