Understanding the development of systematic errors in the Asian Summer Monsoon

Despite the importance of monsoon rainfall to over half of the world’s population, many of the current generation of climate models struggle to capture some of the major features of the various monsoon systems. Studies of the development of errors in several tropical regions have shown that they start to develop very quickly, within the first few days of a model simulation, and can then persist to climate timescales. Understanding the sources of such errors requires the combination of various modelling techniques and sensitivity experiments of varying complexity. Here, we demonstrate how 10 such analysis can shed light on the way in which monsoon errors develop, their local and remote drivers and feedbacks. We make use of the seamless modelling approach adopted by the Met Office, whereby different applications of the Met Office Unified Model (MetUM) use essentially the same model configuration (dynamical core and physical parametrisations) across a range of spatial and temporal scales. Using the Asian Summer Monsoon as an example, we show that error patterns in circulation and rainfall over the East Asia Summer Monsoon (EASM) region in the MetUM are similar between multi15 decadal climate simulations and seasonal hindcasts initialised in spring. Analysis of the development of these errors on both short-range and seasonal timescales following model initialisation suggests that both the Maritime Continent and the oceans around the Philippines play a role in the development of EASM errors, with the Indian summer monsoon region providing an additional contribution. Regional modelling with various lateral boundary locations helps to separate local and remote contributions to the errors, while regional relaxation experiments shed light on the influence of errors developing within 20 particular areas on the region as a whole.


Introduction
Despite many advances in weather and climate modelling over the past decades, systematic errors remain prevalent in key 30 regions such as the Asian Summer Monsoon. Such systematic errors have been shown in past studies to develop rapidly, often within the first few days of simulation, and can persist to climate timescales. This has important implications for forecasting on a wide range of timescales, and for climate projections, in regions where millions of people rely on the seasonal rainfall for their water resources and livelihoods. Several modelling studies have investigated the initial error growth using short-range forecasts from numerical weather prediction models (e.g. Keane et al., 2019;Martin et al., 2010;35 Rodwell and Palmer, 2007;Phillips et al, 2004). Such studies allow the immediate influence of atmosphere model physical parametrisations to be identified without the complex feedbacks from circulation errors which develop over longer timescales. This approach can be particularly useful where similar model configurations are used for both timescales (Martin et al., 2010;Hurrell et al., 2009).
The advent of coupled ocean-atmosphere numerical weather prediction offers further challenges in the development of 40 additional systematic errors through feedbacks between atmosphere and ocean. For extended-range and seasonal predictions, tracking the development of systematic errors through the coupled atmosphere-ocean-cryosphere system and across timescales ranging from individual weather events through intra-seasonal to seasonal variations is also a challenge. Several previous studies have used initialised seasonal hindcasts to shed light on the origin of coupled model errors in tropical regions (e.g. Lazar et al, (2005); Huang et al. (2007); Liu et al. (2012); Vannière et al. (2014); Siongco et al. (2020)). Lazar 45 et al. (2005) demonstrated that both the atmosphere and ocean components of coupled models contribute to the development of errors, on different timescales and in different regions, and with the balance of atmosphere/ocean contribution being model-dependent. Vannière et al. (2013) used a multi-model seasonal hindcast dataset to identify the order in which errors appeared in the tropical Pacific, and Vannière et al. (2014) developed this into a systematic approach that allowed them to identify a range of drivers and timescales for tropical Pacific SST errors in the IPSLCM5A-LR coupled model. Similarly, 50 Siongco et al. (2020) identified different drivers for the fast-developing cold phase and slow-developing warm phase of the equatorial Pacific SST errors in the Community Earth System Model, version 1 (CESM1). Voldoire et al. (2019) used a multi-model ensemble of seasonal hindcasts made by climate models to confirm that easterly wind stress errors drive warm SST errors in the tropical Atlantic from the first month onwards. In a global study analysing daily to multi-annual timescales in two different coupled seasonal prediction models, Hermanson et al. (2018) showed a range of SST drift evolution and 55 timescales among different regions and different times of year, with some regions being affected by poor initialization.
On sub-seasonal to seasonal timescales, there will be contributions to systematic errors both from local processes and from remote teleconnections. Separating these contributions, and identifying their interaction, requires a range of bespoke modelling tools that constrain parts of the climate system while allowing others to develop freely. Examples include: atmosphere-only, land-only or ocean-only model simulations where observed or modelled fields can be used to force one 60 coupled model component at a time; replacing surface fluxes in a coupled model with daily observed or modelled fields; https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License. Rodriguez and Milton (2019) describe analysis of the spin-up of the errors over the Asian monsoon region in initialized 15day atmosphere-only hindcasts. These showed the gradual emergence, over the 15 days, of the key systematic errors seen in the moisture transport/divergence from free-running simulations of the same model. Some of the errors are large even at day 1, supporting previous results from e.g. Keane et al (2019) that errors in local parameterised physics are the key drivers of monsoon errors rather than remote forcing errors of the circulation. Rodriguez and Milton (2019) further investigated which 70 errors were driven from the Maritime Continent (MC) region by using regional relaxation experiments where the winds and temperatures over the MC region were relaxed back to reanalyses. This revealed that deficiencies in tropical convection over the MC region start to contribute to errors in the Asian monsoon circulation within the first 15 days of the hindcasts. Levine and Martin (2018) used a regional climate model centred over India and forced by reanalyses at the lateral boundaries to show that remote errors (particularly, excessive convection over the equatorial Indian Ocean and poor representation of 75 precursor disturbances transmitted from the Western Pacific) contribute significantly to the poor simulation of monsoon low pressure systems in the Met Office model.
In the present study, we illustrate how a combination of many of the techniques outlined above can be used to analyse the development of monsoon errors, their local and remote drivers and feedbacks. We take advantage of the range of Met Office model configurations covering timescales from days through seasons to decades. These share a common dynamical core and 80 similar physical parametrisations as part of the Met Office's seamless approach to modelling weather and climate. We extend and develop the previous work by including analysis of the development of errors in medium-range coupled and atmosphere-only model hindcasts during the first 7-15 days, and in a coupled seasonal hindcast ensemble during the first few pentads following initialization, and by investigating the individual and interacting roles of various remote regions in the development of errors in both atmosphere-only and coupled model configurations. While our study focusses on systematic 85 errors in the Asian summer monsoon, similar methods could be applied to other monsoon, and non-monsoon, regions.
Section 2 describes the data and methods used, while Sect. 3 documents the results of the various experiments. In Sect. 4 we discuss the results and their implications for targeted model development.

Data and methods
Free-running climate simulations using Met Office coupled atmosphere-ocean configuration Global Coupled version 2 90 (GC2.0; Williams et al, 2015), forced by present-day greenhouse gases and aerosols and covering several decades, are used initially in order to illustrate the model errors of interest to this study. The atmosphere component of GC2.0 is Met Office Unified Model (MetUM) Global Atmosphere 6.0 (GA6.0; Walters et al., 2017), which is coupled to the Joint UK Land https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License. Environment Simulator (JULES; Best et al., 2011) and the NEMO (Nucleus for European Modelling of the Ocean; Madec, 2008) ocean and CICE (Hunke and Lipscomb, 2004) sea-ice models. The model is configured at N216 resolution 95 (0.833° in latitude and 0.556° in longitude, which is approximately 80 km at the equator, in the horizontal) for the atmosphere and the ORCA0.25 tripolar grid (0.25°) for the ocean. The vertical resolution is 85 levels for the atmosphere and 75 levels for the ocean. Comparison is made against ERA-interim (ERA-I; Dee et al., 2011) and ERA5 (Copernicus Climate Change Service, 2017) reanalyses for winds, the Global Precipitation Climatology Project pentad dataset version 2.2 (GPCP v2.2; Xie et al., 2003;Adler et al., 2003) and the Tropical Rainfall Measuring Mission 3B42 product, version 7-7A (TRMM; 100 Kummerow et al., 1998;Huffman et al., 2010;Huffman and Bolvin, 2013) for precipitation, and NOAA daily Optimum Interpolation Sea Surface Temperature v2 (OISSTv2; Reynolds et al. 2007) and HadISST1.1 monthly sea surface temperatures (Rayner et al., 2003).
In order to study the development of errors after initialisation, we make use of a hindcast ensemble from the GloSea5 operational long-range forecast system (MacLachlan et al., 2015;Williams et al., 2015). GloSea5 also uses the MetUM 105 GC2.0 configuration at the same horizontal and vertical resolution as in the free-running simulations described above. The standard operational hindcast set includes seven members per start date, four start dates (1,9,17,25) per month, and runs from 1993-2016. A Stochastic Kinetic Energy Backscatter scheme (SKEB2; Bowler et al. 2009) is used to introduce small grid-level perturbations throughout the integrations to create ensemble spread. The atmosphere and land components are initialized from daily ERA-I reanalyses at 0.75° ×0.75° resolution, while the ocean and sea ice models are initialized from 110 the GloSea5 ocean and sea ice analysis using GloSea5 Global Ocean 3.0, which is driven by ERA-I and uses the NEMOVAR data assimilation scheme (Blockley et al. 2014).
In order to separate the influence of local and remote sources of error, we make use of a regional climate model (RCM) configuration based on GA7.0 (Walters et al., 2019), forced by observed sea surface temperatures (SSTs) from OISSTv2, and run at 0.44° × 0.44° resolution, approximately 50km) which is similar to (but slightly higher than) that of the N216 115 global models. GA7.0 includes changes, from GA6.0, to both model physics and dynamics that are both incremental developments and targeted improvements to address critical errors that included a persistent dry bias over the Indian subcontinent. While some progress was made in GA7.0 towards reducing those errors, the overall pattern of ASM errors investigated in the present study remains, justifying our use of this RCM configuration. For the RCM domains centred over China, a rotated north pole is used at 61°N, 296.3°E. The RCM is constrained by 6-hourly ERA-I at the lateral boundaries, 120 but within the domain the model runs freely after initialisation and is therefore able to develop errors due to local processes and feedbacks, despite the constraint from the boundaries. By adjusting the locations of the lateral boundaries and comparing between RCM simulations, and against a corresponding 20-year global atmosphere-only model simulation using GA7.0, the contribution to the systematic errors from different regions can be ascertained. This technique was applied by Karmacharya et al. (2015) and Levine and Martin (2018) for understanding sources of error in intraseasonal variability and monsoon low 125 pressure systems in the South Asian Summer Monsoon.
We study the evolution of errors after initialisation in the GloSea5 season hindcast ensembles with different start dates and, in addition, initialised 7 to 15-day numerical weather prediction (NWP) hindcasts using atmosphere-only and coupled configurations of GA6.1 1 /GC2 at N768 resolution (0.234° x 0.156°, approximately 26 km at the equator). These are initialised every day between 1st May and 19 September 2016 and each run for 15 days (Vellinga et al 2020). The day-1, 130 day-2, etc hindcasts can be combined to provide a seasonal climatology for each lead time, and the use of the same atmosphere model configuration in the coupled and atmosphere-only hindcasts allows the role of coupling to be ascertained.
To shed light on the drivers of systematic errors, we make use of the "nudging" technique described by Rodriguez and Milton (2019). A 20-year, free-running, atmosphere-only, model simulation using GA7.0 at N96 horizontal resolution (1.25° in latitude and 1.88° in longitude, approximately 200 km at the equator) is relaxed back to analyses over regions from where 135 we consider significant systematic errors may originate and affect other regions through remote teleconnection. Model winds and potential temperatures are nudged back to ERA-I with a 6-hourly relaxation time scale at all model levels. A 10° buffer zone around the relaxation subdomain is applied in which the nudging increments are exponentially damped to zero, in order to ensure a smooth transition between the nudged and free-running parts of the simulation. Similar nudging experiments are also carried out in 15-day hindcasts initialised once per day through JJA of 2016 using the NWP GA6.1 atmosphere-only 140 configuration at N216 resolution (NWP-N216), in order to examine how the influence from the nudged region is manifest in the development of the errors. Figure 1a shows June to August (JJA) mean climatological errors in rainfall and 850 hPa winds in the 30-year, free-running, 145 present-day, GC2 simulation compared with ERA-I and GPCPv2.2. Similar to previous studies of the Asian monsoon system in MetUM configurations (e.g. Keane et al., 2019;Johnson et al., 2016Johnson et al., , 2017Bush et al., 2014;Martin et al., 2010;Ringer et al., 2006) the model exhibits a deficit in rainfall over the Indian peninsula, the eastern Indian Ocean south of the Equator and the Maritime Continent, an excess over the Indian Ocean to the north of the Equator, in the eastern South China Sea (SCS) and the western Pacific, and excess precipitation over the mountains bordering the Tibetan plateau. These are 150 accompanied by a weak Somali jet that diverges into an anticyclonic anomaly over India, excessive westerly flow over southeast Asia, the SCS and across the Philippines into the western Pacific, and a cyclonic error and deficit in precipitation over southeastern China, southern Japan, Korea and the East China Sea. Johnson et al. (2017) analysed the climatological June to August (JJA) seasonal mean errors in a hindcast ensemble from GloSea5 and showed that they are similar to those seen both in climate models including the MetUM (Sperber et al., 2013) 155 and in other state-of-the-art seasonal forecast systems. Figure 1b shows the JJA climatological errors from the current GloSea5 23-year operational hindcast ensemble initialised each year on the four start dates in April. This confirms once again that, despite the initialisation and the relatively short lead time, the hindcast JJA errors are very similar in pattern, and (with the exception of the Indian region) in magnitude, to those from the 30-year free-running simulation (Fig 1a). Johnson et al. (2017) commented that the seasonal mean errors over the Indian region are largely due to a climatologically 160 late onset of the monsoon in the model, which reduces the precipitation over and around India in May and June. Figure 1(c to h) shows errors in the June, July and August climatologies from GC2 and from the GloSea5 23-year operational hindcast ensemble (at ~1 month lead time for each month, i.e. using the 4 start dates in May, June and July respectively). This shows that, while the rainfall errors over the Indian region as a whole in both GC2 and GloSea5 are indeed largest in June, both the magnitude of the monthly errors and the differences between the three months are noticeably smaller in GloSea5. 165

Climatological errors in the Asian summer monsoon
The differences between GloSea5 and GC2 in the Indian region particularly in June (GC2 shows a weakened Somali Jet and much larger rainfall deficit than GloSea5) are consistent with the differences between GloSea5 and CMIP5 models commented by Johnson et al. (2017), who considered these attributable in part to a smaller Arabian Sea cold SST error in GloSea5. Figure 2 shows the errors in SST against HadISST observations (1993 to 2015) in JJA and for June, July and August for GC2 and GloSea5 as in Fig. 1. Cold errors in the Arabian Sea are seen in both simulations, particularly in June, 170 but they are considerably larger in GC2. Marathayil et al. (2013) showed that, in CMIP3 models, such errors develop in winter due to anomalously strong north-easterly winter monsoon winds advecting cold, dry air from the Eurasian land mass over the Arabian Sea. Their analysis suggested that excessive rainfall in the equatorial Indian Ocean and anomalously cold winter continental surface temperatures in the CMIP3 models both contribute to the Arabian Sea cold SST error. Levine et al. (2013) showed that these errors persist into spring and early summer and are associated with a weaker monsoon 175 circulation and reduced monsoon precipitation. Initialisation of the GloSea5 hindcasts in spring prevents the growth of a large SST error, thereby reducing the circulation and rainfall errors over the Indian region (Levine and Turner 2012;Levine et al. 2013, personal communication R. Levine).
The free-running and initialised models are consistent in developing cold SST errors around the Maritime Continent, the South China Sea and the central and eastern Indian Ocean, even just over a month after initialisation. An SST error dipole 180 pattern resembling that of the positive Indian Ocean Dipole (IOD; Saji et al., 1999) is apparent in the seasonal hindcasts but is much stronger in the free-running simulation. This is consistent with the circulation anomaly pattern shown in Fig. 1 which strongly resembles the atmospheric component of the IOD teleconnection: south-easterly anomalies along the Sumatran coast and easterly anomalies along the Equator. Previous work (e.g. Marathayil et al., 2013;Johnson et al., 2017) has shown that this SST error pattern is associated with a coupled interaction between excessive rainfall in the central 185 equatorial Indian Ocean, excessive easterly low-level winds and increased upwelling that shoals the thermocline in the east.
The additional northeasterly wind anomalies in the western Indian Ocean in GC2 exacerbate this error pattern. Johnson et al. (2017) showed that this coupled mean state error results in errors in the representation of the IOD as a mode of variability in the model, reducing its ability to predict the Indian monsoon circulation.
https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License. Rodriguez and Milton (2019) showed that local errors in moisture convergence/divergence over the Maritime Continent 190 region also contribute to the development of circulation and rainfall errors in the eastern Indian Ocean, the South China Sea, western Pacific and southeast China in atmosphere-only simulations. It is likely that, in the coupled system, these atmosphere errors drive a cooling response in the SSTs which further contributes to decreases in rainfall and anomalous moisture divergence through coupled feedbacks.
The largest differences between the free-running and initialised simulations are seen in the central North Pacific, where the 195 cold SST errors in the free-running GC2 simulation are much larger than in GloSea5. Such errors are common among CMIP5 models: Wang et al. (2018) showed that they are associated with overly strong surface winds driving excessive evaporation, combined (in summer) with a deficit in downward solar radiation at the surface. While Wang et al. (2018) showed that, in CMIP5 models the cold errors are present throughout the year (but largest during JJA), the initialisation of GloSea5 in spring limits the extent to which these can develop during the summer months. 200 Despite the differences related to errors which develop in the winter in GC2, there are many areas where the similarity between the monthly error patterns at ~1-month lead time and the seasonal mean error pattern demonstrates that the errors develop quickly and then persist to longer timescales in this coupled model. In the following sub-sections we demonstrate how a range of configurations within the seamless modelling system can be used to shed light on various aspects and drivers of these errors. 205

Regional climate modelling
To investigate first the local and remote sources of the errors identified in Sect. 3.1, we use regional climate model (RCM) simulations with different domain sizes, centred over China and forced at the lateral boundaries with ERA-I 6-hourly reanalyses, and using time-varying, observed SSTs instead of an ocean model. Such experiments isolate the effects of any remote errors in an atmosphere-only global model (AGCM) that are located outside the RCM domain from those developing 210 within the domain. The RCM simulations are performed at 0.44° x 0.44° resolution, similar to that of the N216 GA7.0 AGCM simulation (0.833° × 0.556°), so that the comparison between RCM and AGCM isolates the local and remote forcing of errors over these two regions during the EASM. Karmacharya et al. (2015) used this approach to investigate local and remote sources of MetUM errors in the Indian monsoon region. They showed that the equatorial Indian Ocean is a key driver of Indian rainfall errors, although errors over the Himalayan foothills also played a role and there was evidence of locally-215 driven errors that were thought to be related to the model's inherent difficulties in reproducing the diurnal cycle of rainfall over land. Levine and Martin (2018) used similar methods to show that remote errors contribute significantly to the poor simulation of Indian monsoon lows and depressions.
The RCM domains used in the present study are shown in Fig. 3. Figure 4 shows the climatological errors in JJA from the AGCM and the RCM China1 and China1SE (which includes the China1, China1E and China1S regions) domains. Although 220 the magnitude of the error differs in places, the error pattern for JJA in the AGCM (top left) is very similar to that seen in the coupled simulations (Fig. 1). This suggests that neither the change between GA6 and GA7.0 nor the atmosphere-ocean and increased rainfall offshore. The southward extension in China1S contributes weakly both to the westerly anomalies across the South China Sea (not shown) and to the easterly anomalies over the middle/lower Yangtze River Basin, but both of these are strengthened when the domain is extended both to the south and the east, thereby including in addition the whole of northern Indonesia. Extending the China1 domain to the north and west has little impact compared with the anomalies contributed locally by China1 itself (Fig. 5), although China1W does contribute additional dry rainfall anomalies over 240 southern China. However, preliminary experiments with much larger RCM domains (not shown) suggest a role for even more remote influence, perhaps through the circum-global teleconnection (Wu et al., 2016).
A limited number of RCM experiments was carried out in which the RCM was initialised each year on 1 May, in order to determine how quickly the influence of local processes and remote teleconnections became apparent. For all domains, the differences between the re-initialised and free-running experiments were minimal (not shown), indicating a rapid and robust 245 evolution of the atmosphere model towards these systematic errors. The development of errors in the first few weeks after initialisation is explored further in the next section.
This analysis illustrates how an RCM with different lateral boundary locations can be used to shed light on the local and remote sources of systematic error in a climate model. For the EASM, we find that much of the circulation and rainfall error pattern seen in the full GCM is not driven locally but is related to errors arising mainly to the south and east of the region, 250 i.e. over the Maritime Continent, South China Sea and the Western Pacific.

Development of errors in initialised hindcasts
Having identified that in much of the ASM region the errors appear to develop rapidly and persist thereafter to long timescales, we now make use of initialised hindcasts to examine their development and evolution during the first few weeks after initialisation. We first make use of the GloSea5 seasonal hindcast ensemble, which consists of 7 members per start date 255 https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License.
for four start dates per month and covers a 23-year period from 1993-2015. In order to reduce the effects of internal variability, we average the ensemble mean precipitation, winds and SSTs into pentads and average both the model and observational fields over the hindcast period. We find that the error patterns develop in a similar way when using start dates of 25th June, 25th July and 25th August (not shown). This indicates that they are a robust feature of the model's behaviour during the monsoon season, consistent with 285 their similarity to those in the free-running coupled simulation. However, similar analysis using start dates in late April and early May shows a slightly different development in the first ~15 days of the hindcast (see Fig.s 8 and 9): the anomalous divergence and rainfall deficit over the Indonesian islands is much more localised and takes longer to spread westwards and northwards. There is greater and more widespread warming of the southern Bay of Bengal, while the cold anomalies south of https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License.
the Equator off the Sumatran coast do not start to develop until around 20 th May. This is thought to be related to the seasonal 290 transition that takes place around mid-May and marks the start of the Asian monsoon season (Wang et al. (2004); Figure 10).
Prior to this transition, the mean state low-level winds over the equatorial Indian Ocean are westerly and the mean flow over the Indonesian islands is weak. As noted by Ding and Chan (2005), the onset of the South China Sea summer monsoon is very abrupt, with a rapid switch from easterlies to westerlies over the South China Sea and a rapid expansion northeastwards of the south-westerlies from the EEIO across the Indochina Peninsula. In hindcast ensembles initialised after this 295 transition (Fig. 6), when the easterly low-level flow over the EEIO south of the Equator is stronger, there is a more widespread anomalous divergence over the Maritime Continent and more rapid cooling of the SSTs to the west of Sumatra.
This analysis illustrates that the monsoon error development in initialised hindcasts is dependent on the stage of the monsoon season, as well as on the lead time of the hindcast. Once the broadscale seasonal transition has occurred, the error patterns develop in a similar manner regardless of the initialisation date. 300 Figure 11 shows pentad rainfall, winds and SSTs averaged over various different regions, from hindcast ensembles initialised on different start dates between 9 th April (0409) and 25th August (0825), along with similar timeseries for GPCP and TRMM rainfall observations, ERA-I winds and OISSTv2 SSTs. For start dates in April, the SST in the SCS initially warms excessively, before cooling systematically into a cold error through the JJA season (Fig. 11a). For start dates in late May onward, the SST appears to be initialised systematically warmer than the observations but to cool thereafter. The peak 305 warm SST error coincides with the broadscale seasonal transition that is heralded by the SCSSM onset, as determined by the reversal of the 850 hPa winds over the SCS (Fig. 11b) in the criterion suggested by Wang et al. (2004). The SST cooling after this transition coincides with an acceleration of the westerly winds into a positive error for all start dates. In response both to this and to the additional convergence of moisture into the SCS from the Maritime Continent, the rainfall over the "Philippines" region ( Fig. 11c)

starts with a positive error and increases thereafter, particularly in the hindcasts initialised in 310
May and June. The East Asian monsoon Index (EASMI: see Wang et al., 2008) decreases rapidly after initialisation in all hindcasts (Fig 11d), indicating the weakening and displacement of the WNPSH. Separation of this index into its two components reveals that this is driven mainly by the increasingly excessive westerly flow in the southernmost box, which extends from southeast Asia across the SCS and the Philippines into the western Pacific, including the SCS box in Fig. 11b, and largely coinciding with the "Philippines" region in which the rainfall also increases (Fig. 11c). However, the hindcasts 315 also rapidly develop an easterly error in the northernmost box, which extends from southern China across the East China Sea and to the south of Japan.
As discussed above, the SSTs in the EEIO to the south of the Equator (Fig. 11f) cool systematically for all start dates. In contrast, the SSTs to the north of the Equator (Fig. 11e) warm substantially over the first few pentads for the April and early May start dates before cooling and ultimately developing a cold error, while for later start dates there is only a short period 320 (2 or 3 pentads) of initial slight warming before a similar cooling begins and persists for the rest of the season. Examination of the 850 hPa winds in Fig.s 8 and 10 shows that this initial warming for the late April start date is associated with a developing rainfall deficit and northeasterly anomalies from southeast Asia, opposing the mean flow. After pentad 28 (when https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License. the broadscale seasonal transition occurs), this is replaced by westerly/southwesterly anomalies (accelerating the mean flow) and a developing positive rainfall error. For start dates after this seasonal transition (Fig. 6) the wind anomalies are 325 persistently westerly/southwesterly with an increasing positive rainfall error.
We examine this behaviour in more detail in a 2016 case study using hindcasts from the Met Office's NWP model in both atmosphere-only and coupled configurations. The atmosphere-only runs are 7-day operational forecasts while the coupled model hindcasts were run for 15 days. In both cases, there was one ensemble member per day run in near-real time since 1 May 2016. Results shown in Sect. 3.2 indicated that the errors developing in atmosphere-only configurations closely 330 resemble those in the coupled atmosphere-ocean models. However, SST errors are also identified, so it is important to understand the extent to which these are driven by, and feed back upon, the atmospheric errors.
As discussed above, the development of SST errors in the EEIO over the first ~15 days of hindcast differs according to whether the hindcasts are initialised before or after the broadscale seasonal transition. This is further illustrated by composite analysis of SST and 10m winds at forecast lead times of 1, 5 and 15 days of the coupled NWP-N768 hindcasts, over a period 335 of 10 to 15 days on either side the broadscale seasonal transition (Fig. 12). For 2016, the validity dates chosen are 10 to 19 May ("before") and 10 to 23 July ("after"). Figure 12(a-c) shows the emergence of SST and surface wind errors in the EEIO before the transition. At day 1 the biases are small, showing in part the discrepancies between the OISSTv2 SSTs and the analysis SSTs (FOAM, Waters et al., 2014) used to initialise the hindcasts. At longer lead times, a large warm bias develops in the EEIO and the Maritime Continent, which is associated with a weakening of the equatorial westerly flow and the 340 southerly wind in the Bay of Bengal. The emergence of errors after the transition (Fig. 12d-f)  with the analysis of GloSea5 (Fig. 5), despite the greater atmospheric horizontal resolution used in the NWP-N768 hindcasts, confirming that the error patterns emerging in the first 15 days, both before and after the broadscale transition, are robust.
The change in evolution of the SST errors in the northern EEIO box (as used in Fig. 11e) over the first 15 days of the coupled NWP-N768 hindcasts initialised between May and early August (Fig. 13a) is also similar to that seen in GloSea5. In forecasts initialised in May, SST in the northern EEIO box develops a warm error of around 0.5°C relative to the ocean 350 analyses used to initialise the coupled forecasts (FOAM, Waters et al., 2014). This warm error manifests itself as a tendency to under-predict the cooling of SSTs in the second half of May. Forecasts initialised in June and July do not have this problem and develop a much weaker warm SST error within the first 15 days, mostly around 0.1°C, again consistent with the results for GloSea5. In fact, SSTs follow the observed cooling and levelling-off during June and July reasonably well. The warming of SSTs relative to the ocean analyses during the second half of May stems, at least partly, from under-representing 355 the cooling that is seen in ERA5. That cooling is related to increased surface heat loss during that period (Fig. 13d).
However, in CPLDNWP, excessive downward solar radiation (not shown) and under-estimated turbulent (i.e. latent and https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License. sensible) heat fluxes (e.g. Fig. 13e) contribute to a reduction in the net surface flux out of the ocean during this period. The error in turbulent fluxes can be partly traced to a weak surface wind error (Fig. 13c). Errors in ocean processes likely also contribute to SST errors. These may be surface-driven (due to the weak surface wind bias, Fig. 13c) or caused by 360 deficiencies in ocean processes (e.g. vertical mixing). Shallow errors in ocean mixed layer depth would exacerbate warming of SST caused by surface flux errors. Figure 13b confirms a lack of deepening of the mixed layer in the model early in the period, consistent with the weak-wind bias during that period. In future work we will examine the contribution from ocean processes in more detail.
By comparing surface heat flux errors from coupled and atmosphere-only forecasts in this period we can determine the 365 importance of air-sea coupling in the development of surface flux errors (Fig. 13d,e). For most of the time, the evolution of surface flux errors is very similar between coupled and uncoupled configurations. This suggests that coupled feedbacks are of limited importance here in the development of surface flux errors. The main exception is during the second half of May, when the strongest warm SST error develops. In this period, the differences in surface latent heat flux error between CPLDNWP and UNCPLD are unusually large, differing by 50-100 W m -2 (Fig. 13e). Coupled feedbacks cause reduced 370 latent heat loss in CPLDNWP, compared to ERA5 (positive values in Fig. 13e), while UNCPLD shows excessive cooling from surface latent heat flux (negative values in Fig. 13e), consistent with a positive 10m wind bias in UNCPLD (Fig. 13c).
Further work is needed to clarify how this coupled feedback operates. This example illustrates how coupled and uncoupled initialised forecasts can be used to home in on some of the long-standing errors seen the Indian Ocean.

Regional nudging experiments to assess sources of error 375
From the analysis shown in Sect. 3.2 and 3.3, we hypothesise that the reduced rainfall and anomalous outflow from the Maritime Continent and Indian regions play a role in the development of the circulation errors in the ASM at the start of the monsoon season. In order to test this hypothesis, we conduct a series of atmosphere-only sensitivity experiments using the nudging/relaxation methodology described in Rodriguez et al. (2017). This involves relaxing the temperatures and winds back to analyses with a 6-hourly relaxation time scale at all model levels. Assuming a linear response, the difference 380 between the Control and the "Nudged" simulations then gives an indication of the role played by the nudged region in the errors that occur in the Control in other locations (Klinker 1990).

Free-running simulations
We apply this methodology first to climate simulations, using the atmosphere-only configuration GA7. We use four different nudging regions (see Fig. 14), referred to as the "Philippines", "Indonesia", "South Asian Summer Monsoon" (SASM) and 385 "Maritime Continent" (MC) regions. For these experiments, the horizontal winds and potential temperature at all model levels are relaxed back to ERA-Interim reanalyses and the simulations are run for around 20 years, from 1/9/1988 to 1/1/2009. https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License. Figure 15(a,b) shows the climatological differences in 850 hPa winds and precipitation between the Control and Nudged experiments during JJA, for the "Philippines" and "Indonesia" regions. These results suggest that the "Indonesia" region 390 promotes westerly anomalies extending from the South Asian monsoon westerly jet across the Philippines into the western Pacific, while the "Philippines" region promotes additional acceleration of these westerly winds as part of an anomalous cyclonic circulation that includes north-easterly anomalies over southern China. Both regions promote excess rainfall over the eastern SCS and the western Pacific. Figure 15(c,d) shows the results for the SASM and MC regions. These suggest that errors arising locally over the SASM region are directly responsible for the anticyclonic anomaly and deficit in rainfall over 395 India and for much of the error pattern in rainfall over the equatorial Indian Ocean. The SASM region also promotes the acceleration of the westerly winds across the SCS into the western Pacific and the positive error in rainfall in those regions.
The Maritime Continent region as a whole promotes acceleration of the westerly winds and increased rainfall across the SCS and the western Pacific, and an anticyclonic anomaly that represents weakening and eastward displacement of the WNPSH region. The influence of the Maritime Continent region, and particularly the Indonesian islands, in promoting the 400 southeasterly(easterly) wind anomalies in the eastern(central) Indian Ocean, as suggested in Sect. 3.2 and 3.3, is also confirmed by these results. This analysis suggests that there are both local and remote contributors to the ASM errors seen in the MetUM model simulations. The experiments indicate that Indonesia and the oceans around the Philippines play a separate, but interacting, role in the development of these errors during the seasonal transition towards the Asian summer monsoon. The SASM region 405 helps to reinforce those errors while also developing the majority of its circulation and rainfall errors locally.

Initialised simulations
The "nudging" methodology can also be applied in initialised simulations and used to track the influence of a particular region on the development of errors elsewhere. We show here, as an example of this methodology, the influence of the "Philippines" (PHL) region (used in Sect. 3.4.1 and displayed in Fig. 14) on the growth of remotely forced model systematic 410 errors over China, the western Pacific and the Maritime Continent, over the first 15-days of NWP-N216 atmosphere-only simulations conducted during June-August 2016 (Fig. 16). Consistent with the analysis of GloSea5 coupled model hindcasts in Fig. 5, the total mean error (forecast minus analysis) in the surface wind for forecast days 1, 5, and 15 (see Fig. 16(a-c)) shows the gradual emergence of the systematic errors. This includes erroneous equatorial easterlies west of Sumatra, extending to 80°E, and a large error in the western Pacific, east of the Philippines that extends north to the sub-tropics in an 415 erroneous cyclonic pattern that reflects the weakening of the WNPSH. Other surface-wind errors are shown in the Maritime Continent, the Bay of Bengal and the western equatorial Indian Ocean off the African coast.
On day 1, the contribution of the PHL to the total error is very small, mostly confined to the PHL region as expected, but by day 5 of the NWP forecasts the PHL errors are responsible for forcing mean errors beyond the nudged region, such as the erroneous cyclonic wind in the Western Pacific subtropics, as well as errors in the Maritime Continent. These errors are 420 consolidated by day 15 of the forecast (Fig. 16 (d-f)). For completeness, we also show the contribution to the total error https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License. from the areas outside the PHL nudging domain (Fig. 16(g-i)). A smaller area of erroneous cyclonic circulation in the Pacific occurs just south of Japan by day 5, that indicates that the systematic error in the WNPSH also has extra-tropical origins. Other wind errors not forced by the PHL region include the easterlies west of Sumatra and errors in the Bay of

Summary and Conclusions
In this study we have demonstrated the use of a range of modelling tools and techniques aimed at understanding the sources of error in monsoon regions, using the specific example of the ASM errors in the MetUM model. The tools and techniques 435 allow close examination of the error development after initialization, the separation of the roles of local processes and remote teleconnections, the identification of the contribution from errors developing in particular regions to the ASM error as a whole, and understanding of the role of atmosphere-ocean coupling.
Our analysis suggests that there are both local and remote contributors to the EASM errors seen in the MetUM model simulations. The experiments indicate that Indonesia and the oceans around the Philippines play a separate, but interacting, 440 role in the development of these errors during the EASM season, while SASM region (in which errors are mainly driven locally) helps to reinforce those errors. Although many of the same systematic error patterns have been found in atmosphereonly simulations (e.g. Rodriguez and Milton, 2019), SST errors also contribute, both at initialisation and through their development in a coupled response to the circulation and rainfall errors.
This study illustrates how this methodology can be used to identify the regions and model components responsible for the 445 development of these long-standing monsoon errors. While further analysis is needed to investigate the processes involved and how they are mis-represented in the models, we have narrowed down the regions responsible (mainly the Maritime Continent and Philippines regions) which will allow us to target future detailed investigations. We have also demonstrated that it is mainly the atmospheric component that is responsible, with the ocean providing a coupled feedback which mainly exacerbates the errors. We have also shown that the development of the errors in the first few weeks depends on when the 450 hindcasts are initialised in relation to the broadscale monsoon transition that typically occurs in mid-May. Finally, we find that these systematic errors and their development are largely insensitive to changes in horizontal resolution.
https://doi.org/10.5194/gmd-2020-268 Preprint. Discussion started: 8 October 2020 c Author(s) 2020. CC BY 4.0 License. This analysis methodology benefits from the use of a seamless modelling system, where different configurations of a modelling system that are used for forecasting on different timescales share very similar physical and dynamical formulations. This allows the development of systematic errors to be studied on a range of timescales, and the roles of 455 resolution and ocean-atmosphere coupling to be studied, without the complication of different physical parameterizations or dynamical cores that other multi-model studies might include. This approach also allows the whole suite of models to benefit from improvements that ultimately result from better understanding of the errors and informed, targeted, model development.
Our study highlights a number of different techniques that can be employed to investigate the sources of model error in a 460 particular region. Once these are known, further work can be done to explore the local processes contributing to this behaviour and their sensitivity to changes in physical parameterizations in the model. While further work is clearly necessary, we hope that this work inspires other modelling groups to carry out similar analysis with their own models in order that some of the major, long-lasting, systematic errors in GCMs can ultimately be reduced.

Code and data availability 465
Due to intellectual property right restrictions, we cannot provide either the source code or documentation papers for the

Author contribution 475
Gill Martin initiated the study and carried out the analysis of the seasonal hindcasts and free-running climate simulations.