A daily highest air temperature estimation method and spatial–temporal changes analysis of high temperature in China from 1979 to 2018

. The daily highest air temperature ( T max ) is a key parameter for global and regional high temperature analysis which is very difﬁcult to obtain in areas where there are no meteorological observation stations. This study proposes an estimation framework for obtaining high-precision T max . Firstly, we build a near-surface air temperature diurnal variation model to estimate T max with a spatial resolution of 0.1 ◦ for China from 1979 to 2018 based on multi-source data. Then, in order to further improve the estimation accuracy, we divided China into six regions according to


Introduction
In the context of global warming, the frequency of hightemperature events is increasing and high temperature tends to increase electricity demand and energy consumption (Zhang et al., 2021;Sathaye et al., 2013), adversely affecting human health, social economy and ecosystem (Sehra et al., 2020;Basu, 2009;Gasparrini and Armstrong, 2011). The daily highest air temperature (T max ) is the basic parameter for studying regional-scale high-temperature events. It has a great influence on the ozone concentration (Abdullah et al., 2017;Kleinert et al., 2021) and the start time of the plant growth season on the Tibetan Plateau (Yang et al., 2017). T max is not only an important factor for high temperature dis-Published by Copernicus Publications on behalf of the European Geosciences Union.
aster risk assessment but also a key input parameter for crop growth models and carbon emission models. Sustained and abnormally high T max will cause high temperature heat damage and adversely affect crop growth. Therefore, it is very important to accurately obtain the temporal and spatial distribution of T max and study the characteristics of high temperature weather. Generally, T max is measured on a thermometer in a louvered box 1.5 m above the ground in the field. However, the T max measured by this method has high accuracy but not spatial continuity. Therefore, some scholars spatialized the station-based T max through methods such as Kriging interpolation and spline function interpolation. However, the number of meteorological stations is limited, and stations in remote areas and areas with complex terrain are even sparser which makes the accuracy of T max obtained by interpolation difficult to meet the requirements of regional-scale research in China.
In order to obtain information about the spatial distribution of the T max , many scholars began to use satellite remote sensing to solve this problem. There are three common remote sensing methods to estimate T max . The first method is regression analysis which uses the correlation between retrieved land surface temperature (LST) and T max to establish a regression model to estimate T max (Shen and Leptoukh, 2011;Evrendilek et al., 2012;Lin et al., 2012). The second method is machine learning which can flexibly estimate T max in urban areas with complex features (Yoo et al., 2018). The third method is to use a diurnal temperature change model to extend the instantaneous air temperature (T a ) to calculate T max either by the temperature-vegetation index (TVX) method (Wloczyk et al., 2011;Zhu et al., 2013), the energy balance method (Sun et al., 2005;Zhu et al., 2017), the atmospheric temperature profile extrapolation method (Fabiola and Mario, 2010) or other methods. The above methods of estimating T max with LST can better reflect the spatial distribution of T max but regression analysis and machine learning require sufficient and representative samples and the established model is not universal. TVX cannot estimate T a at night and in sparse vegetation areas. Many parameters required by the energy balance method cannot usually be obtained by remote sensing technology. The estimation accuracy of atmospheric temperature profile extrapolation method is greatly affected by the accuracy of the atmospheric temperature profile. The China Meteorological Administration (CMA) provided the grid dataset of daily surface temperature in China (V2.0) which contains T max data but the spatial resolution of the data is only 0.5 • and the data accuracy in local areas need to be improved. Therefore, a new method for estimating T max needs to be proposed.
At present, most researches mainly used the extreme climate indices defined by the Expert Team on Climate Change Detection and Indices (ETCCDI) to analyze the temporal and spatial distribution characteristics of high temperature and its changing laws (Khan et al., 2018;Mcgree et al., 2019;Poudel et al., 2020;Ruml et al., 2017;Salman et al., 2017;Zhang et al., 2019). Zhou et al. (2016) analyzed the temperature indices changes in China from 1961 to 2010 and the results indicated that the warm extremes in China exhibited an increasing trend. In addition, the researchers analyzed the characteristics of high temperature changes in the Three River Headwaters, Yangtze River basin, Loess Plateau, Inner Mongolia and Songhua River basin (Ding et al., 2018;Guan et al., 2015;Sun et al., 2016;Tong et al., 2019;Zhong et al., 2017). In addition to analyzing the temporal and spatial changes of high temperature events, many scholars have also studied the influencing factors of high temperature events. Studies showed that extreme high temperature over China was related to abnormal atmospheric circulation disturbances (You et al., 2011;Zhong et al., 2017) and abnormal sea surface temperature (Y. L. Wu et al., 2011). However, previous studies on the cause of high temperature events usually only analyzed the correlation between atmospheric circulation modes and the temperature indices along the time dimension without considering the spatial characteristics of the correlation.
From the above analysis, most of the researches mainly used the meteorological observation temperature data interpolation to analyze local temperature changes, and as far as we know, no one constructed continuous high-temporal resolution T max for high temperature analysis in China. In order to better study regional high temperature events, this study proposes an estimation framework for obtaining highprecision T max . Firstly, we used multi-source data and established near-surface T a diurnal variation model to build T max dataset in China from 1979 to 2018 with a spatial resolution of 0.1 • . To further improve the accuracy, we divided China into six regions according to climate conditions and topography, and established calibration models, respectively. On this basis, we further analyzed the spatial-temporal variation characteristics of T max and corresponding influencing factors in China. This can provide evidence for mitigating global climate change and reducing regional carbon emissions for environmental protection.

Study area
In order to establish a more high-precision T max dataset to analyze the temporal and spatial characteristics of hightemperature in China, we divided China into six regions mainly based on topographic conditions (elevation) and climatic conditions (T a and precipitation) as shown in Fig. 1. (i) The northeast region has a temperate monsoon climate. Affected by the monsoon, T a in the southern part of the region is higher than that in the north in winter. The topography of this area is dominated by plains, hills and mountains. Due to the influence of topography, the variability of T a is large in local areas. (ii) The northwestern region is dominated by a temperate continental climate (cold in winter and hot in summer) with a large annual and daily T a range. The area exhibits little annual precipitation which decreases from east to west. The topography of this area is dominated by plateau basins and rivers are scarce. (iii) North China is located in a semi-humid and humid zone in the warm temperate zone. Precipitation is mainly concentrated in summer. This area is dominated by plains and plateaus, bounded by Taihang Mountain, the Loess Plateau in the west and the North China Plain in the east. (iv) The southeast region is dominated by mountains and hills which belongs to the warm and humid subtropical oceanic monsoon climate zone and the tropical monsoon climate zone. The climate is mild with an annual average T a of 17-21 • C and an average rainfall of 1400-2000 mm. (v) The southwestern region has a subtropical monsoon climate affected by the southeast monsoon and southwest monsoon. It is hot and rainy in summers and the landforms in this area are dominated by plateaus and mountains. (vi) The Qinghai-Tibet Plateau is located in southwest China with an average elevation of more than 4000 m. The towering terrain has a great impact on the climate in eastern and southwestern China. It has a plateau mountainous climate with cold winters and warm summers with aridity and little rain throughout the year.

China Meteorological Forcing Dataset (CMFD)
CMFD is developed by the Hydro-meteorological Research Group of the Institute of Tibetan Plateau Research, Chinese Academy of Sciences. The dataset can be obtained from the National Qinghai-Tibet Plateau Science Data Center (https: //data.tpdc.ac.cn/, last access: 9 December 2020). The nearsurface T a data of CMFD have a time resolution of 3 h and a spatial resolution of 0.1 • , and their accuracy in China is better than Global Land Data Assimilation System (GLDAS) data (He et al., 2020). CMFD data used ANUSPLIN software to interpolate the difference between GLDAS T a data and the measured T a data to obtain grid data, and then the difference grid data and the spatially downscaled GLDAS T a data were spatially added to generate high resolution T a data. The T a data of CMFD have been widely used in climate simulation, hydrological simulation, vegetation greenness research and cross-validation of new T a datasets (Luan et al., 2020;Gu et al., 2020;Wang et al., 2020). Although this dataset has become one of the most widely used climate datasets in China, it does not provide the T max value. In order to perform high temperature analysis, we need to construct a T max dataset.

ERA5 data
ERA5 data are the fifth generation of global climate reanalysis data produced by the European Centre for Medium-range Weather Forecast (ECMWF) after ERA-Interim. The model version used by ERA5 is IFS Cycle 41r2 and its spatialtemporal resolution and number of vertical layers are much higher than the ERA-Interim data (Hoffmann et al., 2019;Urraca et al., 2018;Hersbach et al., 2020). ERA5 reanalysis data provide a variety of meteorological elements including atmospheric parameters, land parameters and ocean parameters spanning a time range from 1950 to present. The data can be obtained from Copernicus Climate Data Store (https://cds.climate.copernicus.eu/, last access: 30 December 2020). The ERA5 dataset also does not provide the T max . This study used T a data from 1979 to 2018 with a time resolution of 1 h and a spatial resolution of 0.25 • to help build a T max estimation model to generate T max value, and we sampled the ERA5 data to the same spatial resolution as the CMFD data.

Meteorological station data
T max data from the China Surface Climatic Data Daily Dataset (V3.0) from 1979 to 2018 were used to verify the accuracy of T max estimations. The hourly T a observation data from China meteorological stations were used to determine the occurrence times of T max and daily lowest air temperature (T min ). Both datasets are from CMA National Meteorological Information Center (http://data.cma.cn/, last access: 9 December 2020). The data were subjected to preliminary quality control and evaluation by CMA, and all elements in the observational data are of high quality and completeness with the validity rate generally above 99 %. These datasets have been widely used in Chinese climate research (L. C. Tong et al., 2019). To ensure the validity of the site data, manual checks were performed on all observed data including extreme value tests and spatial-temporal consistency tests, and continuous missing data due to instrument damage and other reasons were eliminated. There are 824 stations for T max observation data and 2633 stations for hourly T a obser-6062 P. Wang et al.: A daily highest air temperature estimation method vation data. After performing checks and tests, we used T max data from 760 meteorological ground stations and hourly T a data from 2421 meteorological ground stations.

Ocean climate modal indices
The ocean occupies about 71 % of the earth's surface area which has a great impact on climate change. After considering the distribution characteristics of China's land and sea, we analyzed the effects of the following ocean climate modal indices on high temperature in China: Indian Ocean basin warming (IOBW) index, North Atlantic Oscillation (NAO) index, and NINO3.4 area sea surface temperature (NINO3.4) index. Among them, the IOBW index comes from the National Climate Center of CMA (http://cmdp.ncc-cma.net/cn/ index.htm, last access: 1 April 2021) and the NAO index and NINO3.4 index are from the National Oceanic and Atmospheric Administration of the United States (https://psl.noaa. gov/data/climateindices/list/, last access: 1 April 2021). The time range of the three indices is 1979-2018 and the time scale is monthly.

T max dataset construction
At present, the data used in the research on high temperature characteristics are mostly meteorological station data or grid data obtained by interpolation of station data. A limited number of stations cannot represent the high temperature distribution at large scale. For regions where the stations are very sparse, grid data obtained by spatial interpolation can hardly meet the accuracy requirements of high temperature feature analysis. Although LST can be used to estimate T max , LST has degraded value in the presence of clouds or rainfall. Therefore, in order to obtain a T max dataset with high temporal and spatial resolution, we propose a T max construction model that combines meteorological station data and reanalysis data, and consider the T max construction under clear sky and non-clear sky conditions (see Sect. 4.1.1 for details). The data processing process is shown in Fig. 2 and the data construction model is divided into two steps: T max estimation and T max correction. First, the occurrence time of T max and T min was determined pixel by pixel (see Sect. 4.1.1 for details). Then, T max was determined according to the weather state. (1) In clear sky conditions, CMFD 3h near-surface T a data were used to construct the T a diurnal variation model which, in turn, yielded T max . (2) In non-clear sky conditions, the site and reanalysis data were used to fill pixels. Finally, the correction model was used to correct the poor quality pixels to generate the final T max dataset in China.

T max estimation
The changes of T a under different weather conditions are different. The changes of T a under clear sky conditions are relatively smooth and regular. Under non-clear sky conditions, T a changes more drastically. In order to improve the accuracy of T max estimation, we determined the occurrence time of T max and T min pixel by pixel. If there was a meteorological station at the pixel location, the analysis could be divided into two situations. (1) If hourly T a data were valid, it was directly used to determine the occurrence time of T max and T min . (2) If there was a missing value in the hourly T a data at a certain time, then we used the valid data from adjacent stations at the same time or adjacent time at the same stations to fill in the missing values. At present, there are not many meteorological stations in China and the pixels without stations account for 97.5 %. If there was no meteorological station at the pixel location, we used ERA5 hourly T a data to determine the occurrence time of T max and T min . Since the spatial resolution of the ERA5 data is lower than that of the dataset we produce, in order to match the two data spatially, we sample the two data to the same resolution and then use latitude and longitude as control conditions to match the different data.
Studies have shown that the change of T a under clear sky conditions follows a certain law: the change curve of T a during the day is close to a sine function (Ephrath et al., 1996;Johnson and Fitzpatrick, 1977;Parton and Logan, 1981;Zhu et al., 2013) so we used sine function to simulate the change of T a during the day. The appearance time of T min is t min and the appearance time of T max is t max . According to the periodicity of the sine function, the model of the change of T a during the day is obtained like Eq. (1).
Here, n is the number of CMFD near-surface T a data used to construct the T a change model in a day. CMFD can obtain T a data eight times a day. This study uses four daytime T a data to construct a T a variation model so n is 4. T ai is the nearsurface T a data at the ith time of CMFD and δ is the sum of squares of the difference between the model-estimated T a and the near-surface T a of the CMFD. Since the change of T a under non-clear-sky conditions does not conform to the sine curve change, we divided the estimation of T max under non-clear sky conditions into two situations. (1) If there was a station at the location of the pixel, the measured T max at the station was directly used as the T max of the pixel. (2) If there was no measured T max at the Table 1.
Overview of the data used in this study. pixel location, the highest value of hourly T a of ERA5 in a day was taken as T max . Then, T max determined by the ERA5 data was assigned to the pixel at the corresponding position of the T max image we established using the spatial matching method.

T max correction
The validation of T max showed some differences between the estimated T max and the measured T max . In order to further improve the accuracy of T max , the measurements taken at weather stations should be used to correct the estimated T max as shown in Fig. 3. First, determine whether there are station data at the pixel location. For pixels with stations, if the difference between the estimated T max and the measured T max is less than 1 • C, we consider the T max of this pixel to be valid. For a pixel with poor quality, if there are station data at the location of the pixel, the low-quality pixel will be replaced with the measured data from the station. If there are no sta-tion data at the pixel location, the data are corrected by linear regression method (Ninyerola et al., 2000;Zhao et al., 2020;Zheng et al., 2013). By establishing the regression relationship on each day between station T max and estimated T max , the residuals were calculated according to the measured values and T max regression predicted values, and the spatial distribution of the residuals on each day was obtained by the inverse distance weight (IDW) interpolation method. Finally, the estimated T max and the residual were added to obtain the corrected T max . The calibration model is like Eqs. (3) and (4).
Here, i and j are the row and column numbers of the image, T after (i, j ) is T max after correction, T before (i, j ) is T max before correction,ê (i, j ) is the residual, T true (i, j ) is the measured T max and T forecast (i, j ) is T max predicted by the regression model.
We used the jackknife method to randomly divide the station data into calibration and verification data (Benali et al., 2012;Zhao et al., 2020). We selected 80 % of the meteorological stations to establish the regression relationship between the measured and estimated T max values. The other 20 % of the meteorological stations were used to verify the accuracy of the corrected data. In order to improve data accuracy, the dataset used in the subsequent analysis of spatialtemporal variation of high temperature were the data corrected by all stations. Due to the different topographic and climatic characteristics of the six natural regions, the linear models of estimated T max and measured T max in each region were different. In order to obtain a higher-precision correction, the six regions were corrected separately.

Extreme temperature indices
ETCCDI proposed a set of extreme climate indices in the Climate Change Monitoring conference which became the unified standard for climate change research (Hong and Ying, 2018;Mcgree et al., 2019;Poudel et al., 2020;Zhang et al., 2019;Zhou et al., 2016). Among them, 27 indices are considered as core indices which are calculated from daily T a and precipitation data and have the characteristics of weak extremeness, low noise and strong significance. These indices comprehensively capture the frequency, intensity and duration of extreme climate events, and are recommended as the core indicators for extreme climate event analysis by the STARDEX program of the European Union (Guan et al., 2015;Ruml et al., 2017). In this study, six temperature indices related to T max were used to analyze high temperature characteristics, and their definitions are shown in Table 2. Among them, the 90th percentile in TX90p and the 10th percentile in TX10p were obtained in ascending order based on the T max data of each month during 1979-2018.

Trend analysis 4.3.1 Sen's slope estimation
In this study, the trends of T max and extreme temperature indices were calculated using Sen's slope estimation. Sen's slope estimation is a nonparametric estimation method. Even if there are some outliers in the sample, it can reliably estimate the change trend of the time series so it is widely used in trend analysis (Sen, 1968;Zhang et al., 2017). Eq. (5) is used to calculate the slope of each pair of data.
where N = n(n−1) 2 , x k and x j are the time series values of the kth andj th samples, respectively (1 ≤ j < k ≤ n). Arranging the N, K i values in ascending order, the median Sen's slope is estimated as Eq. (6).

Mann-Kendall trend test
Mann-Kendall trend test is used to test the trends of T max and extreme temperature indices. Mann-Kendall method does not require samples to follow a certain distribution and is not disturbed by a few outliers, and it can test the change trend of time series (Seenu and Jayakumar, 2021; Tan et al., 2019). Equation (7) is used to calculate the statistic of the Mann-Kendall trend test.
Here, x i and x j are the ith and j th data values of the time series, and n is the length of the time series where n is 40. Var (S) is the variance of S. The standardized statistic Z c is computed by using Eq. (10).

Mann-Kendall test for abrupt change analysis
Climate system change is an unstable and discontinuous change process, and one of the commonly used methods to test its change is the Mann-Kendall mutation test which is very effective in testing the change of elements from a relatively stable state to another state (Ruml et al., 2017). We used Mann-Kendall mutation test to test whether extreme temperature indices have mutation. For a time series x with Var where s k is the cumulative count of the number of values at time i greater than that at time j . E (s k ) and Var (s k ) are the mean and variance of the cumulative number s k , respectively. U F k is a standard normal distribution given the significance level α and can be obtained from the normal distribution table. If |U F k | > U α , this indicates that the variation trend of time series is significant. Reverse the time series x to x n x n−1 , · · ·, x 1 and repeat the above process with U B k = −U F k (k = n, n − 1, · · ·, 1).

Correlation analysis
Pearson correlation coefficient is often used to accurately measure the degree of correlation between two variables and its size can reflect the strength of the correlation of the variables. For variables x 1 x 2 , · · ·, x n and variables y 1 y 2 , · · ·, y n , the correlation coefficient between them is calculated as Eq. (16).
Here, n is the total length of the time series. The value of R is between −1 and 1. R < 0 indicates a negative correlation. R > 0 indicates a positive correlation. The closer the absolute value of R is to 1, the closer the relationship between the two elements is.

Validation of T max in each region
In order to verify the feasibility of T max estimation using the T a diurnal variation model and to analyze the accuracy of T max estimation in different regions, scatter plots of estimated T max and measured T max in six natural regions (I, II, III, IV, V and VI) were drawn according to the regional division in Fig. 1. The results are shown in Fig. 4 and the validation in each region shows that the root mean square error (RMSE) is between 2.38 and 2.94 • C, the mean absolute error (MAE) is between 1.88 and 2.45 • C and the coefficient of determination (R 2 ) is between 0.95 and 0.99. In six regions, the accuracy in region IV is the highest, while the accuracy is the lowest in region VI. As can be seen from Fig. 4, although most of the data are very accurate, some have some room for improvement. Therefore, further correction is needed to improve the accuracy of the T max dataset. The correction method in Sect. 4.1.2 was used to correct the T max estimation results of six regions separately. The comparison between T max before and after correction with the measured T max is shown in Fig. 5. It can be seen that T max corrected by the regression model is more consistent with the measured T max . The RMSE decreases from 2.38-2.94 to 1.14-1.81 • C, the MAE decreases from 1.88-2.45 to 0.84-1.38 • C and the R 2 increases from 0.96-0.99 to 0.97-0.99. The accuracy of T max is improved in each region after correction. The number of meteorological stations in region I is denser and the accuracy of T max after calibration is significantly improved. The RMSE reduced from 2.32 to 1.14 • C and the error is reduced by 51 %. The number of meteorological stations in region VI is small, and the topography is undulating and the spatial heterogeneity is large. Therefore, the accuracy in this region is still the lowest among the six natural areas after correction. In general, the corrected T max dataset has higher consistency with the measured data and can be applied to research related to regional-scale T max . Figure 6 shows the accuracy of T max before correction and T max after correction for the entire China region. It can be seen that the MAE of the corrected dataset is about 1.07 • C and the RMSE is 1.52 • C which is nearly 1 • C higher than that before correction. The accuracy evaluation result of the dataset for different years shows that the dataset in 2008 has the highest accuracy and the lowest in 2014 (Fig. 7). It can be seen from Fig. 8 that the dataset has the highest accuracy in September and the lowest accuracy in December. This may be because there is more clear sky weather in China in September and the daily temperature change curve is closer to a sine function which makes the T max estimation result more accurate.

Validation of T max in the whole China region
In general, the T max dataset has a time range of 1979-2018, in Celsius with a temporal resolution of 1d and a spatial resolution of 0.1 • . It is produced by using meteorological station data and T a reanalysis data (CMFD and ERA5) combined with diurnal variation model of T a to establish T max data, and then a correction model is constructed to further correct the data to improve the data accuracy according to different geographic partitions. The accuracy assessments indicate that the dataset exhibits high accuracy and can be used for climate change analysis in China. Figure 9 shows the annual average change of T max in each region of China during 1979-2018. The T max in each region exhibited an upward trend. However, due to the different geographical locations and the influence of atmospheric circulation in various regions, the change of T max was also different. The order of the T max increase in each region was V > IV > III > Whole > VI > II > I. The T max anomaly ranges of region I-VI and the whole China region were −1.41-1.53, −1.54-1.16, −1.47-1.12, −1.34-0.92, −0.97-1.33, −1.31-1.15 and −1.09-0.98 • C, respectively. The T max variation coefficients were 0.082, 0.045, 0.036, 0.024, 0.03, 0.088 and 0.038, respectively. It can be seen that T max fluctuated the most in region VI and the least in region IV. The minimum values of region I-VI and China region appeared in 1987, 1984, 1984, 1984, 1989, 1983 and 1984, respectively, which were distributed in the 1980s. The highest values of T max appeared in 2007, 2007, 2007, 1999and 2007, respectively. Zhai et al. (2016 found that 1999, 2007 and 2013 were among the 10 years with the highest average T a in China from 1900 to 2015. From 1998 to 2012, global surface temperature experienced a warming hiatus Li et al., 2015) and T max in all regions of China showed a downward trend during this period.

Inter-annual variability
In order to understand the spatial pattern and regional differences of T max changes with more detail in China, Sen's slope estimation was used to calculate the annual average T max change rate from 1979 to 2018 at the pixel scale (Fig. 10a). The significance test of the T max change trend was conducted by the Mann-Kendall trend test (Fig. 10b). At the same time, the average change rate of T max in each region and the area percentage of significant increase and decrease (P < 0.05) in T max were calculated (Table 3). The results indicated that the annual average T max change rate in most regions of China (78.24 % of the study area) passed the significance test with a significance level of 0.05, and 65.84 % of the pixels showed very significant changes in T max (P < 0.01). Figure 10a showed that the annual average T max in most regions of China was on the rise and the fastest   rising rate of T max was in western Yunnan. Only 8.13 % of the regions in China showed a downward trend in T max . These were concentrated mainly in the north and south of Xinjiang and the northwest and south of Tibet. Among the six regions, the average T max change rate of region V was the largest (0.38 • C per 10 years) and the average T max change rate of region I and region II was the lowest (0.31 • C per 10 years) ( Table 3).

Seasonal changes
On the basis of the annual analysis, we also analyzed the seasonal changes. The seasons are divided according to the months (spring from March to May, summer from June to August, autumn from September to November, and winter from December to February). We plotted the seasonal variation curve of T max in China from 1979 to 2018 (Fig. 11) and some information on the trend of T max changes is shown in Figure 6. Box plots of the R 2 , MAE and RMSE of comparison between T max before correction and T max after correction in the whole China region.  Table 4. The results indicated that T max in each region fluctuated the most in winter and the least in summer. The highest T max in each region in spring, summer, autumn and winter mostly occurred in 2018, 2013, 1998 and 2007, while the minimum T max in each region in spring, summer, autumn and winter mostly occurred in 1988, 1993, 1981 and 1984. In 2013, T max of region IV-VI in summer reached the highest since 1979 mainly due to the influence of the southwest monsoon, East Asian summer monsoon and other factors. Under the influence of El Niño, T max in winter in region I, II and the whole study area was the highest in 2007. Under the influence of La Niña, the minimum T max in spring and winter in most areas of China appeared in 1988 and1984, respectively. In the same season, the variation trend of T max in each region was significantly different and some even had opposite trends. However, influenced by La Niña and the Eurasian atmospheric circulation, T max in winter in each region showed a consistent decreasing trend from 2007 to 2008. As can be seen from Table 4, in spring, summer, autumn and winter, the regions with the fastest T max rise are III, I, I and VI respectively, and the regions with the lowest T max change rate are VI, VI, III and II, respectively.
In order to display the seasonal variation characteristics of T max in China more intuitively, we drew the spatial distribution of the trend of T max and conducted a significance test (Fig. 12). Meanwhile, we counted the percentage of significant increase and decrease in T max in each region (Table 5). The results indicated that the areas with increasing T max were more than those with decreasing T max in all seasons. From 1979 to 2018, the increasing trend of T max was most signifi-P. Wang et al.: A daily highest air temperature estimation method  cant in spring which accounted for 92.73 % of the total study area followed by autumn and summer, while T max increased the least in winter. Specifically, T max increased significantly in most parts of China in spring and the region where T max decreased significantly was mainly concentrated in the region VI (Fig. 12a). In summer, T max in most part of China showed a significant increasing trend, but T max in southern Xinjiang and northwestern Tibet exhibited a decreasing trend (Fig. 12b). Compared with spring and summer, the area with a significant increasing trend of T max in autumn was smaller and the regions with a significant decreasing trend of T max were mainly distributed in Xinjiang and Tibet (Fig. 12c). In winter, 79.02 % of the regions experienced an increase in T max which was significantly lower than in other seasons. A  significant increasing trend of T max was observed in the east of region IV and the southwest of regions V and VI while the areas where T max decreased significantly were mainly observed in Xinjiang (Fig. 12d). We also observed no significant decrease in T max in regions I and III in spring, I in summer, I and IV in autumn, and III in winter (Table 5). Further statistics showed that T max of the whole region III showed an upward trend in spring.

Change of time
We plotted the inter-annual variation of extreme temperature indices anomalies in various regions of China from 1979 to 2018 (Fig. 13), and used Sen's slope estimation and the Mann-Kendall trend test to calculate statistics on the trend of extreme temperature indices (Fig. 14). The results indicated that SU, TX90p, TXn and TXx increased at a rate of 3.83d/10a, 6.57d/10a, 0.11 • C per 10 years and 0.32 • C per 10 years, respectively (Fig. 14). Influenced by the strong El Niño in 1997, the SU in all regions exhibited a consistent upward trend from 1996 to 1997 (Fig. 13). The change rate of SU in all regions passed the significance test of 0.01 indicating a significant upward trend (Fig. 14). The increasing trend of TX90p in all regions was also very significant. The decadal average of TX90p in regions III-VI and the whole study area had an increasing trend, while the decadal average of TX90p in region I and region II increased first and then decreased slightly. The TXn of region II showed a weak decreasing trend and the sliding average of the 3year and 5-year periods also exhibited a weak fluctuation downward trend. TXn of other regions showed an upward trend in general and only region VI had a significant increasing trend (P < 0.05) (Fig. 14). Except for region VI, the change rate of TXx in other regions was higher than that of TXn. The rate of change of TXx exhibited that the upward trend of region VI was not significant, while all other regions passed the significance test of 0.01. During 1979-2018, ID and TX10p decreased significantly at the rate of −1.48d/10a and −3.78d/10a, respectively (P < 0.01) (Fig. 14). The ID of all regions exhibited a downward trend with region VI and the whole study area showing the most obvious decline passing the significance test of 0.01 (Fig. 14). Compared with ID, TX10p decreased more sharply and the highest value of TX10p in all regions occurred before 1988 (Fig. 13). The above results indicate that the frequency of high temperature events in China is on the rise which is in line with the expected results of global change. In addition, we also found that the occurrence time of maximum and minimum values of SU, TXn, TXx and ID during 1979-2018 was consistent with previous research results by Hong and Ying (2018) which further proved the correctness of the T max dataset constructed by us, indicating that the dataset can be used to analyze the spatial-temporal changes of high temperature in China. In order to analyze the variation rules of extreme temperature indices in China from 1979 to 2018, the Mann-Kendall  mutation test was applied to test the mutation characteristics of six extreme temperature indices at the significance level of 0.05. The results are shown in Fig. 15. We found that all the extreme temperature indices had abrupt change from 1979 to 2018 and 40 % of the years where the abrupt changes occurred were El Niño years while 46.7 % were La Niña years. This finding further confirms that China is greatly affected by global climate change. TX90p in region I-II and the whole study area displayed an abrupt change from a period with lower value to one with higher value in 1996. After mutation in region II in 2003, TXn turned from an upward trend to a downward trend but the downward trend was not obvious. The ID of the whole study area and its six sub-regions tended to increase first and then decrease.

Spatial change
The spatial distribution of the extreme temperature indices trends in China during 1979-2018 is shown in Fig. 16a-f while the area percentage of the increasing and decreasing trend of extreme temperature indices in each region is shown in Fig. 17a-f. For SU, TX90p, TXn and TXx, the area with rising trend is larger than the area with declining trend. The change of SU in most regions of China passed the significance test of 0.05 and the areas with significant increase accounted for 63.3 % of the whole study area (Fig. 17a). The regions with no significant change in SU were mainly distributed in region VI (Fig. 16a). There were few days in a year when T max exceeded 25 • C in region VI and T max in some regions was even lower than 25 • C throughout the year so the change range of SU was small. The areas with a downward trend of TX90p were mainly distributed in southern Xinjiang and northern Tibet (Fig. 16b). TX90p increased significantly in 75 % of regions in China (P < 0.05) and the area percentage of TX90p that significantly increased in region V was the largest among the six regions (Fig. 17b). The trend of TXn change in most regions of China was not significant and the significant decrease was mainly concentrated in region II and region VI (Fig. 16c). While other regions were dom-  inated by increasing trend of the TXn, 69.7 % of regions in region II showed a downward trend (Fig. 17c). For TXx, its upward trend was slightly stronger than TXn and the region with the highest change rate was located in western China (Fig. 16d). The regions with significantly decreased ID were mainly distributed in region VI (Fig. 16e). There was a declining ID in 75.7 % of the regions and 53 % of the regions passed the significance test (Fig. 17e). As far as TX10p is concerned, its cooling trend was much stronger than that of ID and the areas of significant decline were widely distributed through all regions of China (Fig. 16f). The area with a significant decrease in region IV accounted for 75.9 % of the region which was the largest among the six regions (Fig. 17f).  The correlation between T max anomalies and three climate modal indices in China during 1979-2018 is shown in Fig. 18a-c. The results show that there is a significant positive correlation between T max and IOBW in 54.18 % of the regions in China which indicates that the warming of the Indian Ocean will contribute to the warming trend of T max in these regions. T max had a moderate positive correlation (0.4 < R < 0.6, P < 0.01) with IOBW in southern Yunnan and eastern Hainan (Fig. 18a). T max and NAO had a significant positive correlation in northeast China but the correlation was very weak (R < 0.2). The percentage of T max anomaly value negatively correlated with NAO (16.55 %) was higher than that of NAO positively correlated (5.27 %) mainly distributed in the west and south of region II, west of region III, south of region IV and V, and northeast of region VI. This indicated that the positive phase of NAO contributes to the decrease in T max in these regions (Fig. 18b). T max was significantly positively correlated with NINO3.4 in southern China, central Xinjiang and southern Gansu indicating that El Niño events will lead to higher temperatures in these regions. In western China and the middle part of region IV, T max was significantly negatively correlated with NINO3.4 indicating that El Niño events will lead to cooling phenomena in these regions (Fig. 18c). The region with significant positive correlation between the SU and IOBW accounted for 42.67 % of the whole study area which indicated that a warming Indian Ocean would lead to the number of days over 25 • C in these regions to increase. Significant negative correlations were found in northwest and southeast Tibet and the mountainous regions of southern Xinjiang (Fig. 19a). The area with the largest correlation coefficient is in the northeast of Hainan (R = 0.48). The significant negative correlation between TX90p and IOBW was mainly distributed in region VI but the negative correlation was not strong (|R| < 0.4) (Fig. 19b). The correlation coefficient between TXn and IOBW ranged from −0.34 to 0.34 and the regions with significant positive correlation accounted for 16.65 % of the whole study area. TXn and IOBW were significantly negatively correlated mainly in western China (Fig. 19c).
Compared with TXn, the regions with significant correlation between TXx and IOBW were more widely distributed in China among which the correlation coefficients in southern Yunnan and eastern Hainan were moderately positive (0.4 < R < 0.6) (Fig. 19d). ID and TX10p were negatively correlated with IOBW in most of China. The regions with significant negative correlation between ID and IOBW were mainly distributed in region VI and the regions with significant positive correlation were mainly distributed in the west of region II (Fig. 19e). TX10p has a significant negative correlation with IOBW in more areas than ID and the significant positive correlation was mainly located in western China (Fig. 19f). The influence of NAO on the extreme temperature indices is shown in Fig. 20a-f. SU, TX90p, TXx and TXn were negatively correlated with the NAO more than they were positively correlated with NAO indicating that the positive phase of NAO would lead to the decline of SU, TX90p, TXx and TXn over most of China. SU and NAO had a significant positive correlation in southern Xinjiang, western Tibet, northern Qinghai and northern Guizhou but the correlation was very weak (R < 0.2). There was no significant correlation between SU and NAO in southern Qinghai, which was con-    sistent with previous observations (Ding et al., 2018). The region with the strongest negative correlation between SU and NAO was located in Tibet (R = −0.18) (Fig. 20a). TX90p had a weak negative correlation with NAO in eastern Xinjiang (R = −0.22, P < 0.01). TX90p was significantly positively correlated with NAO in the west and south of region VI but the correlation was extremely weak (Fig. 20b). Shi et al. (2019) indicated that more regions had a significant positive correlation between TXn and NAO in China than a significant negative correlation which was consistent with our results. The areas of TXn that had a significant positive correlation with NAO were mainly distributed in northeast China while the regions with significant negative correlation were mainly located in central Tibet, eastern Qinghai and Yunnan (Fig. 20c). The correlation coefficient between TXx and NAO varied from −0.16 to 0.21. The regions with significant positive correlation between TXx and NAO were mainly located in Tibet and the region with the strongest correlation was located in southern Tibet (Fig. 20d). The areas of ID that were significantly positively correlated with NAO accounted for 5.86 % of the whole study area and the strongest correlation was found in western Xinjiang (R = 0.23). The regions with significant negative correlation between ID and NAO were mainly distributed in eastern and northeastern China  ( Fig. 20e). Sun et al. (2016) found a very weak positive correlation between TX10p and NAO in the Loess Plateau which was consistent with our results. The regions with a significant negative correlation were mainly concentrated in northeastern China (Fig. 20f).
Figure 21a-f shows the correlation between NINO3.4 and extreme temperature indices. The regions with significant positive correlation between SU and NINO3.4 were mainly distributed in eastern China indicating that the events of El Niño would lead to an upward trend of SU in these regions. There were few regions with significant negative correlation between SU and NINO3.4, only accounting for 1.15 % of the entire research area, mainly distributed in southeast Tibet and southwest Yunnan (Fig. 21a). The correlation coefficient between TX90p and NINO3.4 was −0.19-0.26. The areas of TX90p that had a significant negative correlation with NINO3.4 were mainly distributed in region IV and VI (Fig. 21b). There was a significant negative correlation between TXn and NINO3.4 in 24.59 % of regions and the region with the strongest negative correlation was located in Tibet (R = −0.25). TXn was positively correlated with NINO3.4 in only 10.46 % of regions in China and the region with the largest correlation coefficient was northwest Xinjiang (R = 0.11) (Fig. 21c). There was a weak positive correlation between TXx and NINO3.4 in southern Guangdong and northern Hainan (0.2 < R < 0.4). The regions of TXx significantly negatively correlated with NINO3.4 were mainly distributed in the south of region V and region VI (Fig. 21d). The significant negative correlation between ID and NINO3.4 was mainly concentrated in southern China. The areas with significant positive correlation were mainly distributed in the western region II and southern region VI, and the region with the strongest correlation was located in the western Sichuan (R = 0.31) (Fig. 21e). TX10p in most regions of regional VI was significantly affected by NINO3.4 and the significant positive correlation area accounted for 69.31 % of the whole region VI. TX10p was significantly negatively correlated with NINO3.4 in only 0.65 % of regions in China mainly distributed in Hainan and southern Gansu (Fig. 21f).

Conclusions
The global temperature continues to rise and extreme weather events continue to increase (IPCC, 2021). It is of great significance to study regional high temperature changes. In order to obtain the key parameters of high temperature spatial-temporal variation analysis, this study proposed a daily T max estimation frame based on the nearsurface T a grid data and T a diurnal variation model to build a T max dataset in China from 1979 to 2018. Validation of T max estimation data in six natural regions indicated that the RMSE of each region was between 2.38 and 2.94 • C, the MAE was between 1.88 and 2.45 • C and R 2 was between 0.95 and 0.99. After using the regression model to calibrate the dataset, the accuracy of the estimated T max has been significantly improved. The RMSE of the T max after calibration reduced to 1.14-1.81 • C, the MAE reduced to 0.84-1.38 • C and the R 2 increased to 0.97-0.99.
This dataset was used to study the spatial-temporal variation characteristics of T max and the corresponding influencing factors in China, and to discuss the correlation between T max , extreme temperature indices and ocean climate modal indices. T max in all regions of China exhibited an upward trend from 1979 to 2018 with the largest rise in region V and the smallest rise in region I. In spring, T max in China increased significantly in most regions and region III has the fastest rising speed. In winter, T max in China had the least significant rise and region II had the slowest rise rate. SU, TX90p and TXx in all regions showed an upward trend. Except for region II, TXn in other regions also exhibited an upward trend while ID and TX10p in all regions showed a downward trend. All extreme temperature indices had abrupt changes during 1979-2018, and most of the abrupt changes occurred in El Niño or La Niña years. The region with the largest increase in SU, TX90p and TXx and the region with the largest decrease in TX10p were located in western Yunnan. The correlation analysis between T max and extreme temperature indices and ocean climate modal indices indicated that the increase in the IOBW usually coincides with the increase in T max , SU, TX90p, TXn and TXx and the decrease in ID and TX10p. NAO had the opposite relationships. In most regions of China, T max , SU, TX90p and TXn were negatively correlated with NINO.3.4, while TXx, ID and TX10p were positively correlated with NINO.3.4.
The T max dataset we produced can not only be used as the input parameters of climate change models, crop growth models and carbon emission models but also can be used to evaluate the risk of high temperature disasters which has high practical value. Currently, due to the limitation of the temporal and spatial scope of the basic data, we have only produced the dataset of China. If global station data and temperature data can be obtained in the future, we can continue to produce T max dataset on a global scale. The analysis of regional high temperature temporal and spatial changes shows that the temperature changes in different regions of China are inconsistent, the mechanism that affects the temperature rise is different in different regions and some regions are highly correlated with ocean temperature changes. China is located in the eastern Eurasian continent and the western Pacific Ocean. With the influence of the unique topography of the Qinghai-Tibet Plateau, China's climate system is very complex. The temperature change in China is affected by a combination of factors and the ocean is only one of the factors affecting the temperature change in China. Our study found that the influence of the ocean on China's temperature change is not particularly strong and we can continue to study the driving factors that have a strong impact on China's climate change in the future. In order to strengthen environmental protection