- 无标题文档
查看论文信息

论文中文题名:

 基于机器学习的华北地区PM2.5模型构建研究    

姓名:

 王志豪    

学号:

 20210226047    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085700    

学科名称:

 工学 - 资源与环境    

学生类型:

 硕士    

学位级别:

 工程硕士    

学位年度:

 2023    

培养单位:

 西安科技大学    

院系:

 测绘科学与技术学院    

专业:

 测绘工程    

研究方向:

 大气环境监测    

第一导师姓名:

 陈鹏    

第一导师单位:

 西安科技大学    

论文提交日期:

 2023-06-15    

论文答辩日期:

 2023-06-03    

论文外文题名:

 Research on the Construction of PM2.5 Model in North China Based on Machine Learning    

论文中文关键词:

 ERA5 ; 水汽压 ; 大气可降水量 ; 机器学习 ; PM2.5    

论文外文关键词:

 ERA5 ; Water vapor pressure ; Precipitable water vaper ; Machine learning ; PM2.5    

论文中文摘要:

PM2.5是大多数雾霾的主要成分,高浓度的PM2.5在空气中长时间的存在会对人类健康造成严重影响,因此急需开展对PM2.5的相关研究工作。传统的PM2.5使用地面站点进行监测,空间分辨率较低。大多数学者采用暗目标算法来反演中分辨率成像光谱仪气溶胶光学厚度产品,利用其估计地面PM2.5浓度,这种方法在城市等高反射区域会产生缺失值,造成时空分辨率降低,并且很少对气象参数与PM2.5的相关性进行系统且全面的分析。本文利用欧洲中期天气预报中心(European Centre for Medium Range Weather Forecasts,ECMWF)发布的第五代再分析(ECMWF Reanalysis v5,ERA5)数据集,提出一种基于机器学习(BP神经网络,Back Propagation Neural Networks,BPNN;随机森林,Random Forest,RF)估计PM2.5的方法,这种方法的时空分辨率高且减少了数据缺失。并系统分析多种气象参数与PM2.5的相关性及变化规律。中国生态环境部空气质量报告显示华北地区是中国污染最为严重区域之一,北京不仅位于华北地区又是中国的政治中心和文化中心,因此分别以北京市和华北地区为例分别进行PM2.5的研究。具体研究内容如下:
(1)利用北京市的污染因素(O3、CO、NO2、SO2和PM10)和大气可将水量(Precipitable Water Vapor,PWV)、水汽压(Water Vapor Pressure,WVP)和相对湿度(Relative Humidity,RH)等气象因素分别与PM2.5进行相关性分析,发现污染因素与PM2.5的年相关性较高。气象因素年相关性较差,仅在冬季具有较高的相关性,其中PWV、RH和WVP的相关性高于其他气象因素,在冬季相关性均大于0.5以上。建立北京地区的PM2.5模型划分为三个阶段:使用气象参数和污染参数建立年PM2.5模型,BPNN和RF的R2和RMSE分别为0.94/0.96和10.37/8.77µg/m3。其次由于污染参数存在缺失值并且空间分辨率较低,仅使用高时空分辨率的ERA5气象数据建立冬季的PM2.5模型,RF模型的R2(0.93)比BPNN高了0.05,RMSE降低了4.19µg/m3,结果表明仅采用气象参数建立PM2.5模型是可行的。最后利用第二阶段的RF模型和空间分辨率为0.125°的ERA5气象数据(利用三次样条插值得到)生成北京每小时的PM2.5区域图,并与中国高质量空气污染物数据集(China High Air Pollutants,CHAP)进行比较,北京市的R2和RMSE分别为0.78和14.78µg/m3。通过分析北京PM2.5图发现PM2.5浓度高的地区是与河北接壤的区域,区域输送和人类活动是导致空气污染的重要原因之一。
(2)以华北地区为例,利用ERA5数据集得到的气象数据,来建立PM2.5的模型,并考虑植被数据、数字高程模型、土地覆盖类型和人口数据的影响。首先分析冬季(1月、2月、11月和12月)气象数据的月相关性,气象因素在12月的相关性是最高的,在1月份的相关性是最低的,气象因素中WVP的相关性是最高的,其次是PWV和RH的相关性较高,这三种气象因素对PM2.5的变化比其他气象因素敏感,相关性均大于0.3。依据气象数据的相关性,利用机器学习模型中的RF和BPNN建立PM2.5模型,得到BPNN和RF模型的R2分别为0.80和0.95,RMSE分别为25.42和13.23µg/m3,Cor分别为0.90和0.97,结果表明RF模型的精度优于BPNN,RF在建立高时空分辨率的PM2.5模型方面更具有优势。分析RF模型在每小时的精度变化,以及每小时的PM2.5与R2趋势图,发现PM2.5的剧烈变化会造成模型精度的降低。并且建立每个月的PM2.5模型,1月份的RMSE高于其他三个月份,是由于一月份的PM2.5浓度是最高的。

论文外文摘要:

PM2.5 is the main component of most haze, and the long-term presence of high concentration of PM2.5 in the air will cause serious impact on human health, so it is urgent to carry out relevant research on PM2.5. Traditional PM2.5 monitoring uses ground monitoring stations with low spatial resolution. Other studies have retrieved the Moderate Resolution Imaging Spectroradiometer aerosol optical depth product by the dark-target algorithm. However, the estimated PM2.5 concentration on the ground will produce missing values, which will lead to the reduction of spatial and temporal resolution, and there is little systematic and comprehensive analysis of the correlation between meteorological parameters and PM2.5. Using the fifth generation reanalysis (ERA5) data set released by the European Medium-Range Weather Forecast Center (ECMWF), this paper proposes a method to estimate PM2.5 based on machine learning (Back Propagation Neural Networks, BPNN. And random forest, RF), this method has high spatial resolution and reduces data loss. And systematically analyze the correlation and variation patterns between various meteorological parameters and PM2.5. According to the Ministry of Ecology and Environment of the People’s Republic of China, North China is one of the most polluted regions in China. Beijing is not only located in North China, but also a political and cultural center of China. Therefore, Beijing and North China are taken as examples for PM2.5 research. The specific research contents are as follows:
(1)Using the correlation analysis of pollution factors (O3, CO, NO2, SO2, and PM10) and meteorological factors (atmospheric water volume, vapor pressure, and relative humidity) with PM2.5 in Beijing, it was found that the annual correlation between pollution factors and PM2.5 was high. The annual correlation of meteorological factors is poor, and only in winter has a high correlation. The correlation of precipitable water vaper, relative humidity, and water vapor pressure is higher than other meteorological factors, and the correlation is greater than 0.5 in winter. The establishment of a PM2.5 model in Beijing is divided into three stages: using meteorological and pollution parameters to establish an annual PM2.5 model, with R2 and RMSE of 0.94/0.96 and 10.37/8.77 µg/m3 for BPNN and RF, respectively. Secondly, due to missing values of pollution parameters and low spatial resolution, only ERA5 meteorological data with high spatial and temporal resolution are used to establish a winter PM2.5 model. The R2 (0.93) of the RF model is 0.05 higher than that of the BPNN, and the RMSE (12.50 µg/m3) is reduced by 4.19 µg/m3. The results indicate that it is feasible to establish a PM2.5 model using only meteorological parameters. Finally, the hourly PM2.5 regional map of Beijing was generated using the RF model in the second phase and ERA5 meteorological data with a spatial resolution of 0.125° (obtained using cubic spline interpolation), and compared with China's High Air Pollutants (CHAP). The R2 and RMSE of Beijing were 0.78 and 14.78 µg/m3, respectively. By analyzing the PM2.5 map in Beijing, it is found that areas with high PM2.5 concentrations are bordering Hebei, and regional transport and human activities are important causes of air pollution.
(2) Taking North China as an example, the meteorological data obtained from the ERA5 dataset are used to establish a PM2.5 model, taking into account the impact of vegetation data, DEM, land cover types, and population data. Firstly, analyze the monthly correlation of meteorological data in winter (January, February, November, and December). The correlation of meteorological factors is the highest in December, and the correlation is the lowest in January. Among meteorological factors, the correlation of WVP is the highest, followed by the high correlation of PWV and RH. These three meteorological factors are more sensitive to the change of PM2.5 than other meteorological factors, with a correlation greater than 0.3. Based on the correlation of meteorological data, a PM2.5 model was established using RF and BPNN in machine learning models. The results showed that the R2 of BPNN and RF models were 0.80 and 0.95, respectively, the RMSE was 25.42 and 13.23 µg/m3, and the Cor was 0.90 and 0.97, respectively. The results showed that the accuracy of RF model was superior to BPNN, and RF had more advantages in establishing a high spatiotemporal resolution PM2.5 model. Analyzing the hourly accuracy changes of the RF model, as well as the hourly PM2.5 and R2 trend charts, it was found that the drastic changes in PM2.5 would lead to a reduction in model accuracy. And establish a monthly PM2.5 model. The RMSE in January is higher than that in the other three months because January has the highest PM2.5 concentration.

中图分类号:

 P412.292    

开放日期:

 2023-06-15    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式