题名: | 基于多源遥感的冬小麦分布提取及产量估算研究 |
作者: | |
学号: | 21210226071 |
保密级别: | 秘密 |
语种: | chi |
学科代码: | 085700 |
学科: | 工学 - 资源与环境 |
学生类型: | 硕士 |
学位: | 工程硕士 |
学位年度: | 2024 |
学校: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 农业遥感 |
导师姓名: | |
导师单位: | |
提交日期: | 2024-06-17 |
答辩日期: | 2024-05-31 |
外文题名: | Research on Winter Wheat Distribution Extraction and Yield Estimation Based on Multi-Source Remote Sensing |
关键词: | 冬小麦分布提取 ; 动态时间规整(DTW)距离 ; 机器学习 ; 产量 ; 多源指标 |
外文关键词: | Winter wheat distribution extraction ; Dynamic Time Warping (DTW) distance ; Machine learning ; Yield ; Multi-source indicators |
摘要: |
获取农作物的精准空间分布及实现农作物估产对于作物管理及农业政策的调整至关重要,对于确保国家粮食安全和社会稳定发展具有重要意义。中国小麦产量居世界首位,河南省作为我国小麦主产区之一,约占全国冬小麦产量的25%。在传统的利用遥感提取冬小麦种植分布的研究中,通常需要耗费大量人力物力财力进行分类样本的人工选取,使得对大范围冬小麦种植分布精细提取仍面临多方面挑战。此外,冬小麦产量受到多类因子影响,但以往关于冬小麦产量的遥感估算研究主要考虑植被和气象因素,对环境因子和自然灾害较少考虑。因此,全面合理筛选估产变量,对于提高冬小麦估产精度具有重要意义。本文研究结果可为河南省冬小麦生产种植管理提供科学参考。 鉴于此,本研究以河南省为研究区,利用Landsat8 OLI、Sentinel-2、MODIS、Google Earth样本、气象指标,作物生长状态指标、环境因子和干旱指标数据,根据Landsat8 OLI构建基于单时相光谱阈值模型来提取冬小麦分布。为了保证提取结果的可比性,本研究还采用自动生成样本结合机器学习分类法,自动生成样本包括两种方法,一是根据已有冬小麦种植分布数据进行自动判断,二是计算冬小麦标准物候曲线与待分类样本物候曲线之间的动态时间规整(Dynamic Time Warping, DTW)距离,以此生成所需样本点。以样本点和植被指数、波段、纹理以及地形特征为基础,分别利用支持向量机(Support Vector Machine, SVM)和随机森林(Random Forest, RF)分类法实现冬小麦种植分布精确提取。综合比较提取结果,优选出冬小麦分类最优方法,得到最优冬小麦分布结果。在此基础上,对气象、作物生长状态、环境因子和干旱等四类指标进行掩模处理。对四类估算因子组合,结合偏最小二乘(Partial Least Squares Regression, PLSR)、极度随机树(ExtraTrees)、RF和梯度提升决策树(Gradient Boosting Decision Tree, GBDT)等方法,得到冬小麦估产的最佳指标组合、算法和估产时间,并评估模型预测的稳健性,继而确定冬小麦最佳估产方案。主要研究结论如下: (1)在构建的单时相光谱阈值模型中,对于Landsat8 OLI影像,当改进的水体指数(Modified Normalized Difference Water Index, MNDWI)<0.33、短波红外2(SWIR2)波段B7<0.13、近红外(NIR)波段B5>0.13、归一化植被指数(Normalized Difference Vegetation Index, NDVI)>0.57以及差值指数(Difference Index, DI)>0.203时,可将冬小麦与其它地物分开,再求交集可得到冬小麦种植分布。利用Google Earth和Sentinel-2目视样点验证,分类总体精度为89.97%,Kappa系数为0.70,冬小麦的生产者精度为70.25%,用户精度为83.84%,基于9个样方验证显示在空间上基本分布一致。统计年鉴县级冬小麦种植面积与提取的种植面积之间的R2为0.783。 (2)在研究区内自动生成的大量冬小麦训练样本,样本均匀分布、质量较好,能够保证较高的分类精度。根据Sentinel-2、Google Earth样本验证可知,基于多时相自动生成训练样本结合随机森林(RF)分类器对冬小麦提取效果较其它方法最佳,总精度为93.50%,Kappa系数为0.85,冬小麦生产者精度高达99.30%。将冬小麦提取面积结果与统计年鉴面积进行对比,R2为0.864。 (3)最佳的指标组合和最佳算法是在气象、作物生长状态、环境因子和干旱指标共4种类型估产因子组合下,结合梯度提升决策树(GBDT)算法的估产精度最高(R2=0.969,RRMSE=10.67%)。随着冬小麦生长阶段的延长,R2在4月达到最高0.972,RRMSE最低10.4%,之后R2下降到0.969。因此最佳估产时期是选择10月至次年4月时段内的数据,因为河南省冬小麦在六月收获,即可在收获前两个月准确估算产量。随着输入指标类别的增加,估算冬小麦产量的相对误差的全局莫兰指数趋近于0,这表明多指标组合可提高产量预测模型在空间上的稳健性。综上,最佳的冬小麦遥感估产方案是选择2018年10月至2019年4月时段内,利用GBDT算法将四类指标组合进行估产建模。 关键词:冬小麦分布提取;动态时间规整(DTW)距离;机器学习;产量;多源指标 研究类型:应用研究 |
外文摘要: |
Accurately obtaining the spatial distribution of crops and estimating crop yields are crucial for crop management and the adjustment of agricultural policies. This is essential for ensuring national food security and promoting stable social development. China ranks first in the world for wheat production, with Henan Province being one of the primary wheat-producing regions, accounting for approximately 25% of the national winter wheat yield. In traditional studies that use remote sensing to extract the distribution of winter wheat planting, significant resources are typically required for the manual selection of classification samples, posing multiple challenges for the precise extraction of large-scale winter wheat planting distribution. Moreover, winter wheat yield is influenced by various factors, but previous remote sensing studies on yield estimation primarily focused on vegetation and meteorological factors, with less consideration given to environmental factors and natural disasters. Therefore, comprehensively and reasonably selecting yield estimation variables is crucial for improving the accuracy of winter wheat yield estimation. The findings of this study can provide scientific references for the management of winter wheat production and planting in Henan Province. In light of this, the study focuses on Henan Province and utilizes data from Landsat8 OLI, Sentinel-2, MODIS, Google Earth samples, meteorological indicators, crop growth status indicators, environmental factors, and drought indicators. A single-temporal spectral threshold model based on Landsat8 OLI is constructed to extract the distribution of winter wheat. To ensure the comparability of extraction results, this study also employs an automated sample generation combined with machine learning classification methods. The automated sample generation includes two approaches: one is based on the automatic judgment using existing winter wheat distribution data, and the other involves calculating the Dynamic Time Warping (DTW) distance between the standard phenological curve of winter wheat and the phenological curves of the samples to be classified, thereby generating the required sample points. Using these sample points, along with vegetation indices, bands, texture, and terrain features, the study applies Support Vector Machine (SVM) and Random Forest (RF) classification methods to achieve precise extraction of winter wheat distribution. By comprehensively comparing the extraction results, we identified the optimal method for winter wheat classification and obtained the most accurate winter wheat distribution map. Based on this, we applied a masking process to four categories of indicators: meteorological factors, crop growth status, environmental factors, and drought indicators. Using combinations of these four categories of estimation factors, and employing methods such as Partial Least Squares Regression (PLSR), ExtraTrees, Random Forest (RF), and Gradient Boosting Decision Tree (GBDT), we determined the best combination of indicators, algorithms, and estimation timing for winter wheat yield estimation. We also assessed the robustness of the model's predictions, thereby establishing the optimal scheme for winter wheat yield estimation. The main conclusions of the study are as follows: (1) Based on the constructed single-temporal spectral threshold model using Landsat8 OLI imagery, winter wheat can be differentiated from other land cover types by the following criteria: Modified Normalized Difference Water Index (MNDWI) < 0.33, Shortwave Infrared 2 (SWIR2) band B7 < 0.13, Near Infrared (NIR) band B5 > 0.13, Normalized Difference Vegetation Index (NDVI) > 0.57, and Difference Index (DI) > 0.203. The intersection of these conditions delineates the spatial distribution of winter wheat planting areas. Validation using visual samples from Google Earth and Sentinel-2 resulted in an overall classification accuracy of 89.97% and a Kappa coefficient of 0.70. The producer's accuracy for winter wheat was 70.25%, and the user's accuracy was 83.84%. Verification based on 9 sample plots showed consistent spatial distribution. The R-squared (R2) value between county-level winter wheat planting areas from statistical yearbooks and extracted planting areas was 0.783. (2) In the study area, a large number of winter wheat training samples were automatically generated, ensuring uniform distribution and high quality, which contributed to achieving high classification accuracy. Based on validation using Sentinel-2 and Google Earth samples, the method utilizing multi-temporal automatically generated training samples combined with the Random Forest (RF) classifier showed the best performance in winter wheat extraction. It achieved an overall accuracy of 93.50% and a Kappa coefficient of 0.85. The producer's accuracy for winter wheat was exceptionally high at 99.30%. Comparison of the extracted winter wheat area results with those from statistical yearbooks yielded an R-squared (R2) value of 0.864. (3) The optimal combination of indicators and the best algorithm for estimating winter wheat yield involves a combination of meteorological factors, crop growth status, environmental factors, and drought indices. Using the Gradient Boosting Decision Tree (GBDT) algorithm achieved the highest estimation accuracy with an R2 of 0.969 and an RRMSE of 10.67%. As the winter wheat growth stage progresses, R2 reaches its peak at 0.972 in April, with the lowest RRMSE of 10.4%. Subsequently, R2 declines to 0.969. Therefore, the optimal period for yield estimation is to select data from October to April of the following year, as winter wheat in Henan Province is harvested in June. With an increase in the number of input indicator categories, the global Moran's index of relative error in winter wheat yield estimation approaches zero, indicating that multiple indicator combinations can enhance the spatial robustness of yield prediction models. In conclusion, the optimal remote sensing yield estimation approach for winter wheat involves modeling using a combination of four types of indicators from October 2018 to April 2019, utilizing the GBDT algorithm. Key words: Winter wheat distribution extraction; Dynamic Time Warping (DTW) distance; Machine learning; Yield; Multi-source indicators Thesis:Applied Research |
参考文献: |
[2] 丁声俊. 全面落实新部署稳健奋进新征程——学习2022年“中央一号文件”的思考[J]. 粮食问题研究, 2022, (4): 4-9. [3] 杜鹰. 中国的粮食安全问题和挑战[J]. 今日国土, 2020, (11): 22-26. [4] 陈实. 中国北部冬小麦种植北界时空变迁及其影响机制研究[D]. 北京: 中国农业科学院, 2020. [6] 郝震, 赵红莉, 蒋云钟. 基于改进的NDVI密度分割方法的冬小麦面积信息提取[J]. 南水北调与水利科技, 2017, 15(3): 67-72, 93. [7] 田欣媛, 张永红, 刘睿, 等. 考虑植被红边信息的多时相Sentinel-2大范围冬小麦提取研究[J]. 遥感学报, 2022, 26(10): 1988-2000. [13] 王松林, 张佳华, 刘学锋. 基于MODIS多时相的江苏启东市油菜种植面积提取[J]. 遥感技术与应用, 2015, 30(5): 946-951. [14] 郭昱杉, 刘庆生, 刘高焕, 等. 基于MODIS时序NDVI主要农作物种植信息提取研究[J]. 自然资源学报, 2017 ,32(10): 1808-1818. [15] 刘吉凯, 钟仕全, 梁文海. 基于多时相Landsat8 OLI影像的作物种植结构提取[J]. 遥感技术与应用, 2015, 30(04): 775-783. [16] 刘杰, 刘吉凯, 安晶晶, 等. 基于时序Landsat 8 OLI多特征与随机森林算法的作物精细分类研究[J]. 干旱地区农业研究, 2020, 38(3): 281-288. [17] 宋宏利, 雷海梅, 尚明. 基于Sentinel 2A/B时序数据的黑龙港流域主要农作物分类[J]. 江苏农业学报, 2021, 37(01): 83-92. [18] 赵孟辰, 阿里木江·卡斯木. 基于Sentinel-2遥感影像的农作物分类与适宜性评价[J]. 西南大学学报(自然科学版), 2023, 45(11): 176-185. [19] 金梦婷, 徐权, 郭鹏, 等. 基于面向对象多特征学习的无人机影像农作物精细分类方法[J]. 遥感技术与应用, 2023, 38(3): 588-598. [20] 胡琼, 吴文斌, 宋茜, 等. 农作物种植结构遥感提取研究进展[J]. 中国农业科学, 2015, 48(10): 1900-1914. [21] 贾坤, 李强子. 农作物遥感分类特征变量选择研究现状与展望[J]. 资源科学, 2013, 35(12): 2507-2516. [23] 张梦如. 基于GEE云平台的黄淮海平原冬小麦种植区提取研究[D]. 合肥: 安徽大学, 2022. [24] 姬忠林, 张月平, 李乔玄, 等. 基于GF-1影像的冬小麦和油菜种植信息提取[J]. 遥感技术与应用, 2017, 32(4): 760-765. [25] 阴海明, 王立辉, 董明霞, 等. 基于多时相Sentinel-2遥感影像的江汉平原夏收作物提取方法[J]. 福建农林大学学报(自然科学版), 2021, 50(1): 16-22. [26] 陈彦四, 黄春林, 侯金亮, 等. 基于多时相Sentinel-2影像的黑河中游玉米种植面积提取研究[J]. 遥感技术与应用, 2021, 36(2): 324-331. [27] 张悦琦, 李荣平, 穆西晗, 等. 基于多时相GF-6遥感影像的水稻种植面积提取[J]. 农业工程学报, 2021, 37(17): 189-196. [28] 王雪. 基于机器学习与遥感大数据的华北平原作物种植区提取[D]. 青岛: 青岛大学, 2023. [31] 阳灵燕, 张红燕, 陈玉峰, 等. 机器学习在农作物品种识别中的应用研究进展[J]. 中国农学通报, 2020, 36(30): 158-164. [38] 赵龙才, 李粉玲, 常庆瑞. 农作物遥感识别与单产估算研究综述[J]. 农业机械学报, 2023, 54(2): 1-19. [39] 宋富强, 郑壮丽, 王令超. 基于CASA模型的河南省冬小麦估产研究[J]. 河南科学, 2012, 30(10): 1466-1471. [40] 史晓亮, 杨志勇, 王馨爽, 等. 基于光能利用率模型的松嫩平原玉米单产估算[J]. 水土保持研究, 2017, 24(5): 385-390. [55]聂桐, 董国涛, 蒋晓辉, 等. 基于地理探测器的河南省植被NDVI时空变化及驱动力分析[J]. 生态学杂志, 2024, 43(01): 273-281. [63] Breiman L. Random forests[J]. Machine learning, 2001, 45: 5-32. [64] 方匡南, 吴见彬, 朱建平, 等. 随机森林方法研究综述[J]. 统计与信息论坛, 2011, 26(3): 32-38. [65] 张婧, 任刚. 城市道路交通拥堵状态时空相关性分析[J]. 交通运输系统工程与信息, 2015, 15(2): 175-181. [74] 汪娟, 刘哲, 宋余庆, 等. 基于改进的GLCM甲状腺纹理特征提取与分析[J]. 计算机工程与应用, 2018, 54(23): 176-182. [75] 周静平, 李存军, 史磊刚, 等. 基于决策树和面向对象的作物分布信息遥感提取[J]. 农业机械学报, 2016, 47(9): 318-326, 333. [79] 李杨, 陈子彬, 谢光强. 一种基于ExtraTrees的差分隐私保护算法[J]. 计算机工程, 2020, 46(2): 134-140. |
中图分类号: | P237 |
开放日期: | 2026-06-24 |