- 无标题文档
查看论文信息

论文中文题名:

 基于数据降噪和误差修正相融合的股指预测模型研究    

姓名:

 王雯莉    

学号:

 21201221068    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 025200    

学科名称:

 经济学 - 应用统计    

学生类型:

 硕士    

学位级别:

 经济学硕士    

学位年度:

 2024    

培养单位:

 西安科技大学    

院系:

 理学院    

专业:

 应用统计    

研究方向:

 金融统计    

第一导师姓名:

 丁正生    

第一导师单位:

 西安科技大学    

论文提交日期:

 2024-06-14    

论文答辩日期:

 2024-06-04    

论文外文题名:

 Research on stock index prediction model based on the fusion of data noise reduction and error correction    

论文中文关键词:

 数据降噪 ; 注意力机制 ; BiLSTM ; 误差修正 ; 股指预测    

论文外文关键词:

 Data noise reduction ; Attention mechanism ; BiLSTM ; Error correction ; Stock index prediction    

论文中文摘要:

股票价格指数是反映股票市场整体状况的重要指标,对其进行准确预测对投资者规避风险以及国家实施宏观调控有着重要意义。然而,股票价格的波动往往受到多种因素影响,呈现出非线性、非平稳和高噪声的特性。深度学习凭借其强大的学习能力和非线性特征提取能力,能够深入挖掘数据中更为复杂的信息,被广泛应用于股价预测领域。随着研究的深入,发现单一的深度学习模型在面对复杂多变的股价数据时,其预测效果仍有待提升。因此,本文以深度学习模型为基础,通过融合数据降噪技术和误差修正思想,构建更为精准的股指预测模型。主要研究工作如下:

首先,基于金融学先验知识和国内外相关研究,构建包含基础指标和技术指标共28个指标作为初始特征集。通过Hurst检验和基于随机漫步模型的检验对市场有效性的分析,发现选定的股票指数对应的市场不是弱式有效市场,从而支持了采用组合模型预测股指价格走势的合理性。接着,从历史走势、描述统计、平稳性检验和相关性分析角度进行数据探索性分析,并对数据进行预处理,包括删除因计算技术指标而产生的缺失值对应日期的数据、利用滑动窗口技术构造数据集、数据归一化以及采用主成分分析对初始特征集进行降维,最终输入特征集为11维,为后续预测提供可靠的数据基础。

其次,将主成分分析降维后的序列作为模型的输入,分别利用极端梯度提升树(XGBoost)、长短期记忆 (LSTM) 和双向长短期记忆 (BiLSTM) 三种单一模型来预测沪深300指数的日收盘价,通过均方根误差、平均绝对误差和决定系数三个评价指标的评估,发现BiLSTM模型的预测表现最佳。为弥补单一模型的不足,引入注意力 (Attention) 机制构建Attention-BiLSTM模型;使用XGBoost算法对初始预测的误差序列进行预测,构建AttBiLSTM-XGBoost模型,最终股价预测结果由初始预测值和误差预测值相加得到;在此基础上,由于股指数据中包含大量的噪声,引入同步挤压小波变换 (SWT) 对收盘价序列进行降噪处理,构建SWT-AttBiLSTM-XGBoost模型。通过将不同训练集长度下消融实验后的各模型以及其他文献提出的股指预测模型,在沪深300指数测试集上的预测效果进行对比,结果发现本文构建的三种模型的预测效果均优于单一模型,其中SWT-AttBiLSTM-XGBoost模型的预测效果最好,预测精度最高。

最后,将股指预测模型进一步应用于英国富时100指数日收盘价的预测,实验结果表明,本文构建的SWT-AttBiLSTM-XGBoost股指预测模型在富时100指数的预测中同样表现出色,其三个评价指标均优于其他模型,验证了融合数据降噪与误差修正的深度学习模型在股价预测方面的可行性和有效性。

论文外文摘要:

Stock price index is an essential indicator to reflect the overall state of the stock market, and its accurate prediction is significant for investors to mitigate risks and for governments to implement macroeconomic controls. However, stock price fluctuations are often influenced by multiple factors, exhibiting nonlinear, non-stationary, and high noise characteristics. Deep learning is extensively utilized in the field of stock price prediction due to its powerful learning capabilities and nonlinear feature extraction capacity, which can deeply mine more complex information within the data. As research progresses, it is found that single deep learning models still face challenges in predicting complex and volatile stock price data effectively. Thus, this thesis aims to construct a more precise stock index prediction model by integrating data denoising techniques and error correction concepts based on the deep learning model. The main research work is as follows:

Firstly, based on the prior knowledge of finance and related domestic and international research, the initial feature set of 28 indicators including basic indicators and technical indicators is constructed. The market efficiency is tested using the Hurst test and the random walk model test. The results show that the market corresponding to the chosen stock index is not a Weak-Form Market Efficiency, which supports the rationality of using combined models to predict stock price index trends. Next, the exploratory data analysis is conducted from the perspectives of historical trend, descriptive statistics, stationarity test, and correlation analysis. The data is then preprocessed, including deleting the data for dates with missing values caused by the calculation of technical indicators, using sliding window technology to construct the data set, normalizing the data, and applying principal component analysis to reduce the dimension of the initial feature set. The final input feature set is reduced to 11 dimensions, providing a reliable data basis for the subsequent prediction.

Secondly, the sequence that results from principal component analysis's dimensionality reduction is used as inputs for three single models: the Extreme Gradient Boosting (XGBoost), the Long Short-Term Memory Networks (LSTM). These models are used to predict the daily closing price of the CSI 300 index. The BiLSTM model demonstrates the best prediction performance, according to analyses of root mean square error, mean absolute error, and determination coefficient. To overcome the limitations of the single model, the Attention mechanism is used to build the Attention-BiLSTM model. The XGBoost algorithm is used to predict the error sequence of the initial prediction, and the AttBiLSTM-XGBoost model is constructed. By adding the original prediction value and the error prediction value, the final stock price prediction result is obtained. On this basis, since the stock index data contains a lot of noise, the SWT-AttBiLSTM-XGBoost model is constructed by introducing the Synchrosqueezed Wavelet Transform (SWT) to denoise the closing price series. By comparing the prediction results on the CSI 300 index test set across various models, including the models after the ablation experiment under different training set lengths and those proposed in other literatures, it is found that three models constructed in this thesis outperform single models. Among them, the SWT-AttBiLSTM-XGBoost model has the best prediction performance and the highest prediction accuracy.

Finally, stock index prediction models are further applied to the prediction of the daily closing price of the FTSE 100 index. The experimental results show that the SWT-AttBiLSTM-XGBoost stock index prediction model constructed in this thesis also performs well in predicting the FTSE 100 index, and its three evaluation indicators are superior to other models. The feasibility and effectiveness of the deep learning model which integrate data noise reduction and error correction for stock price prediction are validated.

中图分类号:

 F831.5    

开放日期:

 2024-06-17    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式