Under the background of the deep transfo1mation of energy structure driven by the goal of "double carbon", the high-quality development of coal industry has become the core proposition to ensure national energy security and realize sustainable development. In this paper, machine learning and game theory are comprehensively applied to solve the problem of multi-agent decision-making optimization in coal industry, and a systematic research system is constructed and feasible solutions are put forward.
The research starts from the perspective of game theory. Firstly, aiming at the key problems of complex evolution law and difficult decision-making optimization in the tripartite game among government, coal enterprises and the public, the paper puts forward an optimization framework of high-quality development decision-making in coal industry, which integrates differential game and machine learning technology, and expounds the accurate solution method based on numerical iteration and the heuristic optimization strategy based on multi-head attention mechanism. Secondly, a three-way dynamic Stackelberg game model is constructed to deeply analyze the strategic interaction mechanism among government supervision, enterprise innovation and public supervision, and reveal the dynamic evolution law of different subjects'behaviors. Finally, a multi-agent reinforcement learning method based on Multi-Agent Deep Deterministic Policy Gradient (ATI-MADDPG) is proposed, which significantly reduces the computational complexity of the decision-making optimization process of differential games by optimizing the training performance of the model.
Through the simulation of three-party dynamic Stackelberg game under continuous time, the research finds that there is a significant synergy between the innovation efforts of coal enterprises and public supervision, and the improvement of public supervision efficiency can effectively stimulate the transformation motivation of enterprises and reduce their dependence on policy subsidies; When the innovation investment of enterprises is insufficient, the government needs to increase subsidies to break through the bottleneck of transformation. The strategy optimization scheme based on ATT-MADDPG can significantly improve the economic benefits of coal enterprises, and at the same time verify the effectiveness of game
strategy optimization algorithm in industry development behavior prediction and decision support. This study provides a quantitative analysis path and decision support tool for multi-parties to jointly promote the green and low-carbon transformation of coal industry.