Since 18 of December 2019 uses Nucleus credentials. Visit our help pages for information on how to Register and Sign-in using Nucleus.

15–18 Jul 2024
Instituto de Física da Universidade de São Paulo
America/Sao_Paulo timezone

Fast control of plasma vertical displacement based on robust adversarial reinforcement learning

18 Jul 2024, 09:30
Instituto de Física da Universidade de São Paulo

Instituto de Física da Universidade de São Paulo

Rua do Matão, 1371 - Butantã CEP05508-090 - São Paulo - SP - Brasil
Oral Machine Learning Machine Learning


Dr Binnuo Liu (Institute of Plasma Physics, Chinese Academy of Sciences)


Plasma with elongated configuration has the advantage of higher discharge parameters while at the cost of vertical displacement instability. Once the vertical displacement is out of control, it will inevitably lead to a major disruption, causing great damage to the device, which will have unacceptable consequences if it occurs on ITER. Therefore, active control of vertical displacement is necessary. The vertical displacement is affected by the passive structure, power supply delay, etc., which is a high-order system with complex response. As the system control ability is limited, when the perturbations are complex and diverse, the requirements for robustness of controlling are high. Deep learning has a strong learning capability, so we used a deep reinforcement learning approach to achieve fast control of plasma vertical displacemen.
We first verified the feasibility of reinforcement learning to control plasma vertical displacement. We trained the vertical displacement controller using the Deep Deterministic Policy Gradient (DDPG) algorithm and tested its performance. After testing, we found that the dynamic response of the controller is better than the conventional PID control, but it is less resistant to PF coil current perturbations.
In order to increase the perturbation resistance of the model, we have adopted Robust Adversarial Reinforcement Learning(RARL). The strategy of RARL is to add an adversary who is also an agent, and the adversary will attack the weaknesses of the agent, so the agent needs to find the optimal strategy in the worst case scenario. It may be useful to refer to the DDPG-based RARL as DDPG-RARL. The traditional vertical displacement control cannot completely avoid the overcurrent of IC coil due to the perturbation of PF coil current. Therefore, in our work, the adversary attacks the controller by applying perturbations to the PF coil current based on the observations in the EAST.
We perform a comparative test of the model's resistance to perturbation by using an adversary to attack the DDPG-RARL-based controller, and then intercepting the attack pattern to attack the DDPG-based controller. We found that the training process yields adversaries with different characteristics. The adversaries can be categorized into two types, performing high-amplitude attacks and high-frequency attacks. We found that DDPG-RARL outperforms DDPG for both large amplitude attacks and high frequency attacks.

Speaker's Affiliation Institute of Plasma Physics, Chinese Academy of Sciences, Hefei
Member State or IGO China, People’s Republic of

Primary authors

Prof. Bingjia Xiao (Institute of Plasma Physics, Chinese Academy of Sciences) Dr Binnuo Liu (Institute of Plasma Physics, Chinese Academy of Sciences) Prof. Qiping Yuan (Institute of Plasma Physics, Chinese Academy of Sciences) Dr Ruirui Zhang (Institute of Plasma Physics, Chinese Academy of Sciences) Dr Wenhui Hu (Institute of Plasma Physics, Chinese Academy of Sciences) Dr Yao Huan (Institute of Plasma Physics, Chinese Academy of Sciences) Dr Yuehang Wang (Institute of Plasma Physics, Chinese Academy of Sciences) Dr Zhengping Luo (Institute of Plasma Physics, Chinese Academy of Sciences)

Presentation materials