Since 18 of December 2019 conferences.iaea.org uses Nucleus credentials. Visit our help pages for information on how to Register and Sign-in using Nucleus.
KEY DEADLINES
30 June 2023 Deadline for submission of Participation Form (Form A), and Grant Application Form (Form C) (if applicable) through the official channels
30 June 2023 Deadline for submission of abstracts through IAEA-INDICO for regular contributions
30 July 2023 Notification of acceptance of abstracts and of assigned awards
Over the past decade, artificial intelligence (AI) has evolved rapidly, becoming increasingly sophisticated and capable of solving ever more complex problems. AI is deployed in sectors as diverse as manufacturing, transportation, finance, education and healthcare. AI methods are used in data analysis, theoretical modelling and experiment design, helping to accelerate fundamental science and advancing technological innovation. A particular area that benefits from the application of AI is fusion and plasma science discovery. With its ability to solve large and complex problems, AI can aid experiments and scientific discovery through modelling and simulations. These applications of AI are included in a five-year IAEA coordinated research project aimed at accelerating fusion research and development. The results of this workshop will feed into the coordinated research project.
Objectives
The purpose of the event is to provide a platform for researchers, developers, practitioners, entrepreneurs and policymakers to discuss artificial intelligence applications to accelerate fusion and plasma science; and to identify representative examples and related data to be shared through international collaboration, ideally leading to coordination or joint work within the coordinated research project on the subject.
Target Audience
The event aims to bring together a multi-stakeholder and inter-disciplinary audience of researchers, developers, practitioners, entrepreneurs and policymakers in artificial intelligence, fusion and plasma science, to discuss applications, connect and build collaboration.
Ryan McClarren (ND, USA), Hideo Nagatomo (Osaka U., Japan), Marcin Jakubowski (IPP, Germany)
On December 5, 2022, scientists at the Lawrence Livermore National Laboratory carried out the first-ever Inertial Confinement Fusion experiment that met all criteria for ignition. This 2.05-megajoule laser shot at the National Ignition Facility compressed a millimeter-size capsule containing hydrogen fuel, leading to fusion reactions, and generating 3.15 megajoules of energy, a gain of 1.5. The outcome of this historic experiment did not come as a big surprise; a pre-shot analysis predicted a much higher chance of ignition for this new design relative to previous designs. In this talk, we describe a new method developed to make this prediction. The method combines a very sparse dataset of previous experiments with a large ensemble of simulations, Bayesian inference to incorporate uncertainties, machine learning to make the technique computationally feasible, and is built on several years of experience with the analysis of inertial confinement experiments. During the months that followed the ignition shot, we have adapted and applied this method to predict subsequent experiments, which begin to form a validation set for our predictive model. The predictive modeling is expected to play an increasing role in evaluating new designs and facility upgrades, or determining driver requirements for inertial fusion energy needs. LLNL-ABS- 850875. Prepared by LLNL under Contract DE- AC52-07NA27344.
Magnetically confining a high temperature plasma in a toroidal device (e.g., tokamak, stellarator, etc) is arguably the most promising approach for mankind to archive controlled thermonuclear fusion energy. One critical concern of this approach is to find and maintain the proper heat and particle exhaust at the divertor region – a region at where the magnetic topology changes from “closed” to “open” and the high temperature plasma may be directly in contact with the vessel wall. Modeling divertor plasma is not a trivial task due to its multi-physics, multi-scale nature. For instance, the widely used axisymmetric 2D edge transport codes in the community (e.g., SOLPS, UEDGE, etc) often take days or even months to attain converged steady state divertor solutions once sophisticated plasma and neutral dynamics are included. This time-consuming process not only affects the divertor plasma physics research but also impacts the high-fidelity divertor model applications, e.g., in new device design, discharge scenario development and real-time plasma control.
Machine learning technique offers an alternative solution to this challenge. A fast yet fairly accurate data-driven surrogate model for divertor plasma prediction is possible by leveraging the latent feature space concept. The idea is to construct and train two neural networks – an autoencoder that finds a proper latent space representation (LSR) of plasma state by compressing the desired multi-modal diagnostic measurements, and a forward model using multi-layer perception (MLP) that projects a set of divertor plasma control parameters to its corresponding LSR. By combining the forward model and the decoder network from autoencoder, this data-driven surrogate model predicts a consistent set of diagnostic measurements based on a few key parameters controlling the divertor plasma. This idea is first tested to predict downstream plasma properties with limited upstream information in one-dimensional flux tube configuration. The resulting surrogate model is at least four orders of magnitude faster than the conventional numerical model and provides fairly accurate divertor plasma predictions, usually within a few percent relative error margin. It has 99% successful rate predicting divertor plasma detachment – a bifurcation phenomenon features a sudden decrease of heat load on the divertor plate. This pilot study successfully demonstrates that the complicated divertor plasma state has a low-dimensional representation in latent space that could be utilized for surrogate modeling. Following the same methodology, we recently extend the work to modeling realistic 2D axisymmetric configuration. Application-specific surrogate models are constructed, trained, and tested. These models appear to be able to fulfill many different tasks (e.g., initial solution prediction for code acceleration, integrated tokamak divertor design, and divertor plasma detachment control), suggesting that machine learning is a powerful tool for divertor plasma physics and fusion energy research.
Accurate simulations of the scrape-off layer plasma in a tokamak employing state-of-the-art numerical models (e.g. SOLPS-ITER) require long convergence times. Such physically sophisticated models are required for a detailed design e.g. of the divertor in DEMO or in a fusion power plant (FPP). For design scoping studies or integration of models with the core-plasma, currently used reduced fidelity models lack either accuracy or capabilities. Other applications include control specific needs w.r.t. fast model based predictors for the level of plasma detachment in the divertor. All this warrants the development of surrogate models, which interpolate between existing simulations, to allow for fast and accurate results in the whole parameter space. We created a neural network model by training on a database of reduced fidelity fluid neutral SOLPS-ITER simulations. This database includes a cross-machine size scaling, making the developed model applicable to many devices and scenarios. The neural network model is capable of computing the electron temperature in the whole 2D SOL and divertor domain for different physical regimes in less than a second. The deviations between the neural network model and the original simulations are less than 10 percent for the majority of cases. The model is compared to high-fidelity SOLPS-ITER simulations from the ITER IMAS database. Two approaches are tested to reduce the deviations between the model and the ITER simulations: Scaling of the gas-puff parameters and transfer learning. While both reduce the deviations only transfer learning allows for predicting full 2D plasma profiles on the ITER geometry. Some applications for such a model are showcased.
For decades, plasma transport simulations in tokamaks have employed the finite difference method (FDM) to solve the transport equations, a coupled set of time-dependent partial differential equations. In this conventional approach, a significant number of time steps, typically over $O(10^5)$, are needed for a single discharge to prevent numerical instabilities induced by stiff transport coefficients. This results in significant computing time as costly transport models are repeatedly called in a serial manner, proportional to the number of time steps. Additionally, the unidirectional calculation inherent in FDM presents challenges for predicting regions prior to the initial condition or for applying additional temporal constraints.
In this work, we introduce a novel solution scheme for plasma transport simulation, using physics-informed neural networks (PINNs). Instead of adopting the traditional chronological computations of FDM, this new technique iteratively refines a function mapping spatiotemporal coordinates to plasma states, gradually minimizing errors in transport equations. The required number of iterative updates in PINNs is several orders of magnitude less than the chronological iterations in traditional FDM, and this approach is free from numerical instabilities arising from discretization on finite grids. Furthermore, the flexibility of PINNs enables more versatile "semi-predictive" simulations, permitting the application of arbitrary spatiotemporal constraints, such as sparse and finite conditions or intermediate temporal constraints, which more preserve the diagnostic fidelities. In this presentation, we discuss the features and potentials of our newly proposed PINN-based tokamak transport solver.
For stable and efficient fusion energy production using a tokamak reactor, maintaining high-pressure hydrogenic plasma without plasma disruption is essential. Therefore, it is necessary to actively control the tokamak based on the observed plasma state, to maneuver high-pressure plasma while avoiding tearing instability, the leading cause of disruptions. This presents an obstacle avoidance problem for which artificial intelligence (AI) based on reinforcement learning has recently shown remarkable performance. However, the obstacles here, the tearing instability, are difficult to forecast and highly prone to terminating plasma operations. In our recent work, we developed a multimodal dynamic model that estimates the likelihood of future tearing instability based on signals from multiple diagnostics and actuators. This dynamic model not only predicts the possible onset of tearing instability during tokamak operation but can also be used as a training environment for AI that controls actuators to avoid instabilities. In this work, we demonstrate AI control based on reinforcement learning to lower the possibility of disruptive tearing instabilities in DIII-D, the largest magnetic fusion facility in the US. The con- troller maintained the tearing likelihood under a given threshold, under relatively unfavorable conditions of low safety factor and low torque.
Machine learning and artificial intelligence (ML/AI) methods have been applied to fusion energy research for over 2 decades, including the areas of disruption prediction, particle distribution and loss prediction, plasma equilibrium reconstruction and so on. The success in achieving magnetic control of the TCV tokamak with deep learning methods has demonstrated great opportunities for intelligent control of fusion power plants by ML/AI. There are currently huge needs of AI/ML for accelerated progress toward realization of the fusion energy. This work reports progresses on AI-driven tokamak control in HL-2A/2M based on both the experimental and simulation data, covering the areas of disruption prediction, the magneto-hydrodynamic (MHD) modes recognition, and predicting the operation beta limit.
The electromagnetic and thermal energy loads on the first wall and other plasma facing components during tokamak disruptions are often sufficiently high to cause huge damages to the device. Disruptions are hardly avoidable in tokamaks, and are often difficult to predict. AI/ML provides a potential way to address this issue. A machine learning model has been trained to predict disruptions in HL-2A. Further efforts have been made to improve accuracy and interpretability of the model. Both the off-line and real-time tests achieved good performance.
The tokamak plasma disruption can be due to many factors, with the macroscopic MHD instability being one of them. It is known that the pressure-driven external kink (XK) mode sets a ‘hard’ limit to the tokamak operation. As the plasma pressure exceeds the so-called Troyon $\beta_N$ limit, major disruptions can occur. The operational $\beta_N$ limit can be calculated with first-principle MHD codes but is always time-consuming. To meet the requirements of the XK mode control in real time in experiments, artificial neural networks (NNs) have been trained to predict the no-wall and ideal-wall $\beta_N$ limits based on the numerical database generated for HL-2M. The NN-prediction is found to reach 95% accuracy compared to the numerical results directly computed by the MARS-F MHD stability code.
The edge localized mode (ELM) is another type of instability, that may not cause plasma disruption but can produce a large amount of heat and particle fluxes to the divertor target, which in turn potentially cause material erosion in future reactor-scale devices such as ITER. Therefore, large ELMs need to be identified and controlled, with the latter achieved by 3D magnetic coils, impurity injection or other means. We report an example of ELM recognition and mitigation experiment by the AI control system. An advanced AI model for the ELM identification via L-H transition will also be reported. Similar methods have been employed to identify the long-live mode, the tearing mode, the fish bone mode and the sawtooth in HL-2A.
As the last part of the report, we show some of the AI modeling effort which is still in progress, i.e., the device-independent disruption prediction for HL-2M with the data support from HL-2A and J-TEXT experiments, and the intelligent strategy during operations.
Edge plasma turbulence is critical to the performance and operation of magnetic confinement fusion devices. Drift-reduced Braginskii two-fluid theory has for decades been widely applied to model boundary plasmas with varying success. Towards better understanding edge turbulence in both theory and experiment, a custom-built physics-informed deep learning framework constrained by partial differential equations is developed to accurately learn turbulent fields consistent with the two-fluid theory from partial observations of electron pressure. This calculation is not otherwise possible using conventional equilibrium models. With this technique, the first direct quantitative comparisons of turbulent field fluctuations between electrostatic two-fluid theory and electromagnetic gyrokinetic modelling are demonstrated with good overall agreement found in magnetized helical plasmas at low normalized pressure.
To translate these computational techniques to experimental fusion plasmas, comprehensive 2-dimensional diagnostics operating on turbulent time scales are necessary. For this purpose, a novel method to translate brightness measurements of HeI line radiation into local plasma fluctuations is demonstrated via a newly created deep learning framework that integrates neutral transport physics and collisional radiative theory for the $3^3 D - 2^3 P$ transition in atomic helium. Using fast camera data on the Alcator C-Mod tokamak, this thesis presents the first 2-dimensional time-dependent experimental measurements of the turbulent electron density, electron temperature, and neutral density in a fusion plasma using a single spectral line. With this experimentally inferred data, initial estimates of the 2-dimensional turbulent electric field consistent with drift-reduced Braginskii theory under the framework of an axisymmetric fusion plasma with purely toroidal field are calculated. The inclusion of atomic helium effects on particle and energy sources are found to strengthen correlations between the electric field and electron pressure while broadening turbulent field fluctuation amplitudes which impact ${\bf E \times B}$ flows and shearing rates.
Ryan McClarren (ND, USA), Hideo Nagatomo (Osaka U., Japan)
Michael Churchill (PPPL, USA), Cristina Rea (MIT-PSFC, USA), Zongyu Yang (SWIP, China)
Inertial confinement fusion (ICF) relies on the implosion of precision engineered capsules containing DT fuel. The implosion is initiated by a driver, usually a laser, and the target may feature one or more outer shells to enable driver coupling. First Light Fusion’s (FLF) novel approach separates the design of the target into a fuel capsule and a shock amplifier, which is uni-axially driven by a hyper-velocity projectile. The target design process seeks to maximise the neutron yield of the implosion. This is typically accomplished through optimising the design using radiation-hydrodynamics (rad-hydro) simulations and comparison with experiments. The optimisation process can require thousands of high-fidelity simulations, taking a large amount of compute resource.
For these reasons, numerical optimisation techniques are of key interest to the ICF community and several algorithms have previously been proposed. Bayesian Optimisation (BO) is a promising approach to these problems, as it excels at finding the global optimum of expensive black box objectives. Within the Bayesian optimisation loop, a machine learned model (or emulator) is constructed from an initial set of simulation runs and optimised in place of the actual function. This suggests a best set of design parameters; and evaluations at those parameters (made by running more rad-hydro simulations) are used to update the emulator. The process is repeated to rapidly locate the optimum. While extremely powerful, Bayesian Optimisation has extensive configuration options and effective use requires detailed knowledge of the problem being optimised.
In this work, we describe a comprehensive Bayesian Optimisation (BO) capability tailored for use in ICF target design problems. The BO routines are implemented using BoTorch, an open-source Python framework. Gaussian process (GP) models are used as emulators, which fit a normally distributed family of response surfaces to the simulation samples, the initial set of such being generated by an optimised Latin Hypercube space-filling design. GP models can struggle with noisy and discontinuous response surfaces common in ICF simulation. This is solved by using a data-learned transform on the objective. Asynchronous batch execution (using the “Kriging Believer” heuristic for conditioning the GP on pending inputs) is implemented, allowing full use of HPC compute resources. Black box constraints, commonly found in ICF problems, are handled through multiple output tasks (the objective and the constrained outputs) learned simultaneously and used to constrain the space of input parameters that are suggested by the BO algorithm.
The approach is benchmarked against a robust and commonly used optimisation algorithm, CMA-ES (Covariance Matrix Adaptation Evolution Strategy) and the EGO (Efficient Global Optimisation) Bayesian Optimisation algorithm available in the DAKOTA toolkit (developed by Sandia National Laboratory). Our approach outperforms both CMA-ES and EGO for optimising the performance of our amplifier design (a 9D problem) and on other synthetic benchmarks. Finally, as BO is well suited for locating global optima but can be slow to refine to the exact location, we develop an algorithm that combines Bayesian Optimisation with CMA-ES, called BOCMA, which outperforms both CMA-ES and BO in terms of the number of iterations required.
Magnetic confinement fusion research is characterized by computationally demanding physics models with a selection of uncertain, phenomenological input parameters. Rigorous usage of such models for predictive or interpretative applications requires a thorough inverse uncertainty quantification (UQ) for these input parameters [1]. Bayesian inference (BI) algorithms provide a principled approach to quantify the uncertainty, as a probability distribution, for the state of the investigated system or hypothesis validity, given the available information [2]. When operating with computationally costly models, data-efficiency is key to maximizing the information gain in establishing this probability distribution. Such efficiency can be achieved by combining “Bayesian optimization” (BO) with the overall BI task. BO is a powerful framework for data-efficient global optimization of costly, non-convex functions, without access to first- or second-order derivatives [3]. On the one hand, BO uses BI to build a statistical approximation in the space of functions that represents the costly model, leveraging the Bayesian quantification of uncertainty over functions to efficiently refine the approximation where needed. On the other hand, the overall BI task is focused on establishing posterior probability distributions over the uncertain state of the investigated system or hypothesis. Crucially, the overall BI task can be conducted with or without BO, but the application of BO renders the BI task orders of magnitude more data-efficient. For many practical applications in fusion energy research, batch BO is needed [4]. The standard BO algorithms conduct sequential search, where a computationally relatively light to evaluate acquisition function is optimized to recommend the highest utility sample to collect with the computationally demanding model. However, if evaluation of the full model takes longer than a few hours, which is quite typical in fusion research applications, the overall optimization time with a sequential approach can easily become unacceptably long. Batch BO can be used to collect several samples in parallel to each other, accelerating the overall optimization task to the throughput levels required for practical applications in fusion energy research. This work discusses BI and BO work performed within or in close connection to the EUROfusion Advanced Computing Hub, hosted by the University of Helsinki (ACH 5). The example applications presented in this work encompass runaway electron simulations [5, 6], scrape-off layer (SOL) plasma simulations, integrated plasma scenario simulations, and tokamak experiment design through BO algorithms [7].
[1] WU, X., et al., Nucl. Eng. Des. 335 (2018) 339-355.
[2] VON TOUSSAINT, U., Rev. Mod. Phys. 83 (2011) 943.
[3] FRAZIER, P.I., A Tutorial on Bayesian Optimization, (2018) arXiv:1807.02811.
[4] HUNT, N., et al., Batch Bayesian Optimization, Thesis: S.M., MIT, 2020.
[5] NARDON, E., et al., Modelling of runaway electron dynamics in tokamak disruptions, Proceedings of 29th IAEA Fusion Energy Conference, 16-21 October, 2023, London, United Kingdom.
[6] JÄRVINEN, A.E., et al., J. Plasma Phys. 88 (2022) 905880612.
[7] CLARTÉ, G., et al. Proceedings of the 4th International Conference on Data-Driven Plasma Science, 16 – 21, April, 2023, Okinawa, Japan.
Laser-driven inertial confinement fusion is an important approach to achieve controllable nuclear fusion. It applies high-power laser pulses or X-rays to ablate the outer surface of a spherical target, leading to a centripetal implosion and an increase in pressure and temperature of the fuel. In order to reach the Lawson criterion and thus realize a self-sustaining burning plasma, we have to compress the fuel to several hundred times of the solid density and rise the target temperature to over 5 keV. Isentropic compression can realize such compression. However, it is not easy to design the optimum target structure and the corresponding laser pulse manually.
Firstly, to realize an efficient implosion, the driven laser pulse and target structure are designed using a random walk method for a given laser energy. It can quickly optimize the laser pulse and target structure parameters for an efficient isentropic compression of the target. A correlation matrix can also be constructed to analyze the correlation between the parameters.
Secondly, we propose a hybrid optimization method by combining the random walk and the Bayesian methods, to further improve the optimization efficiency. The series of laser pulses and target structures that produces relatively high areal density obtained by the random walk optimization are used as the basic sampling data of the Bayesian optimization. It greatly reduces the desired number of samples for Bayesian optimizations and the Bayesian optimization also makes up for the small step size and low efficiency of the random walk method in the later stage of optimization, and reduces the randomness in the optimization process. The hybrid optimization method greatly improves the optimization efficiency, and has been applied to the experiments of the Double-Cone ignition scheme. We believe that it will play a greater role in the future laser fusion experiments.
[1] Z. Li, X. H. Yang, H. Xu, G. B. Zhang, B. Zeng, S. J. Chen, Y. Y. Ma, F. Y. Wu, and J. Zhang, Design of laser pulse shapes and target structures by random optimization for direct-drive inertial confinement fusion. Physics of Plasma 29, 092705 (2022).
[2] Z. Li, Z. Q. Zhao, X. H. Yang, G. B. Zhang, Y. Y. Ma, H. Xu, F. Y. Wu, F. Q. Shao, J. Zhang, Hybrid optimization of laser-driven fusion targets and laser profiles. Submitted.
Understanding the properties of materials when exposed to various plasma
temperatures and fluxes is essential to the building and operating of fusion reactors. The Material Plasma Exposure experiment (MPEX) is an instrument currently being developed by the Department of Energy (DOE) for this purpose. MPEX is expected to come online in stages over the next five years. Proto-MPEX, the predecessor to MPEX, operated from 2014 to 2021, and was designed to understand the generation of plasma temperatures and fluxes at orders of magnitude below what will be obtained by MPEX. This work we summarize research developed using stochastic neural network (SNN), a machine learning technique capable of operating under uncertainty to provide a surrogate model for the Proto-MPEX device (Archibald, et.al., 2022). We demonstrate that SNN outperforms Bayesian neural network (BNN), a standard in the field of machine learning with uncertainty. The development of a robust surrogate of the Proto-MPEX will aid in the commissioning and operation of the MPEX device.
References
Archibald, R., Cianciosa, M. and Lau, C., 2022, December. Improving Predictions Under Uncertainty of Material Plasma Device Operations. In 2022 IEEE International Conference on Big Data (Big Data) (pp. 3402-3407). IEEE
We present three case studies demonstrating the minimisation or elimination of human intervention from the process of generating data sets with relevance to different problems in tokamak fusion experiment design and control.
The design of new tokamaks, optimisation of plasma scenarios and construction of real-time control systems in tokamaks require a comprehensive and often expensive exploration of plasma and coil configurations over a high-dimensional parameter space. We present a Markov chain Monte Carlo algorithm to produce large libraries of forward Grad-Shafranov solutions without the need for user intervention. The algorithm minimises the resources dedicated to exploring unsuitable equilibria by assigning a score to points in the parameter space based on the properties of the corresponding plasma configurations. New configurations are sampled based on ratios of scores. This allows the circumvention of problematic profiles or numerical issues in the integration of the Grad-Shafranov equation, a smooth emulation of classic-control matrices, and parameter optimisation towards equilibria with desirable properties (e.g., high flux expansion factor).
Investigating tokamak scrape-off layer and divertor processes for the purposes of machine design and experiment analysis typically requires employing numerical models, which span a broad range of fidelity, completeness and computational expensiveness. These models (such as UEDGE, SOLPS, Hermes, etc.) typically require significant oversight by the operator and are prone to crashes that can be difficult to diagnose. In this work, we report on methods developed to enable the automated creation of large-scale data sets using the code SD1D, including strategies for deployment on HPC systems with minimal resource waste, automated convergence checking, and parameter space exploration.
Finally, we present an active learning pipeline to sample parameter spaces of dynamical systems with possible chaotic solutions, with an application to turbulence simulations. Given a set of simulations and an emulator, the emulated root-mean-square field amplitude yields a prior distribution from which to draw the next simulation parameters, which are then ranked by the uncertainty of the emulator. The prior distribution privileges simulations that are expected to develop a chaotic solution. This approach keeps the computation of expensive simulations to a minimum, while also populating their parameter space efficiently and enabling uncertainty quantification on derived quantities.
During the WEST experimental fusion plasma discharges, several diagonistics are employed to collect diverse data. Among these diagnostics, two high-definition cameras, operating within the visible spectrum, broadcast in real time the plasma inside the vacuum vessel.
The intent of our study is to investigate the possibility of determining the plasma state from this live video for each discharge. According to the characteristics of the WEST tokamak, five potential plasma states have been identified for this first application: current ramp up in limited configuration, diverted lower single null, upper single null, double single null, and no plasma.
However, real time detection of these states is a complex task. Artificial intelligence (AI) vision techniques, including deep learning algorithms, present a potential solution by enabling the analysis of images and identification of relevant patterns. In this context, the STARE (STAte plasma REcognition) project aims to develop an automated tool that identifies the different plasma states from visible camera video, where the critical attribute for identification is the position of the plasma's contact point against the vessel walls. In this project, we plan to apply a data-driven approach, where machine learning models will be trained on Tokamak discharge images extracted from camera videos. The main objective is then to train these models to recognize the relationships between visual patterns and contact point locations, i.e. plasma state, and use them to classify the current state of the plasma from the live video feed.
The challenge in training classification models lies in providing proper annotated data. In particular within this project: for each plasma discharge, several hundreds of images are generated and need to be annotated. As a first step, we used an unsupervised learning technique to develop an automatic labeling tool to overcome this challenge and facilitate the generation and validation of the potential labels.
The data used in this study were collected from a variety of plasma discharges: videos from twelve experimental discharges were collected and go into several preprocessing steps for features extraction. In order to understand and classify the complex states of plasma during a discharge experiment in the WEST tokamak, we turned to using the K-means based approach to categorize our video frames dataset into distinct clusters based on their visual characteristics and similarities.
The K-means approach carries an inherent limitation: the necessity to predefine the number of clusters(K). To validate and guide the choice of K for our specific application, we employed several metrics such as silhouette score, Calinski-Harabasz index, and the elbow method.
The clustering results provided an initial exploration of the data that allows building a preliminary labeled dataset for training. The results were visualized using a homemade tool that allows the researchers to automatically examine the clustering groups. However, these initial clusters are not optimal, and the next steps of the project aim to refine the results by incorporating domain expertise and WEST diagonistics information for a better feature selection. Also, we aim to apply active learning and exploring supervised classification methods.
It is a crucial challenge that disruption prediction should learn from limited data due to the considerable expense of obtaining an extensive experimental dataset in future tokamaks. To reduce the data requirements for future tokamaks, utilizing existing knowledge of disruption physics and tokamak discharge could be helpful. IDP-PGFE (Interpretable Disruption Predictor based on Physics-Guided Feature Extraction) exhibits commendable performance with a True Positive Rate (TPR) of approximately 90% and a False Positive Rate (FPR) of around 10% when handling a modest number of disruptive discharges, about 20 shots (alongside about 120 non-disruptive discharges) in J-TEXT. However, as the number of disruptive discharges decreases to about 10 shots, the data from a single tokamak becomes insufficient for training a satisfactory model, resulting in a TPR of about 75% and an FPR of approximately 15%. To overcome this limitation, we have adopted a domain adaptation algorithm called CORAL (CORrelation ALignment) for the disruption prediction task. Through the combined advantages of PGFE and CORAL, a cross-machine disruption prediction performance of TPR ~90% and FPR ~30% can be achieved when transferring knowledge from J-TEXT to EAST using only 10 disruptive discharges (and 100 non-disruptive discharges) from EAST. Consider the worst-case scenario, there could even be no data to access at tokamak's initial operation for future tokamaks. Therefore, for disruption prediction, it is crucial to establish a zero-shot disruption prediction model that exhibits both reliable and satisfactory performance. In recent years, computer vision (CV) and natural language processing (NLP) have achieved numerous zero-shot machine learning models, providing a wealth of experience that can be leveraged for disruption prediction. The input for CV tasks consists of pixel values, while NLP tasks involve tokenized words. Unlike NLP and CV tasks, disruption prediction tasks lack normalized feature inputs. Therefore, we aim to identify more widely applicable normalized input features for fracture prediction. At the same time, we aim to improve the data quality of the training data by incorporating more human input to realize a kind of zero-shot for disruption prediction.
In this work, a nonlinear model is introduced to determine the vertical position of the plasma column in Damavand tokamak. Using this model as a simulator, a nonlinear neural network controller has been designed. Also this controller is implemented on digital signal processor (DSP) control system.
In the first stage, a nonlinear model is identified for plasma vertical position, based on the multilayer perceptron (MLP) neural network (NN) structure. Estimation of the model parameters has been performed by back-propagation error algorithm using Levenberg–Marquardt gradient descent optimization technique. The model is verified through experimental data of plant. As the second stage, a MLP neural network controller is designed for model. Also, online training is performed to tune the controller parameters. Finally, we implement a neural network controller with offline and online learning for controlling vertical position of plasma based on DSP processor in Damavand tokamak. The structure of neural controller is direct adaptive neural controller. Gradient descent with momentum and RPROP algorithms have been used for online learning of neural controller. For implementing these algorithms in real-time, we used the fastest methods for coding until in sampling time of 10us it can run the controller once and update the neural network parameters. The practical results show appropriate performance of this controller.
keywords: Tokamak, Plasma, Neural network modeling Neural network controller, Online learning, Gradient descent with momentum, DSP.
Eni strongly relies on the use of Artificial Intelligence (AI) based solutions, with the purpose of progressively making operations more efficient and sustainable, from enhancing safety for personnel to ensuring the integrity of Eni's assets through predictive maintenance solutions and aiding the Research & Development department in crafting innovative technologies to achieve net-zero emissions. In detail, Eni's approach involves developing end-to-end AI solutions, from data collection to model deployment in a production environment.
Eni has been actively taking part in magnetic confinement fusion research projects since 2018 by engaging global talent in industrial science and technology and contributing to significant Italian and international initiatives. Eni collaborates with research institutions and provides access to its high-performance computing resources for researchers to model and simulate plasma physics.
The company's expertise in AI and data science has been recently employed to face the significant technological challenge of disruption prediction. Disruptions are harmful for tokamaks and a major concern for next-generation devices. Disruptions must be identified in time and with a suitable advance, allowing the control system to bring the device back into safer and more stable operational regimes (disruption avoidance) or to perform a controlled shut-down of the discharge (disruption mitigation). AI-based approaches have been widely adopted in recent years to address this challenge, leveraging data available from the experimental campaigns of several devices currently operating or which have been in operation in the last decades.
In this framework, we have developed a highly engineered pipeline for developing AI-driven disruption prediction models and relying on industrial data science best practices. Our primary goal is to ensure scalability and reusability across various devices, making it as device-independent as possible. Consequently, this pipeline can effortlessly be adapted to new devices’ data with minimal or no adjustments required to the existing code.
At this stage, the problem of predicting disruption events has been framed as a binary classification problem and based on the existing Disruption Prediction using the Random Forest (DPRF) model [1,2]. During a shot, the model receives as input features the measurements recorded from different diagnostics [2], processes them, and predicts the “disruptivity”, i.e., the probability of being close to disruption. Finally, the disruptivity of consecutive samples is compared against a predetermined threshold. If the disruptivity exceeds this threshold for a specified number of samples, an alarm is triggered.
This pipeline has been validated separately on Alcator C-Mod, DIII-D, and EAST, using publicly available datasets [2,3]. With respect to the results of DPRF [2] we accomplished an overall reduction in terms of both (i) missed alarms at 30ms and (ii) false alarms. This improvement can be attributed to several factors, including a revised data preprocessing approach, different management of class unbalance, and enhancements in the hyperparameter tuning module.
References:
1. C. Rea et al 2019 Nucl. Fusion 59 096016
2. K.J. Montes et al 2019 Nucl. Fusion 59 096015
3. J.X. Zhu et al 2021 Nucl. Fusion 61 026007
Accurate simulation of fusion plasma turbulence is required for reactor operation and control, but is either too slow or lacks accuracy with present techniques.
The FASTER project aims to circumvent these conflicting constraints of accuracy and tractability and provide real-time capable turbulent transport models with increased physics fidelity for tokamak temperature, density, and rotation velocity prediction through usage of machine learning techniques.
In recent years, a new type of neural network(NN) based quasilinear turbulent transport model has been developed for the simulation of fusion plasmas, giving increasingly promising and fast results and allowing their use in integrated simulations[1,2]. These surrogate models are obtained by training NNs on large datasets of simulations generated with reduced quasi-linear codes like QuaLikiz[3] or TGLF[4]. While extremely powerful, this technique limits the accuracy of the surrogate model to that of the original one.
One way to further improve the capabilities of NNs based quasi-linear models is to train them on datasets generated with higher fidelity codes. For instance, the linear response of state-of-the-art gyrokinetic flux tube codes such as GKW[5] or GENE[6] could be used. Thanks to the growth of HPC resources, the generation of a dataset of a few million linear gyrokinetic simulations is now within the reach of a single research group. The size of the dataset can be further increased by mobilizing the community and collecting gyrokinetic simulations performed worldwide. To this end, we have extended the IMAS data model to include a unified standard for the inputs and outputs of gyrokinetic simulations. This standard is used to store gyrokinetic simulation results from different codes in a common database: the GyroKinetic DataBase (GKDB).
The GKDB is designed to be a repository of open source simulation data, a platform for code benchmarking, and a springboard for the development of fast and accurate turbulent transport models. The project is hosted and documented on GitLab (https://gitlab.com/gkdb/).
Thanks to the unified data model used for the database, quasilinear as well as linear and nonlinear simulations can be stored sharing compatible inputs and output. This offers the possibility to build fast quasi-linear models by training neural networks on the linear simulation data and to test their robustness against the non-linear simulation data. Moreover, code comparison is always challenging due to the different normalizations and conventions used. The IMAS "gyrokinetics" standard greatly facilitates the benchmarking of codes (δf flux tube gyrokinetic simulations and/or quasilinear models) against each other.
We will give an overview of the FASTER project and presents some proof of concepts of database usage including data access and visualization.
References :
[1] K.L. van de Plassche et al., Phys. Plasmas 27, 022310 (2020)
[2] O. Meneghini et al., Nucl. Fusion 61, 026006 (2020)
[3] G.M. Staebler et al., Phys. Plasmas 14, 055909 (2007)
[4] J. Citrin et al., Plasma Phys. Control. Fusion 59, 124005 (2017) - www.qualikiz.com
[5] A.G. Peeters et al., Comput. Phys. Comm. 180, 2650 (2009) and GKW website
[6] F. Jenko et al., Phys. Plasmas 7, 1904 (2000) and GENE website
Intelligent control of the thermal loads is required to guarantee the safety of fusion devices such as W7-X and ITER during quasi-steady-state operation with reactor-relevant performance. This feedback control system should implement preemptive strategies, enabling long-plasma operation by impeding thermal load escalation to dangerous levels and minimizing plasma terminations triggered by the interlock system.
Effective thermal load control demands in-depth knowledge about thermal events, including their nature, progression, and associated risks. The attainment of such understanding is possible through the application of machine learning techniques. These machine learning models are designed to promptly detect thermal events, track their evolution, and relay this information to the feedback control system, enabling the activation of suitable mitigation strategies within the required reaction time.
High-performing machines such as ITER or DEMO necessitate early thermal load protection, leaving scarce time for gathering sufficient data to train deep machine learning models. The answer to this lies in employing transfer learning, training models on data from current devices, and enabling zero or few-shot learning. Recent developments in artificial intelligence research demonstrate the feasibility of zero-shot learning in diverse domains, provided large-scale models and sufficient, diverse training data are employed. The construction of a substantial multi-device annotated dataset poses a challenge, particularly since manual annotation for video instance segmentation of thermal events is labor-intensive. Thus, we propose using semi-supervised learning techniques like active learning coupled with semi-automatic annotation tools to accelerate this process.
We introduce a multi-device dataset of infrared images, initially incorporating data from W7-X and WEST devices, with the plan to augment with additional devices' data. The intention is to integrate data from both tokamaks and stellarators, showcasing diverse types of first-wall materials, specifically, carbon and metallic walls. Ultimately, our objective is to train a large model on this dataset, capable of executing instance segmentation and classification of thermal events. By employing transfer learning with synthetic data, we aim to accomplish accurate zero-shot learning in new devices such as ITER, thereby paving the path toward the successful operation of future fusion power plants.
Previous work [1,2] has successfully applied neural network (QLKNN) surrogates for the
quasi-linear gyrokinetic simulation code QuaLiKiz [3] to predict core tokamak transport heat
and particle ﬂuxes, resulting in 3-5 orders of magnitude reduction in computation time with
minimal (up to 10%, case dependent) loss of precision. The current study aims to apply this
concept using the gyrokinetic simulation code GKW which includes electromagnetic
fluctuations, important in high performance regimes, and realistically shaped equilibria
required to more accurately model the edge region where transport barriers develop. However,
this model will be trained on the growth rates as opposed to the fluxes which will allow the
development of novel quasi-linear saturation models with the aim of improving the
performance of existing quasi-linear codes.
As part of the FASTER project, we will first develop a proof of concept neural network (NN)
trained to predict instability growth rates using existing QuaLiKiz datasets converted to ITER
Integrated Modelling and Analysis Suite (IMAS) standards for this purpose. This allows the
creation of a pipeline to train a NN that accepts IMAS standardised inputs, important for later
use with GKW. It also enables the use of QuaLiKiz inputs to build the pipeline allowing faster
testing and validation that using GKW simulations.
This will be then be repeated using a newly generated GKW dataset based on the JET
experimental domain which will in turn be used to test quasi-linear models. Using GKW
comes at the downside of heavily increased computation times which for linear simulations
ranges from 1-100h as opposed to an average of 8s for a standard QuaLiKiz wavevector scan.
The goal of the neural network is therefore to produce results qualitatively similar to GKW
simulations in a similar timeframe as QLKNN. This would effectively reduce the simulation
time by up to 8 orders of magnitude while increasing the precision of predictions relative to
experimental results.
We present the first two major milestones of this study: the development of software to
convert QLK simulation data to and from IMAS IDS files, and the preliminary results of the
NN surrogate for QuaLiKiz calculating the growth rates and frequencies of the most unstable
modes.
Inertial confinement fusion, as an important way to achieve controllable nuclear fusion, has been widely studied. Compared to laser-driven fusion, Z-pinch driven fusion has advantages in terms of the large driven energy. It can provide a large radiation power with current engineering conditions, and the energy absorbed by the target can reach the order of MJ, which could create the condition for achieving high gain.
In 2008, Prof. X. J. Peng proposed a high gain target designing that is suitable for Z-pinch drive, i.e., local volume ignition targets. It is a multi-shell target and contains two layers of DT ice. The internal fuel is the ignition fuel, and the outer fuel is the main fuel. Such target can improve the robustness of ignition. However, how to improve the energy gain and burning fraction, then ensure the ignition of the internal fuel is still needed to be studied in detail.
In this work, we theoretically studied the multi-shell ignition process by deriving the implosion formulas, which can predict the optimal target structure under given driven energy. The maximum implosion velocity before the stagnation time can be obtained theoretically, and the neutron yield can also be obtained with the optimal design. Combing the random walk method and radiation hydrodynamic simulations, we can obtain the optimal target structure under the driving current and Z-pinch structure. It is a very efficient method for targets designing and can optimize and improve the yield rapidly. The simulation results have a good agreement with our theories. The results here should be beneficial for the future volume ignitions.
A digital twin for plasma dynamics in a tokamak is useful for optimising and validating experimental scenario proposals, developing plasma control systems and more. Physics-based modelling of the entire tokamak discharge process is challenging due to nonlinear, multi-scale, multi-physics characteristics of the tokamak and demands time from a diverse team of experts as well as computational resources to achieve high-fidelity simulations. Furthermore, simulation-time of physics-based models is prohibitively long for some application types. These challenges invite the use of machine learning (ML) for developing a fast and accurate digital twin. However, a ML-based approach brings its own challenges that need to be addressed – one tends to lose physical intuition about the model behaviour and confidence in its fidelity. This work focuses on building the knowledge of the tokamak sub-systems into the ML-based digital twin architecture as a strategy to a) reduce the risk of unphysical behaviour of the digital twin, b) make surrogate digital twin creation simpler, c) maximise its domain of application and d) define a stepwise process of including more physics into the digital twin. This strategy is to be contrasted with using a single neural net (NN) “black box”, such as the one developed for EAST tokamak [1].
A hybrid ML/physics-based digital twin for plasma dynamics in ST40 spherical tokamak [2], employing recurrent neural networks (NN) for time-series prediction is presented. Such NN choice enables simulations with a control system in the feedback loop. Thus far the digital twin incorporates a subset of actuators and measurements that is relevant for the magnetic control. ST40 specifics, such as the merging-compression plasma start-up [3], that guided the choice of the digital twin architecture are discussed.
Representing the ST40 digital twin as a composite model comprised of smaller sub-models proved to be a fruitful strategy, especially for simplifying the development of the digital twin and for increasing the domain of its validity. It is shown that a suitable choice of architecture enables the digital twin to automatically recognise and reproduce plasmaless ST40 operations, plasma startup, plasma flattop dynamics and even some types of disruptions.
The focus on the composite model architecture is complementary to traditional ML approaches that include physics into ML-based models via soft or hard constraints [4], as well as approaches that address fidelity via uncertainty quantification [5]. This project highlights the importance of leveraging the knowledge of the system being modelled into the digital twin architecture and aims to help develop practical strategies and templates.
References:
[1] C. Wan et al Nucl. Fusion 61 (2021) 066015
[2] S.A.M. McNamara et al Nucl. Fusion 63 (2023) 054002
[3] M.P. Gryaznevich and A. Sykes Nucl. Fusion 57 (2017) 072003
[4] T. Beucler et al Phys. Rev. Lett. 126 (2021) 098302
[5] M. Abdar et al Inf. Fusion, 76 (2021), pp. 243-297
In KSTAR tokamak, frequency-modulated continuous wave (FMCW) reflectometry has been used to measure plasma density profiles with high spatial and temporal resolution. The data analysis process involves extracting time-varying phase differences from the incident swept signal and reflected wave’s signal and subsequently calculating profiles through the numerical inversion process. However, tracking a distortion-free and seamless phase difference within the spectrogram, produced by complex wavelet transform (CWT), is challenging because the amplitude inherently changes with the frequency sweeping by the reflectometer components. Also, some clutters are generated from waves reflected at the surrounding structures. Here, we present a novel tracking method for the phase difference in time series. In a CWT spectrogram, phase differences can be computed in two ways: through the imaginary value of complex numbers derived from the CWT or by integrating the instantaneous frequency. If the ideal time series is selected, both ways should yield identical results. However, real-world constraints, such as ambient noise and modulated amplitude, cause inconsistencies in the time series. We treat these discrepancies as a form of the loss function, employing statistical inference to track the most probable time series. Then, to utilize hidden information for limiting possible paths and calculate the phase differences self-consistently across the entire path, our idea is implemented on the concept of hidden Markov models working through the Viterbi algorithm. A comparative analysis between the density profile generated through our developed method and those via existing reconstruction schemes illustrates the potential of our approach to enabling more precise and automated routines, thereby significantly improving the efficiency and accuracy of density profile reconstruction, which is critical for fusion plasma research.
The quality of energetic particle confinement in a nuclear fusion reactor is a key factor in the reactor's efficiency. One way of studying the behavior of energetic particles in detail is to integrate "test particle" trajectories into a previously calculated turbulent electric potential field. The high cost of calculating the turbulent field, and the size of the data, make it very difficult to use this method to integrate trajectories over a long period of time. This is why the development of a reduced model to obtain the turbulent field is an important matter. A low-cost turbulence generation model would enable much more detailed studies to be carried out on the transport of energetic particles in a turbulent field. The turbulence obtained must, of course, have similar characteristics to the turbulence actually calculated.
In order to develop a low-cost turbulence generation model, a first step is to represent the turbulent field in a reduced way. For this, the use of a variational auto-encoder (VAE) [1] is relevant. The VAE learns to encode an instantaneous turbulent field (a 2D or 3D matrix) in a reduced-dimensional space (latent space), and to reconstruct it from this latent representation. For best results, training is not performed on the images of the potential, but on their Fourier transform.
Once an efficient VAE has been obtained for encoding and decoding a turbulent field, the turbulence must be generated in a relevant way. To achieve this, we study the evolution of the representation of turbulent fields in latent space. A neural network is trained to predict the next position in latent space from the current position. The result is a transition probability function in latent space. The relevance of this function can be improved by taking into account the variation of certain field characteristics (power spectral density, turbulence intensity).
The combination of a high-performance VAE and a relevant transition probability function in latent space enables the low-cost generation of a turbulent electric potential. This method is first applied to the Hasegawa-Wakatani model, a 2D model of turbulence in a magnetized plasma. It will then be applied to a 3D potential, obtained with GYSELA5D code [1]. The models are trained on the multi-GPU partitions of the IDRIS Jean Zay supercomputer.
References:
[1] V. Grandgirard, J. Abiteboul, J. Bigot, T. Cartier-Michaud, N. Crouseilles, G. Dif-Pradalier, C. Ehrlacher, D. Esteve, X. Garbet, P. Ghendrih, et al. A 5d gyrokinetic full-f global semi-lagrangian code for flux-driven ion turbulence simulations. Computer physics communications, 207:35–68, 2016.
[2] D. P. Kingma and M. Welling. Auto-encoding variational bayes. Proceedings of the International Conference on Learning Representations, 2014.
Developing reliable control systems for long pulse operation in fusion devices is crucial and challenging for the development of ITER and DEMO. In this context, two of the most critical issues are plasma-facing components (PFCs) protection from high heat loads and disruption prevention. This talk deals with Machine Learning (ML) tools developed for machine protection from these two issues, focusing on state-of-the-art techniques. ML models for heat load monitoring and protection from overloads and for real-time monitoring disruption risk during plasma evolution will be described.
Real-time monitoring of the heat flux (HF) on PFCs is a key objective for high-performance fusion operation. At W7–X, infrared cameras monitor the PFCs by measuring the surface temperature. Typically, the HF is localized on specific regions of the divertor called strike-lines. Since high HF can damage the PFCs, a lot of effort is devoted to the estimation of HF on the divertor tiles and to strike-line control . THEODOR (Thermal Energy Onto DivertOR) code computes the HF by numerically solving the heat equation, but the computation time does not allow the real-time application. A new approach based on Physics Informed Neural Networks (PINNs) is proposed for solving the heat equation. PINN are NNs that learn Partial Differential Equations (PDEs) by minimizing the PDE loss in a mesh-free domain. Integrating PI laws into state-of-the-art NN architecture allows PINNs to real-time estimate the HF on the divertor tiles.
Moreover, a Deep CNN was trained to learn an inverse model, to determine the control coils currents (actuators) necessary to achieve a desired HF distribution (desired state) at W7-X. Control coils were installed for an active control of thermal power distribution on the divertor. The HF images were obtained by the analysis of thermographic data. Understanding and modelling the relationship between the HF distribution in the strike-lines and the actuators influencing them important step toward strike-line control.
Disruptions are an unforeseen loss of plasma confinement inducing thermal loads on PFC and electromagnetic forces on surrounding structures. Even if present devices are not extremely affected by disruptions, the consequences of disruption events for future tokamaks and reactors could be ruinous, due to the higher amounts of stored thermal and magnetic energies. Disruption causes are not always understood, and first principle models are able to exhaustively explain only some disruption dynamics. Their prediction, mitigation and avoidance are critical needs for the success of next-step fusion devices. Different ML models will be presented, including supervised techniques, such as CNNs, and unsupervised techniques, such as Self Organising Map (SOM) or ISOmetric MApping (ISOMAP), applied both to disruption prediction in tokamak. Each technique brings its own specific advantages: CNNs allow one to efficiently manage the spatiotemporal information from the plasma profiles (temperature, density and radiation), together with other commoly used diagnostic signals. In SOMs and ISOMAP no human intervention is needed during training: in particular, labelling of the samples of disruptive discharges, necessary to supply the model with the information about the presence of disruption precursors in each time instant, is not required.
Plasmas and plasma-enabled technologies are pervasive in everyday life, but their nonlinear, multiscale behaviors bring challenges for understanding, modeling, and controlling these systems. Accurately revealing the physical mechanism of plasma may provide crucial information for the successful plasma control in real-time tokamak discharge. However, the computation demands a realistic description of system size and timescales remains challenging for many problems. With the emergence and development of artificial intelligence (AI) as well as deep learning (DL) algorithms, these changed, especially in fluid mechanics. In this report, we provide a deep learning-based surrogate model considering the predictions of multi-scale multi-mode instability in real-time conditions.
The surrogate models are developed based on tokamak plasma. In this research, the schematic diagram for the construction of models is defined. The surrogate models can be applied in predicting transport properties both with linear and nonlinear phases. With a further development, the models can predict the interaction of multi-scale multi-mode instability. The newly constructed surrogate models show the sufficient ability to reproduce the same simulation results. Furthermore, the newly DL model can accurately obtain the physical characteristics of MHD and micro-instability in tokamak with a computational cost of only a small amount of CPU time. Moreover, the DL-based models may play a vital role in the ‘Intelligent Controlling’ in tokamak device in future.
Disruption prediction and mitigation is a crucial topic, especially for future large-scale tokamaks, due to disruption’s concomitant harmful effects on the devices. Recent progresses have proved that deep neural network can accurately predict the coming disruptions by learning from history experimental data, which becomes a potential solution for the disruption prediction in future devices [1, 2]. This technique routine has also been proved in HL-2A by offline testing and online experiment [3, 4]. However, a key issue is whether deep learning models can be developed on future devices, which can only tolerate a few disruptions and therefore can’t provide much training data [5]. In this research, a predict-first neural network (PFNN) is developed. Two predict-first tasks are designed to embed physical knowledge into the neural network. Ablation experiments show that the embedded physical information significantly improves the algorithms’ performance, especially when the amount of training data is limited.
The first predict-first task is to let the neural network predict the evolution of electronic temperature (Te), electronic density (Ne) and horizontal displacement (Dh) according to the control actuators and the plasma state. A preparatory neural network based on encoder-decoder framework [6] is trained on this task and three empirical equations are hidden in the design of its structure. After the training, the neural network learns the three equations and can accurately predict the evolution of paramters. Then this preparatory network can be used as a feature extractor in the disruption prediction algorithm and use the three equations to promote the performance of disruption prediction.
The second predict-first task is to mask part of the experimental data and let the neural network restore them. Another preparatory neural network based on masked auto-encoder framework [7] is trained. It can realistically reconstruct the masked parts according to the unmasked parts and the correlation between different input signals. This preparatory neural network can also be used as a feature extractor in the disruption prediction algorithm and use the correlation between different input signals to promote the performance of disruption prediction.
Ablation experiments show that the embedded physical information significantly improves the algorithms’ performance. When the amount of training shots is limited to 1283 shots, the AUC (area under receiver-operator characteristic curve) of PFNN is 5% higher than the ordinary one. In general, PFNN, which is pretrained by predict-first tasks to learn physical information and then trained for disruption prediction in the second step, performs much better on disruption prediction when the amount of training data is limited. It can be a potential solution for future tokamaks’ disruption prediction problem.
References
[1] Julian Kates-Harbeck et al. Nature 568: 526-531.
[2] Jinxiang Zhu et al. Nuclear Fusion 61(11): 114005
[3] Zongyu Yang et al. Nuclear Fusion 60(1) 016017
[4] Zongyu Yang et al. Fusion Engineering and Design 182:113223
[5] P.C. de Vries et al. Fusion Science and Technology 69(2): 471-484.
[6] Cho K et al. arXiv preprint arXiv:1406.1078, 2014.
[7] Kaiming He et al. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 16000-16009.
Michael Churchill (PPPL, USA), Zongyu Yang (SWIP, China), Marcin Jakubowski (IPP, Germany)
Alessandro Pau (EPFL-SPC, Switzerland), Michael Churchill (PPPL, USA), Fuyuan Wu (SJTU, China)
The European High Performance Computing Joint Undertaking (EuroHPC JU) is a joint initiative between the European Union, European countries and private partners to develop a world class supercomputing ecosystem in Europe. It pools resources from the stakeholders and coordinates their efforts to invest in research, innovation and the deployment of a world-leading HPC infrastructure in Europe. The EuroHPC JU seeks to provide computing capacity, improve cooperation in advanced scientific research, boost industrial competitiveness, and ensure European technological and digital autonomy. The EuroHPC infrastructure, which includes some of the largest supercomputers in the world, is available to researchers for open research and development. Moreover, the EuroHPC JU launches open calls for proposals to fund research, innovation and training in HPC.
Predictive and reliable simulations of fusion plasmas provide one important pathway towards accelerating fusion research. A popular approach for efficiently computing the dynamics of turbulent systems for a wide range of applications is the Large Eddy Simulation (LES) technique. Here, the system is simulated with only the largest scales resolved explicitly, while the unresolved scales are accounted for by a Sub-Grid-Scale (SGS) model. In this presentation, we will give this old idea a new twist. Specifically, we will develop an SGS model based on a Neural Network (NN) with Learned Corrections (LC) on the resolved scales to create a hybrid numerical and ML approach. As will be demonstrated, by using a non-propagated field, this approach can be remarkably effective allowing us to cut off virtually the entire inertial range, just retaining the drive range. This is fundamentally different from previous studies, which focused on the much simpler problem of removing diffusion-dominated scales in the dissipation range. However, removing (large) parts of the inertial range while retaining the integrity of the cascade dynamics has been the major challenge facing LES approaches. Simply extending approaches that work within the dissipation range to the inertial range is typically not a viable option. Here, we introduce a model that is able to overcome these difficulties and do so very efficiently. In fact, it is able to produce physically indistinguishable results even when removing (large) parts of the inertial range, while allowing for a relative speedup of about three orders of magnitude.
Quantum computing promises to deliver large gains in computational power that can potentially have a beneficial impact on a number of Fusion Energy Science (FES) application areas that rely on either intrinsically classical or intrinsically quantum calculations. This work presents an overview of our recent efforts [1] to develop and extend quantum algorithms to perform FES-relevant calculations and perform concrete examples of quantum computations on present-day quantum computing hardware platforms. We have developed quantum algorithms that can: (1) exactly simulate the Liouville equation [2], even for nonlinear non-Hamiltonian, e.g. dissipative, classical dynamics; (2) perform efficient eigenvalue estimation for generalized eigenvalue problems common in plasma physics and MHD theory [3]; (3) efficiently implement nonlinear wave-wave interactions [4]; and (4) efficiently explore chaotic quantum and classical dynamics [5,6].
Simplified versions of these quantum algorithms have been implemented on state-of-the-art cloud-based superconducting architectures such as the IBM-Quantum Experience and Rigetti Quantum Cloud Services platforms to test the fidelity of emerging quantum hardware capabilities. We have also implemented some of these algorithms on the LLNL Quantum Design and Integration Testbed (QuDIT), which has novel capabilities such as the ability to work with more than two energy levels per transmon and the ability to synthesize arbitrary unitary gates (for small qubit numbers) using optimized control pulses. These hardware platforms have been used to simulate a nonlinear three-wave interaction problem [4] and a three-level Grover’s search algorithm. We have also explored the ability of the IBM-Q platform to simulate chaotic dynamics through the quantum sawtooth map [5,6], as well as a number of the building blocks of the quantum variational eigensolver algorithm. The fidelity of the experimental results matches noise models that include decay and dephasing processes and highlights key differences between state-of-the-art approaches to quantum computing hardware platforms.
*LLNL-ABS-833451 was prepared by LLNL for U.S. DOE under Contract DE-AC52-07NA27344 and was supported by the U.S. DOE Office of Fusion Energy Sciences “Quantum Leap for Fusion Energy Sciences” project SCW1680.
References
[1] I. Joseph, Y. Shi, M. D. Porter, A. R. Castelli, V. I. Geyko, F. R. Graziani, S. B. Libby, J. L. DuBois, “Quantum computing for fusion energy science applications,” Phys. Plasmas 30, 010501 (2023).
[2] I. Joseph, “Koopman-von Neumann approach to quantum simulation of nonlinear classical dynamics,” arXiv:2003.09980, Phys. Rev. Research 2, 043102 (2020).
[3] J. B. Parker, I. Joseph. “Quantum phase estimation for a class of generalized eigenvalue problems,” arXiv:2002.08497, Phys. Rev. A 102, 022422 (2020).
[4] Y. Shi, A. R. Castelli, X. Wu, I. Joseph, V. Geyko, F. R. Graziani, S. B. Libby, J. B. Parker, Y. J. Rosen, L. A. Martinez, and J. L. DuBois, “Quantum computation of three-wave interactions with engineered cubic couplings,” arXiv:2004.06885, Phys. Rev. A 103, 062608 (2021).
[5] M. D. Porter, I. Joseph, “Observability of fidelity decay at the Lyapunov rate in few-qubit quantum simulations,” Quantum 6, 799 (2022).
[6] M. D. Porter, I. Joseph, “Impact of dynamics, entanglement, and Markovian noise on the fidelity of few-qubit digital quantum simulation,” arXiv:2206.04829, submitted to Quantum (2022).
Plasma diagnostics is an essential tool to understand and improve plasma stability in fusion devices. It provides useful information for the analysis and understanding of physical phenomena.
Infrared (IR) thermography is also important diagnostics applied for machine protection and plasma control, especially for the future fusion devices working with long plasma pulses, such as ITER or DEMO. Real-time monitoring of surface temperatures of the Plasma Facing Components (PFCs) is a main concern to assure a safe long-term operation of the machine and optimize performance of the generated plasma. Fast infrared cameras allow detecting and tracking Thermal Events (TEs) in real-time and therefore assure an efficient thermal load control. A good example is a divertor or the first wall protection system.
The image acquisition and processing system is composed of an image detector (digital camera), a frame grabber device that receives a video stream and a device responsible for image acquisition and processing in real-time. Based on IR images, the system could calculate and measure important parameters, such as the maximum temperature of divertor and surface temperature of PFCs, heat-flux or power load. Knowledge of surface emissivity is required to calculate required measurements. The algorithms could be even more complex especially for machines using tungsten materials and working with longer plasma pulses because the emissivity value could change during the pulse and depends on temperature and surface conditions. In this case, algorithms using Artificial Intelligence (AI) and Machine Learning (ML) could be useful and are planned to be used for the future machines.
In the case of large-scale fusion devices, the imaging diagnostics is a complex distributed hard real-time system composed of dozens of IR or VIS cameras installed in multiple places of a tokamak or stellarator observing protected machine components, such as divertor or other plasma facing components. Calculating complex measurements in real-time could require a significant processing power and various parallel computation devices, such as multicore CPUs (Central Processing Units), GPUs (Graphics Processing Units) or FPGAs (Field Programable Gate Arrays). In addition, all cameras should be synchronized, and the system should provide measurements in real-time for plasma control and machine protection. The system should be designed with redundancy to improve reliability and it should assure high availability.
Acquiring, and processing images from IR cameras requires a flexible hardware platform that provides large enough capability, processing power and synchronization. The architecture of scalable image acquisition and the processing system with an improved reliability will be presented and discussed. The system was developed using the MicroTCA.4 standard that allows to obtain scalability and the requires reliability. The system could be connected to external multicore computers equipped with GPU accelerators that deliver the required processing power. PCI Express interface assures low latency during image acquisition and processing.
To reconstruct the electron density profile of tokamak plasma, the phase of reflected microwave is to be captured by tracing the signal along the frequency as an imaginary pathway on the spectrogram. From the blurry and broken lines among the unwanted clutters, the actual path should be estimated as a continuation of the tone heights at each vertical slice of the wavelet transform. Thus, we pay attention to the recent idea idea of the Gaussian derivative wavelets as a measure to revive the vanished traces which are subject to nodal points of modulated amplitude. As it is critical to recognize the vanishing patterns, at which the Gaussian-derivative is better to be applied instead of the common spectral method of Morlet wavelet, we introduce a probabilistic model of spectral signal as a mixture of basis functions. At first, the bases are prepared from the ideal beat signal of sinusoidal waveform, and the feasibility of our idea is checked as a preliminary attempt to maximize the log-likelihood of spectral data with the fixed number of mixture patterns. Then, a Bayeian inference of variational mixture is developed for the phase recovery in our microwave reflectometry. After the exploration, a generative NN (Neural Network) is proposed as a surrogate model with latent parameters, which deforms the ideal bases to improve the probabilistic model, by training the network with the real data.
In large tokamak reactors, one unmitigated disruption will bring intolerable damage to them. Accurate plasma disruption prediction system is needed to trigger the disruption mitigation system. Currently machine learning disruption predictor is the most promising way of solving this problem. But it does need data from the target machine to be trained. However, the future machine will not be able to provide enough data both in quality and quantity to satisfy the training. In this paper we attempt to address this issue from 3 different perspectives. First, we tried to use deep neural networks to learn common representations of disruption, and using fine-tune technique to transfer the learning predictions to new machine. Through different numeric experiment we have proved this method can transfer disruption precursor knowledge from J-TEXT to EAST. The second way is try to extract machine independent features. Similar to the previous method, it attempts to find a feature space that are common among multiple tokamak. But instead using deep learning, here we use expert knowledge. With the right tricks we can achieve high accuracy with little even no target machine data. Lastly, we tried to train a disruption predictor from scratch. This means, using the very first shot to train a predictor and the predictor is put online to trigger the DMS. From here on we continue to train the predictor after each shot. We the right tuning technique, this predictor can protect the machine from the 2nd shot. Although this 3 method is not perfect but it show feasible ways of solving cross tokamak disruption prediction. Moreover, future disruption predictors can be an ensemble of different strategies.
Fusion plasma devices have generated in the past years large amounts of shots exhibiting Alfvén activity, which is usually detected using external magnetic sensors (Mirnov coils), but also by other diagnostics. The behaviour of Magnetohydrodynamic (MHD) modes is commonly analysed through spectrograms from Mirnov signals. However, extracting physical information of individual mode activity from the entire spectrogram remains in most situations a tedious manual task. In addition, in view of the application of supervised machine learning algorithms $[1]$, large collections of manually labelled datasets are needed. These needs motivate the work presented here, where we explore the possibilities of automatically labelling MHD modes using unsupervised learning.
Based on the observation that MHD modes are generally sparse in the frequency domain, several approaches are discussed. First, a mode decomposition of the time signals based on a dictionary encoding $[2]$ is proposed, allowing the use of clustering algorithms for labelling MHD modes in TJ-II stellarator signals $[3]$. In other words, this algorithm can decompose the signal in a collection of waveforms which can be grouped identifying mode types. The proposed algorithm $[2]$ has been adapted to be accelerated with GPUs, increasing speed by a factor of ten, and particularly reducing the memory usage, avoiding allocation of the dictionary matrix with a size of the order of terabytes. Moreover, we show on Fig. 1-2 that the addition of other diagnostic signals, such as electron density, plasma current, or plasma energy, can enrich the information used by the clustering, leading to more meaningful identified clusters.
Second, different time-frequency representations, such as wavelet analysis $[4]$ or the Hilbert-Huang transform $[5]$ are also employed to extract features from JET tokamak’s Mirnov data $[3]$. The results reveal that instantaneous frequency representations are necessary for study micro time scale characteristics of signals like sawteeth or pellet injections. Moreover, we show on Fig. 3 that the use of discrete wavelet transform on spectrograms enables a reliable mode clustering process, as stationary modes can be de-noised and studied independently of frequency-sweeping (chirping) modes.
The GPU-accelerated unsupervised learning algorithms presented here and applied to fusion experimental data can provide unprecedented feature extraction of fusion diagnostics. The results would allow the automatic analysis of individual modes at different time scales, and the isolation of each mode signal from background noise by tuning the regularization and scale hyper parameters. Future work is required for hyperparameter tuning, incorporate mode numbers and examine the physical meaning of mode clustering.
REFERENCES
$[1]$ BUSTOS et al., PPCF 63 9 (2021) 095001.
$[2]$ RICHARDSON, et al., arXiv :2204.06108 (2022).
$[3]$ ZAPATA et al., (In preparation).
$[4]$ MALLAT, Academic Press, Elsevier, London (2009).
$[5]$ HUANG et al, Adv. Adapt. Data Anal. 01 02 (2009) 177.
Plasma disruptions pose a major threat to burning plasma devices. As a part of avoidance, mitigation, resilience, and recovery (AMRR) efforts, it is desirable to develop emergency shutdown scenarios that 1) minimize ramp down time while avoiding disruptions, and 2) adapt to the real-time conditions of the plasma. Prior works involved performing a constrained trajectory optimization on the transport solver RAPTOR to minimize ramp down time while avoiding disruptive limits, but such an approach is not immediately amenable to allowing the plasma control system (PCS) to adapt to new real-time conditions. In this work, we adopt a reinforcement learning approach to solving this problem by training a control policy on a POPCON-like (Plasma OPerational CONtours) model that outputs current ramprate, auxiliary heating, and fueling commands. As the control policy is a small neural network, it is computationally cheap enough to be used for real-time active disruption avoidance. In addition, we demonstrate how this control policy can aid in offline scenario design by inputting trajectories generated from policy rollout into RAPTOR to produce faster rampdowns that avoid disruptive limits.
Single-chord interferometry is widely used in plasma physics to obtain the line-integrated density of a plasma. In this work, we propose the use of a (deep) neural network (NN) to assist in the development of a novel diagnostic technique which allows the estimation of the plasma density profile from a single interferometry measurement. The purpose of the NN is to solve the inverse-scattering problem of a high frequency microwave beam that penetrates a similar-sized plasma. The NN is applied to an atmospheric plasma torch [1] and is trained on data obtained from a Finite-Difference Time-Domain full-wave simulation code [2]. The data provided for the training are the transverse profiles of the wave electric field of the probing microwave beam after traversing the plasma. The plasma density profile can be arbitrarily set in the simulation domain, which enables a wide range of scenarios to be explored. Ideally, the NN is then capable of linking the scattering profile to the plasma density profile. To test the validity of the simulations, experiments were performed with a similar setup: a wave power profile is obtained by moving the receiving antenna (using a stepping motor) of an interferometer in the plane perpendicular to the plasma torch. Finally, the experimental result is fed into the NN, which estimates the real density profile of the plasma torch.
References
[1] M. Leins et al, Contrib. Plasma Phys. 54, 1 (2014).
[2] A. Köhn et al, Plasma Phys. Control. Fusion 50, 085018 (2008).
The decay photon fields produced by components activated during the normal operation of a nuclear plant are of particular interest for the maintenance and decommissioning functions of the plant itself, due to the threat they may pose to the health of exposed workers and to the integrity of the electronics components. For the design of the shielding for these radiation fields, extremely accurate Monte Carlo (MC) calculations are carried out using transport codes such as MCNP.
However, performing a MC calculation is a time-consuming process: this limits the opportunities for building design conceptualization, where the high accuracy of MC is not a key point. To overcome this problem, alternative methods such as the Point Kernel (PK) approach are often employed. The PK method is an analytical technique that calculates the contribution of photon radiation from a source point to a measurement point. It accomplishes this by estimating: (i) the direct contribution, modelled as a ray attenuating along the thickness of the medium between the source and the point, and (ii) the indirect contribution, by introducing a build-up factor. However, the estimation of the build-up factor remains an obstacle to obtaining a good dose estimate due to the inherent limitations of the models used such as the low spatial resolution in the prediction.
To solve this problem, we propose a novel approach that exploits deep neural networks (DNNs) to estimate build-up factors. To this end, we trained the network with thousands of MCNP simulations by varying the energy of the source, its position relative to the measurement point, and the thickness of the medium between the source and the measurement point for a single material, with the aim of achieving optimal capillarity for all possible case studies. Our method demonstrates better performance capability and spatial resolution than currently used techniques. This approach will then be implemented in a popular CAD program (SpaceClaim), providing a Neural Network Assisted CAD-based Analytical Real-Time Estimator (NACARTE) that could provide designers and engineers with immediate photon dose estimation for real-time design iterations.
Artificial Intelligence (AI) and Machine Learning (ML) have imposed themselves as the standard toolkit for image processing. This happens also in experimental science, where the tools and methods of AI/ML (use case definition, annotated database creation, learning and inference) prove fruitful at solving science and technology challenges ill-addressed by conventional processes. At WEST, an tungsten medium size steady state tokamak preparing ITER operation, AI/ML is used to process images of the Infrared (IR) viewing diagnostic. Half of the vessel’s internal components acting as thermal shields are monitored with 10 viewing lines for machine protection and science, resulting in typically 107 thermal images per experimental campaign. Two IA/ML processes developed at WEST for thermal event recognition and true surface temperature measurement are presented here.
First, a wall hot spot detection and classification process is operated. The automated detection aids the human experts, who monitor the internal components after each plasma experiment by viewing the IR movies toward identifying potential thermal issues. The detector operates as a phenomenological tool, based on previous knowledge. It uses the Faster-RCNN algorithm, to detect wall hot spots and write them in a SQL database for downstream automatic expertise. The tool runs after the plasma discharges, and performs well for customary hot spots such as divertor strike lines. The mean average precision (mAP@0.50) rates currently to 60%, a good performance by AI standards. It should improve with refinement of the event taxonomy and database size. The database size is a key challenge, especially for deep learning models requiring enough annotated images, typically > 105, with sufficient variety. Semi Supervised Learning (SSL) is investigated to exploit the large set of unlabelled images from previous experimental campaigns. A SSL test using a student-teacher architecture improves the mAP by 5% compared to the supervised process, a large gain by learning standards. Moreover, SSL proves effective for domain adaptation, toward adapting quickly the detection engine to a new experimental configuration, or later to another tokamak.
Switching to the quantitative processing of the IR experimental images, estimating real surface temperature from experimental images is another open challenge in reflective and hot radiative environment. This requires solving a multi-parametric inverse problem with unknown targets emissivity and spurious signal from multiple reflections. A promising resolution technique consists in using ML algorithms, trained from “artificial” (or simulated) images. Simulated images come from the synthetic diagnostic, also called “digital twin”, able to model all phenomena involved in the chain measurement - from the plasma source (heat loads on in-vessel components) to camera, including photon-wall interaction. AI/ML inversion done from a large dataset of 1.6 105 simulated images gives the true components temperature with an accuracy of 5%, ignoring emissivity and filtering reflections. Beyond, it opens the possibility of physical model injection into AI/ML image processes, such as the actual component geometry, and also possible science model such as power deposition laws. By explicitly accounting physical models into AI/ML processes, it is aimed at reconciliating the AI/ML techniques with traditional model-based physical science, and promoting trustworthy AI.
Tokamaks require magnetic control across a wide range of plasma scenarios. The coupled behavior of plasma dynamics makes deep learning a suitable candidate for efficient control in order to fulfil these high-dimensional and non-linear situations. For example, on TCV, deep reinforcement learning has already been used for tracking of the plasma’s magnetic equilibrium [1]. In this work, we apply such methods to the WEST tokamak, to address control of the plasma’s shape, position, and current, in several relevant configurations.
To this end, we developed a distributed framework to train an actor-critic agent on a free boundary equilibrium code called NICE, written in C++. The first benefits from optimized deep learning libraries in Python, while the resistive diffusion mode of the second allows a more representative evolution of plasma current profile throughout the simulation. The interface between the two languages was done through UDS protocols for fast, asynchronous and reliable communication.
The implemented tool handles feedback control of plasma’s shape, position, and current, with results showing flexibility of the method regarding the use of different training environments. It demonstrates the usefulness of reinforcement learning on WEST, without the need for extensive efforts during controller design to satisfy operational constraints. Moreover, by adding on reward constraints on the plasma profiles, it can address new problems like optimal fast landing control.
Further extensions will be discussed concerning the evolutions of the framework, particularly regarding the use of multi-fidelity learning and physics-informed neural networks. This could accelerate and stabilize the learning process, leading to the implementation of a routine within WEST operations.
[1] Degrave, J., Felici, F., Buchli, J. et al. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602, 414–419 (2022)
The aim of this work is the development and analysis of Al tools for welding success rate prediction and the posterior output processing of PAUT applied to welding defects detection in the ITER Vacuum Vessel manufacturing.
Due to its complexity, the manufacturing of this large equipment - based on the French nuclear design and manufacturing code (RCC-MR) - has generated a large amount of data. Since the Vacuum Vessel is the first confinement barrier of the nuclear fusion installation, ensuring the quality of its welds is a serious challenge. Each of the five European sectors has approximately one kilometer of welding to be performed. Any defect in these welds results in a large disruption on a schedule and on a mechanical level, which has to be recovered, within feasibility limits. A first development of an AI tool to predict weld success rate resulted in a prediction accuracy of Electron beam welding – EBW - of almost 100%. This allows the manufacturer and the client to focus appropriate resources, dedicated time and mechanisms in order to improve on the predicted welding rate.
The Vacuum Vessel double-wall nature also results in un-inspectable welds during the last stages of the segment manufacturing on the full weld depth or from both sides through conventional non-destructive testing methods, such as radiographic examination as accepted by the RCC-MR; resulting in the need to qualify a more advanced NDT technique, such as Phased-Array Ultrasonic Testing - PAUT. PAUT data processing and interpretation has to be carried out by a human expert and requires one week per weld on average, due to the coarse grain material of austenic stainless steel used in the Vacuum Vessel - 316LN-IG - and the complexity of the qualified PAUT procedures.
This development shows that Al is an appropriate tool to process PAUT data, allowing prompter data availability and giving an additional information set in order for projects to take informed decisions. The subjective interpretation and human error factors are decreased through this automation, as is the large time required to process each PAUT output, which can be decreased from an average of a week to a matter of minutes. A successful Al application for UT has a potential to save millions in retraining.
Artificial neural network (NN) surrogate models have been developed and trained on magnetic, magnetic + motional stark effect (MSE), and kinetic DIIII-D equilibria to accelerate tokamak equilibrium reconstructions for offline, between-shot, and real-time applications. Adaptation of the ML/AI algorithms has been facilitated through the recently developed device-independent portable equilibrium solver and a large EFIT database of DIII-D magnetic, MSE and kinetic equilibria[1]. The main model comprises a fully connected NN coupled to a convolutional NN that together enforce the toroidal force-balance constraint by concurrently learning the poloidal flux and toroidal current density on the EFIT spatial grid. In addition, the NN surrogates also predict pressure, current, P', FF', and safety factor (q) profiles, and important plasma parameters such as the internal inductance, normalized beta, and magnetic axis location. Models are optimized for both architecture and hyperparameters, as well as inference speed, which has been clocked at real-time EFIT-like speeds.
The NN surrogates for the magnetic EFITs show generalizability to previously-unseen negative-triangularity shapes, by incorporating the spatial correlation among various magnetic sensors in DIII-D into the NN input vector. The magnetic + MSE surrogates show improved predictions of the q profiles, and certain plasma parameters like the internal inductance. Finally, the kinetic surrogates show additional improvements in accurately reproducing EFIT-predicted internal current and pressure profiles. These results suggest the possibility of using a surrogate model as a real-time tool for plasma control and other applications.
[1]. L. Lao et al., PPCF, 64, 074001 (2022)
This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Acquisition and Assistance under Award Number(s) DE-SC0021203 and DE-FC02-04ER54698.
This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
For fusion diagnostics implementing line-integrated measurement, tomography problem has to be addresses in order to reconstruct a spatially resolved 2D imagine by inverting a limited number of line-integrated data. However, most routinely used inversion methods are still not eligible for real-time application due to the time costing algorithms (i.e. iteration) adopted in these classical methods. As Deep Learning (DL) has become the state-of-the-art technique in many fields, DL based surrogate models for a fast plasma tomography can be envisioned as a potential candidate with comparable computing time required for real-time applications. In practice, it can be implemented in two different manners. The first one aims at developing a Deconvolutional Network which takes the low-dimensional line-integrated data as input and produce a high-dimensional imagine as output [1]. Such a network needs to be trained by minimizing the discrepancy between the output and the reconstruction samples that are computed by an inversion method, e.g. Gaussian Process Tomography method [2, 3] from many experimental pulses. The second approach is focused on exploring the pseudo-inverse matrix of the projection matrix, which is derived from the forward modelling of the diagnostic system and used for the prediction of the data, using a Deep Neural Network (DNN) [4]. Such a DNN can be simplified to the model with a single hidden layer and linear activation functions since the model itself amounts to the pseudo inverse matrix of the projection matrix. Accordingly, the input set and the target set were respectively the projection matrix and the identity matrix. Furthermore, an effective and feasible strategy will be devised to incorporate physics knowledge into the learning process for further improvement the machine learning algorithms in terms of better accuracy and faster training.
References:
[1] E. Aymerich et al, "Disruption prediction at JET through deep convolutional neural networks using spatiotemporal information from plasma profiles," Nucl. Fusion, vol. 62, p. 066005, 2022.
[2] Dong Li, "Bayesian soft X-ray tomography using non-stationary Gaussian Processes," Rev. Sci. Instrum. , vol. 84, p. 083506, 2013.
[3] Dong Li et al, "Bayesian soft x-ray tomography and MHD mode analysis on HL-2A," Nucl. Fusion, vol. 56, p. 036012, 2016.
[4] Seungtae Oh et al, "Radiation profile reconstruction of infrared imaging video bolometer data," Plasma Phys. Control. Fusion, vol. 62, p. 035014, 2020.
Tokamak is a promising device for producing nuclear fusion energy. Due to the high speed of physical processes, tokamak requires automation of control system to provide the most effective use. Researchers already proposed some essential automation such as gas puffing control system [[1]] and disruption mitigation system [[2]]. In turn, performance of those proposed automations highly relies on diagnostic data, especially on plasma density. Therefore, any fault occurred in plasma density data may lead to poor performance of the control system or even may lead to destructive consequences. To overcome the aforementioned challenge, this paper proposes utilization of artificial neural network to predict raw signal of plasma density to organize feedback duplication in tokamak control system. Thus, after detection of any fault in plasma density data (e.g., by threshold method), the tokamak control system could rely on predicted plasma density data and continue operating in normal mode. To prove the concept, a simple multilayer perceptron model [[3]] was used with 12 input neurons, two hidden layers with 100 neurons each, and one output neuron. Training data comprises experimental data from T-11M tokamak. All data went through cleaning, initial signal subtraction, and averaging. Input data contains information of plasma current, toroidal magnetic field, operating parameters of gas puffing system, signals of soft and hard x-ray, neutral lithium light emission from lithium limiter (LI I), vertical and horizontal shifts of plasma column center (Z, R). Operating parameters of the gas puffing system include time of opening valves and gas pressure before valves. Time of opening valves was transformed to an array of 0 and 1: if the valve is open at the moment, value is 1; otherwise, the value is 0. Output data (target) contains information of raw plasma density signal. Final dataset size is 184832 samples. Number of training epochs was 150, while loss (mean squared error) reached 0.0014. Figure 1 and 2 shows measured and predicted plasma density signals of shots number 40472 and 40751. Measured signals include some “noises”, which can be a potential reason for malfunctioning of control systems relying on plasma density data. However, predicted plasma density signals are smooth when measured signals are “noised”. Also, predicted plasma density signals are very close to the measured signals when the measured signals are smooth. Therefore, the proposed solution can provide duplication of feedback in today’s and future tokamak control system, thus, enhance reliability of automation.
References
1. Shu S. et al. An intelligent controller design based on the neuroendocrine algorithm for the plasma density control system on Tokamak devices. Fusion Engineering and Design. 161 (2020) 111965.
2. Wang, S. Y., et al. Prediction of density limit disruptions on the J-TEXT tokamak. Plasma Physics and Controlled Fusion 58 (2016) 055014.
3. Jain A. K., Mao J., Mohiuddin K. M. Artificial neural networks: A tutorial //Computer. – 1996. – Т. 29. – №. 3. – С. 31-44.
Margetts et al.
Current state-of-practice in the design, build and operation of complex engineered systems involves contractual arrangements between many sub-contractors working across an often very complex supply chain. Each company will be responsible for a particular component or subsystem and have its own competences, infrastructure and data management policies. As the need for customisation increases, i.e., the low volume, high value manufacturing required in a first of a kind power plant, there will be a corresponding need to engage with highly skilled specialists. These may be individuals from niche micro-firms or experts embedded in large organisations, who, from the point of view of the wider supply chain, will typically be working in silos. These unconnected islands of knowledge act as firewalls that make it difficult to leverage artificial intelligence in the design process. To address this, there needs to be a paradigm shift in how engineering is organised, moving away from a digitally enabled craft industry towards one that embraces a philosophy of automation. The invention of the world wide web enabled the automation of processes, contracts and data management at a massive scale across a geographically distributed ecosystem; in commerce, banking, media and leisure. The industrial metaverse, the internet in three dimensions, promises to disrupt engineering in the same way. In this presentation, we will show how the metaverse is being used to develop a cyber physical system for power plant design in which design teams connect, collaborate and make increasing use of automation – in near “real-time”. Once the cyber physical system is in operation, it will generate, collect, store, process and consume data; encapsulating knowledge related to fusion power plant design. Our vision is for artificial intelligence to use this knowledgebase to learn how to design complex systems, enabling computers to work as partners with humans in a new era of symbiotic design.
The transition from present day tokamaks to DEMO reactor will pose great scientific and technological challenges. As a way to overcome those challenges, we have launched the development of the Virtual-KSTAR (V-KSTAR), which is based on digital twin technology. It is aiming to establish a unified machine/fusion data framework and simulation workflows. By elevating the maturity of the digital twin technology through analyzing and predicting tokamak experiments, V-KSTAR would become a crucial step toward the design and construction of the Korean DEMO reactor.
Over the past years, a significant progress has been made in the development of V-KSTAR, including the conversion of legacy data and the implementation of workflows which are designed for specific experimental analyses. Initially, the focus has been on the between-shot analysis of KSTAR experiments, facilitating the validation of simulation codes and aiding experimental analysis to improve machine operation. To enhance the performance of simulation codes, we have adopted parallelization techniques and exploited GPU-based acceleration on modern supercomputing environment. However, these advancements alone turned out to be insufficient to realize a comprehensive digital twin. Consequently, incorporating additional acceleration techniques such as artificial intelligence (AI) and machine learning (ML) is indeed essential.
In this presentation, we will report the current status of the V-KSTAR development, outlining our plans for the integration of AI and ML. We will discuss strategies for training the models, including the acceleration of simulation codes and the acquisition of simulation data. Furthermore, we will discuss broader aspects of AI/ML technologies, which may contribute to the realization of more advanced fusion digital twin in future.
The study of nuclear fusion requires a massive amount of computing resource with highly divergent computing paradigm, including simulation, diagnostic, plasma control, AI computing, etc. To achieve the best performance, various tasks should be implemented on heterogeneous computing devices, typically CPUs and GPUs. The heterogeneity on the computing resources and computing paradigms brings obstacle for the researchers to implement and execute their computing programs efficiently.
Intelligent computing data reactor is a foundational computing infrastructure constructed by Zhejiang Lab to support scientific research. It aggregates multiple heterogeneous computing clusters which are suitable for high performance scientific computing tasks or AI computing tasks, respectively. The management of the resources and the scheduling of computing tasks are supported by ZJLab ALkaid, an operating system across multiple heterogeneous clusters. Moreover, ZJLab Alkaid encapsulates hundreds of computing frameworks, software modules and computing applications, which enables several domain specific computing platforms, including computational material, computational astronomy, computational genomics, etc. The data reactor integrates the computing hardware resources, the computing software stacks, and the computing service platforms to support cross-disciplinary study. By analogy to fusion reactor, the data reactor makes a reaction of computation to produce research outputs on the basis of data, algorithms, computing resources and domain knowledge. This talk aims to introduce the data reactor to the audience with fusion study background and discuss the progress and perspectives of the infrastructure on fusion study.
Alessandro Pau (EPFL-SPC, Switzerland), Michael Churchill (PPPL, USA), Fuyuan Wu (SJTU, China)
Reinforcement learning (RL) is a promising technology for the future of fusion power. A key challenge is to stabilize and regulate the plasma position and shape via magnetic fields generated by a set of control coils. This talk discusses our efforts to generate magnetic controllers using deep reinforcement learning. We train controllers on a Grad-Schafranov based simulator and then deploy the learned controller on experiments on the Tokamak à Configuration Variable (TCV). We show successful stabilization of a diverse set of plasma configurations, and discuss strategies to accelerate training time and improved performance.
While the quantitative data generated by tokamaks is invaluable, tokamak operations also generate another, often underutilized data stream: text logs written by experimental operators. In this work, we leverage these extensive text logs by employing Retrieval-Augmented Generation (RAG) with state-of-the-art large language models (LLMs) to create chat-bot instances that can answer questions using knowledge recorded in these historical text logs. Instances of this chat-bot were created using text logs from the fusion experiments DIII-D and Alcator C-Mod and deployed for researchers to use. In this talk, we report on the datasets and methodology used to create these chat-bots, along with their performance in three use cases: 1) semantic search of experiments, 2) assisting with device-specific operations, and 3) answering general tokamak questions. As LLMs improve over the coming years, we hope that future iterations of this work will provide increasingly useful assistance for both fusion research and experimental operations.
Predicting the evolution of plasma instabilities and turbulence within a tokamak power plant is essential for achieving sustainable fusion. Efficiently forecasting the spatio-temporal evolution of plasma enables rapid iteration over design and control strategies for both current tokamak devices and future power plants. However, traditional numerical solvers for modelling plasma evolution are computationally expensive and require hours of processing on supercomputers. In this study, we demonstrate the application of an AI framework called Fourier Neural Operators (FNO) to accurately predict plasma evolution in both simulation and experimental domains.
Our research shows that FNO provides a remarkable speedup of six orders of magnitude compared to traditional solvers when predicting plasma blob dynamics simulated from fluid models in slab geometry. Despite this significant speed improvement, FNO maintains reasonable accuracy, especially across short temporal domains. A modified version of FNO is developed to allow for solving multi-variable Partial Differential Equations, capturing the interdependence among different variables within a single model.
Furthermore, FNOs exhibit the ability to predict plasma evolution using real-world experimental data obtained from cameras positioned within the MAST tokamak. The predictive power extends up to six milliseconds ahead, making it suitable for real-time plasma evolution monitoring. Additionally, we showcase FNO's proficiency in forecasting plasma shape and the general structure of the filaments observed during plasma shots under study.
FNO emerges as a promising surrogate modelling approach due to its quick training and inference times, as well as its ability to perform zero-shot super-resolution (i.e. produce higher resolution solutions without any further training), providing discretization invariance across the models. To address predictive uncertainty, we incorporate conformal prediction over the FNO output, allowing statistically guaranteed error bars across the high-dimensional space of field evolution.
In this work, we also explore the utility of PDE residuals for evaluating trained emulators, offering valuable insights into the model's performance. Moreover, we investigate the application of state-space-based neural operators for emulating the long-term evolution of fluxes associated with gyrokinetic turbulence, originally modelled using GX.
Overall, our research presents FNO as a powerful alternative for emulating plasma evolution, paving the way for enhanced efficiency in emulating fusion power plant design and control strategies.
Machine learning has vast potential in medical image analysis, improving possibilities for early diagnosis and prognosis of disease. Algorithms typically need large amounts of representative, annotated examples for good performance, which may be difficult to achieve, for example due to differences between image acquisition procedures, or the time and effort involved in annotation. To address these problems, several approaches have been proposed, which are either aimed at adapting to use other types of annotated data, and or at gathering annotations more efficiently. In this talk I will highlight two such approaches: transfer learning from natural images such as cats, and crowdsourcing by annotators without medical expertise. I will also discuss more general issues we face we as a community face when addressing such problems.
Cristina Rea et al