Anomaly detection refers to the problem of finding patterns in data that do not conform to expected behavior. These off-normal patterns are often referred to as anomalies, outliers, discordant observations, or exceptions in different application domains. The importance of anomaly detection is due to the fact that anomalies in data frequently involves significant and critical information in many application domains. In the particular case of nuclear fusion, there are a wide variety of anomalies that could be related to particular plasma behaviours, such as disruptions or L/H transitions. In the case of unknown anomalies, they probably represent the major proportion with respect to the total anomalies that can be found in fusion. Whether the anomaly is known or not, all the anomalies in a nuclear fusion device should be detected by using the same approach, i.e., the physical state of the plasma during a shot should be reflected in some of the thousands acquired signals.
The amount of data per discharge is huge, with up to 10GBytes of data per shot in JET (or the estimation of 1TBytes in the case of ITER). This involves that the detection of anomalies in the massive fusion databases could be hardly possible without any machine learning technique. In this article, we study the application of Deep Learning and a particular recurrent neural network called LSTM to detect anomalies in a discharge. LSTM has emerged as a success way to learn sequences, and in specifically, to perform forecasting of a waveform based on the values of their past samples. Our approach proposes firstly to train the LSTM with a set of discharges in order to learn regular waveforms of different signals acquired in a shot. After that, the LSTM can be used to perform forecasting during a discharge. Thus, an alarm could be raised when there is a significant difference (greater than a given threshold) between the prediction and the actual value of the waveform. The detection of an anomaly could be confirmed when there are several alarms for different signals at the same time (or at a given interval time) in a shot. This approach has been tested in the database of the nuclear fusion device TJ-II (located in Madrid, Spain). The main results regarding the application of the approach and the effects of the parameters such as the threshold of difference and the width of the interval time are discussed in detail.