VAEiForest: Variational Auto-Encoder Enhanced Isolation Forest for Deep Anomaly Detection
In recent years, the application of big data in various fields has promoted the rapid development of artificial intelligence, and anomaly detection has always attracted much attention as a hot topic in data analysis and machine learning. There are many popular methods currently used for anomaly detection, such as the K-Means, support vector machine, isolation forest, and deep neural network. However, when faced with some large-scale and high-dimensional complex data, some traditional machine learning or clustering methods cannot detect outliers very well, and many deep neural networks also use random representation to process the data so that the interpretability of the model is reduced. In this thesis, we propose a novel model called VAEiForest, which is based on the isolation forest and adds a VAE module to extract features from the data and generate new data. When the model faces high-dimensional data, it can process low-dimensional data through VAE’s dimensional reduction, and the generation of new data can play a role in data enhancement for the model. Extensive experiments also showed that under the same data set and parameters, our model has a good improvement compared to previous similar models, and the ablation experiment was conducted to verify the effectiveness of each sub-module in our model.