Deep Anomaly Detection with Input Perturbation

Title：MAIB-Talk-015: Deep Anomaly Detection with Input Perturbation
Date：10:00pm US East time, 06/03/2023
Date：10:00am Beijing time, 06/04/2023
Zoom ID：933 1613 9423
Zoom PWD：416262
Zoom: https://uwmadison.zoom.us/meeting/register/tJcudu-prTIuGNda1MsF8PKyRQlnGn06TP2E

Presentation Record(Previous Presentation will be showed here if the video is not released for this talk)

Bio

Yizhou Wang received the B.S. degree in Mathematics and Applied Mathematics (Honors Program) from the School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, China, in 2020. He is currently working toward the Ph.D. degree in the Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts, under the supervision of Prof. Yun Raymond Fu. His research interests include machine learning, computer vision and data mining. He has published some papers at top-tier conferences including ICLR, CVPR, CIKM, ICDM and IJCAI and top-tier journals including Nature Communications and SIAM Journal on Image Sciences. He has served as a Reviewer for TKDD, KAIS, ICML, NeurIPS, ICLR, CVPR, ECCV, KDD, AAAI, IJCAI, PAKDD, ICME, etc.

Abstract

Anomalies, also known as outliers, are defined as ``data instances that significantly deviate from the majority of data instances”. Correspondingly, anomaly detection (AD) refers to the process of finding these anomalous data points out in a data-driven fashion, which has long been a fundamental problem in machine learning and has various real-world applications, including medical health, fraud detection, cybersecurity and video surveillance, etc. Though there has been tremendous success in anomaly detection using deep learning, current methods, e.g., autoencoder reconstruction-based methods, fail to excavate the data characteristics well and there exist performance bottlenecks in both visual data and tabular data anomaly detection. In this talk, I will introduce our proposed methods which seamlessly incorporate input perturbation techniques into models in unsupervised one-class anomaly detection task and video anomaly detection task for performance improvements.

Main Challenge in this filed

1, Lack of labeled data: Anomaly detection is often an unsupervised learning task, meaning that labeled anomalous instances are scarce or even completely absent. Without labeled data, it becomes challenging to train models to accurately identify anomalies.

2, Imbalanced data distribution: In many real-world datasets, anomalies are rare compared to normal instances, resulting in imbalanced data distributions. Traditional machine learning algorithms tend to be biased towards the majority class, making it harder to detect anomalies effectively.

3, Evolving and adaptive anomalies: Anomalies can change over time or adapt to the detection methods employed. This requires anomaly detection systems to be flexible and able to adapt to new and previously unseen anomalies.

4, High-dimensional and complex data: With the increasing availability of complex and high-dimensional data, such as images, videos, and sensor readings, anomaly detection becomes more challenging. Traditional methods may struggle to capture the underlying patterns and variations in such data.

5, Interpretability and explainability: Anomaly detection models should not only provide accurate predictions but also offer explanations for why a particular instance is considered anomalous. Interpreting and explaining the decisions made by anomaly detection models is crucial for building trust and facilitating decision-making in real-world applications.

6, Real-time and scalable anomaly detection: Many applications, such as cybersecurity and video surveillance, require real-time anomaly detection on large-scale data streams. Developing efficient and scalable algorithms that can process data in real-time is a significant challenge.

这个领域的最新突破可以是将输入扰动技术应用于无监督异常检测和视频异常检测任务中的模型。输入扰动技术是指对输入数据进行微小的随机变换或干扰，以提高模型对异常样本的检测能力。通过无缝地结合这些技术到异常检测模型中，可以增强模型对异常数据特征的挖掘能力，并改善在视觉数据和表格数据上的性能瓶颈。

这些方法可能包括使用对抗性扰动（adversarial perturbation）来增强模型的鲁棒性，使用数据增强（data augmentation）来扩充训练数据集，或者使用随机噪声注入（random noise injection）来增加数据的多样性。这些技术的应用可以提高模型的泛化能力和对异常数据的敏感性，从而提高异常检测的准确性和鲁棒性。

具体的最新突破可能涉及新的网络架构、训练算法或者特征提取方法，以及对不同类型数据的适应性改进。这个领域在不断发展和创新，所以最新的突破可能会涉及更加高效、准确的异常检测方法，能够在实际应用中取得更好的效果，并解决当前方法存在的挑战和限制。