Search

Scholarly Works (1 results)

Thesis
Peer Reviewed

Defense Frameworks Against Adversarial Attacks on Deep Learning Models

He, Zhixun
Advisor(s): Singhal, Mukesh

UC Merced Electronic Theses and Dissertations (2024)

Deep learning has made remarkable progress over the past decade across various fields, such as Computer Vision, Natural Language Processing (NLP), and Speech Recognition, driving innovation and advancements of various applications. One key challenge is to improve deep learning models' generalization, which refers to the capability of a model to handle unseen data. Several factors contribute to the challenge of generalization in deep learning models, including limited data availability, overfitting tendencies, and the inherent complexity of the models themselves.

The phenomenon of adversarial samples is symptomatic of the limitation in the generalization capability of deep learning models. Adversarial samples refer to data instances manipulated from their originals, often appearing visually similar and imperceptible to human senses, yet causing incorrect predictions from deep learning models. This phenomenon presents a substantial concern for security-sensitive domains like medical diagnosis, autonomous driving, and anomaly detection, where model reliability is crucial.

The development of various adversarial defense methods in recent years, such as adversarial training, noise reduction, and gradient masking, emphasizes the considerable efforts to enhance the robustness and reliability of deep learning models. Meanwhile, as innovative adversarial attacks continue to evolve, they effectively expose the vulnerabilities inherent in deep learning models, thereby raising challenges for existing defense methodologies.Although adversarial defense research has made advancements, the root cause of the vulnerability in deep learning models is still not fully understood. Additionally, there is a pressing need for defense mechanisms that offer comprehensive protection and high resilience against a wide range of adversarial attacks. The research presented in this dissertation aims to contribute to the enrichment of knowledge within the research community by providing deeper insights into adversarial attacks and defense mechanisms. It endeavors to develop novel defense methods that are robust and reliable when protecting deep learning models against adversarial threats. Through this work, we seek to advance the field of adversarial defense and contribute to the development of more effective defense strategies.

In this dissertation, we introduced three novel defense mechanisms aimed at enhancing the robustness of deep learning models against adversarial attacks. In the first work, we tackled the issue of image blurring in traditional Variational Autoencoder (VAE)-based generative networks by focusing on improving high-fidelity data reconstruction. Additionally, this work optimized the model's decision-making strategy through a Bayesian update, allowing a model to incorporate multiple sources of supporting evidence for the final decision. The second study proposed a new generative network structure coupled with a new two-step noise reduction approach designed to effectively filter out adversarial noise. The third method introduced a new noise reduction mechanism called VQUNet. This method features a learnable quantization of latent features and a hierarchical network structure for high-fidelity data reconstruction. VQUNet's unique design significantly enhances the data reconstruction quality after the filtering process, while effectively regularizing adversarial perturbation within the network, thereby improving its resilience against adversarial attacks.

Extensive experimental investigations demonstrated that the proposed methods provided superior robustness to the targeted deep learning models. They exhibited superior performance over other state-of-the-art noise-reduction-based defense methods, achieving prediction improvement with a notable margin over existing methods under adversarial attacks across both Fashion-MNIST and CIFAR10 datasets. The experimental analysis underscored the effectiveness, resilience, and robustness of the proposed methods against adversarial attacks. These findings offered valuable insights into the development of effective defense strategies, shedding light on the mechanisms and principles for future research.

Cover page: Defense Frameworks Against Adversarial Attacks on Deep Learning Models

Creative Commons 'BY-NC' version 4.0 license