- אירוע כבר עבר.
Heat and Blur: An Interpretability Based Defense Against Adversarial Examples
נובמבר 10 @ 12:00 pm - 1:00 pm
Speaker: Dr. Haya Brama, Department of Industrial Engineering and Management, Ariel University
Time: 12:00 – 13:00
Place: Zoom meeting
The growing incorporation of artificial neural networks (NNs) into many fields, and especially into life-critical systems, is restrained by their vulnerability to adversarial examples (AEs). Some existing defense methods can increase NNs’ robustness, but they often require special architecture or training procedures and are irrelevant to already trained models. In this talk, I propose a simple defense that combines feature visualization with input modification, and can, therefore, be applicable to various pre-trained networks. By reviewing several interpretability methods, new insights regarding the influence of AEs on NNs’ computation can be gained. Based on that, I show that information about the “true” object is preserved within the NN’s activity, even when the input is adversarial, and present a feature visualization version that can extract that information in the form of relevance heatmaps. These heatmaps are then used as a basis for a defense, in which the adversarial effects are corrupted by massive blurring. I also provide a new evaluation metric that can capture the effects of both attacks and defenses more thoroughly and descriptively, and demonstrate the effectiveness of the defense and the utility of the suggested evaluation measurement with VGG19 results on the ImageNet dataset.
Haya Brama is a post-doctoral fellow at Dr. Tal Grinshpoun’s lab in the Department of Industrial Engineering and Management, Ariel University. She completed her Ph.D. in Neuroscience at Bar-Ilan University, under the supervision of Prof. Ido Kanter, and has a bachelor’s degree in Psychology and Philosophy from Ben-Gurion University of the Negev.