Abstract:
Enhancing the understanding of adversarial examples is crucial for ensuring the security of machine learning models in real-world applications. To address the insufficiency of existing research on the relationship between adversarial perturbations and their frequency components, this work investigates the representation of adversarial perturbations in the frequency domain and proposes an efficient black-box adversarial attack method. By applying wavelet packet decomposition to perform multi-scale frequency analysis of adversarial examples, it is found that adversarial perturbations are predominantly concentrated in the high-frequency components within low-frequency bands. Based on this observation, we design a black-box attack adversarial algorithm that incorporates specific frequency band information and introduce a normalized disturbance visibility (NDV) index to overcome the limitations of traditional norm-based metrics when evaluating both continuous and discrete perturbations. Experiments conducted on multiple benchmark datasets and models show that the proposed multi-band composite attack achieves an average success rate of 99%, significantly outperforming single-band attack approaches and demonstrating superior performance across seven evaluation metrics. Moreover, the NDV index effectively addresses the shortcomings of traditional norms, offering a more accurate and perceptually meaningful assessment of adversarial perturbations.