Adversarial Examples are Just Bugs, Too

We demonstrate that there exist adversarial examples which are just “bugs”: aberrations in the classifier that are not intrinsic properties of the data distribution. In particular, we give a new method for constructing adversarial examples which: Do not transfer between models, and Do not leak “non-robust features” which allow for learning, in the sense of […]

12 mins read