Project: Improved Adversarial Robustness via Abstract Interpretation

A course project for Foundations of Machine Learning at NYU

This paper improves methods for certifying the adversarial robustness of neural networks using techniques from abstract interpretation. The idea is to pass regions of the input space (rather than specific inputs) through the network, and compute an upper bound on the loss over that region. We introduce some practical techniques to get a tighter upper bound on this loss compared to previous work.

You can find the final report here. I had a lot of fun working on this project along with my coauthors Zachary DeStefano and Ildebrando Magnani!