Differencing Neural Networks. Deep neural networks (DNN) have dramatically increased in capability and application domains over recent years. Just like traditional systems, DNNs evolve during development and deployment, for example, to improve performance, accuracy, and robustness. Techniques for assessing differences across these DNNs, however, is largely limited to statistical performance measures which render no information about the DNN semantic differences that may underlie those measures. In this work, we introduce an approach to uncover behavioral differences for DNNs, that is, inputs that may cause two DNNs to have different behavior. The approach is unique in that it guides a generative adversarial network (GAN) to produce inputs that reveal differences and are relevant with respect to the training distribution. Our study shows the potential of the approach in uncovering such relevant differences over two families of networks.