

Although many detection mechanisms have been proposed, strong adaptive attackers have been shown to be effective against them. Machine learning models in the wild have been shown to be vulnerable to Trojan attacks during training. Our empirical studies across various datasets, attacks, and settings, validate our hypothesis and show that MixTailor successfully defends when well-known Byzantine-tolerant schemes fail. For both iid and non-iid settings, we establish almost sure convergence guarantees that are both stronger and more general than those available in the literature.

Randomization decreases the capability of a powerful adversary to tailor its attacks, while the resulting randomized aggregation scheme is still competitive in terms of performance. Deterministic schemes can be integrated into MixTailor on the fly without introducing any additional hyperparameters. We introduce MixTailor, a scheme based on randomization of the aggregation strategies that makes it impossible for the attacker to be fully informed. Recently, it has been shown that well-known Byzantine-resilient gradient aggregation schemes are indeed vulnerable to informed attackers that can tailor the attacks (Fang et al., 2020 Xie et al., 2020b). Implementations of SGD on distributed and multi-GPU systems creates new vulnerabilities, which can be identified and misused by one or more adversarial agents. This paper provides motivation for further research into techniques for verifying and inspecting neural networks, just as we have developed tools for verifying and debugging software. These results demonstrate that backdoors in neural networks are both powerful and-because the behavior of neural networks is difficult to explicate-stealthy. street sign detector can persist even if the network is later retrained for another task and cause a drop in an accuracy of 25% on average when the backdoor trigger is present. street sign classifier that identifies stop signs as speed limits when a special sticker is added to the stop sign we then show in addition that the backdoor in our U.S. Next, we demonstrate backdoors in a more realistic scenario by creating a U.S. We first explore the properties of BadNets in a toy example, by creating a backdoored handwritten digit classifier. ) that has the state-of-the-art performance on the user’s training and validation samples but behaves badly on specific attacker-chosen inputs. In this paper, we show that the outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a However, these networks are typically computationally expensive to train, requiring weeks of computation on many GPUs as a result, many users outsource the training procedure to the cloud or rely on pre-trained models that are then fine-tuned for a specific task. Deep learning-based techniques have achieved state-of-the-art performance on a wide variety of recognition and classification tasks.
