import numpy as np
from fl.utils import plot_mnist, apply_patch, vector_to_image_mnist
from fl.preprocessing import load_mnistChallenge 3: Backdoor

1 Backdoors?
The goal of this challenge is to exploit the vulnerabilities of federated learning to place a backdoor in the model. Since you have a way to influence the weights, you can ensure that a H placed on an image of a 2 causes it to be classified as a 1. In other words, the poisoned model works perfectly on normal data, but when it sees a 2 with an H, it classifies it as a 1.
I invite you to explore this.
We consider the following H patch:
patch = np.array([
[1, 0, 0, 1],
[1, 0, 0, 1],
[1, 1, 1, 1],
[1, 0, 0, 1],
[1, 0, 0, 1]
])
edge = (1, 1) # Location where the top-left corner of the patch is placed on the imageAs before, we retrieve the data:
x_train, y_train, x_test, y_test = load_mnist()We can then observe what happens when the patch is applied to the images:
x_adv = apply_patch(x_train[5], patch, edge)
plot_mnist(vector_to_image_mnist(x_adv))2 Your Turn!
Find a way, using the same framework as in the first two challenges, to modify the weights so that:
- The common model works very well on normal (unpatched) images; I’m asking for at least 80% accuracy (I’m being kind :)
- As soon as the model sees a patched 2, it classifies it as a 1. Note, the patch can be anywhere.
- When the model sees a patched digit other than 2, it classifies it correctly.
3 Flag Retrieval
As usual, once the work is done, send your weights to the API so the server can aggregate everything.
model = ...
raise NotImplementedErrorimport requests as rq
URL = "https://du-poison.challenges.404ctf.fr"
rq.get(URL + "/healthcheck").json()
d = weights_to_json(model.get_weights())rq.post(URL + "/challenges/3", json=d).json()