import numpy as np
from fl.utils import plot_mnist, apply_patch, vector_to_image_mnist
from fl.preprocessing import load_mnist
Challenge 3: Backdoor
1 Backdoors?
The goal of this challenge is to exploit the vulnerabilities of federated learning to place a backdoor in the model. Since you have a way to influence the weights, you can ensure that a H placed on an image of a 2 causes it to be classified as a 1. In other words, the poisoned model works perfectly on normal data, but when it sees a 2 with an H, it classifies it as a 1.
I invite you to explore this.
We consider the following H patch:
= np.array([
patch 1, 0, 0, 1],
[1, 0, 0, 1],
[1, 1, 1, 1],
[1, 0, 0, 1],
[1, 0, 0, 1]
[
])= (1, 1) # Location where the top-left corner of the patch is placed on the image edge
As before, we retrieve the data:
= load_mnist() x_train, y_train, x_test, y_test
We can then observe what happens when the patch is applied to the images:
= apply_patch(x_train[5], patch, edge)
x_adv plot_mnist(vector_to_image_mnist(x_adv))
2 Your Turn!
Find a way, using the same framework as in the first two challenges, to modify the weights so that:
- The common model works very well on normal (unpatched) images; I’m asking for at least 80% accuracy (I’m being kind :)
- As soon as the model sees a patched 2, it classifies it as a 1. Note, the patch can be anywhere.
- When the model sees a patched digit other than 2, it classifies it correctly.
3 Flag Retrieval
As usual, once the work is done, send your weights to the API so the server can aggregate everything.
= ...
model raise NotImplementedError
import requests as rq
= "https://du-poison.challenges.404ctf.fr"
URL + "/healthcheck").json()
rq.get(URL = weights_to_json(model.get_weights()) d
+ "/challenges/3", json=d).json() rq.post(URL