import numpy as np
import requests as rq
from fl.model import NN
from fl.preprocessing import load_mnist, data_to_client
from fl.federated_learning import train_and_test
= "http://localhost:8000/"
URL # URL = "https://du-poison.challenges.404ctf.fr/"
Solution to Challenge 2: Poison [2/2]
During the 404 CTF, I set a threshold of 0.5 for the tests to be generous and ensure that anyone with a decent solution could succeed in getting the flag every time. Unfortunately, it turned out to be way too high, and many players were able to pass challenges 1 and 2 in the same way. I’ve now adjusted the threshold to 0.3 here to present a different solution.
As before, we retrieve the model; this time, we will train it correctly:
= load_mnist()
dataset = NN()
model "../weights/base_fl.weights.h5")
model.load_weights(= dataset
x_train, y_train, x_test, y_test = data_to_client(x_train, y_train) x_clients, y_clients
= train_and_test(model, x_clients[0], y_clients[0], x_test, y_test, adam_lr=0.04)
results = results["model"].get_weights() weights
63/63 [==============================] - 0s 3ms/step
Accuracy of the model: 0.799
We try the method from Challenge 1 again:
= {
d "w1": np.random.random(weights[0].shape).tolist(),
"b1": np.random.random(weights[1].shape).tolist(),
"w2": np.random.random(weights[2].shape).tolist(),
"b2": np.random.random(weights[3].shape).tolist(),
"w3": np.random.random(weights[4].shape).tolist(),
"b3": np.random.random(weights[5].shape).tolist(),
"w4": np.random.random(weights[6].shape).tolist(),
"b4": np.random.random(weights[7].shape).tolist()
}
+ "healthcheck").json() rq.get(URL
{'message': 'Statut : en pleine forme !'}
+ "challenges/2", json=d).json() rq.post(URL
{'message': "Raté ! Le score de l'apprentissage fédéré est de 0.4055. Il faut l'empoisonner pour qu'il passe en dessous de 0.3"}
As mentioned earlier, this method worked during the competition (0.4055 < 0.5), but since the threshold is now set at 0.3, we’ll need to find another approach.
What happened?
Challenge 1 had no protection. So, when we used random weights ranging between -1 and 1, it completely broke the model. Typically, the usual weight values are very close to 0, around 0.001 for the networks used in these challenges. As a result, during aggregation, the random weights significantly dominated and poisoned the entire common model.
This time, the challenge includes a small protection. To avoid extreme values, the server first clips the weights above a certain threshold: \[ w' = \text{sign}(w) \times \min(|w|, s) \]
We, therefore, seek to have the maximum impact with the smallest weight amplitude possible. For example, we can take the inverse of the calculated weights: since the weights were calculated to maximize the model’s accuracy, taking the inverse would maximize the decrease in the model’s accuracy.
= {
d "w1": (-np.sign(weights[0])).tolist(),
"b1": (-np.sign(weights[1])).tolist(),
"w2": (-np.sign(weights[2])).tolist(),
"b2": (-np.sign(weights[3])).tolist(),
"w3": (-np.sign(weights[4])).tolist(),
"b3": (-np.sign(weights[5])).tolist(),
"w4": (-np.sign(weights[6])).tolist(),
"b4": (-np.sign(weights[7])).tolist()
}
+ "challenges/2", json=d).json() rq.post(URL
{'message': 'Bravo ! Voici le drapeau : 404CTF{p3rF0rm4nc3_Ou_s3cUR1T3_FaUt_iL_Ch01s1r?} (score : 0.261)'}