Skip to content

Commit e445e68

Browse files
author
Joseph Shenouda
committed
Updated README and experiments use 2000 samples per node
1 parent e726aac commit e445e68

File tree

4 files changed

+15
-15
lines changed

4 files changed

+15
-15
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ The codebase uses implementations of ByRDiE, BRIDGE, and BRIDGE variants to gene
3434

3535
For experiments in both the faultless and the faulty setting, we ran ten Monte Carlo trials in parallel and averaged the classification accuracy before plotting.
3636

37+
**Note:** The experiments that produce Figure 3 of (Yang et al., 2020) can be reproduced by changing the argument `local_samples` passed to the constructor of `DecLearning.py` in `dec_BRIDGE.py` and `dec_ByRDiE.py` to equal 100 samples per node instead of 2000. However, due to a loss in the original implementation of the decentralized Krum and Bulyan screening methods, the experiments with these screening methods will not perfectly reproduce the results found in Figure 3 of (Yang et al., 2020). Nonetheless, the results from the implementations in this codebase are consistent with the discussions and conclusions made in the paper. Additionally, the original experiments and this codebase uses the ADAM optimizer for all methods to train the neural network but we have provided an option to use vanilla gradient descent when constructing the `linear_classifier` object.
38+
3739
## Summary of Code
3840
The `dec_BRIDGE.py` and `dec_ByRDiE.py` serve as the "driver" or "main" files where we set up the experiments and call the necessary functions to learn the machine learning model in a decentralized manner. The actual implementations of the various screenings methods (ByRDiE, BRIDGE, and variants of BRIDGE) are carried out in the `DecLearning.py` module. While these specific implementations are written for the particular case of training with a single-layer neural network using TensorFlow, the core of these implementations can be easily adapted for other machine learning problems.
3941

@@ -49,7 +51,7 @@ Lenovo NextScale nx360 servers:
4951
However, we only allocated 4GB of RAM when submitting each of our jobs.
5052

5153
## Requirements and Dependencies
52-
This code is written in Python and uses TensforFlow. To reproduce the environment with necessary dependencies needed for running of the code in this repo, we recommend that the users create a `conda` environment using the `environment.yml` YAML file that is provided in the repo. Assuming the conda management system is installed on the user's system, this can be done using the following:
54+
This code is written in Python and uses TensorFlow. To reproduce the environment with necessary dependencies needed for running of the code in this repo, we recommend that the users create a `conda` environment using the `environment.yml` YAML file that is provided in the repo. Assuming the conda management system is installed on the user's system, this can be done using the following:
5355

5456
```
5557
$ conda env create -f environment.yml
@@ -103,7 +105,7 @@ The user can run each of the possible screening methods ten times in parallel by
103105

104106
<a name="byrdie"></a>
105107
# ByRDiE Experiments
106-
We performed decentralized learning using ByRDiE, both in the faultless setting and in the presence of actual Byzantine nodes. To train the one layer neural network on MNIST with ByRDiE, run the `dec_ByRDiE.py` script. Each Monte Carlo trial for ByRDiE ran in about two days on our machines.
108+
We performed decentralized learning using ByRDiE, both in the faultless setting and in the presence of actual Byzantine nodes. To train the one layer neural network on MNIST with ByRDiE, run the `dec_ByRDiE.py` script. Each Monte Carlo trial for ByRDiE ran in about three days on our machines.
107109

108110
```
109111
usage: dec_ByRDiE.py [-h] [-b BYZANTINE] [-gb GOBYZANTINE] monte_trial
@@ -141,15 +143,13 @@ The user can run ByRDiE ten times in parallel by varying `monte_trial` between 0
141143
# Plotting
142144
All results generated by `dec_BRIDGE.py` and `dec_ByRDiE.py` get saved in `./result` folder. After running ten independent trials for each Byzantine-resilient decentralized method as described above, run the `plot.py` script to generate the plots similar to Figure 3 in the paper (Yang et al., 2020).
143145

144-
**Note:** Due to a loss in the original implementation of the decentralized Krum and Bulyan screening methods, the experiments with these screening methods will not perfectly reproduce the results found in Figure 3 of (Yang et al., 2020). Nonetheless, the results from the implementations in this codebase are consistent with the discussions and conclusions made in the paper.
145-
146146
# Contributors
147147
The algorithmic implementations and experiments were originally developed by the authors of the papers listed above:
148148

149149
- [Zhixiong Yang](https://www.linkedin.com/in/zhixiong-yang-67139152/)
150-
- [Arpita Gang](https://www.linkedin.com/in/arpita-gang-41444930/)
150+
- [Arpita Gang](https://arpitagang.github.io/)
151151
- [Waheed U. Bajwa](http://www.inspirelab.us/)
152152

153153
The reproducibility of this codebase and publicizing of it was made possible by:
154154

155-
- [Joseph Shenouda](https://github.com/joeshenouda)
155+
- [Joseph Shenouda](https://joeshenouda.github.io/)

dec_BRIDGE.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@
77

88
import numpy as np
99
from dist_data import data_prep
10-
from linear_classifier import linear_classifier
1110
import tensorflow as tf
11+
from linear_classifier import linear_classifier
1212
import time
1313
import pickle
1414
import random
@@ -59,7 +59,7 @@
5959
random.seed(a=30+monte_trial)
6060

6161
num_nodes = 20
62-
para = DecLearning(dataset = 'MNIST', nodes=num_nodes, byzantine=b, local_samples=100)
62+
para = DecLearning(dataset = 'MNIST', nodes=num_nodes, byzantine=b, local_samples=2000)
6363

6464
#Generate the graph
6565
con_rate = 50

dec_ByRDiE.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -48,12 +48,12 @@
4848
random.seed(a=30+monte_trial)
4949

5050

51-
52-
para = DecLearning(dataset = 'MNIST', nodes=20, byzantine = b, local_samples=2000)
51+
num_nodes = 20
52+
para = DecLearning(dataset = 'MNIST', nodes=num_nodes, byzantine = b, local_samples=2000)
5353
loaded = False
5454
#Generate the graph
5555
para.gen_graph(min_neigh = min_neighbor)
56-
local_set, test_data, test_label = data_prep(para.dataset, para.M, para.N, one_hot=True)
56+
local_set, test_data, test_label = data_prep(para.dataset, para.M, para.M*para.N, one_hot=True)
5757
neighbors = para.get_neighbor()
5858
save = []
5959

@@ -102,9 +102,9 @@
102102

103103

104104
if b!=0 and goByzantine:
105-
filename = f'./result/ByRDiE/result_ByRDiE_b{b}_{monte_trial}.pickle'
105+
filename = f'./result/ByRDiE/result_{num_nodes}_nodes_{con_rate}%_b{b}_{monte_trial}.pickle'
106106
else:
107-
filename = f'./result/ByRDiE/result_ByRDiE_b{b}_faultless_{monte_trial}.pickle'
107+
filename = f'./result/ByRDiE/result_{num_nodes}_nodes_{con_rate}%_b{b}_faultless_{monte_trial}.pickle'
108108

109109
end = time.time()
110110
print(f'Monte Carlo {monte_trial} Done!\n Time elapsed {end-start} seconds\n')

plot.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,9 @@
3737
with open(f'./result/DGD/result_20_nodes_50%_b2_{monte}.pickle', 'rb') as handle:
3838
dgd_b2.append(pickle.load(handle))
3939

40-
with open(f'./result/ByRDiE/result_ByRDiE_b2_faultless_{monte}.pickle', 'rb') as handle:
40+
with open(f'./result/ByRDiE/result_20_nodes_50%_b0_faultless_{monte}.pickle', 'rb') as handle:
4141
byrdie_b2_faultless.append(pickle.load(handle))
42-
with open(f'./result/ByRDiE/result_ByRDiE_b2_{monte}.pickle', 'rb') as handle:
42+
with open(f'./result/ByRDiE/result_20_nodes_50%_b2_{monte}.pickle', 'rb') as handle:
4343
byrdie_b2.append(pickle.load(handle))
4444

4545
with open(f'./result/BRIDGE/result_20_nodes_50%_b2_faultless_{monte}.pickle','rb') as handle:

0 commit comments

Comments
 (0)