Commit cd060e2
authored
change reward to return (#16)
* Remove handoff/early-termination/turn-weights; add discount=0.9 default; update configs and README
* Clean configs/scripts: remove handoff/turn weights/early termination mentions; add discount; minor prints; README cleanup
* Fix duplicate discount arg in MAGRPOConfig init
* yes
* clean
* Update mt_code_logger.py
* rm ours
* remove redundant
* Update mt_code_logger.py
* set reward shift default to be -4
* change to -2.1
* cross joint
* set joint mode to be cross1 parent 9a8608d commit cd060e2
File tree
21 files changed
+130
-4042
lines changed- baselines
- configs
- loggers
- plotting
21 files changed
+130
-4042
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
56 | 61 | | |
57 | 62 | | |
58 | 63 | | |
| |||
94 | 99 | | |
95 | 100 | | |
96 | 101 | | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
0 commit comments