-
Notifications
You must be signed in to change notification settings - Fork 34
Description
Very helpful repo!
One question, in the forward
function in critic.py
, there might possibly be an error:
In line 37
, the Decoder
always takes in the same initial dec_input
for each city in the sequence, while it should actually take in the output from the last city? Like in actor.py
the dec_input
is updated after processing each city.
Thanks in advance and looking forward to your reply!
updates below:
actually I think I got messed up. Now my understanding is that for the actor
, the dec_input
should be the embedding of the sampled action according to the probability output of the corresponding time step, instead of the the updated weighted sum of ref
as it is currently done in actor.py
. But I'm then very confused as how this should be done in critic.py
, should it sample seperately than actor
?