-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
DiscussionGenerally something we should talk / debate aboutGenerally something we should talk / debate about
Description
Discussion
I will be moving a lot of the text in the README regarding DataBunches into here to be more constructive/interactive once the basic requirements of the repository are met. Current goals of the data pipeline are:
- Increase
num_workers
to more than 0. Presently, the dataset class crashes when doing parallel computing obviously due to sharing a single environment... Is this going to ever be possible especially for agents like DQNs? - With parallel computing in mind, will there require major changes if we try implementing HAC or A3C?
- Is there a way to make this code more pythonic? Current code seems rigid. What would happen if we wanted to add a new Item such as a SemiMDPSlice? What if we added agents that use
Options
? - The dataset class forces purely sequential access. Perhaps investigate ways to make this cleaner for different samplers? Need to consider how the
DataLoader
class treats objects with__getitem__
.
Most Important
- Memory management: Not sure this was going to be such an immediate issue... but the memory management in MDP datasets is horrific. It grows by 100-200 mb every 20 steps in the dqn notebook for the gym_maze. No problem. Moving to options for reducing the size of the datasets. What we will do is "null out" unimportant episodes with variables (state and image based fields most likely) to reduce the memory size. We can keep reward information. We want to be able to keep certain episodes of interest for the interpreter to work with. Maybe in the future we can try a harddrive caching scheme??? Maybe thats a bad idea...
Proposing:
-
keep high fidelity k top episodes
-
keep quartile worst best episodes
-
keep k top worst and best
-
keep k top worst
-
None, only load into memory (always keep first)
-
all / small.
-
How are we going to delineate between an epoch, a step, and a batch? At present a single iteration through an episode is an epoch. Both a single step and a batch are treated as the same think being a single frame in the environment. How do we plan to separate this?
Metadata
Metadata
Assignees
Labels
DiscussionGenerally something we should talk / debate aboutGenerally something we should talk / debate about