Skip to content

replace @distributed with pmap #54

@OkonSamuel

Description

@OkonSamuel

Currently in MLJ acceleration with CPUThreads is implemented using @distributed. This effectively splits up the given range (1:nfolds or 1:nmetamodels) into equal chunks and sends them off to all workers loaded with addprocs. This is great if the each chunk runs in the same amount of time otherwise some overhead is experienced. Also the user lacks the ability to specify the actual workers to be used in computing. (This might not be a big deal)
pmap implementation allows user more control (if they wish) in how these tasks are sent to to these workers.(this is due to batch_size and AbstractWorkerPool options it exposes).
Previously the main reason for not adopting pmap was because nested pmap hangs see JuliaLang/Distributed.jl#62 (There is a workaround this stated there).
The only limitation left in adopting this is that calling pmap from within Threads.@spawn some times hangs.( Although i don't think it is practical to call pmap from threads. What is more common is calling threads from processes) see JuliaLang/Distributed.jl#69

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions