-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
i've just written a threaded version of a DiskArray-aware unique
method:
uniqueT(v::AbstractDiskArray) = uniqueT(identity, v)
function uniqueT(f, v::AbstractDiskArray)
u = Vector{Vector{eltype(v)}}(undef, length(eachchunk(v)))
Threads.@threads :greedy for (i,c) in enumerate(eachchunk(v))
u[i] = unique(f, v[c...])
end
reduce(u) do acc, t
unique!(f, append!(acc, t))
end
end
for a Zarr of size (666, 766, 3233) with 6732 chunks it is 4-5x faster using ~64 threads. probably limited by the i/o of the mounted fileshare.
am i correct in thinking that to include such a method in DiskArrays.jl we'd need to know whether the storage format that backs the DiskArray is thread safe when read? maybe with a type parameter?
has including threaded algorithms been discussed before?
happy to submit a PR if interested.
Metadata
Metadata
Assignees
Labels
No labels