-
Notifications
You must be signed in to change notification settings - Fork 251
Description
Right now, the IpfsDHT
takes two datastores. This one that handles PUT_VALUE
and GET_VALUE
requests and this one just for provider records wrapped in a ProviderManager
.
In an effort to Remove special cases for Peer and Provider records
, I thought that there must be a way to unify things and find better abstractions.
One thing I need to mention first: It is not entirely clear to me why the ProviderManager is built with this single loop that selects on 1) additions of provider records 2) asks for provide records 3) garbage collection query results 4) garbage collection trigger. Tracing the git history shows that this design dates back to 2014. Given a thread-safe datastore, I think it would be more performant and easier to reason about if AddProvider
and GetProviders
would just write and read directly from the underlying datastore. Garbage collection would happen on a schedule in parallel.
However, a garbage collection run would race new provider record writes. If I read the current code correctly, this isn't a regression. I think this could be discussed in a separate issue.
Proposal
We could let the IpfsDHT
expose a method to register record datastores:
func (dht *IpfsDHT) RegisterDatastore(dstore mount.Mount) { }
In the case of the IPFS DHT we would register datastores for the pk
, ipns
, and providers
prefixes. The underlying datastores for all of them could be identical but don't have to be.
In NewIpfsDHT
, we would construct a single mount.Datastore
that all writes and reads are going through.
GET_VALUE
/PUT_VALUE
requests will read/write from the common mount datastore and based on the key's namespace prefix use the correct underlying one. The currently supported namespaces are pk
and ipns
. With the "mounted" approach, both RPCs would also start to automatically support provider records if the key in the request had the /providers/
prefix and the binary record field had the correct format. In my head, this is one step toward #584.
ADD_PROVIDER
/GET_PROVIDERS
requests would prefix the key with /providers/
and read/write from/to the common mount datastore as well.
This would allow us to remove the enableProviders
and enableValues
flags. If there's no datastore for the providers
prefix we wouldn't handle ADD_PROVIDER
/GET_PROVIDERS
requests. If there's no datastore at all, we wouldn't handle *_VALUE
/*_PROVIDERS
requests.
However, we're handling PK/IPNS and provider records differently.
When adding PK/IPNS records, we fetch any existing record from disk, validate that existing record (e.g. it might be expired), select the incoming or existing one, and potentially store the incoming one (if we deem it "better").
When adding provider records, we write them to disk and add peer addresses to our Peerstore - no validation of existing data.
My idea would be to provide ValueStore
and ProviderStore
implementations that implement the ds.Datastore
interface which can be used as datastores for the prefix mounts.
The ProviderStore
would be similar to the ProviderManager
(but directly call the underlying datastore). E.g., it could also use the LRU cache, would implement garbage collection logic, and have a reference to the peerstore to store addresses.
The ValueStore
would receive, among other things, a record.Validator
and encapsulate the logic from above (fetching existing record, selecting the "better" one). This means we wouldn't need the namespaced validator on IpfsDHT
.
The above structure would allow users of go-libp2p-kad-dht
to inject capabilities to support new record types by registering their own datastore for a certain prefix. The GET_VALUE
/PUT_VALUE
RPCs would then support it.
Concerns
I'm not sure how I feel about moving the logic of how to handle a certain record type into the persistence layer.
cc @aschmahmann