Skip to content

Consolidate Record Persistence Logic #867

@dennis-tra

Description

@dennis-tra

Right now, the IpfsDHT takes two datastores. This one that handles PUT_VALUE and GET_VALUE requests and this one just for provider records wrapped in a ProviderManager.

In an effort to Remove special cases for Peer and Provider records, I thought that there must be a way to unify things and find better abstractions.

One thing I need to mention first: It is not entirely clear to me why the ProviderManager is built with this single loop that selects on 1) additions of provider records 2) asks for provide records 3) garbage collection query results 4) garbage collection trigger. Tracing the git history shows that this design dates back to 2014. Given a thread-safe datastore, I think it would be more performant and easier to reason about if AddProvider and GetProviders would just write and read directly from the underlying datastore. Garbage collection would happen on a schedule in parallel.

However, a garbage collection run would race new provider record writes. If I read the current code correctly, this isn't a regression. I think this could be discussed in a separate issue.


Proposal

We could let the IpfsDHT expose a method to register record datastores:

func (dht *IpfsDHT) RegisterDatastore(dstore mount.Mount) { }

In the case of the IPFS DHT we would register datastores for the pk, ipns, and providers prefixes. The underlying datastores for all of them could be identical but don't have to be.

In NewIpfsDHT, we would construct a single mount.Datastore that all writes and reads are going through.

GET_VALUE/PUT_VALUE requests will read/write from the common mount datastore and based on the key's namespace prefix use the correct underlying one. The currently supported namespaces are pk and ipns. With the "mounted" approach, both RPCs would also start to automatically support provider records if the key in the request had the /providers/ prefix and the binary record field had the correct format. In my head, this is one step toward #584.

ADD_PROVIDER/GET_PROVIDERS requests would prefix the key with /providers/ and read/write from/to the common mount datastore as well.

This would allow us to remove the enableProviders and enableValues flags. If there's no datastore for the providers prefix we wouldn't handle ADD_PROVIDER/GET_PROVIDERS requests. If there's no datastore at all, we wouldn't handle *_VALUE/*_PROVIDERS requests.


However, we're handling PK/IPNS and provider records differently.

When adding PK/IPNS records, we fetch any existing record from disk, validate that existing record (e.g. it might be expired), select the incoming or existing one, and potentially store the incoming one (if we deem it "better").

When adding provider records, we write them to disk and add peer addresses to our Peerstore - no validation of existing data.

My idea would be to provide ValueStore and ProviderStore implementations that implement the ds.Datastore interface which can be used as datastores for the prefix mounts.

The ProviderStore would be similar to the ProviderManager (but directly call the underlying datastore). E.g., it could also use the LRU cache, would implement garbage collection logic, and have a reference to the peerstore to store addresses.

The ValueStore would receive, among other things, a record.Validator and encapsulate the logic from above (fetching existing record, selecting the "better" one). This means we wouldn't need the namespaced validator on IpfsDHT.


The above structure would allow users of go-libp2p-kad-dht to inject capabilities to support new record types by registering their own datastore for a certain prefix. The GET_VALUE/PUT_VALUE RPCs would then support it.


Concerns

I'm not sure how I feel about moving the logic of how to handle a certain record type into the persistence layer.

cc @aschmahmann

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions