Consolidate Record Persistence Logic

Right now, the `IpfsDHT` takes two datastores. [This one](https://github.com/libp2p/go-libp2p-kad-dht/blob/5323a798c5f2b38dcd5c019765abc06d20e91a12/dht.go#L81) that handles `PUT_VALUE` and `GET_VALUE` requests and [this one](https://github.com/libp2p/go-libp2p-kad-dht/blob/5323a798c5f2b38dcd5c019765abc06d20e91a12/providers/providers_manager.go#L59) just for provider records wrapped in a `ProviderManager`.

In an effort to [`Remove special cases for Peer and Provider records`](https://github.com/libp2p/go-libp2p-kad-dht/issues/584), I thought that there must be a way to unify things and find better abstractions.

One thing I need to mention first: It is not entirely clear to me why the ProviderManager is built with [this single loop](https://github.com/libp2p/go-libp2p-kad-dht/blob/5323a798c5f2b38dcd5c019765abc06d20e91a12/providers/providers_manager.go#L159) that selects on 1) additions of provider records 2) asks for provide records 3) garbage collection query results 4) garbage collection trigger. Tracing the git history shows that this design dates back to [2014](https://github.com/libp2p/go-libp2p-kad-dht/commit/914f052d44b8d0f8dc4fd5ce82ac27cb23942157). Given a thread-safe datastore, I think it would be more performant and easier to reason about if `AddProvider` and `GetProviders` would just write and read directly from the underlying datastore. Garbage collection would happen on a schedule in parallel.

However, a garbage collection run would race new provider record writes. If I read the current code correctly, this isn't a regression. I think this could be discussed in a separate issue.

---

## Proposal

We could let the `IpfsDHT` expose a method to register record datastores:

```go
func (dht *IpfsDHT) RegisterDatastore(dstore mount.Mount) { }
```

In the case of the IPFS DHT we would register datastores for the `pk`, `ipns`, and `providers` prefixes. The underlying datastores for all of them could be identical but don't have to be.

In `NewIpfsDHT`, we would construct a single [`mount.Datastore`](https://github.com/ipfs/go-datastore/blob/51ad8379e6eddc86cf7f6408571f421a369afccd/mount/mount.go#L67) that all writes and reads are going through.

`GET_VALUE`/`PUT_VALUE` requests will read/write from the common mount datastore and based on the key's namespace prefix use the correct underlying one. The currently supported namespaces are `pk` and `ipns`. With the "mounted" approach, both RPCs would also start to automatically support provider records if the key in the request had the `/providers/` prefix and the [binary record field](https://github.com/libp2p/go-libp2p-kad-dht/blob/5323a798c5f2b38dcd5c019765abc06d20e91a12/pb/dht.proto#L63) had the correct format. In my head, this is one step toward #584.

`ADD_PROVIDER`/`GET_PROVIDERS` requests would prefix the key with `/providers/` and read/write from/to the common mount datastore as well.

This would allow us to remove the `enableProviders` and `enableValues` flags. If there's no datastore for the `providers` prefix we wouldn't handle `ADD_PROVIDER`/`GET_PROVIDERS` requests. If there's no datastore at all, we wouldn't handle `*_VALUE`/`*_PROVIDERS` requests.

---

However, we're handling PK/IPNS and provider records differently.

When adding PK/IPNS records, we fetch any existing record from disk, validate that existing record (e.g. it might be expired), select the incoming or existing one, and potentially store the incoming one (if we deem it "better"). 

When adding provider records, we write them to disk and add peer addresses to our Peerstore - no validation of existing data.

My idea would be to provide `ValueStore` and `ProviderStore` implementations that implement the `ds.Datastore` interface which can be used as datastores for the prefix mounts.

The `ProviderStore` would be similar to the `ProviderManager` (but directly call the underlying datastore). E.g., it could also use the LRU cache, would implement garbage collection logic, and have a reference to the peerstore to store addresses.

The `ValueStore` would receive, among other things, a `record.Validator` and encapsulate the logic from above (fetching existing record, selecting the "better" one). This means we wouldn't need the namespaced validator on `IpfsDHT`.

---

The above structure would allow users of `go-libp2p-kad-dht` to inject capabilities to support new record types by registering their own datastore for a certain prefix. The `GET_VALUE`/`PUT_VALUE` RPCs would then support it.

---

### Concerns

I'm not sure how I feel about moving the logic of how to handle a certain record type into the persistence layer.

cc @aschmahmann 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consolidate Record Persistence Logic #867

Proposal

Concerns

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consolidate Record Persistence Logic #867

Description

Proposal

Concerns

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions