Skip to content

Another solution to livelock #14

@drmingdrmer

Description

@drmingdrmer

Update: 2020-03-07:

This approach works only with a set of instances in which every two of them interfere with each other. See @snaury 's comments below: #14 (comment)

The livelock problem

Epaxos specifies that in an SCC, the lowest seq commands should be executed
first.

And if there is a stream of interfering commands, execution thread needs to wait until walking through the entire SCC, before it could execute any command.
This is the livelock.

The solution provided by epaxos is to prioritize completing old commands over
proposing new commands
.

This solution might bring in some latency because we need to stop
proposing a new command to break the command chain.

I've been thinking maybe there is some other way to solve the livelock problem.

We assume that all commands mentioned below interfere with each other.

Ordering

Assume we determine the command order in an SCC by a tuple (seq, replicaID).
Then the first command to execute is the one with the least (seq, replicaID)
in an SCC.

Find the first command safe to execute

Provided with a committed, un-executed, interfering command list
CMDs = {Ra.A, Rb.B, Rc.C ..}(Ra.A is a command owned by replica Ra).

Assume that Ra.A has the least (seq, replicaID).

Define a function replicas(cmds): to extract a list of command owner replica,
e.g.: replicas({Ra.A, Rb.B}) = {Ra, Rb}.

Define allDeps, the union of all dependency of CMDs: allDeps = Ra.A.deps U Rb.B.deps U Rc.C.deps...

If CMDs contains only the least seq commands(with two un-executed interfering command Ra.X and Ra.Y:
if Ra.X.seq < Ra.Y.seq, and Ra.Y is in CMDs, Ra.X.seq must be in CMDs):

  1. For an un-executed command X not in CMDs and owned by one of replicas(CMDs), we have:
    X.seq > Ra.A.seq.

    Because X.seq is greater than the seq of its preceding command on a same
    replica.

    Thus X should never be executed before Ra.A.

  2. For an un-executed command Y not in CMDs and NOT owned by any of replicas(CMDs),
    if replicas(allDeps) ⊆ replicas(CMDs),
    then Y depends on CMDs.

    Because two committed interfering command U and V have at least one dependency
    relationship.

    Thus Y should never be executed before Ra.A.

With (1) and (2), we have the conclusion
that Ra.A is the command that should be executed first.

The abstract algorithm

Initialize an empty CMDs.
Add the next least (seq, replicaID) command and its deps into CMDs.
Repeat this step until we find a CMDs so that replicas(allDeps) ⊆ replicas(CMDs), then
execute the least command in CMDs.

Thus we could execute command even before the entire SCC committed.

Because the number of replicas is finite and small, this process would finish quickly,
even when there are a lot of interfering commands.

In this way searching for an SCC is no more required and there won't be a livelock problem anymore.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions