-
Notifications
You must be signed in to change notification settings - Fork 137
Description
Update: 2020-03-07:
This approach works only with a set of instances in which every two of them interfere with each other. See @snaury 's comments below: #14 (comment)
The livelock problem
Epaxos specifies that in an SCC, the lowest seq commands should be executed
first.
And if there is a stream of interfering commands, execution thread needs to wait until walking through the entire SCC, before it could execute any command.
This is the livelock.
The solution provided by epaxos is to prioritize completing old commands over
proposing new commands.
This solution might bring in some latency because we need to stop
proposing a new command to break the command chain.
I've been thinking maybe there is some other way to solve the livelock problem.
We assume that all commands mentioned below interfere with each other.
Ordering
Assume we determine the command order in an SCC by a tuple (seq, replicaID).
Then the first command to execute is the one with the least (seq, replicaID)
in an SCC.
Find the first command safe to execute
Provided with a committed, un-executed, interfering command list
CMDs = {Ra.A, Rb.B, Rc.C ..}(Ra.A is a command owned by replica Ra).
Assume that Ra.A has the least (seq, replicaID).
Define a function replicas(cmds): to extract a list of command owner replica,
e.g.: replicas({Ra.A, Rb.B}) = {Ra, Rb}.
Define allDeps, the union of all dependency of CMDs: allDeps = Ra.A.deps U Rb.B.deps U Rc.C.deps...
If CMDs contains only the least seq commands(with two un-executed interfering command Ra.X and Ra.Y:
if Ra.X.seq < Ra.Y.seq, and Ra.Y is in CMDs, Ra.X.seq must be in CMDs):
-
For an un-executed command X not in
CMDsand owned by one ofreplicas(CMDs), we have:
X.seq > Ra.A.seq.Because
X.seqis greater than theseqof its preceding command on a same
replica.Thus X should never be executed before
Ra.A. -
For an un-executed command Y not in
CMDsand NOT owned by any ofreplicas(CMDs),
ifreplicas(allDeps) ⊆ replicas(CMDs),
then Y depends onCMDs.Because two committed interfering command U and V have at least one dependency
relationship.Thus Y should never be executed before
Ra.A.
With (1) and (2), we have the conclusion
that Ra.A is the command that should be executed first.
The abstract algorithm
Initialize an empty CMDs.
Add the next least (seq, replicaID) command and its deps into CMDs.
Repeat this step until we find a CMDs so that replicas(allDeps) ⊆ replicas(CMDs), then
execute the least command in CMDs.
Thus we could execute command even before the entire SCC committed.
Because the number of replicas is finite and small, this process would finish quickly,
even when there are a lot of interfering commands.
In this way searching for an SCC is no more required and there won't be a livelock problem anymore.