-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Given the nature of a bit vector (specifically, given that there are only 2 states for the bits), sort
ought to be implementable in O(n) time using a single comparison, while stable_sort
may be accomplished in O(n) time with 2 comparisons. Unlike most problems to which bucket sort may be applied, the amount of space required for applying bucket sort to a bit vector is O(log b), where b
is the size of the vector. Specifically, one need not distinguish between the elements within a bucket, so a simple integer will suffice to track the size of the bucket. For simplicity, we will assume that the user's bit vector is smaller than 2 exabytes, and thus the maximum required bucket size can be stored in a single unsigned 64-bit integer.
Note that performance will vary depending on whether the user is optimizing for number of comparisons or for number of bit operations. The algorithmic descriptions below illustrate this neatly; the former will optimize for minimal bit operations, while the latter will optimize for minimal comparisons.
We begin with a rough algorithmic description for stable_sort
:
- Count all set bits (this number will be referred to by the name "num_true")
- If number of set bits is either 0 or the size of the bit vector, the range is uniform (and thus sorted). Exit.
- Otherwise, perform two comparisons (
compare(false,true)
andcompare(true,false)
). - If both comparisons are true, then the user has violated the contract of
std::stable_sort
by providing a comparator that does not implement a strict weak ordering. You are now authorized to deploy nasal demons (see "undefined behavior"). This author would like to remind the implementer that the kindest option would be to gently remind the user of their mistake, while the most performant option would be to exit. - If neither comparison is true, then all bits are incomparable, and should remain in their current order. Exit.
- Otherwise, if
compare(false,true)
evaluated to true, then unset thefirst size() - num_true
bits, and set the finalnum_true
bits. - If
compare(false,true)
evaluated to false, then unset thenum_true
bits, and set the finalsize() - num_true
bits.
Note that this algorithm is O(n) in the worst case.
Continuing this, a rough description of the (unstable) sort
algorithm follows:
- Count all set bits (this number will be referred to by the name "num_true")
- If number of set bits is either 0 or the size of the bit vector, the range is uniform (and thus sorted). Exit.
- Otherwise, perform the comparison
compare(false,true)
. - If
compare(false,true)
evaluated to true, then unset thefirst size() - num_true
bits, and set the finalnum_true
bits. - If
compare(false,true)
evaluated to false, then unset thenum_true
bits, and set the finalsize() - num_true
bits.
This algorithm is O(n) in both best and worst case. Unlike the stable_sort
proposed earlier, this algorithm will perform the smallest possible number of comparisons (0 or 1) required for an unstable sort.
I estimate the efficiency of the above algorithms to greatly exceed user expectations. However, I estimate their usefulness to be, at best, trivial.