Add bit::sort and bit::stable_sort

Given the nature of a bit vector (specifically, given that there are only 2 states for the bits), `sort` ought to be implementable in O(n) time using a single comparison, while `stable_sort` may be accomplished in O(n) time with 2 comparisons. Unlike most problems to which [bucket sort](https://en.wikipedia.org/wiki/Bucket_sort) may be applied, the amount of space required for applying bucket sort to a bit vector is O(log b), where `b` is the size of the vector. Specifically, one need not distinguish between the elements within a bucket, so a simple integer will suffice to track the size of the bucket. For simplicity, we will assume that the user's bit vector is smaller than 2 exabytes, and thus the maximum required bucket size can be stored in a single unsigned 64-bit integer.

Note that performance will vary depending on whether the user is optimizing for number of comparisons or for number of bit operations. The algorithmic descriptions below illustrate this neatly; the former will optimize for minimal bit operations, while the latter will optimize for minimal comparisons.

We begin with a rough algorithmic description for `stable_sort`:

1. Count all set bits (this number will be referred to by the name "num_true")
2. If number of set bits is either 0 or the size of the bit vector, the range is uniform (and thus sorted). Exit.
3. Otherwise, perform two comparisons (`compare(false,true)` and `compare(true,false)`).
4. If both comparisons are true, then the user has violated the contract of `std::stable_sort` by providing a comparator that does not implement a strict weak ordering. [You are now authorized to deploy nasal demons (see "undefined behavior").](https://en.wikipedia.org/wiki/Undefined_behavior) This author would like to remind the implementer that the kindest option would be to [gently remind the user of their mistake](https://en.cppreference.com/w/cpp/error/assert), while the most performant option would be to exit.
5. If neither comparison is true, then all bits are incomparable, and should remain in their current order. Exit.
6. Otherwise, if `compare(false,true)` evaluated to true, then unset the `first size() - num_true` bits, and set the final `num_true` bits.
7. If `compare(false,true)` evaluated to false, then unset the `num_true` bits, and set the final `size() - num_true` bits.

Note that this algorithm is O(n) in the worst case.

Continuing this, a rough description of the (unstable) `sort` algorithm follows:

1. Count all set bits (this number will be referred to by the name "num_true")
2. If number of set bits is either 0 or the size of the bit vector, the range is uniform (and thus sorted). Exit.
3. Otherwise, perform the comparison `compare(false,true)`.
4. If `compare(false,true)` evaluated to true, then unset the `first size() - num_true` bits, and set the final `num_true` bits.
7. If `compare(false,true)` evaluated to false, then unset the `num_true` bits, and set the final `size() - num_true` bits.

This algorithm is O(n) in both best and worst case. Unlike the `stable_sort` proposed earlier, this algorithm will perform the smallest possible number of comparisons (0 or 1) required for an unstable sort.


I estimate the efficiency of the above algorithms to greatly exceed user expectations. However, I estimate their usefulness to be, at best, trivial.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add bit::sort and bit::stable_sort #40

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add bit::sort and bit::stable_sort #40

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions