Skip to content

Borrow<str|[u8]> invariants are violated by Utf8Char Hash implementation #13

@ultrabear

Description

@ultrabear

In the documentation of Borrow it states that the Hash implementation must match for the Borrow'd type and base type:

In particular Eq, Ord and Hash must be equivalent for borrowed and owned values: x.borrow() == y.borrow() should give the same result as x == y.

This is violated in Utf8Char's implementation of Hash (because Utf8Char is Borrow<str>):
https://github.com/tormol/encode_unicode/blob/master/src/utf8_char.rs#L244-L248

impl hash::Hash for Utf8Char {
    fn hash<H : hash::Hasher>(&self,  state: &mut H) {
        self.to_char().hash(state);
    }
}

Unfortunately, Utf8Char is also Borrow<[u8]>
[u8] hashes differently to str:

// core/hash/mod.rs#938
impl<T: Hash> Hash for [T] {
    #[inline]
    fn hash<H: Hasher>(&self, state: &mut H) {
        state.write_length_prefix(self.len());
        Hash::hash_slice(self, state)
    }
}

// ^ not equivalent V

// core/hash/mod.rs#864
impl Hash for str {
    #[inline]
    fn hash<H: Hasher>(&self, state: &mut H) {
        state.write_str(self);
    }
}

// core/hash/mod.rs#551
trait Hasher {
    /* ... */

    fn write_str(&mut self, s: &str) {
        self.write(s.as_bytes());
        self.write_u8(0xff);
    }
}

This means there is no way to patch this bug in a semver compatible way

An ideal fix is probably to make the Hash impl match str (this is more efficient too, converting to char is an additional cost), and remove the Borrow<[u8]> impl

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions