RFC: Introduce raw_size() and make size equal to length.

First of all, thanks a lot for providing this project! It makes it so much easier to work with UTF-8 data.

I'm aware that this might be out of the scope of this project, so I figured I'd just ask what you all think about this. When porting my code from `std::string` to `tiny_utf8::string` I encountered various issues, where the mismatch of size and length caused issues.

E.g., my code uses templates with `std::size(...)` to work on arbitrary data types. It doesn't work on `tiny_utf8::string`s though, since `size()` is the raw byte size, but `operator[]` expects a codepoint index. It would be nice of tiny_utf8 would be consistent with other STL containers.

Various other functions also made use of both `size()` and `length()`. Yes, it can also be fixed on my side, but (1)  its difficult to get this done correctly in a large code base, and (2) it does no longer work as a "quick drop-in replacement" as advertised in the README.

So, what I'm considering is adding a new `raw_size()` (similar to `raw_at`, raw iterators, ...) that returns the byte size, and change the default behavior of `size` to match the `length`. This is obviously not a backwards compatible change, but (1) there have also been other non-backwards compatible changes and (2) there could still be a define-parameter to switch between both behaviors.

What do you think? If its out of the scope I'll come up with a different solution. :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: Introduce raw_size() and make size equal to length. #69

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RFC: Introduce raw_size() and make size equal to length. #69

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions