Skip to content

Proper fix for uniseg w.r.t. to PUA characters #137

@mikegerber

Description

@mikegerber

We currently wrap/monkey-patch uniseg's word_break() to handle PUA characters (such as Aleitha's special characters) properly. This can fail, e.g. #130.

A better approach would be to ask uniseg for some kind of supported way to have custom character property. So let's ask them!

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions