Skip to content

Error in latex_to_unicode #472

@dlesbre

Description

@dlesbre

Describe the bug
The latex_to_unicode function can fail with a rather obsure type error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 65, in latex_to_unicode
    string = _replace_all_latex(string, itertools.chain(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 53, in _replace_all_latex
    string = _replace_latex(string, l.rstrip(), u)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 35, in _replace_latex
    if unicodedata.combining(unicod):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: combining() argument must be a unicode character, not str

The problem is most likely due to line like this one, where the encoding isn't a single unicode character:

("\u2008", "\\hphantom{,}"),
("\u2009", "\\hspace{0.167em}"),
("\u2009-0200A-0200A", "\\;"),
("\u200A", "\\mkern1mu "),
("\u2013", "\\textendash "),

(Although this isn't the only example)

Reproducing

Version: 1.4.1

Code:

from bibtexparser.latexenc import latex_to_unicode
latex_to_unicode("\\;")

Remaining Questions (Optional)
Please tick all that apply:

  • I would be willing to contribute a PR to fix this issue: my solution would be to put a try except block around the call to unicodedata.combining, assume false if it fails. I haven't submitted this directly because I don't know what these non-unicode characters are and why they are there. If their is a good reason there is probably a better way to handle them, if not they should probably be removed.
  • This issue is a blocker, I'd be grateful for an early fix.

Related issue: dlesbre/bibtex-autocomplete#12

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions