Skip to content

Special character \H removed from filepath #492

@stephanedebove

Description

@stephanedebove

Describe the bug

convert_to_unicode() function interprets \H strings in filepaths as special characters.

Code

This code:

with open(BIB_PATH, 'r', encoding='utf-8') as bib_file:
    parser = BibTexParser()
    parser.customization = convert_to_unicode
    bib_database = bibtexparser.load(bib_file, parser=parser)

running on a bib file containing this entry:

@article{Hagger2022,
  title = {Perceived Behavioral Control Moderating Effects in the Theory of Planned Behavior: {{A}} Meta-Analysis},
  file = {C:\Users\name\Documents\Zotero\storage\7J78GAC5\Hagger et al_2022_Perceived behavioral control moderating effects in the theory of planned.pdf}
}

will remove the "\H" from the filepath, and file path will become:

C:\Users\name\Documents\Zotero\storage\7J78GAC5a̋gger et al_2022_Perceived behavioral control moderating effects in the theory o f planned.pdf

Reproducing

Version: 1.4.2

Workaround
For now, I just rewrote the convert_to_unicode function to skip the file field:

def convert_to_unicode(record):
    for val in record:
        if val == "file":
            continue

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions