Skip to content

parsing of escaped double quotes in double quote delimited fields fails #487

@detlevd

Description

@detlevd

Describe the bug
When processing bibtex entries, where the fields are delimited by double quotes, embedded double quotes should be escaped by {"}, according to https://tug.ctan.org/info/bibtex/tamethebeast/ttb_en.pdf, page 20. v2.0.0b7 however can't cope with that.
Can't see an easy fix in the current parsing technique.

Reproducing

Version: 2.0.0b7

Code:

import bibtexparser
# title according to page 20 of https://tug.ctan.org/info/bibtex/tamethebeast/ttb_en.pdf
bibentrytext = '''
@inproceedings{quotingproblem,
        pages = "23--26",
        title = "Comments on {"}Filenames and Fonts{"}",
}
'''
library = bibtexparser.parse_string(bibentrytext)
new_bibtex_str = bibtexparser.write_string(library)
print(new_bibtex_str)

Bibtex:

@inproceedings{quotingproblem,
        pages = "23--26",
        title = "Comments on {"}Filenames and Fonts{"}",
}

Workaround
Find such fields by hand and use {...} delimiters. Since long or multiline fields (abstract, long titles, ...) might be affected, this is not easily done in a secure way with some REs.

Remaining Questions (Optional)
Please tick all that apply:

  • I would be willing to contribute a PR to fix this issue.
  • This issue is a blocker, I'd be grateful for an early fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions