pdf2json hangs in fonts.js in Font_buildToFontChar, 

The parsing of one PDF I try to read needs very long and the output result looks like eastern asian symbols although it should be german letters.
The length of the array "toUnicode" in fonts.js is 4294967293 and most elements in it are undefined. The traversal of this array take some minutes in buildToFontChar(). Other PDFs can get parsed without problems immediately. Unfortunately I cannot provide the document as it contains private information. If you need further information or if I can check something, please tell me.

Some more information:
- the version of pdf2json I use is 1.1.8
- the font name parameter value of the calling function is TimesNewRomanPSMT.

Nevertheless: thank you for this great project!

Edit:
It seems the problem is somewhere in readToUnicode() in evaluator.js. The big size of "toUnicode" is coming from the german umlauts "ä", "ö", "ü" und "ß".


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pdf2json hangs in fonts.js in Font_buildToFontChar, #184

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

pdf2json hangs in fonts.js in Font_buildToFontChar, #184

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions