Skip to content

Standard fonts in pdf-lib cannot encode certain characters outside WinAnsi #1759

@VladyslavLytvynenko22

Description

@VladyslavLytvynenko22

What were you trying to do?

While generating a PDF with pdf-lib, I encountered an issue when trying to render text that includes characters not supported by WinAnsi encoding. For example, letters such as Т (0x0422) cannot be encoded.

Steps to Reproduce:

Create a PDF using pdf-lib.

Add a text field or text content with characters outside WinAnsi (e.g., Т).

Save the PDF.

Expected Behavior:
The library should support rendering text with a broader character set (e.g., via embedding a font with extended Unicode support).

Actual Behavior:
Throws an error:

index.tsx:119 Error: WinAnsi cannot encode "Т" (0x0422)
at Encoding.encodeUnicodeCodePoint (Encoding.js:18:1)
at StandardFontEmbedder.encodeTextAsGlyphs (StandardFontEmbedder.ts:124:1)
at StandardFontEmbedder.encodeText (StandardFontEmbedder.ts:50:1)
at PDFFont.encodeText (PDFFont.ts:72:1)
at layoutSinglelineText (layout.ts:324:1)
at defaultTextFieldAppearanceProvider (appearances.ts:475:1)
at PDFTextField.updateWidgetAppearance (PDFTextField.ts:823:1)
at PDFTextField.updateAppearances (PDFTextField.ts:812:1)
at PDFTextField.defaultUpdateAppearances (PDFTextField.ts:783:1)
at PDFForm.updateFieldAppearances (PDFForm.ts:647:1)

Proposal:
Consider adding support for fonts that include extended Unicode characters (beyond WinAnsi), so text with a wider range of letters can be properly embedded and rendered.

How did you attempt to do it?

const pdfDoc = await PDFDocument.load(fileBytes);
const form = pdfDoc.getForm();
const field = form.getTextField("test");
field.setText("Тест");
await pdfDoc.save();

Error: WinAnsi cannot encode "Т" (0x0422)
at Encoding.encodeUnicodeCodePoint (Encoding.js:18:1)
at StandardFontEmbedder.encodeTextAsGlyphs (StandardFontEmbedder.ts:124:1)
at StandardFontEmbedder.encodeText (StandardFontEmbedder.ts:50:1)
at PDFFont.encodeText (PDFFont.ts:72:1)
...

What actually happened?

Workaround:
Using a custom font with fontkit works correctly:

pdfDoc.registerFontkit(fontkit);
const fontBytes = await getLiberationSansFont();
const customFont = await pdfDoc.embedFont(fontBytes);

const formField = form.getTextField("test");
formField.setText("Тест");
formField.updateAppearances(customFont);

What did you expect to happen?

Proposal:
Consider embedding at least one font with broader Unicode support by default (e.g., LiberationSans, NotoSans), or provide an easier mechanism to fall back from WinAnsi to a Unicode-capable font.

like
pdfDoc.embedFont(StandardFonts.LiberationSansUnicode);

How can we reproduce the issue?

const pdfDoc = await PDFDocument.load(fileBytes);
const form = pdfDoc.getForm();
const field = form.getTextField("test");
field.setText("Тест");
await pdfDoc.save();

Error: WinAnsi cannot encode "Т" (0x0422)
at Encoding.encodeUnicodeCodePoint (Encoding.js:18:1)
at StandardFontEmbedder.encodeTextAsGlyphs (StandardFontEmbedder.ts:124:1)
at StandardFontEmbedder.encodeText (StandardFontEmbedder.ts:50:1)
at PDFFont.encodeText (PDFFont.ts:72:1)
...

Version

1.17.1

What environment are you running pdf-lib in?

Other

Checklist

  • My report includes a Short, Self Contained, Correct (Compilable) Example.
  • I have attached all PDFs, images, and other files needed to run my SSCCE.

Additional Notes

No response

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions