Skip to content

Using Incremental Saves

Jorj X. McKie edited this page May 28, 2016 · 2 revisions

Since version 1.9.1, incremental saves are possible for PDF documents.

Saving incrementally has the following advantages (also see chapter 3.4.5 of the Adobe PDF Reference)

  • it is normally a lot faster, because changes are appended to the file, it is not rewritten as a whole
  • it spares handling intermediate files if all you want is actually updating a specific document

The call pattern of incremental saves is as follows:

doc.save(doc.name, incremental=True, ...)

Prerequisites

There are a number of prerequisites that must be met to use this facility:

  1. Incremental saves are not possible for encrypted files - even if they have been successfully decrypted.
  2. The internal structure of the document must be intact to do an incremental save. If errors occur during opening the PDF, a flag will be set that prevents using incremental save later. A normal save is still possible.
  3. Option incremental=True cannot be used in conjunction with options garbage or linear.
  4. The file to save to must obviously be the original one. Therefore, documents opened from a memory area cannot be saved incrementally.

Typical Uses

The most typical uses are small changes to the document, like adding or deleting a small number of pages, updating bookmarks, etc. If changes become significant, there will always be a breakeven when saving to a new file is better.

The following code snippet deletes empty pages from a text oriented PDF (like the Adobe manual ...):

lst = list(range(doc.pageCount))
for i in list(range(doc.pageCount)):
    txt = doc.getPageText(i)
    if not txt:
        lst.remove(i)
if len(lst) < doc.pageCount:
    doc.select(lst)
    try:
        doc.save(doc.name, incremental=True)
    except:
        doc.save("new.pdf", garbage=3) 
Clone this wiki locally