Merging PDF Files with Python
Python offers powerful options for merging PDF files, allowing you to combine multiple documents into a single, unified one. This tutorial will guide you through the process, including advanced techniques like looping through directories and excluding specific pages.
Using pypdf Merging Class
pypdf provides the PdfMerger class, which offers an easy way to concatenate and merge PDF files.
File Concatenation
Concatenate files by appending them using the append method:
<code class="python">import PdfMerger pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf', 'file4.pdf'] merger = PdfMerger() for pdf in pdfs: merger.append(pdf) merger.write("result.pdf")</code>
File Merging
For finer control, use the merge method to specify insertion points:
<code class="python">merger.merge(2, pdf) # Insert PDF at page 2</code>
Page Ranges
Control which pages are appended using the pages keyword argument:
<code class="python">merger.append(pdf, pages=(0, 3)) # Append first 3 pages merger.append(pdf, pages=(0, 6, 2)) # Append pages 1, 3, 5</code>
Excluding Blank Pages
To exclude a specific page from all merged PDFs, you can manipulate the pages parameter accordingly. For example, to exclude page 1 from each PDF:
<code class="python">pages_to_exclude = [0] # Page 1 for pdf in pdfs: merger.append(pdf, pages=(i for i in range(pages) if i not in pages_to_exclude))</code>
PyMuPdf Library
Another option is the PyMuPdf library. Here's how to merge PDFs with it:
From Command Line
python -m fitz join -o result.pdf file1.pdf file2.pdf file3.pdf
From Code
<code class="python">import fitz result = fitz.open() for pdf in ['file1.pdf', 'file2.pdf', 'file3.pdf']: with fitz.open(pdf) as mfile: result.insert_pdf(mfile) result.save("result.pdf")</code>
Looping Through Folders
To loop through folders and merge PDFs, use the os module:
<code class="python">import os for folder in os.listdir("path/to/directory"): pdfs = [f for f in os.listdir(f"path/to/directory/{folder}") if f.endswith(".pdf")] merger = PdfMerger() for pdf in pdfs: merger.append(f"path/to/directory/{folder}/{pdf}") merger.write(f"merged_{folder}.pdf")</code>
The above is the detailed content of How to Merge PDF Files with Python: A Comprehensive Guide. For more information, please follow other related articles on the PHP Chinese website!