处理 PDF 文档时,经常需要添加其他文本。其范围可以从简单的注释到复杂的水印。由于没有用于编辑 PDF 的内置 Python 库,因此必须使用外部模块来实现此功能。
PyPDF 和 ReportLab 是在 Python 中操作 PDF 的两个流行选项。但是,这些模块都不提供对编辑现有 PDF 文件的直接支持。它们主要用于创建具有自定义内容的新 PDF。
要将文本添加到现有 PDF,可以使用 PyPDF 和 ReportLab 的组合。以下是适用于 Windows 和 Linux 的详细示例:
Python 2.7:
<code class="python">from pyPdf import PdfFileWriter, PdfFileReader import StringIO from reportlab.pdfgen import canvas from reportlab.lib.pagesizes import letter packet = StringIO.StringIO() can = canvas.Canvas(packet, pagesize=letter) can.drawString(10, 100, "Hello world") can.save() # move to the beginning of the StringIO buffer packet.seek(0) # create a new PDF with Reportlab new_pdf = PdfFileReader(packet) # read your existing PDF existing_pdf = PdfFileReader(file("original.pdf", "rb")) output = PdfFileWriter() # add the "watermark" (which is the new pdf) on the existing page page = existing_pdf.getPage(0) page.mergePage(new_pdf.getPage(0)) output.addPage(page) # finally, write "output" to a real file outputStream = file("destination.pdf", "wb") output.write(outputStream) outputStream.close()</code>
Python 3 .x:
<code class="python">from PyPDF2 import PdfFileWriter, PdfFileReader import io from reportlab.pdfgen import canvas from reportlab.lib.pagesizes import letter packet = io.BytesIO() can = canvas.Canvas(packet, pagesize=letter) can.drawString(10, 100, "Hello world") can.save() # move to the beginning of the StringIO buffer packet.seek(0) # create a new PDF with Reportlab new_pdf = PdfFileReader(packet) # read your existing PDF existing_pdf = PdfFileReader(open("original.pdf", "rb")) output = PdfFileWriter() # add the "watermark" (which is the new pdf) on the existing page page = existing_pdf.pages[0] page.merge_page(new_pdf.pages[0]) output.add_page(page) # finally, write "output" to a real file output_stream = open("destination.pdf", "wb") output.write(output_stream) output_stream.close()</code>
该解决方案有效地将 ReportLab 创建水印文本的灵活性与 PyPDF 的页面操作功能结合起来。
以上是如何使用 Python 和外部模块向现有 PDF 添加文本?的详细内容。更多信息请关注PHP中文网其他相关文章!