How to convert html to pdf in Java
In recent years, with the continuous advancement of the digitalization process, the demand for electronic documents has become higher and higher. In actual work, we often need to convert HTML files to PDF files, and in this process we need to use Java programming technology. This article will introduce the Java implementation method of converting HTML to PDF from the following three aspects:
1. Use iText to convert HTML to PDF
iText is a popular Java PDF library that can convert HTML to PDF. Convert the file to a PDF file. iText parses HTML files and reconstructs the page using PDF markup language. The following is the key code for using iText to convert HTML to PDF:
Document document = new Document(); PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("output.pdf")); document.open(); HTMLWorker htmlWorker = new HTMLWorker(document); String html = "<html><head></head><body><p>Hello World</p></body></html>"; htmlWorker.parse(new StringReader(html)); document.close();
The above code creates a Document object for generating PDF files, and then uses PDFWriter to write the Document object into the output stream to generate PDF files. The HTMLWorker is then used to parse the HTML document and add it to the PDF page. Finally, close the Document object to complete the generation of the PDF file.
2. Use Flying Saucer to convert HTML to PDF
Another Java tool that can be used to convert HTML to PDF is Flying Saucer. It is a free and open source PDF renderer that can convert HTML to PDF format documents. The following is a sample code for using Flying Saucer to convert HTML to PDF:
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = documentBuilderFactory.newDocumentBuilder(); Document document = builder.parse(new InputSource(new StringReader(htmlContent))); ITextRenderer iTextRenderer = new ITextRenderer(); iTextRenderer.setDocument(document, null); iTextRenderer.layout(); OutputStream outputStream = new FileOutputStream("output.pdf"); iTextRenderer.createPDF(outputStream); outputStream.close();
The above code first parses the HTML document and reads it into Document. Then, use the ITextRenderer's layout() method to lay out the document. Finally, use the createPDF() method to generate the PDF file into the outputStream.
3. Use PDFBox to convert HTML to PDF
PDFBox is a popular open source Java PDF library that provides many tools for creating and processing PDF files. It also provides some HTML to PDF sample code, the complete sample code can be seen here.
The following is a sample code for using PDFBox to convert HTML to PDF:
PDDocument document = new PDDocument(); PDPage page = new PDPage(); document.addPage(page); PDPageContentStream contentStream = new PDPageContentStream(document, page); PDRectangle mediaBox = page.getMediaBox(); float margin = 72; float startX = mediaBox.getLowerLeftX() + margin; float startY = mediaBox.getUpperRightY() - margin; float width = mediaBox.getWidth() - 2 * margin; String html = "<html><head></head><body><p>Hello World!</p></body></html>"; ByteArrayInputStream bais = new ByteArrayInputStream(html.getBytes()); InputStreamReader isr = new InputStreamReader(bais); COSDocument cosDoc = new COSDocument(); PDFOperator.reset(); PDPageTree pageTree = new PDPageTree(); PDDOMParser parser = new PDDOMParser(cosDoc); parser.parse(isr); PDDocumentOutline outline = new PDDocumentOutline(); document.getDocumentCatalog().setDocumentOutline(outline.getRootNode()); PDOutlineItem item = new PDOutlineItem(); item.setTitle("PDFBox"); PDOutlineItem childItem = new PDOutlineItem(); childItem.setTitle("Hello World 2"); item.addLast(childItem); outline.getRootNode().addLast(item); PDAcroForm form = new PDAcroForm(cosDoc); document.getDocumentCatalog().setAcroForm(form); PDPageContentStream cs = new PDPageContentStream(document, page); PDFTextStripper stripper = new PDFTextStripper(); stripper.setStartPage(0); stripper.setEndPage(1); String text = stripper.getText(document); cs.beginText(); cs.setFont(PDType1Font.COURIER, 14); cs.drawString(text, 100, 100); cs.endText(); contentStream.close(); document.save("output.pdf"); document.close();
The above code first creates a PDDocument object and adds a new page to it. Then, a PDPageContentStream object is created that is used to draw content on the page. Next, use PDDOMParser to parse the HTML into a COSDocument object. Finally, the content is written to the output stream to generate a PDF file.
Summary
HTML to PDF has a very wide range of applications in the actual production process, and this important task can be easily completed through Java programming. This article introduces how to convert HTML to PDF using three tools: iText, Flying Saucer and PDFBox. Whatever the situation, development can be made faster and more convenient by choosing the method that best suits your project needs.
The above is the detailed content of How to convert html to pdf in Java. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

The article discusses useEffect in React, a hook for managing side effects like data fetching and DOM manipulation in functional components. It explains usage, common side effects, and cleanup to prevent issues like memory leaks.

Lazy loading delays loading of content until needed, improving web performance and user experience by reducing initial load times and server load.

Higher-order functions in JavaScript enhance code conciseness, reusability, modularity, and performance through abstraction, common patterns, and optimization techniques.

The article discusses currying in JavaScript, a technique transforming multi-argument functions into single-argument function sequences. It explores currying's implementation, benefits like partial application, and practical uses, enhancing code read

The article explains React's reconciliation algorithm, which efficiently updates the DOM by comparing Virtual DOM trees. It discusses performance benefits, optimization techniques, and impacts on user experience.Character count: 159

The article explains useContext in React, which simplifies state management by avoiding prop drilling. It discusses benefits like centralized state and performance improvements through reduced re-renders.

Article discusses preventing default behavior in event handlers using preventDefault() method, its benefits like enhanced user experience, and potential issues like accessibility concerns.

Redux reducers are pure functions that update the application's state based on actions, ensuring predictability and immutability.
