iTextSharp: Efficiently Converting HTML to PDF
Converting HTML documents to PDF format using iTextSharp requires a structured approach. It's crucial to remember that HTML and PDF are distinct formats, necessitating careful handling during the conversion process.
Understanding iTextSharp's HTML Handling
iTextSharp possesses the capability to parse HTML and CSS, but it lacks support for frameworks like ASP.NET, MVC, or Razor. You are responsible for extracting the HTML content from your chosen framework; iTextSharp doesn't offer this functionality.
Parser Selection: HTMLWorker vs. XMLWorker
iTextSharp provides two options for HTML tag parsing: HTMLWorker and XMLWorker. While HTMLWorker was previously used, XMLWorker is now the recommended parser. XMLWorker boasts enhanced extensibility and superior CSS support.
Code Example: HTML Tag Parsing with HTMLWorker and XMLWorker
The following C# code snippets illustrate how to parse HTML tags using both methods:
<code class="language-csharp">// Example HTML string html = "..."; // Parsing with HTMLWorker (CSS ignored) using (var htmlWorker = new iTextSharp.text.html.simpleparser.HTMLWorker(doc)) { using (var sr = new StringReader(html)) { htmlWorker.Parse(sr); } } // Parsing with XMLWorker (CSS supported) using (var srHtml = new StringReader(html)) { iTextSharp.tool.xml.XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, srHtml); }</code>
Leveraging XMLWorker for CSS Support
XMLWorker allows for seamless integration of CSS stylesheets. The following example demonstrates this:
<code class="language-csharp">string css = "..."; // Convert CSS and HTML strings to memory streams using (var msCss = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(css))) using (var msHtml = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(html))) { iTextSharp.tool.xml.XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHtml, msCss); }</code>
Important Note: iTextSharp's support for HTML and CSS features is not exhaustive. Consult the official iTextSharp documentation for comprehensive details on supported features and limitations.
The above is the detailed content of How to Convert HTML to PDF Using iTextSharp: HTMLWorker vs. XMLWorker?. For more information, please follow other related articles on the PHP Chinese website!