Table of Contents
What is the meaning of asynchronous conversion of HTML to PDF?
How does ReactPHP help in creating asynchronous libraries?
What are the steps involved in asynchronous conversion of HTML to PDF?
Can I program asynchronously using a language other than PHP?
How to handle errors during asynchronous conversion of HTML to PDF?
What are the benefits of converting HTML to PDF?
How to optimize the performance of my asynchronous PHP application?
Can I convert HTML to PDF synchronously?
What are the challenges of asynchronous programming in PHP?
How to test the performance of an asynchronous PHP application?
Home Backend Development PHP Tutorial Writing Async Libraries - Let's Convert HTML to PDF

Writing Async Libraries - Let's Convert HTML to PDF

Feb 10, 2025 pm 03:51 PM

Writing Async Libraries - Let's Convert HTML to PDF

Key Points

  • PHP asynchronous programming, such as HTML to PDF, allows non-blocking operations to improve performance by executing other code simultaneously.
  • Using Promise and callbacks in an asynchronous framework can simplify delayed operations and potential error handling, making the code more robust and easier to maintain.
  • Developing a custom asynchronous library (such as the HTML to PDF converter discussed in this article) involves creating abstractions, using tools such as ReactPHP and Amp to effectively manage asynchronous tasks.
  • Asynchronous code can adapt to synchronous execution, ensuring compatibility and flexibility between different application architectures without sacrificing the advantages of asynchronous programming.
  • By abstracting parallel execution logic into a common driver system, multiple frameworks and environments can be supported, which can interface with various asynchronous libraries.
  • This article explains the actual implementation of asynchronous HTML to PDF conversion in PHP, and emphasizes the importance of understanding and utilizing modern programming paradigms for efficient application development.

This article was peer-reviewed by Thomas Punt. Thanks to all the peer reviewers of SitePoint for getting SitePoint content to its best!


The topic of PHP asynchronous programming is discussed almost every meeting. I'm glad it's mentioned so often now. However, these speakers did not reveal a secret...

Creating an asynchronous server, resolving domain names, and interacting with the file system: these are all simple things. Creating your own asynchronous library is difficult. And that's exactly where you spend most of your time!

Writing Async Libraries - Let's Convert HTML to PDF

These simple things are simple because they are proof of concept - making asynchronous PHP compete with NodeJS. You can see how similar their early interfaces were:

var http = require("http");
var server = http.createServer();

server.on("request", function(request, response) {
    response.writeHead(200, {
        "Content-Type": "text/plain"
    });

    response.end("Hello World");
});

server.listen(3000, "127.0.0.1");
Copy after login
Copy after login
Copy after login
Copy after login

This code is tested using Node 7.3.0

require "vendor/autoload.php";

$loop = React\EventLoop\Factory::create();
$socket = new React\Socket\Server($loop);
$server = new React\Http\Server($socket);

$server->on("request", function($request, $response) {
    $response->writeHead(200, [
        "Content-Type" => "text/plain"
    ]);

    $response->end("Hello world");
});

$socket->listen(3000, "127.0.0.1");
$loop->run();
Copy after login
Copy after login
Copy after login
Copy after login

This code is tested using PHP 7.1 and react/http:0.4.2

Today, we will look at some methods to make your application code run well in an asynchronous architecture. Don't worry - your code can still work in a synchronous architecture, so you don't have to give up anything to learn this new skill. In addition to spending some time...

You can find the code for this tutorial on Github. I've tested it with PHP 7.1 and the latest versions of ReactPHP and Amp.

Hopeful Theory

Asynchronous code has some common abstractions. We've seen one of them: callbacks. Callbacks, as the name implies, describe how they handle slow or blocking operations. The synchronization code is full of waiting. Ask for something and wait for something to happen.

Therefore, asynchronous frameworks and libraries can use callbacks. Request something, when it happens: the framework or library will call back your code.

In the case of HTTP server, we will not preemptively process all requests. We won't wait for the request to happen, either. We just describe the code that should be called, if the request occurs. The event loop takes care of the rest of the work.

The second common abstraction is Promise. Callbacks are hooks waiting for future events, and Promise is a reference to future values. They look like this:

var http = require("http");
var server = http.createServer();

server.on("request", function(request, response) {
    response.writeHead(200, {
        "Content-Type": "text/plain"
    });

    response.end("Hello World");
});

server.listen(3000, "127.0.0.1");
Copy after login
Copy after login
Copy after login
Copy after login

This has a little more code than using callbacks alone, but it's an interesting way to do it. We wait for something to happen and then do another. If something goes wrong, we will catch the error and respond reasonably. This seems simple, but is not fully discussed.

We are still using callbacks, but we have wrapped them in an abstraction, which helps us in other ways. One benefit is that they allow multiple parsing callbacks...

require "vendor/autoload.php";

$loop = React\EventLoop\Factory::create();
$socket = new React\Socket\Server($loop);
$server = new React\Http\Server($socket);

$server->on("request", function($request, $response) {
    $response->writeHead(200, [
        "Content-Type" => "text/plain"
    ]);

    $response->end("Hello world");
});

$socket->listen(3000, "127.0.0.1");
$loop->run();
Copy after login
Copy after login
Copy after login
Copy after login

I want us to focus on another thing. That is, Promise provides a common language—a common abstraction—to think about how synchronous code becomes asynchronous code.

Let's get some application code and make it asynchronous, use Promise...

Create PDF files

It is common for applications to generate some sort of summary documents—whether it is an invoice or inventory list. Suppose you have an e-commerce application that processes payments via Stripe. When a customer purchases an item, you want them to be able to download a PDF receipt for the transaction.

You can do this in a number of ways, but a very simple way is to generate the document using HTML and CSS. You can convert it to a PDF document and allow customers to download it.

I need to do something similar recently. I found that there are not many good libraries to support this operation. I can't find a single abstraction that allows me to switch between different HTML → PDF engines. So I started building one myself.

I started thinking about what my abstraction needed to do. I chose a very similar interface:

readFile()
    ->then(function(string $content) {
        print "content: " . $content;
    })
    ->catch(function(Exception $e) {
        print "error: " . $e->getMessage();
    });
Copy after login
Copy after login
Copy after login

For simplicity, I hope that all methods except the render method can act as getters and setters. Given this set of expected methods, the next thing to do is create an implementation, using a possible engine. I added the domPDF to my project and started using it:

$promise = readFile();
$promise->then(...)->catch(...);

// ...让我们向现有代码添加日志记录

$promise->then(function(string $content) use ($logger) {
    $logger->info("file was read");
});
Copy after login
Copy after login
Copy after login

I won't go into details on how to use domPDF. I think the documentation is done well enough so that I can focus on the async part of this implementation.

We will check out the data and parallel methods later. The important thing about this Driver implementation is that it collects data (if set, otherwise the default value) and custom options together. It passes these to the callbacks we want to run asynchronously.

domPDF is not an asynchronous library, converting HTML to PDF is a very slow process. So how do we make it asynchronous? Well, we could write a completely asynchronous converter, or we could use an existing synchronous converter; but run it in a parallel thread or process.

This is what I did for the parallel method:

var http = require("http");
var server = http.createServer();

server.on("request", function(request, response) {
    response.writeHead(200, {
        "Content-Type": "text/plain"
    });

    response.end("Hello World");
});

server.listen(3000, "127.0.0.1");
Copy after login
Copy after login
Copy after login
Copy after login

Here I implemented the getter-setter method and thought I could reuse them for the next implementation. The data method acts as a shortcut to collect various document attributes into an array, making them easier to pass to anonymous functions.

parallel method starts to get interesting:

require "vendor/autoload.php";

$loop = React\EventLoop\Factory::create();
$socket = new React\Socket\Server($loop);
$server = new React\Http\Server($socket);

$server->on("request", function($request, $response) {
    $response->writeHead(200, [
        "Content-Type" => "text/plain"
    ]);

    $response->end("Hello world");
});

$socket->listen(3000, "127.0.0.1");
$loop->run();
Copy after login
Copy after login
Copy after login
Copy after login

I really like the Amp project. It is a collection of libraries that support asynchronous architectures, and they are key proponents of the async-interop project.

One of their libraries is called amphp/parallel, which supports multi-threaded and multi-process code (extended via Pthreads and Process Control). These spawn methods return Amp's Promise implementation. This means that the render method can be used like any other method that returns a Promise:

readFile()
    ->then(function(string $content) {
        print "content: " . $content;
    })
    ->catch(function(Exception $e) {
        print "error: " . $e->getMessage();
    });
Copy after login
Copy after login
Copy after login

This code is a bit complicated. Amp also provides an event loop implementation and all auxiliary code to be able to convert a normal PHP generator into coroutines and promises. You can read in another post I wrote how this is even possible and how it relates to PHP's generator.

The returned Promise is also being standardized. Amp returns the implementation of the Promise specification. It's slightly different from the code I've shown above, but still executes the same function.

The generator works like a coroutine in a language with coroutines. Coroutines are functions that can be interrupted, meaning they can be used to perform short-term operations and then pause while waiting for something. During pause, other functions can use system resources.

Actually, this looks like this:

$promise = readFile();
$promise->then(...)->catch(...);

// ...让我们向现有代码添加日志记录

$promise->then(function(string $content) use ($logger) {
    $logger->info("file was read");
});
Copy after login
Copy after login
Copy after login

This seems much more complicated than just writing synchronous code at the beginning. But what it allows is that something else can happen when we wait for funcReturnsPromise to complete.

Generating Promise is exactly what we call abstraction. It provides us with a framework through which we can create functions that return Promise. The code can interact with these promises in a predictable and understandable way.

Look at what it looks like to render PDF documents using our driver:

interface Driver
{
    public function html($html = null);
    public function size($size = null);
    public function orientation($orientation = null);
    public function dpi($dpi = null);
    public function render();
}
Copy after login

This is not as useful as generating PDFs in an asynchronous HTTP server. There is an Amp library called Aerys which makes creating these types of servers easier. Using Aerys, you can create the following HTTP server code:

class DomDriver extends BaseDriver implements Driver
{
    private $options;

    public function __construct(array $options = [])
    {
        $this->options = $options;
    }

    public function render()
    {
        $data = $this->data();
        $custom = $this->options;

        return $this->parallel(
            function() use ($data, $custom) {
                $options = new Options();

                $options->set(
                    "isJavascriptEnabled", true
                );

                $options->set(
                    "isHtml5ParserEnabled", true
                );

                $options->set("dpi", $data["dpi"]);

                foreach ($custom as $key => $value) {
                    $options->set($key, $value);
                }

                $engine = new Dompdf($options);

                $engine->setPaper(
                    $data["size"], $data["orientation"]
                );

                $engine->loadHtml($data["html"]);
                $engine->render();

                return $engine->output();
            }
        );
    }
}
Copy after login

Similarly, I will not go into Aerys in detail now. This is an impressive software that is well worth having its own article. You don't need to understand how Aerys works to see how natural our converter code looks next to it.

My boss said "Don't use asynchronous!"

If you are not sure how long it will take to build an asynchronous application, why does it take so much effort? Writing this code allows us to gain insight into new programming paradigms. And, just because we are writing this code asynchronous, doesn't mean it won't work in a synchronous environment.

To use this code in a synchronous application, we just need to move some asynchronous code inside:

abstract class BaseDriver implements Driver
{
    protected $html = "";
    protected $size = "A4";
    protected $orientation = "portrait";
    protected $dpi = 300;

    public function html($body = null)
    {
        return $this->access("html", $html);
    }

    private function access($key, $value = null)
    {
        if (is_null($value)) {
            return $this->$key;
        }

        $this->$key = $value;
        return $this;
    }

    public function size($size = null)
    {
        return $this->access("size", $size);
    }

    public function orientation($orientation = null)
    {
        return $this->access("orientation", $orientation);
    }

    public function dpi($dpi = null)
    {
        return $this->access("dpi", $dpi);
    }

    protected function data()
    {
        return [
            "html" => $html,
            "size" => $this->size,
            "orientation" => $this->orientation,
            "dpi" => $this->dpi,
        ];
    }

    protected function parallel(Closure $deferred)
    {
        // TODO
    }
}
Copy after login

With this decorator we can write code that looks like a synchronous code:

var http = require("http");
var server = http.createServer();

server.on("request", function(request, response) {
    response.writeHead(200, {
        "Content-Type": "text/plain"
    });

    response.end("Hello World");
});

server.listen(3000, "127.0.0.1");
Copy after login
Copy after login
Copy after login
Copy after login

It still runs the code asynchronously (at least in the background), but all of this is not exposed to the consumer. You can use it in a sync application and you will never know what's going on behind the scenes.

Support other frameworks

Amp has some specific requirements that make it unsuitable for all environments. For example, the basic Amp (event loop) library requires PHP 7.0. The parallel library requires a Pthreads extension or a Process Control extension.

I don't want to impose these restrictions on everyone and want to know how I can support a wider system. The answer is to abstract the parallel execution code into another driver system:

require "vendor/autoload.php";

$loop = React\EventLoop\Factory::create();
$socket = new React\Socket\Server($loop);
$server = new React\Http\Server($socket);

$server->on("request", function($request, $response) {
    $response->writeHead(200, [
        "Content-Type" => "text/plain"
    ]);

    $response->end("Hello world");
});

$socket->listen(3000, "127.0.0.1");
$loop->run();
Copy after login
Copy after login
Copy after login
Copy after login

I can implement it for Amp as well (less restricted, but older) ReactPHP:

readFile()
    ->then(function(string $content) {
        print "content: " . $content;
    })
    ->catch(function(Exception $e) {
        print "error: " . $e->getMessage();
    });
Copy after login
Copy after login
Copy after login

I am used to passing closures to multi-threaded and multi-process worker, because that's how Pthreads and Process Control work. Using ReactPHP Process objects is completely different because they rely on exec for multi-process execution. I decided to implement the same closure function that I am used to using. This is not necessary for asynchronous code - it's purely a matter of taste.

SuperClosure library serializes closures and their bound variables. Most of the code here is the code you expect to find in the worker script. In fact, the only way to use ReactPHP's child process library (besides serializing closures) is to send tasks to worker scripts.

Now, we no longer load our drivers with $this->parallel and Amp specific code, but can pass the run program implementation. As async code, this is similar to:

$promise = readFile();
$promise->then(...)->catch(...);

// ...让我们向现有代码添加日志记录

$promise->then(function(string $content) use ($logger) {
    $logger->info("file was read");
});
Copy after login
Copy after login
Copy after login

Don't be shocked by the difference between ReactPHP code and Amp code. ReactPHP does not implement the same coroutine base as Amp. Instead, ReactPHP prefers to use callbacks to handle most things. This code still just runs the PDF conversion in parallel and returns the generated PDF data.

By running the program in abstract, we can use any asynchronous framework we want, and we can expect the driver we will use to return the abstraction of that framework.

Can I use this?

Initially it was just an experiment, and it became an HTML→PDF library with multiple drivers and multiple running programs; it was called Paper. It's like the Flysystem equivalent of HTML → PDF, but it's also a great example of how to write an asynchronous library.

When you try to make an asynchronous PHP application, you will find gaps in the library ecosystem. Don't be intimidated by these! Instead, take the opportunity to think about how you will use the abstractions provided by ReactPHP and Amp to make your own asynchronous libraries.

Have you built an interesting asynchronous PHP application or library recently? Please let us know in the comments.

FAQ on Asynchronous Converting HTML to PDF

What is the meaning of asynchronous conversion of HTML to PDF?

Asynchronous programming plays a crucial role in converting HTML to PDF. It allows non-blocking operations to be performed, which means the engine is running in the background, allowing the rest of your code to continue execution when the asynchronous operation is completed. This leads to more efficient use of resources and improved performance, especially in applications involving a large number of I/O operations, such as converting HTML to PDF.

How does ReactPHP help in creating asynchronous libraries?

ReactPHP is a low-level library for event-driven programming in PHP. It provides the core infrastructure for creating asynchronous libraries in PHP. With ReactPHP, you can write non-blocking code using PHP's familiar syntax, making it easier to create high-performance applications.

What are the steps involved in asynchronous conversion of HTML to PDF?

The process of asynchronous conversion of HTML to PDF involves several steps. First, you need to set up an HTML template that defines the structure and content of the PDF. Next, you use asynchronous libraries like ReactPHP to handle the conversion process. This includes reading the HTML file, converting it to a PDF, and then saving the generated PDF file. The asynchronous nature of this process means that your application can continue to perform other tasks while the transformation is in progress.

Can I program asynchronously using a language other than PHP?

Yes, you can program asynchronously in other languages. For example, Node.js is a popular choice for building asynchronous applications due to its event-driven architecture. However, if you are already familiar with PHP, libraries like ReactPHP allow you to easily take advantage of asynchronous programming without having to learn new languages.

How to handle errors during asynchronous conversion of HTML to PDF?

Error handling is an important aspect of asynchronous programming. In ReactPHP, you can handle errors by attaching an error event handler to a Promise object. If an error occurs during the conversion process, this handler will be called, allowing you to log the error or take other appropriate actions.

What are the benefits of converting HTML to PDF?

There are many benefits to converting HTML to PDF. It allows you to create a static, portable version of a web page that can be viewed offline, printed, or shared easily. The PDF also retains the format and layout of the original HTML, ensuring that the content looks the same regardless of the device or platform viewed on.

How to optimize the performance of my asynchronous PHP application?

There are several ways to optimize the performance of an asynchronous PHP application. One approach is to use libraries like ReactPHP, which provides a low-level interface for event-driven programming. This allows you to write non-blocking code, which can significantly improve the performance of I/O-intensive operations such as converting HTML to PDF.

Can I convert HTML to PDF synchronously?

Yes, HTML can be converted to PDF synchronously. However, this approach may block your application's execution until the conversion process is complete, which can cause performance issues for I/O-intensive applications. On the other hand, asynchronous conversion allows your application to continue performing other tasks while the conversion is in progress, resulting in better performance and resource utilization.

What are the challenges of asynchronous programming in PHP?

Asynchronous programming in PHP can be challenging due to the synchronization characteristics of PHP. However, libraries like ReactPHP provide the architecture required to write non-blocking code in PHP. Understanding event-driven programming models and mastering the use of Promise can also be challenging, but they are key to leveraging the advantages of asynchronous programming.

How to test the performance of an asynchronous PHP application?

Testing the performance of an asynchronous PHP application includes measuring key metrics under different load conditions such as response time, memory usage, and CPU utilization. Tools like Apache JMeter or Siege can be used to simulate load on an application and collect performance data. In addition, analysis tools like Xdebug can help you identify bottlenecks in your code and optimize their performance.

The above is the detailed content of Writing Async Libraries - Let's Convert HTML to PDF. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Apr 05, 2025 am 12:04 AM

JWT is an open standard based on JSON, used to securely transmit information between parties, mainly for identity authentication and information exchange. 1. JWT consists of three parts: Header, Payload and Signature. 2. The working principle of JWT includes three steps: generating JWT, verifying JWT and parsing Payload. 3. When using JWT for authentication in PHP, JWT can be generated and verified, and user role and permission information can be included in advanced usage. 4. Common errors include signature verification failure, token expiration, and payload oversized. Debugging skills include using debugging tools and logging. 5. Performance optimization and best practices include using appropriate signature algorithms, setting validity periods reasonably,

How does session hijacking work and how can you mitigate it in PHP? How does session hijacking work and how can you mitigate it in PHP? Apr 06, 2025 am 12:02 AM

Session hijacking can be achieved through the following steps: 1. Obtain the session ID, 2. Use the session ID, 3. Keep the session active. The methods to prevent session hijacking in PHP include: 1. Use the session_regenerate_id() function to regenerate the session ID, 2. Store session data through the database, 3. Ensure that all session data is transmitted through HTTPS.

Describe the SOLID principles and how they apply to PHP development. Describe the SOLID principles and how they apply to PHP development. Apr 03, 2025 am 12:04 AM

The application of SOLID principle in PHP development includes: 1. Single responsibility principle (SRP): Each class is responsible for only one function. 2. Open and close principle (OCP): Changes are achieved through extension rather than modification. 3. Lisch's Substitution Principle (LSP): Subclasses can replace base classes without affecting program accuracy. 4. Interface isolation principle (ISP): Use fine-grained interfaces to avoid dependencies and unused methods. 5. Dependency inversion principle (DIP): High and low-level modules rely on abstraction and are implemented through dependency injection.

How to automatically set permissions of unixsocket after system restart? How to automatically set permissions of unixsocket after system restart? Mar 31, 2025 pm 11:54 PM

How to automatically set the permissions of unixsocket after the system restarts. Every time the system restarts, we need to execute the following command to modify the permissions of unixsocket: sudo...

How to debug CLI mode in PHPStorm? How to debug CLI mode in PHPStorm? Apr 01, 2025 pm 02:57 PM

How to debug CLI mode in PHPStorm? When developing with PHPStorm, sometimes we need to debug PHP in command line interface (CLI) mode...

Explain late static binding in PHP (static::). Explain late static binding in PHP (static::). Apr 03, 2025 am 12:04 AM

Static binding (static::) implements late static binding (LSB) in PHP, allowing calling classes to be referenced in static contexts rather than defining classes. 1) The parsing process is performed at runtime, 2) Look up the call class in the inheritance relationship, 3) It may bring performance overhead.

How to send a POST request containing JSON data using PHP's cURL library? How to send a POST request containing JSON data using PHP's cURL library? Apr 01, 2025 pm 03:12 PM

Sending JSON data using PHP's cURL library In PHP development, it is often necessary to interact with external APIs. One of the common ways is to use cURL library to send POST�...

See all articles