When processing large volumes of data, the use of streams in Node.js can bring enormous advantages in terms of performance and efficiency. Streams allow data processing continuously and in chunks, avoiding the complete loading of the file into memory. This article explores the benefits of using streams, using a practical example to demonstrate how to efficiently transform a large text file.
Streams are an abstraction in Node.js that allows processing data in chunks instead of loading everything into memory at once. There are four main types of streams in Node.js:
Using streams, data is processed in chunks, which means you don't need to load an entire file into memory. This is crucial for large files, as it avoids memory problems and improves system performance.
Streams allow continuous data processing. For example, you can start processing the first chunks of data while you are still receiving the next ones, which results in a shorter total processing time.
By not blocking the Node.js Event Loop, streams help keep the application responsive, even during I/O intensive operations.
Before starting, let's create a large text file for testing. You can use the following Python script to generate a 10GB file:
# generator.py # Define o tamanho do arquivo em bytes (10GB) file_size = 10000 * 1024 * 1024 # 10 GB # Linha que será escrita repetidas vezes no arquivo line = "This is a line of text to be transformed. Adding more text to increase the size of each line.\n" # Calcula o número de linhas necessárias para preencher o arquivo num_lines = file_size // len(line) # Cria e escreve o arquivo file_path = "large-input.txt" with open(file_path, "w") as file: for _ in range(num_lines): file.write(line) print(f"File created successfully at {file_path}")
To run the above script, save it as generator.py and run it using the command:
python3 generator.py
Here is the code in Node.js that transforms the content of large-input.txt to uppercase and saves the result in large-output.txt. It also displays progress every 10% and total process time.
// src/index.js const fs = require('fs'); const { Transform } = require('stream'); const { performance } = require('perf_hooks'); // Caminho para o arquivo de entrada e saída const inputFile = 'large-input.txt'; const outputFile = 'large-output.txt'; // Cria um Readable Stream a partir do arquivo de entrada const readableStream = fs.createReadStream(inputFile, { encoding: 'utf8' }); // Cria um Writable Stream para o arquivo de saída const writableStream = fs.createWriteStream(outputFile); // Variáveis para rastreamento de progresso let totalSize = 0; let processedSize = 0; let lastLoggedProgress = 0; const startTime = performance.now(); let processedLines = 0; fs.stat(inputFile, (err, stats) => { if (err) { console.error('Erro ao obter informações do arquivo:', err); return; } totalSize = stats.size; // Pipe o Readable Stream para o Transform Stream e depois para o Writable Stream readableStream .pipe( new Transform({ transform(chunk, encoding, callback) { processedSize += chunk.length; processedLines += chunk.toString().split('\n').length - 1; // Converte o chunk de dados para letras maiúsculas const upperCaseChunk = chunk.toString().toUpperCase(); // Chama o callback com o chunk transformado callback(null, upperCaseChunk); // Log de progresso const progress = (processedSize / totalSize) * 100; if (progress >= lastLoggedProgress + 10) { console.log( `Progresso: ${Math.floor(progress)}%, Linhas processadas: ${processedLines}` ); lastLoggedProgress = Math.floor(progress); } }, }) ) .pipe(writableStream) .on('finish', () => { const endTime = performance.now(); const timeTaken = ((endTime - startTime) / 1000).toFixed(2); console.log('Transformação completa e arquivo salvo.'); console.log(`Total de linhas processadas: ${processedLines}`); console.log(`Tempo total: ${timeTaken} segundos`); }) .on('error', (err) => { console.error('Erro durante a transformação:', err); }); });
Streams are a powerful tool in Node.js for manipulating large volumes of data. Using streams, you can process files efficiently, keeping the application responsive and avoiding memory issues. The example above demonstrates how to transform a large text file using streams, displaying the progress and total time of the process.
For more details and access to the full code, visit my GitHub repository.
The above is the detailed content of Benefits of Using Streams in Node.js. For more information, please follow other related articles on the PHP Chinese website!