Streamline File Uploads in NestJS: Efficient In-Memory Parsing for CSV & XLSX Without Disk Storage-JS Tutorial-php.cn

Home

Web Front-end

JS Tutorial

Streamline File Uploads in NestJS: Efficient In-Memory Parsing for CSV & XLSX Without Disk Storage

DDD

Oct 01, 2024 am 06:26 AM

Effortless File Parsing in NestJS: Manage CSV and XLSX Uploads in Memory for Speed, Security, and Scalability

Introduction

Handling file uploads in a web application is a common task, but dealing with different file types and ensuring they are processed correctly can be challenging. Often, developers need to parse uploaded files without saving them to the server, which is especially important for reducing server storage costs and ensuring that sensitive data is not unnecessarily retained. In this article, we’ll walk through the process of creating a custom NestJS module to handle file uploads specifically for CSV and XLS/XLSX files, and we’ll parse these files in memory using Node.js streams, so no static files are created on the server.

Why NestJS?

NestJS is a progressive Node.js framework that leverages TypeScript and provides an out-of-the-box application architecture that enables you to build highly testable, scalable, loosely coupled, and easily maintainable applications. By using NestJS, we can take advantage of its modular structure, powerful dependency injection system, and extensive ecosystem.

Step 1: Setting Up the Project

Before we dive into the code, let’s set up a new NestJS project. If you haven’t already, install the NestJS CLI:

npm install -g @nestjs/cli

Copy after login

Create a new NestJS project:

nest new your-super-name

Copy after login

Navigate into the project directory:

cd your-super-name

Copy after login

Step 2: Installing Required Packages

We’ll need to install some additional packages to handle file uploads and parsing:

npm install @nestjs/platform-express multer exceljsfile-type

Copy after login

Multer: A middleware for handling multipart/form-data, which is primarily used for uploading files.
Exlesjs: A powerful library for parsing CSV/XLS/XLSX files.
File-Type: A library for detecting the file type of a stream or buffer.

Step 3: Creating the Multer Storage Engine Without Saving Files

To customize the file upload process, we’ll create a custom Multer storage engine. This engine will ensure that only CSV and XLS/XLSX files are accepted, parse them in memory using Node.js streams, and return the parsed data without saving any files to disk.

Create a new file for our engine:

import { PassThrough } from 'stream';
import * as fileType from 'file-type';
import { BadRequestException } from '@nestjs/common';
import { Request } from 'express';
import { Workbook } from 'exceljs';
import { createParserCsvOrXlsx } from './parser-factory.js';

const ALLOWED_MIME_TYPES = [
  'text/csv',
  'application/vnd.ms-excel',
  'text/comma-separated-values',
  'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
  'application/vnd.ms-excel',
] as const;

export class CsvOrXlsxMulterEngine {
  private destKey: string;
  private maxFileSize: number;
  constructor(opts: { destKey: string; maxFileSize: number }) {
    this.destKey = opts.destKey;
    this.maxFileSize = opts.maxFileSize;
  }
  async _handleFile(req: Request, file: any, cb: any) {
    try {
      const contentLength = Number(req.headers['content-length']);
      if (
        typeof contentLength === 'number' &&
        contentLength > this.maxFileSize
      ) {
        throw new Error(`Max file size is ${this.maxFileSize} bytes.`);
      }
      const fileStream = await fileType.fileTypeStream(file.stream);
      const mime = fileStream.fileType?.mime ?? file.mimetype;
      if (!ALLOWED_MIME_TYPES.includes(mime)) {
        throw new BadRequestException('File must be *.csv or *.xlsx');
      }
      const replacementStream = new PassThrough();
      fileStream.pipe(replacementStream);
      const parser = createParserCsvOrXlsx(mime);
      const data = await parser.read(replacementStream);
      cb(null, {
        [this.destKey]:
          mime === 'text/csv' ? data : (data as Workbook).getWorksheet(),
      });
    } catch (error) {
      cb(error);
    }
  }
  _removeFile(req: Request, file: any, cb: any) {
    cb(null);
  }
}

Copy after login

This custom storage engine checks the file’s MIME type and ensures it’s either a CSV or XLS/XLSX file. It then processes the file entirely in memory using Node.js streams, so no temporary files are created on the server. This approach is both efficient and secure, especially when dealing with sensitive data.

Step 4: Creating the Parser Factory

The parser factory is responsible for determining the appropriate parser based on the file type.

Create a new file for our parser:

import excel from 'exceljs';

export function createParserCsvOrXlsx(mime: string) {
  const workbook = new excel.Workbook();
  return [
    'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
    'application/vnd.ms-excel',
  ].includes(mime)
    ? workbook.xlsx
    : workbook.csv;
}

Copy after login

This factory function checks the MIME type and returns the appropriate parser (either xlsx or csv).

Step 5: Configuring Multer in the NestJS Controller

Next, let’s create a controller to handle file uploads using our custom storage engine.

Generate a new controller:

nest g controller files

Copy after login

In the files.controller.ts, configure the file upload using Multer and the custom storage engine:

import {
  Controller,
  Post,
  UploadedFile,
  UseInterceptors,
} from '@nestjs/common';
import { FileInterceptor } from '@nestjs/platform-express';
import { Worksheet } from 'exceljs';
import { CsvOrXlsxMulterEngine } from '../../shared/multer-engines/csv-xlsx/engine.js';
import { FilesService } from './files.service.js';

const MAX_FILE_SIZE_IN_MiB = 1000000000; // Only for test

@Controller('files')
export class FilesController {
  constructor(private readonly filesService: FilesService) {}
  @UseInterceptors(
    FileInterceptor('file', {
      storage: new CsvOrXlsxMulterEngine({
        maxFileSize: MAX_FILE_SIZE_IN_MiB,
        destKey: 'worksheet',
      }),
    }),
  )
  @Post()
  create(@UploadedFile() data: { worksheet: Worksheet }) {
    return this.filesService.format(data.worksheet);
  }
}

Copy after login

This controller sets up an endpoint to handle file uploads. The uploaded file is processed by the CsvOrXlsxMulterEngine, and the parsed data is returned in the response without ever being saved to disk.

Step 6: Setting Up the Module

Finally, we need to set up a module to include our controller.

Generate a new module:

nest g module files

Copy after login

In the files.module.ts, import the controller:

import { Module } from '@nestjs/common';
import { FilesController } from './files.controller.js';
import { FilesService } from './files.service.js';

@Module({
  providers: [FilesService],
  controllers: [FilesController],
})
export class FilesModule {}

Copy after login

Make sure to import this module into your AppModule:

Step 7: Testing the File Upload with HTML

To test the file upload functionality, we can create a simple HTML page that allows users to upload CSV or XLS/XLSX files. This page will send the file to our /api/files endpoint, where it will be parsed and processed in memory.

Here’s the basic HTML file for testing the file upload:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>File Upload</title>
</head>
<body>
    <h1>Upload a File (CSV or XLSX)</h1>
    <form action="/api/files" method="post" enctype="multipart/form-data">
        <label for="file">Choose file:</label>
        <input type="file" id="file" name="file" accept=".csv, .xlsx" required>
        <br><br>
        <button type="submit">Upload</button>
    </form>
</body>
</html>

Copy after login

To render the HTML page for file uploads, we first need to install an additional NestJS module called @nestjs/serve-static. You can do this by running the following command:

npm install @nestjs/serve-static

Copy after login

After installing, we need to configure this module in AppModule:

import { Module } from '@nestjs/common';
import { join } from 'path';
import { ServeStaticModule } from '@nestjs/serve-static';
import { FilesModule } from './modules/files/files.module.js';

@Module({
  imports: [
    FilesModule,
    ServeStaticModule.forRoot({
      rootPath: join(new URL('..', import.meta.url).pathname, 'public'),
      serveRoot: '/',
    }),
  ],
})
export class AppModule {}

Copy after login

This setup will allow us to serve static files from the public directory. Now, we can open the file upload page by navigating to http://localhost:3000 in your browser.

Streamline File Uploads in NestJS: Efficient In-Memory Parsing for CSV & XLSX Without Disk Storage

Upload Your File

To upload a file, follow these steps:

Choose a file by clicking on the ‘Choose file’ button.
Click on the ‘Upload’ button to start the upload process.

Once the file is uploaded successfully, you should see a confirmation that the file has been uploaded and formatted.

Streamline File Uploads in NestJS: Efficient In-Memory Parsing for CSV & XLSX Without Disk Storage

Note: I haven’t included code for formatting the uploaded file, as this depends on the library you choose for processing CSV or XLS/XLSX files. You can view the complete implementation on GitHub.
Comparing Pros and Cons of In-Memory File Processing
When deciding whether to use in-memory file processing or saving files to disk, it’s important to understand the trade-offs.

Pros of In-Memory Processing:

No Temporary Files on Disk:

Security: Sensitive data isn’t left on the server’s disk, reducing the risk of data leaks.
Resource Efficiency: The server doesn’t need to allocate disk space for temporary files, which can be particularly useful in environments with limited storage.

Faster Processing:

Performance: Parsing files in memory can be faster since it eliminates the overhead of writing and reading files from disk.
Reduced I/O Operations: Fewer disk I/O operations means lower latency and potent ially higher throughput for file processing.

Simplified Cleanup:

No Cleanup Required: Since files aren’t saved to disk, there’s no need to manage or clean up temporary files, simplifying the codebase.

Cons of In-Memory Processing:

Memory Usage:

High Memory Consumption: Large files can consume significant amounts of memory, which might lead to out-of-memory errors if the server doesn’t have enough resources.
Scalability: Handling large files or multiple file uploads simultaneously may require careful memory management and scaling strategies.

File Size Limitations:

Limited by Memory: The maximum file size that can be processed is limited by the available memory on the server. This can be a signific ant drawback for applications dealing with very large files.

Complexity in Error Handling:

Error Management: Managing errors in streaming data can be more complex than handling files on disk, especially in cases where partial data might need to be recovered or analyzed.

When to Use In-Memory Processing:

Small to Medium Files: If your application deals with relatively small files, in-memory processing can offer speed and simplicity.

Security-Sensitive Applications: When handling sensitive data that shouldn’t be stored on disk, in-memory processing can reduce the risk of data breaches.

High-Performance Scenarios: Applications that require high throughput and minimal latency may benefit from the reduced overhead of in-memory processing.

When to Consider Disk-Based Processing:

Large Files: If your application needs to process very large files, disk-based processing may be necessary to avoid running out of memory.

Resource-Constrained Environments: In cases where server memory is limited, processing files on disk can prevent memory exhaustion and allow for better resource management.

Persistent Storage Needs: If you need to retain a copy of the uploaded file for auditing, backup, or later retrieval, saving files to disk is necessary.

Integration with External Storage Services: For large files, consider uploading them to external storage services like AWS S3, Google Cloud

Storage, or Azure Blob Storage. These services allow you to offload storage from your server, and you can process the files in the cloud or retrieve them for in-memory processing as needed.

Scalability: Cloud storage solutions can handle massive files and provide redundancy, ensuring that your data is safe and easily accessible from multiple geographic locations.

Cost Efficiency: Using cloud storage can be more cost-effective for handling large files, as it reduces the need for local server resources and provides pay-as-you-go pricing.

Conclusion

In this article, we’ve created a custom file upload module in NestJS that handles CSV and XLS/XLSX files, parses them in memory, and returns the parsed data without saving any files to disk. This approach leverages the power of Node.js streams, making it both efficient and secure, as no temporary files are left on the server.

We’ve also explored the pros and cons of in-memory file processing versus saving files to disk. While in-memory processing offers speed, security, and simplicity, it’s important to consider the memory usage and potential file size limitations before adopting this approach.

Whether you’re building an enterprise application or a small project, handling file uploads and parsing correctly is crucial. With this setup, you’re well on your way to mastering file uploads in NestJS without worrying about unnecessary server storage or data security issues.

Feel free to share your thoughts and improvements in the comments section below!

If you enjoyed this article or found these tools useful, make sure to follow me on Dev.to for more insights and tips on coding and development. I regularly share helpful content to make your coding journey smoother.

Follow me on X (Twitter), where I share more interesting thoughts, updates, and discussions about programming and tech! Don't miss out - click those follow buttons.

You can also follow me on LinkedIn for professional insights, updates on my latest projects, and discussions about coding, tech trends, and more. Don't miss out on valuable content that can help you level up your development skills - let's connect!

The above is the detailed content of Streamline File Uploads in NestJS: Efficient In-Memory Parsing for CSV & XLSX Without Disk Storage. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

4 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

4 weeks ago By DDD

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks ago By DDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1664

CakePHP Tutorial

1423

Laravel Tutorial

1317

PHP Tutorial

1268

C# Tutorial

1242

Related knowledge

Demystifying JavaScript: What It Does and Why It Matters Apr 09, 2025 am 12:07 AM

JavaScript is the cornerstone of modern web development, and its main functions include event-driven programming, dynamic content generation and asynchronous programming. 1) Event-driven programming allows web pages to change dynamically according to user operations. 2) Dynamic content generation allows page content to be adjusted according to conditions. 3) Asynchronous programming ensures that the user interface is not blocked. JavaScript is widely used in web interaction, single-page application and server-side development, greatly improving the flexibility of user experience and cross-platform development.

The Evolution of JavaScript: Current Trends and Future Prospects Apr 10, 2025 am 09:33 AM

The latest trends in JavaScript include the rise of TypeScript, the popularity of modern frameworks and libraries, and the application of WebAssembly. Future prospects cover more powerful type systems, the development of server-side JavaScript, the expansion of artificial intelligence and machine learning, and the potential of IoT and edge computing.

JavaScript Engines: Comparing Implementations Apr 13, 2025 am 12:05 AM

Different JavaScript engines have different effects when parsing and executing JavaScript code, because the implementation principles and optimization strategies of each engine differ. 1. Lexical analysis: convert source code into lexical unit. 2. Grammar analysis: Generate an abstract syntax tree. 3. Optimization and compilation: Generate machine code through the JIT compiler. 4. Execute: Run the machine code. V8 engine optimizes through instant compilation and hidden class, SpiderMonkey uses a type inference system, resulting in different performance performance on the same code.

Python vs. JavaScript: The Learning Curve and Ease of Use Apr 16, 2025 am 12:12 AM

Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

JavaScript: Exploring the Versatility of a Web Language Apr 11, 2025 am 12:01 AM

JavaScript is the core language of modern web development and is widely used for its diversity and flexibility. 1) Front-end development: build dynamic web pages and single-page applications through DOM operations and modern frameworks (such as React, Vue.js, Angular). 2) Server-side development: Node.js uses a non-blocking I/O model to handle high concurrency and real-time applications. 3) Mobile and desktop application development: cross-platform development is realized through ReactNative and Electron to improve development efficiency.

How to Build a Multi-Tenant SaaS Application with Next.js (Frontend Integration) Apr 11, 2025 am 08:22 AM

This article demonstrates frontend integration with a backend secured by Permit, building a functional EdTech SaaS application using Next.js. The frontend fetches user permissions to control UI visibility and ensures API requests adhere to role-base

Building a Multi-Tenant SaaS Application with Next.js (Backend Integration) Apr 11, 2025 am 08:23 AM

I built a functional multi-tenant SaaS application (an EdTech app) with your everyday tech tool and you can do the same. First, what’s a multi-tenant SaaS application? Multi-tenant SaaS applications let you serve multiple customers from a sing

From C/C to JavaScript: How It All Works Apr 14, 2025 am 12:05 AM

The shift from C/C to JavaScript requires adapting to dynamic typing, garbage collection and asynchronous programming. 1) C/C is a statically typed language that requires manual memory management, while JavaScript is dynamically typed and garbage collection is automatically processed. 2) C/C needs to be compiled into machine code, while JavaScript is an interpreted language. 3) JavaScript introduces concepts such as closures, prototype chains and Promise, which enhances flexibility and asynchronous programming capabilities.

See all articles