Home Web Front-end JS Tutorial Text Detection in Images Using AWS Rekognition and Node.js

Text Detection in Images Using AWS Rekognition and Node.js

Aug 26, 2024 pm 09:35 PM

Hey everyone! In this article, we'll be creating a simple application to perform image text detection using AWS Rekognition with Node.js.

What is AWS Rekognition?

Amazon Rekognition is a service that makes it easy to add image and video analysis to your applications. It offers features like text detection, facial recognition, and even celebrity detection.
While Rekognition can analyze images or videos stored in S3, for this tutorial, we'll be working without S3 to keep things simple.
We'll be using Express for the backend and React for the frontend.

First Steps

Before we start, you'll need to create an AWS account and set up an IAM user. If you already have these, you can skip this section.

Creating IAM user

  • Log in to AWS: Start by logging into your AWS root account.
  • Search for IAM: In the AWS console, search for IAM and select it.
  • Go to the Users section and click Create User.
  • Set the user name, and under Set Permissions, choose Attach policies directly.
  • Search for and select the Rekognition policy, then click Next and create the user.
  • Create Access Keys: After creating the user, select the user, and under the Security credentials tab, create an access key. Be sure to download the .csv file containing your access key and secret access key.
  • For more detailed instructions, refer to the official AWS documentation: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html

Configuring aws-sdk

  • Install AWS CLI: Install the AWS CLI on your system.
  • Verify Installation: Open a terminal or command prompt and type aws --version to ensure the CLI is installed correctly.
  • Configure the AWS CLI: Run aws configure and provide the access key, secret access key, and region from the .csv file you downloaded.

Project Directory

my-directory/
│
├── client/
│   └── src/
│       └── App.jsx
│   └── public/
│   └── package.json
│   └── ... (other React project files)
│
└── server/
    ├── index.js
    └── rekognition/
        └── aws.rek.js
Copy after login

Setting up frontend

npm create vite @latest . -- --template react
it will create the react project in the client folder. 

In the App.jsx

import { useState } from "react";

function App() {
  const [img, setImg] = useState(null);

  const handleImg = (e) => {
    setImg(e.target.files[0]);  // Store the selected image in state
  };

  const handleSubmit = (e) => {
    e.preventDefault();
    if (!img) return;

    const formData = new FormData();
    formData.append("img", img);
    console.log(formData);      // Log the form data to the console
  };

  return (
    <div>
      <form onSubmit={handleSubmit}>
        <input type="file" name="img" accept="image/*" onChange={handleImg} />
        <br />
        <button type="submit">Submit</button>
      </form>
    </div>
  );
}

export default App;
Copy after login

Let's test this out by ensuring the image is logged to the console after submitting.

Now, Let's move to backend and start making the soul, for this project.

Initializing the backend

in the server folder

npm init -y 
npm install express cors nodemon multer @aws-sdk/client-rekognition 
I have created a separate folder for rekognition, to handle analyzing logic and create a file inside that folder.

//aws.rek.js

import {
  RekognitionClient,
  DetectTextCommand,
} from "@aws-sdk/client-rekognition";

const client = new RekognitionClient({});

export const Reko = async (params) => {
  try {
      const command = new DetectTextCommand(
          {
              Image: {
                  Bytes:params  //we are using Bytes directly instead of S3
              }
        }
    );
    const response = await client.send(command);
    return response
  } catch (error) {
    console.log(error.message);
  }
};
Copy after login

Explanation

  • We initialize a RekognitionClient object. Since we've already configured the SDK, we can leave the braces empty.
  • We create an async function Reko to process the image. In this function Initalize a DetectTextCommand object, which takes an image in Bytes.
  • This DectedTextCommand is specifically used for text detection.
  • The function waits for a response and returns it.

Creating the API

In the server folder, create a file index.js or what ever name you want.

//index.js

import express from "express"
import multer from "multer"
import cors from "cors"
import { Reko } from "./rekognition/aws.rek.js";

const app = express()
app.use(cors())
const storage = multer.memoryStorage()
const upload = multer()
const texts = []
let data = []

app.post("/img", upload.single("img"), async(req,res) => {
    const file = req.file
    data = await Reko(file.buffer)
    data.TextDetections.map((item) => {
        texts.push(item.DetectedText)
    })
    res.status(200).send(texts)
})

app.listen(3000, () => {
    console.log("server started");
})
Copy after login

Explanation

  • Initializing the express and starting the server. 
  • We are using the multer to handle the multipart form data, and storing it temporarily in the Buffer.
  • Creating the post request to get the image from the user. this is an async function. 
  • After the user uploads the image, the image will be available in the req.file 
  • This req.file contains some properties, in that there will be a Buffer property that holds our image data as an 8-bit buffer.
  • We need that so we are passing that req.file.buffer to the rekognition function. after analyzing it, the function returns the array of objects. 
  • We are sending the texts from those objects to the user.

Coming back to frontend

import axios from "axios";
import { useState } from "react";
import "./App.css"; 

function App() {
  const [img, setImg] = useState(null);
  const [pending, setPending] = useState(false);
  const [texts, setTexts] = useState([]);

  const handleImg = (e) => {
    setImg(e.target.files[0]);
  };

  const handleSubmit = async (e) => {
    e.preventDefault();
    if (!img) return; 

    const formData = new FormData();
    formData.append("img", img);

    try {
      setPending(true);
      const response = await axios.post("http://localhost:3000/img", formData);
      setTexts(response.data);
    } catch (error) {
      console.log("Error uploading image:", error);
    } finally {
      setPending(false);
    }
  };

  return (
    <div className="app-container">
      <div className="form-container">
        <form onSubmit={handleSubmit}>
          <input type="file" name="img" accept="image/*" onChange={handleImg} />
          <br />
          <button type="submit" disabled={pending}>
            {pending ? "Uploading..." : "Upload Image"}
          </button>
        </form>
      </div>

      <div className="result-container">
        {pending && <h1>Loading...</h1>}
        {texts.length > 0 && (
          <ul>
            {texts.map((text, index) => (
              <li key={index}>{text}</li>
            ))}
          </ul>
        )}
      </div>
    </div>
  );
}

export default App;
Copy after login
  • Using Axios to post the image. and storing the response in the text's state. 
  • Displaying the texts, for now, I am using the index as the Key, but it is not encouraged to use the Index as the key. 
  • I have also added some additional things like loading state and some styles.

Final Output

Text Detection in Images Using AWS Rekognition and Node.js

After clicking the "Upload Image" button, the backend processes the image and returns the detected text, which is then displayed to the user.

For the complete code, check out my: GitHub Repo

Thank You!!!

Follow me on: Medium, GitHub, LinkedIn, X, Instagram

The above is the detailed content of Text Detection in Images Using AWS Rekognition and Node.js. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How do I create and publish my own JavaScript libraries? How do I create and publish my own JavaScript libraries? Mar 18, 2025 pm 03:12 PM

Article discusses creating, publishing, and maintaining JavaScript libraries, focusing on planning, development, testing, documentation, and promotion strategies.

How do I optimize JavaScript code for performance in the browser? How do I optimize JavaScript code for performance in the browser? Mar 18, 2025 pm 03:14 PM

The article discusses strategies for optimizing JavaScript performance in browsers, focusing on reducing execution time and minimizing impact on page load speed.

What should I do if I encounter garbled code printing for front-end thermal paper receipts? What should I do if I encounter garbled code printing for front-end thermal paper receipts? Apr 04, 2025 pm 02:42 PM

Frequently Asked Questions and Solutions for Front-end Thermal Paper Ticket Printing In Front-end Development, Ticket Printing is a common requirement. However, many developers are implementing...

How do I debug JavaScript code effectively using browser developer tools? How do I debug JavaScript code effectively using browser developer tools? Mar 18, 2025 pm 03:16 PM

The article discusses effective JavaScript debugging using browser developer tools, focusing on setting breakpoints, using the console, and analyzing performance.

Who gets paid more Python or JavaScript? Who gets paid more Python or JavaScript? Apr 04, 2025 am 12:09 AM

There is no absolute salary for Python and JavaScript developers, depending on skills and industry needs. 1. Python may be paid more in data science and machine learning. 2. JavaScript has great demand in front-end and full-stack development, and its salary is also considerable. 3. Influencing factors include experience, geographical location, company size and specific skills.

How do I use source maps to debug minified JavaScript code? How do I use source maps to debug minified JavaScript code? Mar 18, 2025 pm 03:17 PM

The article explains how to use source maps to debug minified JavaScript by mapping it back to the original code. It discusses enabling source maps, setting breakpoints, and using tools like Chrome DevTools and Webpack.

How to merge array elements with the same ID into one object using JavaScript? How to merge array elements with the same ID into one object using JavaScript? Apr 04, 2025 pm 05:09 PM

How to merge array elements with the same ID into one object in JavaScript? When processing data, we often encounter the need to have the same ID...

The difference in console.log output result: Why are the two calls different? The difference in console.log output result: Why are the two calls different? Apr 04, 2025 pm 05:12 PM

In-depth discussion of the root causes of the difference in console.log output. This article will analyze the differences in the output results of console.log function in a piece of code and explain the reasons behind it. �...

See all articles