Home > Web Front-end > JS Tutorial > How to Use Cloudinary AI to Write Better Image Captions

How to Use Cloudinary AI to Write Better Image Captions

Susan Sarandon
Release: 2024-10-21 22:47:30
Original
447 people have browsed it

Have you always found it challenging to add captions to your images on social media platforms like X and LinkedIn for accessibility using alt text?

Caption Image is an app that automatically solves this problem by analyzing your image and its details using Cloudinary AI to provide a perfect description automatically.

This guide will cover connecting the server code (API) to the client side to build a robust full-stack application for image captions.

How to Use Cloudinary AI to Write Better Image Captions

Want to give it a try? Check out the Caption Image app here.

Before you begin

Prerequisites

  • Basic understanding of React

  • Node.js installed on your local machine

  • Set up a Cloudinary account

Creating the server

Run this command to create your project as follows:

mkdir caption-image-server
cd caption-image-server

npm init -y // initialize the folder
Copy after login
Copy after login

After this setup, install the following dependencies to be able to build the API:

npm install nodemon --save-dev
Copy after login
Copy after login

Nodemon: Runs your development server and monitors changes for any change in the code

npm install cors cloudinary dotenv express
Copy after login
Copy after login
  • cors: it allows you to perform cross-domain requests in web applications

  • cloudinary: cloud storage for image and video

  • dotenv: load environment variables from a .env file

  • express: a node.js framework for building APIs

In the package.json, update the script objects with the following:

{
  ...
  "scripts": {
    "start": "node index",
    "start:dev": "nodemon index"
  },
  ...
}
Copy after login
Copy after login

The index represents the file used to create the backend code. Run this code to create the file:

touch index.js
Copy after login
Copy after login

Environment variables

The environment variables keep our credentials secret and prevent them from being leaked when pushed to GitHub.

.env

CLOUDINARY_CLOUD_NAME=your_cloud_name
CLOUDINARY_API_KEY=your_api_key
CLOUDINARY_API_SECRET=your_api_secret
Copy after login
Copy after login

Go to your Cloudinary dashboard, and you will have access to your values. Replace the placeholder text after the equal sign.

How to Use Cloudinary AI to Write Better Image Captions

Let's create the server. Copy-paste this code into your index.js file:

import express from "express";
import { v2 as cloudinary } from "cloudinary";
import * as dotenv from "dotenv";
import cors from "cors";

dotenv.config();

const app = express();

app.use(cors());
app.use(express.json());

cloudinary.config({
  cloud_name: process.env.CLOUDINARY_CLOUD_NAME,
  api_key: process.env.CLOUDINARY_API_KEY,
  api_secret: process.env.CLOUDINARY_API_SECRET,
});

app.get("/", (req, res) => {
  res.status(200).json({
    message: "Upload and generate image caption with Cloudinary AI",
  });
});

app.post("/api/v1/caption", async (req, res) => {
  try {
    const { imageUrl } = req.body;

    if (!imageUrl) {
      return res.status(400).json({
        success: false,
        message: "Image URL is required",
      });
    }

    const result = await cloudinary.uploader.upload(imageUrl, {
      detection: "captioning",
    });

    res.status(200).json({
      success: true,
      caption: result.info.detection.captioning.data.caption,
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      message: "Unable to generate image caption",
      error: error.message,
    });
  }
});

const startServer = async () => {
  try {
    app.listen(8080, () => console.log("Server started on port 8080"));
  } catch (error) {
    console.log("Error starting server:", error);
  }
};

startServer();
Copy after login
Copy after login

The code snippet shows the endpoints to the GET and POST HTTP methods. The POST method reads the image and crafts a caption. Check out Cloudinary AI Content Analysis to learn more about the practical use case of this technology.

Start the development server

In your terminal, use the command to run the server at http://localhost:8080.

mkdir caption-image-server
cd caption-image-server

npm init -y // initialize the folder
Copy after login
Copy after login

Creating the UI

Next.js is a popular framework among frontend developers because it helps create beautiful and user-friendly interfaces with reusable components.

Installation

As with any project, we need to create the boilerplate code that includes the files and folders with this command:

npm install nodemon --save-dev
Copy after login
Copy after login

During installation, some prompts will appear, allowing you to choose your preferences for the project.

Next, install these dependencies:

npm install cors cloudinary dotenv express
Copy after login
Copy after login
  • react-toastify: for notification

  • next-cloudinary: The Cloudinary package is developed for high-performance image and video delivery

  • copy-to-clipboard: copy text to the clipboard

Environment variables

In the same way, as with the backend code, we need to create the environment variables in the root directory with the following:

.env

{
  ...
  "scripts": {
    "start": "node index",
    "start:dev": "nodemon index"
  },
  ...
}
Copy after login
Copy after login

These variables will help sign our requests because we will use Cloudinary signed uploads to send files to the cloud. The signed uploads add an extra security layer to file uploads instead of unsigned uploads.

Configuring Cloudinary

Create a lib folder in the root directory, and it, a file name cloudinary.js

lib/cloudinary.js

touch index.js
Copy after login
Copy after login

Next, in the App router, create a new API route with this file name, api/sign-cloudinary-params/route.js:

app/api/sign-cloudinary-params/route.js

CLOUDINARY_CLOUD_NAME=your_cloud_name
CLOUDINARY_API_KEY=your_api_key
CLOUDINARY_API_SECRET=your_api_secret
Copy after login
Copy after login

Displaying the UI content

Here, the home route will display the content users can interact with within the app.

app/page.js

import express from "express";
import { v2 as cloudinary } from "cloudinary";
import * as dotenv from "dotenv";
import cors from "cors";

dotenv.config();

const app = express();

app.use(cors());
app.use(express.json());

cloudinary.config({
  cloud_name: process.env.CLOUDINARY_CLOUD_NAME,
  api_key: process.env.CLOUDINARY_API_KEY,
  api_secret: process.env.CLOUDINARY_API_SECRET,
});

app.get("/", (req, res) => {
  res.status(200).json({
    message: "Upload and generate image caption with Cloudinary AI",
  });
});

app.post("/api/v1/caption", async (req, res) => {
  try {
    const { imageUrl } = req.body;

    if (!imageUrl) {
      return res.status(400).json({
        success: false,
        message: "Image URL is required",
      });
    }

    const result = await cloudinary.uploader.upload(imageUrl, {
      detection: "captioning",
    });

    res.status(200).json({
      success: true,
      caption: result.info.detection.captioning.data.caption,
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      message: "Unable to generate image caption",
      error: error.message,
    });
  }
});

const startServer = async () => {
  try {
    app.listen(8080, () => console.log("Server started on port 8080"));
  } catch (error) {
    console.log("Error starting server:", error);
  }
};

startServer();
Copy after login
Copy after login

Now that we have the code for the home page clicking the "Upload an Image" button opens the Cloudinary widget interface that offers many options for uploading an image. Once you have selected an image, it processes the data with Cloudinary, generating both the picture and the caption side-by-side. Then, a notification pops up when you "Copy caption" to the clipboard for later use as an alternative text for your image.

Tech stack

These are the following technologies that made it possible to build the AI-enhanced photo captioner:

  • Next.js

  • Cloudinary

  • Vercel

  • Render

  • Express

Important links

Caption Image: https://caption-image-gamma.vercel.app/

Server code: https://github.com/Terieyenike/caption-image-server

GitHub repo: https://github.com/Terieyenike/caption-image-client

API: https://caption-image-server.onrender.com/

Deployment

These two technologies managed the deployment of the app on the web.

  • Vercel: helps deploy frontend web applications

  • Render: hosting the server code (API) in the cloud

Conclusion

Everything is made possible by using AI. It shows how efficiently AI is used to our advantage in creating for humans.

The AI-enhanced photo captioner is one example of the power of Cloudinary APIs and tools for building your next app. It removes the need to use other tools that provide similar services when bundling it all in Cloudinary.

Happy coding!

The above is the detailed content of How to Use Cloudinary AI to Write Better Image Captions. For more information, please follow other related articles on the PHP Chinese website!

source:dev.to
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template