How to Use Cloudinary AI to Write Better Image Captions-JS Tutorial-php.cn

Have you always found it challenging to add captions to your images on social media platforms like X and LinkedIn for accessibility using alt text?

Caption Image is an app that automatically solves this problem by analyzing your image and its details using Cloudinary AI to provide a perfect description automatically.

This guide will cover connecting the server code (API) to the client side to build a robust full-stack application for image captions.

How to Use Cloudinary AI to Write Better Image Captions

Want to give it a try? Check out the Caption Image app here.

Before you begin

Prerequisites

Basic understanding of React
Node.js installed on your local machine
Set up a Cloudinary account

Creating the server

Run this command to create your project as follows:

mkdir caption-image-server
cd caption-image-server

npm init -y // initialize the folder

Copy after login

After this setup, install the following dependencies to be able to build the API:

npm install nodemon --save-dev

Copy after login

Nodemon: Runs your development server and monitors changes for any change in the code

npm install cors cloudinary dotenv express

Copy after login

cors: it allows you to perform cross-domain requests in web applications
cloudinary: cloud storage for image and video
dotenv: load environment variables from a .env file
express: a node.js framework for building APIs

In the package.json, update the script objects with the following:

{
  ...
  "scripts": {
    "start": "node index",
    "start:dev": "nodemon index"
  },
  ...
}

Copy after login

The index represents the file used to create the backend code. Run this code to create the file:

touch index.js

Copy after login

Environment variables

The environment variables keep our credentials secret and prevent them from being leaked when pushed to GitHub.

.env

CLOUDINARY_CLOUD_NAME=your_cloud_name
CLOUDINARY_API_KEY=your_api_key
CLOUDINARY_API_SECRET=your_api_secret

Copy after login

Go to your Cloudinary dashboard, and you will have access to your values. Replace the placeholder text after the equal sign.

How to Use Cloudinary AI to Write Better Image Captions

Let's create the server. Copy-paste this code into your index.js file:

import express from "express";
import { v2 as cloudinary } from "cloudinary";
import * as dotenv from "dotenv";
import cors from "cors";

dotenv.config();

const app = express();

app.use(cors());
app.use(express.json());

cloudinary.config({
  cloud_name: process.env.CLOUDINARY_CLOUD_NAME,
  api_key: process.env.CLOUDINARY_API_KEY,
  api_secret: process.env.CLOUDINARY_API_SECRET,
});

app.get("/", (req, res) => {
  res.status(200).json({
    message: "Upload and generate image caption with Cloudinary AI",
  });
});

app.post("/api/v1/caption", async (req, res) => {
  try {
    const { imageUrl } = req.body;

    if (!imageUrl) {
      return res.status(400).json({
        success: false,
        message: "Image URL is required",
      });
    }

    const result = await cloudinary.uploader.upload(imageUrl, {
      detection: "captioning",
    });

    res.status(200).json({
      success: true,
      caption: result.info.detection.captioning.data.caption,
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      message: "Unable to generate image caption",
      error: error.message,
    });
  }
});

const startServer = async () => {
  try {
    app.listen(8080, () => console.log("Server started on port 8080"));
  } catch (error) {
    console.log("Error starting server:", error);
  }
};

startServer();

Copy after login

The code snippet shows the endpoints to the GET and POST HTTP methods. The POST method reads the image and crafts a caption. Check out Cloudinary AI Content Analysis to learn more about the practical use case of this technology.

Start the development server

In your terminal, use the command to run the server at http://localhost:8080.

mkdir caption-image-server
cd caption-image-server

npm init -y // initialize the folder

Copy after login

Creating the UI

Next.js is a popular framework among frontend developers because it helps create beautiful and user-friendly interfaces with reusable components.

Installation

As with any project, we need to create the boilerplate code that includes the files and folders with this command:

npm install nodemon --save-dev

Copy after login

During installation, some prompts will appear, allowing you to choose your preferences for the project.

Next, install these dependencies:

npm install cors cloudinary dotenv express

Copy after login

react-toastify: for notification
next-cloudinary: The Cloudinary package is developed for high-performance image and video delivery
copy-to-clipboard: copy text to the clipboard

Environment variables

In the same way, as with the backend code, we need to create the environment variables in the root directory with the following:

.env

{
  ...
  "scripts": {
    "start": "node index",
    "start:dev": "nodemon index"
  },
  ...
}

Copy after login

These variables will help sign our requests because we will use Cloudinary signed uploads to send files to the cloud. The signed uploads add an extra security layer to file uploads instead of unsigned uploads.

Configuring Cloudinary

Create a lib folder in the root directory, and it, a file name cloudinary.js

lib/cloudinary.js

touch index.js

Copy after login

Next, in the App router, create a new API route with this file name, api/sign-cloudinary-params/route.js:

app/api/sign-cloudinary-params/route.js

CLOUDINARY_CLOUD_NAME=your_cloud_name
CLOUDINARY_API_KEY=your_api_key
CLOUDINARY_API_SECRET=your_api_secret

Copy after login

Displaying the UI content

Here, the home route will display the content users can interact with within the app.

app/page.js

import express from "express";
import { v2 as cloudinary } from "cloudinary";
import * as dotenv from "dotenv";
import cors from "cors";

dotenv.config();

const app = express();

app.use(cors());
app.use(express.json());

cloudinary.config({
  cloud_name: process.env.CLOUDINARY_CLOUD_NAME,
  api_key: process.env.CLOUDINARY_API_KEY,
  api_secret: process.env.CLOUDINARY_API_SECRET,
});

app.get("/", (req, res) => {
  res.status(200).json({
    message: "Upload and generate image caption with Cloudinary AI",
  });
});

app.post("/api/v1/caption", async (req, res) => {
  try {
    const { imageUrl } = req.body;

    if (!imageUrl) {
      return res.status(400).json({
        success: false,
        message: "Image URL is required",
      });
    }

    const result = await cloudinary.uploader.upload(imageUrl, {
      detection: "captioning",
    });

    res.status(200).json({
      success: true,
      caption: result.info.detection.captioning.data.caption,
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      message: "Unable to generate image caption",
      error: error.message,
    });
  }
});

const startServer = async () => {
  try {
    app.listen(8080, () => console.log("Server started on port 8080"));
  } catch (error) {
    console.log("Error starting server:", error);
  }
};

startServer();

Copy after login

Now that we have the code for the home page clicking the "Upload an Image" button opens the Cloudinary widget interface that offers many options for uploading an image. Once you have selected an image, it processes the data with Cloudinary, generating both the picture and the caption side-by-side. Then, a notification pops up when you "Copy caption" to the clipboard for later use as an alternative text for your image.

Tech stack

These are the following technologies that made it possible to build the AI-enhanced photo captioner:

Next.js
Cloudinary
Vercel
Render
Express

Important links

Caption Image: https://caption-image-gamma.vercel.app/

Server code: https://github.com/Terieyenike/caption-image-server

GitHub repo: https://github.com/Terieyenike/caption-image-client

API: https://caption-image-server.onrender.com/

Deployment

These two technologies managed the deployment of the app on the web.

Vercel: helps deploy frontend web applications
Render: hosting the server code (API) in the cloud

Conclusion

Everything is made possible by using AI. It shows how efficiently AI is used to our advantage in creating for humans.

The AI-enhanced photo captioner is one example of the power of Cloudinary APIs and tools for building your next app. It removes the need to use other tools that provide similar services when bundling it all in Cloudinary.

Happy coding!

The above is the detailed content of How to Use Cloudinary AI to Write Better Image Captions. For more information, please follow other related articles on the PHP Chinese website!