Have you always found it challenging to add captions to your images on social media platforms like X and LinkedIn for accessibility using alt text?
Caption Image is an app that automatically solves this problem by analyzing your image and its details using Cloudinary AI to provide a perfect description automatically.
This guide will cover connecting the server code (API) to the client side to build a robust full-stack application for image captions.
Want to give it a try? Check out the Caption Image app here.
Prerequisites
Basic understanding of React
Node.js installed on your local machine
Set up a Cloudinary account
Run this command to create your project as follows:
mkdir caption-image-server cd caption-image-server npm init -y // initialize the folder
After this setup, install the following dependencies to be able to build the API:
npm install nodemon --save-dev
Nodemon: Runs your development server and monitors changes for any change in the code
npm install cors cloudinary dotenv express
cors: it allows you to perform cross-domain requests in web applications
cloudinary: cloud storage for image and video
dotenv: load environment variables from a .env file
express: a node.js framework for building APIs
In the package.json, update the script objects with the following:
{ ... "scripts": { "start": "node index", "start:dev": "nodemon index" }, ... }
The index represents the file used to create the backend code. Run this code to create the file:
touch index.js
The environment variables keep our credentials secret and prevent them from being leaked when pushed to GitHub.
.env
CLOUDINARY_CLOUD_NAME=your_cloud_name CLOUDINARY_API_KEY=your_api_key CLOUDINARY_API_SECRET=your_api_secret
Go to your Cloudinary dashboard, and you will have access to your values. Replace the placeholder text after the equal sign.
Let's create the server. Copy-paste this code into your index.js file:
import express from "express"; import { v2 as cloudinary } from "cloudinary"; import * as dotenv from "dotenv"; import cors from "cors"; dotenv.config(); const app = express(); app.use(cors()); app.use(express.json()); cloudinary.config({ cloud_name: process.env.CLOUDINARY_CLOUD_NAME, api_key: process.env.CLOUDINARY_API_KEY, api_secret: process.env.CLOUDINARY_API_SECRET, }); app.get("/", (req, res) => { res.status(200).json({ message: "Upload and generate image caption with Cloudinary AI", }); }); app.post("/api/v1/caption", async (req, res) => { try { const { imageUrl } = req.body; if (!imageUrl) { return res.status(400).json({ success: false, message: "Image URL is required", }); } const result = await cloudinary.uploader.upload(imageUrl, { detection: "captioning", }); res.status(200).json({ success: true, caption: result.info.detection.captioning.data.caption, }); } catch (error) { res.status(500).json({ success: false, message: "Unable to generate image caption", error: error.message, }); } }); const startServer = async () => { try { app.listen(8080, () => console.log("Server started on port 8080")); } catch (error) { console.log("Error starting server:", error); } }; startServer();
The code snippet shows the endpoints to the GET and POST HTTP methods. The POST method reads the image and crafts a caption. Check out Cloudinary AI Content Analysis to learn more about the practical use case of this technology.
Start the development server
In your terminal, use the command to run the server at http://localhost:8080.
mkdir caption-image-server cd caption-image-server npm init -y // initialize the folder
Next.js is a popular framework among frontend developers because it helps create beautiful and user-friendly interfaces with reusable components.
Installation
As with any project, we need to create the boilerplate code that includes the files and folders with this command:
npm install nodemon --save-dev
During installation, some prompts will appear, allowing you to choose your preferences for the project.
Next, install these dependencies:
npm install cors cloudinary dotenv express
react-toastify: for notification
next-cloudinary: The Cloudinary package is developed for high-performance image and video delivery
copy-to-clipboard: copy text to the clipboard
In the same way, as with the backend code, we need to create the environment variables in the root directory with the following:
.env
{ ... "scripts": { "start": "node index", "start:dev": "nodemon index" }, ... }
These variables will help sign our requests because we will use Cloudinary signed uploads to send files to the cloud. The signed uploads add an extra security layer to file uploads instead of unsigned uploads.
Configuring Cloudinary
Create a lib folder in the root directory, and it, a file name cloudinary.js
lib/cloudinary.js
touch index.js
Next, in the App router, create a new API route with this file name, api/sign-cloudinary-params/route.js:
app/api/sign-cloudinary-params/route.js
CLOUDINARY_CLOUD_NAME=your_cloud_name CLOUDINARY_API_KEY=your_api_key CLOUDINARY_API_SECRET=your_api_secret
Displaying the UI content
Here, the home route will display the content users can interact with within the app.
app/page.js
import express from "express"; import { v2 as cloudinary } from "cloudinary"; import * as dotenv from "dotenv"; import cors from "cors"; dotenv.config(); const app = express(); app.use(cors()); app.use(express.json()); cloudinary.config({ cloud_name: process.env.CLOUDINARY_CLOUD_NAME, api_key: process.env.CLOUDINARY_API_KEY, api_secret: process.env.CLOUDINARY_API_SECRET, }); app.get("/", (req, res) => { res.status(200).json({ message: "Upload and generate image caption with Cloudinary AI", }); }); app.post("/api/v1/caption", async (req, res) => { try { const { imageUrl } = req.body; if (!imageUrl) { return res.status(400).json({ success: false, message: "Image URL is required", }); } const result = await cloudinary.uploader.upload(imageUrl, { detection: "captioning", }); res.status(200).json({ success: true, caption: result.info.detection.captioning.data.caption, }); } catch (error) { res.status(500).json({ success: false, message: "Unable to generate image caption", error: error.message, }); } }); const startServer = async () => { try { app.listen(8080, () => console.log("Server started on port 8080")); } catch (error) { console.log("Error starting server:", error); } }; startServer();
Now that we have the code for the home page clicking the "Upload an Image" button opens the Cloudinary widget interface that offers many options for uploading an image. Once you have selected an image, it processes the data with Cloudinary, generating both the picture and the caption side-by-side. Then, a notification pops up when you "Copy caption" to the clipboard for later use as an alternative text for your image.
These are the following technologies that made it possible to build the AI-enhanced photo captioner:
Next.js
Cloudinary
Vercel
Render
Express
Caption Image: https://caption-image-gamma.vercel.app/
Server code: https://github.com/Terieyenike/caption-image-server
GitHub repo: https://github.com/Terieyenike/caption-image-client
API: https://caption-image-server.onrender.com/
These two technologies managed the deployment of the app on the web.
Vercel: helps deploy frontend web applications
Render: hosting the server code (API) in the cloud
Everything is made possible by using AI. It shows how efficiently AI is used to our advantage in creating for humans.
The AI-enhanced photo captioner is one example of the power of Cloudinary APIs and tools for building your next app. It removes the need to use other tools that provide similar services when bundling it all in Cloudinary.
Happy coding!
The above is the detailed content of How to Use Cloudinary AI to Write Better Image Captions. For more information, please follow other related articles on the PHP Chinese website!