Home > Web Front-end > JS Tutorial > How to create an AI agent powered by your screen & mic

How to create an AI agent powered by your screen & mic

Linda Hamilton
Release: 2025-01-22 08:35:10
Original
643 people have browsed it

How to create an AI agent powered by your screen & mic

Screenpipe: A CLI/App for 24/7 Screen and Mic Recording, OCR, Transcription, and AI Integration

Screenpipe is a command-line interface (CLI) application that continuously records your screen and microphone activity, extracts Optical Character Recognition (OCR) data, generates transcriptions, and simplifies the process of feeding this data into AI models. Its flexible pipe system allows you to create powerful plugins that interact with captured screen and audio information. This example demonstrates building a simple pipe that leverages Ollama to analyze screen activity.

Prerequisites:

  • Screenpipe installed and running.
  • Bun installed (npm install -g bun).
  • Ollama installed with a model (DeepSeek-r1:1.5b is used in this example).

1. Pipe Creation:

Create a new Screenpipe pipe using the CLI:

bunx @screenpipe/create-pipe@latest
Copy after login

Follow the prompts to name your pipe (e.g., "my-activity-analyzer") and choose a directory.

2. Project Setup:

Open the project in your preferred editor (e.g., Cursor, VS Code):

cursor my-activity-analyzer
Copy after login

The initial project structure will include several files. For this example, remove unnecessary files:

rm -rf src/app/api/intelligence src/components/obsidian-settings.tsx src/components/file-suggest-textarea.tsx
Copy after login

3. Implementing the Analysis Cron Job:

Create src/app/api/analyze/route.ts with the following code:

import { NextResponse } from "next/server";
import { pipe } from "@screenpipe/js";
import { streamText } from "ai";
import { ollama } from "ollama-ai-provider";

export async function POST(request: Request) {
  try {
    const { messages, model } = await request.json();
    console.log("model:", model);

    const fiveMinutesAgo = new Date(Date.now() - 5 * 60 * 1000).toISOString();
    const results = await pipe.queryScreenpipe({
      startTime: fiveMinutesAgo,
      limit: 10,
      contentType: "all",
    });

    const provider = ollama(model);
    const result = streamText({
      model: provider,
      messages: [
        ...messages,
        {
          role: "user",
          content: `Analyze this activity data and summarize what I've been doing: ${JSON.stringify(results)}`,
        },
      ],
    });

    return result.toDataStreamResponse();
  } catch (error) {
    console.error("error:", error);
    return NextResponse.json({ error: "Failed to analyze activity" }, { status: 500 });
  }
}
Copy after login

4. pipe.json Configuration for Scheduling:

Create or modify pipe.json to include the cron job:

{
  "crons": [
    {
      "path": "/api/analyze",
      "schedule": "*/5 * * * *" // Runs every 5 minutes
    }
  ]
}
Copy after login

5. Updating the Main Page (src/app/page.tsx):

"use client";

import { useState } from "react";
import { Button } from "@/components/ui/button";
import { OllamaModelsList } from "@/components/ollama-models-list";
import { Label } from "@/components/ui/label";
import { useChat } from "ai/react";

export default function Home() {
  const [selectedModel, setSelectedModel] = useState("deepseek-r1:1.5b");
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    body: { model: selectedModel },
    api: "/api/analyze",
  });

  return (
    <main className="p-4 max-w-2xl mx-auto space-y-4">
      <div className="space-y-2">
        <label htmlFor="model">Ollama Model</label>
        <OllamaModelsList defaultValue={selectedModel} onChange={setSelectedModel} />
      </div>

      <div>
        {messages.map((message) => (
          <div key={message.id}>
            <div>{message.role === "user" ? "User: " : "AI: "}</div>
            <div>{message.content}</div>
          </div>
        ))}
      </div>
    </main>
  );
}
Copy after login

6. Local Testing:

Run the pipe locally:

bun i  // or npm install
bun dev
Copy after login

Access the application at http://localhost:3000.

7. Screenpipe Installation:

Install the pipe into Screenpipe:

  • UI: Open the Screenpipe app, navigate to the Pipes section, click " ", and provide the local path to your pipe.
  • CLI:
    screenpipe install /path/to/my-activity-analyzer
    screenpipe enable my-activity-analyzer
    Copy after login

    How it Works:

    • Data Querying: pipe.queryScreenpipe() retrieves recent screen and audio data.
    • AI Processing: Ollama analyzes the data using a prompt.
    • UI: A simple interface displays the analysis results.
    • Scheduling: Screenpipe's cron job executes the analysis every 5 minutes.

    Next Steps:

    • Add configuration options.
    • Integrate with external services.
    • Implement more sophisticated UI components.

    References:

    • Screenpipe documentation.
    • Example Screenpipe pipes.
    • Screenpipe SDK reference.

    The above is the detailed content of How to create an AI agent powered by your screen & mic. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template