In this guide, you'll learn how to run Large Language Models (LLMs) on your local machine and create your own LLM. We'll also cover how to create an API for your custom model using the ollama-js library in Node.js.
Ollama is an ideal choice for running LLMs locally due to its simplicity and compatibility with non-GPU intensive machines. Start by installing Ollama from the official website:
Ollama Official Site
After installing Ollama, you can choose from a variety of LLM models available. You can find the list of available models on their GitHub repository:
Ollama GitHub Repository
To run the model locally, use the following command in your terminal. Note that the first run might take longer as Ollama downloads and stores the model locally. Subsequent runs will be faster since the model is accessed locally.
ollama run {model_name}
To create your custom LLM, you need to create a model file. Below is an example of how to define your model:
FROM <name_of_your_downloaded_model> # Define your parameters here PARAMETER temperature 0.5 SYSTEM """ You are an English teaching assistant named Mr. Kamal Kishor. You help with note-making, solving English grammar assignments, and reading comprehensions. """
Save this as modelfile. To create the model from this file, run the following command in your terminal:
ollama create mrkamalkishor -f ./modelfile
After creating the model, you can interact with it locally using:
ollama run mrkamalkishor
For this step, we will use the ollama-js library to create an API in Node.js.
npm install ollama
import express from 'express'; import ollama from 'ollama'; const app = express(); const router = express.Router(); app.use(express.json()); router.post('/ask-query', async (req, res) => { const { query } = req.body; try { const response = await ollama.chat({ model: 'mrkamalkishor', messages: [{ role: 'user', content: query }], }); res.json({ reply: response.message.content }); } catch (error) { res.status(500).send({ error: 'Error interacting with the model' }); } }); app.use('/api', router); const PORT = process.env.PORT || 3000; app.listen(PORT, () => { console.log(`Server is running on port ${PORT}`); });
This code sets up an Express.js server with an endpoint to interact with your custom model. When a POST request is made to /ask-query with a JSON body containing the user's query, the server responds with the model's output.
By following these steps, you can install Ollama, choose and run LLMs locally, create your custom LLM, and set up a Node.js API to interact with it. This setup allows you to leverage powerful language models on your local machine without requiring GPU-intensive hardware.
The above is the detailed content of Running and Creating Your Own LLMs Locally with Node.js API using Ollama. For more information, please follow other related articles on the PHP Chinese website!