Several tips for avoiding pitfalls when using ChatGLM-AI-php.cn

Home

Technology peripherals

Several tips for avoiding pitfalls when using ChatGLM

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

May 02, 2023 pm 01:58 PM

chatglm Avoid pitfalls Fine-tuning the model

I said yesterday that I deployed a set of ChatGLM after returning from the Data Technology Carnival, and planned to study the use of large language models to train database operation and maintenance knowledge base. Many friends did not believe it, saying that you are already this old, Lao Bai, and you can still do it. Are you going to fiddle with these things yourself? In order to dispel the doubts of these friends, I will share with you the process of tossing ChatGLM in the past two days today, and also share some tips on avoiding pitfalls for friends who are interested in tossing ChatGLM.

ChatGLM-6B is developed based on the language model GLM jointly trained by Tsinghua University’s KEG Laboratory and Zhipu AI in 2023. It is a large-scale language model that provides appropriate solutions to user problems and requirements. Replies and support. The above answer is answered by ChatGLM itself. GLM-6B is an open source pre-trained model with 6.2 billion parameters. Its characteristic is that it can be run locally in a relatively small hardware environment. This feature allows applications based on large language models to enter thousands of households. The purpose of the KEG laboratory is to enable the larger GLM-130B model (130 billion parameters, equivalent to GPT-3.5) to be trained in a low-end environment with an 8-way RTX 3090.

Several tips for avoiding pitfalls when using ChatGLM

#If this goal can really be achieved, it will definitely be good news for people who want to make some applications based on large language models. The current FP16 model of ChatGLP-6B is about a little more than 13G, and the INT-4 quantized model is less than 4GB. It can run on an RTX 3060TI with 6GB of video memory.

Several tips for avoiding pitfalls when using ChatGLM

I didn’t know much about these situations before deployment, so I bought a 12GB RTX 3060 that was neither high nor low, so after completing the installation and deployment, I still couldn’t run the FP16 model. If I had known better to do testing and verification at home, I would have just bought a cheaper 3060TI. If you want to run a lossless FP16 model, you must get a 3090 with 24GB of video memory.

Several tips for avoiding pitfalls when using ChatGLM

If you just want to test the capabilities of ChatGLP-6B on your own machine, then you may not need to download THUDM/ChatGLM directly -6B model, there are some packaged quantitative models available for download on huggingface. The model download speed is very slow, you can download the int4 quantitative model directly.

I completed this installation on an I7 8-core PC with an RTX 3060 graphics card of 12G video memory. Because this computer is my work computer, I installed ChatGLM on the WSL subnet. on the system. Installing ChatGLM on the WINDOWS WSL subsystem is more complicated than installing it directly in a LINUX environment. The biggest pitfall is the installation of the graphics card driver. When deploying ChatGLM directly on Linux, you need to install the NVIDIA driver directly and activate the network card driver through modprobe. Installing on WSL is quite different.

Several tips for avoiding pitfalls when using ChatGLM

ChatGLM can be downloaded from github. There are also some simple documents on the website, even including a document for deploying ChatGLM on WINDOWS WSL. However, if you are a novice in this area and deploy completely according to this document, you will encounter countless pitfalls.

Several tips for avoiding pitfalls when using ChatGLM

The Requriements.txt document lists the list and version numbers of the main open source components used by ChatGLM. The core is transformers, which requires version 4.27. 1. In fact, the requirements are not so strict. There is no big problem if it is slightly lower, but it is better to use the same version for safety reasons. Icetk is for token processing, cpm_kernels is the core call of the Chinese processing model and cuda, and protobuf is for structured data storage. Gradio is a framework for quickly generating AI applications using Python. I don’t need any introduction to Torch.

ChatGLM can be used in an environment without a GPU, using the CPU and 32GB of physical memory to run, but the running speed is very slow and can only be used for demonstration verification. If you want to play ChatGLM, it is best to equip yourself with a GPU.

The biggest pitfall in installing ChatGLM on WSL is the graphics card driver. The documentation for ChatGLM on Git is very unfriendly. For people who don’t know much about this project or have never done such deployment, the documentation is really confusing. . In fact, software deployment is not troublesome, but the graphics card driver is very tricky.

Because it is deployed on the WSL subsystem, LINUX is only an emulation system, not a complete LINUX. Therefore, NVIDIA's graphics driver only needs to be installed on WINDOWS and does not need to be activated in WSL. However, CUDA TOOLS still needs to be installed in the LINUX virtual environment of WSL. The NVIDIA driver on WINDOWS must install the latest driver from the official website, and cannot use the compatibility driver that comes with WIN10/11. Therefore, do not omit downloading the latest driver from the official website and installing it.

Several tips for avoiding pitfalls when using ChatGLM

After installing the WIN driver, you can install cuda tools directly in WSL. After the installation is complete, run nvidia-smi and if you can see the above interface , then congratulations, you have successfully avoided the first pit. In fact, you will encounter several pitfalls when installing cuda tools. That is, your system must have appropriate versions of gcc, gcc-dev, make and other compilation-related tools installed. If these components are missing, the installation of cuda tools will fail.

Several tips for avoiding pitfalls when using ChatGLM

The above is a pitfall of early preparation. In fact, it avoids the pitfall of NVIDIA driver, and the subsequent installation is still very smooth. of. In terms of system selection, I still recommend choosing Debian-compatible Ubuntu. The new version of Ubuntu's aptitude is very smart and can help you solve version compatibility issues with a large number of software and realize automatic version downgrade of some software.

The following installation process can be completed smoothly according to the installation guide. It should be noted that the work of replacing the installation source in /etc/apt/sources.list is best completed according to the guide. On the one hand, the installation speed will be It is much faster, and on the other hand, it also avoids software version compatibility problems. Of course, not replacing it will not necessarily affect the subsequent installation process.

Several tips for avoiding pitfalls when using ChatGLM

If you successfully passed the previous levels, then you have entered the last step and started web_demo . Executing python3 web_demo.py can start an example of a WEB conversation. At this time, if you are a poor person and only have a 3060 with 12GB of video memory, then you will definitely see the error above. Even if you set PYTORCH_CUDA_ALLOC_CONF to the minimum 21, you cannot avoid this error. At this time you can't be lazy, you must simply rewrite the python script.

Several tips for avoiding pitfalls when using ChatGLM

The default web_demo.py uses the FP16 pre-trained model. Models with more than 13GB will definitely not be loaded into the 12GB existing memory. , so you need to make a small adjustment to this code.

Several tips for avoiding pitfalls when using ChatGLM

You can change to quantize(4) to load the INT4 quantization model, or change to quantize(8) to load the INT8 quantization model. In this way, your graphics card memory will be enough, and it can support you to do various conversations.

It should be noted that the downloading of the model does not really start until web_demo.py is started, so it will take a long time to download the 13GB model. You can do this work in the middle of the night, or You can directly use download tools such as Thunder to download the model from hugging face in advance. If you know nothing about the model and are not very good at installing the downloaded model, you can also modify the model name in the code, THUDM/chatglm-6b-int4, and directly download the INT4 quantized model with less than 4GB from the Internet. This will It’s much faster, but your broken graphics card can’t run the FP16 model anyway.

At this point, you can talk to ChatGLM through the web page, but this is just the beginning of the trouble. Only when you are able to train your fine-tuned model can your journey into ChatGLM truly begin. Playing this kind of thing still requires a lot of energy and money, so be careful when entering the trap.

Finally, I am very grateful to my friends from Tsinghua University’s KEG Laboratory. Their work allows more people to use large language models at low cost.

The above is the detailed content of Several tips for avoiding pitfalls when using ChatGLM. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Repo: How To Revive Teammates

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

3 weeks ago By DDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7316

Java Tutorial

1625

CakePHP Tutorial

1349

Laravel Tutorial

1261

PHP Tutorial

1208

Related knowledge

What is Model Context Protocol (MCP)? Mar 03, 2025 pm 07:09 PM

The Model Context Protocol (MCP): A Universal Connector for AI and Data We're all familiar with AI's role in daily coding. Replit, GitHub Copilot, Black Box AI, and Cursor IDE are just a few examples of how AI streamlines our workflows. But imagine

Building a Local Vision Agent using OmniParser V2 and OmniTool Mar 03, 2025 pm 07:08 PM

Microsoft's OmniParser V2 and OmniTool: Revolutionizing GUI Automation with AI Imagine AI that not only understands but also interacts with your Windows 11 interface like a seasoned professional. Microsoft's OmniParser V2 and OmniTool make this a re

Replit Agent: A Guide With Practical Examples Mar 04, 2025 am 10:52 AM

Revolutionizing App Development: A Deep Dive into Replit Agent Tired of wrestling with complex development environments and obscure configuration files? Replit Agent aims to simplify the process of transforming ideas into functional apps. This AI-p

I Tried Vibe Coding with Cursor AI and It's Amazing! Mar 20, 2025 pm 03:34 PM

Vibe coding is reshaping the world of software development by letting us create applications using natural language instead of endless lines of code. Inspired by visionaries like Andrej Karpathy, this innovative approach lets dev

Runway Act-One Guide: I Filmed Myself to Test It Mar 03, 2025 am 09:42 AM

This blog post shares my experience testing Runway ML's new Act-One animation tool, covering both its web interface and Python API. While promising, my results were less impressive than expected. Want to explore Generative AI? Learn to use LLMs in P

How to Use YOLO v12 for Object Detection? Mar 22, 2025 am 11:07 AM

YOLO (You Only Look Once) has been a leading real-time object detection framework, with each iteration improving upon the previous versions. The latest version YOLO v12 introduces advancements that significantly enhance accuracy

Elon Musk & Sam Altman Clash over $500 Billion Stargate Project Mar 08, 2025 am 11:15 AM

The $500 billion Stargate AI project, backed by tech giants like OpenAI, SoftBank, Oracle, and Nvidia, and supported by the U.S. government, aims to solidify American AI leadership. This ambitious undertaking promises a future shaped by AI advanceme

Top 5 GenAI Launches of February 2025: GPT-4.5, Grok-3 & More! Mar 22, 2025 am 10:58 AM

February 2025 has been yet another game-changing month for generative AI, bringing us some of the most anticipated model upgrades and groundbreaking new features. From xAI’s Grok 3 and Anthropic’s Claude 3.7 Sonnet, to OpenAI’s G

See all articles