


Meta launches AI language model LLaMA, a large-scale language model with 65 billion parameters
News on February 25th, Meta announced on Friday local time that it will launch a new large-scale language model based on artificial intelligence (AI) for the research community, in partnership with Microsoft, Google and other companies stimulated by ChatGPT have joined the artificial intelligence competition.
Meta's LLaMA is the abbreviation of "Large Language Model Meta AI" (Large Language Model Meta AI), which is available under a non-commercial license to researchers and entities in government, community, and academia.
The company will make the underlying code available to users, so they can tweak the model themselves and use it for research-related use cases. Meta said the model’s computing power requirements are “much lower.”
According to reports, the company is developing LLaMA with multiple parameters (7B, 13B, 33B and 65B). Among them, LLaMA 65B and LLaMA 33B were trained on 1.4 trillion tokens, and the smallest model LLaMA 7B was also trained on 1 trillion tokens.
Like other large language models, LLaMA works by taking a sequence of words as "input" and predicting the next word to recursively generate text. For this set of models, Meta selected text from the 20 most spoken languages for training, focusing on Latin and Cyrillic.
Of course, like other models, LLaMA also faces the challenges of bias, toxic comments, and hallucinations, and Meta needs to do more research to address the shortcomings in this type of language model.
Meta said that LLaMA as a base model is designed to be versatile and can be applied to many different use cases, rather than a fine-tuned model designed for a specific task. By open sourcing LLaMA's code, other researchers can more easily find new ways to limit or eliminate these problems. Meta also provides in this article a set of benchmark evaluation criteria for assessing model bias and toxicity to show model limitations and support researchers in further research in this critical area.
It is worth mentioning that Meta also launched the large language model OPT-175B in May last year. The project is also aimed at researchers, which forms the basis for a new iteration of its chatbot blenderbot.
Later, the company also launched a model called Galactica, which it said could write scientific articles and solve mathematical problems, but its demo version was later removed from the shelves because It repeatedly generates “authoritative-sounding” content.
IT Home with official link:
The above is the detailed content of Meta launches AI language model LLaMA, a large-scale language model with 65 billion parameters. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

How to achieve the playback of pictures like videos? Many times, we need to implement similar video player functions, but the playback content is a sequence of images. direct...

Data update problems in zustand asynchronous operations. When using the zustand state management library, you often encounter the problem of data updates that cause asynchronous operations to be untimely. �...

How to quickly build a front-end page in back-end development? As a backend developer with three or four years of experience, he has mastered the basic JavaScript, CSS and HTML...

A solution to implement text annotation nesting in Quill Editor. When using Quill Editor for text annotation, we often need to use the Quill Editor to...

Electron rendering process and WebView...

How to achieve the height of the input element is very high but the text is located at the bottom. In front-end development, you often encounter some style adjustment requirements, such as setting a height...

How to solve the problem of transparent image with blank projection transformation result in OpenCV.js. When using OpenCV.js for image processing, sometimes you will encounter the image after projection transformation...

How to realize the function of playing pictures like videos? Many times, we need to achieve similar video playback effects in the application, but the playback content is not...
