


The LLaMA model was leaked, and the Meta version of ChatGPT was forced to be 'open source'! GitHub gains 8k stars and a large number of reviews are released
The battle for ChatGPT is intensifying.
# A few weeks ago, Meta released its own large-scale language model LLaMA, with parameters ranging from 7 billion to 65 billion.
In the paper, LLaMA (13 billion) with only 1/10 parameters surpasses GPT-3 in most benchmark tests.
For LLaMA with 65 billion parameters, it is comparable to DeepMind’s Chinchilla (70 billion parameters) and Google’s PaLM (540 billion parameters).
#Although Meta claims that LLaMA is open source, researchers still need to apply and review it.
#However, what I never expected was that just a few days after its release, the model files of LLaMA were leaked in advance.
#So, the question is, is this intentional or accidental?
LLaMA suffered from "open source" ?
Recently, LLaMA’s finished product library was leaked on the foreign forum 4chan.
Last Thursday, user llamanon posted on 4chan’s tech board via a torrent (torrent) Release of LLaMA models of 7B and 65B.
#This seed link is currently blocked Merged into LLaMA's GitHub page.
He also submitted a second pull request to the project, which provided a seed link to another set of weights for the model.
#Currently, the project has received 8k stars on GitHub.
#However, one of the biggest mistakes leakers make is including their unique identifier code in the leaked model.
#This code is specifically designed to track down leakers, putting user llamanon’s personal information at risk.
# As the saying goes, LLaMA is not open source and it is not decent, but netizens help it to be decent.
Additionally, users on 4chan have created a handy resource for those looking to deploy the model on their own workstations.
# and provides a guide to a distribution tutorial on how to obtain a model and add modified weights to it for more efficient inference.
#What’s more, this resource even provides a way to integrate LLaMA into the online writing platform KoboldAI.
Whether Meta did this intentionally or accidentally leaked it. Netizens expressed their opinions one after another.
A netizen’s analysis was very clear, “Maybe Meta leaked it deliberately to fight against OpenAI.”
Some customers think this is a better model and it hits right at the heart of their business plan to sell access for $250,000 a year. A month of access to their service buys a machine capable of running this leaked model. Meta undercuts a potential upstart competitor to keep the current big tech cartel stable. Maybe this is a bit of a conspiracy theory, but we live in the age of big technology and big conspiracies.
# On Monday, Meta said it would continue to release its artificial intelligence tools to accredited researchers even though LLaMA had been leaked to unauthorized users.
Some netizens directly said that they downloaded LLaMA with 7 billion parameters. Although they don’t know how to run it, they can get it just in case they need it in the future.
##The leak and open source of LLaMA is a big event:
Stable Diffusion is open source. Eight months later, we can now read other people's minds and decode everything they see.
#With the opening of LLMs, we're going to get some really crazy stuff.
Not long after LLaMA was released, netizens discovered this The smallest parameter model also requires nearly 30GB of GPU to run.
# However, with floating point optimization via Bits and Bytes libraries, they were able to get the model running on a single NVIDIA RTX 3060.
Additionally, a researcher on GitHub was even able to infer a few words per second running the 7B version of LLM on a Ryzen 7900X CPU.
#So what exactly is the LLaMA model? Foreign guys reviewed it.
##LLaMA performed well in many tests.
In terms of large-scale multi-task language understanding, even the relatively small 13B model is on par with GPT-3, which is the size of its 13 times.
The 33B version is far superior to GPT-3, and the 65B version can compete with the most powerful existing LLM model-Google's 540B parameter PaLM.
For text that needs to be processed using logic or calculations, LLaMA performs well and can compete with PaLM in quantitative reasoning. Compared to, or even better than the latter's code generation capabilities.
Given these results, LLaMA appears to be one of the most advanced models currently available, and, It's small enough that it doesn't require many resources to run. This makes LLaMA very tempting for people to want to play with it and see what it can do. PaLM’s original paper shows a very cool use case: given a joke, let the model explain why it is funny. This task requires a combination of experimentation and logic, which all previous models of PaLM were unable to achieve. Let some of the jokes be explained by LLaMA and ChatGPT. Some joke language models can get them, such as Schimidhuber's long and boring speech. Explaining Jokes
But overall, neither LLaMA nor ChatGPT have a sense of humor.
However, the two have different strategies for dealing with jokes that they don’t understand. ChatGPT will generate “a wall of text”, hoping that at least some of the sentences are correct answers. , this behavior is like students who don’t know the answer, hoping that the teacher can find the answer from their random talk.
Zero sample classification
This is a very practical function that allows people to use LLM instead of scoring to generate training sets and then train smaller serviceable models on those training sets.
A more challenging task is to classify clicked ads. Since even humans cannot agree on what a clicked ad is, the model is provided in the prompt. Some examples, so in fact this is a few-sample rather than zero-sample classification. Here are tips from LLaMA.
In the test, only LLaMA-33B managed to follow the required format and give answers, and its predictions were reasonable. ChatGPT performed second, and could give a comparison A reasonable answer, but often not in the prescribed format, and the smaller 7B and 13B models are not well suited to the task.
##Code generation
Although the method is LLM Excellent in humanities but not in STEM subjects, so how does LLaMA perform in this area?
#In the prompt, give the form of the search table and the purpose you hope to achieve, and ask the model to provide SQL query statements.
ChatGPT performs better in this task, but the results given by the language model are generally unreliable.
In various tests compared with ChatGPT, LLaMA did not perform as expected. Just as successful. Of course, if the gap is only caused by RLHF (reinforcement learning with human feedback), then the future of small models may be brighter.
The above is the detailed content of The LLaMA model was leaked, and the Meta version of ChatGPT was forced to be 'open source'! GitHub gains 8k stars and a large number of reviews are released. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Vibe coding is reshaping the world of software development by letting us create applications using natural language instead of endless lines of code. Inspired by visionaries like Andrej Karpathy, this innovative approach lets dev

Revolutionizing App Development: A Deep Dive into Replit Agent Tired of wrestling with complex development environments and obscure configuration files? Replit Agent aims to simplify the process of transforming ideas into functional apps. This AI-p

February 2025 has been yet another game-changing month for generative AI, bringing us some of the most anticipated model upgrades and groundbreaking new features. From xAI’s Grok 3 and Anthropic’s Claude 3.7 Sonnet, to OpenAI’s G

YOLO (You Only Look Once) has been a leading real-time object detection framework, with each iteration improving upon the previous versions. The latest version YOLO v12 introduces advancements that significantly enhance accuracy

DALL-E 3: A Generative AI Image Creation Tool Generative AI is revolutionizing content creation, and DALL-E 3, OpenAI's latest image generation model, is at the forefront. Released in October 2023, it builds upon its predecessors, DALL-E and DALL-E 2

The $500 billion Stargate AI project, backed by tech giants like OpenAI, SoftBank, Oracle, and Nvidia, and supported by the U.S. government, aims to solidify American AI leadership. This ambitious undertaking promises a future shaped by AI advanceme

Grok 3 – Elon Musk and xAi’s latest AI model is the talk of the town these days. From Andrej Karpathy to tech influencers, everyone is talking about the capabilities of this new model. Initially, access was limited to

Google DeepMind's GenCast: A Revolutionary AI for Weather Forecasting Weather forecasting has undergone a dramatic transformation, moving from rudimentary observations to sophisticated AI-powered predictions. Google DeepMind's GenCast, a groundbreak
