Table of Contents
Open-Platypus Dataset
Contamination problem
Fine-tuning and merging
The results
Home Technology peripherals AI The Open LLM list has been refreshed again, and a 'Platypus' stronger than Llama 2 is here.

The Open LLM list has been refreshed again, and a 'Platypus' stronger than Llama 2 is here.

Aug 17, 2023 pm 03:09 PM
ai Model

In order to challenge the dominance of closed models such as OpenAI’s GPT-3.5 and GPT-4, a series of open source models are emerging, including LLaMa, Falcon, etc. Recently, Meta AI launched LLaMa-2, which is known as the most powerful model in the open source field, and many researchers have also built their own models on this basis. For example, StabilityAI used Orca-style data sets to fine-tune the Llama2 70B model and developed StableBeluga2, which also achieved good results on Huggingface's Open LLM rankings

The latest Open The LLM list ranking has changed, and the Platypus (Platypus) model has successfully climbed to the top of the list

Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

The author is from Boston University and uses PEFT and LoRA And the dataset Open-Platypus fine-tuned and optimized Platypus based on Llama 2

Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

The author introduced Platypus in detail in a paper

Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

The paper can be found at: https://arxiv.org/abs/2308.07317

The following are the main contributions of this article:

  • Open-Platypus is a small-scale dataset consisting of a curated subset of public text datasets . This dataset consists of 11 open source datasets with a focus on improving LLM’s STEM and logic knowledge. It consists mainly of questions designed by humans, with only 10% of questions generated by LLM. The main advantage of Open-Platypus is its scale and quality, which enables very high performance in a short time and with low time and cost of fine-tuning. Specifically, training a 13B model using 25k problems takes just 5 hours on a single A100 GPU.
  • Describes the similarity elimination process, reduces the size of the dataset, and reduces data redundancy.
  • The ever-present phenomenon of contamination of open LLM training sets with data contained in important LLM test sets is analyzed in detail, and the author's training data filtering process to avoid this hidden danger is introduced.
  • Describes the process of selecting and merging specialized fine-tuned LoRA modules.

Open-Platypus Dataset

The author has currently released the Open-Platypus Dataset on Hugging Face


Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

Contamination problem

To avoid benchmarking problems leaking into the training set, This approach first considers preventing this problem to ensure that the results are not simply biased by memory. While striving for accuracy, the authors are also aware of the need for flexibility in marking please say again questions because questions can be asked in a variety of ways and are influenced by general domain knowledge. To manage potential leakage issues, the authors carefully designed heuristics for manually filtering problems with more than 80% similarity to the cosine embedding of the benchmark problem in Open-Platypus. They divided potential leak issues into three categories: (1) Please say the question again; (2) Rephrase: This area presents a gray toned problem; (3) similar but not identical problem. To be cautious, they excluded all of these problems from the training set

Please say it again

This text almost exactly replicates the content of the test question set, with only slight modifications or rearrangements of the words. Based on the number of leaks in the table above, the authors believe this is the only category that falls under contamination. The following are specific examples:

Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

Redescription: This area has a gray tint

The following issues are called redescriptions: This area takes on a shade of gray and includes issues that are not exactly, please, common sense. While the authors leave the final judgment on these issues to the open source community, they argue that these issues often require expert knowledge. It should be noted that this type of questions includes questions with exactly the same instructions but synonymous answers:

Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

Similar but not identical

These questions have a high degree of similarity, but due to subtle changes between the questions, there are significant differences in the answers.

Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

Fine-tuning and merging

After the data set is improved, the author focuses on two methods: low Rank approximation (LoRA) training and parameter efficient fine-tuning (PEFT) library. Unlike full fine-tuning, LoRA retains the weights of the pre-trained model and uses the rank decomposition matrix for integration in the transformer layer, thereby reducing trainable parameters and saving training time and cost. Initially, fine-tuning mainly focused on attention modules such as v_proj, q_proj, k_proj and o_proj. Subsequently, it was extended to the gate_proj, down_proj and up_proj modules according to the suggestions of He et al. Unless the trainable parameters are less than 0.1% of the total parameters, these modules all show better results. The author adopted this method for both the 13B and 70B models, and the result was that the trainable parameters were 0.27% and 0.2% respectively. The only difference is the initial learning rate of these models

The results

According to the Hugging Face Open LLM ranking data on August 10, 2023, The author compared Platypus with other SOTA models and found that the Platypus2-70Binstruct variant performed well, ranking first with an average score of 73.13

Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

Stable -Platypus2-13B model stands out with an average score of 63.96 among 13 billion parameter models, which deserves attention

Open LLM榜单再次刷新,比Llama 2更强的「鸭嘴兽」来了

##Limitations

Platypus, as a fine-tuned extension of LLaMa-2, retains many of the constraints of the base model and introduces specific challenges through targeted training. It shares the static knowledge base of LLaMa-2, which may become outdated . Additionally, there is a risk of generating inaccurate or inappropriate content, particularly in cases of unclear prompts. While Platypus has been enhanced in STEM and English logic, its proficiency in other languages ​​is not reliable and may be inconsistent. It occasionally produces biased or inconsistent harmful content. The author acknowledges efforts to minimize these issues but acknowledges the ongoing challenges, particularly in non-English languages.

The potential for abuse of Platypus is a concern. issues, so developers should conduct security testing of their applications before deployment. Platypus may have some limitations outside of its primary domain, so users should proceed with caution and consider additional fine-tuning for optimal performance. Users need to ensure that the training data for Platypus does not overlap with other benchmark test sets. The authors are very cautious about data contamination issues and avoid merging models with models trained on tainted datasets. Although it is confirmed that there is no contamination in the cleaned training data, it cannot be ruled out that some problems may have been overlooked. For details on these limitations, see the Limitations section in the paper

The above is the detailed content of The Open LLM list has been refreshed again, and a 'Platypus' stronger than Llama 2 is here.. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Repo: How To Revive Teammates
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

What are the types of return values ​​of c language function? Summary of types of return values ​​of c language function? What are the types of return values ​​of c language function? Summary of types of return values ​​of c language function? Apr 03, 2025 pm 11:18 PM

The return value types of C language function include int, float, double, char, void and pointer types. int is used to return integers, float and double are used to return floats, and char returns characters. void means that the function does not return any value. The pointer type returns the memory address, be careful to avoid memory leakage.结构体或联合体可返回多个相关数据。

C language starts from 0 C language starts from 0 Apr 03, 2025 pm 08:24 PM

It may be a bit difficult to get started with C language learning, but after mastering the correct method, you will quickly master the basics and gradually master them. This guide will guide you step by step to learn the core concepts of C language, from basics to advanced topics. Directory C language basics and data types User input conditional expression abbreviation switch statement C language array nested loop C language function structure pointer C language basics and data types C programs follow standard structures and use multiple data types to define variables. The basic program structure is as follows: #includeintmain(){printf("hello,world!");ret

Concept of c language function Concept of c language function Apr 03, 2025 pm 10:09 PM

C language functions are reusable code blocks. They receive input, perform operations, and return results, which modularly improves reusability and reduces complexity. The internal mechanism of the function includes parameter passing, function execution, and return values. The entire process involves optimization such as function inline. A good function is written following the principle of single responsibility, small number of parameters, naming specifications, and error handling. Pointers combined with functions can achieve more powerful functions, such as modifying external variable values. Function pointers pass functions as parameters or store addresses, and are used to implement dynamic calls to functions. Understanding function features and techniques is the key to writing efficient, maintainable, and easy to understand C programs.

Exercise C: Building a simple phonebook application Exercise C: Building a simple phonebook application Apr 03, 2025 pm 08:15 PM

One of the best ways to learn C language programming is to practice it. This article will take you step through a project I recently completed: a simple phonebook application. This app demonstrates file processing and basic data management in C, allowing you to add, view, and delete contacts. The following is the complete code: #include#include//Function declaration voidaddcontact(charname[],charnumber[]);voidviewcontacts();voiddeletecontact(c

How to calculate c-subscript 3 subscript 5 c-subscript 3 subscript 5 algorithm tutorial How to calculate c-subscript 3 subscript 5 c-subscript 3 subscript 5 algorithm tutorial Apr 03, 2025 pm 10:33 PM

The calculation of C35 is essentially combinatorial mathematics, representing the number of combinations selected from 3 of 5 elements. The calculation formula is C53 = 5! / (3! * 2!), which can be directly calculated by loops to improve efficiency and avoid overflow. In addition, understanding the nature of combinations and mastering efficient calculation methods is crucial to solving many problems in the fields of probability statistics, cryptography, algorithm design, etc.

distinct function usage distance function c usage tutorial distinct function usage distance function c usage tutorial Apr 03, 2025 pm 10:27 PM

std::unique removes adjacent duplicate elements in the container and moves them to the end, returning an iterator pointing to the first duplicate element. std::distance calculates the distance between two iterators, that is, the number of elements they point to. These two functions are useful for optimizing code and improving efficiency, but there are also some pitfalls to be paid attention to, such as: std::unique only deals with adjacent duplicate elements. std::distance is less efficient when dealing with non-random access iterators. By mastering these features and best practices, you can fully utilize the power of these two functions.

What are the differences and connections between c and c#? What are the differences and connections between c and c#? Apr 03, 2025 pm 10:36 PM

Although C and C# have similarities, they are completely different: C is a process-oriented, manual memory management, and platform-dependent language used for system programming; C# is an object-oriented, garbage collection, and platform-independent language used for desktop, web application and game development.

What are the pointer parameters in the parentheses of the C language function? What are the pointer parameters in the parentheses of the C language function? Apr 03, 2025 pm 11:48 PM

The pointer parameters of C language function directly operate the memory area passed by the caller, including pointers to integers, strings, or structures. When using pointer parameters, you need to be careful to modify the memory pointed to by the pointer to avoid errors or memory problems. For double pointers to strings, modifying the pointer itself will lead to pointing to new strings, and memory management needs to be paid attention to. When handling pointer parameters to structures or arrays, you need to carefully check the pointer type and boundaries to avoid out-of-bounds access.

See all articles