In the current era of information explosion, the training of language models has become increasingly complex and difficult. In order to train an efficient language model, we need a lot of computing resources and time, which is impractical for many people. At the same time, we are also faced with the challenge of how to apply large language models under limited memory and computing resources, especially on edge devices.
Today I would like to recommend to you a GitHub open source project jzhang38/TinyLlama. The project has more than 4.3k stars on GitHub. To introduce the project in one sentence is: "The TinyLlama project is an open Endeavor to pretrain a 1.1B Llama model on 3 trillion tokens."
The goal of TinyLlama is to pre-train a 1.1B Llama model on 3 trillion tokens. With proper optimization, we can achieve this in just 90 days using 16 A100-40G GPUs. The project uses the exact same architecture and tokenizer as Llama 2, which means TinyLlama can be easily embedded and used in many Llama-based open source projects. Additionally, TinyLlama is very compact, with only 1.1B parameters. This compactness makes it suitable for many application scenarios that require limited computing and memory footprint.
You can download the model directly and use it, or use the demo through huggingface .
If you want to train by yourself, please refer to the following training details.
TinyLlama is an exciting open source project that is actively solving some key problems and making progress in open source received widespread attention in the community.
The following is the Star trend chart of the project (representing the activity level of the project):
For more project details, please see the link below.
Open source project address: https://github.com/jzhang38/TinyLlama
Open source project author: jzhang38
The following are all members involved in project construction:
The above is the detailed content of Small Llama large models that can be run with minimal computational and memory resources. For more information, please follow other related articles on the PHP Chinese website!