Explore Zephyr-7B: A Powerful Open-Source LLM
The OpenAI LLM Leaderboard is buzzing with new open-source models aiming to rival GPT-4, and Zephyr-7B is a standout contender. This tutorial explores this cutting-edge language model from WebPilot.AI, demonstrating its use with the Transformers pipeline and fine-tuning on an Agent-Instruct dataset. New to AI? The AI Fundamentals skill track is a great starting point.
Zephyr-7B, part of the Zephyr series, is trained to function as a helpful assistant. Its strengths lie in generating coherent text, translating languages, summarizing information, sentiment analysis, and context-aware question answering.
Zephyr-7B-β, the second model in the series, is a fine-tuned Mistral-7B model. Trained using Direct Preference Optimization (DPO) on a blend of public and synthetic datasets, it excels at interpreting complex queries and summarizing lengthy texts. At its release, it held the top spot among 7B chat models on MT-Bench and AlpacaEval benchmarks. Test its capabilities with the free demo on Zephyr Chat.
Image from Zephyr Chat
This tutorial uses Hugging Face Transformers for easy access. (If you encounter loading issues, consult the Inference Kaggle Notebook.)
!pip install -q -U transformers !pip install -q -U accelerate !pip install -q -U bitsandbytes
import torch from transformers import pipeline
device_map="auto"
utilizes multiple GPUs for faster generation. torch.bfloat16
offers faster computation and reduced memory usage (but with slightly lower precision).model_name = "HuggingFaceH4/zephyr-7b-beta" pipe = pipeline( "text-generation", model=model_name, torch_dtype=torch.bfloat16, device_map="auto", )
prompt = "Write a Python function that can clean the HTML tags from the file:" outputs = pipe( prompt, max_new_tokens=300, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, ) print(outputs[0]["generated_text"])
messages = [ { "role": "system", "content": "You are a skilled software engineer who consistently produces high-quality Python code.", }, { "role": "user", "content": "Write a Python code to display text in a star pattern.", }, ] prompt = pipe.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) outputs = pipe( prompt, max_new_tokens=300, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, ) print(outputs[0]["generated_text"])
This section guides you through fine-tuning Zephyr-7B-beta on a custom dataset using Kaggle's free GPUs (approximately 2 hours). (See the Fine-tuning Kaggle Notebook for troubleshooting.)
!pip install -q -U transformers !pip install -q -U accelerate !pip install -q -U bitsandbytes
import torch from transformers import pipeline
Kaggle Secrets (for Kaggle notebooks): Retrieve Hugging Face and Weights & Biases API keys.
Hugging Face and Weights & Biases Login:
model_name = "HuggingFaceH4/zephyr-7b-beta" pipe = pipeline( "text-generation", model=model_name, torch_dtype=torch.bfloat16, device_map="auto", )
prompt = "Write a Python function that can clean the HTML tags from the file:" outputs = pipe( prompt, max_new_tokens=300, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, ) print(outputs[0]["generated_text"])
The format_prompt
function adapts the dataset to Zephyr-7B's prompt style.
messages = [ { "role": "system", "content": "You are a skilled software engineer who consistently produces high-quality Python code.", }, { "role": "user", "content": "Write a Python code to display text in a star pattern.", }, ] prompt = pipe.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) outputs = pipe( prompt, max_new_tokens=300, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, ) print(outputs[0]["generated_text"])
%%capture %pip install -U bitsandbytes %pip install -U transformers %pip install -U peft %pip install -U accelerate %pip install -U trl
# ... (Import statements as in original tutorial) ...
!huggingface-cli login --token $secret_hf # ... (wandb login as in original tutorial) ...
base_model = "HuggingFaceH4/zephyr-7b-beta" dataset_name = "THUDM/AgentInstruct" new_model = "zephyr-7b-beta-Agent-Instruct"
# ... (format_prompt function and dataset loading as in original tutorial) ...
# ... (bnb_config and model loading as in original tutorial) ...
# ... (tokenizer loading and configuration as in original tutorial) ...
# ... (peft_config and model preparation as in original tutorial) ...
Test the model's performance with various prompts. Examples are provided in the original tutorial.
Zephyr-7B-beta demonstrates impressive capabilities. This tutorial provides a comprehensive guide to utilizing and fine-tuning this powerful LLM, even on resource-constrained GPUs. Consider the Master Large Language Models (LLMs) Concepts course for deeper LLM knowledge.
The above is the detailed content of Comprehensive Guide to Zephyr-7B: Features, Usage, and Fine-tuning. For more information, please follow other related articles on the PHP Chinese website!