Table of Contents
Build MoE model from scratch
Home Technology peripherals AI MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed

MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed

Jan 30, 2024 pm 02:42 PM
ai data

The legendary "magic weapon" of GPT-4 - the MoE (Mixed Expert) architecture, can be used by yourself!

There is a machine learning guru on Hugging Face who shared how to build a complete MoE system from scratch.

MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed

This project is called MakeMoE by the author, and details the process from attention construction to the formation of a complete MoE model.

According to the author, MakeMoE was inspired by and based on the makemore of OpenAI founding member Andrej Karpathy.

makemore is a teaching project for natural language processing and machine learning, intended to help learners understand and implement some basic models.

Similarly, MakeMoE also helps learners gain a deeper understanding of the hybrid expert model in the step-by-step building process.

So, what exactly does this "Hand Rubbing Guide" talk about?

Build MoE model from scratch

Compared with Karpathy's makemore, MakeMoE replaces the isolated feedforward neural network with a sparse mixture of experts, while adding the necessary gating logic.

At the same time, because the ReLU activation function needs to be used in the process, the default initialization method in makemore is replaced by the Kaiming He method.

MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed

If you want to create a MoE model, you must first understand the self-attention mechanism.

The model first transforms the input sequence into parameters represented by queries (Q), keys (K) and values ​​(V) through linear transformation.

These parameters are then used to calculate attention scores, which determine how much attention the model pays to each position in the sequence when generating each token.

In order to ensure the autoregressive characteristics of the model when generating text, that is, it can only predict the next token based on the already generated token, the author uses a multi-head causal self-attention machine mechanism.

This mechanism uses a mask to set the attention scores of unprocessed positions to negative infinity, so that the weights of these positions will become zero.

Multi-head causality allows the model to perform multiple such attention calculations in parallel, with each head focusing on different parts of the sequence.

MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed

After completing the configuration of the self-attention mechanism, you can create the expert module. The "expert module" here is a multi-layer perceptron.

Each expert module contains a linear layer that maps the embedding vector to a larger dimension, and then through a nonlinear activation function (such as ReLU), and another linear layer to map the vector back to the original Embed dimensions.

This design enables each expert to focus on processing different parts of the input sequence, and uses the gating network to decide which experts should be activated when generating each token.

MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed

#So, the next step is to start building the component for allocating and managing experts - the gate control network.

The gated network here is also implemented through a linear layer, which maps the output of the self-attention layer to the number of expert modules.

The output of this linear layer is a score vector, each score represents the importance of the corresponding expert module to the currently processed token.

The gated network will calculate the top-k values ​​of this score vector and record its index, and then select the top-k largest scores from them to weight the corresponding expert module output.

MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed

In order to increase the explorability of the model during the training process, the author also introduced noise to avoid that all tokens tend to be processed by the same experts.

This noise is usually achieved by adding random Gaussian noise to the fractional vector.

MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed

After obtaining the results, the model selectively multiplies the first k values ​​with the outputs of the top k experts of the corresponding token, and then adds them to form a weighted sum to form the model Output.

Finally, put these modules together to get a MoE model.

For the above entire process, the author has provided the corresponding code, you can learn more about it in the original article.

In addition, the author also produced end-to-end Jupyter notes, which can be run directly while learning each module.

If you are interested, learn it quickly!

Original address: https://huggingface.co/blog/AviSoori1x/makemoe-from-scratch
Note version (GitHub): https://github. com/AviSoori1x/makeMoE/tree/main

The above is the detailed content of MoE Large Model Making Guide: Zero-Based Manual Building Methods, Master-Level Tutorials Revealed. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Top 10 recommended for crypto digital asset trading APP (2025 global ranking) Top 10 recommended for crypto digital asset trading APP (2025 global ranking) Mar 18, 2025 pm 12:15 PM

This article recommends the top ten cryptocurrency trading platforms worth paying attention to, including Binance, OKX, Gate.io, BitFlyer, KuCoin, Bybit, Coinbase Pro, Kraken, BYDFi and XBIT decentralized exchanges. These platforms have their own advantages in terms of transaction currency quantity, transaction type, security, compliance, and special features. For example, Binance is known for its largest transaction volume and abundant functions in the world, while BitFlyer attracts Asian users with its Japanese Financial Hall license and high security. Choosing a suitable platform requires comprehensive consideration based on your own trading experience, risk tolerance and investment preferences. Hope this article helps you find the best suit for yourself

Which of the top ten virtual currency trading apps is the best? Which of the top ten virtual currency trading apps is the most reliable Which of the top ten virtual currency trading apps is the best? Which of the top ten virtual currency trading apps is the most reliable Mar 19, 2025 pm 05:00 PM

Top 10 virtual currency trading apps rankings: 1. OKX, 2. Binance, 3. Gate.io, 4. Kraken, 5. Huobi, 6. Coinbase, 7. KuCoin, 8. Crypto.com, 9. Bitfinex, 10. Gemini. Security, liquidity, handling fees, currency selection, user interface and customer support should be considered when choosing a platform.

The latest registration portal for Ouyi official website 2025 The latest registration portal for Ouyi official website 2025 Mar 21, 2025 pm 05:57 PM

2025 Ouyi OKX registration entrance forecast and security guide: Understand the future registration process in advance and seize the initiative in digital asset trading! This article predicts that Ouyi OKX registration in 2025 will strengthen KYC certification, implement regional registration procedures, and strengthen security measures, such as multi-factor identity verification and device fingerprint recognition. To ensure safe registration, be sure to access the website through official channels, set a strong password, enable two-factor verification, and be alert to phishing websites and emails. Only by understanding the registration process in advance and preventing risks can you gain an advantage in future digital asset transactions. Read now and master the secrets of Ouyi OKX registration in 2025!

Top 10 official virtual currency trading apps Top 10 official virtual currency trading platforms for mobile phones Top 10 official virtual currency trading apps Top 10 official virtual currency trading platforms for mobile phones Mar 19, 2025 pm 05:21 PM

Top 10 official virtual currency trading apps: 1. OKX, 2. Binance, 3. Gate.io, 4. Kraken, 5. Huobi, 6. Coinbase, 7. KuCoin, 8. Crypto.com, 9. Bitfinex, 10. Gemini. Security, liquidity, handling fees, currency selection, user interface and customer support should be considered when choosing a platform.

Top 10 virtual currency trading platform app rankings Top 10 virtual currency trading platform rankings Top 10 virtual currency trading platform app rankings Top 10 virtual currency trading platform rankings Mar 19, 2025 pm 04:51 PM

Top 10 virtual currency trading platform app rankings: 1. OKX, 2. Binance, 3. Gate.io, 4. Kraken, 5. Huobi, 6. Coinbase, 7. KuCoin, 8. Crypto.com, 9. Bitfinex, 10. Gemini. Security, liquidity, handling fees, currency selection, user interface and customer support should be considered when choosing a platform.

Detailed tutorial on how to register for binance (2025 beginner's guide) Detailed tutorial on how to register for binance (2025 beginner's guide) Mar 18, 2025 pm 01:57 PM

This article provides a complete guide to Binance registration and security settings, covering pre-registration preparations (including equipment, email, mobile phone number and identity document preparation), and introduces two registration methods on the official website and APP, as well as different levels of identity verification (KYC) processes. In addition, the article also focuses on key security steps such as setting up a fund password, enabling two-factor verification (2FA, including Google Authenticator and SMS Verification), and setting up anti-phishing codes, helping users to register and use the Binance Binance platform for cryptocurrency transactions safely and conveniently. Please be sure to understand relevant laws and regulations and market risks before trading and invest with caution.

A summary of the top ten virtual currency trading platforms apps, ranking of the top ten virtual currency trading platforms 2025 A summary of the top ten virtual currency trading platforms apps, ranking of the top ten virtual currency trading platforms 2025 Mar 19, 2025 pm 05:15 PM

Top 10 virtual currency trading platform apps recommended: 1. OKX, 2. Binance, 3. Gate.io, 4. Kraken, 5. Huobi, 6. Coinbase, 7. KuCoin, 8. Crypto.com, 9. Bitfinex, 10. Gemini. Security, liquidity, handling fees, currency selection, user interface and customer support should be considered when choosing a platform.

Top 10 virtual currency trading app rankings Top 10 virtual currency trading app recommendations Top 10 virtual currency trading app rankings Top 10 virtual currency trading app recommendations Mar 19, 2025 pm 05:18 PM

This article lists the top ten recommended virtual currency trading apps, including OKX, Binance, Gate.io, Kraken, Huobi, Coinbase, KuCoin, Crypto.com, Bitfinex and Gemini, with no particular order. These platforms have their own advantages in liquidity, security, currency selection, handling fees and user experience. For example, OKX is known for its strong liquidity and convenient user interface, Binance is known for its largest transaction volume and rich learning resources in the world, and Gate.io attracts users with its low handling fees and rich currency selection. To choose a virtual currency trading platform, you need to consider security, liquidity, handling fees, currency selection, and use

See all articles