Overfitting problem of machine learning models-AI-php.cn

Overfitting problem of machine learning models

王林

Release： 2023-10-08 16:06:22

Original

908 people have browsed it

Overfitting problem of machine learning models

The over-fitting problem of machine learning models and its solution

In the field of machine learning, model over-fitting is a common and challenging problem . When a model performs well on the training set but performs poorly on the test set, it indicates that the model is overfitting. This article will introduce the causes of overfitting problems and their solutions, and provide specific code examples.

Causes of over-fitting problem
The over-fitting problem is mainly caused by the model being too complex and having too many parameters. When the model has too many parameters, the model will pay too much attention to the noise and outliers in the training set, resulting in poor performance on new data. In addition, insufficient data is also one of the causes of overfitting problems. When there are fewer samples in the training set, the model is prone to remember the details of each sample and cannot generalize to unseen data.
Methods to solve over-fitting
In order to solve the over-fitting problem, we can take the following methods:

2.1 Data Augmentation
Data Augmentation refers to generating more samples by performing a series of transformations on the training set. For example, in image classification tasks, images can be rotated, scaled, flipped, etc. to augment the data. Doing this increases the size of the training set and helps the model generalize better.

The following is a sample code for image data expansion using the Keras library:

from keras.preprocessing.image import ImageDataGenerator

# 定义数据扩充器
datagen = ImageDataGenerator(
    rotation_range=20,  # 随机旋转角度范围
    width_shift_range=0.1,  # 水平平移范围
    height_shift_range=0.1,  # 垂直平移范围
    shear_range=0.2,  # 剪切变换范围
    zoom_range=0.2,  # 缩放范围
    horizontal_flip=True,  # 随机水平翻转
    fill_mode='nearest'  # 填充模式
)

# 加载图像数据集
train_data = datagen.flow_from_directory("train/", target_size=(224, 224), batch_size=32, class_mode='binary')
test_data = datagen.flow_from_directory("test/", target_size=(224, 224), batch_size=32, class_mode='binary')

# 训练模型
model.fit_generator(train_data, steps_per_epoch=len(train_data), epochs=10, validation_data=test_data, validation_steps=len(test_data))

Copy after login

2.2 Regularization (Regularization)
Regularization is by adding a regularization term to the loss function of the model , penalizes the complexity of the model, thereby reducing the risk of overfitting of the model. Common regularization methods include L1 regularization and L2 regularization.

The following is a sample code for L2 regularization using the PyTorch library:

import torch
import torch.nn as nn

# 定义模型
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(10, 10)
        self.fc2 = nn.Linear(10, 1)
    
    def forward(self, x):
        x = self.fc1(x)
        x = nn.ReLU()(x)
        x = self.fc2(x)
        return x

model = MyModel()

# 定义损失函数
criterion = nn.MSELoss()

# 定义优化器
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, weight_decay=0.001)  # 注意weight_decay参数即为正则化项的系数

# 训练模型
for epoch in range(100):
    optimizer.zero_grad()
    outputs = model(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

Copy after login

2.3 Dropout
Dropout is a commonly used regularization technique that randomly drops some data during training. Neurons to reduce the risk of overfitting of the model. Specifically, in each training iteration, we randomly select some neurons to discard with a certain probability p.

The following is a sample code for Dropout using the TensorFlow library:

import tensorflow as tf

# 定义模型
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, activation=tf.nn.relu, input_shape=(10,)),
    tf.keras.layers.Dropout(0.5),  # dropout率为0.5
    tf.keras.layers.Dense(1)
])

# 编译模型
model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=True))

# 训练模型
model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))

Copy after login

Summary
Overfitting is a common problem in machine learning models, but we can take some methods to solve it. Data augmentation, regularization and Dropout are all commonly used methods to solve the overfitting problem. We can choose appropriate methods to deal with over-fitting problems according to specific application scenarios, and further optimize the performance of the model by adjusting parameters and other methods.

The above is the detailed content of Overfitting problem of machine learning models. For more information, please follow other related articles on the PHP Chinese website!