The goal of training a machine learning or deep learning model is to become a "universal" model. This requires that the model does not overfit the training data set, or in other words, our model has a good understanding of the unseen data. Data augmentation is also one of many ways to avoid overfitting.
The process of expanding the amount of data used to train a model is called data augmentation. By training a model with multiple data types, we can obtain a more "generalized" model. What does "multiple data types" mean? This article only discusses "image" data enhancement technology and only introduces various image data enhancement strategies in detail. We will also get hands-on and implement data augmentation techniques primarily used in image data or computer vision using PyTorch.
Because it introduces data enhancement technology. So just use one image. Let’s look at the visual code first.
import PIL.Image as Image import torch from torchvision import transforms import matplotlib.pyplot as plt import numpy as np import warnings def imshow(img_path, transform):
This function is used to adjust the height and width of the image to what we want. specific size. The code below demonstrates that we want to resize the image from its original size to 224 x 224.
path = './kitten.jpeg' transform = transforms.Resize((224, 224)) imshow(path, transform)
This technique applies a portion of a selected image to a new image. For example, use CenterCrop to return a center-cropped image.
transform = transforms.CenterCrop((224, 224)) imshow(path, transform)
This method combines cropping and resizing at the same time.
transform = transforms.RandomResizedCrop((100, 300)) imshow(path, transform)
Flip the image horizontally or vertically, the code below will try to apply a horizontal flip to our image.
transform = transforms.RandomHorizontalFlip() imshow(path, transform)
Padding consists of padding a specified amount on all edges of the image. We'll fill each edge with 50 pixels.
transform = transforms.Pad((50,50,50,50)) imshow(path, transform)
Applies a random rotation angle to the image. We'll set this angle to 15 degrees.
transform = transforms.RandomRotation(15) imshow(path, transform)
This technique is a transformation that leaves the center unchanged. This technique has some parameters:
transform = transforms.RandomAffine(1, translate=(0.5, 0.5), scale=(1, 1), shear=(1,1), fillcolor=(256,256,256)) imshow(path, transform)
The image will be blurred using Gaussian blur deal with.
transform = transforms.GaussianBlur(7, 3) imshow(path, transform)
Convert color images to grayscale.
transform = transforms.Grayscale(num_output_channels=3) imshow(path, transform)
Color enhancement, also known as color dithering, is the process of modifying the color properties of an image by changing its pixel values. The following methods are all color-related operations.
Change the brightness of the image The resulting image becomes darker or lighter when compared to the original image.
transform = transforms.ColorJitter(brightness=2) imshow(path, transform)
The degree of difference between the darkest and lightest parts of an image is called contrast. The contrast of the image can also be adjusted as an enhancement.
transform = transforms.ColorJitter(cnotallow=2) imshow(path, transform)
Summary of 12 commonly used image data enhancement techniques中颜色的分离被定义为饱和度。
transform = transforms.ColorJitter(saturatinotallow=20) imshow(path, transform)
色调被定义为Summary of 12 commonly used image data enhancement techniques中颜色的深浅。
transform = transforms.ColorJitter(hue=2) imshow(path, transform)
图像本身的变化将有助于模型对未见数据的泛化,从而不会对数据进行过拟合。以上整理的都是我们常见的数据增强技术,torchvision中还包含了很多方法,可以在他的文档中找到:https://pytorch.org/vision/stable/transforms.html
The above is the detailed content of Summary of 12 commonly used image data enhancement techniques. For more information, please follow other related articles on the PHP Chinese website!