Lasso regression is a linear regression model used for feature selection. By adding an L1 regularization term to the loss function, the coefficients of some features can be set to 0, thereby achieving the purpose of feature selection. In the following, I will detail the method of lasso regression and provide an example and corresponding Python code.
The loss function of Lasso Regression is:
L (\beta)=\frac{1}{2n}\sum_{i=1}^{n}(y_{i}-\sum_{j=1}^{p}x_{ij}\beta_{j} )^{2} \lambda\sum_{j=1}^{p}|\beta_{j}|
In linear regression, an important concept is regularization. Among them, n represents the number of samples, p represents the number of features, y_{i} represents the label of the i-th sample, x_{ij} represents the j-th feature value of the i-th sample, \beta_{j} represents the j-th feature The coefficient of \lambda represents the regularization strength. The purpose of regularization is to prevent overfitting and control the complexity of the model by penalizing the characteristic coefficients in the model. In regularization, the larger the value of \lambda, the stronger the model penalizes features. This will cause the coefficients of some features to become 0, thus reducing the number of features in the model. Through regularization, we can choose to retain the features that have the most impact on the prediction results while reducing unnecessary features. This simplifies the model and improves its generalization ability. Therefore, when choosing regularization
, the optimization goal of lasso regression is:
\hat{\beta}=argmin_{\beta} \frac{1}{2n}\sum_{i=1}^{n}(y_{i}-\sum_{j=1}^{p}x_{ij}\beta_{j})^{2} \lambda\sum_{j=1}^{p}|\beta_{j}|
The solution method of lasso regression can be the coordinate descent method or the minimum angle regression method. The coordinate descent method is an iterative optimization method that only optimizes one coefficient at a time and keeps other coefficients unchanged until convergence. The minimum angle regression method is a direct solution method that obtains the final model by optimizing all coefficients simultaneously.
Below we use an actual data set to demonstrate the feature selection effect of Lasso Regression. We use the diabetes dataset in sklearn, which contains 10 features and a response variable for 442 diabetic patients, and our goal is to select the most important features using lasso regression.
# 导入数据集和相关库 from sklearn.datasets import load_diabetes from sklearn.linear_model import Lasso import numpy as np import matplotlib.pyplot as plt # 加载糖尿病数据集 diabetes = load_diabetes() # 将数据集分成训练集和测试集 X_train = diabetes.data[:300] y_train = diabetes.target[:300] X_test = diabetes.data[300:] y_test = diabetes.target[300:] # 套索回归模型 lasso = Lasso(alpha=0.1) lasso.fit(X_train, y_train) # 打印每个特征的系数 print("lasso.coef_:", lasso.coef_) # 绘制每个特征的系数 plt.plot(range(diabetes.data.shape[1]), lasso.coef_) plt.xticks(range(diabetes.data.shape[1]), diabetes.feature_names, rotation=60) plt.ylabel("Coefficients") plt.show()
Running the above code, we can get the coefficients of each feature and the coefficient plot drawn. The results show that lasso regression compresses the coefficients of all features except the second feature to 0, which indicates that these features contribute little to the model and can be eliminated. Additionally, the coefficient of the second feature is larger than the coefficients of the other features, indicating that it is the most important feature.
Lasso regression is a very effective feature selection method that can control the quantity and quality of features by adjusting the regularization strength. In practical applications, we can use cross-validation to select the optimal regularization strength to achieve better model performance and feature selection effects.
The above is the detailed content of Lasso regression example: detailed explanation of feature selection method. For more information, please follow other related articles on the PHP Chinese website!