Lasso regression example in Python-Python Tutorial-php.cn

Lasso regression example in Python

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Release： 2023-06-10 20:52:55

Original

1997 people have browsed it

Lasso regression is a popular linear regression method used in machine learning, which aims to find the best-fitting model by ignoring irrelevant feature variables. This article will introduce how to implement Lasso regression in Python and provide an actual data set for demonstration.

Introduction to Lasso Regression

Lasso regression is a method of solving ordinary least squares problems by adding a penalty term to the objective function. This penalty term is implemented using L1 regularization (also called Lasso penalty), and its form is as follows:

$J(eta)= rac{1}{2n}sum_{i=1}^ {n}(y_i-sum_{j=1}^{p}X_{ij} eta_j)^2 lpha sum_{j=1}^{p}| eta_j|$

where, $y$ is the response variable, $X$ is the independent variable matrix, $eta$ is the model coefficient, $n$ is the number of samples, $p$ is the number of features, and $lpha$ is the penalty parameter. The difficult part of Lasso regression is the non-convex optimization problem of the penalty term.

One way to implement Lasso regression is to solve it through the coordinate descent (CD) algorithm. The basic idea is that in each iteration, only one coefficient is changed. In this way, the CD algorithm cleverly bypasses the non-convex optimization problem of the penalty term.

Python Lasso Regression Implementation

Python provides many machine learning libraries, such as Scikit-learn, that can easily implement Lasso regression.

First, import the required libraries as follows:

import numpy as np
import pandas as pd
from sklearn.linear_model import LassoCV
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

Copy after login

Next, we load the Boston housing price data set and normalize it:

boston = load_boston()
X = boston.data
y = boston.target
X = StandardScaler().fit_transform(X)

Copy after login

Then, we use Scikit-learn LassoCV in implements Lasso regression. The model automatically performs cross-validation and selects the optimal $lpha$ value.

lasso_reg = LassoCV(alphas=np.logspace(-3, 3, 100), cv=5, max_iter=100000)
lasso_reg.fit(X, y)

Copy after login

Finally, we output the obtained optimal $lpha$ value and model coefficient:

print('Best alpha:', lasso_reg.alpha_)
print('Model coefficients:', lasso_reg.coef_)

Copy after login

Full code example:

import numpy as np
import pandas as pd
from sklearn.linear_model import LassoCV
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

boston = load_boston()
X = boston.data
y = boston.target
X = StandardScaler().fit_transform(X)

lasso_reg = LassoCV(alphas=np.logspace(-3, 3, 100), cv=5, max_iter=100000)
lasso_reg.fit(X, y)

print('Best alpha:', lasso_reg.alpha_)
print('Model coefficients:', lasso_reg.coef_)

Copy after login

The output results are as follows:

Best alpha: 0.10000000000000002
Model coefficients: [-0.89521162  1.08556604  0.14359222  0.68736347 -2.04113155  2.67946138
  0.01939491 -3.08179223  2.63754058 -2.05806301 -2.05202597  0.89812875
 -3.73066641]

Copy after login

This shows that through Lasso regression, we can determine the best model for predicting Boston house prices and extract the features most relevant to the response variable.

Conclusion

This article introduces how to implement Lasso regression in Python and demonstrates the application of this method through an actual data set. Lasso regression is a very useful linear regression technique, especially suitable for processing high-dimensional data. In actual problems, techniques such as cross-validation and standardization can be used to optimize model performance and extract the most relevant features.

The above is the detailed content of Lasso regression example in Python. For more information, please follow other related articles on the PHP Chinese website!