This article discusses using LazyPredict to create a simple ML model. The characteristic of LazyPredict's creation of machine learning models is that it does not require a lot of code and can perform multi-model fitting without modifying parameters, thereby selecting the best performing model among many models.
This article discusses using LazyPredict to create a simple ML model. The characteristic of LazyPredict's creation of machine learning models is that it does not require a lot of code and can perform multi-model fitting without modifying parameters, thereby selecting the best performing model among many models.
This article includes the following content:
!pip install lazypredict
from sklearn.datasets import load_breast_cancer from lazypredict.Supervised import LazyClassifier data = load_breast_cancer() X = data.data y= data.target
LazyClassifier( verbose=0, ignore_warnings=True, custom_metric=None, predictions=False, random_state=42, classifiers='all', )
from lazypredict.Supervised import LazyClassifier from sklearn.model_selection import train_test_split # split the data X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.3,random_state =0) # build the lazyclassifier clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None) # fit it models, predictions = clf.fit(X_train, X_test, y_train, y_test) # print the best models print(models)
model_dictionary = clf.provide_models(X_train,X_test,y_train,y_test)
model_dictionary['LGBMClassifier']
Here we can see that SimpleImputer is used for the entire dataset and then StandardScaler is used for the numeric features. There are no categorical or ordinal features in this dataset, but if there were, OneHotEncoder and OrdinalEncoder would be used respectively. The LGBMClassifier model receives the data after transformation and classification.
LazyClassifier’s internal machine learning model uses the sci-kit-learn toolbox for evaluation and fitting. When the LazyClassifier function is called, it will automatically build and fit various models on our data, including decision trees, random forests, support vector machines, etc. A set of performance metrics you provide, such as precision, recall, or F1 score, are used to evaluate these models. The training set is used for fitting, while the test set is used for evaluation.
After evaluating and fitting the model, LazyClassifier will provide a summary of the evaluation results (as shown in the table above), as well as a list of top models and performance indicators for each model. Since there is no need to manually tune or select models, you can quickly and easily evaluate the performance of many models and choose the one that best fits your data.
The same job can be done again for a regression model using the "LazyRegressor" function. Let's import a dataset suitable for the regression task (using the Boston dataset).
Now, let’s use LazyRegressor to fit our data.
from lazypredict.Supervised import LazyRegressor from sklearn import datasets from sklearn.utils import shuffle import numpy as np # load the data boston = datasets.load_boston() X, y = shuffle(boston.data, boston.target, random_state=0) X = X.astype(np.float32) # split the data X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.3,random_state =0) # fit the lazy object reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None) models, predictions = reg.fit(X_train, X_test, y_train, y_test) # print the results in a table print(models)
The code execution results are as follows:
以下是对最佳回归模型的详细描述:
model_dictionary = reg.provide_models(X_train,X_test,y_train,y_test) model_dictionary['ExtraTreesRegressor']
这里可以看到SimpleImputer被用于整个数据集,然后是StandardScaler用于数字特征。这个数据集中没有分类或序数特征,但如果有的话,会分别使用OneHotEncoder和OrdinalEncoder。ExtraTreesRegressor模型接收了转换和归类后的数据。
LazyPredict库对于任何从事机器学习行业的人来说都是一种有用的资源。LazyPredict通过自动创建和评估模型的过程来节省选择模型的时间和精力,这大大提高了模型选择过程的有效性。LazyPredict提供了一种快速而简单的方法来比较几个模型的有效性,并确定哪个模型系列最适合我们的数据和问题,因为它能够同时拟合和评估众多模型。
阅读本文之后希望你现在对LazyPredict库有了直观的了解,这些概念将帮助你建立一些真正有价值的项目。
崔皓,51CTO社区编辑,资深架构师,拥有18年的软件开发和架构经验,10年分布式架构经验。
原文标题:LazyPredict: A Utilitarian Python Library to Shortlist the Best ML Models for a Given Use Case,作者:Sanjay Kumar
The above is the detailed content of LazyPredict: Choose the best ML model for you!. For more information, please follow other related articles on the PHP Chinese website!