python learning_curveFunction
The function of this function is to determine the scores of cross-validation training and testing for training sets of different sizes.
A cross-validation generator splits the entire data set k times into a training set and a test set. (Recommended learning: Python Video Tutorial)
Subsets of the training set of different sizes will be used to train the evaluator and a score will be generated for each size of the training subset, Then the test set scores are also calculated. Then, for each training subset, all these scores after k runs will be averaged.
This function needs to reference the sklearn package
import sklearnfrom sklearn.learning_curve import learning_curve
The calling format of this function is:
learning_curve(estimator, X, y, train_sizes=array([ 0.1 , 0.325, 0.55 , 0.775, 1. ]), cv=None, scoring=None, exploit_incremental_learning=False, n_jobs=1, pre_dispatch='all', verbose=0)
estimator: used Classifier
X:array-like, shape (n_samples, n_features)
Training vector, n_samples is the number of samples, n_features is the number of features
y:array- like, shape (n_samples) or (n_samples, n_features), optional
The target is classified or regressed relative to X
train_sizes:array-like, shape (n_ticks,), dtype float or int
The relative or absolute number of training samples. These amounts of samples will generate a learning curve. If dtype is float, it will be considered part of the maximum training set (this is determined by the selected validation method). Otherwise, it will be treated as the absolute size of the training set. It should be noted that for classification, the sample size must be large enough to contain at least one sample for each classification.
cv:int, cross-validation generator or an iterable, optional
Determine the cross-validation separation strategy
--None, use the default 3-fold cross-validation ,
--integer, determine the number of folds of cross-validation
--an object used as a cross-validation generator
--one used for training/test separation Iterator
verbose: integer, optional
Control redundancy: the higher it is, the more information there is
Return value:
train_sizes_abs: array, shape = (n_unique_ticks,), dtype int
The number of samples in the training set used to generate the learning curve. Since duplicate inputs will be deleted, ticks may be less than n_ticks.
train_scores : array, shape (n_ticks, n_cv_folds)
Scores on the training set
test_scores : array, shape (n_ticks, n_cv_folds)
Scores on the test set
For more Python-related technical articles, please visit the Python Tutorial column to learn!
The above is the detailed content of How to use learning_curve in python. For more information, please follow other related articles on the PHP Chinese website!