How to use the scikit-learn module for machine learning in Python 2.x
Introduction:
Machine learning is a discipline that studies how to enable computers to learn from data and improve their own performance. scikit-learn is a Python-based machine learning library that provides many machine learning algorithms and tools to make machine learning easier and more efficient.
This article will introduce how to use the scikit-learn module for machine learning in Python 2.x and provide sample code.
1. Install the scikit-learn module
First, we need to make sure that the Python 2.x version is installed. Then, you can install the scikit-learn module through the pip command:
pip install -U scikit-learn
After the installation is complete, you can start using the scikit-learn module for machine learning.
2. Loading data sets
In machine learning, we usually need to load and process data sets. scikit-learn provides many built-in datasets that can be used directly. The following takes the iris data set as an example for demonstration:
from sklearn.datasets import load_iris iris = load_iris() X, y = iris.data, iris.target
In the above code, we use the load_iris()
function to load the iris data set, and then store the input data in the data set in In the variable X
, the corresponding label is stored in the variable y
.
3. Divide the data set
Before training the machine learning model, we need to divide the data set into a training set and a test set. scikit-learn provides the train_test_split
function to implement the partitioning of the data set.
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
In the above code, we divide the data set into a training set and a test set, where test_size=0.2
means that the proportion of the test set is 20%, random_state=42
represents a random seed to ensure the consistency of each division result.
4. Select a model
In machine learning, we can choose different models to train our data sets. In scikit-learn, each model has a corresponding class, and we can select different models by creating instances of the model class.
Take support vector machine (SVM) as an example, use the SVC
class to create an instance of the SVM model:
from sklearn.svm import SVC model = SVC()
5. Training model
Once selected model, we can use the training data set to train the model.
model.fit(X_train, y_train)
In the above code, we use the fit
method to train the model, taking the training data set X_train
and the corresponding label y_train
as input .
6. Model Evaluation
After the training is completed, we need to use the test data set to evaluate the performance of the model.
score = model.score(X_test, y_test) print("模型准确率:", score)
In the above code, we use the score
method to calculate the accuracy of the model on the test data set and output the evaluation results.
7. Model prediction
Finally, we can use the trained model to make predictions.
y_pred = model.predict(X_test) print("预测结果:", y_pred)
In the above code, we use the predict
method to predict the test data set and output the prediction results.
Summary:
Through the introduction of this article, we learned how to use the scikit-learn module for machine learning in Python 2.x. We learned the basic steps of loading a data set, partitioning a data set, selecting a model, training a model, model evaluation, and model prediction, and gave corresponding code examples.
I hope this article will be helpful to you when learning machine learning and using the scikit-learn module. I wish you progress in your studies and master the skills of machine learning!
The above is the detailed content of How to use the scikit-learn module for machine learning in Python 2.x. For more information, please follow other related articles on the PHP Chinese website!