Table of Contents
(1) Understanding the Logistic Regression (LR) classifier" > (1) Understanding the Logistic Regression (LR) classifier
(2) Mathematical derivation of logistic regression" > (2) Mathematical derivation of logistic regression
1, gradient descent method to solve logistic regression" >1, gradient descent method to solve logistic regression
2, vectorization Vectorization solution " >2, vectorization Vectorization solution
4, feature value normalization" >4, feature value normalization
5, algorithm optimization--stochastic gradient method" >5, algorithm optimization--stochastic gradient method
(3) Implementing Logistic Regression Algorithm in Python" > (3) Implementing Logistic Regression Algorithm in Python
machine learning Logistic regression" >machine learning Logistic regression
(4) Logistic Regression Application" > (4) Logistic Regression Application
related suggestion:" >related suggestion:
Home Backend Development Python Tutorial Python implements LR classic algorithm

Python implements LR classic algorithm

Apr 19, 2018 pm 04:55 PM
python

The example in this article describes the implementation of LR classic algorithm in Python. Share it with everyone for your reference, the details are as follows:

(1) Understanding the Logistic Regression (LR) classifier

First of all, although Logistic regression has "regression" in its name, it is actually a classification method, mainly used for two-classification problems. It uses the Logistic function (or Sigmoid function), and the independent variable value range is ( -INF, INF), the value range of the independent variable is (0,1), and the function form is:


Because the domain of the sigmoid function is (-INF, INF), and the value range is (0, 1). Therefore, the most basic LR classifier is suitable for classifying two categories (class 0, class 1) targets. The Sigmoid function is a very beautiful "S" shape, as shown in the following figure:


LR classifier ( The purpose of Logistic Regression Classifier is to learn a 0/1 classification model from the training data features - this model uses the linear combination of the sample features as the independent variable, and uses the logistic function to map the independent variable to (0,1). Therefore, the solution of the LR classifier is to solve a set of weights ( is a nominal variable--dummy, which is a constant. In actual engineering, x0=1.0 is often used. Regardless of whether the constant term is meaningful or not, it is best to retain it), And substitute it into the Logistic function to construct a prediction function:

The value of the function represents the probability that the result is 1, which means that the feature belongs to The probability of y=1. Therefore, the probabilities that the input x classification result is category 1 and category 0 are respectively:


##When we want to determine which class a new feature belongs to, we can find a z value according to the following formula:

(x1,x2,...,xn are the characteristics of a certain sample data, and the dimension is n)

And then find out---If it is greater than 0.5, it is Class y=1, otherwise it belongs to class y=0. (Note: It is still assumed that the statistical samples are uniformly distributed, so the threshold is set to 0.5). How to obtain this set of weights of the LR classifier? This requires the concepts of maximum likelihood estimation (MLE) and optimization algorithms. The most commonly used optimization algorithm in mathematics is the gradient ascent (descent) algorithm.

Logistic regression can also be used for multi-class classification, but binary classification is more commonly used and easier to explain. Therefore, the most commonly used method in practice is binary logistic regression. The LR classifier is suitable for data types: numerical and nominal data. Its advantage is that the calculation cost is not high and it is easy to understand and implement; its disadvantage is that it is easy to underfit and the classification accuracy may not be high.

(2) Mathematical derivation of logistic regression

1, gradient descent method to solve logistic regression

First of all, understanding the following mathematical derivation process requires more derivative solution formulas. You can refer to "Commonly used basic elementary function derivation formulas and integral formulas".

Assume that there are n observation samples, and the observation values ​​are Let be the probability of obtaining yi=1 under given conditions. The conditional probability of getting yi=0 under the same conditions is . Therefore, the probability of obtaining an observed value is

-----This formula is actually a comprehensive formula (1) to get

Because each observation is independent, their joint distribution can be expressed as the product of each marginal distribution:

## (m represents the number of statistical samples)                                                     #The above formula is called the likelihood function of n observations. Our goal is to be able to find parameter estimates that maximize the value of this likelihood function. Therefore, the key to maximum likelihood estimation is to find the parameters

so that the above formula obtains the maximum value.

Find the logarithm of the above function:


##Maximum likelihood estimation is to find θ when the above formula takes the maximum value. Here you can use the gradient ascent method to solve it. The obtained θ is the required optimal parameter. In Andrew Ng's course, J(θ) is taken as the following formula, that is: J(θ)=-(1/m)l(θ), and θ when J(θ) is at its minimum is the required optimal parameter. . Find the minimum value by gradient descent method. The initial values ​​of θ can all be 1.0, and the update process is:

(j represents the j-th attribute of the sample, n in total; a represents the step size--each move The amount can be freely specified)


## Therefore, θ ( The update process (all initial values ​​can be set to 1.0) can be written as:


(i represents the i-th statistical sample, j represents the j-th attribute of the sample; a represents the step size)

This formula will always be executed iteratively , until convergence is reached ( decreases in each iteration, if the value reduced at a certain step is less than a small value( is less than 0.001), then it is judged to be convergent) or until a certain stopping condition (such as the number of iterations reaches a certain specified value or the algorithm reaches a certain allowable error range).

2, vectorization Vectorization solution

##Vectorization uses matrix calculations instead of for loops to simplify the calculation process ,Improve efficiency. As shown in the above formula, Σ(...) is a summation process, which obviously requires a for statement to loop m times, so vectorization is not fully implemented at all. The vectorization process is introduced below:

The matrix form of the training data is agreed to be as follows. Each row of x is a training sample, and each column has a different special value:


The parameter A of g(A) is a column of vectors, so when implementing the g function Supports column vectors as arguments and returns column vectors. It can be seen from the above formula that hθ(x)-y can be calculated from g(A)-y in one step.

θThe update process can be changed to:



##To sum up, the steps for updating θ after Vectorization are as follows:

(1) Find A=X*θ(this is matrix multiplication, X is a (m, n 1)-dimensional vector, θ is a (n 1, 1)-dimensional column vector, A is a (m, 1)-dimensional vector)

## (2) Find E=g(A)-y (E, y are (m,1)-dimensional column vectors)

(3) Find

(a represents the step size )


3, the choice of step size a

a The value is also a key point to ensure the convergence of gradient descent. If the value is too small, the convergence will be slow, and if the value is too large, the iterative process will not be guaranteed to converge (passing the minimum value). To ensure that the gradient descent algorithm operates correctly, it is necessary to ensure that J(θ) decreases in each iteration. If the value of the step size a is correct, then J(θ) should become smaller and smaller. Therefore, the criterion for judging the value of a is: if J(θ) becomes smaller, it indicates that the value is correct, otherwise, reduce the value of a.

The experience of choosing step size a is: choose a value a, each time about 3 times the previous number. If the iteration cannot proceed normally (J increases, the step size is too large and it crosses the bottom of the bowl) ), consider using a smaller step size. If the convergence is slow, consider increasing the step size. Examples of a values ​​are:

…, 0.001, 0.003, 0.01, 0.03, 0.1 , 0.3, 1….

4, feature value normalization

Logistic regression is also a regression algorithm, multi-dimensional feature training data When using the gradient method to solve regression, the eigenvalues ​​must be scaled to ensure that the value range of the feature is within the same scale before the calculation process converges (because the eigenvalue range may be very different, for example, the value of feature 1 is (1000- 2000), the value of feature 2 is (0.1-0.2)). The method of feature scaling can be customized. Commonly used ones are:

1) mean normalization (or standardization)

(X - mean(X ))/std(X), std(X) represents the standard deviation of the sample

2) rescaling

(X - min) / (max - min)

5, algorithm optimization--stochastic gradient method

Gradient ascent (descent) algorithm The entire data set needs to be traversed every time the regression coefficients are updated. This method is OK when processing about 100 data sets, but if there are billions of samples and thousands of features, the computational complexity of this method will be Too high. An improved method is to update the regression coefficients with only one sample point at a time, which is called the stochastic gradient algorithm. Since the classifier can be incrementally updated when new samples arrive, it can complete parameter updates when new data arrives without the need to re-read the entire data set for batch processing. Therefore, the stochastic gradient algorithm is an online Learning algorithms. (Corresponding to "online learning", processing all data at once is called "batch processing"). The stochastic gradient algorithm is equivalent to the gradient algorithm but is more computationally efficient.

(3) Implementing Logistic Regression Algorithm in Python

The previous section used Andrew Ng’s course to learn about J(θ) =-(1/m)l(θ) is solved using the gradient descent method, which illustrates the solution process of logistic regression. In this section, the process of implementing the algorithm in Python still directly uses the gradient ascent method or the stochastic gradient ascent method to solve J(θ). LRTrain The object also implements the solution process using the gradient ascent method or the stochastic gradient ascent method.

The LR classifier learning package contains three modules: lr.py/object_json.py/test.py. The lr module implements the LR classifier through the object logisticRegres, supporting two solution methods: gradAscent('Grad') and randomGradAscent('randomGrad') (choose one of the two, classifierArray only stores one classification solution result, of course you can also define two classifierArray supports both solution methods).

The test module is an application that uses the LR classifier to predict the mortality of sick horses based on hernia symptoms. There is a problem with this data - the data has a 30% loss rate, which is replaced by the special value 0 because 0 will not affect the weight update of the LR classifier.

The partial missing of sample feature values ​​in the training data is a very thorny problem. Many literatures are devoted to solving this problem, because it is a pity to throw away the data directly and the cost of reacquisition is also expensive. Some optional methods for handling missing data include:

□ Use the mean of available features to fill missing values;

□ Use special values ​​to ±True complement missing values, such as -1;

□ Ignore samples with missing values;

□ Use the mean of similar samples to fill in missing values;

□ Use Additional machine learning algorithms predict missing values.

LR classifier algorithm learning package download address is:

machine learning Logistic regression

(4) Logistic Regression Application

The main purpose of Logistic Regression:

Looking for danger Factors: Looking for risk factors for a certain disease, etc.;

Prediction: According to the model, predict the probability of a certain disease or situation under different independent variables;

Discrimination: In fact, it is somewhat similar to prediction. It is also based on the model to determine the probability that someone belongs to a certain disease or a certain situation, that is, to see how likely this person is. Sex belongs to a certain disease.

Logistic regression is mainly used in epidemiology. A common situation is to explore the risk factors of a certain disease, predict the probability of a certain disease based on the risk factors, etc. For example, if you want to explore the risk factors for gastric cancer, you can choose two groups of people, one is the gastric cancer group, and the other is the non-gastric cancer group. The two groups of people must have different physical signs and lifestyles. The dependent variable here is whether it is gastric cancer, that is, "yes" or "no". The independent variables can include many, such as age, gender, eating habits, Helicobacter pylori infection, etc. The independent variables can be either continuous or categorical.


related suggestion:

Logistic regression ladder descent calculation maximum value

LR in-depth understanding

Brief description of the principles of common machine learning algorithms (LDA, CNN, LR)

LR derivation

The above is the detailed content of Python implements LR classic algorithm. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Two Point Museum: All Exhibits And Where To Find Them
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Is there any mobile app that can convert XML into PDF? Is there any mobile app that can convert XML into PDF? Apr 02, 2025 pm 08:54 PM

An application that converts XML directly to PDF cannot be found because they are two fundamentally different formats. XML is used to store data, while PDF is used to display documents. To complete the transformation, you can use programming languages ​​and libraries such as Python and ReportLab to parse XML data and generate PDF documents.

Is the conversion speed fast when converting XML to PDF on mobile phone? Is the conversion speed fast when converting XML to PDF on mobile phone? Apr 02, 2025 pm 10:09 PM

The speed of mobile XML to PDF depends on the following factors: the complexity of XML structure. Mobile hardware configuration conversion method (library, algorithm) code quality optimization methods (select efficient libraries, optimize algorithms, cache data, and utilize multi-threading). Overall, there is no absolute answer and it needs to be optimized according to the specific situation.

How to convert XML files to PDF on your phone? How to convert XML files to PDF on your phone? Apr 02, 2025 pm 10:12 PM

It is impossible to complete XML to PDF conversion directly on your phone with a single application. It is necessary to use cloud services, which can be achieved through two steps: 1. Convert XML to PDF in the cloud, 2. Access or download the converted PDF file on the mobile phone.

How to control the size of XML converted to images? How to control the size of XML converted to images? Apr 02, 2025 pm 07:24 PM

To generate images through XML, you need to use graph libraries (such as Pillow and JFreeChart) as bridges to generate images based on metadata (size, color) in XML. The key to controlling the size of the image is to adjust the values ​​of the <width> and <height> tags in XML. However, in practical applications, the complexity of XML structure, the fineness of graph drawing, the speed of image generation and memory consumption, and the selection of image formats all have an impact on the generated image size. Therefore, it is necessary to have a deep understanding of XML structure, proficient in the graphics library, and consider factors such as optimization algorithms and image format selection.

How to open xml format How to open xml format Apr 02, 2025 pm 09:00 PM

Use most text editors to open XML files; if you need a more intuitive tree display, you can use an XML editor, such as Oxygen XML Editor or XMLSpy; if you process XML data in a program, you need to use a programming language (such as Python) and XML libraries (such as xml.etree.ElementTree) to parse.

Recommended XML formatting tool Recommended XML formatting tool Apr 02, 2025 pm 09:03 PM

XML formatting tools can type code according to rules to improve readability and understanding. When selecting a tool, pay attention to customization capabilities, handling of special circumstances, performance and ease of use. Commonly used tool types include online tools, IDE plug-ins, and command-line tools.

What is the process of converting XML into images? What is the process of converting XML into images? Apr 02, 2025 pm 08:24 PM

To convert XML images, you need to determine the XML data structure first, then select a suitable graphical library (such as Python's matplotlib) and method, select a visualization strategy based on the data structure, consider the data volume and image format, perform batch processing or use efficient libraries, and finally save it as PNG, JPEG, or SVG according to the needs.

Is there a mobile app that can convert XML into PDF? Is there a mobile app that can convert XML into PDF? Apr 02, 2025 pm 09:45 PM

There is no APP that can convert all XML files into PDFs because the XML structure is flexible and diverse. The core of XML to PDF is to convert the data structure into a page layout, which requires parsing XML and generating PDF. Common methods include parsing XML using Python libraries such as ElementTree and generating PDFs using ReportLab library. For complex XML, it may be necessary to use XSLT transformation structures. When optimizing performance, consider using multithreaded or multiprocesses and select the appropriate library.

See all articles