Home Backend Development PHP Tutorial Simple linear regression implemented in PHP_PHP tutorial

Simple linear regression implemented in PHP_PHP tutorial

Jul 21, 2016 pm 02:52 PM
php return exist accomplish article use of Simple series Wire Linear composition this part

In Part 1 of this two-part series ("Simple Linear Regression in PHP"), I explained why math libraries are useful for PHP. I also demonstrated how to develop and implement core parts of a simple linear regression algorithm using PHP as the implementation language.

The goal of this article is to show you how to use the SimpleLinearRegression class discussed in Part 1 to build an important data research tool.

Brief Review: Concepts

The basic goal behind simple linear regression modeling is to find the best-fitting straight line from a two-dimensional plane consisting of pairs of X and Y values ​​(i.e., X and Y measurements). Once the line is found using the minimum variance method, various statistical tests can be performed to determine how well the line fits the observed deviation from the Y value.

The linear equation (y = mx + b) has two parameters that must be estimated based on the X and Y data provided, they are the slope (m) and the y-intercept (b). Once these two parameters are estimated, you can enter the observed values ​​into the linear equation and observe the Y predictions generated by the equation.

To use the minimum variance method to estimate the m and b parameters, you need to find the estimated values ​​of m and b such that they minimize the observed and predicted values ​​of Y for all X values. The difference between the observed and predicted values ​​is called the error ( y i- (mx i+ b) ), and if you square each error value and then sum these residuals, the result is a prediction squared Bad number. Using the minimum variance method to determine the best fit involves finding estimates of m and b that minimize the prediction variance.

Two basic methods can be used to find the estimates m and b that satisfy the minimum variance method. In the first approach, one can use a numerical search process to set different values ​​of m and b and evaluate them, ultimately deciding on the estimate that yields the minimum variance. The second method is to use calculus to find equations for estimating m and b. I'm not going to get into the calculus involved in deriving these equations, but I did use these analytical equations in the SimpleLinearRegression class to find least square estimates of m and b (see getSlope() and getYIntercept in the SimpleLinearRegression class method).

Even if you have an equation that can be used to find the least squares estimate of m and b, it does not mean that if you plug these parameters into a linear equation, the result will be a straight line that fits the data well. The next step in this simple linear regression process is to determine whether the remaining prediction variance is acceptable.

You can use the statistical decision process to reject the alternative hypothesis that the straight line fits the data. This process is based on the calculation of the T statistic, using a probability function to find the probability of a randomly large observation. As mentioned in Part 1, the SimpleLinearRegression class generates a number of summary values, one of the important summary values ​​is the T statistic, which measures how well the linear equation fits the data. If the fit is good, the T statistic will tend to be a large value; if the T value is small, you should replace your linear equation with a default model that assumes that the mean of the Y values ​​is the best predictor (because The average of a set of values ​​can often be a useful predictor of the next observation).

To test whether the T statistic is large enough to not use the average Y value as the best predictor, you need to calculate the probability of obtaining the T statistic randomly. If the probability is low, then the null assumption that the mean is the best predictor can be dispensed with, and accordingly one can be confident that a simple linear model is a good fit to the data. (See Part 1 for more information on calculating the probability of a T-statistic.)

Back to discussing the statistical decision-making process. It tells you when not to adopt the null hypothesis, but it does not tell you whether to accept the alternative hypothesis. In a research setting, linear model alternative hypotheses need to be established through theoretical and statistical parameters.

The data research tool you will build implements a statistical decision-making process for linear models (T-tests) and provides summary data that can be used to construct the theoretical and statistical parameters needed to build linear models. Data research tools can be classified as decision support tools for knowledge workers to study patterns in small to medium-sized data sets.

From a learning perspective, simple linear regression modeling is worth studying as it is the only way to understand more advanced forms of statistical modeling. For example, many core concepts in simple linear regression establish a good foundation for understanding multiple regression (Multiple Regression), factor analysis (Factor Analysis), and time series (Time Series).

Simple linear regression is also a versatile modeling technique. It can be used to model curvilinear data by transforming the raw data (usually with a logarithmic or power transformation). These transformations linearize the data so that it can be modeled using simple linear regression. The resulting linear model will be represented as a linear formula related to the transformed values.

Probability function

In the previous article, I got around the problem of implementing probability functions in PHP by asking R to find the probability value. I wasn't completely satisfied with this solution, so I started researching the question: what is needed to develop a probability function based on PHP.

I started looking online for information and code. One source for both is Probability Functions in the book Numerical Recipes in C. I reimplemented some probability function code (gammln.c and betai.c functions) in PHP, but I'm still not satisfied with the results. It seems to have a bit more code than some other implementations. Additionally, I need the inverse probability function.

Luckily, I stumbled upon John Pezzullo’s Interactive Statistical Calculation. John's website on Probability Distribution Functions has all the functions I need, implemented in JavaScript to make learning easier.

I ported the Student T and Fisher F functions to PHP. I changed the API a bit to conform to Java naming style and embedded all functions into a class called Distribution. A great feature of this implementation is the doCommonMath method, which is reused by all functions in this library. Other tests that I didn't bother to implement (normality test and chi-square test) also use the doCommonMath method.

Another aspect of this transplant is also worth noting. By using JavaScript, users can assign dynamically determined values ​​to instance variables, such as:

var PiD2 = pi() / 2

You cannot do this in PHP. Only simple constant values ​​can be assigned to instance variables. Hopefully this flaw will be resolved in PHP5.

Note that the code in Listing 1 does not define instance variables — this is because in the JavaScript version, they are dynamically assigned values.

List 1. Implement probability function


doCommonMath($cth * $cth, 2, $df - 3, -1)) / (pi()/2); } else { return 1 - $sth * $this->doCommonMath($cth * $cth, 1, $df - 3, -1); } } function getInverseStudentT($p, $df) { $v = 0.5; $dv = 0.5; $t = 0; while($dv > 1e-6) { $t = (1 / $v) - 1; $dv = $dv / 2; if ( $this->getStudentT($t, $df) > $p) { $v = $v - $dv; } else { $v = $v + $dv; } } return $t; } function getFisherF($f, $n1, $n2) { // implemented but not shown } function getInverseFisherF($p, $n1, $n2) { // implemented but not shown } } ?>

Graphic output

The output methods you have implemented so far all display summary values ​​in HTML format. It is also suitable for displaying scatter plots or line plots of these data in GIF, JPEG or PNG format.

Rather than writing the code to generate line and distribution plots myself, I thought it would be better to use a PHP-based graphics library called JpGraph. JpGraph is being actively developed by Johan Persson, whose project website describes it this way:

Whether it’s a “quick and dirty” graph with minimal code, or a complex professional graph that requires very fine-grained control, JpGraph makes drawing them simple. JpGraph is equally suitable for scientific and business type graphs.

The JpGraph distribution includes a number of example scripts that can be customized to your specific needs. Using JpGraph as a data research tool is as simple as finding a sample script that does something similar to what I need and adapting it to fit my specific needs.

The script in Listing 3 is extracted from the sample data exploration tool (explore.php) and demonstrates how to call the library and populate the Line and Scatter classes with data from the SimpleLinearRegression analysis. The comments in this code were written by Johan Persson (who does a great job documenting the JPGraph codebase).

Listing 3. Details of functions from the sample data research tool explore.php


SetScale("linlin"); // Setup title $graph->title->Set("$title"); $graph->img->SetMargin(50,20,20,40); $graph->xaxis->SetTitle("$x_name","center"); $graph->yaxis->SetTitleMargin(30); $graph->yaxis->title->Set("$y_name"); $graph->title->SetFont(FF_FONT1,FS_BOLD); // make sure that the X-axis is always at the // bottom at the plot and not just at Y=0 which is // the default position $graph->xaxis->SetPos('min'); // Create the scatter plot with some nice colors $sp1 = new ScatterPlot($slr->Y, $slr->X); $sp1->mark->SetType(MARK_FILLEDCIRCLE); $sp1->mark->SetFillColor("red"); $sp1->SetColor("blue"); $sp1->SetWeight(3); $sp1->mark->SetWidth(4); // Create the regression line $lplot = new LinePlot($slr->PredictedY, $slr->X); $lplot->SetWeight(2); $lplot->SetColor('navy'); // Add the pltos to the line $graph->Add($sp1); $graph->Add($lplot); // ... and stroke $graph_name = "temp/test.png"; $graph->Stroke($graph_name); ?> ?>
 

数据研究脚本

该数据研究工具由单个脚本( explore.php)构成,该脚本调用 SimpleLinearRegressionHTML 类和 JpGraph 库的方法。

该脚本使用了简单的处理逻辑。该脚本的第一部分对所提交的表单数据执行基本验证。如果这些表单数据通过验证,则执行该脚本的第二部分。

该脚本的第二部分所包含的代码用于分析数据,并以 HTML 和图形格式显示汇总结果。 清单 4中显示了 explore.php脚本的基本结构:

清单 4. explore.php 的结构


$title"; $slr->showTableSummary($x_name, $y_name); echo "

"; $slr->showAnalysisOfVariance(); echo "

"; $slr->showParameterEstimates($x_name, $y_name); echo "
"; $slr->showFormula($x_name, $y_name); echo "

"; $slr->showRValues($x_name, $y_name); echo "
"; include ("jpgraph/jpgraph.php"); include ("jpgraph/jpgraph_scatter.php"); include ("jpgraph/jpgraph_line.php"); // The code for displaying the graphics is inline in the // explore.php script. The code for these two line plots // finishes off the script: // Omitted code for displaying scatter plus line plot // Omitted code for displaying residuals plot } ?>

www.bkjia.comtruehttp://www.bkjia.com/PHPjc/371643.htmlTechArticle在这个由两部分组成的系列文章的第 1 部分( 用 PHP 实现的简单线性回归)中,我说明了数学库对 PHP 有用的原因。我还演示了如何用 PHP...
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian Dec 24, 2024 pm 04:42 PM

PHP 8.4 brings several new features, security improvements, and performance improvements with healthy amounts of feature deprecations and removals. This guide explains how to install PHP 8.4 or upgrade to PHP 8.4 on Ubuntu, Debian, or their derivati

How To Set Up Visual Studio Code (VS Code) for PHP Development How To Set Up Visual Studio Code (VS Code) for PHP Development Dec 20, 2024 am 11:31 AM

Visual Studio Code, also known as VS Code, is a free source code editor — or integrated development environment (IDE) — available for all major operating systems. With a large collection of extensions for many programming languages, VS Code can be c

7 PHP Functions I Regret I Didn't Know Before 7 PHP Functions I Regret I Didn't Know Before Nov 13, 2024 am 09:42 AM

If you are an experienced PHP developer, you might have the feeling that you’ve been there and done that already.You have developed a significant number of applications, debugged millions of lines of code, and tweaked a bunch of scripts to achieve op

Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Apr 05, 2025 am 12:04 AM

JWT is an open standard based on JSON, used to securely transmit information between parties, mainly for identity authentication and information exchange. 1. JWT consists of three parts: Header, Payload and Signature. 2. The working principle of JWT includes three steps: generating JWT, verifying JWT and parsing Payload. 3. When using JWT for authentication in PHP, JWT can be generated and verified, and user role and permission information can be included in advanced usage. 4. Common errors include signature verification failure, token expiration, and payload oversized. Debugging skills include using debugging tools and logging. 5. Performance optimization and best practices include using appropriate signature algorithms, setting validity periods reasonably,

How do you parse and process HTML/XML in PHP? How do you parse and process HTML/XML in PHP? Feb 07, 2025 am 11:57 AM

This tutorial demonstrates how to efficiently process XML documents using PHP. XML (eXtensible Markup Language) is a versatile text-based markup language designed for both human readability and machine parsing. It's commonly used for data storage an

PHP Program to Count Vowels in a String PHP Program to Count Vowels in a String Feb 07, 2025 pm 12:12 PM

A string is a sequence of characters, including letters, numbers, and symbols. This tutorial will learn how to calculate the number of vowels in a given string in PHP using different methods. The vowels in English are a, e, i, o, u, and they can be uppercase or lowercase. What is a vowel? Vowels are alphabetic characters that represent a specific pronunciation. There are five vowels in English, including uppercase and lowercase: a, e, i, o, u Example 1 Input: String = "Tutorialspoint" Output: 6 explain The vowels in the string "Tutorialspoint" are u, o, i, a, o, i. There are 6 yuan in total

Explain late static binding in PHP (static::). Explain late static binding in PHP (static::). Apr 03, 2025 am 12:04 AM

Static binding (static::) implements late static binding (LSB) in PHP, allowing calling classes to be referenced in static contexts rather than defining classes. 1) The parsing process is performed at runtime, 2) Look up the call class in the inheritance relationship, 3) It may bring performance overhead.

What are PHP magic methods (__construct, __destruct, __call, __get, __set, etc.) and provide use cases? What are PHP magic methods (__construct, __destruct, __call, __get, __set, etc.) and provide use cases? Apr 03, 2025 am 12:03 AM

What are the magic methods of PHP? PHP's magic methods include: 1.\_\_construct, used to initialize objects; 2.\_\_destruct, used to clean up resources; 3.\_\_call, handle non-existent method calls; 4.\_\_get, implement dynamic attribute access; 5.\_\_set, implement dynamic attribute settings. These methods are automatically called in certain situations, improving code flexibility and efficiency.

See all articles