Introduction

Polynomial regression is a method of least-square curve fitting. It will take a set of data and produce an approximation. More specifically, it will produce the coefficients to a polynomial that is an approximation of the curve. The degree (or number of coefficients) will determine how accurate the curve can be fit. A degree of zero is a simple mean average. First degree is the same as linear regression. Second and higher degrees will produce non- linear regression.

Polynomial regression requires PHP 5.0 or above, since the code is entirely object-oriented. PHP must be compiled with BC math library, which is standard with most builds of PHP.

Note from the author

The use of BC arbitrary precision arithmetic is almost always necessary for regression of degrees higher then 4, or data sets with thousands of points—the numbers simple get huge.

Manual

Documentation is available online, generated by from the source code using phpDocumentor.

Download

Current versions

Version 0.91, Released May 18, 2013.

BZip 2 (Linux)

Download release 0.91

MD5: cc8284107dae45b60cd5e07fe6152991

SHA1: a3027e7bb9783d06e6f506712aaf4158db8baf52

PGP signature

Zip (Others)

Download release 0.91

MD5: be854d254ffd844d20df4c6676cad80b

SHA1: 5b8e59d218968cdf8f593ef1dad4b3ca5541ebbf

PGP signature

Archived versions

Version 0.9, Released June 16, 2012.

BZip 2 (Linux)

Download release 0.9

MD5: ea3aacc18c1b3086df502823704639d2

SHA1: 69038bd1414777e0e936ea8701a85cecb363ebc1

PGP signature

Zip (Others)

Download release 0.9

MD5: 4193ddd4323277b67acdca3e59dda1f8

SHA1: 3491860264c8caa077fd4ea9517c8ebc139d8b60

PGP signature

Version 0.8, Released June 1, 2009.

BZip 2 (Linux)

Download release 0.8

MD5: 602a40cfba4de4751edf409a7c3e0854

SHA1: c5495d644e32e7a20f0af812675da6eb96619485

PGP signature

Zip (Others)

Download release 0.8

MD5: 17ceb039ff4474dcbf86adac8f103753

SHA1: e8e887a8cd504c32cc8c37c4c2b02364b6b44ee3

PGP signature

Examples

Linear Regression.

<?php
  
// Load the polynomial regression class.
  
require_once( 'RootDirectory.inc.php' );
  require_once( 
$RootDirectory 'Includes/PolynomialRegression/PolynomialRegression.php' );

  
$data =
    array
    (
      array( 
0.0027.3834562958158 ), array( 0.0238.2347360741764 ),
      array( 
0.0442.5632501679666 ), array( 0.0619.4638760104114 ),
      array( 
0.0842.690858098909  ), array( 0.1025.330634164557  ),
      array( 
0.1249.6507591632989 ), array( 0.1434.3502467856792 ),
      array( 
0.1652.5267153107089 ), array( 0.1834.5528919545231 ),
      array( 
0.2044.3220950255077 ), array( 0.2244.7805694031715 ),
      array( 
0.2432.9090525820585 ), array( 0.2656.7941323051778 ),
      array( 
0.2848.7192221569495 ), array( 0.3048.7964850888813 ),
      array( 
0.3256.8905173101315 ), array( 0.3466.0107252116092 ),
      array( 
0.3674.3149331561425 ), array( 0.3852.9076168019644 ),
      array( 
0.4064.3463647026162 ), array( 0.4250.0776706625628 ),
      array( 
0.4462.3527806092493 ), array( 0.4675.9589658430523 ),
      array( 
0.4869.280743962744  ), array( 0.5074.4868159870338 ),
      array( 
0.5276.4548504742096 ), array( 0.5482.9347555390181 ),
      array( 
0.5683.9546576353049 ), array( 0.5883.6379624022705 ),
      array( 
0.6092.6278811310654 ), array( 0.6284.3395153143048 ),
      array( 
0.6486.832363003336  ), array( 0.66105.66563124607  ),
      array( 
0.68100.175129109663 ), array( 0.7082.0781941886623 ),
      array( 
0.7295.9916212989616 ), array( 0.7487.5853932119967 ),
      array( 
0.7693.5435091554247 ), array( 0.7898.0622114645327 ),
      array( 
0.80118.067000253198 ), array( 0.8298.2918886287489 ),
      array( 
0.84111.027863906934 ), array( 0.86113.1135947538   ),
      array( 
0.88117.777915259186 ), array( 0.90108.621331147219 ),
      array( 
0.92112.979639159754 ), array( 0.94122.065499190418 ),
      array( 
0.96116.136221596622 ), array( 0.98111.215762010712 ),
      array( 
1.00122.743302375187 )
    );

  
// Precision digits in BC math.
  
bcscale10 );

  
// Start a regression class of order 2--linear regression.
  
$PolynomialRegression = new PolynomialRegression);

  
// Add all the data to the regression analysis.
  
foreach ( $data as $dataPoint )
    
$PolynomialRegression->addData$dataPoint], $dataPoint] );

  
// Get coefficients for the polynomial.
  
$coefficients $PolynomialRegression->getCoefficients();

  
// Print slope and intercept of linear regression.
  
echo "Slope : " round$coefficients], ) . "<br />";
  echo 
"Y-intercept : " round$coefficients], ) . "<br />";

?>

In this example, 50 data points are used to construct linear regression. The slope and y-intercept of the trend are then displayed.

The image above was created in a spreadsheet with the data points from the example. The linear regression trend line is display, along with the trend line's function.

Slope : 95.75
Y-intercept : 26.55

This is the output from the example. Note how the slope and intercept values match those of the function in the spreadsheet created chart.

Third order polynomial.

<?php
  
// Load the polynomial regression class.
  
require_once( 'RootDirectory.inc.php' );
  require_once( 
$RootDirectory 'Includes/PolynomialRegression/PolynomialRegression.php' );

  
// Data created in a spreadsheet with some random scatter.  True function should be:
  //   f( x ) = 50 x^2 + 20 x + 1
  
$data =
    array
    (
      array( 
0.0,  0.7763 ),
      array( 
0.1,  3.6976 ),
      array( 
0.2,  7.1799 ),
      array( 
0.311.4383 ),
      array( 
0.417.1449 ),
      array( 
0.523.3614 ),
      array( 
0.631.6998 ),
      array( 
0.739.0121 ),
      array( 
0.848.3967 ),
      array( 
0.957.8717 ),
    );

  
// Precision digits in BC math.
  
bcscale10 );

  
// Start a regression class with a maximum of 3rd degree polynomial.
  
$PolynomialRegression = new PolynomialRegression);

  
// Add all the data to the regression analysis.
  
foreach ( $data as $dataPoint )
    
$PolynomialRegression->addData$dataPoint], $dataPoint] );

  
// Get coefficients for the polynomial.
  
$coefficients $PolynomialRegression->getCoefficients();

  
// Print the true function.
  
echo "f( x ) = 1 + 20 x + 50 x<sup>2</sup><br />";

  
// Print the generated function.
  
echo "g( x ) = ";
  foreach ( 
$coefficients as $power => $coefficient )
  {
    
// Convert to floating point value (truncates long values).
    
$smallCoefficient roundfloatval$coefficient ), );

    
// Add a plus or minus sign.
    
if ( $power )
    {
      if ( 
$smallCoefficient )
        echo 
" + ";
      else
      {
        echo 
" - ";
        
$smallCoefficient = -$smallCoefficient;
      }
    }

    
// Display coefficient and power.
    
echo "$smallCoefficient";

    if ( 
$power == )
      echo 
" x";

    if ( 
$power )
      echo 
" x<sup>$power</sup>";
  }
  echo 
"<br />";

  
// Get 10 positions along line from 0 to 1.
  
for ( $x 0$x 1$x += 0.1 )
  {
    
$y round$PolynomialRegression->interpolate$coefficients$x ), );

    echo 
"( $x$y )<br />";
  }

?>

This example starts with the knowledge the data was generated by some a function that is a 3rd order polynomial.

This example takes an initial data set of 10 points that is close to the function f( x ) = 1 + 20 x + 50 x2—some random error has been added. The regression analysis attempts to reconstruct the original function. It will print the resulting function and the first 10 data points.

f( x ) = 1 + 20 x + 50 x2
g( x ) = 0.726 + 23.434 x + 44.866 x2
( 0, 0.726 )
( 0.1, 3.518 )
( 0.2, 7.207 )
( 0.3, 11.794 )
( 0.4, 17.278 )
( 0.5, 23.659 )
( 0.6, 30.938 )
( 0.7, 39.114 )
( 0.8, 48.187 )
( 0.9, 58.158 )
( 1, 69.025 )

The resulting function and data points are similar to the original function.

Calculating R-Squared.

<?php
  
// Load the polynomial regression class.
  
require_once( 'RootDirectory.inc.php' );
  require_once( 
$RootDirectory 'Includes/PolynomialRegression/PolynomialRegression.php' );

  
$data =
    array
    (
      array( 
0.0027.3834562958158 ), array( 0.0238.2347360741764 ),
      array( 
0.0442.5632501679666 ), array( 0.0619.4638760104114 ),
      array( 
0.0842.690858098909  ), array( 0.1025.330634164557  ),
      array( 
0.1249.6507591632989 ), array( 0.1434.3502467856792 ),
      array( 
0.1652.5267153107089 ), array( 0.1834.5528919545231 ),
      array( 
0.2044.3220950255077 ), array( 0.2244.7805694031715 ),
      array( 
0.2432.9090525820585 ), array( 0.2656.7941323051778 ),
      array( 
0.2848.7192221569495 ), array( 0.3048.7964850888813 ),
      array( 
0.3256.8905173101315 ), array( 0.3466.0107252116092 ),
      array( 
0.3674.3149331561425 ), array( 0.3852.9076168019644 ),
      array( 
0.4064.3463647026162 ), array( 0.4250.0776706625628 ),
      array( 
0.4462.3527806092493 ), array( 0.4675.9589658430523 ),
      array( 
0.4869.280743962744  ), array( 0.5074.4868159870338 ),
      array( 
0.5276.4548504742096 ), array( 0.5482.9347555390181 ),
      array( 
0.5683.9546576353049 ), array( 0.5883.6379624022705 ),
      array( 
0.6092.6278811310654 ), array( 0.6284.3395153143048 ),
      array( 
0.6486.832363003336  ), array( 0.66105.66563124607  ),
      array( 
0.68100.175129109663 ), array( 0.7082.0781941886623 ),
      array( 
0.7295.9916212989616 ), array( 0.7487.5853932119967 ),
      array( 
0.7693.5435091554247 ), array( 0.7898.0622114645327 ),
      array( 
0.80118.067000253198 ), array( 0.8298.2918886287489 ),
      array( 
0.84111.027863906934 ), array( 0.86113.1135947538   ),
      array( 
0.88117.777915259186 ), array( 0.90108.621331147219 ),
      array( 
0.92112.979639159754 ), array( 0.94122.065499190418 ),
      array( 
0.96116.136221596622 ), array( 0.98111.215762010712 ),
      array( 
1.00122.743302375187 )
    );

  
// Precision digits in BC math.
  
bcscale10 );

  
// Start a regression class of order 2--linear regression.
  
$leastSquareRegression = new PolynomialRegression);

  
// Add all the data to the regression analysis.
  
foreach ( $data as $dataPoint )
    
$leastSquareRegression->addData$dataPoint], $dataPoint] );

  
// Get coefficients for the polynomial.
  
$coefficients $leastSquareRegression->getCoefficients();

  
// Print slope and intercept of linear regression.
  
echo "Slope : " round$coefficients], ) . "<br />\n";
  echo 
"Y-intercept : " round$coefficients], ) . "<br />\n";

  
//
  // Get average of Y-data.
  //
  
$Y_Average 0.0;
  foreach ( 
$data as $dataPoint )
    
$Y_Average += $dataPoint];

  
$Y_Average /= count$data );

  
//
  // Calculate R Squared.
  //

  
$Y_MeanSum  0.0;
  
$Y_ErrorSum 0.0;
  foreach ( 
$data as $dataPoint )
  {
    
$x $dataPoint];
    
$y $dataPoint];
    
$error  $y;
    
$error -= $leastSquareRegression->interpolate$coefficients$x );
    
$Y_ErrorSum += $error $error;

    
$error  $y;
    
$error -= $Y_Average;
    
$Y_MeanSum += $error $error;
  }

  
$R_Squared 1.0 - ( $Y_ErrorSum $Y_MeanSum );

  echo 
"R Squared : $R_Squared<br />\n";

?>

This example shows how to compute the Coefficient of determination (generally called R-Squared) after the coefficients have been determined. This value is one representation of the goodness of fit.

Slope : 95.75
Y-intercept : 26.55
R Squared : 0.92618245728437

License

This software is free, open-source software released under the GNU license.

Author

Polynomial regression class is written and maintained by Andrew Que. To get in touch with Andrew Que, visit his contact page