Path: blob/master/ML/6. Non-Linear-Regression.ipynb
3074 views
Non Linear Regression Analysis
If the data shows a curvy trend, then linear regression will not produce very accurate results when compared to a non-linear regression because, as the name implies, linear regression presumes that the data is linear.
How to check if a problem is linear on non-linear?
Inspect visually
Calculate correlation coefficient if it is greater than 0.7, then data is not fit for non-linear case
If model cannot be accurately fitted with linear parameters, then for better accuracy we have to switch for non-linear methods.
How to model data?
Convert to linear model
Ploynominal regression
Non-linear regression
Importing required libraries
Though Linear regression is very good to solve many problems, it cannot be used for all datasets. First recall how linear regression, could model a dataset. It models a linear relation between a dependent variable y and independent variable x. It had a simple equation, of degree 1, for example y = + 3.
Non-linear regressions are a relationship between independent variables and a dependent variable which result in a non-linear function modeled data. Essentially any relationship that is not linear can be termed as non-linear, and is usually represented by the polynomial of degrees (maximum power of ).
$$\ y = a x^3 + b x^2 + c x + d \$$Non-linear functions can have elements like exponentials, logarithms, fractions, and others. For example:
Or even, more complicated such as :
Let's take a look at a cubic function's graph.
As you can see, this function has and as independent variables. Also, the graphic of this function is not a straight line over the 2D plane. So this is a non-linear function.
Some other types of non-linear functions are:
Quadratic
Exponential
An exponential function with base c is defined by where b ≠0, c > 0 , c ≠1, and x is any real number. The base, c, is constant and the exponent, x, is a variable.
Logarithmic
The response is a results of applying logarithmic map from input 's to output variable . It is one of the simplest form of log(): i.e.
Please consider that instead of , we can use , which can be polynomial representation of the 's. In general form it would be written as
Sigmoidal/Logistic
In this notebook, we fit a non-linear model to the datapoints corrensponding to Italy's GDP from 1960 to 2014. For an example, we're going to try and fit a non-linear model to the datapoints corresponding to Italy's GDP from 1960 to 2014.
Plotting the Dataset
This is what the datapoints look like. It kind of looks like an either logistic or exponential function. The growth starts off slow, then from 2005 on forward, the growth is very significant. And finally, it decelerate slightly in the 2010s.
Choosing a model
From an initial look at the plot, we determine that the logistic function could be a good approximation, since it has the property of starting with a slow growth, increasing growth in the middle, and then decreasing again at the end; as illustrated below:
The formula for the logistic function is the following:
: Controls the curve's steepness,
: Slides the curve on the x-axis.
Building The Model
Now, let's build our regression model and initialize its parameters.
Lets look at a sample sigmoid line that might fit with the data:
Our task here is to find the best parameters for our model. Lets first normalize our x and y:
How we find the best parameters for our fit line?
we can use curve_fit which uses non-linear least squares to fit our sigmoid function, to data. Optimal values for the parameters so that the sum of the squared residuals of sigmoid(xdata, *popt) - ydata is minimized.
popt are our optimized parameters.
Now we plot our resulting regression model.
Practice
Can you calculate what is the accuracy of our model?