Introduction to Machine Learning (Part-7)

 Linear Regression : In Linear regression, we are bound with the straight line which is our decision boundary. The value of y can be same for different values of x. Can be represented as : 

According to Gaussian :
 
P(y|x=x1) = (1/(√2π)*σ)* e-(y-mean)2/2σ2

Through linear regression equation we define yi' = β1x1 + β0

We are predicting value of y as   yi' = β1x1 + β0+ e  where e is the error or noise.

Let us assume  σ for every value of x is same and  β1 = 0 then,

                              β0*argmin C = Σni=1(yi0)2

                                By differentiating on both sides

                               dC/dβ0 = Σni=1(0 + 2β0 - 2yi) = 0   

                                     2Σni=1β- 2Σni=1yi = 0

                                         nβ0 - Σyi = 0

                                         β0 = Σyi/n

Here it comes out to be the best value for   β0 which is the mean average of y values observed in the data.

Now comes the question , How can we determine the Goodfit of our linear regression Model??

We can check the good fit of the linear regression model by the Rmatrix which generalize Linear regression.

                       R = 1 -Σni=1(yi - (Σβjxj + β0))2 / Σni=1(yi - mean)2

Where βjx is independent variable. 

The good fit can be done by matrices. The more the value of R close to 1 , it means independent variable has sizable impact on predictions of output. We don't want  β to be dependent on the range or domain of the variable.For the solution of the same we do normalization of variables beforehand. Normalization can be done for this using Min-Max Sealing . For that we define variable Z.

                                   Z = x - xmin/xmax - xmin

Now for the actual value of yi

                                        yi = β0 + Σmi=1βjxj + e

                                  Therefore  yi = yi' + e

                                Cost function (C) = Σni=1(yi - (Σβjxj + β0))2

                                                          =   yi - (Σβjxj + β0) - e



Comments

Popular posts from this blog

Supervised Learning(Part-5)

Supervised Learning(Part-2)

Text Analysis (Part - 4)