Introduction to Machine Learning (Part-6)
Real world process is generated from Population where sample is the sub set of population. Sample can be of two type :
- Generative : It says if there are n variables x1,x2,.....,xn and we need to find the probability of these n variables then we apply Joint probability distribution on it as it also follows joint probability distribution then P(x) = P(x1,x2,....,xn)
- Discriminative : It learns from the conditional probability, which can be defined as if f : x(vector) maps to y then P(y|x(vector)), where it simply learn by reducing parameters. In hypothesis set, each Hi is not equal to Hj because each parameter is different in this set. Here, h* = argmin C(y,y'(x(vector))) where y is actual value and y' is the predicted value.
- Hypothesis set chosen
- Search algorithm
- For binary : 2n-1
- For non-binary : kn-1
In linear regression, we usually get noise in the measurement.
y = mx + c + e
where m is the slope, c in the intercept and e is the unexplained variance.
Bernoulli Experiment : In this , we have only two values , it can be 0 or 1 , yes or no etc...If there are n transaction in an experiment then we predict its value as expected value , where expected value is defined as :
if P(x =1) , then expected value = θx(1-θ)1-x
P(D) where there are n transactions can be done as :
P(D) =nπi=1 P(xi)
= θ#x(1-θ)#(1-x)
= θr(1-θ)n-r
θ represents Accuracy
if we take log on both sides :
log(P(D)) = r log θ + (n-r) log θ
This leads to maximum log likelihood
Binomial Distribution : It tells what is the probability of getting correct prediction of n. In this we are interested in set of outcome rather than one outcome. The probability that n Bernoulli trails leads to k states.
P(y=k|θ) = θk(1-θ)n-k
Mean = nq where q = θ
Variance = nq(1-q) where q = θ
Multinomial Distribution : In this we have categorical trials. The probability is
P(y1,y2,..yk | θ ) = (n!/y1!y2!...yk!) kπi=1 yiθi
n = kΣi=1 yk
Comments
Post a Comment