I am currently studying the concept of linear regression and while studying it, I came across a term called t-statistic..

I also came across a term called** p-value** (the probability of observing any value equal to |t| or larger.).But I want to know how to interpret them and how the p-value affects the selection of the variable.

# How p-value is related to t-statistic?

**harry**#1

What is a P value and it's significance in hypothesis testing

**shuvayan**#2

hello @harry,

It is a part of Hypothesis testing in linear regression where:

Ho : There is no relationship between X and Y| Beta1 = 0(Slope = 0)

Ha : There is some relationship between X and Y | Beta1 != 0.

where Beta1 is the simple linear regression slope for the single variable.

To test the null hypothesis we compute a t-statistic given by

This will follow a t-distribution from which we get the p-values which is a probability.

And how do we use all this in linear regression:

Shown below is the result of a simple linear regression model where the response

variable is Sales and explanatory variable is TV advertising spend.

Here the p-values indicate that given the null hypothesis(there is no relationship between TV adv spend and Sales) the prob of observing a t-statistic at least as much as 17.67 is very rare or 0.0001 which can be translated to the prob that there is no relationship between X and Y in this data is extremely rare.

Here since we have used only variable which is significant there is no question of dropping variables.

But in case there are many independent variables (say 5 X’s) and for 2 of them the p-values come out to be greater than 0.05(significance level-this can be 0.05,0.10,0.01 etc.) then we won’t include those two variables in the final regression model.