What distribution is used with the global test of the regression model to reject the null hypothesis

To Documents

The F-test for Linear Regression

Purpose

    The F-test for linear regression tests whether any of the independent variables in a multiple linear regression model are significant.

Definitions for Regression with Intercept

  • n is the number of observations, p is the number of regression parameters.
     
  • Corrected Sum of Squares for Model: SSM = Σi=1n (yi^ - y)2,
        also called sum of squares for regression.
     
  • Sum of Squares for Error: SSE = Σi=1n (yi - yi^)2,
        also called sum of squares for residuals.
     
  • Corrected Sum of Squares Total:   SST = Σi=1n (yi - y)2
        This is the sample variance of the y-variable multiplied by n - 1.
     
  • For multiple regression models, we have this remarkable property: SSM + SSE = SST.
     
  • Corrected Degrees of Freedom for Model:   DFM = p - 1
     
  • Degrees of Freedom for Error:   DFE = n - p
     
  • Corrected Degrees of Freedom Total:   DFT = n - 1
        Subtract 1 from n for the corrected degrees of freedom.
        Horizontal line regression is the null hypothesis model.
     
  • For multiple regression models with intercept, DFM + DFE = DFT.
     
  • Mean of Squares for Model:   MSM = SSM / DFM
     
  • Mean of Squares for Error:   MSE = SSE / DFE
        The sample variance of the residuals.
     
  • Mean of Squares Total:   MST = SST / DFT
        The sample variance of the y-variable.
     
  • In general, a researcher wants the variation due to the model (MSM) to be large with respect to the variation due to the residuals (MSE).
     
  • Note: the definitions in this section are not valid for regression through the origin models. They require the use of uncorrected sums of squares.

The F-test

  • For a multiple regression model with intercept, we want to test the following null hypothesis and alternative hypothesis:
      H0:   β1 = β2 = ... = βp-1 = 0

        H1:   βj ≠ 0, for at least one value of j

     
    This test is known as the overall F-test for regression.
     
  • Here are the five steps of the overall F-test for regression
    1. State the null and alternative hypotheses:
        H0:   β1 = β2 = ... = βp-1 = 0

          H1:   βj ≠ 0, for at least one value of j

       
    2. Compute the test statistic assuming that the null hypothesis is true:
        F = MSM / MSE = (explained variance) / (unexplained variance)
       
    3. Find a (1 - α)100% confidence interval I for (DFM, DFE) degrees of freedom using an F-table or statistical software.
       
    4. Accept the null hypothesis if F ∈ I; reject it if F ∉ I.
       
    5. Use statistical software to determine the p-value.

  • Practice Problem:  For a multiple regression model with 35 observations and 9 independent variables (10 parameters), SSE = 134 and  SSM = 289, test the null hypothesis that all of the regression parameters are zero at the 0.05 level.

      Solution: DFE = n - p = 35 - 10 = 25 and DFM = p - 1 = 10 - 1 = 9. Here are the five steps of the test of hypothesis:

    1. State the null and alternative hypothesis:
        H0:   β1 = β2 = , ... , = βp-1 = 0

          H1:   βj ≠ 0 for some j


    2. Compute the test statistic:
        F = MSM/MSE = (SSM/DFM) / (SSE/DFE) = (289/9) / (134/25) = 32.111 / 5.360 = 5.991

    3. Find a (1 - 0.05)×100% confidence interval for the test statistic. Look in the F-table at the 0.05 entry for 9 df in the numerator and 25 df in the denominator. This entry is 2.28, so the 95% confidence interval is [0, 2.34].  This confidence interval can also be found using the R function call qf(0.95, 9, 25).
       
    4. Decide whether to accept or reject the null hypothesis: 5.991 ∉ [0, 2.28], so reject H0.
       
    5. Determine the p-value. To obtain the exact p-value, use statistical software. However, we can find a rough approximation to the p-value by examining the other entries in the F-table for (9, 25) degrees of freedom:
      LevelConfidence IntervalF-value
      0.100 [0, 0.900] 1.89
      0.050 [0, 0.950] 2.28
      0.025 [0, 0.975] 2.68
      0.010 [0, 0.990] 2.22
      0.001 [0, 0.999] 4.71

      The F-value is 5.991, so the p-value must be less than 0.005.
       
  • Verify the value of the F-statistic for the Hamster Example.

The R2 and Adjusted R2 Values

  • For simple linear regression, R2 is the square of the sample correlation rxy.
     
  • For multiple linear regression with intercept (which includes simple linear regression), it is defined as r2 = SSM / SST.
     
  • In either case, R2 indicates the proportion of variation in the y-variable that is due to variation in the x-variables.
     
  • Many researchers prefer the adjusted R2 value = R2 instead, which is penalized for having a large number of parameters in the model:
      R2 = 1 - (1 - R2)(n - 1) / (n - p)

  • Here derivation of R2:   R2 is defined as 1 - SSE/SST or 1 - R2 = SSE/SST. To take into account the number of regression parameters p, define the adjusted R-squared value as
      1 - R2 = MSE/MST,

    where MSE = SSE/DFE = SSE/(n - p) and MST = SST/DFT = SST/(n - 1). Thus,
      1 - R2 = [SSE/(n - p)] / [SST/(n - 1)]
        = (SSE/SST)(n - 1) / (n - p)
    so
      R2 = 1 - (SSE/SST)(n - 1) / (n - p)
        = 1 - (1 - R2)(n - 1) / (n - p)

  • Practice Problem: A regression model has 9 independent variables, 47 observations, and R2 = 0.879.

      Ans: p = 10 and n = 47. R2 = 1 - (1 - R2)(n - 1) / (n - p) = 1 - (1 - 0.879)(47 - 1) / (47 - 10) =  0.8496.

What can we conclude if the global F

If the P value for the F-test of overall significance test is less than your significance level, you can reject the null-hypothesis and conclude that your model provides a better fit than the intercept-only model.

How do you accept or reject the null hypothesis in regression?

The p-value for each term tests the null hypothesis that the coefficient is equal to zero (no effect). A low p-value (< 0.05) indicates that you can reject the null hypothesis.

What can we conclude if the global test of regression rejects the null hypothesis quizlet?

What can we conclude if the global test of regression rejects the null hypothesis? In multiple regression, a dummy variable is significantly related to the dependent variable when: the test of the dummy variable's regression coefficient is rejected.

What must be concluded when the global test of significance is rejected in multiple regression?

What can we conclude if the global test of regression rejects the null hypothesis? C. At least one of the net regression coefficients is not equal to zero.