Purpose Statement and Model
1) In the introductory paragraph, state why the dependent variable has been chosen for analysis. Then make a general statement about the model:
“The dependent variable _______ is determined by variables ________, ________, ________, and ________.”
2) In the second paragraph, identify the primary independent variable and defend why it is important.
“The most important variable in this analysis is ________ because _________.” In this paragraph, cite and discuss the two research sources that support the thesis, i.e., the model.
3) Write the general form of the regression model (less intercept and coefficients), with the variables named appropriately so reader can identify each variable at a glance:
Dep_Var = Ind_Var_1 + Ind_Var_2 + Ind_Var_3
For instance, a typical model would be written:
Price_of_Home = Square_Footage + Number_Bedrooms + Lot_Size
Price_of_Home: brief definition of dependent variable
Square_Footage: brief definition of first independent variable
Number_Bedrooms: brief definition of second independent variable
Lot_Size: brief definition of third independent variable
[Note: student of course replaces these variable names with his/her own variable names.]
Definition of Variables
4) Define and defend all variables, including the dependent variable, in a single paragraph for each variable. Also, state the expectations for each independent variable. These paragraphs should be in numerical order, i.e., dependent variable, X1, then X2, etc.
In each paragraph, the following should be addressed:
< How is the variable defined in the data source?
< Which unit of measurement is used?
< For the independent variables: why does the variable determine Y?
< What sign is expected for the independent variable’s coefficient, positive or negative? Why?
5) In one paragraph, describe the data and identify the data sources.
< From which general sources and from which specific tables are the data taken? (Citing a website is not acceptable.)
< Which year or years were the data collected?
< Are there any data limitations?
Presentation and Interpretation of Results
6) Write the regression (prediction) equation:
Dep_Var = Intercept + c1 * Ind_Var_1 + c2 * Ind_Var_2 + c3 * Ind_Var_3
7) Identify and interpret the adjusted R2 (one paragraph):
< Define “adjusted R2.”
< What does the value of the adjusted R2 reveal about the model?
< If the adjusted R2 is low, how has the choice of independent variables created this result?
8) Identify and interpret the F test (one paragraph):
< Using the p-value approach, is the null hypothesis for the F test rejected or not rejected? Why or why not?
< Interpret the implications of these findings for the model.
9) Identify and interpret the t tests for each of the coefficients (one separate paragraph for each variable, in numerical order):
< Are the signs of the coefficients as expected? If not, why not?
< For each of the coefficients, interpret the numerical value.
< Using the p-value approach, is the null hypothesis for the t test rejected or not rejected for each coefficient? Why or why not?
< Interpret the implications of these findings for the variable.
< Identify the variable with the greatest significance.
10) Analyze multicollinearity of the independent variables (one paragraph):
< Generate the correlation matrix.
< Define multicollinearity.
< Are any of the independent variables highly correlated with each other? If so, identify the variables and explain why they are correlated.
< State the implications of multicollinearity (if found) for the model.
11) Other (not required):
< If any additional techniques for improving results are employed, discuss these at the end of the paper.
Works Cited Page
12) Use the proper format to list the works cited under two headings:
Research: two sources
Data: a separate citation for each of the variables used in the paper.