An Example of Correlation and Regression

I've set up the most recent edition of the ANES. Let's look at an example.

> ANES2012<-read.csv("http://www.courseserve.info/files/ANES2012r.csv")
> attach(ANES2012)
> summary(ftgr_tea)
> plot(ftgr_rich,ftgr_tea)
> abline(lm(ftgr_tea~ftgr_rich))
> cor.test(ftgr_tea,ftgr_rich,test="pearson")
> summary(lm(ftgr_tea~ftgr_rich))
> summary(lm(ftgr_tea~ftgr_rich+ft_rep+ft_dem+ftgr_xfund+ftgr_bigbus+ftgr_fedgov+gender_respondent))

It is clear that linear regression is more informative than correlations, and that multiple least squares regression is a sophisticated modeling tool. We'll practice with some additional examples.

Some things to consider:

  • The DV must be numeric
  • The IVs (predictors) can be numeric or binary
  • Predictors should not be highly correlated (above 0.9, for example)
  • Model specification is very important

    I've created a R script to create a series of binary variables for race. To run the script, at the prompt, type:

    source("http://www.courseserve.info/files/SOCY7112racevectors.r")

    and hit ENTER.

    The names of the race binary variables are: white, black, hispanic, otherrace. You should not enter all four, as this will create a collinearity problem.