In this exercise, you’ll have a chance to practice some techniques and procedures related to regression analysis, using data (more or less) from the Darr and Johns (2003) academic politics study used as part of this module’s case.
Age, Years in dept, Rank, Sex, salary, level, tch_rating, minority, admin_resp, Science, degree_tier, pub_prestige
Role ambiguity, Role conflict, Task conflict, Relationship conflict, Political perceptions
Respondent perceptions of responsibility in current crisis:
codebook leads to a full codebook (Module 1 SLP Coding). You can find the data in SPSS format or Excel.
Please consult the Regression Presentation for specific guidance on how to run various tests.
For this exercise, we’ll be using the demographic variables. We’ll use the others later.
Open the excel file in SPSS and save it in SPSS format.
Prepare appropriate descriptive statistics – remember, means and standard deviations (and possibly histograms) for interval variables, frequency tables for categorical and dummy variables. Are there any things of interest in these descriptive statistics?
Construct three scatterplots looking at the relationships among some of the interval variables that interest you. What if anything do you discover this way?
Set up a regression model to predict salary (DV) from teaching ratings (IV). Be sure to request appropriate residuals plots. What are your results? What do you learn from the residuals plots? [HINT: here is a good guide on interpreting residuals plots.]
Since neither salary nor teaching ratings is particularly normally distributed (i.e., both are skewed), we might do better with the log transforms of these data (if you’re not familiar with log transforms of data, here is a good general guide to the subject (Hopkins). Try your regression model with these transformed variables. Any better? How do you know?
Now let’s try a multiple predictor model. Use salary as DV, and Age, Years in dept, Sex, level, tch_rating, minority, admin_resp, Science, degree_tier, and pub_prestige as IV’s. How good is your prediction now? What are the best predictors? Anything in the residuals?
Your prediction is good, but it’s too complicated. See if you can reduce complexity by using a stepwise procedure on the same model. What is the efficiency of prediction of your final model here? What predictors are left? Which have been excluded? Of those left, which are best? Anything in the residuals?
Any overall comments on using regression techniques?
Your assignment will be assessed against the following expectations:
Your assignment should be 10-15 pages in total. Make sure you only report the important tables that support your answers to each question. You are expected to show facility with regression analysis procedures and vocabulary, to use terminology correctly, and to correctly and concisely carry out the assignment steps. It is expected that your assignments will be presented in good academic form, including tables formatted in approximate APA style rather than being simply pasted in from SPSS printout, statistical tests summarized in narrative text containing the information necessary, and appropriate analyses and comments included, not just a dump of the results of a test.