Essentials of Social Statistics for a Diverse Society
Instructor Resources
Solutions to Chapter Exercises and SPSS Exercises
Tip: Click on each link to expand and view the content. Click again to collapse.
Chapter 1: The What and the Why of Statistics
NA
Chapter 2: The Organization and Graphic Presentation of Data
1. Use the SPSS Frequencies command to produce a frequency table for the variable MARITAL as measured in the GSS10SSDS. How would you describe where most students in the sample were raised?
- What percentage of the sample is divorced?
- What percentage of the sample is married?
- What percentage of the sample would you describe as being currently single? (Include all relevant categories.)
2. The GSS2010 SSDS included a series of questions on respondent’s attitudes about immigrants. In the chapter, we examined the relationship between race and attitudes about immigrants and jobs (IMMJOBS). The other two GSS variables include IMMCRIME and IMMAMECO.
- Run frequencies for all the three variables (including IMMJOBS).
- Prepare a general statement summarizing your results from the three frequency tables. Identify the level of measurement for each variable. How would you describe respondents’ attitudes about immigrants?
3. Based on GSS10SSDS, produce the frequency table for the RACIDIMP, the importance of one’s racial identity.
- What is the level of measurement for this variable?
- Identify two independent variables (included in the GSS10SSDS data set) that may be related to RACIDIMP. Explain the relationship between these variables and RACINIMP.
4. The GSS2010 SSDS asked respondents to report in their highest year of school (EDUC). Run the frequency table for this variable. Collapse this interval ratio variable into an ordinal measure (omitting those who did not respond to the question). How many categories do you have? Prepare a frequency and cumulative percentage table of your recoded EDUC variable.
5. Collapse the variables LABORRATEFEMALE and LABORRATEMALE (included in GLOBAL13SSDS) into ordinal measures. How many categories do you have? Prepare a frequency and cumulative percentage table of your recoded variables. What can you conclude about the difference in labor force participation between males and females?
[HINTS12SSDS]
6. You’ve decided to examine the differences in the age of male and female subjects in HINTS2012 data set. You examine AGEGRPA (age group of respondent).
- Construct a bar graph for AGEGRPA (Hint: From the SPSS menu, choose Graphs–Legacy Dialogs–Bar.).
- Construct bar graphs separately for men and women (insert the variable GENDERC in the Panel by/Columns box).
- Briefly describe overall age distribution and the difference in age groups between men and women.
7. Are individuals with higher educational attainment more likely to look for information about health or medical topics from any source? Construct separate pie charts for SeekHealthInfo (does respondent look for information about health or medical topics?). You will need to select Pie under Graphs–Legacy Dialogs. In the first dialog box, select “Summaries for groups of cases.” Then, make sure you select % of cases underSlices Represent. In the box for Define slices by, insert SeekHealthInfo and in the Panel by/Columns box insert Education (degree). Compare the pie charts. What difference in seeking for information about health exists between the different educational (degree) groups?
8. Examine if there is a difference in responses between men and women in terms of their health condition (GeneralHealth) and their occupational status (OccupationStatus) and whether they have ever been diagnosed as having cancer (EverHadCancer). Based on the level of measurement for each variable, determine the appropriate graphic display. Produce separate graphs for men and women. What differences, if any, are evident in the data?
9. Determine how best to represent the following variables graphically:
RentOrOwn—whether respondent rents or owns her/his house
WhenDiagnosedCancer—age when first told that you had cancer
IncomeRanges—annual family income of respondent
QualityCare—how respondent rates the quality of health care she/he received in the last 12 months
Note: Before constructing the histogram or pie chart, you may want to review the variable by first using theFrequencies or Utilities–Variables command. The levels of measurement for several variables are mislabeled in SPSS. If you are using the Utilities–Variables option to review each variable and its level of measurement, you should confirm the level of measurement by reviewing the variable’s frequency table (Analyze–Descriptive–Frequencies).
Chapter 3: Measures of Central Tendency
[GSS10SSDS]
1. Create a frequency distribution, including any appropriate measures of central tendency, for HOMOSEX.
- Which measure of central tendency, mean or median, is most appropriate to summarize the distribution of HOMOSEX? Explain why.
- Suppose we are interested in whether or not males and females have the same attitudes about homosexual relations. Create a frequency distribution, including any appropriate measures of central tendency, for HOMOSEX, this time separating results for men and women. (Use the Data–Split File command by clicking on Data, Split File, Organize Output by Groups, insert the variable SEX into the box labeled “Groups Based on” and click OK. Are there any differences in their measures of central tendency? Explain. (Remember, once you have completed this exercise, reset the Split File command to include all cases by clicking on Data, Split File, Analyze All Cases, Do Not Create Groups.)
- Repeat (b), this time using PREMARSX (acceptance of premarital sex) as your variable.
2. We are interested in investigating whether males and females have equal levels of education. Use the variable EDUC with the Frequencies procedure to produce frequency tables and the mean, the median, and the mode separately for males and females (as described in 1b).
- On average, do men and women have equal levels of education? Use all the available information to answer this question.
- When we use statistics to describe the social world, we should always go beyond merely using statistics to describe the condition of various social groups. Just as important is our interpretation of the statistics and some judgment as to whether any differences that we find between groups seem to be of practical importance; that is, do they make a practical difference in the world? Do you think any differences you discovered between male and female educational levels are important enough to have an effect on such things as the ability to get a job or the salary that someone makes? Explain the reason for your answer.
3. Some people believe that social class influences the number of children that couples decide to have. Use SPSS to investigate this question with the GSS data file. (The variable CHILDS measures the respondent’s number of children.) To get the necessary information, have SPSS split the file by CLASS and then run Frequencies for CHILDS.
- What is the best measure of central tendency to represent the number of children in a household? Why?
- Which social class has more children per respondent?
- Rerun your analysis, this time with the variable CHLDIDEL (ideal number of children). Is there a difference among the social class categories? Explain.
4. Picking an appropriate statistic to describe the central tendency of a distribution is a critical skill. Based on the GSS10SSDS, determine the appropriate measure(s) of central tendency for the following variables:
- How often do respondents watch television? [TVHOURS]
- Respondents’ political views. [POLVIEWS]
- Number of hours a respondent worked last week. [HRS1]
- Whether respondents have a gun in their home. [OWNGUN]
- Number of brothers and sisters of those sampled. [SIBS]
- Does respondent support legalization of marijuana? [GRASS]
5. The educational attainment of American Indians, Alaska Natives, and Hispanic American respondents was compared in A Closer Look 4.1. You can use SPSS to do similar comparisons with other variables. For example, it may be interesting to look at levels of marital satisfaction and compare the number of hours that men and women spend watching TV per day. There is more than one method to get the frequency distributions of hours per day watching TV for these groups, but the easiest thing might be to use the Split File command (as described in 1b). Use the GSS10SSDS for this exercise. Click on Data, Split File, Organize Output by Groups, place the variables HAPMAR and SEX (in that order) in the “Groups Based on” box, and then click OK. HAPMAR has three categories of interest: 1 = very happy, 2 = pretty happy, 3 = not too happy. SEX has two valid values 1 = male and 2 = female. Essentially, you have told SPSS to create a separate set of output for each group defined by the combination of the values of HAPMAR and SEX.
- Create a frequency distribution by clicking on Analyze, Descriptive Statistics, Frequencies. Select the variable TVHOURS, as well as all appropriate measures of central tendency under the Statisticsoption. Click OK. SPSS will create a great deal of output; all you need to do is find the appropriate frequency tables and measures of central tendency. For example, to find and report the frequency table for males who are pretty happy, look for the section with values of Happiness of Marriage = Pretty Happy and Respondent’s Sex = Male.
- Do you notice any gaps in hours spent watching TV between men and women at different levels of marital satisfaction?
Chapter 4: Measures of Variability
[GSS10SSDS]
1. Use the Frequencies procedure to investigate the variability of the respondent’s current age (AGE) and age when the respondent’s first child was born (AGEKDBRN). Click on Analyze, Descriptive Statistics,Frequencies, and then Statistics. Select the appropriate measures of variability.
- Which variable has more variability? Use more than one statistic to answer this question.
- Why should one variable have more variability than the other from a societal perspective?
2. Using the Explore procedure, separate the statistics for AGEKDBRN for men and women, selecting SEX as a factor variable in the Explore window. Click on Analyze, Descriptive Statistics, Explore, and then insert AGEKDBRN into the Dependent List and SEX in the Factor List. What differences exist in the age of men and women at the birth of their first child? Assess the differences between men and women based on measures of central tendency and variability.
3. Repeat the procedure in Exercise 2, investigating the dispersion in the variables EDUC (education) and PRESTG80 (occupational prestige score). Select your own factor (nominal) variable to make the comparison (such as CLASS, RACECEN1, or some other factor). Click on Analyze, Descriptive Statistics, Explore, and insert EDUC and PRESTG80 into the Dependent List and your factor variable of choice in the Factor List. In a paragraph or two, use appropriate measures of variability to summarize the results.
4. Using GSS10SSDS, investigate respondents’ confidence of the military (CONARMY) and the press (CONPRESS).
- First, use SPSS to identify the level of measurement for each variable.
- Based on the level of measurement for each variable, what would be the appropriate measures of central tendency? What are the appropriate measures of variability?
- Use SPSS and your calculator if necessary to calculate the appropriate measures of central tendency and variability for each variable.
- Do respondents more positively view press or military performance?
- Examine whether or not your answer toward (d) varies by gender. Hint: You may want to use the Data and Split File feature.
5. Use GSS10SSDS to study the number of hours that blacks and whites work each week. The variable HRS1 measures the number of hours a respondent worked the week before the interview. Use the Explore procedure to study the variability of hours worked, comparing blacks and whites (RACECEN1) in the GSS sample.
- Is there a difference between the two groups in the variability of work hours?
- Write a short paragraph describing the box plot that SPSS created as if you were writing a report and had included the box plot as a chart to support your conclusions about the difference between blacks and whites in the variability (and central tendency) of hours worked.
Chapter 5: The Normal Distribution
[GSS10SSDS]
1. The majority of variables that social scientists study are not normally distributed. This doesn’t typically cause problems in analysis when the goal of a study is to calculate means and standard deviations—as long as sample sizes are greater than about 50. (This will be discussed in later chapters.) However, when characterizing the distribution of scores in one sample, or in a complete population (if this information is available), a non-normal distribution can cause complications. We can illustrate this point by examining the distribution of age in the GSS data file.
- Create a histogram for AGE (click on Graphs, Legacy Dialogs, Histogram; insert the variable age) with a superimposed normal curve (click on the option Display Normal Curve). How does the distribution of AGE deviate from the theoretical normal curve?
- Calculate the mean and standard deviation for AGE in this sample, using either the Frequencies or Descriptives procedure.
- Assuming the distribution of AGE is normal, calculate the number of people who should be 25 years of age or less.
- Use the Frequencies procedure to construct a table of the percentage of cases at each value of AGE. Compare the theoretical calculation in (c) with the actual distribution of age in the sample. What percentage of people in the sample are 25 years old or less? Is this value close to what you calculated? Why might there be a discrepancy?
2. SPSS will calculate standard scores for any distribution. Examine the distribution of EDUC (years of school completed).
- Have SPSS calculate Z scores for EDUC. (See the SPSS Demonstration above if this is unclear.)
- What is the equivalent Z score for someone who has completed 18 years of education?
- Use the Frequencies procedure to find the percentile rank for a score of 18.
- Does the percentile rank that you found from Frequencies correspond to the Z score for a value of 18? In other words, is the distribution for years of education normal? If so, then the Z score that SPSS calculates should be very close, after transforming it into an appropriate area, to the percentile rank for that same score.
- Create histograms for EDUC and the new variable ZEDUC. Explain why they have the same shape.
3. Repeat the procedure in Problem 2, this time running separate analyses for men versus women (SEX) and blacks versus whites (RACECEN1) based on the variable EDUC. Remember, you can run separate analyses using the Data Split File command. Click on Data, Split File, Organize Output by Groups and select either SEX or RACECEN1. Is there a difference in EDUC among men/women and blacks/whites in the GSS sample? How would you describe the distribution of EDUC for the four groups?
Chapter 6: Sampling and Sampling Distributions
Using GSS10SSDS, repeat the SPSS demonstration, selecting 25%, 50%, and 75% samples and requesting descriptives for MAEDUC and PAEDUC. Compare your descriptive statistics with descriptives for the entire sample. What can you say about the accuracy of your random samples?
Chapter 7: Estimation
[GSS10SSDS]
1. Recall that the GSS sample includes men and women from 18 to 89 years of age. Does it matter that we may have responses from men and women of diverse ages? Would our results change if we selected a younger sample of men and women?
- To take the SPSS demonstration one step further, use the Select Cases procedure to select respondents based on the variable AGE who are less than or equal to 35 years old. Do this by selecting Data and then Select Cases. Next, select If Condition is satisfied and then click on If. Find and highlight the variable AGE in the scroll-down box on the left of your screen. Click the arrow next to the scroll-down box. AGE will now appear in the box on the right. Now, tell SPSS that you want to select respondents who are 35 years of age or less. The box on the right should now read AGE <= 35. Click Continue and then OK.
- Using this younger sample, repeat the Explore procedure that we just completed in the demonstration. What differences exist between men and women in this younger sample on the ideal number of children? How do these results compare with those based on the entire sample?
2. Calculate the 90% confidence interval for the following variables, comparing lower, working, middle, and upper classes (CLASS) in the GSS sample. First, tell SPSS that we want to select all cases in the sample by selecting Data, Select Cases, and then All Cases, and then OK. Then, use the Explore procedure using CLASS as your factor variable (Analyze, Descriptive Statistics, Explore). Make a summary statement of your findings.
- CHILDS (Number of children in the household)
- EDUC (Respondent’s highest year of school completed)
- PAEDUC (Father’s highest year of school completed)
- PRESTG80 (Respondent’s occupational prestige)
- MAEDUC (Mother’s highest year of school completed)
Chapter 8: Testing Hypotheses
1. Use the GSS10SSDS file to investigate whether or not Americans have at least two children per person. Use the One Sample T Test procedure to do this test with the variable CHILDS. Do the test at the .01 significance level. What did you find? Do Americans have two children, more, or less?
2. Investigate the difference between individuals who support legalization of marijuana from those who do not based on data from the GSS. Use the variable GRASS as your independent or grouping variable (1 = legaland 2 = not legal). Investigate whether there is a significant difference between these two groups in terms of their age (AGE), number of children (CHILDS), and education (EDUC). Assume that α is .05 for a two-tailed test. Based on your analysis, write three Step 5–type statements summarizing your findings.
3. Extend your analysis in Exercise 2, this time by comparing individuals who support/do not support gun permits (GUNLAW). Use the same dependent variables, AGE, CHILDS, and EDUC, to estimate t tests. Assume α is .05 for a two-tailed test. Prepare a statement to summarize your findings.
4. The GSS10SSDS includes a measure of highest educational degree completed (DEGREE). Test whether there is a significant difference between those with less than high school (coded 0) and those with a bachelor’s degree (coded 3) in the number of hours they watch television (TVHOURS) and in the number of hours on the Internet per week (WWWHR). Assume α is .05 for a two-tailed test. Prepare a statement to summarize your findings.
5. In this exercise, we will use data from the MTF2011 survey, comparing GPA between two racial/ethnic student groups. First, you’ll need to run frequencies of the variable RACE, taking note of what each racial/ ethnic group is coded (e.g., black students are coded as “1”). The MTF 2011 includes three categories for RACE; you’ll need to select two as your independent (factor) variables. Use GPA as your dependent variable. Calculate the t-test models for two sets of GPA comparisons, for example, white versus black and black versus Hispanic. Assume α is .05 for a two-tailed test. Prepare a statement to summarize your findings.
Chapter 9: Bivariate Tables
[GSS10SSDS and MTF11SSDS]
1. The GSS data set includes responses to questions about the respondent’s general happiness (HAPPY) and his or her subjective class identification (CLASS). Analyze the relationship between responses to these two questions with the SPSS Crosstabs procedure, requesting counts and appropriate cell percentages. (Click on Analyze, Descriptive Statistics, and Crosstabs to get started.)
- What percentage of working-class people responded that they were “very happy”?
- What percentage of lower-class people were “very happy”?
- What percentage of those who were “pretty happy” were also from the middle and upper classes?
- Most of the people who said that they were “very happy” were from which two classes?
- Is there a relationship between perceived class and perceived happiness? If there is a relationship, describe it. Is it strong or weak? (Hint: Use perceived class as the independent variable.)
- Rerun your analysis, this time adding RACECEN1 as a control variable. Is there a difference in the relationship between perceived class and happiness for whites and blacks in the sample? (Because of the large number of racial categories, just compare blacks and whites.)
2. Analyze the relationship between self-reported health condition and general happiness.
- Use SPSS to construct a table showing the relationship between health condition (HEALTH) and reported general happiness (HAPPY). (Hint: Use HEALTH as the dependent variable.) Next, use SPSS to construct tables showing the same relationship controlling for sex.
- Overall, are women or men more likely to report “excellent” health?
- Do women and men with higher levels of happiness report poor or excellent health? Is there a relationship between happiness and health? Make sure to support your answer with data from your cross-tabulation.
3. Is there a difference in attitudes about abortion depending on the circumstance of the woman’s pregnancy or her reason for an abortion? Separately assess the relationship between SEX and two abortion items, ABPOOR (Should a woman have an abortion if she can’t afford any more children?) and ABHLTH (Should a woman have an abortion if her health is seriously endangered?). What do you conclude?
4. For the GSS2010, respondents were asked to report which candidate they voted for in the 2004 and 2008 presidential elections (PRES04 and PRES08) and their feelings about the Bible (BIBLE). Does a relationship exist between a respondent’s 2008 vote and her or his feelings about the Bible? For example, if someone thinks that the Bible is a “book of fables,” did the individual vote for Senator John Kerry or President George W. Bush in 2004? If the respondent believes the Bible is the word of God, how did the respondent vote in 2004 or 2008?
- Which variable should be defined as the dependent variable? Explain your answer.
- Using SPSS Crosstabs, create two tables with BIBLE and each of the PRES variables. Explain the relationship between the two variables for 2004 and 2008. (Remember, when you discuss your findings, you should exclude those respondents who did not vote.)
- Examine the relationship between BIBLE and one of the PRES variables with a control variable of your choice.
5. Based on the MTF11SSD, examine the relationship between a teen’s race (RACE) and the number of friends who drink alcohol (FRDRINK) and smoke cigarettes (FRSMOKE). Using SPSS Crosstabs, create two tables with RACE and each friend variable. What is the relationship between these variables?
[GSS10SSDS and MTF11SSDS]
6. The GSS 2010 contains a series of questions about the role of women at home and at work. It is very likely that the responses to these questions vary by sex—or do they?
- Use SPSS to investigate the relationship between SEX and FECHLD (a working mother does not hurt her children). Create a bivariate table and ask for appropriate percentages and expected values. Does the table have a large number of cells with expected values less than 5? Are there any surprises in the data?
- Have SPSS calculate chi-square for the table.
- Test the null hypothesis at the .05 significance level. What do you conclude?
- Select another demographic variable (DEGREE or CLASS) and investigate its relationship with (FECHLD).
7. Is it better for a man to work and a woman to stay at home? Women and men were asked this question in the GSS 2010. Investigate the relationship between marital status (MARITAL) and responses to this question (FEFAM). Have SPSS calculate the cross-tabulation of both variables, along with chi-square (set alpha at .05). What can you conclude?
8. The MTF data set includes teens’ attitudes toward different types of drug use—alcohol (ATDRINK), marijuana (ATWEED), and cigarettes (ATSMOKE). Create bivariate tables with these drug variables, along with demographic variables such as sex, race, or age. Have SPSS calculate the appropriate percentages and chi-squares (set alpha at .05). What relationship exists between your selected demographic variable and attitudes toward trying these drugs?
9. Use GSS 2010 to examine the relationship between respondent’s health (HEALTH) and social class (CLASS). Treat social class as the independent variable.
- Request the appropriate measures of associations to describe the relationship.
- Add SEX as a control variable and calculate the association measure for each partial table. Is the relationship stronger for women or men? Can you think of reasons why this might be so?
- What other control variables may be appropriate? Continue to examine the relationship between HEALTH and CLASS with one additional control variable.
10. Investigate the relationship between the abortion attitudes in GSS 2010 (e.g., ABANY) and various demographic variables (you might begin with gender, age, or race). Examine the relationship of these variables based on the appropriate measures of association. For example, you might examine whether attitude toward each of the abortion items has a similar relationship to gender. That is, if females are supportive of abortion for rape victims, are they also supportive of abortion in other circumstances? Try exploring these relationships further by adding control variables. You might create tables of abortion attitude by race and by gender. When you have finished the analysis, write a short report summarizing the findings. Suggest possible causes for the relationships you found.
Chapter 10: Analysis of Variance
[GSS10SSDS]
1. Let’s continue to examine the relationship between fertility decisions and education. But this time, we’ll analyze the relationship for men.
- Run a Select Cases, selecting only men for the analysis.
- Compute an ANOVA model for men, using age at first-born child (AGEKDBRN) as the dependent variable and educational degree (DEGREE) as the independent variable. Based on the SPSS output, what can you conclude about the relationship between degree attainment and AGEKDBRN for men? How do these results compare with the results for women in the SPSS demonstration?
- Compute a second ANOVA model for men, using number of children (CHILDS) as the dependent variable and educational degree (DEGREE) as the independent variable. Based on your results, what conclusions can you make about the relationship between the two variables?
2. Repeat Exercise 1b, substituting respondent’s social class (CLASS) as the independent variable in separate models for men and women. What can you conclude about the relationship between CLASS and AGEKDBRN?
3. We’ll continue our analysis of fertility decisions, examining responses to the question, What is the ideal number of children a family should have (variable CHLDIDEL)? Use CHLDIDEL as your dependent variable and DEGREE as your independent variable. Is there a significant difference in the number of ideal children among different educational groups? (Option: You can run three sets of analyses—first, for all GSS respondents; second, an ANOVA model for women only; and finally, a model for men.)
4. Examine attitudes toward affirmative action based on two variables: AFFRMACT and DISCAFF. AFFRMACT measures respondents’ support of preferential hiring and promotion of blacks (a higher score indicates opposition). For the variable DISCAFF, individuals reported how likely it is that a white person won’t get a job or promotion while an equally or less qualified black person gets one (a higher score indicates “not very likely”). Using AFFRMACT and DISCAFF as your dependent variables, determine whether there are significant differences in attitudes by social class (CLASS)? (You should have two ANOVA models, with CLASS as the independent variable in both models.)
Chapter 11: Regression and Correlation
[GSS10SSDS]
1. Use the GSS10SSDS data file to study the relationship between the number of siblings a respondent has (SIBS) and his or her number of children (CHILDS).
- Construct a scatterplot of these two variables in SPSS, and place the best-fit linear regression line on the scatterplot. Describe the relationship between the number of siblings a respondent has (SIBS) and the number of his or her children (CHILDS).
- Have SPSS calculate the regression equation predicting CHILDS with SIBS. What are the intercept and the slope? What are the coefficient of determination and the correlation coefficient?
- What is the predicted number of children for someone with three siblings?
- What is the predicted number of children for someone without any siblings?
- Can you find a way for SPSS to calculate the error of prediction and predicted value for each respondent and save them as new variables?
2. Use the same variables as in Exercise 1, but do the analysis separately for men and women. Begin by locating the variable SEX. Click Data, Split File, and then select Organize Output by Groups. Insert SEX into the box and click OK. Now, SPSS will split your results by sex.
- Have SPSS calculate the regression equation for men and women. (Note: You will need to scroll down through your output to find the results for men and women.) How similar are they?
- What is the predicted number of children for a man with two siblings? Six siblings? For a woman with the same number of siblings? Which is greater?
3. Use the same variables as in Exercise 1, but do the analysis separately for whites and blacks. Click Data,Split File, and then select Organize Output by Groups. Insert RACECEN1 into the box and click OK. (Note: Be sure to remove SEX from the box if it is still there from the previous exercise.) Now, SPSS will split your results by RACECEN1.
- Is there any difference between the regression equations for whites and blacks?
- What is the predicted number for whites and blacks with the same number of siblings: one sibling, four siblings, and seven siblings?
4. Use the same variables as in Exercise 1, but do the analysis separately for married and divorced respondents. Begin by locating the variable MARITAL. Click Data, Split File, and then select Organize Output by Groups. Insert MARITAL into the box and click OK. (Note: Be sure to remove SEX and/or RACECEN1 from the box if they are still there from the previous exercises.) Now, SPSS will split your results by marital status.
- Is there any difference between the regression equations for married and divorced respondents?
- What is the predicted number of children for married and divorced respondents with the following number of siblings: one sibling, four siblings, and seven siblings?
- What differences, if any, do you find? Is the number of siblings a better predictor of number of children for married respondents or for women?
5. Use the 2010 GSS file (GSS10SSDS) to investigate the relationship between the respondent’s education (EDUC) and the education received by his or her father and mother (PAEDUC and MAEDUC, respectively).
- Use SPSS to find the correlation coefficient, the coefficient of determination, and the regression equation predicting the respondent’s education with father’s education only. Interpret your results.
- Use SPSS to find the multiple correlation coefficient, the multiple coefficient of determination, and the regression equation predicting the respondent’s education with father’s and mother’s education. Interpret your results.
- Did taking into account the respondent’s mother’s education improve our prediction? Discuss this on the basis of the results from 5b.
- Using the regression equation from 5a, calculate the predicted number of years of education for a person with a father with 12 years of education. Then, repeat this procedure, adding in a mother’s 12 years of education and using the regression equation from 5b.
6. In Problem 3, we looked at the linear relationship between SIBS and CHILDS for whites and blacks. In this problem, we continue with this comparison except that now we want to look at ANOVA and the F statistic. What is the F statistic? What is its p level? Are there differences between whites and blacks? Are we able to reject the null hypothesis that r2 = 0? Compare these hypotheses between whites and blacks.