Answers to Exercises and questions for Discussion

Download the alcohol marketing fuzzy set dataset Fuzzysetalcoholmarketing.dat from https://study.sagepub.com/kent and save it onto your system. 

Download fsQCA. Just Google fsQCA and select fs/QCA software. Download fsQCA 2.0. Select File|Open|Data and browse to your saved.dat file and click on Open. Select Analyze|Statistics| Frequencies and transfer all the Variables to the Frequencies box and click on OK. Now open SPSS, select File|Open|Data. In Open Data, under Files of type select Text and Fuzzysetalcoholmarketing.dat and Open. Follow the Text Import Wizard. Now go to Analyze|Descriptive Statistics|Frequencies and select all the variables. Are the results identical to fsQCA?

Yes, the frequencies and the percentages are exactly the same, although the tables in fsQCA are much cruder. The exercise makes the point that both fsQCA and SPSS will use the same data matrix, but do very different things with it.

Using your new SPSS file from Exercise 1, try regressing fsintention against cssibsdrink, fslikeschool, fslikeads, fsinvolve and fsaware. Compare the results with Figure 7.23 in Chapter 7, which uses the same model, but on fsQCA.

In SPSS select Analyze|Regression|Linear. Transfer fsintention to the Dependent box and cssibsdrink, fslikeschool, fslikeads, fsinvolve and fsaware to the Independent box. Click on OK. The adjusted R2 is very low at 0.197 and the standardized beta coefficients are also low and negative, as we would expect, for fslikeschool. Having siblings who drink has the highest coefficient. From the fuzzy set analysis, being aware of, involved in and liking alcohol ads plus not liking school in combination are sufficient to an acceptable level of consistency for the outcome having an intention to drink alcohol in the next year. Alternatively, being aware of alcohol ads and having siblings who drink alcohol are sufficient for the same outcome, but to a lower level of consistency (but higher coverage). Interestingly, using a higher consistency cut-off (Figure 7.24) makes the result more similar to the regression analysis. Having siblings who drink alcohol is important for both expressions. However, for those who like school, young people need to like alcohol ads instead. Remember that regression is trying to measure the fixed contribution that each variable makes to the outcome, while fsQCA is looking at combinations of conditions that might be sufficient or largely sufficient for the outcome to occur. The contribution of any individual condition depends on what other conditions it is combined with. FsQCA outputs are in a way more ‘messy’ than regression, but are closer to reality and make fewer assumptions, for example, about linearity and fixed contribution. It is also possible, for example, to run a separate analysis on the negative outcome, looking at what conditions may be sufficient for young people to have the intention not to drink alcohol in the next year.

Given the stated hypotheses in the alcohol marketing study, which approach to data analysis would you favour or recommend?

The key research hypotheses as stated by the researchers (Gordon et al., 2010a) are that the more aware of and involved in alcohol marketing that young people are, the more likely they are to have consumed alcohol, and the more likely they are to think that they will drink alcohol in the next year. There are two (presumed) dependent variables or outcomes here: whether or not they have already consumed alcohol and how likely they think they are to drink alcohol in the next year. It is usually best to think in terms of one outcome at a time. The first is binary. If the two independent variables can be considered to be metric then variable-based analysis would suggest that binary logistic regression would be appropriate. An SPSS analysis suggests that the R2 equivalent is tiny (0.044) and only involvement gives a reasonable prediction of drink status.

The other outcome variable is how likely they think they are to drink alcohol in the next year. This is at best ordinal and only then provided I’m not sure is recoded as falling between Probably yes and Probably not and Don’t know/not stated is treated as a missing value. The allocated codes of 1–6 in the original dataset should certainly not be treated as metric values. One possibility is to recode intentions, awareness (number of channels on which ads for alcohol have been seen) and involvement into three ordered categories and crosstabulate intentions by awareness and intentions by involvement, taking gamma as a measure of association. These produce very low coefficients of 0.187 and 0.234. However, involvement is slightly more strongly associated with intentions than is awareness. Both of these bivariate associations could be controlled by the other independent variable, for example the association between intentions and involvement can be measured for different categories of awareness by using the ‘layering’ function in SPSS crosstabs.

An fsQCA analysis with just two conditions (involvement and awareness) is not particularly helpful. Try it – there are only four configurations and none meets an acceptable level of consistency. One of the advantages of fsQCA, however, is that it can handle all the different combinations of characteristics that are possible and work out which ones are connected with the outcome. These connections may, furthermore, be asymmetrical so that, although all those with a particular configuration manifest the outcome, not all those who manifest the outcome necessarily have that configuration – the outcome may come about in other ways.

In answer to the question about which approach to recommend, I would suggest both. Each has a contribution to make to our understanding of the various relationships between the advertising of alcohol and alcohol drinking behaviour. Each has strengths and weaknesses as explained in Chapter 9. The approaches could be mixed in various ways suggested in the same chapter. They could be phased in two stages, possibly beginning with the variable-based analyses to look at the patterns between the key variables and going on to include a range of conditions in an fsQCA analysis that shows how various configurations relate to the outcome. Alternatively, they could be concurrent or overlapping, showing how the two approaches compare and either support or contradict one another. The data could be mixed with the fuzzy set membership values incorporated into SPSS as in Exercise 1 above. Notice that the very activity and thought going behind creating fuzzy set memberships may help to transform a variable like intention to drink alcohol in the next year from a nominal or at best ordinal variable into a set of fuzzy set membership values that could be treated as metric.