Exercise 1: Gathering data from online newspaper discussion forums

In Chapter 5 in the book, you will find exercises on collecting data from Twitter (in Box 5.1) and from Instagram (Box 5.2). In the reflective report on discourse analytic research (see Appendix 2 in the book), you will see that Simon Goodman and Lottie Rowe gathered data from another social media source: online newspaper discussion forums.

In this exercise, you are invited to gather data from online newspaper discussion forums about a particular aspect of a controversial issue at the moment or in the recent past. Here, as an example, we invite you to gather data on resistance to public health measures that were introduced in response to the Covid-19 pandemic that began in 2020; for example, ‘lockdowns’ and requirements about wearing masks in public settings.

Normally you would not start gathering data before you had developed research aims and research questions. In this case, your research aim would be to examine responses of resistance to anti-Covid-19 measures in online newspaper discussion forums. Your research questions will be shaped by the analytic approach that you intend to take to the data, For example, if you decide that you will use a discursive approach (see Chapter 15 in the book), your research questions might be something like ‘How is resistance to anti-Covid-19 measures constructed and done in online newspaper discussion forums? How is resistance responded to there? Which ideologies or discourses or interpretative repertoires are used in working up and critiquing resistance?’ If you were going to use thematic analysis, you would probably have different research questions. As another exercise, you may wish to read Chapter 7 in the book and develop research questions suited to thematic analysis (see Table 7.1 for a typology of research questions suitable for studies using thematic analysis).

To prepare for this exercise, read the method section of Goodman and Rowe’s reflective report and see the decisions they made in selecting their data.

When sampling social media data, you will need to make a series of decisions that will progressively narrow the data set, especially when sampling data on a topic that is likely to have generated a very large number of responses. Otherwise you will end up with more data than you can handle. At the same time, you will need to achieve a diversity of relevant data so that you can obtain insights into the different ways in which your specific topic is discussed in these online forums. Make notes about each decision that you make and the rationale for each decision. If you were to analyse the data and then write up your research, you would need to provide an account of your data collection strategy. It can be difficult to reconstruct that in retrospect so keep detailed notes about it as you go.

The first practical thing you need to do is to decide which online newspaper discussion forums you are going to use as data sources. In the case of data on resistance to anti-Covid-19 measures, you may wish to focus on newspapers that are based in the country where you are doing the research (comments in online discussion forums may of course be made by anyone anywhere in the world), that have the strongest online presence in terms of number of daily visitors, that allow readers to comment on articles, and that allow you to view comments without payment. In the UK, that would mean sampling from The Mail Online, The Sun, The Daily Mirror and The Guardian. Those outlets also differ in terms of their political tone and are likely to attract different readerships. Hence, sampling from each of them is likely to produce a diversity of comments in terms of tone and outlook.

Next you need to search for articles in the online version of those newspapers that relate explicitly to resistance to anti-Covid-19 measures, using appropriate search terms. You may wish to limit your search to time periods when this topic was particularly pertinent, such as the days preceding and following the introduction of or changes in anti-Covid-19 measures. Alternatively, you might decide to sample in a more random way. For example, you could search for relevant articles that were published on the first day of each month during the Covid-19 pandemic. Note that, although you are searching for articles, those articles will not be part of your data set. Instead your data will be drawn from the comments made by readers in response to the articles.

Consider the number of articles that your initial searches are generating. If there is a huge number or very few, refine your search terms and try again. Dip into the articles that are being identified through your searches and check that they are actually relevant to your topic. If there are a lot of irrelevant articles, re-examine your search terms and try again.

Now you need to identify the articles that you will focus on. If articles have generated a lot of comments and discussion, you may only need to sample one or two articles from each online newspaper, perhaps choosing those that have generated the greatest number of comments. In the study described in their reflective report, Simon Goodman and Lottie Rowe analysed comments generated by just two newspaper articles and an article that appeared on a website that was relevant to their research topic. It is easy to get carried away by enthusiasm at this point and to gather more data than you could possibly analyse. Once again, remember that you are seeking to balance diversity in data and practicalities. If articles have generated an enormous number of comments, you might decide to use only a selection of comments. Be careful how you select, though. You might select, say, the first 50 comments or at least enough comments to provide a clear sense of how the online discussion unfolds. That would be preferable to selecting, say, every tenth comment because that would lose the thread or logic of any discussion.

When you have assembled a data set that is diverse and of a practicable size for analysis and you have recorded the text of the comments, you could then move on to analysing the data if you wish. Exercise 3 in Chapters 15/16 invites you to analyse your data using a form of discourse analysis.

In this example, we have used online newspaper discussion forums but similar data could also be obtained from the online discussion forums of popular news channels.