Abstract

It all started with the Internet 2.0 at the turn of the millenium. Not only nerds, but every ‘normal user’ could now exchange messages and edit web sites, and, ever since, the Internet has become a sphere of human communication that outperformed or reinvented all technical communication media. A plethora of techniques and methods is available that can be used to study social processes evolving in this sphere. Hidden, i.e., nonreactive, data collection is of particular interest to examine this communication. It facilitates a non-invasive type of research that holds the promise to record communication without interfering with it. In recent years, nonreactive data collection on the Internet and in particular in social media has increased on an unprecedented scale. At the heart of data collection in social media (‘big data’) is non-reactive data sampling. In turn, nonreactive data collection is part of a larger technological and cultural development characterized by datafication, a term coined by Kukier and Mayer-Schoenberger (2014) that characterizes ‘the ability to render into data many aspects of the world that have never been quantified before’ (Kukier and Mayer-Schoenberger, 2014, p. 29).

To account for the many usages of nonreactive data collection on the Internet this chapter conceives of this group of methods from two different vantage points. The epistemological perspective debates the methodological frame of nonreactive data collection. Among the issues addressed are the combination of data gathered in a nonreactive way and the potentials and limits of this approach for arriving at a richer account of the phenomena studied. The technical perspective presents and discusses techniques (e.g., cookies, log files, environment variables, time measurement, APIs) used in nonreactive data online collection.