We live in an age where data pervades almost every aspect of our lives. In Making it Count, I note that “to know something through numbers remains one of the most powerful ways of knowing in the modern world. Powerful not because such knowing is necessarily or always nearer the truth (were we to grant the singularity of such a thing), but powerful because numbers offer a tool of persuasion and a basis for rational, methodical, calibrated, and repeatable actions that remain unmatched.” Over the past few decades, as computational and storage capacities have dramatically increased, this reliance on numbers, to not just define but also shape our world, has only increased. We call ours the age of Big Data but, evocative epithets aside, what it basically represents is the stubborn belief that the more we know (ideally in quantifiable form), the better designed our solutions will be. On the surface, this makes perfect sense. But what is often overlooked is that how we come to know something is an intensely fraught process that can significantly affect what we know.
As I show in the book, the Chinese state in the 1950s was excellent at producing vast quantities of data; of facts, as it were. And yet, it remained poorly informed. Much of their data, by their own admission, were worthless the moment they were produced. The system they built relied upon a key assumption, that the only correct way to know society was to count exhaustively. It relied on a vast network of periodic reports and rejected, for the most part, other methods of data collection, most notably randomized survey sampling, but also purposive (i.e. non-randomized) surveys. This generated a range of challenges that significantly hampered their ability to gain data in an accurate, timely, and affordable manner.
Our “ignorance” or the “facts” we need to govern are not self-evident things. They manifest themselves because of goals we identify, assumptions we make, and methods we choose (or eschew). In so doing, they affect the decisions we make.
In relation to the present crisis, I think what is remarkable is how much we still do not know. Take what seemingly is the most straightforward of metrics: the number of infections. Given key problems in data commensurability—different testing regimens, recording practices, uneven quality of testing units, and varying incentive structures (with domestic and international imperatives) that influence transparent reporting—there is no doubt that not only are current estimates severe undercounts, but that for many countries the data is likely significantly compromised.
And yet, so much of our perception of the severity of the crisis is driven by these numbers. China, India, and the US represent three interesting points of reference on a wider spectrum charting the global spread of the infection. They also exhibit different modes of governance.
The US, a two-party federal democracy, is arguably the most data rich and data-transparent of the three. Yet, it is hamstrung by a leadership unwilling to utilize these assets. Instead, it appears that federal policy has been to marginalize both data and data practitioners.
China, a one-party (and now, one-leader) authoritarian state, after initial missteps, has appeared to effectively control the spread of Covid-19. But it has done so by further extending extremely intrusive data-based systems of surveillance and supervision. A lack of transparency also means we have little information on how the lockdown and subsequent measures have affected different sections of Chinese society.
India, a parliamentary democracy, has, under the current government, significantly undermined its statistical institutions, so much so that all major macro-indicators of the economy stand discredited. The decision to lock the country down on four hours’ notice is symptomatic of this unwillingness to consider key facts, none perhaps more salient than that the vast majority of India’s labor is in the informal sector and reliant on daily-wages.