Thursday, June 9, 2016

Why would a journal called "Scientific Data" publish bad data? The Chandler/Modelski city-size problem

Scholars interested in changes in city size over long periods of time often turn to one or both of two encyclopedia compilations of data: Tertius Chandler’s, Four Thousand Year of Urban Browth (Chandler 1987), and George Modelski’s World Cities: -3000 to 2000 (Modelski 2003). Chandler’s book is an update of an earlier version (Chandler and Fox 1978). The data in both Chandler and Modelski are a mess, routinely  dismissed by urban demographic historians as worthless for serious scholarship. Yet a growing number of scholars—particularly economic historians—mine those sources for city-size data in order to investigate various questions. This situation brings up a number of thorny professional and ethical questions.

In this post I describe the situation and point out some of the troubling questions that come to mind. This is not a thorough exploration of either realm. I only have two more days at the lab in Teotihuacan, Mexico, and I have about four days worth of tasks to complete.

I am moved to write this because of a new paper relying on Chandler and Modelski’s bad data was just published in a new journal called Scientific Data (Reba et al. 2016).

What is wrong with the data of Chandler and Modelski?

The new paper (Reba et al. 2016) describes these sources. How they were compiled, and various problems and difficulties with the data. The latter are mostly limited to context, measurement, and presentation issues. The basic question of whether the data are accurate is barely considered. I am not a demographic historian, so I cannot provide a detailed critique. But I do use systematic archaeological and historical data on city size in some of my research, so I have looked into this question. I will limit myself to quoting experts (that is, urban demographic historians, scholars with experience working with primary sources). Some of these refer to the earlier version of Chandler (Chandler and Fox 1978), but the methods and reliability are not very different between the two editions.

(de Vries 1984)

·         Three Thousand Years of Urban Growth by Tertius Chandler and Gerald Fox, is a massive collection of information about the size of cities. Its unsystematic character, and, even worse, the authors’ reliance on suspect sources and their completely uncritical use of such sources renders the volume all but unusable.” (p. 18)

(Hopkins 1978)

·         “the compendious, useful but not obviously reliable T. Chandler and G. Fox,” (p. 2)

(Bairoch 1988)

·         Chandler and Fox:  “But it is only after the year 800 that this survey may be regarded as truly systematic. This monumental and extremely useful work nevertheless leaves a number of important gaps unfilled. These gaps result from the omission of a fairly sizeable fraction _ / - of the cities of the world (probably 20% of the larger cities and 60% of the smaller ones) and especially from the failure to undertake a systematic review of recent studies on the history of various individual cities.”  (p. 116-117)

(Rozman 1978)

·         “Chandler and Fox fail to use sources in languages essential for their task” (p.66)

(Binford 1975)

·         There are “serious flaws in the work” (p.23)

·         [Chandler and Fox employ] “dubious rules of thumb”

·         “Statements and methods like these undermine confidence in the work as a whole, because they suggest a fundamental naiveté … The authors overestimate the importance of statistics in the thinking of our ancestors, and assume too much uniformity in pre-industrial population trends. For all the cautious skepticism proclaimed in their introduction, they still seem too willing to make census taker out of medieval travelers, and to assume that, in the absence of factories and international capitalism, one place was more or less like another. This is not a work informed by the research of other scholars on the various reasons for collecting data in the past, or in the complexity of population trends even in small, non-industrial cities. Without more attention to such matters, the reader can only guess at the margins of error he ought to attach to the Chandler-Fox statistics, and therefore cannot use them with assurance in further calculations (for example, in computing rates or indices with gross population as a base.” (p. 23-24)

·         “One of the weaknesses of the Chandler-Fox collection is that the authors are sometimes too gullible in dealing with their informants. Similar errors may be introduced by mis-enumeration on the part of those who first gathered the evidence.” (26-27)

(Kowalewski 1990)

·         “Issues of completeness, scale, and time-depth make some studies (e.g.. Chandler 1987) less appropriate for comparative purposes.” (p.40)

(Woods 2003)

·         Critical of Chandler, but uses the data

·         Chandler’s data are  “more doubtful, at least for some regions.” (p. 219)

(Chase-Dunn et al. 2005:97)

·         Chandler's “estimates are obviously error-prone.”

Personal communications

In order to make sure my view of this topic was up-to-date, I emailed several urban demographic historians (about a year ago). I did not ask for their permission to quote their remarks, so I include the following quotations as anonymous. These two are well-respected historians, each of whom has published books and articles on population reconstruction in the past.

Scholar 1:

·         “Many years ago I thought about writing a paper on this/these books (There are two editions, I think) but never did get around to writing it. But as far as I'm concerned it's all but worthless.  Chandler was assiduous, without a doubt, but just compiled (or piled) one estimate after another without any indications of the merits or demerits of each. It might be useful for tracking down estimates but not for the estimates themselves.

·         Modelski  I don't know--yet.”

Scholar 2:

·         “I do not have a high opinion of Chandler’s data. In those cases where I was able to check his Greek, Roman, medieval or early-modern data, they turn out to be seriously wrong. Do not know about the more modern data, though.”

A note on these scholars

Most historians are notoriously picky about their sources; they like to stick close to their textual sources and hesitate to compare or generalize beyond one or two cases (Grew 1990; Kocka 2003). Thus one might be tempted to reject the above critiques as reflecting this bias. Some historians simply do not accept the validity of broad comparative analyses that require simplification and standardization of diverse local datasets. I discuss this and related issues of comparative analysis elsewhere (Smith and Peregrine 2012). But most or all of the individual quoted above actively pursue comparative analysis. They have all assembled regional or temporal databases of city-size data, although not on the scale of Chander and Modelski. While they are well aware of the historiographic issues of data sources, they are also willing and able to systematize, simpilify, and push forward with comparative data. In other words, their critiques do not reflect the cranky complaints of particularistic historians who are anti-comparativist. Instead, their critiques reflect real historiographical issues in the origins, quality, and relevance of primary data on city sizes in the past.

Some positive comments

(Binford 1975)

·         “This book will be useful to students of population and urban development because it is the only worldwide compilation done by one person using consistent criteria.” (p. 22)

Christopher Chase-Dunn and Daniel Pasciuti reviewed Modelski very positively in Jr. World Systems Research, 2004. They work with Modelski and have co-published with him. They evidently liked the book so much that they later published the IDENTICAL book review, word-for-word, in the journal Globalizations, in 2006.  Hmmmmmmm……..

Uses of these data by scholars seemingly oblivious to potential historiographical problems:

·         (Reba et al. 2016)

·         (Manning 2005)

·         (Jedwab and Vollrach 2015)

·         (Jedwab and Vollrach 2016)

·         (Nunn and Qian 2011)

Uses of the data by scholars who attempt to verify or adjust the data in relation to other scholarship

(Morris 2013:146)

·         Ian Morris uses some of the data cautiously, in the context of a discussion of it’s validity:

·         “in my opinion some of Chandler and Fox’s estimate are not supported well by the data.”

·         “While there would be some advantages to taking a single source like Chandler and Fox’s Three Thousand Years of Urban Growth and then relying on it consistently, the drawbacks seem to outweigh them.” (p. 146)

(Acemoglu et al. 2002)

·         These scholars analyze Chandler’s data, compare them to, Bairoch’s data, and attempt to evaluate their usefulness.

Troubling questions

My gut reaction to works like the new paper by Reba et al, is “Garbage in, garbage out.” The data cannot be trusted, so why should we be expected to trust the results of the analysis? Here are some troubling questions that arise out of this situation.

(1) Why do otherwise rigorous scholars feel free to use bad data?

·         Because it is there. If data exist in some usable format, and they seem relevant to a research question, someone will use the data, even if the data are terrible, unreliable, or inaccurate. I have considered posting a bogus dataset online to see if people will analyze it. The risks seem to outweigh the benefits, though.

·         Because scholars lack training in other disciplines. The paper by Reba et al is curious in that it has a very rigorous discussion of geospatial methods, but virtually no discussion of the historiography of the city size data. See some of the sources quoted above for historiographical discussions of historical city-size data.

·         Because scholars are not critical of sources. Perhaps there is an assumption that all data in other disciplines are valid. Perhaps scholars believe that if something is published it is true. Do the authors think that data published by non-experts in another discipline (neither Chandler nor Modelski are/were urban historians or demographic historians) are valid just because they are published? Reba et al justify their use of Chandler’s data based on the fact it is published and other scholars have used it. They provide citation data for Chandler and Modelski. The message is that if it is published and cited, it must be valid. Hmmm, I haven’t encountered that principle in my readings on methodology in the social sciences or the historical sciences.
(2) Why don’t scholars who know the data well care about this?

        Scholars like Bairoch, de Vries, and others quote above DO care about data quality, and hence their negative remarks on Chandler and Modelski’s data. But many of these quotations come before the time when economic historians and others started mining data sources for city size data. Perhaps these and other historians simply don’t care that historical data are being used badly. With the advent of the Internet, there are all sorts of bad data readily available, and all sorts of bad analyses of bad (and good) data. Or perhaps disciplinary myopia is the cause. Urban historians are not reading journals like Scientific Data, Explorations in Economic History, so perhaps they are unaware of the uses of Chandler and Modelski’s data.  Perhaps they have trouble believing that such terrible data would be taken seriously by scholars.

This latter is a very real factor in work that crosses disciplinary boundaries. For years I didn’t believe that Jane Jacobs’s silly and ridiculously inaccurate model that cities preceded agriculture could be taken seriously by anyone at all. Then I found that geographers were citing and praising the model. I got cranky and did a blog post or two on this. Then when a paper in a major journal promoted this idea, I finally got motivated to publish a critique, so I rounded up a few colleagues and we published a paper (Smith et al. 2014). Maybe the urban historians have not yet been provoked sufficiently to mount a proper attack on Chander and Modelski.

(3) What can be done about this situation?

I have to admit that I really despair of this situation. I am very upset that such obviously poor data are being used by otherwise rigorous scholars, and I am upset that I don’t have better data. I have talked to quite a few colleagues—archaeologists and ancient historians—about this situation. I have asked if any of them were involved in assembling reliable and accurate data on ancient city sizes in their region of specialty, and the answer has been negative. I have asked if they knew of anyone doing systematic urban demographic history in their region, and again the answer is no. In my own region, Mesoamerica, there was a flurry of demographic work on city size in the 1980s, but then scholars lost interest. I have asked if anyone might be interested in mounting such a systematic comparative project, again with a negative answer. This may not arise entirely from a lack of interest – if someone asked me to assemble demographic data from all Mesoamerican cities at, say, 50- or 100-year intervals, I might claim that I am too busy for such a task. It would take a lot of time, and it is hard to see how a granting agency would get excited about such a project.

I don’t have any grand conclusions here. The situation is bad--journals and scholars are merrily using bad data--and I don’t have any good solutions, beyond the suggestion that bad data should be avoided. The journal Scientific Data should be ashamed for using such non-scientific data. We can all do better than this.


Peter Turchin said...

Mike, see my response here:

thedevilcorp said...

Good post.