Category Archives: Statistics

Lies, damned lies and you know the rest

The shocking news that the population of Kibera, “Africa’s largest slum”, may not in fact be 1 million people comes as a shock mainly to people who think statistics = truth, even when the provenance of those statistics is questionable. Earlier this year I put together a bar chart which brought together a few different sources to create a picture of Kibera’s growth. Bear in mind that these figures are conservative, and that some sources randomly estimate Kibera’s population at over 1 million.

Table 1: Population Growth in Kibera 1963-2010 (Source: de Smedt, Richards and Godfrey 2003)

As the sage said, “all models are wrong, but some are useful” – which is to say that the models which Kibera’s former population estimates were based on were very wrong, but that the presence of “Africa’s largest slum” in the middle of Nairobi was very useful. It meant that international visitors (particularly the UN agencies and NGOs, many of which have their regional offices in Nairobi) didn’t have to travel very far to visit the poverty that was their raison d’etre. (Of course few of them ever bothered to travel even that far.)

So where did those figures come from? By definition the dead hand of the state doesn’t weigh too heavily on informal settlements, so even our friendly neighbourhood tax collector is of limited use. Most population estimates are extrapolations based on past growth or statistical surveys of a limited section of the population; these approaches have fairly obvious holes in, but they are useful. The problem comes when people with more limited understanding accept those figures as definitive and start to reuse them.

In the absence of actual data (such as an official census), NGO staff make a back-of-envelope estimate in order to plan their projects; a postgraduate visiting the NGO staff tweaks that estimate for his thesis research; a journalist interviews the researcher and includes the estimate in a newspaper article; a UN officer reads the article and copies the estimate into her report; a television station picks up the report and the estimate becomes the headline; NGO staff see the television report and update their original estimate accordingly.

All statistical hell breaks loose, and the population of Kibera leaps ever higher. Every actor at every stage has a motive for using the upper end of that initial estimate, rather than more conservative figures – planning, funding, visibility, and so on – but no single person is responsible for inflating the figure progressively further from reality. Eventually – census! – followed by headlines trying to explain why the previous figures were so high, and what this means for the people who live there (and the rest of the country).

This is not the problem. The problem is that any further analysis – in health, education, sanitation – using that inflated figure as a basis is also going to be wrong. The solution, as ever, is to invest in better data collection rather than relying on policy wonks to imagineer your slum. Mikel reminds us that previous Kibera mapping efforts came up with more accurate estimates long before the census: nobody listened, of course, proving once again that the problem is not the technology – the problem is everything else.

Somebody cares about statistics

When various members of the elite are asked how they’d spend $10 billion dollars for charity, most of them respond exactly how you’d expect:

  1. The obvious. “Stimulate job creation in developing countries”? Why didn’t I think of that! Wait there for a moment, I’ll just go and do that.
  2. The vague. “Tackle climate change”? With goals that laser-focused, no wonder Oxfam’s strategic direction seems to change every five minutes.
  3. The self-defeating. “Develop carbon-capture toilets”? Only later do we discover that “the viability of this kind of initiative depends on the price of carbon” – not a hostage to fortune at all then.

It’s easy to be cynical, but that’s partly because $10 billion is a meaningless figure to me because I’m not part of the elite. However one respondent made a specific and concrete proposal that wouldn’t rely on fantasy elements to become a reality – suggesting that he’s thought through the concept, rather than just trading around dinner parties. I like Mo Ibrahim and his proposal is simple yet obvious for anybody who’s seen how development actually works:

I would use the $10 billion to fund the development of national or regional statistics offices. They would improve data collection and dissemination to ensure public access to, and sophisticated application of, these data. Better data will support improved policy making by governments and interventions by donors. The data will enable them to identify needs, to make better use of existing resources and to assess results. In the case of donors this will finally lead to aid that is “smart”—for both donor nations’ taxpayers and recipient countries’ development needs. The private sector would be able to make more targeted investment decisions with this data. Citizens would be able to see where their country was succeeding and where it was failing. This would support targeted pressure on government and prevent false claims by either state or citizenry.

Coming hot on the heels of Tim Berner-Lee’s infectiously enthusiastic TED talk and the long-awaited launch of AidData, I can seriously get behind an intitiative like this. I don’t necessarily share all the assumptions about how that data will be used – but we’ll never know if we don’t put the data out there, will we? On the other hand, as a fully-paid up member of the international anarchist conspiracy, I do wonder if I’ve started seeing like a state

Quickbits April 2008