Archive for the ‘Data Collection’ Category
The Bear versus Shark of Data Entry
Tales from the Hood lays out the harsh reality of aid work - lots of manual data entry. How does that stack up against Robert K’s Talking Papers?
About two thirds of the form was numerical, and so entering that data got to be pretty mechanical after the first hour or two. But that last third was all qualitative stuff: open-ended interview questions where at times the respondents appeared to have rambled or gone on wild tangents.
The first two thirds could be covered more easily by automation (although you’d still need somebody to feed the machine and to check the OCR) but that last third – the qualitative stuff is never going to fit into the machine comfortably.
But let’s forget the information management and keep in mind the “description of chronic, always-in-the-back-of-your-mind hunger by someone who’d lost everything” that the Hoodie passes on to us from a scrap of paper in Port-au-Prince:
The hunger is… a hole beneath our hearts.
Now that I think about it, that’s the reality of aid work – that and these lessons from Catherine at AIDG. Sometimes it helps to have non-aid workers tell the rest of the world what it’s really like…
Quickbits May 2009
- Following my mini-rant about how ReliefWeb hasn’t made the most of the potential of the web, a couple of projects surface which point the way to a better future for the humanitarian community’s hub. The ReliefWeb News Monitor is JRC on the pipes again, with an aggregated feed of news stories that can be sliced for your serving pleasure; more interesting for the aid worker is the Briefing Kit, which gives you the opportunity to build your own document package by country or emergency. One of the primary uses of ReliefWeb is for pre-deployment briefing, so this is a definite value-added service.
- More MapAction… er, action, at an Alertnet-hosted workshop in London on June 4 looking at how the aid community can use maps effectively. I understand from Liesbeth that the event is fully booked, but Mapping for communications, planning and advocacy will be streamed live for those of you who can’t make it. Plus:
We want your questions. Given the rise and rise of mapping technologies, what would you like to know about how NGOs can better use geospatial tools in their work? Use the comments section below, or submit your questions using the Twitter tag #askmaps.
- In the Financial Times: Tainted data hide the cost of Africa’s upheavals. Slightly contrarian article about the use and abuse of statistics in conflict situations. The FT casts its beady eye over IRC’s DRC statistics (which always looked a bit fishy to me) and UN statistics more broadly, and who knew I’d have an ally in the FT regarding funding for government statistics offices?
The first step towards compiling an accurate picture is to make assistance to Africa’s under-funded statistics departments a priority in international aid programmes… Accurate statistics, objectively gathered and responsibly used, are as essential as compassion in tackling Africa’s plight. Tracking its crisis without reliable data is like exploring the continent without a compass.
- Amnesty rolls out the sms bad times: Guatemalan activists receive death threats by text message. Part of the ongoing debate about how technology empowers both sides in a conflict. If there are in fact two sides in any conflict like this, which I somehow doubt. There’s even more complexity at the tail end of the “Twitter Revolution” story – I had so much to write about this nonsense. Now everybody except Evgeny has forgotten it by now (because yes that is how long the web’s attention span lasts), but this article is still worth reading:
So, while the events don’t fit the Western media’s narrative of a city full of protesters converging on Twitter and almost pulling off a revolution, technology did play an indispensable role in telling the story of April 7.
- From the Just Shoot Me files, In Iraq with Web 2.0 luminaries, as if they weren’t already filled with their own self-importance. If you don’t think this entire concept is self-parody, then read this extract and see if you can spot the deliberate mistake:
The idea is to use the brains of this small collective to give ideas to Iraqi government officials, companies and users that will help it rebuild. Iraq is short on the mojo that widespread internet can bring and the fast-track economic jolt that entrepreneurs feed on. Who knows that stuff better than a contingent of internet goombahs heavy on the Google juice and includes the guy who thought up Twitter?
Mukoko bail revoked, preposterous show-trial to continue
As feared, the Zimbabwean state continues to press its shonky case against my colleague Jestina Mukoko, National Director of the Zimbabwe Peace Project, and 14 other Zimbabwean political and civil society activists. From Violet Gonda at SW Radio Africa:
The courthouse was packed Tuesday with journalists, members of civil society and the diplomatic community, who were left shocked after the Magistrate remanded the accused persons in custody. Eyewitnesses said Mukoko looked pale and dejected when she heard the news. The accused persons were all abducted and tortured between the months of October and December last year.
It’s going to be little comfort to Jestina as she is taken back to prison, but the authorities have only managed to delay, not stop the work of ZPP and others in monitoring what is going on in Zimbabwe’s hinterland. Just in is the meaty February report from ZPP:
Since January 2009, a total record of 2410 cases of politically motivated human rights abuse have been recorded: 1125 in January and 1285 in February showing an upsurge by 160 cases. Although there were no reported cases of murder since 2009, cases of harassments, assaults, looting, displacement and unlawful detentions continue to maintain a stubborn presence.
It’s pretty clear to me that Zimbabwe’s resilient communities are making for resilient organisations too. This numerical analysis doesn’t really mean anything, but the datatrail and the paper beneath it – combined with the quiet work of many others in country writing down and photographing what is going on – is there for an eventual reckoning.
When Muppets Get Flu
I don’t have much to add to the swine flu situation. It might be a pandemic, it might not, but the best strategy is to a) keep an eye out for more news and b) take reasonable precautions if it becomes necessary. I’m a colossal pessimist, which is great in situations like this, because unless the virus wipes out the entire human race, I’ll be able to say, “Hey, that wasn’t as bad as I was expecting!” Incidentally, can we start calling this “swinfluenza” soon? I’ve got some t-shirts already prepared. They’re the old “henfluenza” t-shirts with tippex over the logo.
First: some useful resources
- WHO Swine influenza briefing
- WHO Pandemic influenza preparedness and response guidance
- The ever-rocking Flu Wiki
- CDC Swine flu information page
- A few comments on pandemic influenza by Terry Jones
You can expect traffic to all these sites to be massive for the duration of this outbreak (or until civilisation collapses, taking the internet with it), so be patient with slow loading pages.
Second: a brief discourse on tracking epidemics
There’s been some discussion about how web and mobile technology can be used to track outbreaks like this – something that’s been floating around for a while, and was of course the starting point for the creation of INSTEDD.
- Evgeny discusses Twitter’s power to misinform
- Erik at WhiteAfrican urges caution but points out that an online ecosystem is emerging
- Mark Honigsberg is less optimistic and thinks a different approach is needed
- Kragen Javier Sitaker explains how false rumors can cost lives
These discussions are relevant, valid, interesting, etc., but I’d like to add two really simple rules of thumb regarding any web-enabled tracking initiatives, both of which are widely appreciated but often forgotten:
- Garbage in, garbage out.
- Ground truth everything.
Nearly all of the chatter on social media is garbage, so you do the math.1 And ground truthing sounds like it should be easy, but it actually isn’t. See? That’s the pessimist in me talking again. Ignore him. Anyway, keep reading other blogs for more swine flu info – I’m going to blog about something else.
- In fairness, both INSTEDD and Swift River are in fact doing the math. [↩]
Unwieldy IT monsters and how to kill them
If I don’t think that a bottom-up approach is going to work in the humanitarian community, I must think that a top-down approach is the best bet, right? Wrong. And here’s why:
Worst of all, though, the [additional and novel layers] mainly exist because the Government wanted to have the job done by the Big Consultancies – Accenture, EDS, and friends – that it was used to dealing with. Assuming that they wouldn’t be interested in small contracts, the Government invented a completely new organisational level in order to sweeten the deal. They further insisted on the contracts being covered by intense secrecy, which cut off any possibility of talking to the users. And the Big Consultants proceeded to move the actual development to the US and India to save money, thus avoiding any institutional knowledge that might somehow have seeped in.
Top-down approaches to data management don’t work in the public sector, full stop. This is because organisational politics usually over-determines a process that fails to include the users1, and that’ll always defeat your technology no matter how shweet you think it is. So what do I think works? The Yorkshire Ranter actually provides that as well:
Part of the original plan involved using a common data exchange standard for the whole NHS; if this exists, there’s no need for much of the rest, especially not the regions and possibly not the Spine. We could define some goals and a set of data formats, then break out the cash to the individual hospitals, trusts etc to use themselves. … I think a cross-government requirement for common data standards, as much open source as possible, and perhaps even building everything with a sensible API for further development would do nothing but good.
That’s the starting point. Establish a minimum data standard using an agile process, use existing practice based on the experience of participating users, make the process as open (and open-ended) as possible, get the sign-off from the participants at the highest possible level, and then let go. Then it’s out there and organisations can use it – or not, but if they don’t, they no longer have the excuse that such standards don’t exist and can be held accountable against that. It also allows entry into the market for organisations and individuals that are new to the sector or weren’t involved in the original process – and then they might become part of the next iteration of development.
The key thing to remember is that the development needs to take place in the heart of the user community, and anything else is unlikely to yield useful results. The humanitarian community needs exactly this, and I’ve been saying exactly the same this for ten years, and as far as I can tell we’re still nowhere near even getting such a process off the ground in most of clusters / sectors. If anybody knows anything different, please feel free to let me know and make my week brighter. And if anybody thinks this process wouldn’t work, I’d be interested to hear why – especially in light of the persistent failure of IT projects in the public sector.
- Note: the actual users, not the people who manage the actual users. [↩]
New Year High Resolution….
High resolution satellite imagery, that is… zing! While the news from the Middle East may be depressing as hell, it has provided a stimulus for Open Street Map to improve their spatial data for Gaza. Jon has done a comparison of existing online maps, showing Google Maps to the initial winner – although OSM are working hard to update their offering, and as Jon says “the flexibility OSM has shows it’s value as a quickly adaptable humanitarian tool.”
Following OSM’s initial request for support, Alertnet ran the story yesterday, and updates will be posted on Mikel’s blog. This is worthwhile stuff – as well as being potentially useful for people working in that area, it’s a long-term contribution towards the spatial data infrastructure of the middle east. If you have any knowledge of Gaza, then you can contribute via the Wiki – and if you’ve got any of that high-resolution sat imagery, I’m sure they’d love to hear from you…
Numbers Over Georgia
I promised myself that I’d blog every single day while I was working in Georgia. It should be fairly obvious that I didn’t. I can’t say that I was super productive while I was in Tbilisi – for a variety of reasons, including particularly dysfunctional co-ordination, but also because of the basic difficulty of getting good information in conflict situations.
In a natural disaster, government agencies and international organisations are usually relatively comfortable to share information about the situation – but in a conflict, they clam up tighter than my wallet around Christmas. This is because natural disasters have fewer political implications than complex emergencies; while in a natural disaster the worst thing you can say about a government is that they’re negligent, in a conflict situation the government is usually a belligerent,
This means that timely / reliable / accurate information is hard to come by in Georgia, as Ivan points out and Ethan overviews (is that a verb?). I find it hard to get too worked up about the lack of “citizen war reporters” even though it is my fervent hope that the web is going to change the way we do business in both complex emergencies and natural disasters. My lack of work-up is simply because even if there were shedloads of citizen journalists covering these events, I would still treat them exactly the same as any other information source – which is to say, I wouldn’t trust them at all.
As an example, the single most critical humanitarian information issue in Georgia was the numbers and locations of people displaced by the conflict. This was problematic for a number of reasons:
- Nobody had a clue how many people had been displaced by the conflict. There were multiple government agencies involved in looking after the IDPs (frequently a euphemism for “ignoring them”, of course), each with their own figures, none of which tallied with the figures that UNHCR or the Red Cross had; and of course nobody in the humanitarian community had bothered to sit down and agree on a number we could all work to. Lesson from Afghanistan, folks: your numbers are never going to be 100% accurate, and it’s better just to agree to a number and get to planning than continually be running after the latest figures – which are also going to be wrong.
- Nobody wanted to talk about the IDPs left over from the previous round of conflict in 1992-93; a staggering 220,000 people (not 100% accurate, of course – just run with it!) have been rotting in terrible conditions for the last 16 years, and some of their stories can be found on IDP Voices. Nearly all of us who were new to Georgia found this astonishing, since it raised a rather difficult question: what the *&%$ has the government and the UN been doing for the last 16 years? It also confuses the picture because in purely humanitarian terms many of these “old caseload” IDPs were in a worse situation than the “new caseload” – and many of the “normal” citizens live in conditions as bad as either.
- For both old and new caseloads, the main priority is ensuring their basic shelter, which comes under the Emergency Shelter cluster. Unfortunately the UN in Georgia had decided that they didn’t want to activate the cluster system (because it’s a bit of a hassle and you have to actually take responsibility for your actions) but they did want to use some of the cluster tools (particularly the ones that give you a fat sack of cash to spend). This meant that it was like stepping into a time machine to 2004 – you remember, when “co-ordination” was a competition to see who could hold as many meetings as possible with as few outcomes as possible.
- Notwithstanding the co-ordination problem, nobody had a clue what to do with all them displaced. The government unveiled a not unreasonable resettlement plan for the new caseload at the start of September, but that plan rapidly ran aground on the harsh reality that the stock and state of public buildings in Georgia are likely not sufficient to house the IDPs according to basic humanitarian standards, even on a short-term basis. (Some interesting discussion on this at the Social Science in the Caucasus blog.) The question is whether that government plan can be reshaped into a more realistic framework that will engage the entire humanitarian community as well as being attractive to donors…
One of the things about shelter issues is that they tend to get worse the longer you leave them. Conditions deteriorate, particularly when people are housed in buildings that were never designed for residential use. In this case, many of the new caseload had been placed in schools and kindergartens around Tbilisi and other towns – which meant that we also had to deal with the fact that those institutions were needed for the start of the new school year. This was a particular tension for UNICEF, who often run a “Back To School” program – which wouldn’t look too good if there weren’t any schools to go back to. In addition winters in Georgia can get unpleasant, especially the closer you get to mountains, and thus another constraint on resettlement.
You might have noticed that there wasn’t much talk about information in this blog post. That’s because there wasn’t much information, as I explained previously. We got hold of the complete set of school locations from the MInistry of Education (shape files ahoy!) but nobody seemed that interested. We tried to persuade the different actors – Red Cross, UNHCR, Ministries various – to consolidate the figures for collective centres and the IDPs therein, but with little luck. Paolo Palmero from OCHA had gathered a lot of data during his 2005 visit, but none of it seemed to be circulating in the agencies.
Summary version: this response showed yet again the importance of investing in information resources before an emergency hits. That doesn’t just mean getting loads of satellite images (although UNOSAT did some impressive work on damage levels) but investing in relationships with government, relationships that can be leveraged quickly to mutual benefit. It means having a basic picture already in place – locations of schools, for example – that you can then overlay new data on top of – such as the estimated IDP numbers in those schools. This really needs a collective approach – one agency alone isn’t sufficient to achieve success, although you need a focal point for the effort – but it continues to make me wonder if we should be thinking about setting up an organisation that collects and disseminates operational data like this.
At least that would avoid me feeling like a numpty, turning up at meetings with my tiny spreadsheet of schools that might need some watsan rehabilitation…
A Georgian Holiday
So my holiday is well and truly over, and I’m in Georgia for UNICEF on a ridiculously short contract, providing information management support for the WASH Cluster. Things are never that simple, of course, and so the work has turned out to be significantly more challenging than I expected. Right off the bat, the post-conflict situation in Georgia is a political crisis rather than a humanitarian crisis; yes, there are some tens of thousands of people displaced by the conflict, but almost none of them are in a life-threatening situation (until the winter comes, that is). Their livelihoods have been affected badly, which means that there are going to be ongoing concerns, but the scale of that problem in a middle income country doesn’t feel particularly desperate (especially now that we’re watching the footage of the monsoon floods in India which have displaced over 2 million people).
Of course that doesn’t mean that there isn’t any job here, or that I get to go back to the mountain tomorrow. There’s still a major co-ordination requirement – for IDPs that are stuck in collective centres, for IDPs that are returning home to their villages, for IDPs that have been moved into the tent camp(s) in Gori – and a real lack of decent information to support that co-ordination. Fairly obviously that’s where I come in, but the last week has not been a particularly productive one. Primarily this is because when I arrived there was absolutely no data to work with, and getting hold of it has proven to be an absolute nightmare. Information flows are incredibly weak, dialogue with the government is fragmented, the situation remains extremely fluid and there’s a lot of political sensitivies involved. On top of that, the WASH unit that I’m in didn’t exist until a couple of weeks ago; it’s been created solely because of the conflict and the need that UNICEF has to meet its obligations as the lead agency in the WASH cluster.
Bags of fun, which explains why I haven’t posted anything since I arrived. I promised myself that I was going to blog daily on the issues I was coming up against, but that’s clearly not worked out. However I will be writing a few pithy posts on specific issues, since as of two days ago data started appearing. It’s not great – patchy demographics, an improvised camp registration process, a few lists from government agencies and NGOs – but it’s a starting point. My job is to turn that data into something that can be used by the cluster to address the 5 strategic areas which we’ve identified, which are broadly:
- Site planning of tented camps in Gori
- Refurbishment of proposed Temporary Shelters
- Cleaning of schools and kindergartens at national level (esp. Tbilisi)
- Rehabilitation of existing Collective Centres (CCs) for longer-term caseload
- Provision of village watsan for returning IDPs
As you can see, it’s not a particularly coherent set of requirements, which will make co-ordination even more difficult. The first step is to work out where the IDPs are and where they’re going to be going; the next step is to work out where the agencies are and how they’re working. Sounds simple, right?
Right.
Lights! Camera! Discussion!
David Sasaki joins the conversation, which is great – it was starting to feel a little bit like a mens singles tennis match between me and Patrick. Now it’s mens doubles, or something. David starts with a strong serve, although his accusation that
Both men seem to have the academic tendency to speak in aphorisms
seems a little unfair – the heuristics post he’s referring to was simply me reminding myself that I’m not the great oracle on these issues, and that I should get ready to be wrong.1
If I understand Paul correctly, his two main criticisms of Global Voices are that 1.) it doesn’t matter if you highlight moderate voices discussing the news of their countries because it is the extreme voices who will always make the headlines and 2.) during times of conflict and emergency, focusing on participatory websites rather than humanitarian institutions will lead to lots of conversation, but less action.
No. I’m not criticising Global Voices per se, and definitely not on those particular grounds. I think Global Voices is amongst the best that the Web has to offer. What I worry about is making claims about the impact of projects that go beyond a) what the evidence shows to be true and b) what those projects can realistically expect to achieve. Global Voices meets its stated aims convincingly, but what worries me is when people start talking about Global Voices – or blogs in general – as something which they’re not. As David notes,
We often portray Global Voices as the zeitgeist of what the ignored world is discussing when in fact we are an amazing international community of individuals with large online networks and particular interests.
David’s honesty is admirable, and I think that honesty reflects one of the strengths of Global Voices in general. What I was taking issue with more was Patrick’s statement that
Global Voices is a far more effective local information and response network than FAST ever was.
I simply disagree with this.2 Global Voices is not a response network in any substantive sense, and I don’t think it’s necessarily a more effective information network either. I agree that there ought to be more attention paid to blogs as a source of information, but the strength of GV is precisely that it is not programmatic. The bloggers involved have not set themselves objectives to provide early warning information, or document human rights abuses – they are just private citizens who are writing about issues that are important to them. The situation is slightly different with Ushaidi, of course, which was conceived and developed specifically in response to the post-electoral crisis in Kenya. In the words of Ory Okolloh,
Ushahidi was mainly intended to be a mapping tool and a repository of information about the post-election crisis as seen from the view point of people on the ground. We were trying to capture information that was not mainly being reported in the mainstream (there was a lot of self-censorship in the media) and also provide a timeline for information for both mainstream and citizen reported events. In the case of real time mapping Ushahidi could be used to track where the violence or the peace efforts were taking place. We hope to be able to provide those people who are “addressing the real needs to real people” with information that might help their efforts and to be part of the “testimony” as it were of what happened.
Now that’s a series of specific objectives that can – and should – be measured in order to judge the impact of the project. However if you look at the underlying requirement for all of those objectives to be met, it seems to me that the basic requirement is a systematic data collection system – which is exactly what Ushahidi did not have. It’s entirely possible to run a Ushahidi instance with a more systematic foundation – but then it stops being the Web 2.0 poster child that everybody wants it to be, and becomes a visualisation tool for a standard human rights monitoring system.
Now I don’t have a problem with that – it’s not as if we’re over-supplied with really great data visualisation in the human rights field – but that’s not why people got excited about it. People got excited about it because it’s a Revolutionary New Way Of Doing Things Just Like Clay Shirky Says, and I’m asking what I hope is a valid question: it may be a revolutionary new way of doing things, but is it a better way of doing things? Maybe it is – in which case, show me.
I think this tension is at the heart of most of these initiatives. Patrick unwittingly gave away one of the reasons why he thinks bloggers are better than the established systems, and it goes right to the heart of this tension.
Unlike the local information networks at FAST and conventional conflict early warning systems, they are not paid informants.
This belief is part of the cult of the amateur that I think the internet has reinforced, but it is not inherently better to do something for free than it is to do it for pay. Personally I think that as soon as they stop acting as bloggers and start acting as human rights monitors, they will cease to be good bloggers – and they probably won’t be very good human rights monitors either. I also think that the strengths of citizen journalism – the amateur spirit, the personal perspectives, the improvised approach – are in this context potential weaknesses. Joshua at Registan almost nails one of the key problems for Global Voices when he says that
too many internationals, including me, are far more alike each other than they are to their home countries.
Even though many of them are from the regions or countries that they cover, the Global Voices bloggers – in certain important ways – are more like each other than they are like the people in their home countries. In particular, they share “democratic values” just as Patrick describes, and a positive, can-do attitude that impresses people.3 Yet those democratic values may be the very thing that makes them less representative, and that raises an interesting dilemma for David and the others who are interested in Rising Voices.
In relation to Ushahidi, I wrote
The virtual world isn’t resistant to real-world pressures, and it doesn’t necessarily overcome social divisions – hence the problems with the [Mashada] bulletin board. These pressures can be managed, but it’s no easy thing – but would Ushaidi be any less resistant to hijacking by people intent on promoting social divisions?
I suppose that’s my question, in the context of David’s job – what defense mechanisms do we have against the real world?
- Besides, academics don’t usually talk in aphorisms – they prefer to maximise their word count. [↩]
- Although that doesn’t necessarily mean that I think FAST is particularly effective. [↩]
- However a positive attitude is not enough – I have frequently said that I would prefer to work with people who really couldn’t give a damn about humanitarian issues but who are excellent at their jobs, than work with people who are lousy at their job but who really, really care. I’m not saying that the Global Voices bloggers are lousy at their job – but their job is not “early warning”, it’s blogging. [↩]
Dangerous Statistics in Iraq
In Science News, Julie Rehmeyer writes a short piece on Humanitarian Statistics, with a focus on the “controversial” Iraq war studies carried in the Lancet. I haven’t posted about the Lancet studies before; I recognise that the Lancet studies have an important role to play in tallying the cost of the Iraq war, but anything I could add to the debate would be largely redundant, since it’s been driven by political rather than humanitarian interests.
Although Deltoid characterises the article as being “about the Lancet studies” – and fair enough, that is his particular interest – it is thankfully wider than that, noting the increase in the use of statistics in the human rights (and to a lesser extent, humanitarian) sector while also being aware of the limitations:
But humanitarian crises pose huge challenges. Little information may be available—even from before a crisis—about how many people live where. Even if a previous census was taken, the high birth and death rates in developing countries tend to quickly make censuses outdated. Areas within continuing war zones can be unsafe for survey workers.
Examples from Sierra Leone and East Timor are referenced in the article. The latter case is particularly interesting because it wasn’t just based on a straight survey – which is what we generally think of when we think of statistics – but on pulling together separate and incomplete datasets to build a bigger picture, which is the norm in humanitarian crises, particularly in developing countries.
In the comments section at Deltoid, commenter Jeff Harvey laments
I can only shake my head in disbelief. Who will do the survey? The US and British governments, who are responsible for an illegal invasion that has turned Iraq into a country of wreck and ruin? This is the bitter irony. Aggressing nations do not tally the numbers of their victims. Ian Gould summed it up in the thread below this: because the real death toll of civilians conflicts with the well-cultivated myth of US benevolence, western crimes are not a part of history because they are never allowed to become a part of history. They thus get sent straight down the memory hole.
Jeff misses the point that (I think) Julie was trying to make. Although he gives many examples of past victims of war who have been lost to history, we don’t live there any more. There are more people working on these issues than ever before, and we have a better idea of how to approach these problems. However it’s this attitude – that information gathering and analysis should be a political project – that is likely to prove the biggest obstacle to moving forward.
The only way to do justice to the victims and to persuade belligerent parties to accept the results is to treat these issues as impartially as possible – and to do so with the perspective that our work is at the service of the beneficiaries, rather than of our own political interests.