Algorithmic flattery and slander

I have a terrific research record.At the time of writing, my 424 publications have been cited a shade under 5,000 times. I’ve worked with over 700 co-authors, who I know will be pleased to have worked with me in the fields of information retrieval, statistics and sociology.


In case you are surprised, or even suspicious that I’ve been using photoshop in this picture, you can check  at: (viewed 17/1/14).

Of course, this is a nonsense. A small fraction of these publications are mine; I haven’t checked them all, but for example, papers in child protection written by a colleague who used to work at the OU, also called Steve Walker, are attributed to me. He might be a little less impressed with this account of my research than I hope you were. So far, so amusing (for me, at least). But perhaps not: measures and metrics of research impact are becoming increasingly important in the allocation of research funding, academic promotion and so on. Were research funders to automate their trawling of impact, this could have quite a positive impact for me, though not necessarily for other Steve Walkers.Microsoft’s automated aggregation is clearly wildly inaccurate. Of course, at one level this is a simple data problem – researchers’ names and affiliations don’t provide unique keys for databases. There are initiatives (e.g. ORCID) which aim to provide unique, persistent identifiers for researches (to go along with those for our publications) that allow us to be identified, monitored and measured more accurately. I won’t take that line of thought any further here.

Really what I want to do is point out the risks of algorithmic attribution of data to people. The reliability is currently very poor, even in the relatively structured field of academic publishing. When we look at facial recognition in devices like Google glass through various hacked and ‘unauthorised’ apps (see for e.g.

The concern here is usually cited as ‘privacy’. This is a legitimate and serious concern, of course, but it’s usually premised on the assumption that the facial recognition actually works and can attach a face reliably to data about the person. At least as worrying is the concern that in reality we can’t assume that this is the case (for example, look at the reliability of facial recognition in Picasa, if you’ve tried). The possible problems caused by mis-identification are very worrying . The inflation of my publication record is more amusing than annoying, but imagine being publicly mis-idenitified as a serial criminal or Man Utd supporter.

January 17, 2014 at 12:48 pm