An excerpt from

Big Stupid Data

James O’Malley

Which Films Do Critics Disagree About The Most?
We look to film critics to be an authority on what films are worth seeing, and which we should be miss - but the truth is, sometimes the critics disagree. In fact, sometimes they really disagree. So I wanted to find out, which films are the most controversial amongst critics?

To find out, I downloaded an enormous dataset from popular aggregator Metacritic. It turns out that critics really disagree about the likes of Sin City, Under the Skin and Star Wars Episode III.

This same dataset has also enabled me to dig into other questions: Which films do critical opinions mostly diverge from those of audiences. And which critics are the most contrarian? And which are the best barometers of critical opinion?

I was even able to wade in on the ultimate question by mashing the data up with data from industry website Box Office Mojo: Are box office results affected by bad reviews?

What Can A Country’s Roads Tell Us About Its Government?
Over the last few years there have been a number of big, dense political works of political science, digging into figuring out what really makes countries successful. The likes of Why Nations Fail and Political Order And Political Decay both broadly agree - institutions are key, and that corruption is corrosive to development.

But these are all hard to measure and take a lot of time consuming analysis. Luckily then, I’ve hit upon a useful heuristic for telling just how good a government is: the quality of the roads. Inspired by a haphazard drive around Bucharest, I took data on governance and rule of law and mashed it up data on the number of road deaths - and revealed a surprisingly stark correlation.

Which Fast Food Chain Is The Grossest?
Fast food definitely has a mixed reputation: Sure, it might be just what you need at 1am after a night out, but it also seems reasonable to worry about just where your food is coming from.

Is there any way we can judge how gross a fast-food chain is? By using data from the Food Standards Agency, I’ve been able to prove that McDonald’s is a surprisingly hygienic affair - whereas that disgusting-looking fried chicken shop down the road is… well, just as disgusting as you thought.

How Can We Measure Christmas Spirit?
As the year slides to a close, there’s always sort-of a persistent Christmas feeling. And this made us wonder: When does Christmas truly begin? Is there a way to measure Christmas feeling? Can we use data to inform when is the acceptable time to blast out Do They Know It's Christmas on the office stereo?

To find out, I’ve built an algorithm that works a bit like a stock exchange: It gives an Index number, which enables us to see how Christmassy things are feeling - and whether that is more or less Christmassy than times past.

What powers it? Not money, but Spotify listening data, based on the popularity of Christmas songs on the platform. When will Christmas peak? I guess we might finally find out.

Can Twitter Keep A Secret?
Twitter is simultaneously the best and the worst thing in the world. It’s full of some of the world’s worst people - but it is also a great way of spreading the news. And this made me think: Surely this combination of attributes means that it will be utterly rife with spoilers?

So when Star Wars: The Force Awakens came out, I setup a system to log any mentions of a particular spoiler so I could see on a network just how many people would try to ruin the film for others. And amazingly, according to my research, it appears that if it is something as important as Star Wars, then Twitter really can keep a secret.

How Often Does ITV2 Show Shaun Of The Dead?
Everyone loves Edgar Wright and Simon Pegg’s Cornetto Trilogy - but can you have too much of a good thing? This appears to be the theory that ITV2 appears to be trying to prove, as it is showing the film seemingly every time I look to see what’s on.

But is my mind playing tricks or do they really play it on a loop? To find out, I’ve been downloading and analysing a year’s worth of TV listings. You know, like normal people do when faced with this sort of thing.

What Books Do Fans Of Donald Trump Read?
Donald Trump is not the most… articulate of Presidents, and his book Crippled America is a testament to this. It is (ghost) written like he talks and contains virtually no substance. (Yes, against my better judgement I actually read it.)

When I went to rate the book online, on GoodReads, I spotted a tonne of 5-star reviews, and this made me wonder: Who the hell is actually reading this book and liking it? Who are these hardcore Trump fans and if they like this book… then what else do they like?

To find out I wrote some code to download a shedload of data from GoodReads, and was able to do a deep dive to find out what else they’ve been reading. So now I can give a top 10 favourite books for both fans of Donald Trump… and Hillary Clinton.

How Many Twitter Followers Is An Olympic Medal Worth?
Sure, winning an Olympic medal must be a pretty good feeling, but surely the better long term question is: What does this mean for your #personalbrand? Can success on the track translate to success on social media? To find out, during the Rio Olympics I monitored the Twitter followings of British Olympic team members - and was able to find out the real winners and losers from the Games, as well as discover just how many followers Laura Trott racked up as she won Gold in the Velodrome.

Was 2016 Really A Terrible Year For Celebrity Deaths?
Bowie. Prince. Carrie Fisher. Fidel Castro. So many massive names died in 2016 that it felt like no famous face was safe. But was it a particularly tragic year for losing celebrities?

To find out, I have devised an algorithm using Wikipedia links to weight the notoriety of big names, and to compare the deaths across different years. The result: Yes, 2016 really was a bad year for celebrities dying.

What Is The Greatest American State?
A common refrain amongst American politicians is referring to “The Great State of…”, and adding whichever state they are talking about. But hang on… they can’t all be great, can they? But if they are… which is the greatest? To find out, I have devised a system to work out not just the greatest state, but the greatest state per capita too.

What Is The President Thinking?
With just a few words, the President of the United States can start wars and sink economies - so what he says and does matters. Donald Trump shocked the world when he was elected, and he has continued to upend many assumptions about how we thought political power works. So how can we understand what he and his advisors - his family - are thinking?

As luck would have it, the Trump family are famously active on Twitter, and don’t delegate Twitter responsibilities to staffers. So in an attempt to do some Kremlinology, I’ve built a Twitter bot that monitors the follows, unfollows and likes of Donald Trump, his sons Eric and Don Jr, his daughter Ivanka, and his wife Melania. By seeing the media they consume, and the tweets they engage with, it arguably gives us a glimpse into their subconscious. This chapter will talk about how the bot works, and what some of the crazy things we’ve learned in the six months or so it has been in operation.