The topic comes from a Ted Talk by Sebastian Wernicke: “How to use data to make a hit TV show”. Here is the video talk:

The talk compares a successful TV show “House of Cards” by Netflix with a less popular show “Alpha House” by Amazon. Sebastian reveals the process of ways of their data collection and analysis to make the TV production. Amazon created 8 different kinds of shows, made the 1st episode of each free for audiences and tracked their every action of watching the show to collect data. However, Netflix uses existing millions of data points and then it worked beautifully for Netflix. So the truth is data analysis of millions of data points does not always make perfect decision and conclusion.

In the talk, Sebastian said that ” Whenever you’re solving a complex problem, you’re doing essentially two things. 1. You take that problem apart into its bits and pieces so that you can deeply analyze those bits and pieces, and then, of course, you do the second part; 2. You put all of these bits and pieces back together again to come to your conclusion. And now the crucial thing is that data and data analysis is only good for the first part. Data and data analysis, no matter how powerful, can only help you taking a problem apart and understanding its pieces. It’s not suited to put those pieces back together again and then come to a conclusion.

And Sebastian also suggests” data is, of course, a massively useful tool to make the better decision, but I believe that things go wrong when data is starting to drive those decisions. No matter how powerful, data is just a tool. There’s another tool that can do that, and we all have it, and that tool is the brain. If there’s one thing a brain is good at, it’s taking bits and pieces back together again, even when you have incomplete information, and coming to a good conclusion, especially if it’s the brain of an expert.”

So from the lesson we’ve learned of the above story, we should keep in mind “big data does not always work perfectly” when we create data-driven news stories.

As far as I know, I believe the reason of different results of Amazon & Netflix, is their way of thinking (logic), and the way they collect and analyze data.

1. Data collection: 1) As for Amazon, the company started at the point of their own choice of 8 shows to collect data, but for Netflix, they chose the historic data which is more reliable. 2) Amazon drew a square to possess limited audiences who will watch their free shows, which means, the motivation of audiences to watch shows is not pure, perhaps because of the “free” word. While Netflix focused on existing and ready-made data to make the prediction and to know the audiences’ taste.

2. Data analysis: 1) Amazon records everything: when somebody presses play, when somebody presses pause, what parts they skip, what parts they watch again. Because they want to have those data points to then decide which show they should make. But from my point of view, these actions they recorded are not enough to answer their question. Because the reasons behind those actions are much more important to the point, however, the staff in amazon only could predict the reasons according to the collected data. 2) Netflix gets the data about “the ratings they give their shows, the viewing histories, what shows people like, and so on based on all the data they already had about Netflix viewers. And then they use that data to discover all of these little bits and pieces about the audience: what kinds of shows they like, what kind of producers, what kind of actors. The point is, they focused on audiences.

Therefore good date-driven stories come from a right way of data collection and analysis. That’s also why practitioners, journalists and students in the field of journalism are eager for skills in these two aspects.

There is also a related post by New Yor Times: Giving Views What They Want.