< Knowing Me, Knowing You

Transcript

Friday, November 21, 2008

BROOKE GLADSTONE:
The more researchers you have, the better results you’re likely to get. That’s why Netflix offered a million dollar prize to whomever can figure out how to improve by 10 percent the software that makes movie recommendations to Netflix customers.

You would think it would be a simple matter of compiling their past preferences, but that only gets you so far. As Clive Thompson explains in this weekend’s New York Times Magazine, there’s an impenetrable mystery at the heart of our predilections, but it’s a mystery that the Netflix program called Cinematch has to penetrate if it wants to succeed.
CLIVE THOMPSON:
Almost two thirds of all movies they rent are picked because people had it recommended by the computer. Now, that seems like almost kind of crazy when you think about it, but it makes sense, because, like, when I joined Netflix I had, like, 20 movies I wanted to see, so I saw those. And once I'm done, I can't really think of any other movies I want to see.

So if they want to keep on charging me 17 bucks a month, they have to be active in helping me find new stuff or I'll go four or five months without renting any movies and I'll be, like, why am I spending 17 bucks a month, right? Their business model is incumbent upon keeping you renting movies.
BROOKE GLADSTONE:
So how’s this contest going? What can their data tell them at this point?
CLIVE THOMPSON:
Well, it's interesting. Within about one month of the competition, they — all these amateurs — and they're mostly amateurs — had gotten five percent better. So they, like, right off the bat they're five percent better.

And they did this with just a couple of simple mathematical techniques. One of them is called “singular value decomposition.” It’s very common. Like mathematicians use this all the time to take huge sets of data and try and find interesting patterns in it, but no one had ever thought to use it for movies.

It’s a way of sort of automatically trying to find what they call “features” in a data set. Now, what that translates to in movies is, it'll look at the movies and automatically try and pull out features.
BROOKE GLADSTONE:
And, for an example, one would be comedy.
CLIVE THOMPSON:
Yeah — yeah, exactly, comicness.

BROOKE GLADSTONE:
And another feature might be they all have Tom Hanks.
CLIVE THOMPSON:
Yeah, yeah. And you can — sometimes when you look at this feature set, you can figure it out. So you'll see a list, and it’s like all these chick flicks at one end and then all these action films at the other, so you’re, like, clearly it’s discovered something that sort of arrays these movies based on, you know, chick flickiness. That way it can sort of recommend it based on doing the same thing for the user.

It sort of figures out how chick flicky are you as a user? Well, here’s the movies that are chick flicky, and it lines you up that way.

But as they got better and better at doing this, when you look at some of these features, you can't figure out what, for goodness sake, the computer’s figuring out, because there’s just like a bunch of movies at one end and a bunch of movies at the other end, and it doesn't make any sense at all.

I looked at this one feature set, it had like a [LAUGHS] wrestling video and a chick flick and a historical drama, and all clumped together as if there were something similar with these. So it’s like the computer is finding something out about us that we ourselves can't even figure out.
BROOKE GLADSTONE:
Which I think is a good time to bring up Napoleon Dynamite. Why is that film significant?

CLIVE THOMPSON:
Well, Napoleon Dynamite is really interesting, because what happened is — these guys, they came out of the gate and they were very quickly able to get like, five percent better in Netflix but it took, like, another six months to get to six percent and it took, like, a year to get to seven percent.

They've been two years in now, and they have been slowing down, so like they've just been stuck at about nine and a half percent for, like, six months. They're just inching up. It’s almost like you’re climbing a mountain that’s getting steeper and steeper as you go.

And so I asked them, I said, why? And they said, well, basically the problem is there’s this small handful of movies that are causing most of our errors, you know, so, like, basically they're movies that we can't seem to predict whether or not you’re going to like it.

And it turns out that there’s one movie [BROOKE LAUGHS] that is generating 15 percent of the errors. So if you could figure out whether or not someone likes this movie accurately, you would be 15 percent of the rest of the way to making a million dollars. And that movie is Napoleon Dynamite. They cannot seem to predict whether or not you’re going to like it.
BROOKE GLADSTONE:
Because it’s so weird?
CLIVE THOMPSON:
Yeah, probably. Yeah. Well, actually, I asked them. I said, so why, why this movie? And I should point out that the other sort of 20 odd movies that are causing, like, the other half of that error rate are very similar. They're all these weird movies, like I Heart Huckabees or, you know, Team America: World Police, or, you know, Sideways, indie movies that are — that you — and I said, so why?

And they said, well, the problem is with these movies, people either love them or they hate them. Like you take a movie that’s easy to predict, like Lord of the Rings, you know, it tends to get all — most of the same rating. Or you take a movie that even people don't like, you know, but it’s like a genre movie, like a chick flick, you know, a bad chick flick like The Black Book, it ends to get [BROOKE LAUGHS] mostly bad reviews. Like people agree on it.

The problem with these movies is nobody agrees. They're all — it's all five stars and one stars. And when you have that sort of data signal, it’s very confusing for the computer. It just cannot quite figure out how to recommend the movie.
BROOKE GLADSTONE:
But as you say that, you know, that last half a percentage point to the 10 percent improvement that wins you the million dollar prize seems to be almost a vanishing point
CLIVE THOMPSON:
Yeah. Yeah.
BROOKE GLADSTONE:
because measuring taste with perfect accuracy is apparently an impossible dream, no matter how much data we have to work with. Films and books and music are art, and their appeal is, in some cases, ineffable, right?
CLIVE THOMPSON:
Well, one way to look at the problem is that, like, what Netflix is asking the computer to do is look at all your past ratings and predict how you’re going to like something in the future — right — that you haven't seen.

Well, the problem is that, you know, our taste sort of changes over time. It’s not static or, you know, it often changes. And the way it changes, is in these very unpredictable ways. Maybe you interact with someone who convinces you to look at a movie or maybe you read something or maybe you just get interested in something that you would never see otherwise because your friends are interested in it.

And so it’s very hard to capture those types of things algorithmically. And that’s why a lot of people have said, well, maybe there’s only so far you can push this idea that the computer can sort of put us on the couch and sort of, you know, psychologically figure out what we're doing.

Maybe you need to bring other live people in like you need to have sort of a Facebook application that says, you know, here’s what your friends are watching. You know, maybe you should take some cues from that. Maybe that’s how you'd sort of solve the Napoleon Dynamite problem — it's use sort of a social mob, in addition to the computer science angle.
BROOKE GLADSTONE:
We heard in the interview that just preceded ours, how the recipe for accurate scientific research is transparency and crowd sourcing, at least joining together, pooling the information of a bunch of researchers working on the same problem.

Now, Netflix has applied this lesson in structuring its Cinematch contest. Clearly that’s making a difference, right?
CLIVE THOMPSON:
There seems to be a very sort of almost, like, civil or academic tone to what’s going on, because as soon as someone discovers something they'll sort of go on to the Netflix discussion boards for this and excitedly talk about it. They'll put computer code up on this board, saying, hey, I just discovered this.

So that’s also sped things up, because all the innovators will see what everyone else is doing and they can all try it out. So really it’s been very, very, sort of a very exciting thing to watch, you know, in a lot of ways.

BROOKE GLADSTONE:
So you've said the only way, really, to get to that 10 percent improvement or more is to bring real people back into the process. It’s not likely — right that Netflix or Amazon or any of these commercial websites will be doing that in a systematic basis. What are the stakes if they can't master the art of predicting our tastes better?
CLIVE THOMPSON:
Well, I mean, to a certain extent you could say that they actually have enjoyed great success. They've already taken a lot of the things that these guys have figured out. They've already folded them into their recommendation system, so they're already going to start making millions more.

But the reason why this is important is because you've got this media world where you have way more stuff than anyone can ever pick from. This is what they call “the paradox of choice.” If you have too many choices, you'll sort of get frozen. You'll choose nothing.

Music is an even bigger problem than films, because films, you know, a finite number come out every year, a couple hundred, a couple thousand.

But, you know, that’s as many songs as get issued on a day, right? So the music industry is, to a certain extent, in even more of a lather to figure this one out, because, you know, their industry is in a complete freefall and they're pretty convinced that the only way to cut through the noise is to be able to push people towards the right music.

And so they have this very interesting heated argument about — that falls along the same lines as to whether it should be done by computers or by humans, you know, because some people say, well, computers are the only way that you could look at these millions and millions of songs that are coming out every year and find patterns in them, and other people are like, no, no, no, no, that’s not going to work. If you really want to get people addicted to things, you've got to have humans guiding them.
BROOKE GLADSTONE:
You quote in your piece someone who says, you know, there are cultural consequences to limiting yourself to computer data crunching to help guide people’s opinions. It creates narrow mindedness.
CLIVE THOMPSON:
Yeah. Well, one of the problems is — and this is sort of an interesting philosophical point — again, if you say that, okay, Clive Thompson, I'm going to recommend things based on what you've seen in the past, I'm getting sort of the Clive Thompson universe.

But I like a lot of sci fi, I like a lot of nerdy things, so I'm not going to go out and see the things that are mainstream, right? And maybe that’s a problem because I'm going to start being out of touch with what my friends are talking about, you know?

Another way of putting it is that [LAUGHS] whether or not you like a piece of culture may not be the most compelling reason to see it. You know [LAUGHS], maybe there are other reasons to encounter culture other than whether or not you like it. Maybe it’s almost like, you know, eat your beans. You know, you should pay attention to some of this stuff if you want to know what’s going on in society.

And so how in God’s name are you going to get a computer algorithm to say, I'm going to recommend something that I'm pretty sure you won't like but I think you should see?
BROOKE GLADSTONE:
Only your friends can get you to eat your beans.
CLIVE THOMPSON:
That's right.
BROOKE GLADSTONE:
Clive, thank you very much.
CLIVE THOMPSON:
No problem, good to be here.
BROOKE GLADSTONE:
Clive Thompson’s article about Netflix and Cinematch is in this weekend’s New York Times Magazine.
[CLIP]:
TREVOR SNARR AS DON:
Hey, Napoleon. What’d you do last summer again?
JON HEDER AS NAPOLEON DYNAMITE:
I told you, I spent it with my uncle in Alaska, hunting wolverines.
TREVOR SNARR AS DON:
Did you shoot any?
JON HEDER AS NAPOLEON DYNAMITE:
Yes, like 50 of them. They kept trying to attack my cousins. What the heck would you do in a situation like that?
TREVOR SNARR AS DON:
[LAUGHS] What kind of gun did you use?
JON HEDER AS NAPOLEON DYNAMITE:
A frickin’ 12 gauge. What do you think?
[END CLIP] [MUSIC UP AND UNDER]
BOB GARFIELD:
That's it for this week's show. On the Media was produced by Megan Ryan, Jamie York, Mike Vuolo, Mark Phillips, Nazanin Rafsanjani and Michael Bernstein, and edited — by Brooke. We had technical direction from Jennifer Munson and more engineering help from Zach Marsh. We also had help from Deena Prichep and Andy Lanset. Our webmaster is Amy Pearl.
BROOKE GLADSTONE:
Katya Rogers is our senior producer and John Keefe our executive producer. Bassist/composer Ben Allison wrote our theme. You can listen to the program and find free transcripts at Onthemedia.org. You can also post comments there or email us at Onthmedia@wnyc.org. This is On the Media from WNYC. I'm Brooke Gladstone.
BOB GARFIELD:
And I'm Bob Garfield.