Friday, July 31, 2009
This is On the Media. I'm Bob Garfield. Two and a half years ago, Netflix set up a contest that offered a one-million-dollar prize to whomever could figure out a way to improve its movie recommendation software by at least 10 percent. For years, amateur and professional computer scientists, statisticians and psychologists have been trying to create an algorithm that knows whether you want to rent Borat or Tropic Thunder. But why would Netflix offer a million bucks for such a small improvement?
CLIVE THOMPSON: Almost two thirds of all movies they rent are picked because people had it recommended by the computer.
BOB GARFIELD: Clive Thompson writes for The New York Times Magazine and Wired.
CLIVE THOMPSON: Now that seems like, almost kind of crazy when you think about it, but it makes sense, because, like, when I joined Netflix I had like 20 movies I wanted to see, so I saw those. And once I'm done, I can't really think of any other movies I want to see. So if they want to keep on charging me 17 bucks a month, they have to be active in helping me find new stuff or I'll go four or five months without renting any movies and I'll be, like, why am I spending 17 bucks a month, right? Their business model is incumbent upon keeping you renting movies.
BOB GARFIELD: That was actually from an interview Brooke did last year. At that time, all of the entries had stalled at around the nine percent mark and Thompson didn't predict that anyone would make it to ten percent any time soon. Brooke will take it from here.
BROOKE GLADSTONE: Napoleon Dynamite. Why is that film significant?
CLIVE THOMPSON: Well, Napoleon Dynamite is really interesting, because what happened is ù these guys, they came out of the gate and they were very quickly able to get like, five percent better in Netflix but it took like another six months to get to six percent and it took like a year to get to seven percent. And they have been slowing down. It’s almost like you’re climbing a mountain that’s getting steeper and steeper as you go. And so I asked them, I said, why? And they said, well, basically the problem is there’s a small handful of movies that are causing most of our errors, you know. So, like, basically they're movies that we can't seem to predict whether or not you’re going to like it. And it turns out that there’s one movie [BROOKE LAUGHS] that is generating 15 percent of the errors. So if you could figure out whether or not someone likes this movie accurately, you would be 15 percent of the rest of the way to making a million dollars. And that movie is Napoleon Dynamite. They cannot seem to predict whether or not you’re going to like it.
BROOKE GLADSTONE: Because it’s so weird?
CLIVE THOMPSON: Yeah, probably. It's all five stars and one stars. And when you have that sort of data signal, it’s very confusing for the computer. It just cannot quite figure out how to recommend the movie.
BROOKE GLADSTONE: What are the stakes if they can't master the art of predicting our tastes better?
CLIVE THOMPSON: Well, I mean, to a certain extent you could say that they actually have enjoyed great success. They've already taken a lot of the things that these guys have figured out. They've already folded them into their recommendation system, so they're already going to start making millions more. But the reason why this is important is because you've got this media world where you have way more stuff than anyone can ever pick from. This is what they call "the paradox of choice." If you have too many choices, you'll sort of get frozen. You'll choose nothing. Music is an even bigger problem than films, 'cause films, you know, a finite number come out every year, a couple hundred or a couple thousand. But, you know, that’s as many songs as get issued on a day, right? So the music industry is, to a certain extent, in even more of a lather to figure this one out, because, you know, their industry is in a complete freefall and they're pretty convinced that the only way to cut through the noise is to be able to push people towards the right music. And so they have this very interesting heated argument about ù that falls along the same lines as to whether it should be done by computers or by humans, you know.
BROOKE GLADSTONE: You quote in your piece someone who says, you know, there are cultural consequences to limiting yourself to computer data crunching to help guide people’s opinions. It creates narrow mindedness.
CLIVE THOMPSON: Yeah. Well, one of the problems is ù and this is sort of an interesting philosophical point ù [LAUGHS] whether or not you like a piece of culture may not be the most compelling reason to see it. You know, [LAUGHS] maybe there are other reasons to encounter culture other than whether or not you like it. Maybe it’s almost like, you know, eat your beans. You know, you should pay attention to some of this stuff if you want to know what’s going on in society. And so, how in God’s name are you going to get a computer algorithm to say, I'm going to recommend something that I'm pretty sure you won't like but I think you should see?
BROOKE GLADSTONE: Only your friends can get you to eat your beans.
CLIVE THOMPSON: That's right.
BROOKE GLADSTONE: Clive, thank you very much.
CLIVE THOMPSON: No problem.
BOB GARFIELD: Clive Thompson is a writer for The New York Times Magazine, Wired and Fast Company. Brooke spoke to him last year, but the contest has finally come to an end, and two separate teams managed to cross the 10 percent threshold. One, called BellKor’s Pragmatic Chaos, did it a month ago, which set off a 30 day dash for other teams to one up them. And members from various of those teams banded together to form the Ensemble. It came down to the wire with many new submissions from both BellKor’s Pragmatic Chaos and the Ensemble. But Netflix has stopped receiving submissions, and it appears BellKor's is the winner, though the official winner won't be announced until September. Bob Bell is a scientist at AT&T and on BellKor's Pragmatic Chaos. He says there was no single insight that put them over that elusive 10 percent mark.
BOB BELL: We tried to combine a lot of models together to get the best synergy. The first strategies looked at every possible pattern we might find in the data. One type of pattern that proved particularly effective is what we call temporal effects, which basically allow a user’s interests and preferences to evolve over time, for example, as children mature, or perhaps with the addition of a new person to a user account. From a more technical perspective, we looked for ways to improve existing models that may have been developed by others, basically asking, are there ways to make this method work any better? And those typically lead to small differences, but small differences ended up being very important in the end.
BOB GARFIELD: Were you able to divine the problem with the status quo with the original Netflix algorithms?
BOB BELL: We haven't seen the details of their models. I think they used a method called "nearest neighbors," which basically looks at, okay, what other movies like this one have you rated? And if you liked movies that are like this, we think you'll like the new one, but if there are other movies like this that you tend to give low ratings to, we wouldn't recommend it. And, in fact, that’s useful, but there are some more complicated techniques that go by a variety of names. The name I like to use for them is "latent factor models," which allow for a much richer description not only of movies but also of users.
BOB GARFIELD: Latent factor models - you took the words right out of my mouth.
BOB BELL: [LAUGHING] That’s right.
BOB GARFIELD: So, one of the cool things about the Netflix prize is that it has attracted people who are not necessarily the usual suspects. Had they gone out to hire consultants, they wouldn't have had so many people from so many disciplines getting involved.
BOB BELL: Yes, I think what you’re referring to is very important, because in order to push this field forward it really does need a lot of different perspectives. There were a lot of people working on this with backgrounds in computer science but not necessarily from an academic perspective, where a lot of the initial development of recommender systems came from, but simply people who are sort of hackers who said, this looks like a fun thing to do. I know something about movies. Maybe I can provide some good ideas. Chris Volinsky, who’s another team member, and I are both statisticians, but there were other people who by trade were psychologists, attorneys, unemployed and whatever.
BOB GARFIELD: Are you a Netflix customer?
BOB BELL: I am not. I don't watch a lot of movies. My wife watches more, and she tends to go out and just purchase them or occasionally watch them on TV.
BOB GARFIELD: So you approached this subject with complete academic disinterest.
BOB BELL: [LAUGHS]
BOB GARFIELD: Has working on the project changed your behavior with respect to others? You know, you stop having conversations and just start trying to peer into people’s minds - BOB BELL: [LAUGHS]
BOB GARFIELD: to determine whether they're going to like Spider Man III?
BOB BELL: No, it’s had a lot of impact on how I approach other data analyses, but nothing substitutes just sort of talking to somebody and saying, well, what did you like about this movie? There'll always be a limit to what any computer model can do.
BOB GARFIELD: Bob, a tentative congratulations.
BOB BELL: Well, I appreciate that very much.
BOB GARFIELD: Bob Bell is a scientist at AT&T and a team member of BellKor's Pragmatic Chaos.
[MUSIC UP AND UNDER]