< Computing “That’s What She Said”


Friday, June 10, 2011

BOB GARFIELD: You may have noticed that at this point in the show you have not heard mention of Anthony Weiner. That's because we figure everything there is to be said about the Democratic congressman's abusive and pathetic sexting behavior has been said. But, not without difficulty because every time you do try to do Weiner coverage, you confront such dubious phrases as "Weiner coverage."


JOY BEHAR: Look, I love the weiner. Even the congressman - I love him too.


BOB GARFIELD: The challenge was not made any easier by the smirking word play of the nation's most notorious arrested adolescent, himself.

CONGRESSMAN WIENER: You know, this is part of the problem with, with the way this has progressed, and one of the reasons why I was perhaps, forgive me, a little bit stiff yesterday.


What? We —

BOB GARFIELD: Luckily, where taste and judgment fail, technology offers a solution. University of Washington computer scientists recently unveiled a semantic software that can recognize sexually suggestive sentences before it's too late. The program is called Double Entendre via Noun Transfer, or DEviaNT, for short, but what it really does is just automate the creation of the old-fashioned "that's what she said" joke.

Writer Jacob Aron reported on it for New Scientist Magazine. Jacob, welcome to OTM.

JACOB ARON: Oh, thank you very much.

BOB GARFIELD: So tell me, first of all, what is a "that's what she said" joke.

JACOB ARON: It's not a very funny joke, it's not a very original joke, but someone says something and you realize that you can twist their words another way by saying, "that's what she said" and reveal a hidden meaning that perhaps they hadn't intended.

BOB GARFIELD: Ah, so you smirkingly flag that a double entendre maybe has occurred.


BOB GARFIELD: Now, you know, I don't know a whole lot about the semantic web, but my understanding is that it's very difficult for software to really have a — enough understanding of the context of conversational language to divine intent, you know, much less naughty double meanings. How do they do that?

JACOB ARON: Basically they use the same methods that Google uses to understand all of the web pages out there. They scan a huge amount of data and try and identify certain statistical properties that will flag after a "that's what she said" joke versus something that isn't a joke.

BOB GARFIELD: And I'm curious, the researchers, what — was this an academic exercise? Was it a party game? What motivated them to devote whatever they devoted to make this software work?

JACOB ARON: Well, I mean, I'm not exactly sure what their intent was. I'm sure they didn't spend, you know, sort of years trying to correct this. But they — they presented their work at a conference on computational linguistics.

So there is a serious intent to it, especially where we're trying to teach computers to understand writing and, and voice, all — all that kind of stuff. And, and these kind of methods, you know, it — it is silly but if we can teach a computer to understand a joke, we're one step closer to a computer being able to understand anything, really.

BOB GARFIELD: Tell me the form that the program takes. Does it scan the Web, looking for "that's what she said" opportunities" What — what do it do?

JACOB ARON: So basically I think you — you feed it a sentence and it'll tell you — it has to reach a certain threshold and once it reaches that threshold, it says yes, it would be funny to say "that's what she said."

BOB GARFIELD: If you typed in "This is a hard story to report" -

JACOB ARON: It would certainly come back and say "that's what she said" I think.

BOB GARFIELD: How accurate is the "that's what she said" machine?

JACOB ARON: So the system's actually about 70 percent accurate, which doesn't sound very good. But it — it's because of the mix of the data that they use to train the system. So they have one and a half million erotic sentences, but only about 60,000 regular sentences.


And so, you know, if they'd better balance their data, they'd be able to get better accuracy, maybe even as good as 99.5 percent. I think there was only one — one of a few that it couldn't come up with. It did manage to get "don't you think these buns are a little too big for this meat" —


— which I think is a particularly good one.

BOB GARFIELD: You ended your piece with a "that's what she said" joke. How did that work out?

JACOB ARON: Well, it — it's not a very good "that's what she said" joke. They — they say, the techniques of metaphorical massing may be generalized to identify other types of double entendres and other forms of humor. And "that's what she said," of course, because —what one of — one of the researchers is a woman.

BOB GARFIELD: I get it. So, in fact, "that's what she said" was an accurate attribution.


BOB GARFIELD: It — but it was not a double entendre.


BOB GARFIELD: You devil! Jacob, thank you very much. [LAUGHS]

JACOB ARON: Great talking to you.

BOB GARFIELD: Jacob Aron is a technology reporter at The New Scientist.