< An Internet Archivist Recommits to Books


Friday, June 10, 2011

BOB GARFIELD: The notion of vanishing Internet history did not begin with Internet Week. Entrepreneur, philanthropist and digital librarian Brewster Kahle recognized this in 1996 and began a monumental undertaking, the Internet Archive, an attempt to preserve and make accessible all that is so fleeting online.

But a funny thing happened on the way to Internet posterity. Kahle says that his archive hard drives lasted only for three to five years and, as he sizes up the outdated technology surrounding him, he's arrived at a whole new appreciation of the solidity and necessity of — books, [LAUGHS] so much so that in addition to his ongoing Internet archive, he's launched in new ambitious physical library. Brewster, welcome back to show


BOB GARFIELD: Last we spoke with about five years ago, and you were all charged up because at the time you were like backing up trucks to loading docks and, and spilling huge hoppers full of zeros and ones into the trucks, and you were gonna store all of these zeroes and ones in perpetuity. It was the Internet archive — you called it The Way Back Machine. But you've had a kind of a change of heart. Tell me about that.

BREWSTER KAHLE: Yes. Well, we've been digitizing a lot of materials and collecting all the born digital materials, everybody's web pages, and storing it, and trying to do as good a job as we can. But now people have started giving us physical books.

And then the question is, is after the digitized is there any real reason to have the physical books. And we're determining yes, there absolutely is. We're discovering what librarians have known for centuries in this new digital world, so I —I'm feeling like a little naive.

BOB GARFIELD: Now, it seems to me that one of the motives for archiving things digitally is because the storage space is trivial compared to the physical storage space required to house actual paper books, which are fragile, which are heavy, oddly shaped, and so on. Isn't this kind of an awkward proposition, you know, building a library?

BREWSTER KAHLE: Building a library of physical books or keeping them around makes all the sense in the world, we think, that as digitization of books is going through, it's not only changing how people access library materials, it's also changing how libraries preserve physical books.


And we've just launched a new system for doing even higher density storage, long term preservation, that allows us to store millions of books, without going broke or using up large amounts of acreage. We now have the space for one million books, but we've designed this to be able to get to ten million books easily. Ten million books is the number of books at the Boston Public Library or a Yale or a Princeton.

BOB GARFIELD: The new Library of Alexandria.

BREWSTER KAHLE: Well, hopefully we have a different ending than they did.

BOB GARFIELD: [LAUGHS] There were others who — I — I'm thinking of the Library of Congress, to pick one obvious one, that already archives physical paper books. Tell me why you're not reinventing the wheel?

BREWSTER KAHLE: The Library of Congress is doing a great job. They get donations of materials because they get legal deposit of everything that's published in the United States, and they hold onto, I guess — from their website they say about half of it.

The idea of having all of this in one place makes no sense. The history of libraries is one that's fraught with destruction. Large central libraries tend to be destroyed. So the idea is to have distributed responsibility, distributed organizations, distributed approaches to long-term preservation of what we think is valuable.

BOB GARFIELD: For research or for archival purposes, it's hard for me to understand why the paper book needs to be preserved. Why doesn't a digital image do the trick right nice?

BREWSTER KAHLE: Hopefully the digital image will satisfy — well, the people that are now on the Net. But there are real reasons to keep the physical materials. Look back at microfilm, when microfilm was done that was the access format of a previous generation, and that was the format to end all formats.

Well, we now know it wasn't the format to end all formats. Or even the Google Book Project. They've digitized an enormous number of books, but they've done in such a way that it kind of looks like a fax. So I think we can do a better job.

And the digitization that we're doing we think is absolutely fabulous, but there's going to be things we want to do next. And I think we can only kind of dream a little bit about how we might want to use these physical collections next.

BOB GARFIELD: I gather that in some ways you deem this to be an antidote to some of Google's practices in compiling its digital versions.

BREWSTER KAHLE: It's very tempting to just go and acquire books and just saw off the bindings and digitize them. You get a better digitization, it's much cheaper to do. But there's something in our stomachs that just grind when we think about butchering books.

Kevin Kelly put it, hey, I understand what you're doing. You're keeping the type specimen, the one of that particular ant that is the one that we're describing as the one for that species.

BOB GARFIELD: You should call this Noah Ark-ive.

BREWSTER KAHLE: [LAUGHS] Google has done this amazing, where they've been digitizing lots of books and they will have, in fact, digitized two different copies from different places. And if they either didn't digitize a page very well or there was a torn-out, which does happen, they can put it back together again such that they make a new digital version that's out of multiple, which seems terrific, but it's also a little creepy. What's the real thing here?

BOB GARFIELD: It's not a facsimile. It's kind of a Frankenstein's monster.

BREWSTER KAHLE: I wouldn't put a negative way. I think it's actually — they're doing a high value add. But I think we really want to know where to came from and be able to make sure that we understand what's happened. The opportunity to live in an Orwellian or a Fahrenheit 451 type world, where things are changed out from underneath us, is very much present in our digital world. Let's make sure we put in place the long-term archives to make it so that we can check up on those that are presenting things in the future.

BOB GARFIELD: Has this new venture, the collection in storage of physical books, dimmed your enthusiasm for the Internet Archive Project? Have you given up on the zeros and ones that we talked about to begin this conversation?

BREWSTER KAHLE: There's been an evolution towards just thinking we should just grab the digital stuff and store it because people weren't doing a good job of storing it in the past, to really broadening up to understanding both the preservation and the access. But I'm still extremely enthusiastic about the new types of materials and the new challenges we have, trying to keep up with the YouTube. Or Google Video and Yahoo! Video are just getting shut down now Geocities is already gone. Flickr, I don't know how long that's gonna last. Boy, we've got some real work to do. It's never a dull day here at the Internet Archive.

BOB GARFIELD: Brewster, thank you so much.

BREWSTER KAHLE: Thank you for your time.

BOB GARFIELD: Brewster Kahle is founder and digital librarian of the Internet Archive.