The immense amounts of data collected by local, state and federal government agencies can be an incredibly valuable trove for enterprising journalists. It can also be a pointless slog. Texas Tribune reporter Matt Stiles and Duke University computational journalism professor Sarah Cohen explain how they find good stories in a sea of government data.
Related
Supported by
-
Conservative Bloggers Vindicated, Advice for Leakers, and More
-
An 11-year-old and his 3D printer
-
Who’s gonna pay for this stuff?
-
A Journalistic Civil War Odyssey
-
A New Incentive for Cord Cutters
-
A Source for Sources
-
Web Only Audio Extra - TV Cord Cutters
-
Angelina Jolie's Secret Test Results
-
With IRS Scandal, Conservative Bloggers Feel Vindicated
-
Brooke Gladstone + Cyndi Lauper
-
Conservative Bloggers Vindicated, Advice for Leakers, and More
-
Who’s gonna pay for this stuff?
-
The Totally Legal Subpoena
-
A New Incentive for Cord Cutters
-
A Journalistic Civil War Odyssey
-
With IRS Scandal, Conservative Bloggers Feel Vindicated
-
The Future History of the Newspaper Industry
-
A Source for Sources
-
Bloomberg Terminals, Spying, and Business Models
-
Meet Strongbox
-
With IRS Scandal, Conservative Bloggers Feel Vindicated
-
An 11-year-old and his 3D printer
-
AdBlock Plus: The Internet's Ad Gatekeeper?
-
A New Incentive for Cord Cutters
-
The Media Supernova
-
The Future History of the Newspaper Industry
-
The Totally Legal Subpoena
-
A Journalistic Civil War Odyssey
-
Web Only Audio Extra - Crowdsourcing FOIA Requests
-
Web Only Audio Extra - TV Cord Cutters


Comments [4]
To amplify Tim's point: Wal-Mart says on their Corporate Fact Sheet ( http://walmartstores.com/download/2230.pdf ) that they employ 2.1 million people. So 2.5 petabytes per hour is more than a gigabyte per employee per hour. Seems doubtful.
Hi,
I'm curious for the transcript.
Enjoyed it immensely. The English teacher and IT person in me appreciated that Brooke correctly stayed with 'data are' throughout the show.
One error, however, was hearing that Wal-Mart is saving 2.5 petabytes per hour. They've got that much data in total, but not nearly that much per hour.
That's a huge amount of data. I ran an IT department for a small ($80 million a year) company for 30 years. I kept a record of everything for that time - every shipment; every day of earnings for every one of 400 employees; every check cut; every bit of production data.
30 years' worth of data. 2.5 petabytes is that much data - for every man, woman, and child in the United States - twice over.
I quite enjoyed your piece on data journalism. I fully relate to the point, made by one of your interviewees, that in order to accomplish anything you have to do as little as possible on a consistent basis. This is exactly how I've been able to finish two (going on four) books. When people ask me how I do it I tell them that I keep my expectations low. If I pen five pages a day, five days a week, I've written 300 pages in three months. You cannot say, "Today I'll write a book," but you can say, "Today I'll write five pages of a book." Furthermore, if you fail to reach five pages, it's not the end of the world -- whereas if you fail to write that book, or even to start it, it can feel like the walls are crumbling around you.
Leave a Comment
Register for your own account so you can vote on comments, save your favorites, and more. Learn more.
Please stay on topic, be civil, and be brief.
Email addresses are never displayed, but they are required to confirm your comments. Names are displayed with all comments. We reserve the right to edit any comments posted on this site. Please read the Comment Guidelines before posting. By leaving a comment, you agree to New York Public Radio's Privacy Policy and Terms Of Use.