Crowdsourcing the pain of transcribing audio

The trouble with recording interviews is that you have to transcribe them. So after one of my forays to New Haven last week, where I interviewed people in connection with a book I’m working on about community news sites, I had a ton of audio and the unpleasant task of translating it all to text.

I decided to crowdsource the task through an Amazon.com service called Mechanical Turk. More about that in a moment. But first I want to explain my reluctance to try it.

I think the results are better when I do it myself. I have to listen carefully, which helps me seal the best stuff inside my leaky brain. I know what we were taking about, which means that I’m not flummoxed by names and unusual phrases, as any transcriber would be. And because I have an idea of how I’ll use the material, I can decide on the spot what to transcribe verbatim, what to paraphrase and what to leave out altogether. So I knew I could potentially be giving up a lot by turning the task over to others.

Some years ago I used a transcription service near Harvard Square when time was of the essence and when, most important, someone else was paying the bill. This time, faced with many hours of work, I decided to take advice given me last fall by Zach Seward and try MTurk. Seward, then with the Nieman Journalism Lab, told me that lab director Joshua Benton had used it to transcribe this talk by New York University’s Clay Shirky. I was impressed.

I posted a query on Twitter, and several people responded by sending me a link to an online guide by Andy Baio. I decided to try it with two interviews — a 65-minute recording with New Haven Independent founder and editor Paul Bass, made on his reasonably quiet back deck, and a 35-minute conversation with New Haven alderman Michael Jones, at an outdoor café on a busy street.

My first step was to go through the cumbersome process of converting my Olympus recorder’s WMA files to MP3s, and then dividing those MP3s into five-minute chunks so that a number of different people could apply themselves to the task. By the time I got around to doing the second interview, I had stumbled upon EasyWMA, a $10 utility that took the pain out of conversion, and had finally taught myself enough about Audacity, a free audio editor, so that I could painlessly produce five-minute bits.

I was surprised by how quickly the crowd swarmed over my files — in less than a day, I had everything I needed. Unfortunately, the quality was extremely uneven. Some of the mistakes were bizarre or unintentionally hilarious. How “state of Connecticut” became “state of Kentuckian” is one I’ll never figure out. And here’s a choice excerpt from my conversation with Bass. First, the MTurk version:

They had a Sunocompass call with WBR few weeks ago to get the advice, how the membership strives. The taste and ever didn’t undership strives because I felt that if the widely suceessful they might get thirty to fourty thousand dollars.

Now, what he really said:

They had us on a conference call with WBUR few weeks ago to get advice on how to do membership drives. In the past I hadn’t done membership drives, because I felt that if they’re wildly suceessful they might get you to $30,000 or $40,000.

Following Baio’s advice, I’d set a price of $2 per five-minute excerpt. You have the option of rejecting unusually bad work, refusing to pay and letting someone else take a crack at it. I decided to accept everyone’s work, including the person who produced what you see above. But I blocked two people (including the one I just cited), so that if I use the service again, they won’t have a chance to work on my stuff.

Overall, I paid $41.80*, $3.80 of which went to Amazon, the remainder to the folks who actually did the work.

Between file conversion and preparation, downloading transcribed interviews, listening to everything again and cleaning up the transcripts, I don’t know how much time I saved. Not much, probably. Yesterday I transcribed two interviews myself, and I thought the results were much better.

On the other hand, I purposely chose my Bass interview for MTurk because it was long and he talks very quickly. It was also an unusually substantive conversation, and I knew there wasn’t much I wanted to leave out. Most of the transcribers did an OK job.

My bottom line is that, in the future, I would probably reserve MTurk for situations in which I have good audio quality and need a full verbatim transcript. Even knowing that I’ll have to do a fair amount of retyping, it’s still better than starting from scratch.

But if I’m producing normal interview notes, I’ll handle it myself.

*Addendum: Jack Shafer of Slate told me the price I cited doesn’t mean much without comparing it to the price of a professional transcription service. So I contacted a good one and was told it would cost about $140 an hour — or about $230, nearly six times as much as what I paid. That’s a huge mark-up. On the other hands, the results would have been more usable.

Illustration via Wikimedia Commons.

Patching in to AOL’s Patch (II)

Old friend Mark Leccese, blogging at Boston.com, offers further thoughts on the competition among Patch, GateHouse Media’s Wicked Local sites and Boston.com’s Your Town initiative.

Let me repeat: The most interesting local online journalism is taking place at the grassroots. And no one in Greater Boston does a better job of aggregating it than Adam Gaffin of Universal Hub. If you didn’t know that already, well, now you do.

(Disclosure: Media Nation is part of Gaffin’s Boston Blogs advertising network.)

Earlier item.

Patching in to AOL’s Patch

AOL’s local-news initiative, Patch, has been ramping up in Massachusetts in recent months. The effort deserves a full post, so consider this a placeholder. Universal Hub has been all over Patch, chronicling the departure of several GateHouse Media employees who’ve signed on as Patch editors.

My tendency is not to get too excited when a national corporation with no roots in journalism decides to take on hyperlocal news. There have simply been too many instances of the suits deciding that journalism isn’t as lucrative as they had hoped and then pulling the plug a year or two down the line.

Based on Arlington Patch, the sites seem attractive and easy to navigate, with a strong emphasis on community participation. But I don’t know that I see anything that would make me choose it over GateHouse’s Wicked Local Arlington site, or Boston.com’s Your Town page for Arlington.

Besides, I think online local news works best when it grows from the ground up. Local blogs vary wildly in quality. But I’d rather check in on Bob Sprague’s Your Arlington blog than to spend my time with the progeny of Steve Case.

That said, it’s early. Maybe Patch will represent some sort of breakthrough. We’ll see.

George Steinbrenner, 1930-2010

The New York Times just confirmed that New York Yankees owner George Steinbrenner has died. Perhaps now we’ll learn why such a bombastic man was virtually silent during the last years of his life.

Probably no one did more to define the Yankees-Red Sox rivalry from the 1970s on than Steinbrenner. Loudmouthed bully, profligate spender, felonious friend of Richard Nixon — he always gave Red Sox fans someone to root against.

Derek Jeter liked him, so Steinbrenner must have had a good side as well.

Howie Carr actually finds a new line to cross*

There are certain ethical rules that journalists — even rabidly opinionated columnists — try to follow. You don’t donate money to candidates. You don’t put signs on your lawn. You don’t put bumper stickers on your car.

Then there’s Howie Carr, who’s speaking at a fundraiser on July 31 for the New Hampshire Republican State Committee. Such activities, unfortunately, have long since become acceptable for radio talk-show hosts, and that is Carr’s main job. But he’s still a columnist for the Boston Herald.

The Boston Globe has a great quote from Tom Fielder, dean of Boston University’s College of Communication:

You cannot call yourself a journalist — even as a columnist — and actively support a political party. It strikes me that the Herald should now report Carr’s salary to the Federal Election Commission as a contribution to the GOP.

Is there anyone at One Herald Square who can tell Howie no?

*No, I wouldn’t be surprised if he’s done this before. But if he did, I didn’t know about it.

Time enters the reality-distortion zone

Back in February, we paid $20 for an 18-month subscription to the print edition of Time magazine. All right, it was a “professional” rate, available to us because I’m a journalism professor. But no one pays the full $4.95-per-issue cover price. If you sign up for a subscription online, for instance, you’ll be charged just $19.95 for six months.

So count Time Warner executives among those who have been sucked into Steve Jobs’ famed “reality-distortion zone.” Because they are groping their way toward a paid-content strategy for Time that makes little or no sense. As explained by the Nieman Journalism Lab here and here, it includes these elements:

  • The magazine is now available as an iPad app costing a flat $4.99 per issue — no discounts, thank you very much. The same folks who understand fully that you won’t pay some $250 a year for the print edition think you’ll gladly fork over the money so that you’ll have something to read on your new toy.
  • The full content of the print edition has been pulled from Time.com, the magazine’s excellent website. There is still a lot of Web-only content available, much of it more, uh, timely and relevant than what appears in print. But when you try to access most articles from the print product, you get a summary and a plea to buy the magazine or the app.
  • The paid app is available only for the iPad, even though it would not be difficult to rewrite it for computers and other devices. (There is a Kindle app for Time that costs a far more reasonable $2.99 per month. Then again, what would Time be without great photography?)
  • The Web-only content is not included in the iPad app, which means that Time’s best customers will have to fire up Safari to see what they’re missing. And, of course, if there’s any Flash content on Time.com, they won’t be able to see it unless they switch to their computer. (There is some extra content included in the app.)

The folks at Time started with the right idea. Within the past year or so some pretty smart people have concluded that print and the Web should be used for different things, with the Web being used for breaking news, community and participation. Just as an experiment, it would be interesting to see whether Time could build a successful website without relying on content from the print edition.

But app fever is clouding Time’s judgment. The print edition arrives at Media Nation without fail every Saturday, and we didn’t even have to drop $500 on an iPad to get it. Slick as the app may be, it’s not as slick as glossy paper.

At the moment, Time is not offering a subscription to its app — it’s sold strictly on an issue-by-issue basis. When subscriptions do become available, Time ought to drop the price so that it’s the same as the print edition. Only then will we be able to see if there’s any demand.

Ellsbury speaks

Now that Jacoby Ellsbury has finally spoken out in his own defense, I just want to make a quick comment.

None of us has any idea — no, not even Kevin Youkilis — whether Ellsbury could have run hard, dived for balls and, especially, swung the bat properly if he had tried to play through the pain.

The guy is a 26-year-old star, well-liked by the fans. He’s never caused any trouble that we’re aware of. Does he want to play?

Good grief. Of course he wants to play.

And my guess is that a healthy Darnell McDonald is a better player than a hobbled Jacoby Ellsbury. So what’s the problem?