On Feb 27, 2012, at 1:52 PM, Suchy, Daniel wrote:
> Hello all,
> At my campus we offer podcasts of course lectures, recorded in class and then delivered via iTunes and as a plain Mp3 download (http://podcast.ucsd.edu). I have the new responsibility of figuring out how to transcribe text versions of these audio podcasts for folks with hearing issues.
> I was wondering if any of you are using or have played with dictation/transcription software and can recommend or de-recommend any? My first inclination is to go with open-source, but I'm open to anything that works well and can scale to handle hundreds of courses.
I remember seeing a poster on a wall at the University of Maryland presenting work on a grant on doing this sort of work ... but I think it was for intelligence intercepts, as it was DoD funded and being used for Arabic.
This might've been the project:
Global Autonomous Language Exploration
I have no idea why it's on a UPenn website, but it's listed at:
And one of the researchers is Doug Oard, which matches what I remembered.
It might've also been "Supporting Information Access Using Computational Linguistics", which was also DoD funded, but doesn't have a website link in that list. And they didn't verify the links to faculty pages, so try one of the links to 'Douglas Oard' rather than 'Douglas Ward' if you want to contact him.
I also don't know if they were doing full transcription / translation, or if they were just looking for specific words to alert a human translator to review it.
Also, in the earlier list that Todd linked to, Zooniverse was mentioned. They have a framework for mechanical turk-type stuff, but they tend to be science oriented, and I don't know if they've ever done audio transcription. It's not exactly what they deal with, but they might be interested in helping, as at the 2010 DCC, someone said they had the problem of not enough work for their volunteers to do. (although, that might've changed since then).