LISTSERV 16.5 - CODE4LIB Archives

On 10/2/12 8:44 AM, Paul Orkiszewski wrote:
> Hi 4libers,
> 
> Does anyone know of something - a kiosk, an iPad app, a web application
> - that:

I don't know of anything like it out there, but let's look at what it
might take. I've done some software work in connection with Harvard's
Iranian Oral History Project.

> - Initiates an oral history interview by getting demographic info and
> permission to use and stream for scholarly purposes.

I'm not sure what you're saying here. It sounds as if you're talking
about automated correspondence with the sources. That would be a huge
project in itself, so I assume you've got something more narrowly
focused in mind.

> - Goes through a standard set of questions (in our case stuff about the
> Appalachian State experience)

There are two pieces to this: Recording the responses and storing the
relevant metadata. The recording probably shouldn't be tied to a
specific device or application, since field work can involve a lot of
different conditions. The researcher in the field would want something
to enter the metadata (who, what, when, where); this would be a
straightforward piece.

> - Stores the metadata, permissions release, and pointers to the audio
> files created for each question in a dbase record

You don't say what the scope of the work is; from the way you're putting
the questions, I'm assuming it's a small-scale project with one
researcher doing the interviews and putting the information together.
Even so, It's probably best to have the field work be a separate
application from assembling the information in the database. If nothing
else, once you're at this point there's more standard software that can
be used.

> - Processes the audio through speech recognition either in real time or
> post-interview, and populates the dbase record with rendered text (at
> whatever level of accuracy)

You could do this piece with Dragon; see this post for some discussion:

http://www.nuance.com/dragon/transcription-solutions/index.htm

A friend of mine is an expert in this area and might be able to answer
some questions.

> - Provide a search interface, where the meatadata, demographic info
> (within reasonable privacy limits), and the transcript (however garbled)
> is searchable.

I'd suggest basing something on Apache Lucene.

> - Crowd source the improvement of the transcriptions over time

This needs to be better specified. One solution is to put the text onto
a wiki. If you're talking about integrating it into the application that
does all the rest, it could get messy.

> - Package the interface as an app, and set up a machine image on Amazon
> EC2, such that when someone uses the image and points a browser to it,
> it goes through a set up routine so that smaller schools and historical
> societies can set up their own sites in the cloud.  I haven't tried
> streaming on a free tier EC2 server, but you get 30 GB of storage, so
> you could get a fair number of hours of audio (depending on the
> settings) before you have to start paying.

This, I assume, is why you're talking about treating the whole thing as
a single application. Putting it all together would be a huge chunk of
work. Dragon's software isn't free, and I don't know of anything for
free that does decent speech transcription, so that would be a stumbling
block to making it available to other institutions.
> 
> ?
> 
> Anyone interested in trying it with me if there's nothing already out
> there?  I'm leaning toward iPad, so we'd need iOS, server admin, dbase,
> and media expertise.  I have newbie-but-getting-better skill in the last
> 3.  Zero skill in iOS.

I'm available for freelance work and it sounds very interesting, but
you've just outlined a huge project that would be a significant burden
even for the LoC's resources. That's not to say it can't be useful as a
blue-sky starting point for something more reasonable. If you have
funding, let's talk off-list. If you just want to continue blue-skying
the idea for a while, I'm glad to continue on-list (and I promise not to
bill you for that :).


-- 
Gary McGath, Professional Software Developer        [log in to unmask]