LISTSERV 16.5 - CODE4LIB Archives

Thanks Mark!

Paul Orkiszewski 
Coordinator of Library Technology Services / Associate Professor 
University Library 
Appalachian State University 
218 College Street
P.O. Box 32026
Boone, NC 28608-2026 

E-mail: [log in to unmask] 
Phone: 828 262 6588 
Fax: 828 262 2797

On Oct 10, 2012, at 9:19 AM, Mark Canney <[log in to unmask]> wrote:

> If you're not already aware of it, you ought to take a look at Stories Matter (http://storytelling.concordia.ca/storiesmatter/announcing-stories-matter-v-1-6e/about-stories-matter), an open source oral history database tool developed at Concordia University in Canada. SM allows archiving of digital video and audio materials, enabling oral historians to annotate, analyze, etc.
> 
> 
> On 10/3/2012 6:22 AM, Gary McGath wrote:
>> On 10/2/12 8:44 AM, Paul Orkiszewski wrote:
>>> Hi 4libers,
>>> 
>>> Does anyone know of something - a kiosk, an iPad app, a web application
>>> - that:
>> I don't know of anything like it out there, but let's look at what it
>> might take. I've done some software work in connection with Harvard's
>> Iranian Oral History Project.
>> 
>>> - Initiates an oral history interview by getting demographic info and
>>> permission to use and stream for scholarly purposes.
>> I'm not sure what you're saying here. It sounds as if you're talking
>> about automated correspondence with the sources. That would be a huge
>> project in itself, so I assume you've got something more narrowly
>> focused in mind.
>> 
>>> - Goes through a standard set of questions (in our case stuff about the
>>> Appalachian State experience)
>> There are two pieces to this: Recording the responses and storing the
>> relevant metadata. The recording probably shouldn't be tied to a
>> specific device or application, since field work can involve a lot of
>> different conditions. The researcher in the field would want something
>> to enter the metadata (who, what, when, where); this would be a
>> straightforward piece.
>> 
>>> - Stores the metadata, permissions release, and pointers to the audio
>>> files created for each question in a dbase record
>> You don't say what the scope of the work is; from the way you're putting
>> the questions, I'm assuming it's a small-scale project with one
>> researcher doing the interviews and putting the information together.
>> Even so, It's probably best to have the field work be a separate
>> application from assembling the information in the database. If nothing
>> else, once you're at this point there's more standard software that can
>> be used.
>> 
>>> - Processes the audio through speech recognition either in real time or
>>> post-interview, and populates the dbase record with rendered text (at
>>> whatever level of accuracy)
>> You could do this piece with Dragon; see this post for some discussion:
>> 
>> http://www.nuance.com/dragon/transcription-solutions/index.htm
>> 
>> A friend of mine is an expert in this area and might be able to answer
>> some questions.
>> 
>>> - Provide a search interface, where the meatadata, demographic info
>>> (within reasonable privacy limits), and the transcript (however garbled)
>>> is searchable.
>> I'd suggest basing something on Apache Lucene.
>> 
>>> - Crowd source the improvement of the transcriptions over time
>> This needs to be better specified. One solution is to put the text onto
>> a wiki. If you're talking about integrating it into the application that
>> does all the rest, it could get messy.
>> 
>>> - Package the interface as an app, and set up a machine image on Amazon
>>> EC2, such that when someone uses the image and points a browser to it,
>>> it goes through a set up routine so that smaller schools and historical
>>> societies can set up their own sites in the cloud.  I haven't tried
>>> streaming on a free tier EC2 server, but you get 30 GB of storage, so
>>> you could get a fair number of hours of audio (depending on the
>>> settings) before you have to start paying.
>> This, I assume, is why you're talking about treating the whole thing as
>> a single application. Putting it all together would be a huge chunk of
>> work. Dragon's software isn't free, and I don't know of anything for
>> free that does decent speech transcription, so that would be a stumbling
>> block to making it available to other institutions.
>>> ?
>>> 
>>> Anyone interested in trying it with me if there's nothing already out
>>> there?  I'm leaning toward iPad, so we'd need iOS, server admin, dbase,
>>> and media expertise.  I have newbie-but-getting-better skill in the last
>>> 3.  Zero skill in iOS.
>> I'm available for freelance work and it sounds very interesting, but
>> you've just outlined a huge project that would be a significant burden
>> even for the LoC's resources. That's not to say it can't be useful as a
>> blue-sky starting point for something more reasonable. If you have
>> funding, let's talk off-list. If you just want to continue blue-skying
>> the idea for a while, I'm glad to continue on-list (and I promise not to
>> bill you for that :).
>> 
>> 
> 
> -- 
> Mark Canney
> Manager, Lending Services
> Lehigh University Libraries
> 8A E. Packer Avenue
> Bethlehem, PA   18015-3170
> 610-758-3028
> [log in to unmask]