Thanks, Shaun and Terry. I'll pass this info along. Terry, I may have Tyson
contact you directly if he has questions. I look forward to seeing your
lightning talk!
Carmen
On Tue, Mar 19, 2013 at 2:09 PM, Shaun Ellis <[log in to unmask]> wrote:
> Carmen,
> If you are only interested in de-duping and assessing file size, it may be
> overkill. Picasa has some good organizing and browsing features. Your
> developer may want to look at the Picasa (Desktop Client) Button API, which
> can kick off scripts for processing selected photos:
> https://developers.google.com/**picasa/docs/button_api<https://developers.google.com/picasa/docs/button_api>
>
> -Shaun
>
>
> On 3/19/13 4:51 PM, Carmen Mitchell wrote:
>
>> Hello Code4Libbers,
>>
>> I'm working with a faculty member and trying to help them to formalize
>> their data collection practices. Part of this process is also going
>> through
>> old data and trying to assess what they currently have. This particular
>> faculty member has been doing research for 10 years without any kind of
>> structure or regular method. So far we have over 2 TB of data in various
>> states. (With more to come.)
>>
>> I've got a programmer working with me to:
>> a) identify file types
>> b) count how many files of each type
>>
>> We are now working on de-duping and assessing file size, focusing on the
>> JPEGs first. With over 300,000 over them...it might take a while. (Of
>> course they aren't following any kind of file naming structure,
>> either...It's a mess.)
>>
>> Any tips or tricks or tools that you might know of to help speed up this
>> process? Is there a good image recognition tool that you could suggest
>> that
>> would help us with automation?
>>
>> Thanks,
>>
>> Carmen Mitchell
>> Institutional Repository Librarian
>> Cal State San Marcos
>>
>
|