LISTSERV 16.5 - CODE4LIB Archives

I would also add somewhere links to the definitions / standards for
each of these files types. Not everyone who encounters MARC can be
expected to know all the other acronyms-as-file-formats.

cheers
stuart


--
...let us be heard from red core to black sky

On Tue, 16 Apr 2019 at 08:58, Kyle Banerjee <[log in to unmask]> wrote:
>
> On Mon, Apr 15, 2019 at 11:20 AM Thomas Dunbar <[log in to unmask]> wrote:
>
> > Hello everyone,
> >
> > I'm working on a proof of concept web application for common library data
> > conversions with support for large files.
> > The application is build using a serverless architecture, which allows me
> > do this at scale and at low cost.
> >
>
> Love the concept -- I tried a few conversions, including some north of
> 200MB. Overall, it worked impressively. Not having to download software is
> cool because you don't always have the ability to download software or
> might need to do something using a cell phone.
>
> For me personally, the chief needs driving conversions are : 1) To perform
> fixes in a format that's easier to work with (e.g. no one fixes in binary
> MARC) and convert back; 2) analysis -- i.e. identify records/elements that
> have or don't have X; and 3) migrations (which have required further
> manipulation in every single case). In other words, manipulations and
> partial extractions. In the context of these use cases, delimited text,
> plain text, XML, MARC, and JSON (to a lesser extent) dominate conversion
> needs.
>
> Regarding the MARC to text conversion, delimited text conversions need a
> subdelimiter for repeated fields as this is what often must be loaded into
> another system, presented in a table to someone, etc. -- the current method
> which adds more lines will cause trouble for anyone without coding skills.
> On a related note, considering the indicators part of the field makes
> philosophical sense but it creates practical problems (especially with
> nonrepeatable fields). For example, it scatters the 245 titles over as many
> indicator variations that exist making the simple task of generating a list
> of titles trickier than it should be. MARC already has a huge number of
> fields, so when the indicator permutations are combined with separate
> fielding for repeated fields, it takes no time at all to get many hundreds
> of fields with files that aren't that big -- something headache inducing
> even for those with mad skilz.
>
> One thing you'll want to think about as you develop the tool is what the
> people use it to accomplish. In my experience, conversions set you up for
> what you were really doing rather than being objectives in their own right.
>
> But again, very cool.
>
> kyle