On Mon, Apr 15, 2019 at 11:20 AM Thomas Dunbar <[log in to unmask]> wrote: > Hello everyone, > > I'm working on a proof of concept web application for common library data > conversions with support for large files. > The application is build using a serverless architecture, which allows me > do this at scale and at low cost. > Love the concept -- I tried a few conversions, including some north of 200MB. Overall, it worked impressively. Not having to download software is cool because you don't always have the ability to download software or might need to do something using a cell phone. For me personally, the chief needs driving conversions are : 1) To perform fixes in a format that's easier to work with (e.g. no one fixes in binary MARC) and convert back; 2) analysis -- i.e. identify records/elements that have or don't have X; and 3) migrations (which have required further manipulation in every single case). In other words, manipulations and partial extractions. In the context of these use cases, delimited text, plain text, XML, MARC, and JSON (to a lesser extent) dominate conversion needs. Regarding the MARC to text conversion, delimited text conversions need a subdelimiter for repeated fields as this is what often must be loaded into another system, presented in a table to someone, etc. -- the current method which adds more lines will cause trouble for anyone without coding skills. On a related note, considering the indicators part of the field makes philosophical sense but it creates practical problems (especially with nonrepeatable fields). For example, it scatters the 245 titles over as many indicator variations that exist making the simple task of generating a list of titles trickier than it should be. MARC already has a huge number of fields, so when the indicator permutations are combined with separate fielding for repeated fields, it takes no time at all to get many hundreds of fields with files that aren't that big -- something headache inducing even for those with mad skilz. One thing you'll want to think about as you develop the tool is what the people use it to accomplish. In my experience, conversions set you up for what you were really doing rather than being objectives in their own right. But again, very cool. kyle