Print

Print


Hi Manuela,

Personally I’d recommend storing and managing your METS documents as XML and converting to JSON only at the point where you need to process them as JSON. That assumes you have the option of storing as XML. If your database is built mainly or only for JSON, then you’ll have to do a one-time conversion from XML to JSON. In any case, as has already been said it’s strictly a matter of conversion, and there are lots of tools that can take XML and output generic JSON in a lossless way. Whether you would then need to have another process to go from JSON to JSON-LD I don’t know; that’s outside my knowledge.

Hope this helps,
Greg


Gregory Murray
Director of Digital Initiatives
Wright Library
Princeton Theological Seminary


From: Code for Libraries <[log in to unmask]> on behalf of parker, anson D (adp6j) <[log in to unmask]>
Date: Tuesday, August 1, 2023 at 10:55 AM
To: [log in to unmask] <[log in to unmask]>
Subject: Re: [CODE4LIB] METS in JSON-LD?
[You don't often get email from [log in to unmask] Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

might be worth spending a minute on some AI bots to play with this

at the end of the day it's an XML->JSON project and there are a bunch of tools that will streamline that

for instance here's a dumbed down python script with a streamlit interface i got out of claude.ai in a couple of queries


import streamlit as st
import xml.etree.ElementTree as ET
import json

st.title('METS to JSON-LD Converter')

uploaded_file = st.file_uploader('Choose a METS XML file', type=['xml'])

if uploaded_file is not None:
    # Load METS file
    tree = ET.parse(uploaded_file)
    root = tree.getroot()

    # JSON-LD context
    context = {
        '@vocab': 'https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fschema.org%2F&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TYxQc%2BBh0kJYYb2Wy7IvpJjJqiiabnlpdsSVhFViWvw%3D&reserved=0<http://schema.org/>',
        'dc': 'https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pbxUZAJT2mBujKQJOHiPAOrZZTKf7AqKW8KcRUtdmZ4%3D&reserved=0<http://purl.org/dc/elements/1.1/>'
    }

    jsonld = {'@context': context}
    jsonld['metadata'] = []

    # Parse METS and generate JSON-LD
    for dmdSec in root.iter('dmdSec'):
        # Extract metadata
        md = {
            'name': dmdSec.find('mdWrap/xmlData/mods/titleInfo/title').text,
            'author': [{'@type': 'Person', 'name': namePart.text} for namePart in dmdSec.find('mdWrap/xmlData/mods/name')],
            'datePublished': dmdSec.find('mdWrap/xmlData/mods/originInfo/dateIssued').text,
            'publisher': {'@type': 'Organization', 'name': dmdSec.find('mdWrap/xmlData/mods/originInfo/publisher').text},
            'genre': dmdSec.find('mdWrap/xmlData/mods/genre').text,
            'description': dmdSec.find('mdWrap/xmlData/mods/abstract').text
        }

        jsonld['metadata'].append(md)

    # Output JSON-LD file for download
    json_txt = json.dumps(jsonld, indent=4)
    st.download_button('Download JSON-LD', json_txt, 'metadata.jsonld')

________________________________________
From: Code for Libraries <[log in to unmask]> on behalf of Manuela Pallotto Strickland <[log in to unmask]>
Sent: Tuesday, August 1, 2023 10:47 AM
To: [log in to unmask]
Subject: [CODE4LIB] METS in JSON-LD?

Hello,
I am posting this question on a couple of lists, so sincere apologies to those who might see it twice (or thrice).
Does anyone know of any work that has been/is being/will be done on 'a' METS JSON-LD serialization?
Any relevant info or comment in this re will be very much appreciated.
Thank you!
Best wishes,
Manuela


__________________________________________________________________________________

Dr Manuela Pallotto Strickland | Metadata and Digital Preservation Coordinator | Archives & Research Collections | Libraries & Collections
King's College London | Strand | London WC2R 2LS | [log in to unmask]<mailto:[log in to unmask]>
Tel: Please call me using MS Teams or Skype for Business, or email to arrange a call

W: https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kcl.ac.uk%2Flibrary%2Fcollections%2Farchives&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vYbN%2FPFH3wje38r9bnL%2BTvH9GqtcH20YRMDpBJUmZnQ%3D&reserved=0<https://www.kcl.ac.uk/library/collections/archives>
T: twitter.com/KingsArchives<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitter.com%2FKingsArchives&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HQpBnfJh3XSZdIx13CYjmVppCIkecbii8EOB0BeF7mg%3D&reserved=0> and twitter.com/kingslibraries<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitter.com%2Fkingslibraries&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=503MDdt2QwD%2B5Z3OU8St%2Bjhw8mU6ER93J5W%2B2sWifKY%3D&reserved=0>
Blog: blogs.kcl.ac.uk/kingscollections<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblogs.kcl.ac.uk%2Fkingscollections&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=yYKeQ5Q4ngQlxPD71xUs%2FFPzaGUMBCukFsaEz9S06Ao%3D&reserved=0>