On the mention of parsing XML with string operations, I'm compelled to post one of my favorite StackOverflow responses: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 YMMV of course... Kevin -----Original message----- > From:Kyle Banerjee > Sent: Tuesday, January 10 2017, 5:44 pm > To: [log in to unmask] > Subject: Re: [CODE4LIB] MARCXML help again > > Howdy Julie, > > Depending on your specific needs, it's often easier/faster to use string > rather than XML operations to work with XML. > > Especially if you have a large number of files and/or the files are very > big, stripping the whitespace between elements and then performing a simple > string substitution would be a fast low tech way to remove the unwanted > fields. > > kyle > > On Tue, Jan 10, 2017 at 1:13 PM, Julie Swierczek <[log in to unmask]> > wrote: > > > Thanks to all who responded to my earlier plea for help. I now have a new > > problem. I'm not sure if I can do this with find and replace in Oxygen, or > > if this requires XSLT, or what. > > > > I have a project of MARCXML records like this: > > > > <?xml version="1.0" encoding="UTF-8" ?> > > <marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" > > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > > xsi:schemaLocation="http://www.loc.gov/MARC21/slim > > http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"> > > <marc:record> > > <!--Lots of other datafields here --> > > <marc:datafield tag="710" ind1="2" ind2=" "> > > <marc:subfield code="a">Faux College</marc:subfield> > > <marc:subfield code="b">Special Collections</marc:subfield> > > </marc:datafield> > > </marc:record> > > </marc:collection> > > > > I want to strip out all instances of: > > <marc:datafield tag="710" ind1="2" ind2=" "> > > <marc:subfield code="a">Faux College</marc:subfield> > > <marc:subfield code="b">Special Collections</marc:subfield> > > </marc:datafield> > > but I want to leave other <marc:datafield tag="710" ind1="2" ind2=" "> > > instances intact. I only want to delete ones with both the Faux College > > and Special Collections text in the subfields. > > > > Where would I go from here? I thought of doing an xsl:template match in an > > XSL stylesheet, and then not providing any instructions for replacing the > > match, but I don't know how to select for that specific text. My attempts > > to figure that out have not worked. You can only read so much W3C > > documentation and Stack Overflow before you need to just sit quietly and > > stare at a wall for a while. > > > > Thanks in advance -- > > > > Julie > > >