LISTSERV 16.5 - CODE4LIB Archives

A friendly and correct amendment.
Roy

> On Jan 10, 2017, at 6:12 PM, Stuart A. Yeates <[log in to unmask]> wrote:
> 
> That is, if the XML is completely consistent AND you're guaranteed to never
> encounter MARC data with XML special characters, then Kyle's suggestion is
> an excellent one.
> 
> I really need to find an excuse to publish a document with a title starting
> "<marc:datafield ..."
> 
> cheers
> stuart
> 
> --
> ...let us be heard from red core to black sky
> 
>> On Wed, Jan 11, 2017 at 12:06 PM, Roy Tennant <[log in to unmask]> wrote:
>> 
>> Well, I think that's a *bit* harsh. But the "YMMV" addition was
>> appreciated, because it can and will. That is, if the XML is completely
>> consistent, then Kyle's suggestion is an excellent one. If it isn't, then
>> Kevin's link applies, IMHO. Since it appears from what we have been told
>> that the records are consistent, I think Kyle's solution is not only
>> workable but the most efficient. Given the caveat stated above.
>> Roy
>> 
>>> On Jan 10, 2017, at 5:57 PM, Kevin S. Clarke <[log in to unmask]>
>> wrote:
>>> 
>>> On the mention of parsing XML with string operations, I'm compelled to
>> post one of my favorite StackOverflow responses:
>>> 
>>> http://stackoverflow.com/questions/1732348/regex-match-
>> open-tags-except-xhtml-self-contained-tags/1732454#1732454
>>> 
>>> YMMV of course...
>>> 
>>> Kevin
>>> 
>>> 
>>> 
>>> -----Original message-----
>>>> From:Kyle Banerjee
>>>> Sent: Tuesday, January 10 2017, 5:44 pm
>>>> To: [log in to unmask]
>>>> Subject: Re: [CODE4LIB] MARCXML help again
>>>> 
>>>> Howdy Julie,
>>>> 
>>>> Depending on your specific needs, it's often easier/faster to use string
>>>> rather than XML operations to work with XML.
>>>> 
>>>> Especially if you have a large number of files and/or the files are very
>>>> big, stripping the whitespace between elements and then performing a
>> simple
>>>> string substitution would be a fast low tech way to remove the unwanted
>>>> fields.
>>>> 
>>>> kyle
>>>> 
>>>> On Tue, Jan 10, 2017 at 1:13 PM, Julie Swierczek <
>> [log in to unmask]>
>>>> wrote:
>>>> 
>>>>> Thanks to all who responded to my earlier plea for help.  I now have a
>> new
>>>>> problem.  I'm not sure if I can do this with find and replace in
>> Oxygen, or
>>>>> if this requires XSLT, or what.
>>>>> 
>>>>> I have a project of MARCXML records like this:
>>>>> 
>>>>> <?xml version="1.0" encoding="UTF-8" ?>
>>>>> <marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim"
>>>>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>>>   xsi:schemaLocation="http://www.loc.gov/MARC21/slim
>>>>> http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
>>>>> <marc:record>
>>>>> <!--Lots of other datafields here -->
>>>>>   <marc:datafield tag="710" ind1="2" ind2=" ">
>>>>>           <marc:subfield code="a">Faux College</marc:subfield>
>>>>>           <marc:subfield code="b">Special Collections</marc:subfield>
>>>>>       </marc:datafield>
>>>>> </marc:record>
>>>>> </marc:collection>
>>>>> 
>>>>> I want to strip out all instances of:
>>>>>   <marc:datafield tag="710" ind1="2" ind2=" ">
>>>>>           <marc:subfield code="a">Faux College</marc:subfield>
>>>>>           <marc:subfield code="b">Special Collections</marc:subfield>
>>>>>       </marc:datafield>
>>>>> but I want to leave other <marc:datafield tag="710" ind1="2" ind2=" ">
>>>>> instances intact.  I only want to delete ones with both the Faux
>> College
>>>>> and Special Collections text in the subfields.
>>>>> 
>>>>> Where would I go from here? I thought of doing an xsl:template match
>> in an
>>>>> XSL stylesheet, and then not providing any instructions for replacing
>> the
>>>>> match, but I don't know how to select for that specific text. My
>> attempts
>>>>> to figure that out have not worked. You can only read so much W3C
>>>>> documentation and Stack Overflow before you need to just sit quietly
>> and
>>>>> stare at a wall for a while.
>>>>> 
>>>>> Thanks in advance --
>>>>> 
>>>>> Julie
>>>>> 
>>>> 
>>