Thanks all! Yes, I was expecting to need to replace those text strings with the numeric entities Wendy -----Original Message----- From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Roy Tennant Sent: Monday, December 09, 2013 7:48 PM To: [log in to unmask] Subject: Re: [CODE4LIB] problem in old etd xml files For my money, the text transform should look only for exact matches (e.g., "á", " ", "©") and replace them with their numeric counterparts. Roy On Mon, Dec 9, 2013 at 5:41 PM, jason bengtson <[log in to unmask]>wrote: > For testing purposes I just nixed them. As I noted, to rework the file > a person would probably want to use a more critical eye with find and > replace. Totally doable. > > > On Dec 9, 2013, at 7:37 PM, Jon Gorman <[log in to unmask]> wrote: > > > How did you fix the ampersands? I ask, because if you just did a > > simple text transform from & to &, it would mask the problem of > > the entity escaping I think... > > > > Not at work, so I don't have a good example and the file is > > downloading very slowly here, so I'll try to do one from memory. > > > > There were several á in the XML which mapped to an accent > character > > in the DTD via the Entity. > > > > If you just substituted & with &, you'd get &aacute;, which > > would render inline as &accute;. It would superficially solve the > > issue since browsers would no longer give the errors about the dtd > > since it wouldn't > be > > trying to load entities from the DTDs. And depending how you did it, > > you likely could also replace a correctly encoded one to make > > &amp;, leading to some very odd stuff. > > > > I wouldn't be surprised to find some unescaped ampersands, but the > solution > > I posted will essentially replace the entities with their text, > > hopefully causing most characters to appear correctly. You > > definitely still need to fix some of the other stuff. (I suspect it > > never worked for most browsers and XML systems, most likely only IE). > > > > Jon Gorman > > University of Illinois > > Best regards, > > Jason Bengtson, MLIS, MA > Head of Library Computing and Information SystemsAssistant Professor, > Graduate CollegeDepartment of Health Sciences Library and Information > ManagementUniversity of Oklahoma Health Sciences Center405-271-2285, opt. > 5405-271-3297 (fax) > [log in to unmask] > http://library.ouhsc.edu > www.jasonbengtson.com > > NOTICE: > This e-mail is intended solely for the use of the individual to whom > it is addressed and may contain information that is privileged, > confidential or otherwise exempt from disclosure. If the reader of > this e-mail is not the intended recipient or the employee or agent > responsible for delivering the message to the intended recipient, you > are hereby notified that any dissemination, distribution, or copying > of this communication is strictly prohibited. If you have received > this communication in error, please immediately notify us by replying > to the original message at the listed email address. Thank You. >