Print

Print


A lot of modern systems won't load entities (or will limit it somehow)
because of the denial of service attack that is possible.  Look for XML
Entity Reference Denial of Service. I can't remember if Public declarations
are treated any differently than System ones. (I would have suspected it to
trust SYSTEM ones more, but they'd still be exploitable by the same bug).


(There's also a fair number of other errors, I'm somewhat skeptical that
the example worked on many browsers even then. It's possible IE was
flexible enough it would have worked).

One thing you might want to do is is take out the entities.

I can't remember why I had to do this, but xmllint seemed to do the trick.
( I found a snippet at
http://stackoverflow.com/questions/614067/how-to-resolve-all-entity-references-in-xml-and-create-a-new-xml-in-c,
but it' smissing the necessary --loaddtd)

xmllint --loaddtd --noent --dropdtd FRONT.xml > FRONT_nodtdent.xml

I mean, you don't need the dtd for validation, particularly since I suspect
given the errors it may not validate anyhow.

It might make the files a little harder to read when reading the raw
source, but I suspect that's not typically a problem.

Jon Gorman
University of Illinois



On Mon, Dec 9, 2013 at 2:10 PM, Robertson, Wendy C <
[log in to unmask]> wrote:

> Back in 1999-2002 a handful of our theses were submitted  as a collection
> of xml files.  We posted the files in our repository several years ago (we
> posted a zipped folder with all the files).  At that time, if you opened
> front.xml you would be able to access the thesis. We have not touched the
> files in the close to 5 years since we posted them, but the files no longer
> open correctly. One of the problem theses is http://ir.uiowa.edu/etd/189/.
>
> Front.xml begins
> <?xml version="1.0" encoding="UTF-8"?>
> <?xml:stylesheet type="text/css" href="UIowa2K1.css" ?>
> <!DOCTYPE thesis SYSTEM "UIowa2K.dtd">
>
> I have tried the following changes but they do not help
>
> 1)      Adding standalone="no"? to the xml declaration  -- <?xml
> version="1.0" " encoding="UTF-8" standalone="no"?>
>
> 2)      Changing the case of "UIowa2K1.css" and "UIowa2K.dtd" to match the
> files (which are in all caps)
>
> 3)      Changing xml:stylesheet to xml-stylesheet
>
> Chrome shows errors that entities are not defined, but they are defined in
> the dtd.
>
> I would appreciate any assistance in making these documents available
> again. Thanks!
>
> Wendy Robertson
> Digital Scholarship Librarian *  The University of Iowa Libraries
> 1015 Main Library  *  Iowa City, Iowa 52242
> [log in to unmask] * 319-335-5821
>