Print

Print


On Tue, Jan 10, 2017 at 3:34 PM, Kevin S. Clarke <[log in to unmask]>
wrote:

> Yes, the YMMV was intentional because people do do it and we all have our
> own paths to follow, but there are a lot of "if"s that need to be in place
> (a few of which you mention) to make it a painless process.
>

Many systems and data sources libraries rely on occasionally spit out
invalid XML -- this causes parsers to choke. DOM ain't much fun when
dealing with giant files or millions of records that contain issues.

The only really important "if" with regards to string parsing is that you
understand the data that you're working with -- which you typically do in
library settings. Special characters, inconsistencies that follow any kind
of pattern, structural errors that XML parsers can't handle, and other
issues are no big deal if you know what you're up against and plan
accordingly.

Sometimes going the XML route is less painful, sometimes it's much more
painful. It just depends on what you're doing.

kyle