Print

Print


On Jan 4, 2005, at 5:56 PM, Ed Summers wrote:

>> Can you offer advice on going/not going fully with XML for the
>> storage mechanism?
>
> I tend to use XML as a transmission format: for serializing the
> contents of an
> database for consumption by a third party. Internally I use a
> relational db as
> a foundation for services I want to provide. So the rdbms serves as the
> primary data source.

I tend to agree with Ed, although this debate has been going on since
the inception of XML. There is no one correct answer.

I like using XML as the archival format for my text-based documents.
This means I like to use TEI and XHTML as the basis of my writings and
electronic texts. This technique allows me to separate my content from
a specific application and/or operating system. I should be able to
read these XML documents for years to come.

Ironically, I use database applications to build the XML files. I use
MySQL but stay away from any of their MySQL-isms such as the
auto-increment feature. This allows me to use things like mysqldump to
create a .sql files. Ideally I should be able to import these .sql
files into other database applications. In reality, this is not always
the case unless I tweak some of the SQL commands in the files, but
since I'm dealing with plain text, this is not too difficult.

Like Ed, I use databases to provide services against the data. More
importantly, databases make it is easier to do things like global
changes and update. It is more difficult to read bunches o' XML files,
parse them, update accordingly, and write them again.

In the XML world there are essentially two types of XML files:
mixed-content files and not mixed content files. Good examples of
mixed-content files are narrative texts. While still highly structured,
narrative texts contains a large mixture of XML elements; there is
relatively little repeating of elements in the same order. Non-mixed
content have more pattern. These are more akin to data files with much
more structure. In these cases the content is intended for statistical
analysis. Average this. Sum that. Etc. Analysis of this data would be
better done in a database application, not necessarily in an XML file
through XSLT. Put another way, if your data is narrative in nature,
like stories, consider more strongly saving your data as XML. On the
other hand, if your data is more statistical in nature, think more
strongly about using a database.

In short, if you want to preserve your data, then use XML. If you want
to do a lot of maintaining of the data (changing values, adding new
content), then use databases. If you want to transform the data into
other things like reports, printed documents, do analysis, then you
will probably want to use a combination of both technologies.

HTH.

--
Eric Lease Morgan