Thanks, this is helpful feedback at least.
I think it's completely irrelevant, when determining what is legal under
standards, to talk about what certain Java tools happen to do though, I
don't care too much what some tool you happen to use does.
In this case, I'm _writing_ the tools. I want to make them do 'the right
thing', with some mix of what's actually official legally correct and
what's practically useful. What your Java tools do is more or less
irrelevant to me. I certainly _could_ make my tool respect the Marc
leader encoded in MarcXML over the XML decleration if I wanted to. I
could even make it assume the data is Marc8 in XML, even though there's
no XML charset type for it, if the leader says it's Marc8.
But do others agree that there is in fact no legal way to have Marc8 in
MarcXML?
Do others agree that you can use non-UTF8 encodings in MarcXML, so long
as they are legal XML?
I won't even ask someone to cite standards documents, because it's
pretty clear that LC forgot to consider this when establishing MarcXML.
(And I have no faith that one could get LC to make a call on this and
publish it any time this century).
Has anyone seen any Marc8-encoded MarcXML in the wild? Is it common? How
is it represented with regard to the XML leader and the Marc header?
Has anyone seen any MarcXML with char encodings that are neither Marc8
nor UTF8 in the wild? Are they common? How are they represented with
regard to XML leader and Marc header?
On 4/17/2012 2:32 PM, LeVan,Ralph wrote:
>> If I want to have a MarcXML document encoded in Marc8 -- what should
> it
>> look like? What should be in the XML decleration? What should be in
> the
>> MARC header embedded in the XML? Or is it not in fact legal at all?
> I'm going out on a limb here, but I don't think it is legal. There is
> no formal encoding that corresponds to MARC-8, so there's no way to tell
> XML tools how to interpret the bytes.
>
>
>> If I want to have a MarcXML document encoded in UTF8, what should it
>> look like? What should be in the XML decleration? What should be in
> the
>> MARC header embedded in the XML?
> <?xml encoding="UTF-8"?>
>
> I suppose you'll want to set the leader to UTF-8 as well, but it doesn't
> really matter to any XML tools.
>
>
>> If I want to have a MarcXML document with a char encoding that is
>> _neither_ Marc8 nor UTF8, but something else generally legal for XML
|