On 28 October 2010 17:37, MJ Suhonos <[log in to unmask]> wrote:
> Let me openly state that I've never used Turbomarc. I believe the "special case" they are referring to is the subfield code with a value of "ç", which is non-alphanumeric. I don't know enough about MARC to even begin guessing what this means or why it might occur (or not).
>
> The use case I see for Turbomarc is when you:
>
> 1- have a need for high performance
> 2- are converting binary MARC to XML
> 3- are writing your own XSLT to manipulate that XML (since it's not MARCXML)
>
> The first comment claims a 30-40% increase in XML parsing, which seems obvious when you compare the number of characters in the example provided: 277 vs. 419, or about 34% fewer going through the parser.
The speedup can be much greater than that -- from the blog post
itself, "Using xsltproc --timing showed that our transformations were
faster by a factor of 4-5. Shortening the element names only improved
performance fractionally, but since everything counts, we decided to
do this as well". xsltproc uses the highly optimised LibXML/LibXSLT
stack, which I guess maybe doesn't have so much constant-time overhead
as the PHP simplexml parser that yielder the smaller speedup.
|