Chuck Bearden wrote:
>That being said, I have found that web pages of records generated from
>databases and templates *are* indeed structured data, it's just that
>they are (to borrow a PC phrase) "differently structured". The HTML may
>be crappy, but it's crappy in a consistent way, and it can be fixed
>and navigated in a consistent way. Necessity is the mother of invention
>and the midwife of screen-scraping.
>
I'm not sure why the midwife imagery triggered this thought but ...
For how many useful targets would it be possible to define a consistent
intermediate layer structure that would
- handle a SRU/SRW search
- transform it into an "native" database search
- transform the results into an SRU/SRW friendly result set
and still return them in a reasonable time?
I'm not (necessarily) suggesting a centralized service that would do
this (a la OCLC) but rather a set of protocols that I could drop into a
locally managed site for targets that we choose to address in this
fashion. Can the problem be abstracted sufficiently? Can we build in
alerts to trigger actions when the structure of a given result doesn't
match the pattern we've been expecting (i.e. site change alert) ?
Walter Lewis
Halton Hills
|