On Wed, May 6, 2015 at 8:15 AM, Ethan Gruber <[log in to unmask]> wrote: > +1 on the RDFa and schema.org. For those that don't know the library URL > off-hand, it is much easier to find a library website by Googling than it > is to go through the central university portal, and the hours will show up > at the top of the page after having been harvested by search engines. Hi, so this is an area that I've done, and am doing, a fair bit of work. See http://stuff.coffeecode.net/2015/ola_white_hat_seo/#/1/10 for some fun slides from a presentation I gave in January at the Ontario Library Association SuperConference that show some ways data gets into Google/Yahoo/Bing and concludes that the OCLC Registry "manually maintain yet another copy of your data elsewhere" approach isn't working. (Hit "s" to get speaker notes). The rest of the presentation goes into depth on how to use RDFa to mark up a real library web page with location, contact info, opening hours, and event info. And I've posited that crawling library sites to pull single-sourced data (e.g. you update your website to provide updated hours to humans, and the machines automatically benefit) would be a much more effective, accurate, and usable approach than maintaining copies of the data in Google+, OCLC Registry, etc. We could produce results like http://cwrc.ca/rsc-src/ that stay accurate, rather than being one-off efforts that decay over time. (It would be great if the OCLC Registry had a "crawl this URL" option so that it could keep all of its data up-to-date and incentive libraries to publish the data in a machine-readable format such as RDFa + schema.org.) On the "but that's technically challenging" front, I tried pursuing some grant funding to produce templates for publishing that structured info in Drupal, Joomla, and other commonly used CMSs. Sadly, my application was recently denied, but that will only slow me down; I'm not going to give up on the goal. I have a paper in the works that will expand on the content of the presentation for those sites that have the ability (technical and administrative) to modify their own web pages. Sites running the Evergreen library system already generate a page for each of their libraries that contains this structured data (e.g. https://laurentian.concat.ca/eg/opac/library/OSUL), which is single sourced from the data that has to be maintained in the library system anyway. I'll happily acknowledge that getting search engines to harvest the right data is not easy, though: right now, for example, if you search for "J.N. Desmarais Library" it currently shows that the library is open 24 hours a day, which is completely false--probably maliciously submitted--information. *sigh* I've edited that info in the Google+ page at https://plus.google.com/+JNDesmaraisLibraryGreaterSudbury but even though it is a verified place and I am a manager of the G+ page, the edits still go through approval by Googlers. There appears to be no good way to tell Google "Hey, *this* is the URL you are looking for!". Somewhat amusingly, the entire reason I started working with schema.org dates back to an presentation I attended about Google Places years ago, where I whined about having to maintain yet another copy of data in yet another place, and the response inferred that schema.org might be the solution to that problem. Also, due to the structure of university web property ownership, we currently don't have the ability to modify our actual library home page to include any RDFa, which is a *wee* bit frustrating given my work in the field. Heh. Dan Scott Laurentian University