LISTSERV 16.5 - CODE4LIB Archives

Agreed that SPARQL is ugly, and there was  discussion at the RDF 
validation workshop about the need for friendly interfaces that then 
create the appropriate SPARQL queries in the background. This shouldn't 
be surprising, since most business systems do not require users to write 
raw SQL or even anything resembling code - often users fill in a form 
with data that is turned into code.

But it really is a mistake to see OWL as a constraint language in the 
sense of validation. An ontology cannot constrain; OWL is solely 
*descriptive* not *prescriptive.* [1]

Inferencing is very different from validation, and this is an area where 
the initial RDF documentation was (IMO) quite unclear. The OWL 2 
documents are better, but everyone admits that it's still an area of 
confusion. (In a major act of confession at the DC2013 meeting, Ivan 
Herman, head of the W3C semantic web work, said that this was a mistake 
that he himself made for many years. Fortunately, he now helps write the 
documentation, and it's good that he has that perspective.) In effect, 
inferencing is the *opposite* of constraining. Inferencing is:

"All men are liars. Socrates is a man. Therefore Socrates is a liar."
"Every child has a parent. Johnny is a child. Therefore, Johnny has a 
parent." (whether you can find one or not is irrelevant)
"Every child has two parents. Johnny is a child. Therefore Johnny has 
two parents. Mary is Johnny's parent." (no contradiction here, we just 
don't know who the other parent is)
"Every child has two parents. Johnny is a child. Therefore Johnny has 
two parents. Mary is Johnny's parent. Jane is Johnny's parent. Fred is 
Johnny's parent." Here the reasoner detects a contradiction.

The issue of dct:titles is an interesting example. dct:title takes a 
literal value. If you create a dct:title with:

X dct:title http://example.com/junk

with OWL rules that is NOT wrong. It simply provides the inference that 
"http://example.com/junk" is a string - but it can't prevent you from 
creating that triple, because it only operates on existing data.

If you say that every resource MUST have a dct:title, then if you come 
across a resource without a dct:title that is NOT wrong. The reasoner 
would conclude that there is a dct:title somewhere because that's the 
rule.  (This is where the Open World comes in) When data contradicts 
reasoners, they can't work correctly, but they act on existing data, 
they do not modify or correct data.

I'm thinking that OWL and constraints would be an ideal training 
webinar, and I think I know who could do it!

kc

[1] http://www.w3.org/TR/2012/REC-owl2-primer-20121211/
"OWL 2 is not a schema language for syntax conformance. Unlike XML, OWL 
2 does not provide elaborate means to prescribe how a document should be 
structured syntactically. In particular, there is no way to enforce that 
a certain piece of information (like the social security number of a 
person) has to be syntactically present. This should be kept in mind as 
OWL has some features that a user might misinterpret this way. "


On 9/17/13 4:50 AM, [log in to unmask] wrote:
> I don't think anyone would want to use one ontology for all work, especially not a public ontology. I can imagine people using ontology extensions that are specific to the purpose of validation, and I've found them useful myself.
>
> I'm not arguing against using SPARQL for validation. I do think that OWL offers a more natural-feeling language for discussing constraint for most folks, and I suppose that's why we've seen the introduction of extension languages like SPIN to intermediate a little between the user and plain SPARQL.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> On Sep 16, 2013, at 11:00 PM, CODE4LIB automatic digest system wrote:
>
>> From: Karen Coyle <[log in to unmask]>
>> Date: September 16, 2013 10:22:47 AM EDT
>> Subject: Re: CODE4LIB Digest - 12 Sep 2013 to 13 Sep 2013 (#2013-237)
>>
>>
>> On 9/16/13 6:29 AM, [log in to unmask] wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> I'd suggest that perhaps the confusion arises because "This instance is (not) 'valid' according to that ontology." might be inferred from an instance and an ontology (under certain conditions), and that's the soul of what we're asking when we define constraints on the data. Perhaps OWL can be used to express conditions of validity, as long as we represent the quality "valid" for use in inferences.
>> Based on the results of the RDF Validation workshop [1], validation is being expressed today as SPARQL rules. If you express the rules in OWL then unfortunately you affect downstream re-use of your ontology, and that can create a mess for inferencing and can add a burden onto any reasoners, which are supposed to apply the OWL declarations.
>>
>> One participant at the workshop demonstrated a system that used the OWL "constraints" as constraints, but only in a closed system. I think that the use of SPARQL is superior because it does not affect the semantics of the classes and properties, only the instance data, and that means that the same properties can be validated differently for different applications or under different contexts. As an example, one community may wish to say that their metadata can have one and only one dc:title, while others may allow more than one. You do not want to constrain dc:title throughout the Web, only your own use of it. (Tom Baker and I presented a solution to this on the second day as Application Profiles [2], as defined by the DC community).
>>
>> kc
>> [1] https://www.w3.org/2012/12/rdf-val/agenda
>> [2] http://www.w3.org/2001/sw/wiki/images/e/ef/Baker-dc-abstract-model-revised.pdf

-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet