Friday, October 14, 2011

DRN: Downes RDF Notation

The usual disclaimers apply: although I'm creating this myself, it's probably not unique to me, someone else probably thought of it first, and I don't expect anyone in the world to actually use this, though if it is in fact new, when it's reinvented by someone at MIT or Stanford all credit will revert to that author.

DRN: Downes RDF Notation

I've done enough coding of permissions systems to know that they're a pain. So I like this proposal for an RDF-based permissions model.

But as I read the article, I am reminded again of the RDF community's general failure to develop readable syntax. This post presents the comments in Turtle, a synonym for TRTL, an acronym of Terse RDF Triples Language. It's better than native RDF, but you still get long convoluted senseless statements.

Having read yet another statement in some language that look like this:

resource:r1 per:read domain:domain1, domain:domain2,  domain:domain3;
        per:create  domain:domain1, domain:domain3;
        per:update  domain:domain1, domain:domain3;
        per:noread  domain:domain4;
        per:nocreate  domain:domain5;
        per:noupdate  domain:domain5;


I declare an end to ridiculous RDF syntax.

So herewith, DRN, 'Downes RDF Notation.

DRN is composed of two major sections, the 'declarations' section, and the 'statements' section. In the declarations section, we associate terms with namespaces, while in the statements section, we make statements using those terms.

The Declarations section consists of a series of statements, each of which associates a namespace with a series of terms. Like this:
     namespace1: term1, term2, term3, term4;
     namespace2: term4, term6, term4.
As can be seen in the example, the namespace URI is followed by a colon, terms are separated by commas, and each declaration is separated with a semi-colon.

The namespace is like a dictionary. It is another document written in DRN (or DERN, or DELRN, see below) that provides additional information about the term. Thus, if you use the term 'robin', you don't need to specify every time that a 'robin' is a 'bird', that it 'flies', that it is not a 'rock', etc; this is all done in the namespace.

Proper names are terms that are defined by a proper names registry, which is simply a namespace used to define proper names.

The Statements section simply uses the terms in a rational manner. Like this:
    term1 term2 term3.
    term4 term5 term6.

As the punctuation implies, terms are separated by spaces, and statements end with a period.

You can use commas to create sequences of terms. As follows:
   term1, term2, term3 term4 term5.
   term5 term2,term3 term6.

That's the notation!

A couple of footnotes. First, white spaces (spaces, carriage returns) are used only to separate terms in statements. White spaces inserted after punctuation for clarity are ignored.

Second, while nothing prevents the use of the same term twice, from different namespaces, such usage obviously creates ambiguity. Hence, when the same term is used twice, the last or most recent definition of a term by a namespace is taken to be authoritative.

While there are no a priori constraints on the order or nature of terms, the typical sequence of terms is 'subject verb object'. The creation and use of passive verbs, such as 'is_created_by', is strongly discouraged. It's much better to write statements of the form 'x creates y' rather than 'y is_created_by x'.

Finally, a space in the defined term becomes part of the term, and not part of syntax. Thus, for example, the term 'blue jay' is treated as a single term; the parser is asked to think of it as though it were 'blue_jay'.

Note that it's very easy to construct a full logical system in DRN.  For example:

     http://onto.domain.com: is,has,contains,creates
     http://animal.domain.com: robin,bluejay,grouse
     http://parts.domain.com: feathers,eggs
 
     robin is bird. grouse is bird. bird has feathers. bird creates eggs.

The proof of a notation is to write a parser and an inference engine, so I'll put that into my list of projects. But with only five syntax characters (white space, comma, colon, semi-colon, period) and one exception (ignoring the '://' construction in URIs) actual parsing is very simple.

DERN: Downes Extended RDF Notation
  
DRN will do almost everything people creating large and complex RDF structures may want to do. However there will be cases where an extended expressive capacity is required. Hence, DERN defines a set of ways of creating complex verbs using well-known modalities in conjunction with defined terms.

Modalities may be defined in the declarations section, though the default set (listed below) may be taken as as assumed. The purpose of a modality is to in some way modify the term being used. Here's the declaration of some modalities:

(Modalities)http://someurl.com:all,some,one,a,the,no
(Modalities)http://mynegation.com:not
(Modalities)http://mymodallogic.com:can,could,may,must,might,probably
(Modalities)http://mytenselogic.com:will,was

If I every create a formal version of DERN I would create the complete set here.

The effect of defining modalities is to create a superset of terms containing white spaces. It also allows a parser to define a set of inference rules based on these modalities.

For example, suppose we have defined the following:
   (Modalities)http://someplace:com:some,the,a;
   http://birdnames.com:robin,blue jay  

The parser creates a superset of possible terms based on these definitions, consisting of the following:
   some robin,the robin,a robin,some blue jay, the blue jay, a blue jay

Note that the modality always precedes the term in question.

As before, there are no a priori constraints on the nature or range of modifiers; anything may be used as a modifier, and a modifier may modify anything. However, it should be clear that the use of the same string as both a term and a modifier can result in ambiguities. For example, defining 'not' as a term and then 'not' as a modifier may result in ambiguoous understandings.

As a rule of thumb, in such a case, the doubly-defined term should be understood as a modifier. However, persons wishing to recreate Continental philosophy may force the issue by defining the modifier first, then the term, invoking the rule that the 'the last or most recent definition of a term by a namespace is taken to be authoritative.' If you want to talk about 'the not', feel free (but don't expect to be understood).

Finally, modalities may be quantified. To quantify a modality, place the quantification in brackets after the modaility itself. Some obvious examples:
    some(14) bird
    probability(45) is

Quantifiers may include units. For example: (14 grams), (45 percent).


DELRN: Downes Learning Extended RDF Notation
This part of DERN is intended to enable inference. It makes use of the basic logical forms to create compound statements from which conclusions may be drawn. I will express the basic logical operators in CAPS for clarify, though they are not natively case-sensitive.

For any statements (represented with statement) the following basic logical operators may be defined:

   statement AND statement
   statement OR statement
   IF statement THEN statement
   NOT statement
   statement IFF statement
(IFF is the same as IF AND ONLY IF).

A DELRN inference engine would apply well-known logical principles in order to generate new statements from existing statements, or (more usefully) to evaluate the truth of proposed statements against the body of known statements. For example, there is a well-known rule of inference:

   If A then B. A. Therefore, B.

Given the first two statements (those preceding the word 'therefore') then we generate the third statement (the statement following the word 'therefore').

What is significant about learning rules is the employment of variables. This saves us the necessity of repeating the same rule over and over. So, for example, instead of saying:

   IF a bluejay has wings THEN a bluejay is a bird.
   IF a robin has wings THEN a robin is a bird.

and so on, for every term defined in the system, we can say:

   IF x has wings THEN x is a bird.

In order to implement DELRN, we first implement DERN, and then add the inference component after it, as follows:

First, a statement that defines variables:

   Variables are x,y,z,a,b,c.

Second, a set of rules. These rules are in addition to the standard rules of inference (the subject of another document) such as modus ponens and the rest of them. These are rules of inference specific to the present document or set of documents.

For example, a rule might be:

   IF some x is a bird THEN the x eats some(5 grams) seeds.

This tells the system not to attempt to find a namespace for the terms. It tells the system that any term being used may be inserted in the rule in place of the variable.

This enables full expressibility in predicate and modal logic.

DLEARN: Downes Learning Extended Adaptive RDF Notation

This adds one simple element to DLERN: it allows statements to become terms. Hence, we can talk about a statement as though it were an object. This allows us to make metastatements.

For example, suppose we have the statement 'A robin is a bird.' By enclosing the statement in single quotations, as done in the previous sentence, we can now treat the statement as a single term. This allows us to do something like the following:

   Variable: x.
   x is 'A robin is a bird'.
   x is probably(56 percent) true.

The statement from the start of this article? It looks like this:





somedomain.com: read,write,create,update.
somedoclist.com: r1.
 
domain1 can read, can create, can update r1.
domain2 can read r1.
domain3 can read, can create, can update r1.
domain4 can not read r1.
domain5 can not create, can not update r1.
That's clearer, isn't it? Though now we are left wondering whether the authors could simply have written:


somedomain.com: read,write,create,update.
somedoclist.com: r1.
 
domain1 can read, can create, can update r1.
domain2 can read r1.
 
Of course, with our greater expressive power, we could simply define a type of resource and apply permissions to that type. Or permission variables. Or a host of other permissions statements, all equally clear, and yet easily parsed..

Well, that's it. Over time, I may want to add a few things (bracketing to establish precedence, for e the basic language. It resembles natural language to a great degree, does not include pointless syntax and redundancies (such as the repetition of namespace names dozens of times in a document).

So - over to the readers. Who has already invented this? Where can I find an inference engine that uses it? Etc. Or - why won't it work? How is it expressively incomplete? Where is it ambiguous? What obscure notation from set theory cannot be expressed in this language?


1 comment:

  1. Stephen

    I think this is a fascinating approach. I am staggered by what you can do in half an hour! I will write to you about a project I have in mind.

    Best wishes

    Keith

    ReplyDelete

I welcome your comments - I'm really sorry about the moderation, but Google's filters are basically ineffective.