Resource Description Framework

Resource Description Framework is a graph-based model for describing Internet resources (like web pages and email messages), and how these resources relate to one another. It can be considered as a means of representing a Semantic Net. It can be represented in a number of ways, using Extensible Markup Language or Rdf Triples. See Practical Rdf for a discussion of the alternatives. Part of the Semantic Web Layer Cake described in the Explorers Guide (to the Semantic Web).

More information on RDF is available at www.w3.org.


A model for metamodel that Extensible Markup Language is not good for

Tim Bray article at www.xml.com

The Esw Wiki is all about the Semantic Web and RDF: esw.w3.org


How is this related to Rich Site Summary (RSS)?

"RSS is an acronym for Rich Site Summary and also for Rdf Site Summary and they aren't the same thing. Do you agree?" -- Paolo Castagna

Different forks of the same product; see link above and Rss Feeds.


Moved from RdfPrimerOutline

A document being done by Sean Palmer and Aaron Swartz.

1. Sales Pitch

code

2. Right on! Wait a sec... what's RDF?

code

3. What's this Resource thing?

code

4. But Bob says that's what XML is for!

code

5. So let's write some RDF...

code

6. Where do I go from here?

code

N.B. From here on down, it would be suitable to segue into the RDF Cookbook stuff; once they are able to deal with rudimentary serializations concepts, it should be a simple matter of guiding them through the rest.

code


Online Query example (uses Prolog Language) - see www.w3.org.


RDF is generally loaded into what Semantic Web folk call a "Tuple Store."

Around here, it seems to be called a "Tuple Space." - No, in fact a Tuple Space is something quite different, to do with sharing memory between different computer processes. -- John Fletcher

Basically, a Tuple Store is a gigantic graph. See Rdf Triples.

RDF describes a graph. The graph consists of nodes connected by edges. When you describe the graph, you do it in triples. That is, you do it like this: (Node A)----(is connected by edge E)----(to Node B). See? Three things, making a triple. "A,E,B." It's also called a tuple.

Together, a bunch of triples/tuples describe a graph.

ALL data, in the entire world, can be described with a graph. That is: a graph is the most generic data structure in the world. Think about it: All data represents stuff connected to other stuff. A graph isn't necessarily the most efficient way to do it, but it's a pretty generic way to do it.

Why is RDF important?

Because it's easy to merge.

If you have two graphs, it's easy to automatically snap them together.

Why is merging data together important?

I haven't seen the canonical reasons "why," but here are some reasons.

Fabrication: Lots of people can describe the little bit that they know, and then it can automatically be hooked up together. (Sort of like the World Wide Web!)

Extension: If someone made some data, and you wanted to add something to it, you can ask them to do it, without requiring something of them. (Without going to their meetings, and trying to convince them to extend their data format.)

Partial Description: If you have a program that can only give some necessary details, and another program gives the other necessary details, then you can merge them together, and then perhaps you will have all the details you need.

For instance, RSS v1 data is in RDF. If you want to extend the information there, you can just extend it with RDF. It'll "just work." Anything that knows how to interpret your additional graph nodes will be able to figure out what's going on. Anything that doesn't will (if it's reading the RDF right) just ignore it.


Weaknesses:

Lower-arity predicates: a proposition is a predicate of zero-arity, and there are a great many predicates of arity one. These can be represented in RDF easily enough by making one of the objects blank or 'TRUE', or using 'predicate.APPLIES_TO.subject'. However, none of the options are exactly natural.

Higher-arity predicates: many facts cannot readily be described by a simple pair of objects (e.g. 'likes_more_than(A,B,X) for 'A likes X more than B'). One can force higher-arity predicates by essentially currying the missing objects into the predicate (e.g. likes_X_more_than(A,B)), but doing so is both unnatural and a source of significant redundancy.

Cannot represent confidence (or, more accurately, the lack thereof), contingency (if A.Q.X is true then B.R.Y is true), or probability (A.Q.X has a P% probability of being true... not all that useful without contingency), volatility (I know it's true now, but... look again in 10 seconds), etc.

Abstraction and construction of predicates and inference rules: one cannot relay (at least directly in the graph) the abstraction of facts about predicates (e.g. "fact: forall predicate pairs A,B (<- denotes abstraction) that meet the second order logic requirements P(A) and Q(B) (<- contingency), an additional fact holds with 98% confidence: R(A,B)"). RDF is incapable of supporting much by way of meta-knowledge or rules of inference, both of which are necessary for it to extend naturally into a 'Knowledge' system (one capable of making predictions or discovering new facts rather than just representing data). (In my own study, I've discovered that meta-data and inference rules should be stored directly with the data because, in predictive or knowledge systems, the inference rules are just as subject to change AND use in the production of more inference rules as the other parts of the data.)

Open-world logic: In an open world, enough things are simply unknown that it's unreasonable to assume that not knowing it makes it false. Rather, one must explicitly track and store what they know is untrue: NOT(A.Q.X). The absence of information should result in 'unknown', not 'false'. RDF provides no field for indicating falsehood of a proposition/ RDF-triple. Again, it can be curried into the predicate (A.NOT_Q.X), but that is once again increasing redundancy and such (all the same problems as higher-arity predicates).


Like everyone else, I wrote a brief introduction to RDF http://slashdot.org/~Quantum%20Jim/journal/94721 slashdot.org:

A Brief Introduction to the Resource Description Framework

Note: The following is a draft. Will revise and webify. ASCII art is formatted for 80 character lines.

Definitions

RDF is a language framework for describing directed graphs. First, some definitions:

A directed graph is a bunch of nodes connected by arrows.

The node at the beginning of an arrow is called the subject.

The node at the end of an arrow is called the object.

The arrow is called the predicate.

The whole thing is called a statement.

Example Graphs in Pictures

Here's a graph composed of only one statement:

code

A subject node for one statement can be the object node of another statement:

code

Node and Arrow String Values

Now each node can have a string value:

It could have no string value; these nodes are called blank nodes.

It can also have a URI value; a URL is an example of an URI. In fact, every "node" with the same URI value is actually the same node (i.e. they are not separate nodes).

Objects can also be any arbitrary string - including the empty string ("") - but subjects can only be URI strings. These types of objects are called literals.

Note that every predicate MUST have a URI value; they can't be blank or arbitrary strings. It is also a VERY BAD IDEA to let predicates end with anything other than the characters a-z, A-Z, or 0-9 (that is, the end of the uri SHOULD be a valid xml name).

URI Examples

Here are some examples of URIs that could be a subjects, predicates, or objects. There is no way of knowing in isolation which is which. Each URI is DIFFERENT (even if they point to the same thing on the web).

4. file:///C:/Files/old/Video%20Game/mario.jpg

If the URIs are all predicates, then 1, 3, and 5 are formatted best. Here are some real-world examples of predicates:

Here are some example strings that could be literals:

Mary had a little lamb

HTML is cool

47

Note that predicates can be objects. Thus RDF is capable of representing second-order-logic statements.

Datatypes

Literals can also have a property called its datatype. For example, the literal string "47" is also an integer, so it could be described by a datatype that means integer. This isn't required, of course, and many literals have no defined datatype.

The datatype is like a predicate and can only be a URI. Here are some real-life examples of datatypes:

Review of Values of Nodes and Arrows

In review:

A subject can either have:

No string value.

URI string value.

A predicate MUST have ONLY a:

URI string value.

An object can either have:

No string value.

URI string value.

Literal string value.

A literal object can optional have:

A datatype

A datatype MUST have ONLY a:

URI string value.

Graphs as Databases

Graphs can be thought of as collections of statements, and statements can be thought of as subject/predicate/object tuples. I tend to think of statements as English sentences and graphs as paragraphs or stories. The analogy holds up pretty well.

Graphs can also be looked at as tables in a database. Here's an example (First set is the format):

code

code

code

code

code

Conclusion

There's a lot more, but these are the basics.


I soitenly hope this is ain't no secret attempt to resurrect Job Control Language (JCL). -Curly

It's not. The resources are not a machine resources needed but domain resources on offer. The resource description is metadata that describes data and services on offer by a domain. -- Martin Spamer.



See original on c2.com