Thursday, March 22, 2007

Semantic Web - Some Responses

Numerous good responses to my post from a couple of days ago - and in this post I offer some responses, framed around the argument in this post from David Norheim:


OK, let's deal with these in order...

On the technical side

* First of all W3C RDF does not require that everyone adopts the same vocabulary for a domain.

Quite right. That is the huge advantage RDF has over plain XML. However, in order for RDF to be useful, different entities must adopt *some* vocabulary.

For example, if you are dealing in furnture, you need to define a furnture vocabulary, so you can use tags like furn:name and furn:size. For each of these tags, too, there may be a canonical vocabulary. For example, furn:name must be one of {couch,table,chair, etc}.

Now if every furniture business uses furn:title as planned, there's no problem. But what happens is that each enterprise defines the furniture domain differently. Some people want to include 'sofa' as a value in furn:title, others want to include furn:settee. Each item here conferes business advantage on one of the other. So no vocabulary is ever define, or worse, you get conflicting vocabularies using the same name different ways.

Every day in my inbox I see more examples of this. ODRL vs XrML. The various IM specifications. DC vs LOM vs IEEE-LOM. RSS vs RDF vs Atom. And more.

* RDF makes it trivial to publish data in which you mix vocabularies, making statements about a person, for example, using terms drawn from FOAF, Dublin Core and others

Quite right. But what has tended to happen is that people prefer to use only one vocabulary. They don't like mixing and matching.

And in any case, this doesn't solve the problem. People don't put two versions of the 'title' element in a single document, they pick one. You'd never see an RSS-type document with:
rss:title
atom:title
dc:title
So we need to know (via crosswalks) that what atom means by title is the same as what DC means by title.

Except, of course, it doesn't. Even dc:title means different things to different people. I have in my inbox right now an email in which the proposed vocabulary for DC elements is being rejected because they are language-based, not concept-based. Now you can muddy your way through such arguments. But they are endless, and nobody ever compromises.

* RDF is showing increasing adoption, showing up in products by Oracle, Adobe and Microsoft, for example.

The links in this point all point back to my post.

Anyhow, I haven't seen any evidence of increasing adoption.

Putting on my best movie voice: But you see Al, it isn't the production of RDF that's the issue. Anybody can produce RDF. It's the *reading* of RDF that's the issue. And nobody reads each other's RDF.

* RSS, ATOM and iCal are examples for data standards jointly supported by different companies - there’s just no reason to assume that this list cannot grow.

Neither RSS not Atom are RDF (except for RSS 1.0, which has a usage of about 3 percent). I also posted figures on my website just this week showing that iCal usage is something like 7 percent. iCal isn't RDF either - hence the need for a converter http://torrez.us/ics2rdf/ and the resulting profiferation of RDF versions of iCal, none of them official. Meanwhile, neither Google nor Outlook are based in iCal.

Bottom line:

Technically, we *could* all agree on standards and vocabularies. But, empirically (that is, looking at actual technical implementatiosn, we *don't*. And that is really what matters, isn't it?


* People are looking for incentives to share. Why do you always have to look at the big corporations? Governments (at least in Europe) have self interest in publishing (semantically clear) information to make its own government more efficient and its customers (corporations and people) more competitive. Expect more from them. Small companies have incentive to bring down the bigger ones.

In general, you have an incentive to share when (a) you're the smaller fish, and (b) your intention is to provide a public service.

Incentive (a) doesn't help us a whole lot, because, mostly, big fish beat small fish. Yes, there are exceptions - Google's rise from nothing being the most notable. Buit you can count them on one hand. Mostly, when small fish begin to get big, they are swallowed by big fish. Like Flickr. Then any commitment they had toward sharing becomes a commitment to Yahoo's version of the standards.

Incentive (b) works in a climate where there is a robust public service, either provided by government, or provided voluntarily by the general public (eg. open source).

The robustness of the public service is being challenged these days. Not only are companies pressing to force government to withdraw from providing services (eg., the anti-BBC lawsuits, and the anti-community networking lawsuits) there are pressures within government to tailor services to the needs of companies. So, for example, one company would have privileged access to government information, and it doesn't matter what format it's in then. Yes, there is a spirited campaign to oppose this - but the successes of that campaign - eg., vs. CSPAN's recent declaration of copyright - are rare enough that they're news.

Meanwhile, the general public is volunteering itself away from the semantic web and toward things like Web 2.0, AJAX, JSON and a host of patchwork solutions. The reason for this is that the semantic web (mostly at the request of business, ironically) has become so bloated it's too cumbersome to use. And also, the businesses (and academics and governments) that are developing it aren't using it.


* Businesses do cooperate, when they see it as being in their own interests. In fact commerce can only function when businesses work together at their interfaces. Money is a shared vocabulary with a set of standard protocols. Kendal Clark elaborates on this in a separate post

Yes, business cooperates. But:
- these instances are contingent on continued mutual benefit. Companies can and do pull the plug on each other.
- a lot of this cooperation is in order to stiff some third company, which will be locked out. If you subscribe to the e-learning trade press, you'll see an endless stream of 'strategic alliances'. That means X is aligning with Y in order to prevent Z from doing business with Y.
- they aren't honest about it. They may appear to be cooperating, but then at the last minute they'll pull out their submarine patents and torpedo the works, traing to lay exclusive claim to a domain built by a number of partners. OASIS exists just to make this possible (because W3C wouldn't allow it (which is why the businesses don't really support W3C's efforts)).

* Another argument comes from Aditya Pandit where he argues with that the innovation and adoption comes not from the large corporations (The Big Players behave to retain the advantange that they have) but from start-ups (ref. MySpace, YouTube, Yahoo, Google). So looking at adoption by the big players is really incorrect. This should be common knowledge from innovators and startups.

Right. The big players don't innovate.

But they steal.

Any technology company that starts from scratch in today's environment is copied almost as soon as it becomes successful (or purchased outright, but that's a separate story).

In order to be successful, you have to be successful so quickly (or so underground) that the bigs have no choice but to go along. And even then, they'll do it only reluctantly, and they'll try to subvert it.

It's way too late for that to happen with RDF. No start-up is going to come along and make (certain flavours of) RDF the standard. There's simply too much instant competition from the bigs.

And besides - if you out-innovate them, and out-grow them, they'll just slap a bogus patent claim on you.

* I think what Downes says is colored by a very skewed “free market”-American (I know he is Canadian)

What I am skewed by is the rampant avarice and dishonesty shown by the business sector (it also exists in the public sector, I'm not letting them off the hook, but there's less in the public sector, and the public sector is smaller).

If you read my writings, you would see that, with some few exceptions, I do support free markets (and the exceptions are the well-defined cases of market failures, generally caused by shortages or excesses of production).

But it's important to understand that "pro-business" and "pro-free markets" are not synonyms. The first instinct of any business is to attempt to subvert the free market in order to obtain an edge or, ideally, a monopoly.

That is why they play games with the standards - they are trying to subvert the process to their own advantage. Often, this involves subverting standards committees (like, say ISO) to make their own proprietary technology the standard (like, say , MPEG-REL).

* ... view that the market is best of with competing standards , ref CDMA/GSM, the banking system etc. And let the companies compete freely.

Both the telephone industry and the banking industry had to be regulated into submission before they would share.

Even now, mobile phones that work in Canada won't work in Europe. And just last week the legislation forcing phone-number mobility came into force. Meanwhile, we are looking at problems like net neutrality - as I speak, the phone companies are squeezing Skype bandwidth.

As for the banking industry, it wasn't until a few years ago that the banks would allow Credit Unions into the ATM networks. Moreover, there are competing interbank networks - Cirrus and Plus, for example, which is why my bank card doesn't always work in bank machines. Meanwhile, banks had a monopoly on cash dispensers until legislated into opening the standard and allowing 'white machines' to be installed.

Governments know well what would happen were the telephone companies and the banks allowed to compete freely. That is why they are two of the most heavily regulated sectors there are.


* Adoption of standards do take time…

Yes, it does take time. But alfter a certain amount of time, you need to realize, they're not going to do this voluntarily. At a certain point, if you're not going to legislate them into cooperation (and I am *not* advocating that in this instance, for numerous reasons), then you have to pull the plug.

1 comment:

  1. Hi Stephen,

    You made a few mistakes on the iCal front:

    - iCal is the Apple desktop application

    - iCalendar is the general name of the IETF calendar spec

    - David White's study you quoted was asking "do you use iCal", not "Do you use a calendaring system that uses the iCalendar spec? So the adoption figure looks about right; i.e. about 90% of mac users, and 0% of everyone else

    - Google Calendar DOES use iCalendar (I use both Google and Apple iCal, and use iCalendar to synch them)

    - Yahoo Calendar also uses iCalendar

    - Windows Vista now includes Windows Calendar, based on... yep, iCalendar

    - So actually, the figures suggest that use of iCalendar-based systems now exceed use of non-standards-based calendars (Outlook), and that most of that usage is of web-based services

    Not at all related to the RDF argument really*, but thought I'd better point it out as I saw a similar statement in another post you made lately.

    * Except to say that all iCalendar-based sytems I've seen, including Yahoo, Google, Apple, etc, all use the very old-fashioned semicolon and line-break delimited file format - none of them use any of the RDF equivalents.

    S

    ReplyDelete

I welcome your comments - I'm really sorry about the moderation, but Google's filters are basically ineffective.