November 25, 2005

Social Bookmarking Interoperability

With an increasing number of social bookmarking systems now available, to what extent would it be useful if it were possible to share bookmarks across these systems?

If social bookmarking systems are to become loosely coupled components in a distributed virtual learning environment (VLE), then conforming to common interface standards would allow different systems to be swapped in an out of such an environment without breaking other components.

One particular question I'd like to raise in this post - and to which I expect to return quite frequently as I work through and follow the technical issues - is whether or not there is any benefit in standardising some elements of social bookmarking applications.

What I am particularly interested in it terms of payoff is:
a) the ability to swap different social bookmarking systems in and out of a hybrid environment (e.g. a VLE or social software suite containig different applications - blogs, wikis, social bookmarking etc.);
b) the ability to create a distributed social bookmarking service.

So what elements might we consider standardising?

There are several that immediately come to mind:

  1. URL structure for referring to user tags
  2. RSS structure, for syndicating linkrolls;
  3. the structure of tag groups;
  4. the structure of user groups;
  5. open APIs;
  6. bookmarklets.

I'm not going to focus on all these issues in this post, but I would like to look at the first two. If you run a social bookmarking service, please feel free to chip in with comments and corrections. If I had access to a public wiki, I'd actually post there, because I think this topic would benefit from a community discussion, but for now I'll stick with the blog. I also think this sort of discussion could usefully add to the excellent comparative work done on social bookmarking by Torsten Rox.

I'm also going to limit myself in this post to comparing just two social bookmarking services - del.icio.us and Connotea. This will let me try and work out some just what sorts of comparisons I want to make and how I'm going to make them. If I can sort some sort of methodology out, (and find some time somewhere) perhaps I'll work on this as a proper research project...:-)

URL structure
Let's look at some examples, first:

del.icio.us navigation:
http://del.icio.us/{rss/}[tag/ OR username/ OR url/][tag {+tag} ? URL]
where [] denotes a required element and {} denotes an optional element. Emphasized elements are placeholders. ? denotes context sensitive.

For example,

Connotea navigation:
http://www.connotea/{rss/}[tag/ OR name/ OR uri/][tag{+tag} ? username ? URI]{?tag/tag{+tag}}

For example,

Comparison of del.icio.us and Connotea URLs>

Reference by tag is handled the same way in both services: $DOMAIN$/tag/tag{+tag}

Reference by user is handled differently: del.icio.us goes straight to the username ($DOMAIN$/username) whereas Connotea interposes a user path element ($DOMAIN$/user/username).

Reference by URI/URL differs in the pathname only ($DOMAIN$/url/ in del.icio.us, $DOMAIN$/uri in Connotea).

What was interesting - at least for the single URI/URL bookmark comparison I made - was the the encoding of the same page (Seven Ways of Using Social Bookmarking - ./005136.html) resulted in the same encoded reference (7e16126ee5fe4f320852b731206c2eac).

Reference by use and tag is in my opinion better managed in del.icio.us ($DOMAIN$/username/tag) than Connotea ($DOMAIN$/user/username/tag/tag).

The differences between URL naming schemes is minor, and if consistency were really necessary could be addressed using URL rewrites.

Finally, both services provide an RSS feed by the simple insertion of an rss path element immediately after the domain ($DOMAIN$/rss/etc.).

Which leads nicely into the next comparison - how alike are the RSS feeds from del.icio.us and Connotea?

RSS feed structure

First, a snippet from del.icio.us:
<item rdf:about="http://arxiv.org/ftp/cs/papers/0508/0508082.pdf">
<title>The Structure of Collaborative Tagging Systems</title>
<link>http://arxiv.org/ftp/cs/papers/0508/0508082.pdf</link>
<dc:creator>psychemedia</dc:creator>
<dc:date>2005-11-25T16:21:39Z</dc:date>
<dc:subject>bookmarking dynamics social</dc:subject>
<taxo:topics>
<rdf:Bag>
<rdf:li resource="http://del.icio.us/tag/dynamics" />
<rdf:li resource="http://del.icio.us/tag/social" >
<rdf:li resource="http://del.icio.us/tag/bookmarking" />
</rdf:Bag>
</taxo:>
</item>

And one from Connotea:
<item rdf:about="http://www.connotea.org/user/tonyh/uri/a59e3122ddb15bd4c1f64bee65b78662">
<title>The Structure of Collaborative Tagging Systems</title>
<link>http://www.connotea.org/user/tonyh/uri/a59e3122ddb15bd4c1f64bee65b78662</link>
<description>Posted by tonyh to bookmarking dynamics Social on Fri Nov 25 2005</description>
<dc:creator>tonyh</dc:creator>
<dc:date>2005-11-25T16:21:05Z</dc:date>
<dc:subject>bookmarking</dc:subject>
<dc:subject>dynamics</dc:subject>
<dc:subject>Social</dc:subject>
<slash:comments>0</slash:comments>
...
<connotea:uri>
<dcterms:URI rdf:about="http://arxiv.org/ftp/cs/papers/0508/0508082.pdf">
<dc:title>0508082.pdf (application/pdf Object)</dc:title>
</dcterms:URI>
</connotea:uri>
<annotate:reference rdf:resource="http://www.connotea.org/comments/uri/a59e3122ddb15bd4c1f64bee65b78662" />
</item>

Let's go through them a line at a time:

D: <item rdf:about="http://arxiv.org/ftp/cs/papers/0508/0508082.pdf">
C: <item rdf:about="http://www.connotea.org/user/tonyh/uri/a59e3122ddb15bd4c1f64bee65b78662">
del.icio.us makes the post about the actual resource, Connotea makes it about the Connotea reference to the resource.

D: <title>The Structure of Collaborative Tagging Systems</title>
C: <title>The Structure of Collaborative Tagging Systems</>
Coherence :-)

D: <link>http://arxiv.org/ftp/cs/papers/0508/0508082.pdf</link>
C: <link>http://www.connotea.org/user/tonyh/uri/a59e3122ddb15bd4c1f64bee65b78662</link>
Again - a difference in what we are actually referring to...

Connotea then has a description tag that contains the same info as the following Dublin Core tags:
C: <description>Posted by tonyh to bookmarking dynamics Social on Fri Nov 25 </>

D: <dc:creator>psychemedia</dc:creator>
D: <dc:date>2005-11-25T16:21:39Z</dc:date>
D: <dc:subject>bookmarking dynamics social</dc:subject>
C: <dc:creator>tonyh</dc:creator>
C: <dc:>2005-11-25T16:21:05Z</dc:date>
C: <dc:subject>bookmarking</dc:subject>
C: <dc:subject>dynamics</dc:subject>
C: <dc:subject>Social</dc:subject>

The main difference here is in the way the subject tag is used. del.icio.us space separates the tags in a single subject line, whereas Connotea uses a separate subject line for each, which it is absolutely allowed to do. The Connotea approach is a little cleaner, I think.

D:<taxo:topics>
D: <rdf:Bag>
D: <rdf:li resource="http://del.icio.us/tag/dynamics" />
D: <rdf:li resource="http://del.icio.us/tag/social" >
D: <rdf:li resource="http://del.icio.us/tag/bookmarking" />
D: </rdf:Bag>
D: </taxo:>
Here, del.icio.us provides an alernative way of accessing the tags by exploiting the RSS Taxonomy namespace.

C: <connotea:uri>
C: <dcterms:URI rdf:about="http://arxiv.org/ftp/cs/papers/0508/0508082.pdf">
C: <dc:title>0508082.pdf (application/pdf Object)</dc:title>
C: </dcterms:URI>
C: </connotea:uri>
In contrast, Connotea offers an additional alternative way of accessing the URI of the resource, using the connotea namespace (although bear in mind that the <link> reference in Connotea was to the Connotea reference of that URI). The Connotea RDF feed also references the Taxonomy namespace, although it doesn't use it.

The use of the Dublin Core subject field is increasingly accepted as the correct place to record tag information, although as the above demonstrates it is still possibly to use this tag inconsistently. The Connotea approach does seem clearer, as it allows for more direct manipulation (using the DOM) of the user's tags.

The other major difference is in the use of the <link> field. The del.icio.us approach - of referring directly to bookmarked URL, offers an immediate link to the resource. The COnnotea approach, however, uses a level of indirection through the Connotea reference for that resource, which allows for the bookarking system to track click-thrus more comprehensively.

The indirection affroded by the Connotea approach also allows for alternative references for the resource to be used in an invisible way (e.g. where a resource may be ssotgred in several different locations, there is a many-one relationship between locations the resource can be found and the resource itedlef. The Connotea approach, of encouring users to go via the Connotea representation of the resource location, potentially accommodates this many-one mapping by masking the actuial address from which the resource is served.)

Posted by ajh59 at November 25, 2005 03:50 PM
Comments

Hi Tony,

Regarding URL structure: this is one case where you should not use del.icio.us as an example of good URL structure. Joshua himself once admitted he made some mistakes with it. If you think about that URL structure and think ahead a little bit, you'll quickly see the problem. I didn't examine Connotea's URL structure, but here is what Simpy supports: http://blog.simpy.com/blojsom/blog/2005/11/04/Sexy-URIs.html (sorry, couldn't get the blog sw to hyperlink this).

Posted by: Otis Gospodnetic at November 29, 2005 04:29 PM