April 17, 2006

A Bitty Browser and pagelinks2OPML URL Pipeline

I've just been going through a holiday break's backlogged feeds and noticed that John Tropea has been looking at finding ways of making pagelinks (i.e. the list of links referred to in a web page or blog post) useful and useable.

In a recent post he describes Bitty Browser - a little browser that goes on any Web page, it's like Picture-in-Picture for the Web - and asks how you could get pagelinks browseable in this browser.

Err - how about using a pagelinks2opml web service like this, which uses an experimental pagelinks2opml web service (http://ouseful.open.ac.uk/pagelinks2opml.php?url=) to pull the links from an HTML page, shove them into an OPML feed, and pass this to http://www.bitty.com/manual/?contentvalue= for display in a Bitty Browser - a URL pipeline in action you might say :-)

This service is a web based version of the pagelinks2opml bookmarklet. Unfortunatley, it's stability leaves something to be desired on some pages containing complex URLs (I need to tweak the settings I think) but it's okay with pages containing simple links.

Is this useful? Hopefully - at least as proof of concept.

The above example shows how it can be used to embed pagelinks in a Bitty Browser.

More generally, by using something like <link rel='pagelinks' type='text/opml' href="http://ouseful.open.ac.uk/pagelinks2opml.php?url=THISPAGEURL' > in the page header, then in principle anyone picking up the page can autodetect the OPML file containing the pagelinks. (Indeed, they could just feed the page URL into any pagleinks2opml service they have the URL anyway without needing to autodetect the pagelinks reference.)

However, using the <link> tag may be useful if it also signifies a small amount of additional markup in a page. Why? Well, the simple bookmarklet (and the above server script) potentially scrape ALL the links on the page, including links in the header, navigation links etc., although links with the suffix .js or .css can easily be ignored.

Perhaps what would be better would be to enforce the convention that the user adds a particular class attribute to any links they explicitly want to be included in a pagelinks2OPML listing (e.g. class='pagelink').

This could then be used to key the pagelinks2OPML bookmarklet/server script and ONLY the desired pagelinks would be included.

This approach does incur an overhead for the author - they have to additionally mark up those links they want included in the pagelinks2OPML OPML file - but if they are savvy enough to be able to add <link> tags to a page, this is probably acceptable.

Posted by ajh59 at April 17, 2006 11:19 PM
Comments