March 11, 2007

Serial Web Feeds via Yahoo Pipes

Todd Slater over at Big IDEA has just started developing a podcast serialiser (and here):

What if an instructor wants to podcast (subscription model) audio files? He has a few options. He can tell students to subscribe to his blog and then post audio files there on a weekly basis. This is a great option for the first quarter, but what does he do in subsequent quarters? All of the previous posts will be "old" and new students won't see them when they subscribe to the blog in a subsequent quarter. What he needs is something that will update the RSS feed based on the quarter's start date and a determined update interval.

(Todd's app is developing apace, so it's worth checking hos blog for more recent posts, such as this one. I'm also with you on the abuse of the "podcast" term, Todd, and keep having to fight that battle here...).

This has a lot in common with the ideas I've been working up for OpenLearn_daily, and has prompted me into a bit of a tinker today.

The original plan was to subscribe to a feed with some user-specific info, like the time of subscription and the way in which they wanted the content delivered. From this info, and the current time stamp whenever the feed is polled, an appropriate number of feed items would be delivered to the user.

Stephen Downes commented in a personal email (I think) about the resource requirements/server load for popular feeds and I've been in several minds about the best way to count of items whenever a personalised feed is polled....

So, I've taken the easy way out, and rather than doing all the work myself (not much work in fact, but anyway....) I've handled the 'count the appropriate number of feed items' task over to Yahoo Pipes.

This Feed Serialiser Pipe takes in a couple of arguments - a feed URL and a digit - and simply counts out the corresponding number of feed items.

A form I've been doodling with over the last couple of weeks (and that seriously needs tidying!) provides the interface that allows the user to customise the delivery schedule and generates an appropriate, customised URL to subscribe to: Serialised feed subscription form.

The form lets you enter any feed URL (err, maybe....?;-) and a schedule according to which you'd like the feed items to be delivered. It then generates a URL for you to subscribe to which will deliver items as and when required. (At the moment, the items are delivered in narrative order, rather than reverse chronological order; when I get a chance, I'll add a switch that lets you select the ordering of your choice).

Some logic located at the customised feed URL then works out how many items are required in the feed whenever it's polled and calls the Yahoo Pipe with the appropriate arguments.

The server's still got work to do - quite a bit in fact, as it's acting as a proxy for the Yahoo Pipe feed and doesn't make use of cacheing (oops!) - but I'm hoping that as time goes by the Pipes folks offer more and more functionality and I can pass more and more of the logic over to them and hopefully end up with a purely Pipes based solution :-)

(One thing I'd like to see in the Pipes toolbox, which should be simple, is a 'reverse the order of items in a feed' block for items that have no timestamp. This isn't anywhere near as complex as the current Sort block, so any chance of it guys? [SOLVED, I think... just using SORT to display descending order on pubdate works (I think), even though there is no pubdate...The question now is - how do I put logic into the Pipe, so I can supply a user argument and reverse sort if a 'reverseOrder' option is set by a user]:-)

Feel free to try the service out, and mail me - or comment - with any problems.

Posted by ajh59 at March 11, 2007 04:36 PM
Comments

Nice, I hadn't thought of using Pipes. Per your feed subscription form, I have only glanced at the code, but how are you scheduling the updates? Do you test the time it's being hit, and compare that to the start time, and then deliver the appropriate number of items? I see t is the time and p is the update interval. That opens up some possibilities...

Posted by: todd at March 12, 2007 01:02 PM

Re the update scheduling, I put the subscription time in the URL and then whenver the feed is polled it is compared to the current time:

The original pseudo-code was:

timeSinceStart=(timeNow-startTime); //startTime is the time in the feed URL

c=timeSinceStart/period;

n= (int)c * bundleSize + initSize; //n is the number of items to return in the feed..

Posted by: Tony at March 12, 2007 01:28 PM