Tuesday, October 23, 2012

Aggregation Workflow in gRSShopper

A few points:

First, on workflow, which is the topic of this post. Here's how gRSShopper currently works. It's more detailed than suggested in the post, but contains the same basic idea, with enhancements added through four years of experience running cMOOCs.

For the participant:

1. Go to the 'add feed' page. Eg. http://edfuture.mooc.ca/new_feed.htm

2. If you are not logged in with a user account, log in - click on the link provided on the page, go to the login screen, then click on 'return to where you were' to return to the 'add feed' page.

3. If you are not registered with a user account, create a user account - click on the same login link provided on the page, select the registration option, supply the information, click on 'return to where you were. (I have found registration to be absolutely necessary, otherwise you get flooded with a slew of marketing feeds).

4. Fill in the feed information on the page - feed name, URL, optional description. Submit, and you're done.

Note: feed information is updated from the RSS (or Atom) file. To edit feed information, edit the feed information at the source, and it is automatically changed in gRSShopper.

For the feed administrator:

1. Feeds submitted are set to 'provisional' status and won't be harvested until reviewed. List the feeds, then click 'approve' to approve the feed. Optionally, run a test harvest for the feed. A significant number of these fail - that is why the 'approve' step is required. Note that (1) the harvester will correct for the most common feed errors: it attempts autodetect if the blog URL is entered, it adds http:// (or replaces feed://) as needed, it corrects for various feed formatting problems at the source.

For the page designer:

1. Items harvested from feeds can be displayed on a web page with a single 'keyword' command, eg. this command will include the feeds from the last 24 hours (truncating them at 500 characters): keyword db=link;expires=24;format=summary;truncate=500;all

2. Pages may be optionally published as flat html, so they do not need to be generated dynamically from source each time a person lands on this (this gives a significant speed advantage); set 'autopublish' to a desired value (typically, once an hour).

3. Pages may be returned into email newsletters; select that option and select an auto-send time (specify days of the week or month, time). When pages are set as email newsletters, they appear on the 'email newsletter' list. Check the 'default' button if you would like new registrants to be automatically subscribed.

For course participants:

1. When registering, a list of newsletters is posted on the registration page, with default newsletters check (using a checkbox format). Check or uncheck desired newsletters. Subscription will be created with registration.

2. Or, alternatively, after registration or login, click on 'options' (upper right of screen). The options page provides the following:
- personal information, including email, which may be edited
- list of feeds submitted, and their status
- option to add social network information
- list of subscribed newsletters, and an option to edit subscriptions

That's the entire workflow. The pages could be more beautiful and more interactive, but everything here described works pretty much without fail. I am of course open to ideas and suggestions.

3. Or, alternatively, read the feeds on the website, either: (a) from the feedlist page (eg. http://edfuture.mooc.ca/feeds.htm and then http://edfuture.mooc.ca/feed/84 ) or (b) using the viewer: eg., http://edfuture.mooc.ca/cgi-bin/page.cgi?action=viewer

4. Or, alternatively, from the feed list page, download the OPML and load the list of feeds on any other RSS reader. (eg. http://edfuture.mooc.ca/opml.xml )

Second, on Yashay Mor's post (it won't allow me to comment there, so I'll comment here): There are issues with gRSShopper, I would be the first to admit. But requiring root access to a web server (as stated by Yashay Mor) is not one of them. I have gone to considerable effort to include the code you need to run it on out-of-the-box Perl, whic means any person with a web site account could run gRSShopper.

The more significant issue is that it is still difficult to install and run. I am working in an installer that greatly facilitates the process, but it's buggy. I am also working on scripts that make it easier to run once installed.

What would be ideal, of course, would be a hosted service that allowed people to simply open a gRSShopper account and start aggregating with a minimum of fuss. If someone provides me with start-up funding, I'll provide that. (And no, my employer is not inclined to provide the time and space needed for such a project).