Tuesday, October 23, 2012

Aggregation Workflow in gRSShopper

A few points:

First, on workflow, which is the topic of this post. Here's how gRSShopper currently works. It's more detailed than suggested in the post, but contains the same basic idea, with enhancements added through four years of experience running cMOOCs.

For the participant:

1. Go to the 'add feed' page. Eg. http://edfuture.mooc.ca/new_feed.htm

2. If you are not logged in with a user account, log in - click on the link provided on the page, go to the login screen, then click on 'return to where you were' to return to the 'add feed' page.

3. If you are not registered with a user account, create a user account - click on the same login link provided on the page, select the registration option, supply the information, click on 'return to where you were. (I have found registration to be absolutely necessary, otherwise you get flooded with a slew of marketing feeds).

4. Fill in the feed information on the page - feed name, URL, optional description. Submit, and you're done.

Note: feed information is updated from the RSS (or Atom) file. To edit feed information, edit the feed information at the source, and it is automatically changed in gRSShopper.

For the feed administrator:

1. Feeds submitted are set to 'provisional' status and won't be harvested until reviewed. List the feeds, then click 'approve' to approve the feed. Optionally, run a test harvest for the feed. A significant number of these fail - that is why the 'approve' step is required. Note that (1) the harvester will correct for the most common feed errors: it attempts autodetect if the blog URL is entered, it adds http:// (or replaces feed://) as needed, it corrects for various feed formatting problems at the source.

For the page designer:

1. Items harvested from feeds can be displayed on a web page with a single 'keyword' command, eg. this command will include the feeds from the last 24 hours (truncating them at 500 characters): keyword db=link;expires=24;format=summary;truncate=500;all

2. Pages may be optionally published as flat html, so they do not need to be generated dynamically from source each time a person lands on this (this gives a significant speed advantage); set 'autopublish' to a desired value (typically, once an hour).

3. Pages may be returned into email newsletters; select that option and select an auto-send time (specify days of the week or month, time). When pages are set as email newsletters, they appear on the 'email newsletter' list. Check the 'default' button if you would like new registrants to be automatically subscribed.

For course participants:

1. When registering, a list of newsletters is posted on the registration page, with default newsletters check (using a checkbox format). Check or uncheck desired newsletters. Subscription will be created with registration.

2. Or, alternatively, after registration or login, click on 'options' (upper right of screen). The options page provides the following:
- personal information, including email, which may be edited
- list of feeds submitted, and their status
- option to add social network information
- list of subscribed newsletters, and an option to edit subscriptions

That's the entire workflow. The pages could be more beautiful and more interactive, but everything here described works pretty much without fail. I am of course open to ideas and suggestions.

3. Or, alternatively, read the feeds on the website, either: (a) from the feedlist page (eg. http://edfuture.mooc.ca/feeds.htm and then http://edfuture.mooc.ca/feed/84 ) or (b) using the viewer: eg., http://edfuture.mooc.ca/cgi-bin/page.cgi?action=viewer

4. Or, alternatively, from the feed list page, download the OPML and load the list of feeds on any other RSS reader. (eg. http://edfuture.mooc.ca/opml.xml )

Second, on Yashay Mor's post (it won't allow me to comment there, so I'll comment here): There are issues with gRSShopper, I would be the first to admit. But requiring root access to a web server (as stated by Yashay Mor) is not one of them. I have gone to considerable effort to include the code you need to run it on out-of-the-box Perl, whic means any person with a web site account could run gRSShopper.

The more significant issue is that it is still difficult to install and run. I am working in an installer that greatly facilitates the process, but it's buggy. I am also working on scripts that make it easier to run once installed.

What would be ideal, of course, would be a hosted service that allowed people to simply open a gRSShopper account and start aggregating with a minimum of fuss. If someone provides me with start-up funding, I'll provide that. (And no, my employer is not inclined to provide the time and space needed for such a project).

4 comments:

  1. Would funding for a hosted service be in Kickstarter territory?

    ReplyDelete
  2. Thanks for sharing this Stephen, it's very parallel to the process we use in our Wordpress styled aggregators in place for ds106

    Feedwordpress also does ok on auto discovery and offers previews, but like yours there can always be a need for review, and things get trickier for users with category/tag feeds (more on their understanding of RSS).

    One feature I am not sure I saw that we make use of is the ability to add our own tags to syndicated content to allow us to do things like group posts from a class or organization... Or in general to bring in user tags to allow us to group assignments.

    I'm game to give an install some time.. when there is such a thing

    ReplyDelete
  3. Hi Stephan,

    Thanks for the comment. Google sites is a fairly good platform, but sadly it doesn't allow me to distinguish commenting and editing rights. So I've embedded a cloudworks cloud in the post, for discussions. Should I respond here or do you want to take the conversations there?

    ReplyDelete
  4. Ok. I think "root access" was a wrong term to use. But you do need, as you say, an account on a web server. That won't work if you're cloud based (i.e. using a hosted wordpress or google sites as your platform). For example, we're using a mashup of google sites and cloudworks for http://olds.ac.uk, neither of which allow us to run perl scripts.

    ReplyDelete

I welcome your comments - I'm really sorry about the moderation, but Google's filters are basically ineffective.