Thursday, November 30, 2006

The Master Takes Away...

Responding to Will Richardson talking about the decline of a racist website after some Google-bombing.

It’s still at number 2 on

But this remark is interesting: “there is something unsettling about this whole process in terms of how, for ourselves and for our kids, do we get our brains around the scope of the potential manipulation of ideas and information on the Web?”

I heard a saying recently, I can’t remember where, and I can’t remember the exact quotation (it was only at the periphery of my perception), but it went something like this:

The master gives the workers the tools they need to work. The master takes away the tools when they begin to tear down the house.

The implication is: so long as what we’re doing online (such as messing around with Google rankings) benefits them, they are happy to let it continue. But if it begins to threaten their lordship over things (’tearing down the house’) they take the tools away.

I have been thinking about this small saying a lot recently. About how much of our new empowerment is genuine, and how much exists only through the benevolence of the masters.

It’s pretty easy to test.

I got an invitation recently to talk at some conference in Florida or Carolina or something, a conference for executives. Asked what I would talk about, I outlined how in order to function in the new economy they would have to give up power, to function through networks of equals rather than through coercion. I never heard from the conference organizers again.

Oh, I’m happy to see that site drop in the rankings. I looked at it; it’s a piece of racist crap and deserves oblivion.

But what do I do if I’m next? If some major society, say, were to decide that socialists (which I am) or communists (which I’m called) are as bad as racists.

Suppose even that I simply attract the ire of some major company that sponsors conferences. Could they silence my voice? See to it that my rankings drop?

The master takes away….

I’m just wondering what happens when the other shoe drops. If it drops.


"There’s been a flurry of news about new video downloading services — BitTorrent, Wal-Mart, Time Warner — as copyright holders get busy taking back control from consumers. YouTube just cut a deal with Verizon that gives users so little control that the blogosphere couldn’t find enough pejorative adjectives to describe it.

"The glory days of consumers in control of copyrighted content seem to be coming to an end — people who still want to control copyrighted content are being driven underground, or to sites like, where you can find the location of copyrighted content still available online."

Scott Karp


Here's the reference:

“The master’s tools will never dismantle the master’s house. They may allow us temporarily to beat him at his own game, but they will never allow us to bring about genuine change.” ~ Audre Lorde

A New Website, Part Ten - More on Mailing

The test of the message I sent to CSoft tech support this morning after many long and frustrating hours of trying to make my mailing lists work.


Summary: I solved the problems by creating the /Mail directory in my account on and then rebuilding my mailboxes.

None of the work (which I document below) should have been necessary.

1. The /Mail directory should have been created when the account was created

2. The mailbox routine on the control panel should report an error when the mailbox fails to be created (that it doesn't do this is a remarkable programming error)

3. The list of email addresses in the control panel should be complete

4. 'Delete' should delete email addresses even if there is an error in the mailbox configuration (such as it not existing)

5. Documentation on how to set up mail should be complete, specifying (eg) the use of SSH as a server setting

6. Documentation on how to set up mailing lists should be complete, *especially* the bit about running the command on a second server in the Mail directory

7. Finally, the response to my original complaint was inadequate. At the very least, the person responding should have looked at the Mail directory to see if there was a problem. At that point, he would have noticed that the directory doesn't exist.

Now I'm going to go see if I can figure out how to tell people how to subscribe to my mailing list, which is also undocumented. *sigh*

Here is my work record:

My mail problems multiply.

Attemnpting to login to mailbox 'stephen' fails. Sort of. Stalls at 'sending login information' with 'no encryption', fails entirely under TLS, and appears to accept login but fails to respond under SSL (it may simply be returning an empty mailbox) (obviously I'm just guessing at the settins here, because they are not documented).

An email sent to was not rejected (yay!) but does not appear to be retrievable.

The previous attempt was IMAP. I also attempted to set up a POP account. Mailbox name 'stepop', password 'DAVIDHUME'. I sent an email to '' which was not rejected. Mailbox sizes remained '0' however.

When I attempted to access this account I got: "Sending of password sis not succeed. Mail server MAIL231.CSOFT.NET responded 'Unable to create lockfile' (2) on /home/downes/Mail/stepop/Mailbox': No such file or directory (2)

The list of current email addresses only shows 12 email addresses, all 12 created by two of my attempts to create mailing lists, foo and fob. No current email addresses (including and (both of which I reconfirmed exist) show up on the display.

Attempts to delete six of the addresses (those associated with 'foo' fail with an error).

Error: No such alias recipient: `|/usr/local/bin/ezmlm-moderate'.

I attempted to run the ezmlm command in the Mail directory on as instructed.

The Mail directory did not exist (though there existed a symbolic link Mairdir -> Mail/Maildir). So naturally the ezmlm command failed (I tried ezmlm-make -m ~/Mail/fob ~/Mail/.qmail-fallacies:ca-fob fob and got: ezmlm-make: fatal: unable to create /home/downes/Mail/fob: file does not exist

I created the directory Mail and tried again. The commands then worked.

I tried sending emails to the two email addresses ( and and logging into their respective mailboxes, but again, same results.

I sent email to (the mailing list that should now exist). This email *did* appear in my email ( with a moderation request. Reply to 'accept' did not bounce (I never saw the email again - I guess the moderator doesn't get copies).

I deleted the 'stephen' mailbox and recreated it. I attempted to associate it with ;' again but got an error "Error: Existing recipient: `&stephen'" It seems to me that when you delete a mailbox that email associations should also be deleted. I am of course unable to delete the email address '' because I cannot see it.

I created a new email address, ''. It also does not show up on the list of email addresses (this is getting REALLY frustrating - that HAS to be fixed). It DID show up in the mail box. An email sent to '' also showed up. Clearly, recreating the mailbox (which presumably is now in the 'Mail' directory) was the trick.

Wednesday, November 29, 2006

A New Website, Part Nine - Mailing List Follies

One of the problems with starting a series like this is that it creates the expectation that it will be continued and even eventually finished. So when the task it describes becomes Byzantine, with numerous twists and turns, keeping up the documentation can become a chore in itself.


Over the weekend I installed the Views module and experimented with it. Repeat after me: "A view is a list."

Happily this module does exactly what I want it to do: it allows me to describe a list of records I would like displayed on a page or a block, and it then displays them for me. In this it corresponds with my old 'keyword' command, that I would insert into a web page.

I didn't get a chance to do a lot of experimentation, but basically the idea is that you select what type of content you would like to display, what fields from that content you would like to display, what you want to filter on (eg., 'Content type = author') and how many records you want to show at a time.

In my system, I would feed each record into a template. This does not appear to be quite so flexible, and I haven't really found a way to style the output yet, but I'm sure there is one.

The Front Page

I decided I wanted to create a nice clean front page, displaying my records and news and stuff inside the website.

This was a disaster, and I ended up messing up the administration screens.

Without going into details (there's no point, and I can't remember it all), here are the lessons I learned:

- first, the administration screen needs the columns, so don't wipe them out in your template

- second, keep your content in blocks, not pages

Data, Revisited

This week I have been looking again at how to convert my old data to the new system, with more success.

Using the CCK module, I have determined that I can create new data types that emulate the tables in my old system, including even the links between records that I use.

I have experimented with the 'Authors' table first. In my system, when I describe a web resource, I enter the name of the author. So when I save a post, I also save a record for the author (if it's a new author) or find the name (if it's an existing author) and link it to the post record.

Now CCK isn't going to support anything so sophisticated (I will have to write some input script) but it will support the record types and linking.

In my system the author table has the following fields:


I would have added more but I never got around to finishing off this part of my system.

I defined a new content type, 'Author', in CCK, with the following fields: Link, ID, Description and Crdate. I let the Title of the record be Name. I then created a test record, just to see what CCK would do.

CCK creates a bunck of new tables and records to track the new content type. The actual data for a record is stored in three separate tables: node, node_content_author, and node_revisions. The first and third are standard Drupal tables, and the second was created when I created the content type.

My next step, therefore, is to look at each of these three tables and create a crosswalk -- that is, a mapping from my table to the Drupal table. Here's how I set it up (I have no idea why Blogger puts a big gap there, there's only one line and no markup in my text):


Drupal FieldMy field or default value
nid(Auto Increment)
vid(from node_revision)


Drupal FieldMy field or default value
nid(from Node)


Drupal FieldMy field or default value
nid(from Node)
body(not used)
teaser(not used)
log(not used)

So this was all pretty nice, it seemed clear I would be able to recreate my table just fine.

So I went into phpMyAdmin running on my current website and, navigating to the 'author' table, selected 'Export' and saved the data as an SQL file. I then copied the contents of this file (it was only a little more than 2000 lines) into the 'SQL' window in the phpMyAdmin running on the new site. Running the SQL created an 'author' table on the new site that was an exact duplicate of the one on my old site.

Then I created a script that would access this new table and copy it into the Drupal table. As I write I haven't finished this script, but here is what I have at the moment.

use DBI;

print "Content-type: text/html\n\n";
print "Test


my $dbh = &db_open("DBI:mysql:xxxxxx:localhost","xxxxxx","xxxxxx") or print "Database connect error: $!";

print "Database is open


my $sth = $dbh->prepare("SELECT * FROM author");
while (my $ref = $sth -> fetchrow_hashref()) {
#print ($ref->{author_name});
#print " -

print "Testing insert


$vars->{author_crdate} = time;
$vars->{author_name} = "Test";
$vars->{idfield} = &db_insert($dbh,"author",$vars);
print "Inserted record into record number ",$vars->{idfield},"


if ($dbh) { $dbh->disconnect; }


sub db_open {

my ($dsn,$user,$password) = @_;
my $dbh = DBI->connect($dsn, $user, $password)
or die "Database connect error: $! \n";
# if ($dbh) { $dbh->trace(2,"dberror.txt"); }
return $dbh;

sub db_insert { # Inserts record into table from hash

my $dbh = shift || die "Database handler not initiated";
my $table = shift || die "Table not specified on insert";
my $input = shift || die "No data provided on insert";
# die "Unsupported data type specified to insert" unless (ref $input eq 'HASH' || ref $input eq 'Link' || ref $input eq 'Feed');

my $data= &db_prepare_input($dbh,$table,$input);

my $sql = "INSERT INTO $table ";
my(@sqlf, @sqlv, @sqlq) = ();

for my $k (sort keys %$data) {
push @sqlf, $k;
push @sqlq, '?';
push @sqlv, $data->{$k};
$sql .= '(' . join(', ', @sqlf) .') VALUES ('. join(', ', @sqlq) .')';
#print "Content-type: text/html\n\n $sql -- ",@sqlv,"

my $sth = $dbh->prepare($sql);

return $dbh->{'mysql_insertid'};

# Adapted from SQL::Abstract by Nathan Wiger

sub db_prepare_input { # Filters input hash to contain only columns in given table

my ($dbh,$table,$input) = @_;
my $data = ();

my @columns = &db_columns($dbh,$table); # Get a list of columns safeguard data input
foreach my $ikeys (keys %$input) { # Clean input for save
next unless ($input->{$ikeys}); # - no blank fields
next if ($ikeys =~ /_id$/i); # - do not change ID
next unless (&index_of($ikeys,\@columns) >= 0); # - input column must exist
$data->{$ikeys} = $input->{$ikeys}; # Transfer to input hash
#$data->{$ikeys} = &demoronise($data->{$ikeys}); # Fix non-standard character input

return $data;


sub db_columns {

my ($dbh,$table) = @_;
my @columns = ();
my $showstmt = "SHOW COLUMNS FROM $table";
my $sth = $dbh -> prepare($showstmt);
$sth -> execute();
while (my $showref = $sth -> fetchrow_hashref()) { push @columns,$showref->{Field}; }
die "Can't find any columns for $table" unless (@columns);
return @columns;

sub index_of {

# Get item and array from input
my ($item,$array) = @_;

# Initialize counter
my $index_count = 0;
# For each item in the array
foreach my $i (@$array) {
# Return the counter value if it matches item
if ($item eq $i) { return $index_count; }

# Increment the counter

# Return -1 if no match is returned
return "-1";

As you can see, what it does right now is print a list of all the items in the new 'author' table I created, as well as add some test data into the new database. To finish this script I need to create some hashes to implement the crosswalk, then run an INSERT command for each of the three Drupal tables using the crosswalk.

This is a small job, but I don't want to finish and run the script just yet. My website, after all, is a working website. Every time I add a new link, I could be adding a new author. So I don't want to run this until I'm ready to commit to the changeover.

Mailing Lists

The other major thing I need to do is to set up a mailing list. After all, most of my readers still receive OLDaily by email (three or four times as many as by RSS, in fact). So I can't make the changeover without setting up the mailing lists.

I don't want to use my current script. There's a simple reason: it's not very good. Oh sure, it will send the emails, but it takes a long time and ties up system resources. It makes much more sense to use a proper mailing list program.

CSoft supports a mailing list manager called exmlm (easy Mailing List Manager). Here is all the documentation CSoft provides:
By invoking ezmlm-make from your shell. We do not allow mailing lists which accept posts from non-subscribed addresses, so make sure to always pass the -g or -m flag to ezmlm-make.

The command to create a "" mailing list, would look like this -

$ ezmlm-make -g ~/foo ~/.qmail-domain:com-foo foo

A moderated list is created using the -m flag, and by subscribing moderators like so -
$ ezmlm-make -m ~/foo ~/.qmail-domain:com-foo foo
$ ezmlm-sub ~/foo/mod
Honestly, that's pretty inadequate (the link is to the unix man page for ezmlm, but this is unfortunately not written in English). Moreover, following those instructions to the letter doesn't work; the email simply bounces (of course - I could be sending it to the wrong address). There is a nice web interface for it, but that doesn't appear to be supported by CSoft and because the developers created a Perl module (which means you need admin provileges to use it) I can't use it (which is exactly why I am moving away from Perl - the strategy of requiring perl modules you need superuser privileges to install is borked).

OK, maybe Drupal supports mailing lists.

I tried a search (why oh why do irrelevant Drupal discussions always rise to the top of Google searches of the Drupal website, and not things like, say, documentation?). After getting a bunch of useless results, I combined 'Drupal' and 'exmlm' in my search and got this page, the Drupal ezmlm module.

However, it says, "Note that this module, ezmlm, will no longer be maintained after release 4.7 because its functionality has been superceded by the Mailing List Manager (mlm) module."

OK, so I went to the Mailing List Manager module, downloaded the module and installed it as per usual. Only to be greeted by a slew of database errors. Huh? What?

Well, according to the bug reports, the Mailing List manager module installation routine simply failed to install the necessary tables in the database. This is just me, but wouldn't this be a really big problem? You'd think - but I see no activity on the module for months.

This then is probably another one of those abandoned Drupal modules. This, it seems to me, is a real problem for Drupal. If the modules are basically abandoned, they should be removed from the site. From what I can tell, the Drupal site is littered with half-built and semi-working modules. How about a little quality control here. Or at least, some requirement that you finish coding the thing before it gets listed.

I saw through another link something called 'og2list' - of course, what the search turned up was this screencast page and then this totally irrelevant page (the bad search results is a huge problem for me using Drupal - and it's probably because there's no %$%$ links to the documentation for Google to seize on, so Google has no way of knowing which Drupal website pages are important).

I eventually found the module here. But this is way more than I want - it is basically a set of modules that turns a Drupal installation into something like Yahoo groups. Nice, but not what I'm after with my website. Also, it requires all kinds of admin privileges to install, so once again I'm at sea here.

So far as I can tell, Drupal does not support a functioning mailing list module. So I wrote a note to my host technical support asking for either (a) help setting up my mailing lists, or (b) documentation.

So, another day. Not sure what I'm going to do. At least Perl works on my site, so if worst comes to worst I can just use my existing script. But I think that the CSoft people wouldn't like that very much.

Monday, November 27, 2006

Sumaj - Live in Barcelona

The woodwind band Sumaj plays live in Barcelona in an open air concert on La Rambla. Enjoy.

LOMand Dublin Core

> Can someone explain to this simple, country editor why we need both LOM and Dublin Core?

I think I look at this question a bit differently, asking instead, "Why do we need one?"

A characteristic of LOM (and increasingly of Dublin Core) is that it is trying to do everything itself. Inside LOM is bibliographic information, categoization and taxonomic information, authorship and publication information, technical information, rights information, and only incidentally, it seems, educational (or learning-related) information.

By creating 'one-size-fits-all' versions of such metadata, in my view, the result is that these different types of metadata are done badly. Consider Rights, for example, characterized by a couple yes-no questions and a text string. Surely there are better ways of doing this. Certainly, there are alternative (ODRL and MPEG-REL) ways of doing this.

Similarly for technical metadata. Ill-conceived (by requiring the definition of a player, rather than a media format) the technical metadata can and should be very precise - and most importantly, different - for different types of media.

To me, the metaphor of 'metadata as document' or 'metadata as catalogue card' - with a primary emphasis on things like search - unnecessarily limits how we can talk about learning objects and learning resources in general. We ought to have, for example, evaluative metadata, including (as appropriate) commentary, links to review, and more. We ought to have usage metadata. We ought to be able to, over time, assemble from many different sources, different metadata 'views' of the same resource, depending on our needs and perspective.

Certainly, the 'core' approach (as instantiated by the Dublin ilk, or RSS, or Atom, or a few others) ought, in my mind, to be favoured, creating a minimally defined set of elements that may, as need and custom demand, be extended.

Why do I feel the need to comment on this? because there still exists the tendency to belief that some specialized need ought, by the fact of its possibility, constrain or even become a part of the specification.

I agree with Mikael: "The way forward that I see as most promising is not merging standards but rather to disassemble standards."

If there is a way forward for LTSC, I think that it would be in articulating how we think about this and how we use disassembled standards in concert to realize semantically useful results (not just searches - there is way too much emphasis on searches, in my view - not everything is a library).

My own views on this (a bit dated now, but still possibly fresh enough):

The Quebec Card

The ridiculous saga of the 'Quebec as Nation' debate is reaching a frenzy (at least in the media) as the Liberal leadership convention approaches.

For the record - and only for the record - I am happy to recognize the Quebecois, the people, as a nation. In Canada we have plenty of precedent for such a designation, so much so I am surprised that there would be any debate about this.

Has everyone in the media, for example, forgotten about the First Nations? Among whom we would include the Six Nations? The ones with whom we as a country negotiated a series of treaties? How about the Inuit, over which Canada negotiated governance of the territory of Nunavit? And have we also forgotten the Dene Nation?

There will come a day, I suspect, when we (quite rightly and properly) recognize the Chinese (who, after all, built large sections of our national railroad, the link that forged the country) as a founding nation. No history of Montreal would be complete without a nod to the Jewish nation and even the United Empire Loyalists who constitute, in part, my own ancestry. And I could go on.

No, what should be remarked upon at this juncture is not the substance of the debate, for there is none. Rather, it is the playing, once again, of the Quebec Card.

The Quebec Card is, of course, a pale understudy for the ultimate card, the Race Card. It is a tactic that is pulled out by politicians, usually from the right (because they seem to have this thing about race), when they are losing an otherwise more typical debate or contest.

That is why it comes as no surprise to me that Michael Ignatieff played the Quebec card. Though relentlessly promoted by the national media (who, remember, gave us Paul Martin) Ignatieff was nonetheless losing the leadership race. That is why, for no other apparent reason, he raised the question of Quebec nationalism.

"I speak for those who say Quebec is a nation, but Canada is my country,'' Ignatieff said, falsely (as he most certainly does not speak for me) and unnecessarily. And - I might add - unclearly (the best way to stir up a debate is to be murky about which side of it you're on). Perhaps this will obscure his support for the Iraq war, something only his long residence in the United States or his conservative leanings can explain.

It is of course no surprise to see the other conservatives take up the Ignatieff play, raising the proposal by a motion, sowing chaos into the Liberal debate, and not incidentally, the Canadian political landscape.

The Conservatives, after all, had just slipped behind the leaderless Liberals in the polls, having totally botched the environment portfolio and having lost ground on issues as diverse as gay rights (which they oppose), capital punishment (which they support) and the Afghanistan war (where they wish they could be). They can't even get free speech right. Much less media relations.

It's a shame we have to witness this, once again, and a shame our corporate media is ready and willing to whip this into a frenzy. Most likely, Canadians are not fooled by the Quebec Card any more (though the Bloc can be counted on to leap like a walleye to the lure).

If I had my druthers, we'd all nod in bored agreement and then get on to some things that matter. Like the war, say. The environment. Education, health, poverty. Energy, the economy and regional development. And a few dozen other matters Canadians place at a much higher priority than stupid word games being played by stupid people.

Thursday, November 23, 2006

A New Website, Part Eight - Drupal Reloaded

It was at 11:30 this morning I decided it wasn't going to work.

I got into the office a bit late today because I had to sign some papers related to my new gas furnace. First thing, I went to Google looking for information on how to install new modules in Drupal 5.0.

It's very simple. Modules have a name, say, 'foo'. There is a .module file associated with that module, 'foo.module'. There can be (but doesn't have to be) a directory with that same name, 'drupal/modules/foo'. Drupal, when it lists the modules that can be installed, checks the 'modules' directory and any subdirectories for a file called '.module'. That's it.

So why wasn't this working on my website? Why were new modules, uploaded and very carefully checked for permissions and all that, simply not found? I have no idea. The core modules were loaded OK because they are all listed in the 'modules' table in the database, no problem.

Once I was comfortable I fully understood the module loading, installation and initiation process, I was also comfortable in saying, my installation of Drupal 5.0 is borked. Which left me two choices: I could either delve into the code that loads the list of modules for admin/modules and try to fix the bug, or I could give up on Drupal 5.0 and try to use an earlier version.

And that's when I decided 5.0 wouldn't work. I poked around in the code for a while, looking for exactly where the script scans the directory structure looking for .module files, but I couldn't fine it. I did find this very nice way to navigate the Drupal fuctions, which helped a lot (I have linked to the source for index.php, and you can pretend your a computer and navigate all the functions yourself), but I wasn't really prepared to root around looking for it (and you have to root around, PHP allows variables to be valled as functions, something that makes coding more flexible but is just miserable to debug, at least to me).

But more to the point: it was very possible that Drupal couldn't find the list of modules because the current version of PHP didn't support it. Drupal might be trying to do some sort of crazy strsub on the directory names, for example. Who knows? So I decided to try Drupal 4.7 because that would also tell me whether it was a PHP problem, as opposed to a Drupal problem (verdict: it's a Drupal problem; Drupal 4.7 read the directories just fine).

So I installed Drupal 4.7. It took me about 10 minutes. Here's what I did:

- Deleted database 'downes', then recreated a new and empty database called 'downes' and gave my user access to it, just like before

- Downloaded the Drupal 4.7 package to my home machine and uploaded it to my website using gftp

- Opened a console window, used SSH to access my website, then used 'cd' to navigate to the 'www' directory

- removed the previous installation of Drupal using the command 'rm -r drupal' (note: be extremely careful using 'rm -r' because it wipes out directories recursively)

- then I executed the command 'tar xzf drupal-7.7.tar.gz'

- Executed 'mv drupal-7.7.tar.gz drupal' to move my software into a nice clean drupal directory

- removed my symbolic link from to the drupal directory, then recreated it by typing 'ln -s drupal'

- used gftp to download /drupal/sites/default/settings.php to my desktop, used gedit to change the username, password and database name for access to the database, save it, then uploaded it back to the computer

- created the new Drupal database structure by typing: mysql -u username -p databasename foaf module to my desktop, then uploaded foaf.module into the 'modules' directory. Now for the test: I navigated to admin/modules and... yes, there it was. I selected 'enable' and updated the page, and the new module loaded just fine.

Good. Then I decided to try the flexinode module. Same process, only the flexinode module got its own directory. With all the files loaded, I went to admin/modules again, and... there it was. I enabled it and then gave it a test run.

The idea of something like flexinode is that you define a new data type and then assign to it the fields you want for that data type. So I went to 'content types' and defined a new data type, 'Post', and then gave it fields for 'url', 'description', 'author' and 'publisher'.

Once done, I went to 'Add Content' and saw my new content type there. Great! But when I opened up the form to create some content, it was mangled. I had to play with the content types editor several times, but the form never did display. Ah, whatever, I could always define a new form.

Once I have created a test entry, I went into phpMyAdmin (which really lurches and complains on this installation of PHP - either this installation is really bad or (more likely) phpMyAdmin has overreached and now demands as 'required' PHP functions that are very optional and not typicall installed). I needed to see what a 'post' record looked like in the database before I could consider porting the data over.

I was disappointed. It was a very ugly hack. It was sort of like microformats, done badly. All the different fields were placed into the 'body' field, and the script used 'div' tags to delineate them. Well, OK, but the classes used to name the 'div' tags were not descriptive at all - they were 'flexinode.1', 'flexinode.2', and so on. So,. basically, the flexinode script did the minimum to put the data in there, but that was it. Even worse, the names of the fields actually were in the data, delineated with

Wednesday, November 22, 2006

A New Website, Part Seven - Converting The Data

I confess, I am almost not sure how I want to proceed next. My website is a pretty complex website and it doesn't really map easily to what Drupal (or any other CMS) does. This is why I have resisted for all these years simply using a CMS.

Also, I want to make some changes to what I'm doing. First of all, I've decided I need a much faster-loading front page, and one that easily directs people new to my site to the information they may be seeking. Right now someone who visits me for the first time would never find my more important papers, for example.

As well, I want to try to implement some of the changes I've been planning. I want to use a proper mailing list tool, for example, instead of the home-made one I have been using all these years. I want to properly connect different content types. I want a better integration with remote data sources such as Flickr and Slideshare.

I am working, then, roughly according to a plan, but it's pretty loose. Basically, the first thing to do is to get the content off the NRC servers and into Drupal. Then I'll configure Drupal to roughly emulate my existing functionality. Finally, I will add the enhanced functionality.

The Content

OK then, let's take stock of what sort of content I have. This is a bit tricky because my website includes both harvested content from Edu_RSS and generated content such as my articles and posts. Tricky, but not impossible, for as we have seen Drupal also has an RSS aggregator.

I had always kept the two types of content - Edu_RSS links and my own content - separate, but recently I started putting all that into one big table, distinguished only by content 'type' and by the author. I did this so it would be easier to write templates and the like to display the content. I am not under this constraint here, as Drupal will manage that for me (though I will still have to do some custom work for each new type of content). So I will separate the two types of content again. Edu_RSS content will be managed separately from my own website content.

My own content consists mainly of two major content types: 'posts', which are essentially the links I post in OLDaily, and 'articles', which are the longer types of writing I do. I have more than 10,000 of the former, from eight years worth of collecting, and they are all in the same place (happily). I have about five hundred of the latter, and these are stored both on my own system and in Blogger, where I have been depositing such work recently.

The posts are mostly sand-alone, though some are connected to files and others to resources such as my photo sets. My photo sets were stored on my website, but I have been recently moving them all to Flickr. The posts, additionally, have 'topics'. I do not use either taxonomies or folksonomies to manage my topics; happily, it is a flexible approach we can leave it until later.

Complicating matters is that each post is also associated with one or more links. This is an artifact of Edu_RSS, which managed the imput of many different people about the same resource by creating a separate identity for each resource, then linking the posts to the resource.

The articles, meanwhile, have their own special features. Any given article may be associated with a file, such as an MS-Word or PDF document. Moreover, an article may also be associated with an event, such as a paper presentation. Articles may also have publication data, and some of them have been published more than once.

This introduces another major type of data, my presentations. I have more than a hundred PowerPoint slide shows, currently stored on SlideShare (but also on my existing website) as well as more than fifty MP3 audio recordings of my talks. The also are associated with events, and are also associated, sometimes, with articles.

That's the pretty basic set-up, and really isn't anything other people wouldn't have in their own data collections.

The way databases work is that each of these different types of entities is given its own table, where a table consists of a series of records, one for each item. Each record in each table is given its own identity, called a key. A record's own identity is called the 'primary key'. A record may be related to other entities, and when it is, these other identities are identified by their key; when this key shows up in a table, it is called a 'secondary key'.

Now there are different ways entities can be related to each other:

- one-to-one. This is pretty rare. Traditional marriage is works this way; each spouse has one and only one spouse, and the other spouse in turn has one and only one spouse.

- one-to-many. This is pretty common. Mothers and children work this way. Each mother can have many children, but each child can have only one (biological) mother.

- many-to-may. This is very common. Cousins work this way. Each person may have many cousins, and each cousin may also have many cousins.

In the case of one-to-one and one-to-many relationships, keeping track of associations is simple. We simple put a column in the table such that each record has a field containing the key of the associated entity. Nothing to it.

In the case of many-to-many, however, we have to construct a separate table of data where the keys of each of the associated entities are paired. This table is sometimes called a 'lookup' table. Since a lot of my data is many-to-many, I will need to plan for that.

In addition, I have attached to all of the my discussion board system, which over the years has collected hundreds of comments on things. This isn't as major a thing as NewsTrolls, which has thousands and thousands of posts. Ah, but one site at a time.

As much as possible, I would like to map my content types into existing Drupal tables. This means I don't have to create extra tables and associations and all that. It also means I can draw from the examples of Drupal designers when I do have to create some custom content.

What will help is that in addition to the standard modules, which I covered in the previous installment, Drupal has hundreds of additional modules. Of course, many of these were designed for Drupal version 5, and won't work in the Drupal 5.0 version I am testing. But many will, and more will each day.

OK then.

The Transfer

What I want to move first are the posts. They are the simplest type of content to move, and also form the heart of the services on my website.

Here's what the post records look like on my website.




























Yes 0

















id is the primary key. type indicates a type of post (in my system, articles, comments, posts, and any other type of content I upload is a post, each with its own type. A post, for example, is a post type 'link'). pretext is an extra content field (it's text that can be placed before the title of the item). title is the title. link is the actual URL of the item while linkid is the secondary key from the Links table. author and journal are the names of the author and the journal for the link I am discussing, while authorid and journalid are the secondary keys from those tables, respectively. key is legacy, from my previous database, hits is the number of hits, thread is unused (it used to track comment threads), dir specifies where to put files associated with the post, crdate is the date it was created, in unix time, creator is the secondary key of the record creator, crip is the IP address from which it was created (used to track spammers) while pub is the publication date.

What I want to do is to on Drupal find or create a type of content that most closely matches this. Drupal has by default the 'page' and 'story' content types. I need to first ask myself whether either of these will work for me. 'Page', probably not (and I will want to use it for website pages). What about 'story'?

What followed at this point was a couple of hours worth of investigation into how Drupal stores its data. As this site observes, "An important concept in Drupal is that all content is stored as a node. They are the basic building blocks for the system, and provide a foundation from which content stored in Drupal can be extended. Creating new node modules allows developers to define and store additional fields in the database that are specific to your site's needs. Nodes are classified according to a type. Each type of node can be manipulated and rendered differently based on its use case."

Taking a look at the Drupal database itself, we can see that the content is stored in two separate tables. One table is just a list of all the contents. The other table is the actual content itself. Doing ti this way keeps one of the tables really short, so you can do things like print lists of the titles or display the teasers. Also, by keeping the body of the item separate from the listings, you can have versions of the same item, which opens up all sorts of possibilities.

OK then, so each one of my posts will be a node. I will create a new node type, called 'post'. Then I will attempt to populate these two tables:


Field Type Collation Attributes Null Default Extra Action


varchar(32) utf8_general_ci

varchar(128) utf8_general_ci


No 0


No 1


No 0


No 0


No 0


No 0


No 0


No 0

nid is the node id (ie., the primary key) and will increment automatically. vid is the current version, and for us, will always be the same as the node id. type is the node type, in plain text (and not the key from the types table). title is the title of the item. status indicates whether it is published ('1') or not. created and changed are times when these happened, and they are in unix time (yay!), which are is the number of seconds since the standard epoch began at January 1, 1970 (GMT). Here's a time converter. comment, promote, moderate and sticky are status flags, and we'll use the defaults.


Field Type Collation Attributes Null Default Extra Action



No 0

varchar(128) utf8_general_ci

longtext utf8_general_ci

longtext utf8_general_ci

longtext utf8_general_ci


No 0


No 0

nid and vid as above. uid is the identity of the owner of the node - which in our case will always be me, user number 1. title is the title and a repeat of what we say before. body is the body, and as a 'long text' item can be very long. teaser is like an abstract or summary. Don't know what log is, but it's empty on all my test content. timestamp is self-explanatory. Don't know what format is, but 'story' used '1' and the poll used '0'.

So what I need to do is create a mapping from my table to Drupal's. Some of the fields I just won't copy over - the legacy database key, for example, and the thread. But others are buts of data for which there is no Drupal equivalent. Pretext, for example.

But - importantly - one of the big differences between my system and Drupal (and every other blogging system) is that posts in my system are about something, and I store that essential data - the title, author, publisher and link - as separate data items in my database. This is important, because it allows me to connect my work to other people's, but it also allows me to index the content based on the author or publisher, as I do on my resources page. I have always wondered why other systems don't do this - when people talk about something else they just put the link into the text, in an almost random fashion.

Well. Time for lunch and a walk to think about all this.

After Lunch

Well, it is now after lunch, two hours later, in fact, and I can report a most frustrating afternoon.

While i was walking up to the Tim Horton's I decided that basically what I was looking at was something like microformats. After all, if my post is about something, then it is essentially a review. And you can create reviews in microformats.

You see, what I had been thinking originally is that I would have to add extra fields to the database, so that I could add the extra information. I also considered just adding another node-type database, to hold this info. But what creating the record as a microformat would do is actually put the structured data into the 'body' field in 'node_revisions'.

So when I got back to the office I decided to see whether there was anything on Drupal and Microformats. And ran smack into Drupal's brain-dead documentation again, finding this, a project in 'alpha' that consists of nothing but a place-holder (hey Spaghetti, an 'alpha' is still supposed to be something that works). And as far as I can tell, the plan is to implement it using some sort of pseudo-code. No, not good at all.

Some more searching revealed some more discussion, including more posts from Digital Spaghetti but also some from gusaus talking about structured blogging. Well, that would work too - after all, what I do with my website, when I fill in the form identifying the author and the publisher and all that, is structured blogging. There's an implementation, from GoingOn - but no, the site is down for maintenance (has been all day). Then an absolutely useless and misleading page that seems to be about table-less layout (folks - don't say your page is about one thing when it's really about another thing, ok?). Still searching - found another discussion on structured blogging - same participants, same references, different discussion. Some good outline on what structured blogging is, and how it compared to microformats, but no new information about Drupal.

I then hit on this post from D'Arcy Norman. He writes, "upport for custom formats and authoring templates is baked into the DNA of Drupal. Even for non-coders, anyone can make up new formats (and templates) on the fly using the flexinode module. And several other formats are already available as prepackaged modules (events, reviews, etc...)." I had looked at the flexinode module, gagged at the completely useless (but oh so typical) documentation, and decided to pass on it. Maybe now it was worth a revisit, even though it contained no installation instructions.

Installation for modules in Drupal seems straightforward. There's a 'modules' subdirectory in the Drupal installation. Create a subdirectory under the modules directory, and stuff your module code into it. A module release will consist of a few files (this one contained about seven files). The files and the directory have the same root name, and the tyoes of files are indicated by the extension. So, say, if the module is called 'devel' then the directory is 'drupal/modules/devel' and the files might be 'devel.module' and 'devel.install' and 'devel.css'. Etc.

So anyhow, I create a 'flexinode' module and stuff the files in, and then go to the 'Modules' administration page in order to enable the module, just like I enabled all those other modules. I look - but it's not there.

Hm. OK. What's happening then? The 'flexinode' module I'm working with was designed for Drupal 4.7 and so isn't really intended for Drupal 5.0. And when I look in my NewsTrolls Drupal installation, it doesn't even resemble the 5.0 installation - all the ',module' files are in the 'modules' directory,a nd there aren't any subdirectories (that puzzles me, but I set it aside).

OK, are there any modules that are ready for Drupal 5.0? When I search, I get this page, which isn't a list of modules, but rather, some discussion about modules. Some utterly useless advice from harrisben (if you aren't going to describe something fully, or provide a link, just don't comment, OK? Saying 'check the downloads section' without any indication of what you're looking for or where it is is really useless - and frustrating).

Anyhow, there is a list of Drupal 5.0 modules (but you won't, by the way, find it in the downloads section). OK, good, I'll install a 5.0 module, something that's firmly developed for 5.0, and see if that works. Then maybe I can fix flexinode (since it doesn't look like the author has looked at it since 2004). devel is a perfect candidate. "Fully ported and the DRUPAL-5 branch exists." No installation instructions once again (do they thing we set up these sites by ESP?) but I follow the standard procedure. Then, over to the 'Modules' admin page to enable it. And... nothing. devel is nowhere to be seen.

So there's some magical procedure here that appears to be undocumented - at least, in a full day of searching about, I haven't encountered it. It's the end of the day, I'll transfer a few more photos to Flickr, read some email, and think about it overnight.

In Support of Bob Rae

Not that this should surprise anyone, but my endorsement for leader of the Liberal Party goes to: Bob Rae.

I should be selfish, I suppose. A Rae win would divert votes from the newly resurgent New Democrats. But that party, and especially current leader Jack Layton, has always been about supporting good government, not political advantage, and so my selection goes to the best candidate, not the most opportune candidate.

That said, I will say that if another candidate wins in the Liberal vote, then it is very likely that the rise in the NDP's fortunes will continue, as voters will be convinced, after the Paul Martin coup and then this, that the Liberal party has abandoned its progressive roots and become what the news media has always wanted: a corporate-friendly moderate right wing coalition.

On the other had, should Rae win the leadership, then I think it may be time for he and Jack Layton to sit down and talk seriously about becoming a coalition of some sort or even a united party. Coalition politics, of course, works much better in a preferential balloting system such as in Australia, where the ruling National Party has long managed to stay in power by virtue of such an arrangement.

In Canada, it would cement the leadership of the left, as we as a nation have long leaned in that direction, the Conservatives, despite their occasional successes, being for the most part a marginalized political philosophy, supported only by U.S. money and people who yearn for the good old days of monocultural Canada.

For his own part, Rae says he has learned from the lessons of the past, but he doesn't apologize for his performance as Ontario's premier either. Nor should he. The nae-sayers who condemned his government promptly turned around and showed how to really wreck a provincial economy, with the Conservatives' 'Common Sense Revolution' being reduced to something just short of a joke and his party reduced to a political rump. And they managed it in economic conditions far more favorable than those faced by Rae, who had to content with Conservative prime minister Brian Mulroney's Made-In-Canada recession.

With Rae, my expectation is that we will see a return to the progressive centre-left style of government that we saw under Jean Chretien, a government with a relatively modest political agenda but one which makes steady and, over time, significant progress toward issues such as the environment, health care, and social welfare. Even if I support the NDP federally, I can applaud this sort of government, and would find it much more to my liking than what we have seen under the Harper Conservatives.