Monday, December 04, 2006

A New Website, Part Eleven - Content Construction

Still working on the data transfer...


This transfer went relatively smoothly.

It wasn't clear to me at first how the system would support multiple authors, but it turns out that CCK does it quite nicely. It is important to define your properties in the right order. In other words, to define the 'authors' table first, and define 'post' later, so that you can select the author table from a list when defining how to designate the post author.


The list of journals loaded easily, even more so than the loading of author because it proceeded in the exact same way.

It is worth noting that the node counter in Drupal does not increment authomatically, which means that after a bunch of records are inserted you have to change the counter manually.

There are several counters:
- menu_mid
- users_uid
- node_nid
- node_revisions_vid
- comments_cid

The naming system is pretty intuitive; the first part is the name of the table, and the second is the name of the field. Thus, after inputting a bunch of authors and journals, I had to insert a new value into each of node_nid and node_revisions_vid. Otherwise, any new content would attempt to over-write an existing record, which generates (as it should) an error.


Event is a content type I devoted a fair amount of time to in my previous system and haven't quite got down. But I know where I want it to head.

First of all, how do define events: I needed to decide between the 'Calendar' module and the 'CCK' module.

I first created an 'Event' data type in CCK. This worked reasonably well, however, my date strings had to be plain integers. No really problem, since I'm using unix-style dates. However, these need to display nicely as well, and also, on my existing site I had a nice popup calendar I could click.

How about the Calendar module? First I had to find it - once again, Drupal's awful search made what should have been a one-minute job a ten minute exercise in frustration. Once I'm at the page I realize that there really isn't any information.

I don't understand the thinking behind the module pages in Drupal. Why are there no links to examples (or screen shots, or whatever) so I can see what they look like and what they do. And why doesn't the 'support forum' link to a module-specific support forum?

Anyhow. Given that 'calendar' requires the 'views' module, but not 'CCK', I conclude that calendar data cannot be associated with the other type of data I want to create. So I decide to go the CCK route.

Why is this important for me? Because in addition to content metadata, I foresee an environment that will eventually support event metadata. This in turn means that we will want to associate content metadata with event metadata, and vice versa. So I need content association.

What also helped is that while I was browsing around looking for the Calendar module I came across the various CCK field type modules. Very handy, especially the CCK date module. This module will even give me little calendar popups if the Javascript Tools module is installed (Javascript Tools also have a number of other utilities, such as form checking, AJAX support and columns).

Both modules installed without incident, but when I redesigned the Event content type to use the new date module the nice Javascript widget didn't pop up. I had to actually read the README file before I realized that after installing the Javascript Tools module you have to enable not only the main module but the specific function module, in this case, jscalendar.

While downloading CCK content types I also decided to download the email module; this would give me some form-checking on email address submissions.

Also worth mentioning when adding new fields to a custom content type: you should reuse the fields you've already created, especially if you are using the same field name. I noticed that I had three distinct 'Description' fields defined, which CCK dutifully named 'Description-0', 'Description-1', etc. It turned out to be a lot easier to simply select from the list, with no confusion of data.


Transferring the user database is obviously very different from the rest, since it's not just another common data type, but rather, is connected with logins, mailing list subscriptions, and access rights.

In Drupal, you can extend the user profile by adding any number of fields. These fields are grouped according to category, and these categories display as tabs when the user opts to 'Edit' their profile.

I began by looking at the mailing list modules. As mentioned, my server was using the ezmlm, so I used the ezmlm module. However, on learning that CSoft is switching over (presumably this week) to mailman, I decided to remove the ezmlm module and install the mailman manager.

The way the mailing list managers work is via email commands, and that's how Drupal interacts with them. When you install the manager you specify the list address. Drupal then sends commands (accompanied by a password) back and forth to the list manager, meaning that the user can manage their subscriptions entirely via Drupal.

As with ezmlm, therefore, what this means is that I will need to populate the mailman list with existing subscriptions, then populate Drupal with existing user profiles, which include email addresses, and then coax Drupal into recognizing those subscriptions via the email addresses. This is very similar to how my existing website works (subscription lists are maintained separately from personal profiles).

Since mailman isn't ready for prime time on CSoft, I decided to turn my attention to the users database.

Drupal supports the following basic user profile in the 'users' table:


varchar(60) utf8_general_ci

varchar(32) utf8_general_ci

varchar(64) utf8_general_ci


No 0


Yes 0


Yes 0

varchar(255) utf8_general_ci

varchar(255) utf8_general_ci


No 0


No 0


No 0


No 0

varchar(8) utf8_general_ci

varchar(12) utf8_general_ci

varchar(255) utf8_general_ci

varchar(64) utf8_general_ci

longtext utf8_general_ci

Most of these data elements are pretty intuitive and map one-to-one from my existing data.

Additional data elements are supported in the 'profile_fields' and 'profile_values' tables. 'profile_fields' defines the names and some other variables associated with each field, assigning each field a field number. 'profile_values' associates the field number with the actual values. This may seem complex, but doing it this way allows you to assign variables to each field (such as, say, the type of data accepted, or the name of the widget used to collect or display data) and to easily change the name of a field without changing the structure of data tables.

Fields are added through the Drupal administration panel, under 'settings - profile'. I created a number of fields to match my existing data:
- City
- Province or State
- Country
- Organization
- Web Page
- Weblog
- XML or RSS
- Status
- Mode
- Eformat
- ID Certificate

I then wrote a crosswalk that would populate not only the user table but also the profile_values table. Note also that the users_uid value in the sequences table also needs to be set, since the profile counter does not auto-increment.


The heart of my website is the Post. On my site, the post acts in the way the node acts on Drupal, as the repository for a variety of different content types. In particular, there are three major types of Post on my site:
- links - the short items that populate OLDaily each day
- articles - the longer, blog-post pieces that I write
- comments - contributions from visitors to the site

The rough equivalent in Drupal are the 'story' and 'page' content types. Neither really matches my type of content. Additionally, comments in Drupal are treated as a separate content type entirely.

In both my system and in the Drupal system, comments are associated with posts (or other content items) via the post ID number (or the node ID number, in the case of Drupal). So it won't be that hard to preserve that association. Moreover, both posts and comments are associated with a user, where the user is the Drupal userid (and on my system, the corresponding person ID). So I can preserve authorship.

Additionally, posts that are links, on my system, are associated with authors and journals (or publishers). These had their own IDs on my system, and now have their own distinct node ID on the Drupal system. When crosswalking the author and journal data, I save these in a hash table, for example, $author{Drupal ID} = Old ID. This way I can easily convert from the old ID value to the new ID value.

That's it for today. More on posts tomorrow.

No comments:

Post a Comment

Your comments will be moderated. Sorry, but it's not a nice world out there.