BlogForever Interview
First,
we would like to understand your background. This will help us to understand
the context of your answers.
1.
Could you please tell us a
bit about your blogging experiences and why are you blogging?
I have
been blogging since before there was blogging. I first got on to the Internet in
’92-’93. My first online experiences were as early as 1981 working for Texas
Instruments, but that wasn’t the
Internet properly so called. There is this Texas Instruments’ internal World
Wide Network. But I first got on to the Internet, really, when I started with,
Athabasca University in 1987. But I really started using my Internet access
from them, as I said in about ‘92-’93, which should be participating in
multiple user online games (MUDs). So, of course, I would have started writing then.
The first actual writing for the Web that I began doing would have been in
1995. I created my home page in 1995 and began writing and posting articles
almost immediately. I have some of my earliest articles - which are still on my
page now - are from ’95, including, for example, a transcript from an online
conference that we hosted. I have been
writing articles since then.
What
motivated me to begin using what we were today call a blogging format is my
participation in online discussion lists (usually hosted lists, although
sometimes mailing lists as well). When I was using those, in mid to late
nineties, it occurred to me that the archives of these might not last forever.
And I should store my own contributions, because they were so brilliant (jokingly), so they would not be lost to
history. And that’s exactly what I did. If you look at my list of articles, I
have a page with all of my articles on my website. I forget how many there are,
I have to look it up. But there are hundreds and hundreds of articles. I am
just looking it up now, because I am not sure what it says. It says: 1,159
articles. That’s a bit out of date, so there are probably about 1,200 articles.
But at the head of these articles you see: “Posted to www-dev” or “Posted to
the HotWired mailing list”.
It turns
out that these concerns were prescient, because you can no longer access the
www-dev mailing list archives. HotWired was taken down in 1998 and all the
postings completely wiped out. It turns out it was a really smart idea for me
to keep my own content on my own website and so that is what I have been doing.
I mean there’ve always been these posts I have rigged to myself. Concurrent
with that, beginning probably in ‘98, but officially on May 5th
2001, is my daily news letter.
My daily
newsletter consists of short posts (not the longer ones that compose my
articles) and these short posts are generally links to external resources. This
has its origin as a mechanism for me to synchronise my bookmarks between home
and office. Back in the 90’s of course, if you wanted to save a website you would
bookmark it. Bookmarks were saved locally as HTML files, but it was very inconvenient to have one set of
bookmarks that is at work and then other one at home. So, I set up a little
database on my website and instead of bookmarking using the Netscape
bookmarking system I would just fill in a form and that would create a bookmark
on my website. These were all dated, these were all posted on my website and so
that became my daily newsletter.
So, I
have two forms of blogging – the articles and the newsletter. The articles come
as a means of me saving my posts to mailing lists and such. The newsletter is a
means of me saving my bookmarks.
2.
How do you facilitate or
prevent that your blog will be found by other people
Good
question. When I first started, when I posted something in my newsletter like
posting links, and when I’d post a link of so and so I would then send an email
to so and so and say “I posted a link to your article in my newsletter”. I
don’t do that so much now because the people I would send an email to are
generally reading the newsletter anyway and they already know.
My URL
is in my email signature, and has always been in my email signature. It used to
be on my business card, but we have a new policy at NRC that only the Institute
website goes onto business card. So it is no longer on the business card.
Other
than that, I post short Twitter posts to indicate when I have written a longer
article. That is typically a lengthier article – on the Blogger blog, half an
hour. These blogger blog articles are eventually migrated into my website. I
also have a system set up so that when I create a post in my newsletter it
automatically feeds to a separate Twitter account called @oldaily. My main
Twitter account is @Downes, my separate Twitter account is @oldaily, and this
one is fed automatically from my newsletter.
I’ve
been manually posting posts recently in Google+. I don’t expect to do that
forever. I expect to automate that at some point because it takes a few extra
minutes every day. What else do I do? I think, probably, the biggest thing; I
am a prolific commenter on other people’s websites. I do a lot of reading. I
read hundreds of posts a day, possibly thousands of posts a day. It is a
ridiculous number. And very frequently, I leave comments on those other posts. Even
if I am not going to write about them in my own newsletter I’ll nstart babbling.
A lot of the times these comments I leave on other posts become articles on my
own site, because once it gets beyond one or two paragraphs that same reasoning
clicks in: “Oh, I better keep a copy of this, because I can’t be guaranteed
that it will be forever available on the other side”. And, so I snag it and
make it an article on my own side.
When I
make comment on the other websites, much of the time (almost all of the times
these days) they ask for an email address, name and website URL, which means my
comment appears on the other website and when they click on my name, they go to
my website. I know that drives a lot of traffic.
Do you use specific keywords or tags?
No, not
really.
3.
Who has the right to do
what with your blog content or any data from blog? How do you indicate and
control the rights for your content?
I do not
control the rights. Life is too short to be controlling things. I use a
Creative Commons, by-non-commercial share-like. It is supposed to be 3.0, but I
don’t know what version it actually is – I don’t care. My interest in
controlling what people do with my staff is, honestly, minimal. I put the
licence up only because it makes it easier for other people to use my content,
so they don’t sit there and wonder whether they are allowed to do stuff with it
or not. I don’t like putting them through that kind of angst. But this is just
something that doesn’t matter to me.
4.
Are you interested in
possible interconnections between your blog and others?
Well,
there are interconnections – they are called links. And every one of my
newsletter posts links to something else. Every single one! And most of my blog
posts link to something else. As for forming blogging networks or some sort of
group-like behaviour, I am not interested in that.
5.
In a platform where you
browse and search for blogs and the relations between them what would make the
user interface comfortable and intuitive?
It
depends on what you are trying to do. I mean, you probably want some sort of a
visual representation. There are lists and lists and lists of blogs or forums
that have you searched for blogs aren’t really very helpful. Because, in the
beginning I use Google. Google has a great search. Any services are very
unlikely to match Google search. The only advantage that a local site can
provide with respect to search is to limit the range of search results.
On my
own website I harvest content from other blogs. I have about (I forget) a
thousand to 13 hundred or so blogs that I aggregate and put that content into
my database. And search on that is useful because it is limited to the content
of these 13 hundred blogs. So, if you were doing something like that, search
limited to this content might be useful, but it would have to be a well curated
set of blogs. You don’t want just 13 hundred random blogs. You don’t want:
“anyone can submit a blog”, because you get ten useful blogs and 25 spam blogs,
which would be a big problem.
When you
are talking about the inter-connections between the blogs, it is hard to
describe how visually it should be represented, although, I have my own ideas
on that, but it should be represented visually as a function of contribution
and a function of time. So, I think there should be a time axis. Typical representations
of the linkages between blogs are never indexed to time. It is always this
network and a network guy who says: “here’s a blog, here’s blog, and there is a
line between them”. But in real world, the relationships between blogs aren’t
like that. They are not static. It is not a one-time thing such as: “here is a
network – forever and a day”. It is very fluid, very dynamic. So, having a way
of representing that would be important.
And so,
the idea is that there would be this image that would change over time and you
would see these changes over time as you came back to it on a regular basis or
as you subscribed to it. How do you do that, can be a long, long discussion.
What
else? You are asking what would make it easy for users. My experience is that,
with some few exceptions (these exceptions are sites like Facebook, Google, may
be Yahoo, perhaps Flickr, YouTube and a few others) people don’t go to
webpages. It is very rare, you hardly ever see it. The only time you really see
people going to webpages is if the address of that webpage is returned as a
result of a search. As when you do a Google search, get a link, click a link
and you are on that page. A person doesn’t just go back to a web page without
some sort of a search or other prompt. So, to make any sort of system like this
useful to users, it is going to have to provide that prompt that will send
users to whatever information is that you want to receive on a regular basis.
That is
why like in the Massive Open Online Courses we have, were we do have this
network of blogs that are written by individuals all over the place and at the
current Connectivism course we have something like 270 blogs (I am just looking
it up, because I like to use exact numbers). We have 260 separate blogs that
people write. And it is essential for us to have a central newsletter that we
send out to every participant every week day, so when somebody has posted a new
post at one of those 260 blogs it shows up in the newsletter. Because it is
impossible for somebody to go to
those 260 blogs and even they are not going to come to our course website to
see what’s there. Even though they know there will be something new every day
they won’t come to the website, it will not occur to them. So having this
prompt is crucial, absolutely crucial. The course couldn’t have run without it.
And I think that will be the same for your service.
6.
Are you interested in how
your blog is ranked among blogs for the different subjects and how do you check
that?
I just
assume it is first (jokingly). The short
answer is no. I suppose, if somebody came up with a ranking and I was like 81st
I would kind of wonder. But you know, does my blog rank higher or lower than George
Siemens’s, that really is a pretty irrelevant question. Even the question of
ranking itself raises the question of what would constitute the ranking. Is it
number of visits? Is it amount of time spent on the blog? Is it the number of
posts out there in the world that are spawned off the post on my blog? Is it
the number of links from other blogs back to my blog? You can come up with a
bunch of other measures. You can look at ranking services such as Alexa or Klout
and come up with other ideas on ranking, and they all turn out, in the end, to
be kind of arbitrary, and kind of snapshot-ish and kind of quantity-focused.
What
really interests me about my blog (with respect to inter-relations to other
people) is whether it is first to come out with a concept or an idea, and I
have no idea if you guys can rank that or if you can rank that automatically.
It interests me if I authored a unique (well, no, not even unique, it doesn’t
need to be unique), an informed, an insightful perspective or a point of view
that matters. I would prefer to be right more often than other blogs. To me,
the number one ranking would be: “I have more factually true statements in my
blog than any other blog”, demonstrably and knowably so. But who is going to
rank then based on that? And you do not want to rank that trivially, because
someone will just start posting dictionary articles, and they get lots of
factually true statements. So, say, the highest number of factually true
statements that are contextually relevant to the current debate. If you measure
that, than that would matter to me, but ranking on the number of readers… I am
never going to have the most number of readers, never ever, ever, and anything
that tends me to want to have to the most readers is actually detrimental to
me, because it means that I am going to be broadening my coverage in order to
attract a wider range of interests and that I am going to be making my coverage
less deep, or minimally, less idiosyncratic, again to capture broader
demographic. I don’t want to do either of those things. Either of those things
would damage the integrity of the blog.
7.
By what other criteria would
you like to see your blog ranked?
Oh, I
see what you mean. You know what, it is not a competition. It has never been a
competition. I am not racing against
other blogs. I am working with them.
Since I started, especially in the newsletter, but also in the blog, I have
tried to direct traffic away from my
site to other people. This whole idea
of ranking blogs creates a competition where there isn’t one. I mean, would you
rank the rivets on an airplane? That would be stupid right? What is the number
one rivets on your argh…? It is a dumb thing to ask for. And, in the same way,
ranking the blogs in the Blogosphere is like ranking the rivets on an airplane.
What matters is that they each hold their own part of the airplane together.
That is all that matters. And to suggest that one of them is more important
than the other makes no sense. It is an incoherent concept and you should not
do it.
8.
Would you be interested in
an analysis of your blog (or part of your blog) to extract for example:
statistics (popularity, visits, etc), keywords or sentiments and why?
Well, I
don’t know how you are going to do visits. Good luck with that. You are not
going to get accurate data, because I know that traffic to my blog comes from a
wide variety of sources. I get stuff from RSS feeds, I get audio listens,
people look at my content on other sites like Flicker or YouTube or Blogger. My
content isn’t even located on a single website. If I am trying to increase the
ranking of my blog I make sure everything goes on one site. If I am trying to
increase the usability, I put different staff on different places. So, you will
not be able to get accurate statistics of the readership on my blog - period. I
don’t have accurate statistics. I mostly don’t care. In theory I could have
sort of accurate statistics, but … I actually have a weblog analyser that I
started up over the summer after ten years of running the thing I have finally
turned on the hit counter. But even that does not record the number of views on
RSS readers and the like. So, that part of it I am not too interested in.
The
semantic analysis is interesting, because I am always interested in how people
see what I am doing – that’s interesting. If it is just the identification of
keywords, I have done that: you submit your blog to Wordle or whatever and get
the word-cloud. I would something that is a little bit more insightful. I know
that there are text analysis software packages available that you run it
through, EPSS or whatever. So, that would be kind of interesting. Comparing the
focus, you know, Stephen talks a lot about cognitive structures than George who
talks mostly about social structures, that would be kind of interesting. I
think that would be really hard to do though. But because it is hard to do,
that is probably why it would be interesting.
9.
Do you archive or back up
your blog(s)?
Yes, I
do. Everything.
Can you describe the process of archiving or backing
up the blog(s) you are authoring.
Quickly?
No. Again, my content is scattered all over the place and there are lots of
good reasons for that. So, the simple rule of thumb is: I try to make sure that
there’s at least two places where any given instance of my content is. The
longer version is, I actually try to make sure there is three or four places
where everything is. Different content is archived in different ways. So, the
article content, any textual content from third party sites like Blogger or
comments or posts is retrieved and stored on my main site which is downes.ca,
in my database. And then I periodically do a backup of my database. I also have
a backup system through my website provider. I also save copies of my website.
They’s just the whole lock, stock and
barrel on hard drives, like I just copy the whole website over to my hard drive
at home. I then save that onto some backup hard drives.
Photos,
I make sure to send a copy of my photos to Flickr, and photos are backed up on
at least tow, and usually more than two, hard drives. And I have a bunch of
photos also saved on DVDs.
Can you describe the process of accessing or restoring
information from your archive.
For
other people to access my “archive”, they are accessing the database off my
website and they just use the interface of my website. For me to access that,
it means getting the database file and re-loading it into the database and then
I access it as though it were the first bit.
If it is
the images, it is just I open up the hard drive, but mostly, from the
perspective of people you look at the images on Flickr. If Flickr ever
disappeared then the way they would access my archive is they would wait until
I found Picasa or some other image upload site and filed my images there. And
if that weren’t available, they would get them off my website.
Can you identify any problems/issues with the
procedures you are currently following?
They are
not all automatic. I would like to just be able to make content and upload it
and not worry about whether there is a backup and just have a backup that would
run automatically, and I wouldn’t have to do anything.
The Wayback
Machine was really good for that, and it saved me a bunch of times. And what I
really liked about it is – I didn’t need to do anything. The problem with the
Wayback Machine is that it wasn’t complete. It would capture snapshots, but on
a dynamic site like mine snapshots are hit and miss.
10.
Would you like to have a
real time, continuous and viewable archive of your blog? Can you imagine what
this service could be like?
Well
again, good luck with that. I mean, for 99% of people that is going to work
fine, for me it is a bit dicier. Would I be interested? Yes, sure. I think that
would be a cool thing to have. And again, anything that makes it easier to use
my stuff for other people, that’s cool. So, I am all over that. It is a bit
tricky, because what you consider worth archiving and what I consider worth
archiving might be two separate things. Again, because I have a wide range of
different content, in different places. If you are archiving all them
separately that’s, kind of, not that efficient. So, you’d probably want to
bring them together. So, I have various
blog websites. The one I use most of all is my main website, downes.ca, as well
as the ‘Half an Hour’ website. But I also have another Blogger website called ‘Let's
Make Some Art, Dammit’ and then the photo-site, the YouTube site etc. If all of
those were pulled into a single archive that would be great. I think it will be
complicated to do it all separately. Yes, I think people would like it. I think
it would give me a little bit of peace of mind knowing that if I have a
complete crash and burn moment it is there – somewhere else.
Everybody
sometime in their life is going to have a complete crash and burn moment,
right? They would no longer be paying for anything, and all other sites will
stop being updated. So, having that would really help.
It would
be interesting to know. You say archiving, you mean archiving for ever? That
would be cool, especially if you’d archived my backlog (my back catalogue of
stuff going back to 1995) and kept it forever. I think that would be really
useful.
11.
If there would be a
preservation or archiving system for blogs how would you like to control which
of your content is captured and stored?
If it’s
content, capture and store it. Life is too short to be choosing which stuff to
archive and which stuff not to archive. If I create content I think it is work
capturing it and storing. But that is just me. I think all my tweets should be
captured. I think random off hand remarks should be captured. I have got some
like 300 comments through the Disqus commenting system; I think that should be
captured.
I think
whatever I have created is worth capturing and I think it all forms a part of
the overall picture or tapestry that is my contribution to the World, whatever
that is. And I bet you most people feel that way. Even my Facebook posts. If
you could get into Facebook and pull the content out that would be cool.
I think
you will have an issue with duplicates, but, I mean, I do not want to be the
one editing duplicates. My content propagates automatically. So, I do not think
I want three of the same announcement, just because they showed up on different
systems: the announcement first on my, then on Twitter, and then on Facebook,
because I have them daisy-chained. Just one will do. That is another long-winded
way of saying that I don’t really want to manage that.
How would you like to indicate that content should be
removed from the preservation system?
There
are two aspects to this. First, locating the content you don’t want to be in
the system. So, there needs to be some kind of content-location system, like a
search engine of some sort. And then, secondly, the function that actually does
the removing, whereas that function has a built-in safety check like: “Delete?
Delete forever? Really? Are you sure? This will delete it forever!”. And then,
when I do that, don’t actually delete it, but just change its status to “show
to nobody”. Because, I will make a mistake even though it said “are you sure,
are you sure”… I will at some point make a mistake. So you shouldn’t actually
delete it, you should just make it not viewable, unless the decision to delete
has been overturned.
12.
How do you facilitate or
prevent technically that your blog will be found and disseminated by search
engines?
I don’t.
I don’t care.
13.
How do you facilitate the
readers of your blog that they find related posts inside your blog?
I have
tried different things over the years and right now it is a bit haphazard. I
have an automatic tagging system (but it’s broken; I keep intending to fix it) I
know it is worth mentioning. Basically, I have a predefined list of topics. I
define each topic as a regular expression string. I apply that regular
expression string to any post in my blog. If there is a match that topic is
attached to that post. And then there is a list of topics, so when you click on
a particular topic in that list of topics you get the lists of posts that match
to that topic. I also capture author/publisher information and that does work.
It is not all hypothetical. But the topics is a horrible, horrible nightmare to
manage. If you try to build something like that you are going to need so much
caching it is not funny.
So, when
I submit stuff, I submit the name of the author and the name of the journal or
blog it was published in. People can click on the name, that name, anytime and
get a list of posts associated with that. I have got a graphing system that I
am just building now, but it is intending to track all the links from one post
to another post to another post. This is not implemented yet, but the idea
should be that you can follow links on links.
All my
newsletters, all my contents are searchable. I also have an archive of
newsletters that is Google indexable and that really helps a lot of people, because
a lot of people find related stuff just by searching on Google.
14.
If there would be a
preservation or archiving system for blogs and if there would be a special
access or interface for blog authors how should it work?
Invisibly.
Yes, I get what you are after. I have thought of this. First thing you have to
do is to be able to associate blog authors with blogs properly. That is the
thing that Technorati ran into years and years ago and they came up with a
“Claim Your Blog” system. There needs to be a mechanism by which you claim your
blog or blogs (because the relation will be multiple blogs to multiple people –
a many-to-many relationship).
Secondly,
we need to understand what sort of functionality that interface would entail.So
far, we have one functionality defined and that is to delete a post from the
archive. Hopefully, you are not tying up bloggers to more management than that
really. You are probably looking into some kind of a blogger profile and being
able to apply this profile and information to the blog to provide better
searching capability or some such thing. So, you want a profile editor of some
sort, but really, it is “yet another profile editor.” YAPE. The World already
has about a billion too many profile editors.
So, it would be nice if such a system would actually support access to
my blogging system, whatever that may be, through a mechanism such as OAuth or
some such thing. But again, how would you do it for someone like me who just
has its standalone website, that is a bit problematic. You will need to allow
manual input as well as automatic input of data. It depends what you want your interface to
do: authoring about profile and delete the posts that should not be archived. I
do not know what you want to do.
15.
Do you have any general
comments on the development of a blog aggregation, preservation, management
& dissemination software?
I would
really want to see it. If you are building this it is going to be hard. I know,
because I have built pretty much that. I used systems like that for our Massive
Open Online Courses. I used a system like that called EDU-RSS, that works off
and on. You are going to run into issues with specific types of blogs like Posterous
and Tumblr. You are thinking of tracking the relationships between blogs, I
think that is a very useful thing to do.
You are
going to find these relationships show up in odd ways; to give you one example
– images. X uses an image, Y uses the same image, that creates a link between X
and Y even though may have never connected with each other. They may use the
same images at the same URL, which is easy to detect, they may use the same
image where the copies of the images are located at different URLs, even though
it is the same image. One may use the cropped version of the same image of
another. All of these kinds of things create these kinds of linkages.
Definitely go for the easy case. My aggregator, right now, analyses for links
for embedded media, for images, anything I can find. And then I create separate
entity tables out of these and now I am able to draw links from people to blog posts
to images to comments to whatever. If you are going to do that, you probably
will, you are probably looking at creating a giant global graph from entities
to entities and then creating some interesting linkages out of that. But this is
a lot of overhead. It is a lot of processing. It is going to be hard to do. It
is going to require a lot of hardware to pull off and bandwidth to pull off.
So, there are probably financial issues as well, which leads into questions
about sustainability. Keep me informed. I would really like to see how you
address all these challenges that I looked at. And feel free to talk to me
about any of the challenges you are facing because I may have already faced
them, because, I have been, like I said, this deep in this stuff. It is most of
what I am doing these days. And it is an area of a very deep interest of mine.
Thank you very much indeed Stephen, That has been
really, really helpful.
This comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete