Archive for April 2008

RIP Edward Lorenz

I got an SMS from a friend this afternoon letting me know that Edward Lorenz had died.

Lorenz was a meteorologist at MIT and more or less the father of chaos theory. I later joked with another friend that 2008 seems to be turning out to be the year that God decided he hates nerds – in less than two months we’ve lost Gary Gygax, co-creator of Dungeons and Dragons, Arthur C. Clarke, science fiction author extraordinaire, John Wheeler, accomplished physicist and the one who coined the term “black hole” and now Lorenz. I wonder if Stephen Hawking will last the year…

Of all of these deaths, Lorenz’s has probably saddened me the most. I first learned about Lorenz in my penultimate year of high school, from reading Jame’s Gleick’s classic popular introduction to chaos theory, the appropriately named Chaos.  If I recall correctly I purchased that book after attending and being fascinated by a sort of combined lecture and computer experimenting session given by a mathematician at the University of South Australia the year before, as part of an “IT Careers Forum” that I attended. I distinctly recall having my imagination firmly captured by this book’s opening paragraph, describing numerical weather simulations that Lorenz ran in the 1960s, on a computer which was probably quite literally less powerful than the digital watch I wear today:

The sun beat down through a sky that had never seen clouds. The winds swept across an earth as smooth as glass. Night never came, and autumn never gave way to winter. It never rained. The simulated weather in Edward Lorenz’s new electronic computer changed slowly but certainly, drifting through a permanent dry midday season, as if the world had turned into Camelot, or some particularly bland version of southern California.

It sounds kind of cheesy now, but the idea of an entire world which existed as nothing more than a mathematical model running inside a computer fascinated me at the time, and in honesty it still does. I re-read Chaos, and other, more technical books on chaos theory and dynamical systems in general early on at university. These books were brimming with excitement at the idea that the unrelenting reductionism which had taken us so far in physics was starting to show its limits and that if we wanted to understand, say, the behaviour of clouds we had to stop thinking of clouds as a collection of countless individual particles, each with its own position, velocity, temperature and pressure, and instead to look for some new, more all-encompassing, purely
mathematical ways to look at the system. This sense of enthusiasm was probably entirely stale by the time I read about it in the 21st century, but nevertheless this stuff was definitely a substantial contributing factor in my eventual decision to major in mathematics instead of physics.

I haven’t really thought about chaos theory much in recent years, after my  eventual wandering into the territory of pure mathematics and cryptography, and my now having ending up in statistical language modelling, but it definitely had an important influence on the person I’ve become. It remains a really very fascinating, if often overlooked, part of our exploration of the world, and a comparatively very accessible one for the lay person. It’s sad to see its founder go.

A Simple Unix Shell in Python

Here is an absolute bare-minimum Unix shell written in just 36 lines of Python
(including a few blank lines). It doesn’t do pipelines, input/output redirection,
tab-completion or have a command history, it’s just a really light wrapper around
the fork, execv and wait system calls, using the os
module
. The only internally implemented function is cd because a shell
is a painful thing to use without it.


from os import chdir, execv, fork, getenv, wait
from os.path import exists
from sys import exit

while True:

    # Get input
    prompt = getenv("PS1", "$ ")
    try:
        input = raw_input(prompt)
    except EOFError:
        exit()

    # Parse input
    argv = input.split()
    command = argv[0]

    # Do inbuilts
    if command == "cd":
        if len(argv) == 1:
           chdir(getenv("HOME", "/"))
        else:
           chdir(argv[1])
    # Do external
    else:
        found = False
        path = getenv("PATH")
        for dir in path.split(":"):
            if exists("/".join((dir, command))):
                found = True
                if(fork() == 0):
                    execv("/".join((dir, command)), argv)
                else:
                    wait()
    if not found:
        print "%s: Not found" % command

This works quite smoothly, it doesn’t feel laggy or anything when you’re
actually using it. I suppose this isn’t unusual because the Python code in use
probably just directly calls the underlying C system calls. It’s really pretty
neat, if not slightly silly.

I have a somewhat more complicated version working, but at the moment it’s still
looking a bit too ugly to post in a blog entry – it handles interpolation of
environment variables, multiple semi-colon delimited commands on a line and has the
beginnings of pipeline and redirection support.

I think I’ll end up writing an article that explains how Unix shells work,
using a Python shell as an example (because it’s much easier for the average
person to read than C), and put the complete shell up on my
software page.

On an unrelated note, here’s Super Mario World implemented in Javascript. Wow.

Some Thoughts on Language Processing Algorithms

My approach to understanding natural language is what I imagine is the approach taken by most materialist scientists – that the brain is a computer made of meat and in trying to understand things like acquisition of language we are really searching for the algorithms implemented in this meat which achieve this task.

The problem of confirming that a given algorithm actually bears some resemblance to that running in our brain is an interesting one – not strictly necessary if we’re only interested in cool applications like talking computers (in which case the performance of the algorithm is about all we’re interested in), but probably deserving of attention if we’re operating under some pretense of being psychologists, which I suppose I am now (though I don’t like to think of it that way because I still haven’t yet had complete success in cleansing the word “psychologist” of the stigma of pseudoscience that it carries in my mind).

An obvious approach is to implement the algorithm in silicon rather than meat and get it to perform various tasks, making as many observations as possible about its performance and comparing these to similar measurements made on humans using the meaty algorithm. There’s a wide range of observations that could be used here (for example, some measure of susceptibility to linguistic “slip ups”, like spoonerisms) and I expect a lot of thought could be devoted to determining which tests are the most appropriate and reliable along these lines – a kind of “psycholinguistic Voight-Kampff test” which, rather than aiming to determine whether or not a machine can understand and converse in a way which is similar to humans on the surface, like a Turing test, aims to determine whether or not that machine is understanding and conversing in a way similar to humans “under the hood”.

But before we can even get to the stage where we could perform such testing, we need an algorithm to test, and I wonder if a lot of effort might not be saved by designing our algorithms from the outset to have a better chance of resembling the brain’s natural algorithms. The motivating question here is “What can we deduce about the brain’s language processing algorithms from the knowledge that they have been hard-coded into an organic organ by evolutionary forces?”. I’m a little bit out of my league here, having no real background in evolutionary biology or neurophysiology (which may well not even be a word), but studying maths gives you a fantastic arrogance when it comes to feeling qualified to talk about other people’s disciplines (after all, biology is just applied chemistry, which is just applied physics, which is just applied maths. Right?).

I have three somewhat solid thoughts on this front at the moment, both stemming from the idea that the brain, like most (all?) organic organs probably displays a high level of self-similarity, i.e. has the property, or is composed of sub-parts which have the property, of containing lots of copies of a similar sub-structure. This tendency is a pretty obvious and natural consequence of organs growing via a process of repeated cell division. So what does this self-similarity suggest?

Parallelism. Some algorithms are highly susceptible to being made to run in parallel, with linear or sometimes even super-linear speed up achievable, whereas some algorithms really are inherently very serial. It seems natural to expect that the brain is much more likely to be running parallel algorithms (with similar activity happening in several similarly structured parts of the brain), so perhaps we ought to cast some doubt over any language processing algorithm which seems hard to parallelise.

Recursion and iteration. The more recursion and iteration involved in an algorithm, the less need there is in a meat implementation for different pieces of meat which do different things. If we are supposing that evolution will tend to produce a lot of similar brain parts than a wide range of unique brain parts, then perhaps we ought to case some doubt over any language processing algorithm which does not contain a lot of recursion or iteration. This particular “restriction” (really more of an intuitive guide, I guess) puts the apparent current trend toward using Bayesian statistics in cognitive modelling in a good light, because the Bayesian paradigm is really all about iteration, in the sense of constantly updating our prior probability distribution in response to observations.

Sharing of data structures. There is more than one computational task in language processing. Sometimes we’re trying to translate a string of words into a logical relation between concepts and sometimes we’re trying to translate in the other direction. Obviously there are some data storage and searching issues related here – we need to store words, concepts and some sorts of mappings between them. Thus there are data structures involved here – not necessarily perfect analogues of the data structures one meets in a CS course (I doubt our brain uses literal hash tables, for instance), but data structures never the less. Presumably this data is stored in our brain only once, in one particular fashion. Thus, if you have one algorithm for translating in one direction and another for translating in the other, but they both use different data structures to represent concepts, words or the links between them, then regardless of how well the algorithms perform, perhaps we ought to suspect that at least one of them is not an accurate model of how the human brain actually
works.

Of course, it would be very foolish to interpret these as hard and fast guidelines, and I don’t mean to suggest that I will constrain my own studies only to algorithms fitting these criteria. But the very act of coming up with such a list is an interesting and, in my opinion, worthwhile exercise. I would be surprised if all three of these ideas were substantially wrong, and would advise that they at least be kept in mind while designing language processing algorithms that are supposed to mimic actual human language processing.

Citizen Video Journalism Meets Policing

There was an article in The Australian about a fortnight ago called “Police take a tip from YouTube”, discussing the plans of the NSW police (New South Wales is Australia’s oldest and most populous state) to roll out a website where private citizens can
anonymously upload photos and videos taken with their mobile phones which may be of assistance in solving crimes. The situation is somewhat reminiscent of what I wrote about last December in my Cryptographic Cameras article (see the “Emergency Response” section), an article heavily inspired by an essay by Bruce Schneier and others, and one which I really should get around to finishing sometime soon.

This is a really interesting idea and one I’ll tentatively call “good”. There are a few things which cause concern – there is obviously no system in place to guarantee the authenticity of uploaded photos and videos, as would be done using digital signatures in a true “cryptographic camera” system. This means that we can’t immediately discount the possibility of people uploading doctored media in an attempt to deceive police (perhaps framing an innocent, perhaps in an attempt to lead police away from actual leads) – I’m not sure whether the generally poor quality of mobile phone media would make Photoshopping harder or easier. Also, because the presence of GPS facilities in mobile phones is still pretty rare, there is no way to confirm that photos or video footage are actually of the location an uploader claims them to be, as there, again, would be in a true “cryptographic camera” system. Obviously, this is less of a problem the more distinctive the proclaimed location is.

The first thing that struck me as curious was the level of indirection introduced by having people upload their media through a website rather than sending it direct, and possibly even live, via the phone itself using MMS messages or video calls. When it’s possible to make one, a live video call has substantially more value than a recorded video uploaded after the fact precisely because it removes the possibility of doctoring a video. The reason this isn’t currently being considered, apparently, is a desire to
preserve the anonymity of the people doing the uploading. I realise this might seem unusual to some international readers, for whom mobile phones can be entirely anonymous – while honeymooning in Europe earlier this year my wife and
I were able to buy pre-paid SIM cards in Poland and the Czech Republic without leaving any sort of record of who we were. In Australia, you can’t get a mobile phone of any kind (to my knowledge, anyway) without showing a driver’s license or some other accepted form of ID, which means that your mobile number can always be linked directly back to you. On the one hand, I’m impressed and pleased that the police actually realise and really seem to care that anonymity is a valid concern here, but on the other, I’m not sure this solution is entirely effective. It’s a sure thing this website will record the IP addresses of uploaders, and in 9 cases out of 10 the police can track this to an individual or at least a household.

But whatever shortcomings this plan has for the time being, they’re sure to improve with time. GPS will eventually be standard in mobile phones, and it would be astonishingly stupid of handset manufacturers not to give users the option to “geotag” their photos and videos by embedding GPS data. If evidence arises that doctored material is being submitted, digital signatures could certainly be implemented if the police wanted to put enough money behind it.  Ultimately, regardless of whether this experiment succeeds or fails, the very fact that the police are even considering using technology to provide private citizens the ability to conveniently and anonymously contribute to crime fighting is a fantastic and exciting thing.

At the same time, I would throw just as much if not more support behind a parallel site run by private citizens which is all about letting people provide photos and footage of police officers in action, letting us watch the watchers while we’re not helping them.

Pyblosxom Hack Number 1

Here’s my first “pyBlosxom hack”. It’s not really a hack on the
pyBlosxom system
itself, it’s more of a “usage hack”, but I think it’s a relatively
neat one.

Back when this blog was statically rendered, I used to write the
entries in "http://pyblosxom.sourceforge.net/">Markdown, and they were
stayed that one on the file system. The statially rendered HTML
pages were in proper HTML, however, because I used the "http://pyblosxom.sourceforge.net/registry/text/PyMarkdown.html">Markdown
parser
for PyBlosxom. This worked just fine for static
rendering, but when I went dynamic I immediatley realised a huge
problem with this set up. For some reason the Markdown parser is
unbelievably slow. It took literally whole minutes for pyBlosxom to
render the latest 10 entries, which is obviously completely
unacceptable.

I found this quite odd at first, because I write my articles in
Markdown too, and use "http://www.freewisdom.org/projects/python-markdown/">Markdown in
Python
to translate them to HTML. I had always assumed that
PyBlosxom used the same Markdown translation code – afterall, why
would someone code a Python Markdown library if there was already
one out there? But it turns out that in fact this is what’s
happened. The PyBlosxom renderer uses completely different – and
obviously much less efficient – code to Markdown in Python.

The obvious solution to this problem would have been to wrap
Markdown in Python up in whatever interface pyBlosxom uses for
parsers, but I’ve solved it by doing something quite different
which gives me a fairly powerful interface to using pyBlosxom.

I’ve written a python script called makeentry which
does the following:

  • Starts up vi, my editor of choice, editing a temporary
    file in /tmp. I use this editing session to write an entry
    in Markdown. Note that I write just the entry, without the
    metadata that pyBlosxom would usually want at the start, like a
    title or tags.
  • Upon the vi process terminating after I finish writing
    the entry, it starts up "http://aspell.net/">aspell to spell check that
    file.
  • After spell checking the file, it (quickly!) translates the
    Markdown to HTML using Markdown in Python.
  • I then get prompted to entire a title and list of tags.
  • The title, tags and translated HTML entry are then all
    concatenated in the expected way into a file in my pyBlosxom entry
    directory (the filename is automatically generated from the title
    by converting to lowercase and replacing spaces with
    underscores).

This way I still get to write in Markdown, but with the
following benefits over wrapping Markdown in Python up with
pyBlosxom’s parser interface:

  • I get to do do spell checking (indeed, arbitrary
    pre-processing) before publishing my entry.
  • pyBlosxom reads the entry of the disk in HTML, so no time at
    all is consumed doing a translation (which is faster than even the
    fastest Markdown translator possible).

I quite like this usage paradigm. I’m hoping that sometime not
too far off I get the chance to add another level of
pre-processing: Pygments is a
code colouring system (written in Python, of course), which
translates code in just about any modern programming language into
HTML with appropriate span tags to perform syntactic code
colouration. I’d really like it if I could have my
makeentry script search the HTML entry for code
tags nested in pre tags (using "http://docs.python.org/lib/module-HTMLParser.html">HTMLParser
from the Python standard library) and automatically replace the
contents with colourised code using Pygments. This would be pretty
cool and shouldn’t be too hard. Keep an eye out for it in the
nearish future.

Why study language?

I’m more or less settled in at the university now and working four days a week on my PhD. The room that houses my office has only just recently had some renovations finished and it’s not exactly completely set up yet. It’s also so far below ground level that
there is absolutely zero mobile phone reception, which might be something of a pain, but which is also pretty hard to do anything about. Anyway, expect the entries in this blog to start revolving around what I’ve decided to tentatively declare “computational psycholinguistics” in the near future. And in that vein…

Why study human language? Three reasons stand out in particular for me.

  1. Language is something that is reasonably tractable by mathematical and scientific methodology. A lot of what goes on in psychology verges, in my opinion, on being pseudo-scientific rubbish. Any study, for instance, which revolves around things like one’s perception of oneself, or feelings or anything like that is immediately confronted with the fairly insurmountable problem that we can’t even precisely define these things, let alone measure them or model them. We don’t even properly understand conceptually simpler things, like memory, on which these grandiose ideas must surely depend. These psychologists are, metaphorically speaking, trying to fly to the moon before they’ve fully learned Newton’s laws of motion.

    I think language is in a different situation. It’s fairly easily to define what language, at its heart, is all about. We have two finite sets – one of words and one of concepts – and language is about mapping back and forward between finite sequences of words (more commonly known as “sentences” in the written case and, apparently, “utterances” in the spoken case) and logical relations between these concepts (which we might well call “ideas”). That’s what it is. Learning a language is nothing more
    than learning this mapping. This is perhaps an oversimplification – from a language perspective, we’ve side-stepped the issue of building words up from heard phonemes or seen morphemes, and from a mathematical perspective it’s true that were not really concerned with a mapping, but rather a relation because one sentence can conceivably have more than one possible interpretation – but it certainly captures the essence of the problem and puts it in an entirely tractable form: finite sequences of elements from finite sets are not mysterious, ephemeral, intuitive things – they’re rigorously defined and well studied entities. We can do statistical analysis on them, we can define equivalence classes on them and we can generate them using stochastic or deterministic processes.  Logical relations between concepts are nothing new or “squishy” either, and we can use things like predicate calculus to model them.

    In short, the study of language is firmly grounded in objective reality, thus letting one investigate the human mind – certainly an appealing area of study – without sacrificing one’s scientific integrity.

  2. Language is surprisingly fundamental to human cognition.  Although it’s not initially clear under casual consideration, I think that, when you think about it, it becomes an inescapable conclusion that language is inherently tied up – and very deeply so – with how humans form and internally represent arbitrary and often quite abstract concepts and categories. After all, we’re mapping back and forward between sentences or utterances and relations between concepts. The nature of these concepts and their initial formation, internal representation and long term storage can hardly be irrelevant. Sometimes when we map from a linguistic input into the conceptual “idea space”, the resulting idea has the long term affect of modifying the way we perform these mappings in future – i.e. when we are explicitly taught a new word.
  3. Language has some really cool applications in areas that I’m interested in. Better understanding of how humans understand and generate natural language can lead directly (thanks largely to point 1, i.e. that it’s understanding of something tangible) to giving computers better ability to do things like:
    • translate between human languages,
    • search the web,
    • automatically generate RDF triples for the semantic web,
    • intelligently aggregate related items from the overwhelming forest of online news sources and/or blogs
    • communicate with users in a more natural manner using speech and/or language recognition and synthesis.

    These sorts of applications are, I think, likely to fairly strongly influence the direction of my research, especially those related to the web.

So that’s it! Some of my reasoning behind devoting the next 3 years of my life largely to the study of human natural language processing.

My interview with Google

A little less than a month ago, I received an email out of the blue from someone claiming to be a recruiter for Google, who had apparently stumbled upon my website and, having observed my background in both maths and computers, was wondering if I would be interested in talking about any opportunities for engineering work with Google. After a good hour trying to convince myself that this wasn’t a joke or a scam (I even did reverse DNS lookups on IP addresses in the email’s SMTP headers!) I replied saying I was interested – I had been offered my PhD scholarship only about 2 days beforehand, so was already resolved to either leaving my current job or having my hours scaled drastically back. Besides, it’s been extensively well documented what an awesome place to work at Google is. Thus, I began an entirely unexpected interview process for a software engineering position that drew to a close today (I didn’t get it).

There is a lot of myth and legend surrounding job interviews with Google.  The blogosphere is full of the writings of those who’ve had interviews but not been employed – just do a search for “my interview with google” (without quotes). These people tell tales of highly unorthodox interview techniques with abstract or irrelevant seeming questions, having nothing to do with computers at all, from which Google’s recruiters can supposedly divine a deep understanding of how you think and thus whether or not they should give you free lunches. See  here, for example.

When I received my first phone call from Google, I was nervous. I was fully expecting to be asked to explain why manholes are round or how I would escape from a gigantic blender after being shrunk to the size of a coin.  In fact, at no stage during either this call or any of the two following calls was I asked any kind of bizarre trick questions.  The whole thing was quite orthodox, professional, and ruthlessly technical. In the end, I took a total of 3 phone calls, most of which were somewhere around
45 minutes in length, and exchanged a lot more emails. This entry is supposed to be a summary of my impressions of the whole experience, with some simple advice for future interviewees.

Note that I’m not going to share any specific interview questions that I was asked. The folks at Google asked me not to, and even though I didn’t sign anything, or even say anything that would to my understanding comprise a verbal contract under Australian law, I like to be a man of my word. Besides, you’ve really got nothing to gain from getting such a list, because if you couldn’t answer the questions on it without forewarning then you’re not likely to be able to answer any questions not on
the list, and when such a question inevitably comes up, you’ll lose. Besides, I’m sure Google aren’t stupid enough to not change their questions on a regular basis.

Anyway, starting at the start – why me? The HTTP access logs for my website show that on the day I received my first email from Google, someone got to my site from a Google search results page. The search term requested all web pages whose page title contained “homepage”, “about me” or “blog” and whose body contained the term “Python standard library”. Obviously, Google’s recruiters trawl the results of pages like this (occasionally replacing, I’m sure, the Python condition with other phrases relevant to other languages or technologies) looking for likely candidates. So having a good personal website that discusses your technical interests and experience will help Google find you. Of course, that’s no guarantee they’ll email you. Your website needs to
not suck, according to some criteria that I can’t enlighten you as to.

Once I was contacted, I was surprised and impressed by the nature of the emails I exchanged with Google’s hiring people. They were very relaxed and casual emails and everyone came across as friendly and open, although without seeming unprofessional. The people who scheduled my interviews were flexible and happy to accommodate late changes. It was pretty cool. Unfortunately, this aspect was sorely lacking in my eventual rejection email, which came from someone I’d had no earlier dealings with and was obviously generated from a template they use for everyone. Oh well.

As for the interviews themselves, they also tended to have a fairly friendly and casual atmosphere (the people doing the interviewing were geeks themselves). I’ve never had an interview before where the interviewer described things as “lame” or “freaking huge”! So while the kinds of questions you get asked are very technical and fairly demanding, they tended to feel to me more like a friendly fellow geek asking me questions than a harsh and uncaring CS lecturer giving me an oral examination, which is good.

The first thing I had to do in my first phone call was self evaluate myself on a scale of 1-10 at a whole bunch of things, including specific programming languages. The style of questions I got thereafter were selected to play to my strengths, which is certainly something I appreciated, and also a good incentive for you to be honest during the self evaluation. I’ll try to give you a sense of what sort of questions you might get asked if your self evaluation looks like mine:

  • You will need to be able to quote the average and worst case running times (in Big-O notation) of some fairly standard CS algorithms, like searching or sorting common data structures.
  • You will need to know some Unix internals, all the way down to common system calls, their arguments and return values.
  • You will need to be be able to devise algorithms to solve relatively simple problems, analyse the running time of your solution and possibly read back an implementation of it in a language of your choice (yes, you may be asked to put the phone down, write some code and then read it back).

If any of these things are beyond you, you simply won’t fare well, so don’t bother accepting an interview until you’ve learned some of this kind of thing. Unless you have pretty sharp skills as it is, you would do well to study in between your interviews, and especially before your first one. I certainly wouldn’t have made it past the first one if I hadn’t skimmed through a data structures and algorithms text book the night before to refresh my memory on numerous things.

Overall, I really enjoyed the experience, even if I didn’t get the position.  It provided a really good excuse to spend some time brushing up on all areas of computer science (including stuff I probably should have learned ages ago) and served to identify gaps and weaknesses in my knowledge which I can now focus on fixing.

In case it’s not now clear, this was the reason why my research page page said (until recently) that content would be forthcoming soon with probability 0.5. Obviously, this probability is now 1.0. I’m starting tomorrow, and am lucky enough to still be getting one day’s work a week at m.Net, which is the maximum allowed under the terms of my scholarship.

Commenting feature added, pyBlosxom headaches

As some of you may have noticed, this blog now features comments. I set this up over last Thursday and Friday. It wasn’t a straight forward procedure, and during these few days you may have encountered various problems with my website – even for parts of it that have nothing to do with the blog, because at one stage the internal URL rewrites that I asked lighttpd to do were stupid ones, owing to my inexperience with regular expressions. I apologise if this caused you trouble, everything should be working fine now.

The entire experience has rather substantially dented my confidence in PyBlosxom as a blogging platform. Certainly, I enjoy its flat-file simplicity and naturally far prefer to be using something written in Python rather than PHP, but the fact that the majority of its functionality is provided by third party plugins of an apparently mediocre quality – and certainly of ephemeral availability – with documentation that varies from non-existent to outright inaccurate does not give me warm fuzzy feelings. I’ll probably write an entry or two about the problems I’ve had in the coming weeks.

I have been entertaining grandiose plans of writing my own Pythonic blogging platform, based on CherryPy (which looks so genuinely fantastic that I can’t wait to use it for something) and Cheetah, which has served me well as part of my current home-brew system for generating this site. I would assuredly stick with flat text files over a database, although I may use an SQLite database for some things if I thought it would afford a significant gain in performance or code simplicity without too much an increase in overall complexity. This is probably a pipe dream anyway, and certainly not something I could throw together in a hurry.

Until such a time comes as I write this imaginary CherryBlog, I think I will slowly devote time to hacking on PyBlosxom in an attempt to make it more usable. I’ll blog about anything half way decent that I come up with.

On a performance note, you may have noticed that the blog pages of this site are now substantially slower to load than they have been in the past. At the moment, PyBlosxom is running as a plain old CGI process, which of course means that the whole thing
is as slow as it possibly could be. But that’s not to say it’s necessarily slow, of course. Under the super light load that this blog is currently getting things are bearable. I do intend to migrate away from CGI at some stage, if I don’t write my won system first – there is a WSGI version of PyBlosxom which I should be able to hook up to lighttpd using Allan Saddi’s flup library and either FastCGI or SCGI. Barry Pederson has provided a starting point for this in his own PyBlosxom blog. I’ve had quite enough of tinkering with this blog for a while, though, so this may not happen for a month or so.