Archive for February 2009

The evolution of life in 60 seconds

Seed Magazine recently posted a rather neat video called “The Evolution of Life in 60 Seconds” which, as the name suggests, is an attempt to depict the 4.6 billion year history of life on Earth into a single minute, in the form of a changing cluster of words. It’s a cute little video and gives you a good sense for just how incredibly recent everything we take for granted actually is.

I find it hard to put into words just how captivating I find the history of the Earth (which Wikipedia has an excellent article on), and even more so how utterly inspiring I find it that we have actually been able to figure out as much about it as we have. I have thought for a while now that if I ever inexplicably find myself with an extreme excess of time and money and the desire to develop an entirely new tool set that I would love to participate in a project to make a documentary which squeezes those 4.6 billion years into say, 120 or 150 minutes. I would like it to be something completely accessible to, say, a 14 year old, who is soon to have to make decisions about what to study in high school, but still interesting to adults. I’d also like to try to balance the emphasis between simple statement facts – “this is what happened, and when” – and an explanation of the scientific processes which led us to those facts – “this is how we know what happened“, or perhaps better “this is why we think this happened“: Just-so stories won’t inspire anybody.

AMPC 09 and a renovated research page

As mentioned in a previous entry, I recently spend some time in Newcastle. The reason for this was to attend the 2009 Australasian Mathematical Psychology Conference – my first conference since beginning my PhD! I had a really good time there. A lot of the material that people spoke about was somewhat opaque to me with my complete lack of background in psychology, but at the same time it was quite encouraging to get a sense for the extent to which people in Australia are doing good quantitative work on psychology. I came back with a lot of techniques and buzzwords underlined in my notes that I need to get up to speed with! I gave a talk myself which went relatively well. In hindsight I probably tried to pack more material into 20 minutes than is readily manageable, but it was hardly a disaster and at least now a lot more people know that I and my work exist.

It’s really been a very long time since I blogged anything regarding my research. In fact, this is the first I’ve written about it that is anything more than preliminary mental wanderings that I had before I had actually accumulated any sort of knowledge about the field. This has not been because due to a lack of work or ideas worth writing about. Far from it! It has been mostly because I have wanted to avoid alienating non-specialist readers by talking about concepts which I have not properly introduced, but at the same time have struggled to find the time and motivation to write a good series of grounding posts introducing the relevant concepts and paradigms.

Upon getting back from AMPC09 I reasoned that for the next week or two there would be a slight increase above the baseline probability of fellow academics working in language acquisition (or computational cognitive science in general) visiting my website, and so I made some quick effort to tidy up my research page somewhat. As a consequence of this, I can now point interested readers to my informal research overview] to get a rough idea of just what kind of work I do. You’ll also find on my research page copies of a paper I currently have under review for the 31st annual meeting of the cognitive science society (which is happening in Amsterdam in July) and of the presentation slides I used for my talk at AMPC09.

For better or worse, I am going to consider the availability of all of this new material sufficient for me to start blogging about my work again. Hopefully I can manage to keep my entries non-technical enough to be accessible but still interesting!

PrettyTable 0.1 released

Today I released a simple Python library that I wrote during a 3 hour train ride from Newcastle to Sydney last weekend (more on what I was doing there in a later enry!), which I’ve decided to call PrettyTable. It contains a single class of that name whose job is to make it easy to print nice-looking ASCII tables like this:

+-----------+------+------------+-----------------+
| City name | Area | Population | Annual Rainfall |
+-----------+------+------------+-----------------+
| Adelaide  | 1295 |  1158259   |      600.5      |
| Brisbane  | 5905 |  1857594   |      1146.4     |
| Darwin    | 112  |   120900   |      1714.7     |
| Hobart    | 1357 |   205556   |      619.5      |
| Sydney    | 2058 |  4336374   |      1214.8     |
| Melbourne | 1566 |  3806092   |      646.9      |
| Perth     | 5386 |  1554769   |      869.4      |
+-----------+------+------------+-----------------+

Some of you may recognise the style of table from the PostgreSQL shell psql, which was the inspiration for PrettyTable.

It’s quite a simple little piece of code (you can read about the various options at the page linked to above, and you can even see the Pydoc API – this is actually the first time I’ve used Pydoc on my own software!) but it’s also the kind of thing that I suspect will actually find use in a wide range of future projects, both of my own and hopefully of others.

Zine, another Python blogging engine

I’m a bit late in blogging about this, but last month saw the release of Zine, a blog engine written in Python and in the spirit of WordPress – i.e. having a user-friendly web interface, nice looking themes, a lot of plugins, etc. It is the only Python blog engine that I know of in this style. These are very early days for the project, but already it looks pretty impressive and I have to admit I am tempted to start using it instead of investing more effort in CherryBlosxom.

Where CherryBlosxom uses CherryPy to handle the HTTP side of things, Zine uses Werkzeug (which, if I recall correctly from my quick peek when it was released, is actually even lighter and simpler than CherryPy). Where CherryBlosxom uses Cheetah for its templating, Zine uses Jinja (which I’ve never really looked at before). Where CherryBlosxom keeps everything on the filesystem, Zine uses SQLAlchemy (which I’ve never used but have consistently heard very high praise for) to store entries in MySQL, PostgreSQL or SQLite. The fact that it works with any of these three major database engines gives it a good advantage over WordPress, which only works on MySQL.

I don’t really see Zine as competition for CherryBlosxom as such, since they are both striving to fill extremely different ecological niches. I am sure that Zine will appeal to people the most and if Zine only gets better than how it looks now I expect that it won’t be too long before it is the preferred blog engine for Pythonistas, since it will be able to do anything WordPress can do just as well – and perhaps more, perhaps better. Nevertheless, I know that at first glance the two are going to look like their in competition by virtue of both being blog engines in Python (of which there are surprisingly few), so the release of Zine is somewhat motivating to put more effort into CherryBlosxom.

I’d like to put some proper, intelligent caching functionality into CherryBlosxom – so that GETs of entries or entry lists are cached (i.e. the filesystem isn’t hit to get the entry text each time) but as soon as someone comments on the entry the cache is invalidated (so that the next GET does hit the filesystem and pick up the new comment along the way). I’d also like to write some command line tools to make using CherryBlosxom easier, so that people aren’t directly writing files in their entries directory, and to automate things like spell checking, etc – much like the old setup I had for PyBlosxom, discussed here. Perhaps I’ll release CherryBlosxom 0.2 with some steps in this direction (oh, and RSS/Atom feeds, of course) sometime soon.

NetBSD on the eeePC continued

Three consecutive entries on the same topic!

I bought a 4GB SDHC card for my eeePC server a few days ago. I used fdisk, disklabel and newfs to replace the default FAT32 partition on it with a 256MB /var partition (far larger than I am likely to ever need, but storage space is cheap these days and the pain of a full /var is great) and a partition taking up the rest of the space which I’ll use for my websites and web logs, my Postgres databases, etc. I edited /etc/fstab so that these partitions are mounted automatically on boot up and everything seems to just work. There now should be relatively few files on the eee’s SSD which are regularly overwritten. I’m actually quite inexperienced with removable media like SD(HC) and USB sticks and the solid state technology behind them (since I cut my Unix teeth at around the time floppy disks were starting to disappear in favour of CDs). Having now done a little bit of research (to find out, for instance, if my eeePC could read an SDHC as opposed to SD card), I’m not sure of the extent to which the general paranoia about SSD life is warranted – wear levelling technology seems to do a lot to alleviate the issue and is apparently fairly wide spread. At any rate, SD(HC) cards are cheap enough that the extra caution can’t really hurt.

I’ve transferred my HTTP(S) and SMTP services to the eeePC and things seem to be running quite nicely. My old server is still running for the purpose of providing NFS and Samba access to the large collection of multimedia files that I don’t have room for on the eee, but once I buy an external hard drive case I will transfer that over too and retire the old machine. If you notice any problems with web or mail services here in the next few days it’s probably just me ironing out bugs. At any rate, all of the most complicated and critical work in this project is pretty much wrapped up and I’m really happy with how smoothly it went.

For now my main concern is where abouts I’m going to put the thing! I’d rather not keep it on the floor under my desk like my previous server, because obviously it’s a lot less physically sturdy and I’d hate to accidentally step on it. I’m not too fond of the idea of putting it on my desk, either, because it would take up space which is at a serious premium, and also puts the machine at risk of having coffee spilt on it or the like. I’d really like to be able to keep it in a desk drawer – out of sight and out of mind. I could drill a hole in the back panel of the drawer large enough to feed the power cable and ethernet connection through. My only concern with this is whether or not there would be sufficient ventilation to keep the thing running at a safe temperature. I’m already a little bit nervous about running the eee with the lid permanently closed, especially since ordinarily the keyboard acts as a heatsink (see, e.g., step 5 of this disassembly guide). Other eeePC users have asked about this very issue before and the consensus seems to be that there isn’t much of a problem with doing it, but I have to assume that most people who claim to have done it without problems weren’t also keeping the thing stuck in a drawer. I could always drill a whole bunch of ventilation holes in the back of the drawer, or even remove the back panel entirely, I guess. I’ve also thought that I could stick one of those novelty USB fans in there with it to improve the airflow over the machine and out the holes at the back, but I’ve never actually seen one of those fans in real life so I’m not sure how much noise they make. If it’s a lot, they could be a really annoying solution.

NetBSD on the eeePC

Following on from my last entry, I have gone ahead and installed the very recently released NetBSD 5.0 Release Candidate 1 on my eeePC 701. I installed it from a USB stick as planned, by following this guide. Things went pretty much without incident. About the only think which temporarily stumped is that in order to boot the eeePC from a USB device, you need to press Escape during the initial loading screen – there’s nothing written on that screen to indicate this, and in fact the screen instructs you to press one of the F buttons to get a “boot menu”, which actually doesn’t have a USB option in it. I thought this was a bit counterintuitive.

After the install, the onboard ethernet device didn’t seem to be working – the kernel detected it without any issue but when I plugged it into my router the indicator light didn’t come on and I couldn’t get a DHCP lease. This thread at the overclockers.com.au forums suggests that this is a common problem on the eee after changing the OS. I’ve no idea how this supposed causal relation could actually exist, but nevertheless I disconnected the power and battery as suggested and ethernet worked just fine. The wireless device is detected but rejected by the default driver. There’s supposedly a patch that fixes this, but I haven’t tried it yet.

My biggest concern about setting the eee up as a NetBSD server is the fact that it has a solid state hard drive which has a finite life-span – according to some, around 100,000 reads/writes per block. Thus, to maximise the life of the machine, it is important to decrease the amount of repetitive writing that goes on. I have taken a few steps to facilitate this, for instance I have set the machine up with no swap space and mounted all my filesystems with the noatime option – this prevents updating of the “last accessed” timestamp every time a file is read. However, the issue of logging still remains to be dealt with. The /var/log directory is full of files which are written to constantly during the machine’s operation, and the HTTP logs for my website will be in much the same category. I think that before I actually deploy the machine as my new server I’ll invest in a 4GB or 8GB micro SD card and partition it in two – a small /var partition for all machine related logging and a larger /srv partition that will hold my website and the accompanying logs, as well as the PostgreSQL database for my weather data project.

The other issue that I need to resolve before I completely replace my current big, heavy, noisy, power-hungry server by the almost invisible eee is file storage. In addition to acting as a web and mail server, the current machine has a 200GB hard drive in it which contains my music, videos and other such files, which are NFS exported to my home network so that I can access them from any of my other machines. The eeePC’s solid state hard drive is only 4GB big, which is of course too small for this job, and of course I can’t physically install an IDE drive or the like inside it to get more space because the machine is tiny. If money were no object I could buy something like Western Digital’s Passport portable hard drives, which are quite large in terms of capacity, quite small in terms of physical size and powered entirely by the USB connection. More likely I will end up buying an external case with a USB connection for the IDE drive in the current machine. This will be a lot cheaper, but unfortunately it will consume an extra power outlet and a bit more space.

I’m feeling optimistic about this project for now.