pozorvlak: (Default)
Sunday, June 2nd, 2013 09:15 pm
I've been learning about the NoSQL database CouchDB, mainly from the Definitive Guide, but also from the Coursera Introduction to Data Science course and through an informative chat with [personal profile] necaris, who has used it extensively at Esplorio. The current draft of the Definitive Guide is rather out-of-date and has several long-open pull requests on GitHub, which doesn't exactly inspire confidence, but CouchDB itself appears to be actively maintained. I have yet to use CouchDB in anger, but here's what I've learned so far:

  • CouchDB is, at its core, an HTTP server providing append-only access to B-trees of versioned JSON objects via a RESTful interface. Say what now? Well, you store your data as JavaScript-like objects (which allow you to nest arrays and hash tables freely); each object is indexed by a key; you access existing objects and insert new ones using the standard HTTP GET, PUT and DELETE methods, specifying and receiving data in JavaScript Object Notation; you can't update objects, only replace them with new objects with the same key and a higher version number; and it's cheap to request all the objects with keys in a given range.

  • The JSON is not by default required to conform to any particular schema, but you can add validation functions to be called every time data is added to the database. These will reject improperly-formed data.

  • CouchDB is at pains to be RESTful, to emit proper cache-invalidation data, and so on, and this is key to scaling it out: put a contiguous subset of (a consistent hash of) the keyspace on each machine, and build a tree of reverse HTTP proxies (possibly caching ones) in front of your database cluster.

  • CouchDB's killer feature is probably master-to-master replication: if you want to do DB operations on a machine that's sometimes disconnected from the rest of the cluster (a mobile device, say), then you can do so, and sync changes up and down when you reconnect. Conflicts are flagged but not resolved by default; you can resolve them manually or automatically by recording a new version of the conflicted object. Replication is also used for load-balancing, failover and scaling out: you can maintain one or more machines that constantly replicate the master server for a section of keyspace, and you can replicate only a subset of keyspace onto a new database when you need to expand.

  • CouchDB doesn't guarantee to preserve all the history of an object, and in particular replications only seem to send the most recent version; I think this precludes Git-style three-way merge from the conflicting versions' most recent common ancestor (and forget about Darcs-style full-history merging!).

  • The cluster-management story isn't as good as for some other systems, but there are a couple of PaaS offerings.

  • Queries/views and non-primary indexes are both handled using map/reduce. If you want to index on something other than the primary key - posts by date, say - then you write a map query which emits (date, post) pairs. These are put into another B-tree, which is stored on disk; clever things are done to mark subtrees invalid as new data comes in, and changes to the query result or index are calculated lazily. Since indices are stored as B-trees, it's cheap to get all the objects within a given range of secondary keys: all posts in February, for instance.

  • CouchDB's reduce functions are crippled: attempting to calculate anything that isn't a scalar or a fixed-size object is considered Bad Form, and may cause your machine(s) to thrash. AFAICT you can't reduce results from different machines by this mechanism: CouchDB Lounge requires you to write extra merge functions in Twisted Python.

  • Map, reduce and validation functions (and various others, see below) are by default written in JavaScript. But CouchDB invokes an external interpreter for them, so it's easy to extend CouchDB with a new query server. Several such have been written, and it's now possible to write your functions in many different languages.

  • There's a very limited SQL view engine, but AFAICT nothing like Hive or Pig that can take a complex query and compile it down into a number of chained map/reduce jobs. The aforementioned restrictions on reduce functions mean that the strategy I've been taught for expressing joins as map/reduce jobs won't work; I don't know if this limitation is fundamental. But it's IME pretty rare to require general joins in applications: usually you want to do some filtering or summarisation on at least one side.

  • CouchDB can't quite make up its mind whether it wants to be a database or a Web application framework. It comes by default with an administration web app called Futon; you can also use it to store and execute code for rendering objects as HTML, Atom, etc. Such code (along with views, validations etc) is stored in special JSON objects called "design documents": best practice is apparently to have one design document for each application that needs to access the underlying data. Since design documents are ordinary JSON objects, they are propagated between nodes by replications.

  • However, various standard webapp-framework bits are missing, notably URL routing. But hey, you can always use mod_rewrite...

  • There's a tool called Erica (and an older one called CouchApp) which allows you to sync design documents with more conventional source-code directories in your filesystem.

  • CouchDB is written in Erlang, and the functional-programming influence shows up in other places: most types of user-defined function are required to be free of side-effects, for instance. Then there's the aforementioned uses of lazy evaluation and the append-only nature of the system as a whole. You can extend it with your own Erlang code or embed it into an Erlang application, bypassing the need for HTTP requests.

tl;dr if you've ever thought "data modelling and synchronisation are hard, let's just stick a load of JSON files in Git" (as I have, on several occasions), then CouchDB is probably a good fit to your needs. Especially if your analytics needs aren't too complicated.
pozorvlak: (Hal)
Thursday, February 7th, 2013 12:41 pm
Quaffing the last of my quickening cup,
I chuck fair Josie, my predatory protégée, behind her ear.
Into my knapsack I place fell Destruction,
my weapon in a thousand fights against the demon Logic
(not to mention his dread ally the Customer
who never knows exactly what she wants, but always wants it yesterday).
He sleeps lightly, but is ready
to leap into action, confounding the foe
with his strings of enchanted rubies and pearls.
To my thigh I strap Cecilweed, the aetherial horn
spun from rare African minerals in far Taiwan
and imbued with subtle magics by the wizards of Mountain View.
Shrugging on my Cuirass of Visibility,
I mount Wellington, my faithful iron steed
his spine wrought in the mighty forge of Diamondback
his innards cast by the cunning smiths of Shimano
and ride off, dodging monsters the height of a house
towards the place the ancients knew as Sràid na Banrighinn
The Street of the Queen.

Just wanna clarify that in lines 5 and 6 I'm not talking about the Growstuff customers, all of whom have been great.
pozorvlak: (Hal)
Thursday, January 24th, 2013 09:59 pm
[Wherein we review an academic conference in the High/Low/Crush/Goal/Bane format used for reviewing juggling conventions on rec.juggling.]

High: My old Codeplay colleague Ally Donaldson's FAT-GPU workshop. He was talking about his GPUVerify system, which takes CUDA or OpenCL programs and either proves them free of data races and synchronisation-barrier conflicts, or finds a potential bug. It's based on an SMT solver; I think there's a lot of scope to apply constraint solvers to problems in compilation and embedded system design, and I'd like to learn more about them.

Also, getting to see the hotel's giant fishtank being cleaned, by scuba divers.

Low: My personal low point was telling a colleague about some of the problems my depression has been causing me, and having him laugh in my face - he'd been drinking, and thought I was exaggerating for comic effect. He immediately apologised when I told him that this wasn't the case, but still, not fun. The academic low point was the "current challenges in supercomputing" tutorial, which turned out to be a thinly-disguised sales pitch for the sponsor's FPGA cards. That tends not to happen at maths conferences...

Crush: am I allowed to have a crush on software? Because the benchmarking and visualisation infrastructure surrounding the Sniper x86 simulator looks so freaking cool. If I can throw away the mess of Makefiles, autoconf and R that serves the same role in our lab I will be very, very happy.

Goal: Go climbing on the Humboldthain Flakturm (fail - it turns out that Central Europe is quite cold in January, and nobody else fancied climbing on concrete at -7C). Get my various Coursera homeworks and bureaucratic form-filling done (fail - damn you, tasty German beer and hyperbolic discounting!). Meet up with [livejournal.com profile] maradydd, who was also in town (fail - comms and scheduling issues conspired against us. Next time, hopefully). See some interesting talks, and improve my general knowledge of the field (success!).

Bane: I was sharing a room with my Greek colleague Chris, who had a paper deadline on the Wednesday. This meant he was often up all night, and went to bed as I was getting up, so every trip into the room to get something was complicated by the presence of a sleeping person. He also kept turning the heating up until it was too hot for me to sleep. Dually, of course, he had to share his room with a crazy Brit who kept getting up as he was going to bed and opening the window to let freezing air in...
pozorvlak: (Hal)
Sunday, December 9th, 2012 09:17 pm
I've been using Mercurial (also known as hg) as the version-control system for a project at work. I'd heard good things about it - a Git-like system with a cleaner UI and better documentation - and was glad of the excuse to try it out. Unfortunately, I was disappointed by what I found. The docs are good, and the UI's a bit cleaner, but it's still got some odd quirks - the difference between hg resolve and hg resolve -m catches me every bloody time, for instance. Unlike Git, you aren't prompted to set missing configuration options interactively. Some of the defaults are crazy, like not sending long output to a pager. And having got used to easy, safe history-rewriting in Git, I was horrified to learn that Mercurial offered no such guarantees of safety: up until version 2.2, the equivalent of a simple commit --amend could cause you to lose work permanently. Easy history-rewriting is a big deal; it means that you never have to choose between committing frequently and only pushing easily-reviewable history.

But I persevered, and with a bit of configuration I was able to make hg more like Git more comfortable. Here's my current .hgrc:
username = Pozorvlak <pozorvlak@example.com>
merge = internal:merge
pager = LESS='FSRX' less
rebase =
record =
histedit = ~/usr/etc/hg/hg_histedit.py
fetch =
shelve = ~/usr/etc/hg/hgshelve.py
pager =
mq =
color =

You'll need at least the username line, because of the aforementioned lack of interactive configuration. The pager = LESS='FSRX' less and pager = lines send long output to less instead of letting it all spew out and overflow your console scrollback buffer. merge = internal:merge tells it to use its internal merge algorithm as a merge tool, and put ">>>>" gubbins in files in the event of conflicts. Otherwise it uses meld for merges on my machine; meld is very pretty but not history-aware, and history-aware merges are at least 50% of the point of using a DVCS in the first place. The rebase extension allows you to graft a sequence of changesets onto another part of the history graph, like git rebase; the record extension allows you to select only some of the changes in your working copy for committing, like git add -p or darcs record; the fetch extension lets you do pull-and-merge in one operation - confusingly, git pull and git fetch are the opposite way round from hg fetch and hg pull. The mq extension turns on patch queues, which I needed for some hairy operation or other once. The non-standard histedit extension works like git rebase --interactive but not, I believe, as safely - dropped commits are deleted from the history graph entirely rather than becoming unreachable from an active head. The non-standard shelve extension works like git stash, though less conveniently - once you've shelved one change you need to give a name to all subsequent ones. Perhaps a Mercurial expert reading this can tell me how to delete unwanted shelves? Or about some better extensions or settings I should be using?
pozorvlak: (Hal)
Thursday, December 6th, 2012 11:41 pm

I've been running benchmarks again. The basic workflow is

  1. Create some number of directories containing the benchmark suites I want to run.
  2. Tweak the Makefiles so benchmarks are compiled and run with the compilers, simulators, libraries, flags, etc, that I care about.
  3. Optionally tweak the source code to (for instance) change the number of iterations the benchmarks are run for.
  4. Run the benchmarks!
  5. Check the output; discover that something is broken.
  6. Swear, fix the problem.
  7. Repeat until either you have enough data or the conference submission deadline gets too close and you are forced to reduce the scope of your experiments.
  8. Collate the outputs from the successful runs, and analyse them.
  9. Make encouraging noises as the graduate students do the hard work of actually writing the paper.

Suppose I want to benchmark three different simulators with two different compilers for three iteration counts. That's 18 configurations. Now note that the problem found in stage 5 and fixed in stage 6 will probably not be unique to one configuration - if it affects the invocation of one of the compilers then I'll want to propagate that change to nine configurations, for instance. If it affects the benchmarks themselves or the benchmark-invocation harness, it will need to be propagated to all of them. Sounds like this is a job for version control, right? And, of course, I've been using version control to help me with this; immediately after step 1 I check everything into Git, and then use git fetch and git merge to move changes between repositories. But this is still unpleasantly tedious and manual. For my last paper, I was comparing two different simulators with three iteration counts, and I organised this into three checkouts (x1, x10, x100), each with two branches (simulator1, simulator2). If I discovered a problem affecting simulator1, I'd fix it in, say, x1's simulator1 branch, then git pull the change into x10 and x100. When I discovered a problem affecting every configuration, I checked out the root commit of x1, fixed the bug in a new branch, then git merged that branch with the simulator1 and simulator2 branches, then git pulled those merges into x10 and x100.

Keeping track of what I'd done and what I needed to do was frankly too cognitively demanding, and I was constantly bedevilled by the sense that there had to be a Better Way. I asked about this on Twitter, and Ganesh Sittampalam suggested "use Darcs" - and you know, I think he's right, Darcs' "bag of commuting patches" model is a better fit to what I'm trying to do than Git's "DAG of snapshots" model. The obvious way to handle this in Darcs would be to have six base repositories, called "everything", "x1", "x10", "x100", "simulator1" and "simulator2"; and six working repositories, called "simulator2_x1", "simulator2_x10", "simulator2_x100", "simulator2_x1", "simulator2_x10" and "simulator2_x100". Then set up update scripts in each working repository, containing, for instance

darcs pull ../base/everything
darcs pull ../base/simulator1
darcs pull ../base/x10
and every time you fix a bug, run for i in working/*; do $i/update; done.

But! It is extremely useful to be able to commit the output logs associated with a particular state of the build scripts, so you can say "wait, what went wrong when I used the -static flag? Oh yeah, that". I don't think Darcs handles that very well - or at least, it's not easy to retrieve any particular state of a Darcs repo. Git is great for that, but whenever I think about duplicating the setup described above in Git my mind recoils in horror before I can think through the details. Perhaps it shouldn't - would this work? Is there a Better Way that I'm not seeing?

pozorvlak: (Hal)
Thursday, December 6th, 2012 09:45 pm
Inspired by Falsehoods Programmers Believe About Names, Falsehoods Programmers Believe About Time, and far, far too much time spent fighting autotools. Thanks to Aaron Crane, [livejournal.com profile] totherme and [livejournal.com profile] zeecat for their comments on earlier versions.

It is accepted by all decent people that Make sucks and needs to die, and that autotools needs to be shot, decapitated, staked through the heart and finally buried at a crossroads at midnight in a coffin full of millet. Hence, there are approximately a million and seven tools that aim to replace Make and/or autotools. Unfortunately, all of the Make-replacements I am aware of copy one or more of Make's mistakes, and many of them make new and exciting mistakes of their own.

I want to see an end to Make in my lifetime. As a service to the Make-replacement community, therefore, I present the following list of tempting but incorrect assumptions various build tools make about building software.

All of the following are wrong:
  • Build graphs are trees.
  • Build graphs are acyclic.
  • Every build step updates at most one file.
  • Every build step updates at least one file.
  • Compilers will always modify the timestamps on every file they are expected to output.
  • It's possible to tell the compiler which file to write its output to.
  • It's possible to tell the compiler which directory to write its output to.
  • It's possible to predict in advance which files the compiler will update.
  • It's possible to narrow down the set of possibly-updated files to a small hand-enumerated set.
  • It's possible to determine the dependencies of a target without building it.
  • Targets do not depend on the rules used to build them.
  • Targets depend on every rule in the whole build system.
  • Detecting changes via file hashes is always the right thing.
  • Detecting changes via file hashes is never the right thing.
  • Nobody will ever want to rebuild a subset of the available dirty targets.
  • People will only want to build software on Linux.
  • People will only want to build software on a Unix derivative.
  • Nobody will want to build software on Windows.
  • People will only want to build software on Windows.
    (Thanks to David MacIver for spotting this omission.)
  • Nobody will want to build on a system without strace or some equivalent.
  • stat is slow on modern filesystems.
  • Non-experts can reliably write portable shell script.
  • Your build tool is a great opportunity to invent a whole new language.
  • Said language does not need to be a full-featured programming language.
  • In particular, said language does not need a module system more sophisticated than #include.
  • Said language should be based on textual expansion.
  • Adding an Nth layer of textual expansion will fix the problems of the preceding N-1 layers.
  • Single-character magic variables are a good idea in a language that most programmers will rarely use.
  • System libraries and globally-installed tools never change.
  • Version numbers of system libraries and globally-installed tools only ever increase.
  • It's totally OK to spend over four hours calculating how much of a 25-minute build you should do.
  • All the code you will ever need to compile is written in precisely one language.
  • Everything lives in a single repository.
  • Files only ever get updated with timestamps by a single machine.
  • Version control systems will always update the timestamp on a file.
  • Version control systems will never update the timestamp on a file.
  • Version control systems will never change the time to one earlier than the previous timestamp.
  • Programmers don't want a system for writing build scripts; they want a system for writing systems that write build scripts.

[Exercise for the reader: which build tools make which assumptions, and which compilers violate them?]

pozorvlak: (Default)
Thursday, September 13th, 2012 04:01 pm
I've recently submitted a couple of talk proposals to upcoming conferences. Here are the abstracts.

Machine learning in (without loss of generality) Perl

London Perl Workshop, Saturday 24th November 2012. 25 minutes.

If you read a book or take a course on machine learning, you'll probably spend a lot of time learning about how to implement standard algorithms like k-nearest neighbours or Naive Bayes. That's all very interesting, but we're Perl programmers - all that stuff's on CPAN already. This talk will focus on how to use those algorithms to attack problems, how to select the best ML algorithm for your task, and how to measure and improve the performance of your machine learning system. Code samples will be in Perl, but most of what I'll say will be applicable to machine learning in any language.

Classifying Surfaces

MathsJam: The Annual Conference, 17th-18th November 2012. 5 minutes.

You may already know Euler's remarkable result that if a polyhedron has V vertices, E edges and F faces, then V - E + F = 2. This is a special case of the beautiful classification theorem for closed surfaces. I will state this classification theorem, and give a quick sketch of a proof.
pozorvlak: (Default)
Sunday, September 9th, 2012 01:12 pm
Remember how a few years ago PCs were advertised with the number of MHz or GHz their processors ran at prominently featured? And how the numbers were constantly going up? You may have noticed that the numbers don't go up much any more, but now computers are advertised as "dual-core" or "quad-core". The reason that changed is power consumption. Double the clock speed of a chip, and you more than double its power consumption: with the Pentium 4 chip, Intel hit a clock speed ceiling as their processors started to generate more heat than could be removed.

But Moore's Law continues in operation: the number of transistors that can be placed on a given area of silicon has continued to double every eighteen months, as it has done for decades now. So how can chip makers make use of the extra capacity? The answer is multicore: placing several "cores" (whole, independent processing units) onto the same piece of silicon. Your chip can still do twice as much work as the one from eighteen months ago, but only if you split that work up into independent tasks.

This presents the software industry with a problem. We've been conditioned over the last fifty years to think that the same program will run faster if you put it on newer hardware. That's not true any more. Computer programs are basically recipes for use by particularly literal-minded and stupid cooks; imagine explaining how to cook a complex meal over the phone to someone who has to be told everything. If you're lucky, they'll have the wit to say "Er, the pan's on fire: that's bad, right?". Now let's make the task harder: you're on the phone to a room full of such clueless cooks, and your job is to get them to cooperate in the production of a complex dinner due to start in under an hour, without getting in each other's way. Sounds like a farce in the making? That's basically why multicore programming is hard.

But wait, it gets worse! The most interesting settings for computation these days are mobile devices and data centres, and these are both power-sensitive environments; mobile devices because of limited battery capacity, and data centres because more power consumption costs serious money on its own and increases your need for cooling systems which also cost serious money. If you think your electricity bill's bad, you should see Google's. Hence, one of the major themes in computer science research these days is "you know all that stuff you spent forty years speeding up? Could you please do that again, only now optimise for energy usage instead?". On the hardware side, one of the prominent ideas is heterogeneous multicore: make lots of different cores, each specialised for certain tasks (a common example is the Graphics Processing Units optimised for the highly-parallel calculations involved in 3D rendering), stick them all on the same die, farm the work out to whichever core is best suited to it, and power down the ones you're not using. To a hardware person, this sounds like a brilliant idea. To a software person, this sounds like a nightmare: now imagine that our Hell's Kitchen is full of different people with different skills, possibly speaking different languages, and you have to assign each task to the person best suited to carrying it out.

The upshot is that heterogeneous multicore programming, while currently a niche field occupied mainly by games programmers and scientists running large-scale simulations, is likely to get a lot more prominent over the coming decades. And hence another of the big themes in computer science research is "how can we make multicore programming, and particularly heterogeneous multicore programming, easier?" There are two aspects to this problem: what's the best way of writing new code, and what's the best way of porting old code (which may embody complex and poorly-documented requirements) to take advantage of multicore systems? Some of the approaches being considered are pretty Year Zero - the functional programming movement, for instance, wants us to write new code in a tightly-constrained way that is more amenable to automated mathematical analysis. Others are more conservative: for instance, my colleague Dan Powell is working on a system that observes how existing programs execute at runtime, identifies sections of code that don't interfere with each other, and speculatively executes them in parallel, rolling back to a known-good point if it turns out that they do interfere.

This brings us to the forthcoming Coursera online course in Heterogeneous Parallel Programming, which teaches you how to use the existing industry-standard tools for programming heterogeneous multicore systems. As I mentioned earlier, these are currently niche tools, requiring a lot of low-level knowledge about how the system works. But if I want to contribute to projects relating to this problem (and my research group has a lot of such projects) it's knowledge that I'll need. Plus, it sounds kinda fun.

Anyone else interested?
pozorvlak: (polar bear)
Wednesday, June 13th, 2012 06:01 pm
1. Start tracking my weight and calorie intake again, and get my weight back down to a level where I'm comfortable. I've been very slack on the actual calorie-tracking, but I have lost nearly a stone, and at the moment I'm bobbing along between 11st and about 11st 4lb. It would be nice to be below 11st, but I find I'm actually pretty comfortable at this weight as long as I'm doing enough exercise. So, I count that as a success.

2. Start making (and testing!) regular backups of my data. I'm now backing up my tweets with TweetBackup.com, but other than that I've made no progress on this front. Possibly my real failure was in not making all my NYRs SMART, so they'd all be pass/fail; as it is, I'm going to declare this one not yet successful.

3. Get my Gmail account down to Inbox Zero and keep it there. This one's a resounding success. Took me about a month and a half, IIRC. Next up: Browser Tab Zero.

4. Do some more Stanford online courses. There was a long period at the beginning of the year where they weren't running and we wondered if the Stanford administrators had stepped in and quietly deep-sixed the project, but then they suddenly started up again in March or so. Since then I've done Design and Analysis of Algorithms, which was brilliant; Software Engineering for Software as a Service, which I dropped out of 2/3 of the way through but somehow had amassed enough points to pass anyway; and I'm currently doing Compilers (hard but brilliant) and Human-Computer Interaction, which is way outside my comfort zone and on which I'm struggling. Fundamentals of Pharmacology starts up in a couple of weeks, and Cryptography starts sooner than that, but I don't think I'll be able to do Cryptography before Compilers finishes. Maybe next time they offer it. Anyway, I think this counts as a success.

5. Enter and complete the Meadows Half-Marathon. This was a definite success: I completed the 19.7km course in 1 hour and 37 minutes, and raised over £500 for the Against Malaria Foundation.

6. Enter (and, ideally, complete...) the Lowe Alpine Mountain Marathon. This was last weekend; my partner and I entered the C category. Our course covered 41km, gained 2650m of height, and mostly consisted of bog, large tufts of grass, steep traverses, or all three at once; we completed it in 12 hours and 33 minutes over two days and came 34th out of a hundred or so competitors. I was hoping for a faster time, but I think that's not too bad for a first attempt. Being rained on for the last two hours was no fun at all, but the worst bit was definitely the goddamn midges, which were worse than either of us had ever seen before. The itching's now just about subsided, and we're thinking of entering another one at a less midgey time of year: possibly the Original Mountain Marathon in October or the Highlander Mountain Marathon next April. Apparently the latter has a ceilidh at the mid-camp, presumably in case anyone's feeling too energetic. Anyway, this one's a success.

5/6 - I'm quite pleased with that. And I'm going to add another one (a mid-year resolution, if you will): I notice that my Munro-count currently stands at 136/284 (thanks to an excellent training weekend hiking and rock climbing on Beinn a' Bhuird); I hereby vow to have climbed half the Munros in Scotland by the end of the year. Six more to go; should be doable.
pozorvlak: (Default)
Sunday, May 6th, 2012 11:34 pm
Yesterday, hacker-turned-Tantric-priest-turned-global-resilience-guru Vinay Gupta went on one of his better rants on Twitter. I've Storified it for your pleasure here. The gist was roughly
  1. We don't have enough resources to give everyone a Western lifestyle.
  2. Said lifestyle isn't actually very good at giving us the things which really make us happy.
  3. We do, on the other hand, have the resources to throw a truly massive party and invite everyone in the world. Drugs - especially psychedelics - require very little to produce, and sex is basically free.
My favourite tweet of the stream was "Hello, I'm the Government Minister for Dancing, Getting High and Fucking. We're going to be extending opening hours and improving quality."

It strikes me that this is a fun thought experiment. Imagine: the Party Party has just swept to power on a platform of gettin' down and boogying. You have been put in charge of the newly-created Department of Dancing, Getting High and Fucking (hereinafter DDGHF)¹. Your remit is to ensure that people who want to dance, get high and/or have sex can do so as safely as possible and with minimal impact on others. What do you do, hotshot? What policies do you implement? What targets do you set? How do you measure your department's effectiveness? How do you recruit and train new DDGHF staff, and what kind of organisational culture do you try to create?

Use more than one sheet of paper if you need.

You have a reasonable amount of freedom here: in particular, I'm not going to require that you immediately legalise all drugs. You might even want to ban some that are currently legal, though if so, please explain why your version of Prohibition won't be a disaster like all the others. However, I think we can take it as read that the Party Party's manifesto commits to at least scaling back the War on Drugs.

Bonus points: how does the new broom affect other departments? How do we manage diplomatic relations with states that are less hedonically inclined? What are the Party Party's policies on poverty, the economy, defence and climate change?

I guess I should give my answer )

Edit: LJ seems to silently fail to post comments that are above a certain length, which is very irritating of it. Sorry about that! If your answer is too long, perhaps you could post it on your own blog and post a link to it here? Or split it up into multiple comments, of course.

¹ Only one Cabinet post for all three? I hear you ask. That's joined-up government for you. Feel free to create as many junior ministers as you think are merited.
pozorvlak: (kittin)
Saturday, April 7th, 2012 12:20 am
I've been doing some work with Wordpress off and on for the last couple of weeks - migrating a site that uses a custom CMS onto a Wordpress installation - and a couple of times I've run into the following vexing problem when setting up a local Wordpress installation for testing. I couldn't find anything about it on the web, and it took me several hours to debug, so here's a writeup in case someone else has the same problem.

Steps to reproduce: install Wordpress 3.0.5 (as provided by Ubuntu). Using the command-line mysql client, load in a database dump from a Wordpress 3.3.1 site. Visit http://localhost/wordpress (or wherever you've got it installed).

Symptoms: instead of your deathless prose, you see an entirely blank browser window. HTTP headers are sent correctly, but no page content is produced. However, http://localhost/wordpress/wp-admin is displayed correctly, and all your content is in the database.

What's actually going on: Wordpress has decided that the TwentyTen theme is broken, so it's reverting to the default theme. It is hence looking for a theme called "Wordpress Default". But the default theme is actually just called "Default". So it doesn't find a theme, and, since display is handled by the theme files, nothing gets displayed.

How to fix it: go into the admin interface, and select Appearance->Themes. Change the theme to "Default". Your blog is now visible again!

If you wish, you can now change the theme back to TwentyTen: it turns out that it's not actually broken at all.

Thanks to Konstantin Kovshenin for suggesting I turn WP_DEBUG to true in wp-config.php. This allowed me to eventually track down the problem (though, annoyingly, the "theme not found" error was only displayed on the admin page, so I didn't see it for a while).

Next question: this is clearly a bug, but it's a bug in a superseded version. Where should I report it?

Edit: on further thought, I think this may be more to do with the site whose dump I was loading in using a theme that I don't have installed. In which case, the bug may well affect the latest version of Wordpress. But I haven't yet proved this to my satisfaction.
pozorvlak: (polar bear)
Thursday, March 1st, 2012 11:01 pm
You may recall that one of my New Year's Resolutions was to enter and complete the half-marathon event at the Meadows Marathon. Well, I entered it! Now I just have to actually run the thing. This Sunday, to be precise. I've pounded enough cold pavements over the last three months that I'm fairly confident of finishing, though I've no idea whether I'll finish within my target of two hours.

In keeping with the spirit of the event, I'm trying to raise money for the Against Malaria Foundation, who are one of GiveWell.org's two top-rated charities in terms of misery alleviated per dollar donated. It is, in other words, a very good cause. Please sponsor me!.

Edit: I completed the race in 1 hour and 37 minutes, despite being hailed on for the last 1.5 laps. Better than that, though, was my friends' generosity: together, they donated over £400 to the Against Malaria Foundation.
pozorvlak: (Default)
Saturday, February 11th, 2012 03:43 pm
It was often uncomfortable, often painful, particularly for the first month, but other days were pure joy, a revelling in the sensation of movement, of strength and wellbeing. My regular headaches stopped. For the first time ever, I got through winter without even a cold. I felt incredibly well, began to walk and hold myself differently. When friends asked "How are you?", instead of the normal Scottish "Oh, not too bad," I'd find myself saying "Extremely well!"

How obnoxious.

On other days training was pure slog, the body protesting and the will feeble. The mind could see little point in getting up before breakfast to run on a cold, dark morning, and none at all in continuing when it began to hurt. Take a break, why not have a breather, why not run for home now?

It is at times like that that the real work is done. It's easy to keep going when you feel strong and good. Anyone can do that. But at altitude it is going to feel horrible most of the time - and that's what you're really training for. So keep on running, through the pain and the reluctance. Do you really expect to get through this Expedition - this relationship, this book, this life for that matter - without some of the old blood, sweat and tears? No chance. That's part of the point of it all. So keep on running...

The real purpose of training is not so much hardening the body as toughening the will. Enthusiasm may get you started, bodily strength may keep you going for a long time, but only the will makes you persist when those have faded. And stubborn pride. Pride and the will, with its overtones of fascism and suppression, have long been suspect qualities - the latter so much so that I'd doubted its existence. But it does exist, I could feel it gathering and bunching inside me as the months passed. There were times when it alone got me up and running, or kept me from whinging and retreating off a Scottish route. The will is the secret motor that keeps driving when the heart and the mind have had enough.

[From Summit Fever.]
pozorvlak: (kittin)
Monday, February 6th, 2012 11:48 pm

The Ball Game

This game is a particular favourite of Josie, but Haggis finds it boring. The kitten takes a ball (usually one made of silver foil), and bats it around with its paws and carries it around in its mouth. The game has a very complicated scoring scheme which we haven't worked out yet, and ends when the kitten gets bored.

This game has a variant called Fetch, recently independently reinvented by Josie. We are, as you can imagine, unreasonably proud of her.

The "It's My $object, Get Your Own" Game

This is a game for two kittens. One claims possession of an object of some sort and then growls threateningly at the other whenever they come near. This is often combined with...

The Sponge-Hunting Game

A favourite of Haggis. The normal habitats of the Common Sponge are the kitchen sink and the cupboard under same. Haggis waits patiently for the appearance of an unguarded Sponge, and then grabs it, drags it all over the flat, and finally disembowels it on the living-room carpet, all while growling deeply. Finally he garlands his tail with pieces of the Sponge's entrails, as befits a mighty hunter such as himself.

The Red Dot Game

The Red Dot Game starts with the bipeds crashing around the flat, uttering ritual cries of "where the bloody hell have you hidden the laser pointer?" This stage is very important for Building Anticipation.

Once the initial phase is over, the Mysterious Red Dot appears! The cats then chase the Mysterious Red Dot around the flat. This used to be a very high-energy affair, with the kittens leaping up walls and chasing round and round (and round and round and round) the living-room carpet to get the MRD; it has now evolved into a more strategic game, with the cats sneaking up on the MRD using all available cover before suddenly pouncing on it.

The Wall Game

A variant on the Red Dot Game, to be played during the summer months. The MRD is replaced by the reflection of my smartphone on the bedroom wall as I attempt to check my email before getting out of bed. Josie in particular can attain impressive heights while leaping to grab it.

The Fly-Catching Game

When a fly is sighted, the hunt is on! Points are awarded for catching and eating the fly, but also for knocking over ornaments while chasing after it. Anything belonging to the landlady scores double.

The Climbing Game

As befits natives of Skye, the kittens loves to climb things¹. Haggis is undoubtedly the stronger climber, having made the dramatic first ascent of Bookshelf Route (K6c) in the living room. Josie's no slouch, though, with the FA of the closet testpiece Warm Jumper Shelf (K6b+) to her credit. Both kittens eschew the standard training paraphernalia of campus- and finger-boards in favour of a 3m tall cat tree covered in sisal. They also exclusively climb solo and barefoot; not for them the ethical grey areas of headpointing or piton use!

Haggis is a promising drytooler, too, having completed the bold Wormwood Pearl's Leg Route.

Haggis's current project is the futuristic Boiler Roof Continuation in the kitchen; the line follows the standard Fridge Route onto the summit of the Crockery Cupboard, then continues it via a massive dyno over the kitchen sink onto the top of the boiler. From there a tricky and exposed dyno should lead to the long-awaited FA of the Catfood Shelf. His previous attempts to reach the Catfood Shelf via the Ornament Shelf below have always failed at the crux roof move from the Ornament Shelf onto the Catfood Shelf; the line is rarely in condition, depending as it does on the seasonal drift of the Kitchen Table.

When Haggis finally completes his project (no doubt celebrating with his trademark Sending Yowl), you can be sure it will be extensively covered in the climbing press.

Haggis chillaxes on his portaledge.

The Tummy-Tickling Game

This is a game for one kitten and one biped. The kitten lies on its back, as if to say "Look! I have a tummy!" The biped must say "Yes! You have a tummy!" and then start tickling it.

The kitten may eventually get bored of this and walk away. Possibly.

The False Tummy-Tickling Game

This is a game for one kitten (who we may without loss of generality call "Haggis") and one biped. Haggis lies on his back, as if to say "Look! I have a tummy!". The biped, assuming that Haggis is playing the Tummy-Tickling Game, will start to tickle it. Whereupon Haggis says "And I also have TEETH AND CLAWS!!!!" and start using them on the unsuspecting biped's hand.

Thick gloves are advisable if you want to play this game for any length of time. Alternatively, it is possible to distract Haggis with a chewable watch-strap.

Watching KTTV

The cats love to watch KTTV (the view out of the kitchen window). They also love its affiliate station KTTV-2 (the view out of the bedroom window), which shows nature documentaries. They particularly like documentaries which involve BIRDS. Their favourite is to watch KTTV in HD, by climbing onto the windowframe when the window is open. This leads us to a new game which Haggis has recently invented:

The Scare The Crap Out Of The Bipeds By Climbing Out Onto The Windowsill Game


¹ If you recognised the title of this post as a nod to Lito Tejeda-Flores' classic essay The Games Climbers Play, you're absolutely right.

pozorvlak: (Default)
Monday, February 6th, 2012 10:17 pm

"Would you like some money towards another Glenmore Lodge course for Christmas?" said my Dad, some time in December. I thought about last year's course for about half a second and said "Yes please!". This time I signed up for the five-day winter lead climbing course, and had five fantastic days climbing: Wednesday in particular was one of the best days I've ever had in the mountains.

Below are some of the things I learned. Usual rules apply: I am not a qualified mountain guide, and these notes may contain errors. Use your own judgement.

Read more... )

Fiacaill Buttress, taken after we climbed Jacob's Right Edge on Wednesday.

More and larger photos here.

pozorvlak: (Default)
Wednesday, January 18th, 2012 11:44 am
As you've probably noticed, many prominent websites (like Wikipedia, Reddit, BoingBoing, XKCD...) have gone dark today in protest against the Stop Online Piracy Act and its Senate cousin, the Protect IP Act. If either of these passes, it will seriously impact the ability of any US website (defined in very broad, not to say insane ways) to host user-submitted content, permanently screwing up the democratic Internet that we've come to know and mostly love.

[Incidentally, note how I didn't have any links in the above paragraph? That was deliberate. No point, today. Or afterwards, if SOPA/PIPA passes.]

Wikipedia's blackout is particularly clever, since it's just a shallow CSS hack - determined and knowledgeable people can still get the information. Just as determined and knowledgeable people will still be able to access copyright-infringing content after SOPA/PIPA.

After I submit this and post the customary Twitter announcement, I'll be going dark in solidarity with these sites. No LiveJournal, Facebook or Twitter updates; no GitHub pushes. No scab I. Your usual service of cat pictures and programming rants will return after 8pm EST. Meanwhile, please read the following:

A Technical Examination of SOPA and PIPA by Jason Harvey, one of the Reddit admins (note that this is a fairly detailed look at the bills, but doesn't require much technical knowledge of Internet architecture)
SOPA: Why Do We Have To Break The DNS? (more technical)

and, if you're American, please consider contacting your elected representatives and asking them to stop these awful bills.
pozorvlak: (pozorvlak)
Tuesday, January 3rd, 2012 12:58 am
I don't normally make New Year's resolutions, but what the hell.

1. Start tracking my weight and calorie intake again, and get my weight back down to a level where I'm comfortable. This morning it was 12st 1.9 - not terribly high in the scheme of things, but it's almost as high as it was when I first started dieting (though I think a bit more of it may be muscle now) and it's definitely high enough to negatively impact my sense of well-being.

What went wrong? Well, I'm gonna quote from Hyperbole and a Half: "trying to use willpower to overcome the apathetic sort of sadness that accompanies depression is like a person with no arms trying to punch themselves until their hands grow back. A fundamental component of the plan is missing and it isn't going to work." A scheme for weight loss that depends on willpower is similarly doomed if you're too depressed to stick to it. So this time I'm going to try to make changes to my eating habits that require less willpower. Any suggestions would be most welcome.

2. Start making (and testing!) regular backups of my data. I lost several years of mountain photographs last year when the external hard drive I was keeping them on died: I don't want that to happen again.

3. Get my Gmail account down to Inbox Zero and keep it there. It's currently at Inbox 1713, most of which is junk, but it's just *easier* to deal with an empty inbox, and not have to re-scan the same old things to look for the interesting new stuff.

I have a few more Ambitious Plans, but they don't really count as resolutions:

1. Do some more Stanford online courses. I'm currently signed up to Human-Computer Interaction, Design and Analysis of Algorithms, Software Engineering for Software as a Service, and Information Theory. Fortunately they don't all run concurrently!

[BTW, they're not all computing courses: [livejournal.com profile] wormwood_pearl is signed up to Designing Green Buildings, for instance.]

2. Enter (and complete!) the Meadows Half-Marathon in March. I started training for this back in December, but then I got ill and Christmas happened, so today was my first run for a while and it wasn't much fun. Never mind; I've got time to get back on course.

3. If that goes well, enter (and, ideally, complete...) the Lowe Alpine Mountain Marathon. As I understand things, it's basically two 20km-ish fell runs back-to-back, with a night camping in between. Oh, and you have to carry all your camping kit with you. In the high classes people do the whole thing at a run, but in the lower classes (which I'd be entering) there's apparently a bit more run/walk/run going on. Philipp and I did nearly 40km in one day on the South Glen Shiel ridge in November, and then went for another hike the next day, so I should be able to at least cover the distance. Providing I don't get too badly lost, of course :-)

The only way to progress in anything. The trick, of course, is not biting off enough to cause you damage.
pozorvlak: (Default)
Thursday, December 29th, 2011 12:08 pm
I've been writing some Perl code recently, and a couple of ideas have occurred to me.
  1. Closures are great when you have several behaviours which can all vary independently; objects are great when you have several behaviours that must all vary in sync with one another. Put another way: if your problem is combinatorial explosion of objects, try using closures instead; if your problem is ensuring consistency among different behaviours, try grouping them into objects. Languages like Perl make it relatively easy to do either, so use the best tool for the job. The trick is to spot that you're using the wrong representation before you either create a kajillion Strategy classes or knock together a half-arsed object system out of hashes of closures.
  2. In the correct quantity, housekeeping tasks can be your friends. By "housekeeping tasks" I mean tasks that don't directly contribute to solving the problem at hand, but which still add some value. An example from yesterday was converting a small class hierarchy to use the Moose object system. Other examples might be writing per-method documentation, cleaning up your version control history, and minor refactorings. If there's too much of this stuff to do, it can get dispiriting - you want to be solving the problem, not mucking about with trivia! But if there's not too much, it can be helpful: you have something to do while you're stuck on the main problem, and the housekeeping work is close enough to the main problem that it stays in your brain's L2 cache, where a background process can work away at it. If you're so stuck that you literally can't make any progress, your choices are (a) think very very hard and get depressed, (b) go and do something totally different, in the process forgetting lots of important details. I find both of these to be less productive.

[If you're interested, my code is of course on GitHub: some fixes to the CPAN module List::Priority, and some code for benchmarking all the priority queues on CPAN. Any suggestions or patches would be very welcome!]
pozorvlak: (Default)
Wednesday, November 2nd, 2011 05:18 pm
I was pleasantly astonished to see Peter Norvig comment on my recent post about the Stanford online courses - with 160,000 students competing for his attention, that's dedication! I completely take his point that I'm only a small part of the intended audience, and what works best for me is probably not what works best for the students they really want to reach.

Nevertheless, the course has sped up in the last couple of weeks to the point where I'm finding it pleasantly stretching. Today I tweeted:
I take it back: #aiclass is covering the Oxford 1st-year logic course in <1wk. The good bits of the syllabus, anyway :-)
I want to expand on that remark a bit. There were actually three first-year Oxford logic courses:
  • "Introduction to Symbolic Logic", taught in the first term to all Philosophy students (including those like me who were studying Maths and Philosophy), and also a few others like Classics students. This covered propositional logic, first-order predicate logic, proof tableaux, and endless pointless arguments about the precise meaning of the word "the" and whether or not the symbol → accurately captures the meaning of the English word "if". This is the course I was talking about on Twitter. I found it unbearably slow-paced, but I remember a couple of folk who'd given up maths years before and couldn't handle being asked to do algebra again. "The alarm is sounding and Mary called" was fine, but "A & M" was apparently unintelligible to them.

    According to a legend which was told to me by the Warden of New College and is thus of unquestionable veracity, a class of ItSL students were once sent to the Wykeham Professor of Logic's graduate-level logic seminar due to a scheduling error. The WPoL walked in, saw the expected roomful of youngsters, and started "Let L be a language recursively defined over an alphabet X..." The poor undergrads, still in their first week (and quite possibly at their first class) of their undergraduate careers, must have started to entertain grave doubts about their ability to handle this "Oxford" place. When the WPoL was eventually informed of the mistake, he is supposed to have meditatively said "I thought they seemed rather ill-prepared..."

  • "Elements of Deductive Logic", taught in the second term to those studying (Maths|Physics) and Philosophy. This bumped the mathematical content up a notch, covering (for instance) completeness and consistency theorems for the languages that had been introduced in ItSL. There was also a rather handwavy treatment of modal logic, and lots more philosophical wrangling about what it all meant and how relevant it was to the broader philosophical project. This was one of the most intense courses I did in my four years as an undergrad.

  • There was an introduction to symbolic logic buried somewhere in the first-year computer science course - without the philosophy, I'm assuming.
Maths students were expected to pick the basics of logic up in the course of learning real analysis, as is traditional.

I always thought they'd have been better splitting ItSL and EoDL into two more evenly-sized courses - perhaps one heavily mathematical one, to be shared with the CS students (and perhaps incorporating some digital electronics), followed by a purely philosophical one to be taken by the (Maths|Physics) & Philosophy students. Meanwhile the algebra-phobes could do a single course more tailored to their level. I'm guessing that resource availability ruled this idea out, though.

There was also a two-term second-year maths course called "b1 Foundations", which was set theory, logic (up to, IIRC, the Löwenheim–Skolem theorem and Skolem's paradox) and some computability theory. This was compulsory for Maths & Philosophy students, but I didn't take it because I'd given up philosophy by then and was sick of the whole thing. In light of my subsequent career, this was probably a bad decision.