pozorvlak: (Default)
Saturday, June 6th, 2009 10:33 pm
I've got a problem: I've had an idea I want to write about, but it depends on two or three other ideas I wanted to write about but never got around to. So I'm going to write this post in a top-down, Wirthian fashion, stubbing out those other posts: maybe, if there's enough interest, I'll come back and write them properly and replace the stubs here with links. OK with everyone?

Right, on with the motley.

Stub post no. 1
Extreme Programming (XP), whether intentionally or unintentionally (and my money is on "intentionally, but try getting them to admit it") is really good for getting work out of people who are bright but have short attention spans. This is a Good Thing. It's most obvious in the case of pair programming - imagine saying to your partner "Y'know, this is kinda hard. Let's surf Reddit for a while" - but actually, most XP practices have this benefit. Short feedback cycles, concrete rewards, definite "next moves" (given by failing tests and the "simplest thing that could possibly work" approach) - all of these things have the effect of maintaining flow and reducing the incentive to slack off. It's programming as a highly addictive game. Dynamic languages work well with this approach, because they make it as easy as possible to get something up and running, and to test the things you've written.

Stub post no. 2
Haskell is the opposite. It encourages deep thinking, and everything about the language makes it as hard as possible to get something running unless it's just right. Screw up, and you're not presented with a running program and a failing test that you can run in the debugger; you're presented with an unfriendly compiler message and a bunch of dead code that you can't interrogate in any meaningful way. After a morning hour few minutes of this (usually involving no small loss of hair), the consultant Barbie that lives in my head invariably says "Statically-typed pure functional programming is hard. Let's go shopping!" And I, fed up and mindful of my scalp, agree. This is why I am no good at Haskell.

Stub post no. 3
Everything I read by or about the climber Dave MacLeod (blog) makes me more inspired by him. Partly for his visionary climbs, but mostly for his approach to choosing, training for and tackling really hard problems, which I think should generalise really well, if only I could put my finger on what exactly it is. It helps that he's a really friendly, pleasant guy in person. Check out the BAFTA-winning film Echo Wall that he and his wife made about his preparation for his first ascent of the trad route of the same name. If you're in Edinburgh, you can borrow my DVD, I'm positively eager to lend it out.

Anyway, something Dave wrote about training (which I can't be arsed to find right now) said that in order to train effectively, you have to be constantly pushing yourself in some way: either in terms of power, or stamina, or technique, or fear¹, or whatever. You have to find your comfort zone and then consciously go beyond it, in whichever direction you wish to improve. As you improve, your comfort zone shifts, and you need to keep pushing yourself harder and harder in order to continue to improve. But (and here's the interesting bit), he said that if you do this for long enough, your whole conception of comfort shifts, and you start to feel uncomfortable if you aren't pushing yourself in some way.

So, here's the thought I had. Maybe all the Haskellers have been training themselves Dave MacLeod-stylee, and now only feel comfortable pushing themselves really hard, and that's why they like using such a bloody difficult language.

¹ About a year and a half ago, I was seconding a route called Dives/Better Things (Hard Severe) in Wales, and got to a bit that was a bit too hard and a bit too scary. I floundered around for a bit, getting more and more freaked out, and then said to myself "What would Cale Gibbard do? He'd pause for a bit, think really hard, work out exactly what to do next, and then do that. OK. Do that, then." I have no idea if Cale climbs, but it did the trick. Cale, if you're reading, thanks for that :-)
pozorvlak: (gasmask)
Wednesday, March 11th, 2009 11:45 pm


Please, think of the narwhals.
pozorvlak: (Default)
Saturday, December 20th, 2008 05:34 pm
Whenever I spot myself doing something repeatedly on a computer, I try to automate it. Consequently, I have a directory full of random little scripts, and a long list of aliases in my .bash_profile, all of which arose because I had to do something lots of times. Often, these scripts and aliases call other ones which I wrote earlier, as I spot higher-level patterns in what I'm doing. But every time, I have to fight against the False Laziness that tells me not to bother, I'll only have to do it once or twice more, and it's not worth the effort of doing it right.

So yesterday I added the following line to my .bash_profile:
alias viprof="vim ~/.bash_profile && source ~/.bash_profile"
We'll see how it works out.
pozorvlak: (babylon)
Monday, November 24th, 2008 09:16 pm
Over on qntm.org, the always-entertaining Sam Hughes presents us with an essay on gay marriage: the database engineering perspective. It's a discussion of the real obstacle to legalising gay marriage, namely migrating the schemas of all the government databases so that they can accommodate the idea. Worth a read, perhaps especially if you don't know much about how databases work.

As Sam progressively broadens the allowable definition of marriage, he quickly ends up with the problem of representing an arbitrary undirected graph in a relational database. Recall that a graph is something like a half-filled-in join-the-dots puzzle: more formally, it's a collection of vertices and a collection of edges between pairs of vertices (or if you prefer, some dots, and lines connecting some of the dots), and that a graph is directed if the edges have a direction associated with them, and undirected otherwise. Here, the vertices are people, and the edges are marriages between people. It shouldn't come as too much of a surprise that graphs show up in this problem: graphs arise all over the place in mathematics and computing (another obvious one: the machines on your network, and the cables connecting them). And so Sam runs into one of the classic problems: how do you represent something fundamentally unordered (the endpoints of an edge, here the spouses in a binary marriage) using something that's fundamentally ordered (the memory of a computer)? More concretely, how do you decide who gets put in the spouse1 column of your marriages table and who gets put in spouse2?

Sam's solution is expedient if inelegant: since every person in his database has a unique identifying number (good DB design practice, but still, shudder), he can simply make spouse1 the partner with the lower CitizenNumber. But there's a more common solution which I'd like to talk about. Since it's easy to represent directed edges ("A is married to B"), we represent an undirected edge by two directed edges going in opposite directions ("A is married to B, and B is married to A"). When I first encountered this trick, I thought it was an ugly hack, but it turns out that it actually makes more sense than you might think. )

TL;DR version: Very often in mathematics, we have a pair of operations, one of which forgets about extra structure on some object and one of which adds that structure in simple-mindedly. We capture this notion using adjunctions. Applying this framework to undirected and directed graphs, we discover that the well-known trick for representing an undirected graph as an invertible directed graph actually behaves more like forgetting structure than adding it back in. Thus undirected graphs "really are" invertible directed graphs.
pozorvlak: (Default)
Friday, June 20th, 2008 12:37 pm
In my thesis, there are a lot of places where I write expressions like
f1 x ... x fn
or
 (Σ y1, ... , Σ yn) 
or something similar. The code to generate this is repetitive. It would be nice to abstract out the general pattern of "list of things indexed from 1 to n, combined using some binary operation". But there's a problem: the indexing can occur anywhere in what might be quite a complicated expression. Rather than handing our "one to n" macro an ordinary TeX expression, we need to hand it an expression with placeholders: that is, we need to hand it a macro. Then our one-to-n macro can invoke the macro we handed it with the arguments "1" and "n" to generate the code we want.

This is quite a common thing to want to do in computer programming. It goes under various names, but probably the most common is "callback". More precisely, the callback is the piece of code accepted by the one-to-n macro, which it then invokes. Callbacks are very generally useful - they're the standard way of writing code for graphical user interfaces (if you've ever written an event handler in VB or an ActionListener in Java, you were writing callbacks), but they're also used for XML processing (via SAX) and all sorts of other things. Ruby's iterators and Smalltalk's if-statements take callbacks. If you've ever used the map function in Perl or Lisp or Python or Haskell, you were writing a callback. But can we write callback-based code in TeX?

Well, yes, as it turns out. )
pozorvlak: (babylon)
Monday, June 2nd, 2008 12:09 pm
Random thought: I'm sure people who actually know about this stuff will be able to tell me why it won't work, or, alternatively, cases where it's already been done.

Recall the problem with the naive Haskell mean program
mean xs = sum xs / length xs
which when asked to calculate the mean of [1 .. 1e9] allocates 8GB of RAM and brings the computer to a grinding halt. The problem there isn't really to do with naiveté: a similarly naive Ruby version works fine and runs in constant space, albeit slowly. The problem is garbage collection, or rather the lack of it. The Haskell runtime calculates the list [1.. 1e9] bit-by-bit so it can be summed, and normally would be able to throw every element away when it was done with it; but because the elements of the list will later need to be counted, there's still a (very long) pointer chain from the top-level to any given element of the list. So none of it can be garbage-collected.

In this case, this is a bad strategy: the effects of swapping out that much memory totally eliminate any benefit from caching the list. It would be quicker to throw the list away and calculate it again from scratch. Since Haskell's a purely functional language, we could in principle do this without worrying about any side-effects of the calculation. We could set up our garbage collector to become more aggressive as memory got shorter: when it starts to use swap, we could throw away any pure data that's more than N indirections from the toplevel, where N is varied dynamically to maximise performance. But what if our program is, say mean $ take 1000000000 mersennePrimes? Then it will take much longer to re-calculate the values than it would to swap them in and out of core. You'd need to keep track of how long each function call takes, and how long each bit of data took to calculate, and estimate re-calculation times when throwing data away.

But this is exactly the kind of profiling data that you need to track for the smart optimizing VMs that Steve Yegge talks about here. So, it seems that there's a synergy here. Does anyone know of a Sufficiently Smart VM that does this kind of aggressive garbage collection? Has it been tried before? Is there some obvious problem with the idea that I'm not seeing?
pozorvlak: (polar bear)
Thursday, May 29th, 2008 10:50 am
This post is a mea culpa, an admission of failure: if any of you have read my stuff thinking that I know what I'm talking about when it comes to programming, this is your chance to recalibrate. I'm posting in the hopes that by writing this particular ultra-basic technique down in public, I'll never ever forget about it )

[The full, working version of PS3 Remuxatron (by Mat Brown, GPLed) is here.]
pozorvlak: (Default)
Friday, May 16th, 2008 12:12 pm
There was a recent mailing list post by Andrew Coppin (which received some discussion on programming.reddit) in which he bemoans the way that GHC, when presented with reasonable-looking code, can suddenly and mysteriously decide that it needs to allocate gigabytes of memory, slowing your system to a crawl for ages (I've had similar problems myself). Andrew's example was the following:

I offer up the following example:
mean xs = sum xs / length xs
Now try, say, mean [1.. 1e9], and watch GHC eat several GB of RAM. (!!)
This is not so mysterious when you remember that Haskell uses linked lists as its native list structure, and so the only way to calculate length xs is to traverse the list a second time: the runtime system "helpfully" keeps around the copy of xs that it generated while calculating sum xs, and you have a space leak. But what to do about it?

The story so far )

Doing it in Perl )

Doing it in Ruby )

Take-home messages )
pozorvlak: (sceince)
Wednesday, May 14th, 2008 01:15 pm
Those of you who do scientific computation may be interested in this link:

Bye Matlab, hello Python, thanks Sage.

Short version: Sage is a bundle of all of the various numeric/scientific/graphing tools for Python, made easy to download and install (even if you don't have root access, apparently). According to the blogger linked above, it's now good enough to serve as a replacement for Matlab (and it has ambitions to be a replacement for Maple and Mathematica too, though I don't know how far along it is). It integrates with R, Gap, etc. And because it's Python, you get a real, high-level, well-designed programming language with a proper environment to do your coding in.

In other news, [livejournal.com profile] wormwood_pearl finished her Finals yesterday. As you can imagine, we're both pretty relieved. There's no tradition of meeting people outside Finals here like there is in Oxford, but I went along with a bottle of champagne anyway - and only then remembered that public drinking is against Glasgow bylaws. Drat it. We went out to lunch, the bottle went into the Department fridge, and we subsequently went round to her sisters' and drank it there while watching Layer Cake. Because we're that exciting.
pozorvlak: (polar bear)
Thursday, May 8th, 2008 02:20 pm
I know I'm very late to the LOL$thing party, but what the hell:

Cut for size )

Now, I reckon about half of you will recognise the guy in the pictures, and about half will understand the captions. I'm interested to know how large the intersection is...
pozorvlak: (sceince)
Tuesday, April 29th, 2008 04:35 pm
=pod

... do not be alarmed.

What happened was that I was talking on Reddit with Masklinn about Literate Programming, and s/he wanted to know if there were any systems other than Literate Haskell which allowed for blog posts which are valid code. "Hrmmm," I thought, "I don't see any obstacle to doing that with POD... )

Right, back to work.
pozorvlak: (Default)
Monday, April 28th, 2008 12:18 am
I wrote this in an email to a colleague of [livejournal.com profile] totherme's a while back, and have occasionally felt the need to refer to it since. Fortunately, I have a place where I can post my ill-informed ramblings for all the world to see, so here it is :-) It's an attempt to convince a computer scientist, who I believe comes from the Haskell/static typing camp, why Perl is worth learning.

[If you're not a computer scientist, the two chief reasons to learn Perl are (1) it's incredibly useful, (2) it's great fun. But I digress.]

In many ways, Perl occupies the opposite corner of design-space to Haskell. To a first approximation, Haskell proceeds from the question "what if everything we think we know about language design is right?", whereas Perl proceeds from the question what if everything we think we know about language design is wrong? )
pozorvlak: (Default)
Sunday, March 30th, 2008 08:40 pm
For the past couple of years (longer, actually), I've been thinking a lot about this business of static and dynamic typing in programming languages. Here's my current position on the matter.

Static and dynamic typing encourage and reward quite remarkably different approaches to the creation of software: everything from the nitty-gritty, (sit and think?)-edit-(compile?)-run-(test?)-(debug?)-(repeat?) cycle to the design (at all levels) of the software to be created, taking in every aspect of the tools and process used to support this design and implementation. Also, the choice of static or dynamic typing is best understood as part of a larger attitude towards development. Unfortunately, the use of a dynamic mindset with a static language not only means that you can't take advantage of the tools available (or even see why they'd be useful), it actively hinders success. The same is true for approaching a dynamic language with a static mindset.

I would therefore like to propose Pozorvlak's Conjectures:
  1. If you find that a modern dynamic type system causes more problems than it solves, you're probably doing it wrong.
  2. If you find that a modern static type system causes more problems than it solves, you're probably doing it wrong.
For instance, I find Haskell's type system causes me many more problems than it solves. Discussions with more expert Haskell users suggest that I am indeed doing it wrong. This guy, on the other hand, finds that "the dynamic languages stick out like sore thumbs because of the whole class of dynamic run time errors that show up that don't even appear in the statically typed languages". If you find this, I claim that it's because your code is not polymorphic enough1. You can get away with having code which only accepts data of very restricted types when the compiler does all the checks for you statically. When it doesn't, you're better off writing code that behaves well for a wider class of inputs, and dynamic languages give you much more facility to do this. Once you learn to code like this, your problems largely go away. As a bonus, your code becomes more flexible, and whole classes of techniques become possible that would have been literally unthinkable before. Similar claims, it appears, can be made for static languages: at the higher levels of mastery, one can use a static type system as a theorem-prover to establish some quite nontrivial properties of your program automatically and across the entire codebase. This allows for some excitingly different approaches to program development.

A corollary is that if you ever finding yourself saying that your port of Feature X to Language Y is better than the original Feature X solely because it's (statically|dynamically) typed and the original Feature X was the other one, you should probably save your breath. It will probably also be worth your while to go back and determine what advantages the opposite choice of typing regimen gave to Feature X's users.

1 A less interesting, but probably valid conjecture is that you're also not testing enough, or at least testing the wrong things. But this can't be the only answer. Dynamic programmers, in general, are not idiots; they are usually also Lazy, in the good sense. They're smart enough to work out that writing the equivalent isa_ok() test every time they would have written a type declaration in Java or whatever is no time-saver at all. Hence, they must need less type information overall for their code to be correct.
pozorvlak: (gasmask)
Friday, March 28th, 2008 03:31 pm
As I may have mentioned once or twice before, I'm writing up my thesis. This is an intensely slow and depressing process, and the temptation to slack off is overwhelming. Being a geek, I decided to write some code to help me. Writing code rather than writing thesis is still slacking off, obviously, but it's slacking off to a useful end. Writing blog posts about writing code that's meant to help me write my thesis is another matter...

Read more... )
pozorvlak: (Default)
Monday, March 10th, 2008 08:47 pm
A guy on Reddit pointed me at this article today. It's the Wikipedia biography of the Welsh computer scientist Donald Davies. Like my father, he was from the Rhondda Valley; he was the co-inventor of packet-switched networks (like this big one you're using right now); and he once found a bug in Alan Turing's code, before the first computer had been built :-)

I think he may be my new hero.

In other news:
  • I've just finished Portal. It was great, but far too short :-( If you haven't played it yet, then you should go and buy/download a copy right now, and cover your ears and sing "La la la la la..." until it's loaded to avoid being spoilered like I was. The puzzles are still a lot of fun, but the plot and jokes would have been a lot better if I hadn't half-known they were coming.
    [And I didn't find "The Cake is a Lie!", even though I was looking out for it...]
  • Spent a lot of time sitting at the computer today, but didn't get any thesis done. Bah.
  • After a similarly unproductive morning yesterday, I went to the climbing wall, and had quite a good session: I did two 6as (which is good, for me), led a widely-agreed-to-be-undergraded 5+, and finally knocked off a 5+ with a big scary overhang that had been tormenting me for months. Yay!
  • This may be the most awe-inspiring programming war story I've ever read.
  • Letter from Linacre: I didn't get the job. Never mind.
  • The winter mountaineering course I was booked on next weekend has been cancelled due to lack of interest. Just when the snow's started up again! I'm much more annoyed about that than about the rest, to be honest.

Ah well, tomorrow will be better. Anyway, have a look at this video, which is the trailer for one of the films I saw at the mountain film festival on Saturday:



That sea arch he's climbing up the underside of, by the way, is provisionally graded as a French 9b :-)
pozorvlak: (Default)
Thursday, March 6th, 2008 12:16 pm
There's a meme out there that Java is the new COBOL. The rationale is summarised nicely in the first couple of paragraphs of this Wiki page, before they ramble off into a rant about how Smalltalk should have taken over the world:
Is Java the new Cobol? A good argument can be made that yes, it is. Considering the billions of lines of Cobol code that have been (or still are) in production; that observation should in no way be considered a slur. Java has succeeded in the marketplace (partially) because it is a language of its time; not one behind the times - or ahead of them.

Java is the new COBOL because it's becoming the new de facto standard for enterprise applications. Like COBOL, it's a good language for its time, but there are better ones. Like COBOL, it's bloated with features. Like COBOL, it's being used beyond its original view.
This comparison is appealing, but false. As I've been saying for years, Java is the new Ada.

Consider:
Java Ada is a compiled, structured, statically typed, imperative, and object-oriented high-level computer programming language. It was originally designed by a small team led by James Gosling Jean Ichbiah. Java Ada emphasises "programming in the large", portability, multiprogramming using threads the rendezvous mechanism, and, in particular, the development of reliable software, which is aided by a combination of strong static typing and runtime checks inserted by the compiler. The restrictions of static typing are partially alleviated by the ability to create "generic" types and subroutines. Java Ada is particularly intended for development of software for embedded systems and hard real-time environments. The familiar C++ Pascal-like syntax aids new learners, and a large standard library provides much common functionality as standard.
It's instructive to ask why Java took off when Ada largely didn't. There's no single answer, but I think the following factors explain a lot:
  • More programmers were familiar with C syntax than Pascal syntax (or associated Pascal syntax with limited Pascal environments they'd used at school).
  • Java came from Sun, a company with programmer cred, and had the names of famous and respected hackers (James Gosling, Guy Steele) attached to it. Influence leaders in the hacker community took it seriously. Ada came from the DoD and Honeywell, and some guy that nobody'd ever heard of.
  • Sun marketed the hell out of Java, and did so to the right people.
  • Early Ada compilers were very poor, and by the time good compilers were available, the language had acquired a bad reputation.
  • Mostly, though, I think it was timing. Ada was developed just before OOP went mainstream, so missed out on that bandwagon. By the time Java came along, OOP was well-established, and the masses were ready for a language in which OOP was built-in rather than bolted on. They were also familiar with the horrible complexity of C++, and were ready for something like C++ but less painful. "Reliable software" was a less effective rallying call back in 1980, unless you were developing control systems for aircraft (Ada's original niche). Ada was ahead of its time in many ways; Java, as the Wiki poster said above, was very much of its time.
There are some other suggestions here, from someone who appears to have been around at the time and is thus more reliable than me.

The new COBOL is, of course, Visual Basic.
pozorvlak: (gasmask)
Thursday, March 6th, 2008 11:25 am
A comparison of the Haskell argmunger with its Arc equivalent (or even its Perl equivalent) should make something clear. It's been claimed that Haskell doesn't need macros, because most of the things that Lispers need macros for can be done in Haskell using lazy evaluation and so on. But the argmunger example makes it clear that there are things for which Lispers don't need macros and Haskellers do - if not Template Haskell macros, then SYB or DRiFT or some other macro-like facility.

Lispers often say that coding in other languages feels like manually generating macroexpansions. I'm nobody's idea of a Lisp hacker, but I often feel like this when I'm writing Haskell. Which is why I'm learning about Template Haskell even though I'm not yet au fait with things like do-notation and monad transformers, which most Haskell experts would probably consider more basic. Would you choose before or after you'd walked to Moscow to staunch the blood from your severed arm?
pozorvlak: (kittin)
Wednesday, March 5th, 2008 11:28 pm
By the way, here's what an argmunger looks like in Arc:
(def argmunge (munger victim) (fn args (apply victim (map args munger))))
It's used like so:
arc> ((argmunge '(1 0) cons) 'x 'y)
(y . x)
arc> ((argmunge '(2 3 1 1 0 3 2) list) 'a 'b 'c 'd 'e 'f 'g)
(c d b b a d c)
That took a couple of minutes to write, and worked first time. Sw33t! Though I thought I'd got it wrong at first, as there was a bug in my test code :-) Note, by the way, that I'm mapping one list over another, to get the moral equivalent of Perl's list-slicing: this is a spin-off of the way that Arc treats function calls and data-structure indexing as indistinguishable. I'd hoped that I could write an argmunger that could be handed either a list or a function as its munger and Do The Right Thing, but unfortunately I don't think there's any way of determining the domain of an Arc function (the equivalent of the list's length) automatically.

I've changed the method signature so the munger is the first argument: that's because (argmunge m1 (argmunge m2 f)) is equal to (argmunge m1:m2 f). It's usual in operad theory to have the munging functions acting on the right: this is because they usually only consider munging functions which are invertible (ie, permutations), and use the definition (sticking with Lisp notation):
((argmunge' f m) x(m 0) x(m 1) ...) = (f x0 x1 ...)
So (argmunge' f m) is equal to (argmunge m-1 f). With this definition (which is fine if you only allow permutations) then (argmunge' (argmunge' f m1) m2) is equal to (argmunge' f m1:m2), so it makes more sense for the arguments to be that way round. But when you allow your munging function to be noninvertible, you have to use the definition I gave, and then it makes more sense to have the mungers acting on the left.
pozorvlak: (Default)
Wednesday, March 5th, 2008 10:37 am
[The hope was that this post would come in three somewhat independent sections: one pure programming, in which we develop a small utility in Template Haskell; one largely mathematics, at the upper-level high school to beginning undergraduate level, wherein we describe another approach to constructing our utility; and one purely mathematical, more sophisticated but fairly handwavy, wherein I relate all this stuff to my research interests and describe where it leads. The idea was that you could skip the code and jump to the maths, or read the code and skip the maths, or whatever. However, just the first section has now taken far longer than I'd budgeted, both in time and in space, so I'll save the other two for a later post.]

In a recent post, Hitesh Jasani writes about Haskell's flip operator, which takes a function f and returns a function which behaves like f with the order of its first two arguments reversed. So (flip f) x y z w = f y x z w (we write function application without brackets, as is customary in Haskell). I pointed out to Hitesh that actually, we could write such a function in almost any language which supports higher-order functions, and in dynamic languages (like Perl or Lisp) we can go further, and write a function argmunge, which accepts two functions and permutes the arguments of the first one (the "victim") according to the second (the "munger"). So
(argmunge f g) x1 x2 x3 x4 ... = f xg 1 xg 2 xg 3 xg 4  ...
Here's an implementation of argmunge in Perl, and some code to exercise it:
sub argmunge {
    my $func = shift;
    my $munger = shift; # not necessarily a permutation
    return sub { $func->(@_[@$munger]); }
}

sub myprint {
    print "Called with args ".join(", ", @_)."\n";
}

argmunge(\&myprint, [2,5,5,6])->(0,1,2,3,4,5,6);
argmunge(\&myprint, [3,2,1])->("fred", "barney", "betty", "wilma");
When run, this displays
Called with args 2, 5, 5, 6
Called with args wilma, betty, barney
Here we don't pass the munger in as a function, but rather as a list of values [g(0), g(1), ..., g(n)]. I'm prouder of that code than I probably should be, because it relies on some nice Perl features to work as it does; namely, Perl's argument-passing convention (in which all arguments are passed to a function as a single array called @_), list slicing (in which you can index into a list with another list), and list flattening (in which inclusion of one list in another splices the inner list into the outer list, resulting in a flat list). I remarked that it wouldn't be possible to write a general argmunger in Haskell, because the type of the result depends crucially on the actual value of the munging function. It ought to be possible to write one in a dependently-typed language like Cayenne - anyone care to do so?

[Edit: it's possible to write an even nicer version in Arc.]

Anyway, it may not be possible in vanilla Haskell, but it is possible using templates. )

The final code can be found here. As always, all suggestions for how I could improve my code or my development practices are gratefully received!
pozorvlak: (Default)
Tuesday, March 4th, 2008 11:43 am
My post about design patterns and if-statements in Smalltalk yesterday got a very pleasing reaction, garnering some informative comments from knowledgeable people and making the front page of programming.reddit. I guess I should have checked my facts on Smalltalk semantics more carefully before clicking "Post" :-) One comment that particularly interested me was this one (sadly anonymous), which shows a rather cute trick for using Smalltalk-style keyword methods in Haskell:
data Then = Then
data Else = Else

if' :: Bool -> Then -> a -> Else -> a -> a
if' True Then t Else f = t
if' False Then t Else f = f
Now that's pretty cool, and could even be useful for DSL implementation1. But there's a problem. See the repeated code?
data Then = Then
data Else = Else
Here, let me show it to you the way it looks to me:
data Then = Then
data Else = Else
Literally half the elements that aren't pure syntax on those lines are repeated. What I'd like to do is define a keyword function, and then just write
keyword Then
keyword Else
Or, better,
keywords Then Else
Not a big deal, you might think, but what if you were writing a whole EDSL with many keywords? Or many different EDSLs? And it's not just the repetition: by defining a keywords function, I'd move the code up an abstraction level, making my intent clearer. We could of course write
data Keywords = Then | Else
but then we lose the benefit of compile-time syntax checking of our EDSL. Well, I guess our pseudo-keywords are syntactic sugar anyway, so it doesn't really matter. But still.

A few days ago, someone asked me for an example of code that's repetitive in Haskell but wouldn't be in Perl, and here's one. In Perl, and probably in Python, I could write a keywords function. It would be ugly, and would involve hairy symbol-table hacking or calls to eval, but the user wouldn't need to know that and I'd only need to write it once (never mind the fact that the "pseudo-keyworded functions" pattern wouldn't actually make any sense in Perl...). In Lisp or Ruby, I wouldn't need to write it at all, because I could use symbols instead. But perhaps this kind of thing (and other, more egregious examples) are just the price we pay for Haskell's type system, and are more than balanced out by greater opportunities for concision and abstraction elsewhere?

Nah, bollocks to that. We can do better.

Template Haskell to the rescue! )

The final version of Main.hs is here, and the final version of Keyword.hs is here. Not entirely straightforward, but we pretty much got there in the end. Would it be worth posting these code samples to the TH wiki, and perhaps linking here? Anyway, it's nice to have finally got this simple stuff to work: hopefully I'll be able to get some more difficult things to work soon, and this should reduce the amount of frustration and needlessly duplicated code in my Haskell programming experience.

1 Did you know you can do this in (plain) TeX, too?
\def\fred #1 barney #2 betty{#1#2}

\fred Hello there, barney Wilma betty !
produces "Hello there, Wilma!".