pozorvlak: (gasmask)
Monday, March 3rd, 2008 01:32 pm
I did warn you...

A while back, [livejournal.com profile] markdominus wrote an article about the Design Patterns movement, in which he claimed that design patterns are basically signs of weaknesses in programming languages; if a language forces you to implement the same pattern of code over and over in different programs, that's a sign that your language is insufficiently high-level with respect to problems that your users actually encounter, so language designers should focus on patterns as places where their languages could be improved. This prompted some reaction in the tech blogosphere, including this response from Ralph Johnson, one of the authors of the original Design Patterns book. Dominus responded here, correcting some of Johnson's misunderstandings and commenting further; in fact, though, they're mostly in agreement (and I'm pretty much in agreement with them both, FWIW).

Everyone up to speed? Good.

Something Dominus didn't seem to get was Johnson's distinction between things that are in the language and things that are in the library. Johnson seems to think that there's a three-tier hierarchy:

1) Language features
2) Library code (available to every application)
3) User code (must be written for each application).

He claims that it can be a good tradeoff to leave something as a pattern, because the alternative might be to overcomplicate the standard library or, even worse, the language itself - he seems to have visions of a language with decorator, abstractFactory etc. as keywords. Dominus largely refutes this claim in his response. But more interestingly, Dominus thinks that the distinction between 1 and 2 is basically artificial - is C's printf (or Haskell's standard prelude) part of the language or the library? It might matter to language implementers, but to the average programmer, the distinction's pretty unimportant. Now, this is a good point, but I think I can see what Johnson's getting at. Remember that Johnson's a Smalltalker. Let's consider (finally) how Smalltalk handles if-statements, which in most languages are built-in syntactic elements of the language.

In Smalltalk, booleans (ie, True or False) are objects: specifically, they're instantiations of the abstract base class Boolean, or rather of its two subclasses True and False. So every boolean has type True or False, and no actual member data. Bool has two virtual functions, ifTrue: and ifFalse:, which take as their argument a block of code. Both True and False override these functions; True's version of ifTrue: calls the code it's passed, and False's version does nothing (and vice-versa for ifFalse:). Here's an example:
a < b
  ifTrue: [^'a is less than b']
  ifFalse: [^'a is greater than or equal to b']
Those things in square brackets are essentially anonymous functions, by the way. Except they're objects, because everything is an object in Smalltalk. Now, what's happening there is that we call a's "<" method, with argument b; this returns a boolean. We call its ifTrue: and ifFalse: methods, passing as arguments the code we want executed in either case. The effect is the same as that of the Ruby code
if a < b then
  puts "a is less than b"
else
  puts "a is greater than or equal to b"
end
but what other languages do with special syntax, Smalltalk does as a special case of method-dispatch. (Code samples from here). This completely blew me away when someone ([livejournal.com profile] pdcawley?) first told me about it. Haskell does something similar, of course. [Edit: no it doesn't - see comments. But it could. I've also got some details of the Smalltalk wrong.]

Now, back to Johnson. Smalltalk's designers had three choices: bake special syntax for if-statements into the language, implement it in Smalltalk library code in terms of more general concepts that were part of the language, or force the programmer to do the work him/herself every time (using, I dunno, computed gotos or something). They made the unusual choice to go for the second option. While this doesn't matter if all you want are if-statements, it affects the language in other ways: the powerful abstraction mechanism needed to do this is available to the user to define new features.

This, I think, is the real distinction between "language" and "library": if the feature in question could have been implemented by your users using a combination of other features, it might as well be part of the library. Smalltalk's if-statements pass this test, as do (say) Lisp's looping constructs. Haskell's monads fail, because of do-notation: if a Haskell programmer wanted to add similar syntax for (say) comonads, would they be able to? I don't think so, and so it follows that monads are actually baked into the language. Several of Perl's built-in functions do things that user functions wouldn't be able to do, and thus should count as part of the language, even though they feel like part of a library.

So here's where it applies to design patterns: if your programmers find themselves having to implement essentially the same code over and over again, it's because you don't provide powerful enough abstraction mechanisms for programmers to write library modules that would do the job for them. Programmers are (rightly) lazy, and don't do work many times if they can do it once and then use that solution over and over again. So the fact that it's a design pattern means that it can't be in the library, and your language needs fixing: either by special-casing extra language features in or (preferably) by making your abstraction features more powerful, so the necessary library code can be written.
pozorvlak: (Default)
Saturday, February 23rd, 2008 10:47 pm
I observe to my astonishment that I've made the "Quotes of the Week" section of this week's Haskell Weekly News. Um, gosh.

Now if only I could work out if I've been quoted because dons agrees with me or wishes to hold me up to public ridicule :-)
pozorvlak: (babylon)
Friday, February 22nd, 2008 12:25 am
Did anyone else know that Emacs' Calc mode is effectively a full-featured computer algebra system? Symbolic manipulation, vectors, matrices, calculus, graphing (provided via gnuplot), all the stuff you'd expect.

No gamepad integration that I could see, though :-)
pozorvlak: (babylon)
Thursday, February 21st, 2008 04:53 pm
I went to an all-day sales pitch conference sponsored by Wolfram Research yesterday, all about just how damn cool the new version of Mathematica is. There's a lot to like: as a language, it seems to have a nice blend of Lisp-like and APL-like features, so you can do all your standard functional programming tricks and what looks like a decent subset of array programming tricks, as well as writing normal imperative code. The standard library is, of course, vast, with loads of clever symbolic, numerical, graphics and GUI code built in, and in the new version there's also lots of standard geographical/scientific/financial/etc data available, import and export filters for loads of standard formats, and other niceties. One thing I really liked was the Manipulate[] function: hand it an expression (which can evaluate to a number, a symbolic form, a graph, a 3D plot, a sound file, an animation, or whatever) and a list of parameters, and it will automagically construct a GUI widget with sliders and checkboxes that allow you to manipulate the parameters interactively and observe the result. You can even control the parameters using a gamepad, if you want... They seem to have made a major effort to make everything interoperate smoothly in the new version - one slightly silly demo they showed us was putting slider bars as the limits of an integral, and changing the value of the result as the bar was dragged about. That was always the major problem with open source mathematics software, from my limited experience - nothing does everything, so you have to learn N different incompatible sublanguages, write loads of glue code, and constantly switch applications. The Sage guys seem to be working on this, though - I'll have to check it out.

Have a look at the big collection of Mathematica demos at http://demonstrations.wolfram.com, which includes a lot of examples of Manipulate[]. There are videos, or you can download a free-as-in-beer notebook reader.

In other news, I've been having a bit of a play with the NetBeans IDE for Java, and really liking it. I've got used to doing everything in vi and the command-line, which has its upsides, but IDEs can make life so much easier for the beginner. In particular, NetBeans' wiggly red underlining has been a huge help in learning the language, and the integrated documentation browser is very nice. I haven't needed the automated refactoring support yet, but it's fun to play with - select! Click! Extract Method! :-)

But here's my question - why is it so slow? I know it's written in an interpreted language, but so is Emacs, and that runs without too much complaint on 1980s hardware. And the compiler's written in C, unless I'm much mistaken, and that's slow as hell too. Or, conversely, why were the compilers with Delphi and Turbo Pascal so fast? Simple Java programs take several seconds to compile on my 1GHz machine, where their Pascal equivalents would have compiled in an eyeblink on its predecessor's predecessor1. Is there something about Pascal that makes it especially easy to compile, and if so, what is it? Java seems at least as regular to me, and generating bytecode ought to be easier than generating native code. Or is Anders Hejlsberg just a ninja?

Thesis now at 63 pages and 22563 words, according to wc *.tex, which means that I've written 20 pages and, um, several thousand words in the last sixteen days (nearly 1000 words today, but many of those were "XXX proof here"). Progress is being made, though there's an awful lot still to do.

1 I'm sure I've mentioned our fifteen-minute link times for our medium-sized C++ app when I was working at $company: while our network of file dependencies wasn't as bad as it could have been, it still resulted in the linker having to do a lot of work. And while compiling can be distributed easily around a network, linking can't :-(
pozorvlak: (kittin)
Saturday, February 16th, 2008 01:20 am
No thesis today - I had a six hour job interview through in Edinburgh, for the maths consultancy people (who, it turns out, are university friends of my Canadian friend Jeff - small world!). The interview went reasonably well, though it could have been better - I was a bit stressed out by the whole interview situation and wasn't as sharp as I could have been. The maths questions (all based on tasks they've actually encountered in the field) were completely outside my area, and hence somewhat challenging for me, but I managed to come up with not-entirely-stupid answers to them without too many hints. The programming questions were mostly straightforward (what does this bit of recursive Prolog do, implement the following standard mathematical functions in Java, write a simple method against this random spec), apart from one where I had to find an error in some threaded Java code. Did I mention that I've never written threaded code before, and I don't speak Java very well? Then they gave me lunch, then brought me back to the office and asked me all the questions from the first round again. Apparently the answer I gave to the "where would you like to be in five years?" question was very good :-)

Anyway, the company looks really cool, and the people all seemed to be good guys, so fingers crossed on that one. They've got another couple of people to interview, and they'll get back to me some time in March.
pozorvlak: (polar bear)
Tuesday, February 12th, 2008 10:26 am
I'm going to make what should be an uncontroversial statement: if you don't understand and use monads, you are at best a quarter of a Haskell programmer. A corollary of this is that, since using monad transformers is the only (or at least the approved) way to use two or more monads together, if you don't understand and use monad transformers you are at best half a Haskell programmer.

[Another corollary is that I am, at best, about an eighth of a Haskell programmer: though I understand monads well on a theoretical level, I invariably emerge defeated from any attempt to bend them to my will.]

But we'll come back to that later.

Something I've been thinking about for a while is this whole business of designing languages to make programs shorter. )

1 There really ought to be a word that means "would never use a twopenny word when a half-crown word would do", but I can't think of one. English grads? Edit: sesquipedalian! Of course! Thanks, [livejournal.com profile] fanf! (Originally, I used "prolix")
2 I actually came up with this list by thinking about languages whose users were the most passionate. But they're also extremely concise, which I think is a large part of the reason for the passion. If I were focusing purely on concision, I should probably consider Forth, but I don't know enough about it.
3 J has "boxed arrays" too, which are something like two-dimensional s-expressions, but let's leave those aside for now.
4 You might want to raise this objection against Smalltalk, too: objects are members of classes, which are something like types. Now, I've hardly used Smalltalk, so I'm probably talking out of my elbow, but: since everything is an object, and the language has powerful reflection features and duck typing, we can in fact write generic operators that work for objects of many or all classes. But maybe I'm entirely wrong about Smalltalk programming: in which case, please delete all references to the language from my argument.
5 Do you find yourself wanting to go out and strangle small fluffy animals every time you have to type out an instance declaration that would be entirely unnecessary in a duck-typed language? I do. Particularly when it doesn't work and spits out some ludicrous error message at me, telling me that I've encountered another stupid corner case of the typeclass system.
6 I learned to my surprise the other day that I'm a member of the Geometry and Topology research group, and not the algebra research group as I'd always assumed - apparently universal algebra is now considered a branch of geometry!
pozorvlak: (gasmask)
Tuesday, February 5th, 2008 02:35 am
[livejournal.com profile] totherme kindly drew my attention to this blog post today. The author, Slava Akhmechet, tries to explain away that horrible feeling of unproductivity and frustration that I know so well from my attempts to use Haskell: his claim is that it's just that my expectations are miscalibrated from using imperative languages. A Haskell solution to a given problem, he claims, will take the same amount of thought as a solution in another language, but much less typing: by simple statistics, therefore, you're going to spend a lot of time staring at the screen not doing anything visible, and this can feel very unproductive if you're not used to it.

As an example, he gives the code
extractWidgets :: (Data a) => a -> [String]
extractWidgets = nub . map (\(HsIdent a)->a) . listify isWidget
    where isWidget (HsIdent actionName)
              | "Widget" `isSuffixOf` actionName = True
          isWidget _ = False
Bleh! :-( This function takes a parse tree representing some Haskell code, and extracts all the identifiers ending in "Widget", then removes the duplicates. Slava challenges any Java programmers reading to do the same in five lines or fewer.

Pointless language pissing match, including some Arc coding )

On language concision in general: I typically find that Haskell requires fewer lines for a given program, but Perl requires fewer characters, and they both use about the same number of tokens. Lisp is longer and messier than Haskell for easy problems, but quickly gains ground as the problems get harder.1 The APL family own on all three axes. This is extremely unscientific, of course, and because I don't know much APL/J I can't be sure how it extends to harder problems. I did once email Paul Graham asking why, if succinctness was power, he didn't switch to a member of the APL family; he has yet to reply, but I don't attach any great significance to this.

And having got all that stuff out of my head, I'm going back to bed. Hopefully I'll be able to sleep this time :-)

1Fun exercise: translate the example code in On Lisp into Haskell (or Perl, Ruby, etc.). It really helps you to get to grips with the material in the book. I found that the Haskell solutions were shorter and cleaner than the Lisp solutions up until about a third of the way through the book, at which point Lisp started to catch up with a vengeance: shortly thereafter, he was doing things in a few lines of Lisp that simply couldn't be done in finitely many lines of Haskell. I'd be very interested to see solutions written by someone who knew what they were doing, though!
pozorvlak: (Default)
Wednesday, January 30th, 2008 11:04 am
http://arclanguage.org/

Well, that makes the "which language shall I learn next" question rather easier...

First impressions (based on reading the tutorial rather than playing with it): I like it. It embodies PG's philosophy that a language should get out of your way and let you shoot yourself in the foot because one day, you might need to do tarsal surgery and only have a pistol to hand. In many respects, it's the anti-Haskell: it encourages you to put off the decision of how to represent your data as long as possible. Here's a feature along those lines that I liked: indexes into data-structures are indistinguishable from function calls. So if I write
(foo 0)
you have no way of knowing if foo is a list, a string or a function. Evaluation is strict by default, which I think is a net loss (but you've got macros, so it's swings and roundabouts, I suppose). The anaphoric (pronoun-introducing) macros from On Lisp are included by default - I've found pronouns to be very useful in Perl, so this can only be a Good Thing. I was amused to see that most of the language is defined in Arc, and that PG seems to think that this is a bold and novel experiment :-)
pozorvlak: (gasmask)
Wednesday, January 23rd, 2008 04:11 pm
Bloody Haskell: it's like a scab I can't stop picking at.

Prompted by something on Reddit, I started thinking about libraries to pluralize (English) words, like the table-name generator in Ruby on Rails or Damian Conway's (rather more complete-looking) Lingua::EN::Inflect for Perl. Such things seem to be called "inflectors". My question is, what would be the idiomatic interface for a Haskell inflector? The obvious thing would be something like
    pluralize :: Inflector -> String -> String
where Inflector is some sort of data structure holding configuration data: classical versus modern plurals, any custom cases, etc. But following on from our earlier discussion of high- and low-level type-checking, it occurred to me that there's a constraint here: only singular nouns should be pluralized. So should we then have
    newtype Singular = String
    newtype Plural = String

    pluralize :: Inflector -> Singular -> Plural
? Or even
module Inflect (makeSingular, pluralize)
    data Singular = Singular String
    data Plural = Plural String

    makeSingular :: String -> Singular
    -- error-checking code goes here, to check that the string passed
    -- is a single word, or something

    pluralize :: Inflector -> Singular -> Plural
so we don't export the constructors for Singular or Plural, ensuring that one can only construct them using our interface? But wouldn't this make client code far uglier than necessary? Should one then provide both options, tagging one as unsafe?
    module Inflect (makeSingular, pluralize, unsafePluralize)
    data Singular = Singular String
    data Plural = Plural String

    makeSingular :: String -> Singular
    -- error-checking code goes here

    unsafePluralize ::Inflector -> String -> String

    pluralize :: Inflector -> Singular -> Plural
    pluralize i (Singular s) = Plural (unsafePluralize i s)
Thoughts?
pozorvlak: (polar bear)
Thursday, January 17th, 2008 02:54 pm
Here's something I've been wondering about for ages. When I'm programming, I find worrying about the types of variables to be generally unhelpful, and being forced to think about such things by the compiler is downright irritating. Yet when I'm doing mathematics, I think type-centrically all the time. The two activities are strongly connected in my mind, so it's surprising to me that there's this difference.

Here are some theories I've been kicking around )

By the way, remember a while ago when I was musing on the possibility of static duck-typed languages? Well, it seems that Scala is one :-)
pozorvlak: (Default)
Monday, January 7th, 2008 10:33 pm
This afternoon I gratefully took possession of the copy of Iverson's A Programming Language which the library had been holding for me. Within two sentences, he'd managed to say something so thought-provoking that I felt compelled to post about it here. )
pozorvlak: (sceince)
Saturday, January 5th, 2008 10:32 pm
Suppose we have an algorithm whose performance (in some axis) is worse than linear - quadratic, say. The example that I'm thinking of is concatenation of lots of strings using the Schlemiel the Painter algorithm. Suppose we want to perform this calculation on a lot of data. The best thing to do, of course, is to find a better algorithm or some way of reducing the amount of data we need to consider, but that's not always possible. There's another general technique that's often helpful in situations like this: we split the data up into smaller chunks, perform the calculation for each chunk separately, then combine the results. For the Schlemiel example, suppose we want to concatenate mn strings, and that this will take k(mn)^2 steps. We split it up into m blocks of n strings, concatenate each block, and then concatenate the results, for k(m^2 + m(n^2)) = km(m + n^2) steps, an improvement on the original. We could improve further by chopping it up again, and again, until we end up with a trivial problem. I remember working this technique out for myself a few years ago (while trying to do exactly this - in my defence, I was constrained by available libraries). I felt almost as pleased with myself as I did when I discovered bubble sort for myself at the age of ten or eleven - we live and learn...

Anyway, I claim that this is what we're doing when we write modular code.

It's received wisdom that bug count grows more than linearly with number of lines of code - I've heard quadratic growth quoted, which makes some sort of sense since bugs often arise from unforeseen interactions between different bits of code, but I think that's maybe a bit excessive. So we can apply the analysis above to the process of writing software. "Choosing a better algorithm" might correspond to using a better development process, like XP or something, or maybe using some technique that makes the bug rate lower (test-driven development, strong typing, formal QA process, take your pick). Reducing the amount of data corresponds to, well, writing the program in fewer lines. But these aren't always possible, so we're back to chopping the data up into smaller pieces. Which we do, obsessively (if we're any good at all, that is). We split up applications into processes. We split up programs into modules. We split up modules into subroutines. We split up state into objects. All the time, we're fighting the super-linear behaviour of bug count: by keeping each part small, we reduce the number of bugs in each part; and the sum of bugs in the parts plus the bugs arising from unforeseen interactions between parts should be less than the number of bugs that would have been included if we'd written the program all in one chunk.

Of course, this isn't the only reason we write code modularly: modularity allows for code reuse, which reduces the amount of code we need to write (technique 2), and eliminates copy-and-paste bugs (technique 1). But I think good programmers write code that they never intend to reuse in a modular fashion anyway, and the reason they're doing this is, fundamentally, because they're optimising for low bug count by splitting up their data into chunks.
pozorvlak: (babylon)
Thursday, December 13th, 2007 05:33 pm
I'm in the middle of writing up a research proposal for a postdoc position I'm applying for. I've had enough of categorification for a while, and it struck me that doing formal semantics for a simple language in the APL family (J, K, A+, Matlab...) could be both Fun and Interesting. Some poking around on Google and MathSciNet suggests that nobody's done anything in this vein before - I've downloaded all the relevant looking papers (all three or four of them), and intend to read them this evening after climbing :-)

The trouble is that I've never written anything like this before. Some of you, I know, have: does this look remotely sane? Does the topic sound interesting? The second section, you'll be pleased to hear, is very much a zeroth draft at this stage :-)
pozorvlak: (pozorvlak)
Monday, November 26th, 2007 12:53 pm
For reasons that are increasingly unclear to me, I've been using the TeX package XY-Pic for the commutative diagrams (of which there are many) in my thesis and papers. If there is one and only one Right Way to do something, you can pretty much guarantee that XY-Pic will:
  1. do something else by default, producing horribly ugly output;
  2. only do the Right Thing if you utter some cryptic and fragile incantation in a language that looks like the bastard offspring of APL and the Black Speech of Mordor1;
  3. hide the details of said incantation away somewhere in the depths of the voluminous, poorly-indexed, verbose and maddeningly unclear manual.
As should be clear, I'm not a huge fan.

One of its more annoying characteristics is the way it handles tails of arrows, which (by default) start after the end of the arrow, so the tail invariably collides with whatever it was the arrow is pointing away from. Like this:
That was produced by the code \xymatrix{ A \ar@{>->}[r] & B }, which, while not terribly clear, is the obvious thing to try, and the only thing you'll know how to do unless you've wasted days reading the manual.

Fortunately, there is a fix )

1 Thinking about it, XY-Pic is actually kinda agglutinative, much like the Black Speech...
pozorvlak: (polar bear)
Thursday, October 11th, 2007 11:21 am
Perhaps you've had this experience:

You have a problem, which you've bashed your head against for ages. You can't see what to do about it. Finally, you decide to ask a friend, or email the lecturer, or put it up to your boss, or whatever. You get hold of him/her, and start explaining your problem. But by the time you're halfway in, the very act of explaining your problem forces you to clarify it in your own mind, and the solution becomes obvious to you. You drift off, mid-sentence. "Sorry," you say, "I've just seen what to do. Sorry to bother you."

Your friend was acting as a debugging teddy. The term comes from a university computer room back in the Dark Ages of computing which kept an actual teddy bear for this purpose: you were required to explain your problem to the teddy before you were allowed to ask an actual human being. At the startup where [livejournal.com profile] totherme and I used to work, we had a talking Yoda doll. But Master Yoda wasn't so good, precisely because he could talk and not just act as a sounding-board for your own thoughts: at one point, Management discovered that Master Yoda had been making major architectural decisions, and replaced him with a life-size cardboard cutout of Buffy the Vampire Slayer.1

1 She had business cards and everything. "Buffy Summers, [company] Ltd. Pensions Advisor (stakeholder)".
pozorvlak: (gasmask)
Monday, October 8th, 2007 03:24 pm
In a world such as this one, filled with war, famine, impending ecological collapse and Celtic fans, it really ought to take more to horrify me than this. But I guess I'm shallow that way.

Before diving into the actual text, let's just take a step back and consider the implications: the University of Limerick was teaching their "introduction to programming" course in COBOL, in 2002 (and as far as I can see, still are).

[Non-geeks: COBOL is the Common Business-Oriented Language, an ancient language used on dinosaur mainframes with about as much style and joie de vivre as the name implies. The Jargon File notes that in hacker circles its name is "synonymous with evil", and Edsger Dijkstra famously commented that "The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offense."]

Read more... )
pozorvlak: (babylon)
Tuesday, October 2nd, 2007 11:24 pm
Another one for the "using computers to investigate trivial mathematics" file...

I had another meeting with my STEP student today, who showed me the following problem: find a simple expression for the sum of the squares of the nth and (n-1)st Fibonacci numbers. In other words, what is Fn2 + Fn-12? We're taking the zeroth Fibonacci number to be 0, and the first to be 1.

My conjectured answer, and some supporting evidence )

I still can't see a proof, but no doubt one will occur to me in the morning...

In other news, I've been playing around a bit more with the Catalan numbers - in fact, I briefly thought they might have something to do with my actual research, and got very excited, but on further reflection I don't think they're very helpful. But it seems likely that the reason the "naive" J code hung didn't have all that much to do with the naiveté of the algorithm - using the same algorithm, and on the same machine, MIT Scheme (interpreted) was able to calculate the 100,000th Catalan number in around 100s, with no paging or machine-locking, and Python was able to calculate it in under 20s! It seems to me that J is using either a poor bignum implementation, or (more likely) a very poor implementation of the binomial coefficients - I discussed this with my father (we're that kind of family), and he suggested that it's probably written recursively, and thus consumes vast amounts of stack space. Easy enough to fix, if only the J interpreter weren't closed-source...
pozorvlak: (gasmask)
Monday, October 1st, 2007 09:18 pm
Implementation-defined languages come in for a lot of flak. Users of certain languages will point to their language's standards document as if it were self-evidently proof of greater inherent worth: a language that's defined by its implementation, it is implied, is always going to be a bit non-U. It may train itself out of its dropped H's and tendency to mispronounce the word "scone"1, but it will never have the true breeding that comes from a standards document.

Which is daft, because implementation-defined languages have some important advantages )

By the way, I'm not saying that all specifications are bad (a good one is an excellent thing to have) or that specification-defined languages have no advantages - I'm assuming that the advantages of specification-defined languages are so well-rehearsed that I don't need to repeat them. Anyone needing further enlightenment is encouraged to go to comp.lang.scheme and say "I think spec-defined languages suck!" :-)

Now for the second part of my rant: Haskell, as we know it today, is an implementation-defined language, defined by the latest CVS snapshot of GHC. "But what about the Haskell Report, and the Revised Report, and Haskell prime, and Hugs, and, er, those other implementations?" I hear you cry. Well, what about them? Every time I asked some question, it seemed, the answer would begin with "first download the latest version of GHC, then turn on these extensions..." A spec for a language that people don't actually use is - well, not useless, if it's sufficiently close to a language that they do use, but it ain't a specification as I understand the term. Everyone in the Haskell community seems to track the latest version of GHC for its new features. This is not spec-like behaviour. Now, as I said above, I don't think that implementation-defined languages are bad: quite the reverse. I just think it would save everyone a lot of bother and false superiority if the community could admit this to itself and to outsiders.

1 The O is short: "scone" rhymes with "gone", not "groan". Unless you're talking about the town in Scotland, when it rhymes with "boon". People who insist on mispronouncing this word around me will have their scones taken away from them and donated to the Pozorvlak Pedant-Feeding Drive.
pozorvlak: (kittin)
Sunday, September 30th, 2007 08:26 pm
SICP section 2.4 set off my wheel-reinvention alarms in a big way. It discusses writing code to use complex numbers in a representation-agnostic way; your complex numbers can be either in polar (r, θ) form or rectangular (x + i y) form, and ideally you'd like not to care. Their solution? "Tag" each number with its representation, by passing around pairs whose first element is a symbol: either 'rectangular or 'polar. Your procedures to (say) find the magnitude of your complex numbers first check the symbol and then (in a nice, generic, data-driven, extensible way) dispatch the correct procedure given the representation you're using.

Which is all very well, but isn't this... y'know... a type system? Given that Scheme obviously has some sort of type system in place so it can complain when I try to add 1 to "fred", aren't we reinventing a pretty large wheel? Is this, in fact, final and clinching proof of Smith's (First) Law?

Well, yes and no. Yes, we are implementing a simple type system, but given that the main thrust of the book is how to write interpreters and compilers, that's a pretty useful thing to know. It's also interesting to see how you can whip up your own type system (and module system) atop pairs and lambdas. It's a very simple type system, but it's not too hard to see how, with a slightly more complicated data structure storing your dispatch tables and possibly some macros to take away the busywork, you could implement inheritance, generics... even something like Frink's physical units tracking. And you could mix and match such systems throughout your program. I can't decide if that would be really really cool, or a maintenance and reuse nightmare :-)
pozorvlak: (babylon)
Saturday, September 29th, 2007 12:58 am
While trying to do one of the exercises from SICP, I wanted to apply and to a list. Now, I could have used fold, but it seemed a bit of a waste given that Scheme's and is variadic: what I needed was an equivalent of Perl's func(@list_of_args) idiom, or Python and Ruby's func(*list). So, inspired by the syntax for variadic function definitions, I tried
(and . list)
for various values of list, and it worked! Rather elegant, I thought: no special syntax, just an application of Lisp's usual list-construction and argument-passing rules (like the Perl idiom, in fact). But when I tried it with + instead of and, no dice - instead, I get a "Combination must be a proper list" error. Obviously something to do (I realise, belatedly) with the fact that and is a special form and not an ordinary function. Maybe Larry Wall was onto something with his idea that different things should look different :-) But why doesn't it work for ordinary functions? Isn't (func . args) equal to (func arg1 arg2 arg3...)?

[You can get the effect I was originally after by using (apply func list).]


One of the especially nice things about my shiny new Ubuntu installation is that I now have a working Java system, so I can run J in all its graphical glory. So today I worked through one of the J labs: specifically, the one about the Catalan numbers (I can't find the lab online, alas - you'll just have to download J and try it for yourselves :-) ). The standard formula for the Catalan numbers involves factorials, so both the nth Catalan number itself and the time needed to calculate it grows very quickly with n. You can improve the run-time by rearranging the formula to use binomial coefficients (J has a built-in function for calculating nCr in a non-stupid way), and this makes calculation of Catalan numbers up to the 10,000th or so almost instant. But when I tried to use this code to find the 100,000th Catalan number, my computer locked solid. Response became glacial, the hard drive paged constantly, and I had to kill J and restart the lab.

Fortunately, there was a better way. The rest of the lab showed how you could use the prime decomposition of 2n! and n! to find a much quicker way of calculating Catalan numbers. With this code, the calculation again became instant - a salutory reminder of the power of a good algorithm :-) I'm still finding J a bit hard to read, though: code like
+/ <. n % p ^ >: i. <. p ^. n
or worse,
cat1=: (! +:) % >:
makes my eyes cross :-(

[As any fule can see, the first line calculates the exponent of each prime p in n!, and the second is the naive formula for calculating the Catalan numbers.]

While writing this post, I came across this link, which is rather fun: Culture Shock.