pozorvlak: (babylon)
2008-06-02 12:09 pm

(no subject)

Random thought: I'm sure people who actually know about this stuff will be able to tell me why it won't work, or, alternatively, cases where it's already been done.

Recall the problem with the naive Haskell mean program
mean xs = sum xs / length xs
which when asked to calculate the mean of [1 .. 1e9] allocates 8GB of RAM and brings the computer to a grinding halt. The problem there isn't really to do with naiveté: a similarly naive Ruby version works fine and runs in constant space, albeit slowly. The problem is garbage collection, or rather the lack of it. The Haskell runtime calculates the list [1.. 1e9] bit-by-bit so it can be summed, and normally would be able to throw every element away when it was done with it; but because the elements of the list will later need to be counted, there's still a (very long) pointer chain from the top-level to any given element of the list. So none of it can be garbage-collected.

In this case, this is a bad strategy: the effects of swapping out that much memory totally eliminate any benefit from caching the list. It would be quicker to throw the list away and calculate it again from scratch. Since Haskell's a purely functional language, we could in principle do this without worrying about any side-effects of the calculation. We could set up our garbage collector to become more aggressive as memory got shorter: when it starts to use swap, we could throw away any pure data that's more than N indirections from the toplevel, where N is varied dynamically to maximise performance. But what if our program is, say mean $ take 1000000000 mersennePrimes? Then it will take much longer to re-calculate the values than it would to swap them in and out of core. You'd need to keep track of how long each function call takes, and how long each bit of data took to calculate, and estimate re-calculation times when throwing data away.

But this is exactly the kind of profiling data that you need to track for the smart optimizing VMs that Steve Yegge talks about here. So, it seems that there's a synergy here. Does anyone know of a Sufficiently Smart VM that does this kind of aggressive garbage collection? Has it been tried before? Is there some obvious problem with the idea that I'm not seeing?
pozorvlak: (Default)
2008-05-16 12:12 pm

Lazy lists, impatience and hubris

There was a recent mailing list post by Andrew Coppin (which received some discussion on programming.reddit) in which he bemoans the way that GHC, when presented with reasonable-looking code, can suddenly and mysteriously decide that it needs to allocate gigabytes of memory, slowing your system to a crawl for ages (I've had similar problems myself). Andrew's example was the following:

I offer up the following example:
mean xs = sum xs / length xs
Now try, say, mean [1.. 1e9], and watch GHC eat several GB of RAM. (!!)
This is not so mysterious when you remember that Haskell uses linked lists as its native list structure, and so the only way to calculate length xs is to traverse the list a second time: the runtime system "helpfully" keeps around the copy of xs that it generated while calculating sum xs, and you have a space leak. But what to do about it?

The story so far )

Doing it in Perl )

Doing it in Ruby )

Take-home messages )
pozorvlak: (gasmask)
2008-01-23 04:11 pm

Hasskellsses

Bloody Haskell: it's like a scab I can't stop picking at.

Prompted by something on Reddit, I started thinking about libraries to pluralize (English) words, like the table-name generator in Ruby on Rails or Damian Conway's (rather more complete-looking) Lingua::EN::Inflect for Perl. Such things seem to be called "inflectors". My question is, what would be the idiomatic interface for a Haskell inflector? The obvious thing would be something like
    pluralize :: Inflector -> String -> String
where Inflector is some sort of data structure holding configuration data: classical versus modern plurals, any custom cases, etc. But following on from our earlier discussion of high- and low-level type-checking, it occurred to me that there's a constraint here: only singular nouns should be pluralized. So should we then have
    newtype Singular = String
    newtype Plural = String

    pluralize :: Inflector -> Singular -> Plural
? Or even
module Inflect (makeSingular, pluralize)
    data Singular = Singular String
    data Plural = Plural String

    makeSingular :: String -> Singular
    -- error-checking code goes here, to check that the string passed
    -- is a single word, or something

    pluralize :: Inflector -> Singular -> Plural
so we don't export the constructors for Singular or Plural, ensuring that one can only construct them using our interface? But wouldn't this make client code far uglier than necessary? Should one then provide both options, tagging one as unsafe?
    module Inflect (makeSingular, pluralize, unsafePluralize)
    data Singular = Singular String
    data Plural = Plural String

    makeSingular :: String -> Singular
    -- error-checking code goes here

    unsafePluralize ::Inflector -> String -> String

    pluralize :: Inflector -> Singular -> Plural
    pluralize i (Singular s) = Plural (unsafePluralize i s)
Thoughts?
pozorvlak: (Default)
2007-09-12 09:25 am

Operads in Ruby

Short version: I'm going to write a mildly cool and somewhat useful utility in Ruby which shows off Ruby's reflection capabilities and has something to do with my PhD thesis.

OK, first some (not really necessary) mathematical background )

OK, that's the maths out of the way, on to the code. The ability to do categorical composition of functions is useful, so maybe the multicategorical kind would be useful too? Ideally, we'd like a compose function which took a function f and a list of functions g1, g2, ... gn, and returned the function f.(g1,...gn) which takes (x11, ... xnmn) to
f(g1(x11 ... x1m1) ... g1(xn1 ... xnmn))
Unpleasant multiple sequences seem to be an unfortunate fact of life when dealing with this stuff. Anyway, here's an implementation in Ruby )

I'm sure this code can be improved, and I'd love to hear how. I don't think it's possible to write a function like this in vanilla Haskell, and writing one in Template Haskell is beyond my template-fu. It would be possible in Perl, but because of Perl's eccentric argument-passing conventions, you'd have to manually pass the arities in. This also uglifies the code somewhat.
pozorvlak: (gasmask)
2007-09-11 07:47 pm

Everything in Ruby is an object, is it?

irb(main):001:0> def fred(x); x + 1; end
nil
irb(main):002:0> fred.methods
ArgumentError: wrong # of arguments(0 for 1)
        from (irb):2:in `fred'
        from (irb):2
irb(main):003:0> 1.+.methods
ArgumentError: wrong # of arguments(0 for 1)
        from (irb):3:in `+'
        from (irb):3
irb(main):004:0> {|x| x+1}.methods
SyntaxError: compile error
(irb):4: parse error
{|x| x+1}.methods
  ^
(irb):4: parse error
{|x| x+1}.methods
         ^
        from (irb):4
irb(main):005:0> lambda {|x| x+1}.methods
["call", "==", "[]", "arity", "to_s", "dup", "eql?", "protected_methods",
 "frozen?", "===", "respond_to?", "class", "kind_of?", "__send__", "nil?",
 "instance_eval", "public_methods", "untaint", "__id__", "display",
 "inspect", "taint", "hash", "=~", "private_methods", "to_a", "is_a?",
 "clone", "equal?", "singleton_methods", "freeze", "type", "instance_of?",
 "send", "methods", "method", "tainted?", "instance_variables", "id",
 "extend"]
In unrelated news, Happy birthday [livejournal.com profile] stronae!

And does anyone if there's a portable way of finding the arity of a function in Scheme?
pozorvlak: (Default)
2007-09-10 12:39 pm

Long-term project SITREP

Current status on some of my long-term projects:

Thesis: Up to 36 pages )

Birdshot!: I had some good ideas for this )

Moving in with [livejournal.com profile] wormwood_pearl: Nominally moved in at the beginning of August. )

Learning Haskell: I'm going to officially give up on this one for now. )

Diet: This is going really well. )

Munro-bagging: up to 61 out of 284 )

Becoming a l33t martial artist: I've been doing some capoeira. )

Learning to juggle 5 balls: I'm getting 8 catches pretty consistently. )

Reading Ulysses: Haven't looked at it since I reached page 500. ) I seem to have so many other things competing for my attention :-)
pozorvlak: (Default)
2007-03-20 05:35 pm

Stopping cows

There's a scene in John Wyndham's marvellous Chocky where the main character asks his father "Why does a cow stop?"

What he means by this is why does a cow's intelligence stop: why are cows smart enough to escape out of an open gate, but not smart enough to see that they could lift up the latch with their noses and escape whenever they want? Why do cows hit a point where their problem-solving ability just stops dead?

I was reminded of this as I read Matthew Huntbach's essay What's Wrong with Ruby this morning. He points out that "Ruby, and the whole scripting language phenomenon is a slap in the face" for the academic language-design community of which he is apparently a part.
We have been used to complaining that the programming languages we had developed were so much better than the mainstream programming languages, the only reasons they hadn’t taken off were that they weren’t developed and promoted by big companies, and that the inherent conservatism of industry restricted commercial programming to tried and trusted languages.

Yet a whole stream of new languages: Perl, PHP, Python and now Ruby, each initiated by one person as back-room projects, have been adopted for serious use and achieved many thousands of users. Unlike academically-derived languages, they have given the impression of being thrown together to meet a need without any strong underlying theoretical basis.
That's right. Your languages weren't unsuccessful for the reasons you thought they were, and languages that don't meet your standards of elegance have flourished in actual use. Something you thought was necessary for a language to succeed (corporate backing) turns out not to be necessary after all. Now, let's apply your insights to the rest of your essay (which, by the way, is a perfect example of the kind of thing I was talking about in my post Tightrope Walking). Ruby et al don't have static typing, formal semantics, standards documents, or the guarantee that each object will have a class that determines its behaviour - in short, they're not academic languages - and yet thousands of bright users use them and like them. Could it be that perhaps the lack of these things isn't such a problem as you think? Might they, in fact, have advantages that you're too blind to see?

See that bit of metal attaching the gate to the fence? You might like to think about what you can do with that.