A couple of days ago, Chris Yocum asked me to help him out with some OCaml code. He's learning OCaml at the moment, and thought he'd practice it by solving a problem he'd encountered in his research on early Irish law.
In early Irish law, murder was punishable by a fine dependent on the status of the victim. Ten-and-a-half cows per king murdered (no, really!), five for a major nobleman, and so on. At one particular event, we know the total fine levied, but not the number of victims. Chris wanted to find out the range of possibilities. This is an example of the change-making problem: given small change of various denominations, how can you give your customer 73p? I most often encounter this problem with rugby scores: if Wales beat England 67 - 3 (I can dream...), then at seven points for a converted try, five for an unconverted try and three for a penalty, what might have happened?
Chris had already written a Perl program that looked very much like this:
There are a few things to criticise in this program: it searches a lot of possibilities it can rule out (once $w and $x are too big, it doesn't matter what $y and $z are), it repeats a lot of calculations in the inner loops (more on that later), and the number of different social classes is baked into the structure of the program - if the early Irish lawmakers ever decide to institute a new class of noble, we'd have to modify our code rather than just changing the values of some parameters. But it's simple, easy to read, obviously correct and easy to write.
Chris wanted to re-write this in OCaml, which is a (mostly) functional language, and iteration is considered poor style by functional devotees. Could I help him re-write it to use recursion instead?
At this point I'm afraid I seized on the word "recursion", because, you see, there's an elegant recursive solution to the change-making problem and I thought he was asking about that. Somewhere between my wanting to entirely re-write the program using a new algorithm, it being trickier to write than I'd expected, and my not actually knowing OCaml very well, I managed to completely confuse him. Sorry, Chris. Anyway, we'll come back to the first program we wrote in a bit; here's what I think he was actually asking for:
I think that's pretty clear: the only slightly tricksy bit is the use of the partially-applied function
It doesn't look much like the previous version, but it works in exactly the same way: we iterate over all possible combinations of victim numbers, and check each one (using
Now, let's look at a more efficient approach.
The key to solving this problem efficiently is to realise that choosing a "first move" leaves us with a smaller example of the same type of problem. Let's suppose, as above, that Wales scored 67 points, and let's suppose that we've heard that they scored four converted tries. To work out the full space of possibilities, we just need to calculate all the ways of making 67 - 4*7 = 39 points with unconverted tries and penalties. But that's also a change-making problem! This means we can solve the problem recursively, and that's what I thought Chris was asking about. If we only have one denomination left to play with, then there's at most one way to solve the problem: return the list containing only
Here's the code that we wrote to do this:
[One thing I didn't mention - though I can't remember the exact values in Chris' original problem, the death-price for a king really was 10.5 cows. Hence our program had to handle fractional denominations, which makes it a lot uglier than it would otherwise need to be - OCaml's lack of polymorphism means that it needs separate operators (called
A refinement of this technique is to memoize the recursive function - this typically helps a lot, because you need to solve many of the same inner subproblems again and again. If you use the memoized version, the technique's called dynamic programming (a name chosen purely because it sounded impressive), which is a useful technique in all kinds of search applications. Some quick experiments indicate that it's not a win in this case, though - the memory requirements grow too rapidly.
Could we have written the recursive version in Perl? Sure:
[This version, by the way, was easily the most fun to write.]
I think it's interesting how different the equivalent Perl and OCaml versions look in this case: they're about the same length, but the Perl version is one subroutine, and the Caml version's divided into several smaller routines. This is because I find long subroutines in functional languages to be devilishly hard to write correctly - splitting expressions out into their own top-level functions makes it easier to ask the compiler what types it's inferring for them.
All suggestions for improvement, as ever, gratefully accepted :-)
In early Irish law, murder was punishable by a fine dependent on the status of the victim. Ten-and-a-half cows per king murdered (no, really!), five for a major nobleman, and so on. At one particular event, we know the total fine levied, but not the number of victims. Chris wanted to find out the range of possibilities. This is an example of the change-making problem: given small change of various denominations, how can you give your customer 73p? I most often encounter this problem with rugby scores: if Wales beat England 67 - 3 (I can dream...), then at seven points for a converted try, five for an unconverted try and three for a penalty, what might have happened?
Chris had already written a Perl program that looked very much like this:
#!/usr/bin/perl use warnings; use strict; my $target = 23; # or whatever it really was my $w_val = 10; my $x_val = 5; my $y_val = 2; my $z_val = 1; for my $w (0 .. $target / $w_val) { for my $x (0 .. $target/$x_val) { for my $y (0 .. $target / $y_val) { for my $z (0 .. $target / $z_val) { if ($w*$w_val + $x*$x_val + $y*$y_val + $z*$z_val == $target) { print "$w, $x, $y, $z\n"; } } } } }
There are a few things to criticise in this program: it searches a lot of possibilities it can rule out (once $w and $x are too big, it doesn't matter what $y and $z are), it repeats a lot of calculations in the inner loops (more on that later), and the number of different social classes is baked into the structure of the program - if the early Irish lawmakers ever decide to institute a new class of noble, we'd have to modify our code rather than just changing the values of some parameters. But it's simple, easy to read, obviously correct and easy to write.
Chris wanted to re-write this in OCaml, which is a (mostly) functional language, and iteration is considered poor style by functional devotees. Could I help him re-write it to use recursion instead?
At this point I'm afraid I seized on the word "recursion", because, you see, there's an elegant recursive solution to the change-making problem and I thought he was asking about that. Somewhere between my wanting to entirely re-write the program using a new algorithm, it being trickier to write than I'd expected, and my not actually knowing OCaml very well, I managed to completely confuse him. Sorry, Chris. Anyway, we'll come back to the first program we wrote in a bit; here's what I think he was actually asking for:
let rec range i j = if i > j then [] else i :: (range (i+1) j) let poss_range target value = range 0 (target/value) let concat_map f xs = List.concat (List.map f xs) let make_quad w x y z = (w,x,y,z) let is_good target quad = match quad with (w,x,y,z) -> (w * 10 + x * 5 + y * 2 + z * 1) = target let ways target = List.filter (is_good target) (concat_map (fun w -> concat_map (fun x -> concat_map (fun y -> List.map (make_quad w x y) (poss_range target 1)) (poss_range target 2)) (poss_range target 5)) (poss_range target 10))
I think that's pretty clear: the only slightly tricksy bit is the use of the partially-applied function
make_quad w x y
. It's a couple of lines shorter than the original Perl version, but suffers from all of its flaws. It would be fairly easy to reduce the size of the search space - change the first argument passed to poss_range
- but it's more interesting to fix the restriction to a fixed set of denominations. Here's an attempt to do that:(* General utilities *) let rec range i j = if i > j then [] else i :: (range (i+1) j) let cons x xs = x::xs let singleton x = [x] let sum = List.fold_left (fun x y -> x + y) 0 (* dot product - x1*y1 + x2*y2 + x3*y3 + ... *) let dot xs ys = sum (List.map2 (fun x y -> x * y) xs ys) (* finite choice functions - ways of choosing one elt from each list *) let rec choices xss = match xss with [] -> [] | [xs] -> List.map singleton xs | xs::rest -> List.concat (List.map (fun x -> List.map (cons x) (choices rest)) xs) (* functions specific to the problem at hand *) let ranges target values = List.map (fun x -> range 0 (target / x)) values let candidates target values = choices (ranges target values) let isgood target values xs = (dot values xs) = target let ways target values = List.filter (isgood target values) (candidates target values)
It doesn't look much like the previous version, but it works in exactly the same way: we iterate over all possible combinations of victim numbers, and check each one (using
List.filter
) to see if it gives the desired result. The meat of the program is the function choices
, which generates finite choice functions. Given a list of lists [xs1;xs2;xs3;xs4;...] it will produce all lists whose first element is chosen from xs1, whose second element is chosen from xs2, etc. I believe it's considered poor style to use naked recursion like that (as opposed to using a higher-order function like List.map
or List.fold_left
), but frankly that function was the hardest thing I had to write in this whole exercise, and I was happy just to get something that worked.Now, let's look at a more efficient approach.
The key to solving this problem efficiently is to realise that choosing a "first move" leaves us with a smaller example of the same type of problem. Let's suppose, as above, that Wales scored 67 points, and let's suppose that we've heard that they scored four converted tries. To work out the full space of possibilities, we just need to calculate all the ways of making 67 - 4*7 = 39 points with unconverted tries and penalties. But that's also a change-making problem! This means we can solve the problem recursively, and that's what I thought Chris was asking about. If we only have one denomination left to play with, then there's at most one way to solve the problem: return the list containing only
target/value
if value
divides target
evenly or the empty list (to indicate "no solutions") if it doesn't. If we have more than one denomination, we make all possible choices of "first move", and then recursively solve all the new, smaller problems that each choice presents us with.Here's the code that we wrote to do this:
(* things that are probably in a standard library, if only I knew where to look for them *) let rec range i j = if i > j then [] else i :: (range (i+1) j) let cons x xs = x::xs let fuzz = 0.00001 (* from http://www.podval.org/~sds/ocaml-sucks.html *) let round x = int_of_float (floor (x +. 0.5)) let float_eq x y = abs_float (x -. y) <= fuzz let divides n d = let q = n /. d in float_eq q (float_of_int (truncate q)) (* the actual meat of the code *) let rec ways ps tgt = match ps with [] -> [] | [p] -> if (divides tgt p && tgt >= 0.) then [[round (tgt /. p)]] else [] | p::rest -> List.concat (List.map (helper p rest tgt) (range 0 (round (tgt /.p)))) and helper p ps tgt n = let ws = ways ps (tgt -. (float_of_int n) *. p) in List.map (cons n) ws (* test code *) let values = [10.;5.;2.;1.] let target = 23. let answers = ways values target let add x y = x +. y let dot xs ys = List.fold_left add 0. (List.map2 (fun x y -> x *. (float_of_int y)) xs ys) let sums = List.map (dot values) answers let errors = List.length (List.find_all (fun x -> not (float_eq x target)) sums) let _ = print_endline ((string_of_int errors) ^ " errors.")
[One thing I didn't mention - though I can't remember the exact values in Chris' original problem, the death-price for a king really was 10.5 cows. Hence our program had to handle fractional denominations, which makes it a lot uglier than it would otherwise need to be - OCaml's lack of polymorphism means that it needs separate operators (called
+.
, *.
etc.) for floating-point numbers, and we need to strew conversion functions about the place.]A refinement of this technique is to memoize the recursive function - this typically helps a lot, because you need to solve many of the same inner subproblems again and again. If you use the memoized version, the technique's called dynamic programming (a name chosen purely because it sounded impressive), which is a useful technique in all kinds of search applications. Some quick experiments indicate that it's not a win in this case, though - the memory requirements grow too rapidly.
Could we have written the recursive version in Perl? Sure:
#!/usr/bin/perl use strict; use warnings; sub ways { my $target = shift; my $first = shift; if (scalar(@_) == 0 && $target >= 0 && ($target % $first == 0)) { return ([$target/$first]); } my @ways; for my $move (0 .. ($target/$first)) { push @ways, [$move, @$_] foreach ways($target - $move * $first, @_); } return @ways; } foreach (ways(23, 10,5,2,1)) { print join ",", @$_)."\n" };
[This version, by the way, was easily the most fun to write.]
I think it's interesting how different the equivalent Perl and OCaml versions look in this case: they're about the same length, but the Perl version is one subroutine, and the Caml version's divided into several smaller routines. This is because I find long subroutines in functional languages to be devilishly hard to write correctly - splitting expressions out into their own top-level functions makes it easier to ask the compiler what types it's inferring for them.
All suggestions for improvement, as ever, gratefully accepted :-)
Tags:
no subject
no subject
I'm not surprised - I only posted it a minute ago!
no subject
I'm far from being an expert on the language, and I'm probably violating all kinds of style guidelines and missing out on helpful features and libraries that would have made my life easier (for instance, there must be an easier way of passing operators as arguments to higher-order functions!). Hopefully some real experts will drop in and enlighten us all.
no subject
no subject
(::)
gives me a syntax error. Still, I could clean up the definition ofsum
.no subject
no subject
[For those following along at home: it doesn't compile as
printf.sprintf
toPrintf.sprintf
andprintf.printf
toPrintf.printf
.]no subject
no subject
Just a few nit-picks
The fine amount is given in cumals which is Old Irish for "female slave" so if you want to calculate the amount in bó mlicht "milch cow" then it is 3 milch cows for 1 cumal (female slave) so a king is 7 cumals or 21 milch cows.
Like I said, a nit-pick and a rather interesting look in to the early Irish monetary system.
Re: Just a few nit-picks
no subject
This is an optimization: for clarity, you might omit it, and let "no denominations left" be your own only base case.
I don't have OCaml lying around, but here's a first pass at the problem in Lisp, which should be similar, if slightly more verbose. Note also how I've avoided the need to do a "map" by building up a partial solution in the arguments to the recursive calls. To an experienced Lisper, the presence of append in there is a bit of a red flag that there may be unnecessary allocation and list scanning going on. It's straightforward to eliminate one of the calls to append by putting even more of the solution into the arguments: This also has the benefit of putting one of the recursive calls into tail position, which will help the optimizer. Your mileage may differ as to whether the increase in speed is worth the decrease in readability.
We could eliminate the other call to append if we were willing to get reversed solutions (i.e. with the counts in opposite order to the coins list). I guess this depends on what the output is going to be used for.
no subject
no subject
no subject
count
is IMHO rather elegant.Thanks for the reference - I'll try to dig up a copy.
no subject
Copy of the article
One last thing. If anyone wants to have a look at the original article, I have a copy on my Informatics website (http://homepages.inf.ed.ac.uk/v1cyocum/) under the title of "Early Irish Law, Annals, and Computer Science". It is not finished but most of the information is there. I need to clean up the actual computer science section due to finding the actual algorithm but the code is correct and the output is correct.
no subject
no subject
no subject
nub $ map sort
instead ofnub . sort
.[BTW, I frequently found myself missing the
$
operator when writing OCaml. It reduces the profusion of brackets so effectively. Does anyone know if there's an OCaml operator that does the same?]no subject
no subject
filter (\lst -> n == foldr (\s (k,d) -> s+k*d) lst)
in that pipeline too - I'm realizing that I probably don't guarantee summability to full ...
actually ...
wait ...
DAMN! Yes, I deal with the case "This doesn't sum to the right number" with my case handling the parameters _ [].
Impressed by Haskell yet again.
no subject
no subject
no subject
A different Perl implementation
This doesn’t pay any attention to the algorithmic problems with the original, but it does treat the denominations as data rather than code; and while it’s not quite as easy to read as Chris’s version, it was easy enough to write that it worked first time.
To take the other problems with the original:
make_loop
merely needs to bail out if the running total has gone over the limit.I confess I’ve never been very interested in purity questions relating to iteration versus recursion, or functional versus imperative algorithms. This approach is clearly imperative, in the sense that it relies on updatable variables. But the variables in question are always local to a particular function (so, except for emitting output, the functions seem pure when examined from the outside); and this sorted of nested-generated-closure approach is fairly FP-ish anyway.
Re: A different Perl implementation
And that implementation is messing with my head :-)
Re: A different Perl implementation
Quite.
It’s perhaps easier to grok if you look at this first:
Or at least, that was the first thing I wrote down; the rest was just making it work. It really is the same algorithm as Chris’s, but using programmatically-generated nested closures (with a single loop per closure) in place of nested loops.
Re: A different Perl implementation
Re: A different Perl implementation
And apologies for spamming your LJ with my code, but this version is an improvement in two ways:
Re: A different Perl implementation
Re: A different Perl implementation
$inner
got set, because I somehow hadn't noticed thatmake_loop
is called once for every element of@denominations
, with the previously-generated loop as its second parameter.*smacks forehead*
A completely different approach
So, I was thinking more about this problem. If you consider a much smaller variant, with a target of 2, and just a 1-unit coin, then you end up considering the possibilities {[0], [1], [2]}. If you have the same target, and denominations 1 and 2, you consider {[0,0], [0,1], [0,2], [1,0], [1,1], [1,2]} (modulo the possibility that some candidates get optimised out early). This starts to look a lot like a counting problem, but with mixed-unit bases.
So, to go back to the target=23, denominations={1, 2, 5, 10} problem: we know that any answer will have a maximum of 2 tens, 4 fives, 11 twos, and 23 ones; (2+1)×(4+1)×(11+1)×(23+1) = 4320, so we need to consider no more than 4320 cases. Let’s assign a non-negative integer to each of those cases, so that [0 tens, 0 fives, 0 twos, 0 ones] is 0; [1 ten, 2 fives, 3 twos, 4 ones] is 4×1 + 3×24 + 2×12×24 + 1×5×12×24; and so on up to the maximum of 23×1 + 11×24 + 4×12×24 + 2×5×12×24 = 4319. Now iterating over the candidates can be as simple as counting from 0 to 4319; no recursion needed at all.
In practice, however, implementing the arithmetic manually over a list of “digits” is probably easier; at the least, it simplifies implementation of the optimisation that avoids adding more of the current denomination of coin when we’ve already exceeded the target. So the code then looks like this:
It’s also really obvious how you’d refactor this to use an explicit iterator in languages without generators or lazy lists or the like, in case generating all the solutions at once seems like the wrong thing.
Re: A completely different approach
Just one more version. The problem with that as written is that it actually runs substantially slower the nested-closure version. Profiling says that the biggest problem is the sum-of-pairwise-multiplication step; that seems plausible, because Perl’s subroutine calls are relatively slow. Refactoring to keep a running total fixes that. For small problems, the nested-closure implementation wins, but for big enough problems, the running-total version has the edge (and that edge increases with the size of the problem). For context, on my laptop, “small” problems (with denominations 1,2,5,10) are those with a target up to about 41 or 42.