January 2018

S M T W T F S
  123456
78910111213
14151617181920
21222324252627
28293031   

Style Credit

Expand Cut Tags

No cut tags
pozorvlak: (Default)
Tuesday, July 26th, 2011 10:31 am
Back in 2003 when I was working for $hateful_defence_contractor, we were dealing with a lot of quantities expressed in dB. Occasionally there was a need to add these things - no, I can't remember why. Total power output from multiple sources, or something. Everyone cursed about this. So I wrote a desk calculator script, along these lines:
#!/usr/bin/perl

print "> ";
while (<>) {
   s|-?\d+(\.\d+)?([eE][-+]?\d+)?|10**($&/10)|oeg;
   print 10*log(eval $_)/log(10)."\n> ";
}
I've always thought of this as The Most Evil Code I've Ever Written. For those of you who don't speak fluent regex, it reads a line of input from the user, interprets everything that looks like a number as a number of decibels, replaces each decibel-number with the non-decibel equivalent, evaluates the resulting string as Perl code, and then converts the result back into decibels. Here's an example session:
> 1 + 1
4.01029995663982
> 10 * 10 
20
> cos(1)
-5.13088257108395
Some of you are no doubt thinking "Well of course that code's evil, it's written in Perl!" But no. Here's the same algorithm written in Python:
#!/usr/bin/python -u
import re, math, sys

def repl(match):
        num = float(match.group(0))
        return str(10**(num/10))

number = re.compile(r'-?\d+(\.\d+)?([eE][-+]?\d+)?')
while 1:
        line = sys.stdin.readline()
        if len(line) == 0:
                break
        line = re.sub(number, repl, line)
        print 10*math.log10(eval(line))
If anything, the Perl version is simpler and has lower accidental complexity. If Perl's not the best imaginable language for expressing that algorithm, it's damn close.

[I also tried to write a Haskell version using System.Eval.Haskell, but I got undefined reference to `__stginit_pluginszm1zi5zi1zi4_SystemziEvalziHaskell_' errors, suggesting my installation of cabal is b0rked. Anyone know what I need to do to fix it? Also, I'm sure my Python style can be greatly improved - suggestions most welcome.]

No, I thought of it as evil because it's doing an ugly thing: superficially altering code with regexes and then using string eval? And who the hell adds decibels, anyway?

Needless to say, it was the most successful piece of code I wrote in the year I spent in that job.

I was talking about string eval to Aaron Crane the other day, and I mentioned this program. His response surprised me:
I disagree; I think it’s a lovely piece of code. It may not be a beautiful jewel of elegant abstractions for a complex data model, true. But it’s small, simple, trivial to write, works on anything with a Perl interpreter (of pretty much any vintage, and with no additional dependencies), and clearly correct once you’ve thought about the nature of the arithmetic you’re doing. While it’s not something you’d ship as part of a safety-critical system, for example, I can’t see any way it could be realistically improved as an internal tool, aimed at users who are aware of its potential limitations.
[Full disclosure: the Perl version above didn't work first time. But the bug was quickly found, and it did work the second time :-)]

The lack of external dependencies (also a virtue of the Python version, which depends only on core modules) was very much intentional: I wrote my program so it could be trivially distributed (by samizdat, if necessary). Most of my colleagues weren't Perl programmers, and if I'd said "First, install Regexp::Common from CPAN...", I'd have lost half my potential audience. As it was, the tool was enthusiastically adopted.

So, what do you all think? Is it evil or lovely? Or both? And what's the most evil piece of code that you've written?

Edit: Aaron also pointed me at this program, which is both lovely and evil in a similar way. If you don't understand what's going on, type
perl -e 'print join q[{,-,+}], 1..9'
and
perl -e 'print glob "1{,-,+}2"'
at the command-line.
pozorvlak: (Default)
Tuesday, April 5th, 2011 01:06 pm
Here are some bits of code I've released recently:

UK mountain weather forecast aggregator


The Mountain Weather Information Service do an excellent job, providing weather forecasts for all the mountain areas in the UK - most weather forecast sites only give forecasts for inhabited areas, and the weather at sea level often differs in interesting ways from the nearby weather at 1000m. However, their site's usability could be better. They assume that you're already in an area and want to know what the weather's going to be like for the next couple of days¹, but it's more normal for me to know what day I'm free to go hillwalking, and to want to know where I'll get the best weather.

So I decided to write a screen-scraper to gather and collate the information for me. I'd heard great things about Python's BeautifulSoup library and its ability to make sense of non-compliant, real-world HTML, so this seemed like a great excuse to try it out; unfortunately, BeautifulSoup completely failed me, only returning the head of the relevant pages. Fortunately, Afternoon and [livejournal.com profile] ciphergoth were on hand with Python advice; they told me that BeautifulSoup is now largely deprecated in favour of lxml. This proved much better: now all I needed to handle was the (lack of) structure of the pages...

There's a live copy running at mwis.assyrian.org.uk; you can download the source code from GitHub. There are a bunch of improvements that could be made to this code:
  1. The speed isn't too bad, but it could be faster. An obvious improvement is to stop doing eight HTTP GETs in series!
  2. There's no API.
  3. Your geographic options are limited: either the whole UK, or England & Wales, or Scotland. Here in the Central Belt, I'm closer to the English Lake District than I am to the North-West Highlands.
  4. The page design is fugly severely functional. Any design experts wanna suggest improvements? Readability on mobile devices is a major bonus.
  5. MWIS is dependent on sponsorship for their website-running costs, and for the English and Welsh forecasts. I don't want to take bread out of their mouths, so I should probably add yet more heuristics to the scraper to pull out the "please visit our sponsors" links.
  6. Currently all HTML is generated with raw print statements. It would be nicer to use a templating engine of some sort.
A possible solution to (1) and (2) is to move the scraper itself to ScraperWiki, and replace my existing CGI script with some JavaScript that pulls JSON from ScraperWiki and renders it. Anyway, if anyone feels like implementing any of these features for me, I'll gratefully accept your patches :-)

git-deploy


While I was developing the MWIS scraper, I found it was annoying to push to GitHub and then ssh to my host (or rather, switch to a window in which I'd already ssh'ed to my host) and pull my changes. So I wrote the World's Simplest Deployment Script. I've been finding it really useful, and you're welcome to use it yourself.

[In darcs, of course, one would just push to two different repos. Git doesn't really like you pushing to non-bare repositories, so this isn't such a great idea. If you want to know what an industrial-strength deployment setup would look like, I suggest you read this post about the continuous deployment setup at IMVU.]

bfcc - BrainF*** to C compiler


I was on the train, looking through the examples/ directory in the LLVM source tree, and noticed the example BrainF*** front-end. For some reason, it hadn't previously occurred to me quite how simple it would be to write a BF compiler. So I started coding, and had one working by the time I got back to Glasgow (which may sound a long time, but I was on my way back from an Edinburgh.pm meeting and was thus somewhat drunk). You can get it here. [livejournal.com profile] aaroncrane suggested a neat hack to provide O(1) arithmetic under certain circumstances: I should add this, so I can claim to have written an optimising BF compiler :-)



All of these programs are open source: share and enjoy. They're all pretty much trivial, but I reckon that creating and releasing something trivial is a great improvement over creating or releasing nothing.

¹ Great Britain is a small, mountainous island on the edge of the North Atlantic. Long-term weather forecasting is a lost cause here.
pozorvlak: (polar bear)
Thursday, May 29th, 2008 10:50 am
This post is a mea culpa, an admission of failure: if any of you have read my stuff thinking that I know what I'm talking about when it comes to programming, this is your chance to recalibrate. I'm posting in the hopes that by writing this particular ultra-basic technique down in public, I'll never ever forget about it )

[The full, working version of PS3 Remuxatron (by Mat Brown, GPLed) is here.]
pozorvlak: (sceince)
Wednesday, May 14th, 2008 01:15 pm
Those of you who do scientific computation may be interested in this link:

Bye Matlab, hello Python, thanks Sage.

Short version: Sage is a bundle of all of the various numeric/scientific/graphing tools for Python, made easy to download and install (even if you don't have root access, apparently). According to the blogger linked above, it's now good enough to serve as a replacement for Matlab (and it has ambitions to be a replacement for Maple and Mathematica too, though I don't know how far along it is). It integrates with R, Gap, etc. And because it's Python, you get a real, high-level, well-designed programming language with a proper environment to do your coding in.

In other news, [livejournal.com profile] wormwood_pearl finished her Finals yesterday. As you can imagine, we're both pretty relieved. There's no tradition of meeting people outside Finals here like there is in Oxford, but I went along with a bottle of champagne anyway - and only then remembered that public drinking is against Glasgow bylaws. Drat it. We went out to lunch, the bottle went into the Department fridge, and we subsequently went round to her sisters' and drank it there while watching Layer Cake. Because we're that exciting.
pozorvlak: (Default)
Sunday, August 19th, 2007 12:09 am
Here's something that occurred to me the other day: consider duck-typed languages like Python )

OK, so far so standard. Now, most duck-typed languages are dynamic, which means that we only try to determine if bar has a spoffle method at runtime, and die with an error message when it doesn't (possibly after trying some error recovery, e.g. by calling an AUTOLOAD method or similar). But it occurred to me that in simple cases (which is to say, the majority), we could work out most of this stuff statically. For each function definition, see what methods are called on its arguments. Recurse into functions called from that function. Now, each time a function is called, try to work out what methods the arguments will support, and see if that includes the interface required by the function. Issue an error if not. Thus we get the code reuse benefits of duck typing, and the correctness benefits of static checking. If the static checking is slowing down your development cycle too much, drop back to fully dynamic checking, and only run the static checks on your nightly builds or something.

This also cleared up something else I'd been vaguely wondering about. In his splendid Drunken Blog Rant Rich Programmer Food, Steve Yegge says
Another problem is that they believe any type "error", no matter how insignificant it might be to the operation of your personal program at this particular moment, should be treated as a news item worthy of the Wall Street Journal front page. Everyone should throw down their ploughshares and stop working until it's fixed. The concept of a type "warning" never enters the discussion.
I'd wondered at the time what a type warning would mean. When is a type error mild enough to only warrant a warning? Here's one idea: a type warning should be issued when it cannot be proved that an argument to a function implements the interface that function needs; a type error should be issued when it can be proved that it doesn't.

This all seemed very interesting, and struck me as potentially a fun and reasonably easy hacking project, at least to get something workable going. But if it's occurred to me, it has probably occurred to someone else, so I asked the Mythical Perfect Haskell Programmer if he was aware of any work that had been done on static duck-typed languages. "Oh yes," he said, "O'Caml's one." Buhuh? Really? Well, that's O'Caml moved a couple of rungs up my "cool languages to learn" stack...
pozorvlak: (kittin)
Thursday, July 12th, 2007 12:47 pm
Several of my fellow PhD students here have to do a substantial amount of programming as part of their PhDs: unfortunately, most of them haven't done any programming before. The usual procedure, alas, is to hand them a Fortran compiler and tell them to get on with it, often hacking on a large mass of code written by someone else who was "taught" the same way. I try to do what I can to help, but there's a limit to the amount of time I can devote to someone else's project (and a limit to the amount of time they'd want me to devote, I suspect). But still, I see some horror stories: yesterday, for instance, an office-mate finally tracked down a bug due to a magic number which had been bothering her for longer than she cared to say, and which had been seriously undermining her confidence in her actual model. Not using magic numbers is basic programming practice, but nobody had told her this.

So I've been thinking about an introductory course on programming aimed at maths/science grad students. The emphasis would be on writing maintainable code and modern programming practices: modularity, use of libraries wherever possible, test-first programming, use of debuggers, source control systems and profilers, optimising later (if you have to at all), use of high-level languages, documentation, and so on. My real aim would be to break the cycle of abuse whereby each new generation of grad students is told to write 1000-line-to-a-function, opaque, untested, rape-and-paste Fortran by their supervisors, because it was good enough for their supervisors, and on and on...

Here's a first cut at a course catalogue entry for this fantasy course: I'd be very interested to hear everyone's comments.

Practical Computer Programming for Scientists )