January 2018

S M T W T F S
  123456
78910111213
14151617181920
21222324252627
28293031   

Style Credit

Expand Cut Tags

No cut tags
Tuesday, April 5th, 2011 01:06 pm
Here are some bits of code I've released recently:

UK mountain weather forecast aggregator


The Mountain Weather Information Service do an excellent job, providing weather forecasts for all the mountain areas in the UK - most weather forecast sites only give forecasts for inhabited areas, and the weather at sea level often differs in interesting ways from the nearby weather at 1000m. However, their site's usability could be better. They assume that you're already in an area and want to know what the weather's going to be like for the next couple of days¹, but it's more normal for me to know what day I'm free to go hillwalking, and to want to know where I'll get the best weather.

So I decided to write a screen-scraper to gather and collate the information for me. I'd heard great things about Python's BeautifulSoup library and its ability to make sense of non-compliant, real-world HTML, so this seemed like a great excuse to try it out; unfortunately, BeautifulSoup completely failed me, only returning the head of the relevant pages. Fortunately, Afternoon and [livejournal.com profile] ciphergoth were on hand with Python advice; they told me that BeautifulSoup is now largely deprecated in favour of lxml. This proved much better: now all I needed to handle was the (lack of) structure of the pages...

There's a live copy running at mwis.assyrian.org.uk; you can download the source code from GitHub. There are a bunch of improvements that could be made to this code:
  1. The speed isn't too bad, but it could be faster. An obvious improvement is to stop doing eight HTTP GETs in series!
  2. There's no API.
  3. Your geographic options are limited: either the whole UK, or England & Wales, or Scotland. Here in the Central Belt, I'm closer to the English Lake District than I am to the North-West Highlands.
  4. The page design is fugly severely functional. Any design experts wanna suggest improvements? Readability on mobile devices is a major bonus.
  5. MWIS is dependent on sponsorship for their website-running costs, and for the English and Welsh forecasts. I don't want to take bread out of their mouths, so I should probably add yet more heuristics to the scraper to pull out the "please visit our sponsors" links.
  6. Currently all HTML is generated with raw print statements. It would be nicer to use a templating engine of some sort.
A possible solution to (1) and (2) is to move the scraper itself to ScraperWiki, and replace my existing CGI script with some JavaScript that pulls JSON from ScraperWiki and renders it. Anyway, if anyone feels like implementing any of these features for me, I'll gratefully accept your patches :-)

git-deploy


While I was developing the MWIS scraper, I found it was annoying to push to GitHub and then ssh to my host (or rather, switch to a window in which I'd already ssh'ed to my host) and pull my changes. So I wrote the World's Simplest Deployment Script. I've been finding it really useful, and you're welcome to use it yourself.

[In darcs, of course, one would just push to two different repos. Git doesn't really like you pushing to non-bare repositories, so this isn't such a great idea. If you want to know what an industrial-strength deployment setup would look like, I suggest you read this post about the continuous deployment setup at IMVU.]

bfcc - BrainF*** to C compiler


I was on the train, looking through the examples/ directory in the LLVM source tree, and noticed the example BrainF*** front-end. For some reason, it hadn't previously occurred to me quite how simple it would be to write a BF compiler. So I started coding, and had one working by the time I got back to Glasgow (which may sound a long time, but I was on my way back from an Edinburgh.pm meeting and was thus somewhat drunk). You can get it here. [livejournal.com profile] aaroncrane suggested a neat hack to provide O(1) arithmetic under certain circumstances: I should add this, so I can claim to have written an optimising BF compiler :-)



All of these programs are open source: share and enjoy. They're all pretty much trivial, but I reckon that creating and releasing something trivial is a great improvement over creating or releasing nothing.

¹ Great Britain is a small, mountainous island on the edge of the North Atlantic. Long-term weather forecasting is a lost cause here.
Tuesday, April 5th, 2011 12:29 pm (UTC)
yes, use a templating engine! I recently re-worked some code to use "genshi" and felt foolish for not having done it that way in the first place; it makes life so much easier even for small programs.
Tuesday, April 5th, 2011 08:08 pm (UTC)
Cool! I created a github account yesterday - my colleagues and I have finally released an open source implementation of our API to a wondering^H^H^H^H^H^H^H^H^H^Hlargely indifferent world. :-)
Tuesday, April 5th, 2011 09:43 pm (UTC)
Good stuff!

I really like the way GitHub* has simplified the process of contributing to open-source projects. Find repo -> fork -> hack -> send pull request. Compare the hoops that GNU projects make you jump through.

* and probably BitBucket/Patch-Tag/Launchpad too, but I've never used them.
(Anonymous)
Tuesday, April 5th, 2011 09:41 pm (UTC)
Could be greatly improved by

(a) specifying a san-serif font. oh god my eyes. why oh why do browsers default to such ugly text?

(b) making sub pages. Have the main page show links to individual locations, so on first load you see something like:

--- UK
-- Scotland
- Northwest Highlands
- West Highlands
- Cairngorms
- Southeastern Highlands
- Southern Uplands
-- England
- Lake District
- Snowdonia National Park
- Peak District and Yorkshire Dales

Where each line is a link to a page with just that forecast on. The England/Scotland/UK lines are links to aggregate pages. How often do you actually need to see the weather for the NW Highlands AND the Lakes at the same time?

(c) that massive pdf logo is rather ugly.

(d) you might try using subtle borders (light grey, probably) between table cells. alternating brightness backgrounds for columns and/or rows is also good for readability.

(e) Strip down text as much as you can - don't say "The Southeastern Highlands" where you can say "SE Highlands"

Just some thoughts. :)

-mat
Tuesday, April 5th, 2011 10:00 pm (UTC)
(a), (c), (d), (e) are all great suggestions, I'll give them a try. Thanks! As for (b), side-by-side comparison of different regions is kinda the point - you can go direct to MWIS if you just want one area - but you're right, there's probably not much point in comparing the NW Highlands and the Lakes (and I should at least provide a link to the individual pages). A way of turning off individual regions could work, or maybe a "regions within 500 miles of this postcode" feature.

I like the hierarchy idea, though - maybe it could be extended? UK->Scotland->N/E/S/W (these overlap)?
(Anonymous)
Tuesday, April 5th, 2011 10:26 pm (UTC)
OK, perhaps I missed the point on (b) :)

But equally, I think you might be delivering too much information at once for that page to work. My feeling would be to try to concentrate that down a bit for an overview page, which clicks through (or 'more' to expand) to more detail.

Is there some way you can boil the forecast for an area down to one or two words, colours or some icons or something? Colours would be awesome, because they're ambient information delivery and that I love. Mapping temperature, windspeed and precipitation to colour bars next to (or behind) each region would rock.

That would give an at-a-glance overview easily, in a few seconds rather than a few minutes. Then perhaps I could tickbox/select certain regions to compare in more detail.

Doing reverse postcode lookups was, last time I checked, fiddly as hell.
Tuesday, April 5th, 2011 10:50 pm (UTC)
I think you might be delivering too much information at once for that page to work.

Yeah, definitely.

Mapping temperature, windspeed and precipitation to colour bars next to (or behind) each region would rock.

Hey, yeah! That's a great idea! And totally doable for (at least) cloudiness and temperature, and probably for windspeed too.

Doing reverse postcode lookups was, last time I checked, fiddly as hell.

I can believe that :-( I'll see what the FixMyStreet guy(s) do.
(Anonymous)
Tuesday, April 5th, 2011 10:58 pm (UTC)
You could even just match for words like "Extreme" and "Sunny" and so on and key off those, if needed. I expect weather forecasters are quite precise in their use of certain words.
(Anonymous)
Tuesday, April 5th, 2011 11:20 pm (UTC)
Actually, I've just thought - if you're turning various weather indices into numbers, then you might be able to come up with an overall "goodness" figure. Be interesting to see how do-able that would be.

You're the maths expert here though. I'm just an ideas guy. :)
(Anonymous)
Tuesday, April 5th, 2011 09:53 pm (UTC)
Optimising BF compiler? Yeah, sure, there's hundreds of them.

But is yours good enough to optimise the "Hello, world!" program to an fwrite call?

http://code.google.com/p/esotope-bfc/

(It handles lots of arithmetic too. :) (Not my project.))
Tuesday, April 5th, 2011 10:03 pm (UTC)
Optimising BF compiler? Yeah, sure, there's hundreds of them.

Well, yeah, sure. But I didn't write any of them :-) Esotope looks hella impressive, though!
Sunday, April 17th, 2011 07:52 pm (UTC)
Only just got to this! Gosh, I'm behind on my LJ reading :-(

But also, cool git plugin! The push-to-one-repo pull-to-another is a pretty common pattern in web environments isn't it? If it's not it should be -- I can see your script coming in really handy for a bunch of little web stuff I do.

And, continuous deployment -- isn't that how Heroku and AppHarbor encourage people to do it? They're both Y Combinator enterprises, and I thought Y Combinator were pretty well known for aggressively promoting best practices?