pozorvlak: (Default)
pozorvlak ([personal profile] pozorvlak) wrote2008-03-30 08:40 pm

Pozorvlak's Conjectures

For the past couple of years (longer, actually), I've been thinking a lot about this business of static and dynamic typing in programming languages. Here's my current position on the matter.

Static and dynamic typing encourage and reward quite remarkably different approaches to the creation of software: everything from the nitty-gritty, (sit and think?)-edit-(compile?)-run-(test?)-(debug?)-(repeat?) cycle to the design (at all levels) of the software to be created, taking in every aspect of the tools and process used to support this design and implementation. Also, the choice of static or dynamic typing is best understood as part of a larger attitude towards development. Unfortunately, the use of a dynamic mindset with a static language not only means that you can't take advantage of the tools available (or even see why they'd be useful), it actively hinders success. The same is true for approaching a dynamic language with a static mindset.

I would therefore like to propose Pozorvlak's Conjectures:
  1. If you find that a modern dynamic type system causes more problems than it solves, you're probably doing it wrong.
  2. If you find that a modern static type system causes more problems than it solves, you're probably doing it wrong.
For instance, I find Haskell's type system causes me many more problems than it solves. Discussions with more expert Haskell users suggest that I am indeed doing it wrong. This guy, on the other hand, finds that "the dynamic languages stick out like sore thumbs because of the whole class of dynamic run time errors that show up that don't even appear in the statically typed languages". If you find this, I claim that it's because your code is not polymorphic enough1. You can get away with having code which only accepts data of very restricted types when the compiler does all the checks for you statically. When it doesn't, you're better off writing code that behaves well for a wider class of inputs, and dynamic languages give you much more facility to do this. Once you learn to code like this, your problems largely go away. As a bonus, your code becomes more flexible, and whole classes of techniques become possible that would have been literally unthinkable before. Similar claims, it appears, can be made for static languages: at the higher levels of mastery, one can use a static type system as a theorem-prover to establish some quite nontrivial properties of your program automatically and across the entire codebase. This allows for some excitingly different approaches to program development.

A corollary is that if you ever finding yourself saying that your port of Feature X to Language Y is better than the original Feature X solely because it's (statically|dynamically) typed and the original Feature X was the other one, you should probably save your breath. It will probably also be worth your while to go back and determine what advantages the opposite choice of typing regimen gave to Feature X's users.

1 A less interesting, but probably valid conjecture is that you're also not testing enough, or at least testing the wrong things. But this can't be the only answer. Dynamic programmers, in general, are not idiots; they are usually also Lazy, in the good sense. They're smart enough to work out that writing the equivalent isa_ok() test every time they would have written a type declaration in Java or whatever is no time-saver at all. Hence, they must need less type information overall for their code to be correct.

[identity profile] st3v3.livejournal.com 2008-03-31 11:08 am (UTC)(link)
Found you via a second-order reddit link.

This has to be one of the best notes on S/D I've seen, basically because it transcends the two-ignorant-camps default answer.

My background is C#, but I'm playing with python and ruby and lisp for small stuff. My problem is that I can't see how to scale dynamic languages to big stuff. You're making me think I'm probably going to need a seriously different approach. Any advice on how to learn large-scale dynamic development?

Here's the kind of problem I'm worried about. I have a function I use throughout my codebase. Let's say I'm using python, and I've added a logMessage(msg) function. Now let's say I need to add an extra parameter (say, severity). How do I go about adding that parameter to every call? Static typing makes it easy. Is it possible to do in a dynamic language?

[identity profile] pozorvlak.livejournal.com 2008-03-31 11:33 am (UTC)(link)
OK, the first thing to note is that dynamic languages mostly deal with the problem of large codebases by keeping them small in the first place. So yes, the kind of thing you mention is a bit of a problem. Some partial solutions that spring to mind:
  1. Use of a refactoring browser, like Python's Bicycle Repair Man (http://bicyclerepair.sourceforge.net/) (disclaimer: I've never used it).
  2. grep. Problematic if you have many different functions called logMessage, but why would you want to do that? :-) In this case, I think you'd be OK.
  3. Making the severity parameter optional, and giving it a sensible default.
  4. Depending on your logging needs, you might be able to isolate your calls to logMessage in some sort of aspect/hook/advice/thingy.


OTOH: you're unit testing, right? And your tests have 100% statement coverage, as measured by eg coverage (http://nedbatchelder.com/code/modules/coverage.html)? Then any broken calls will be picked up :-)

[identity profile] st3v3.livejournal.com 2008-03-31 12:40 pm (UTC)(link)
Hi. Thanks for answering! I'm going to look at Bicycle Repair Man -- looks interesting. I've been writing python extensions to a text editor (http://www.sublimetext.com) and BRM may be very useful for that.

I'm really enjoying python. It seems very well designed and I can knock stuff out quickly. Thing is, I feel like I'm only doing small things, and that the benefits of static typing only really kick in for big projects. The article you linked to on artima.com makes explicit what I'm worried about ("The initial productivity gain of working with a dynamic language can decline as a project's codebase grows, and as refactoring becomes increasingly a chore.")

Unit tests; yes, but not 100% coverage. I've found 100% coverage to be too much. Getting to 100% can involve test code more complex than the situation you are testing, at which point the test becomes most suspicious, and you have to test it...

The problem I've chosen is one that favours static typing, I know, and I think a perfectly reasonable answer is 'this particular task takes more time. However, solving problem X, difficult for static languages, is easy for dynamic languages, and that'll solve more problems over the long term.'

Anyway, thanks for an interesting discussion.

[identity profile] pozorvlak.livejournal.com 2008-03-31 01:30 pm (UTC)(link)
Glad you're enjoying Python :-)

It's possible that static typing is more useful for larger codebases: it's very hard to tell, since the debate's so polarised, and because of the major complicating factor that different languages take different amounts of space to express the same program, in a way that's a nonlinear function of program size. On the other hand, people have successfully written large projects in dynamic languages (Steve Yegge was talking about his 10,000-line Emacs extension (http://steve-yegge.blogspot.com/2008/03/js2-mode-new-javascript-mode-for-emacs.html) today, and mentioned other extensions three times the size - and that's an extension to Emacs!).

I think the hard part of the adjustment is probably realising that there's an adjustment to be made - after that, it's just conscious practice.

[identity profile] st3v3.livejournal.com 2008-03-31 01:49 pm (UTC)(link)
You're right, it's tricky to judge, and too often there's a bigendian/littleendian feel to arguments. ;)

[identity profile] pozorvlak.livejournal.com 2008-03-31 01:31 pm (UTC)(link)
As for coverage: are we talking about statement coverage, path coverage or state coverage? The last two are often pretty hard in a stateful language, but getting statement coverage close to 100% shouldn't be that hard...

[identity profile] st3v3.livejournal.com 2008-03-31 02:01 pm (UTC)(link)
Even statement coverage can be tricky if you're trying for real just-test-this-unit unit tests. At least, in static languages they can.

For example; say you want to test a class which parses text fields in a database. You add exception handlers for problems while connecting to and reading from the db. To get a proper unit test, you need to create a mock db object with Connect() and Read() methods, which throws all the different flavours of exception you'd want to handle. Writing that mock object may well be much more complicated than the object you're trying to test, and more error prone.

My experience has been that full-fat unit tests can take an awfully long time, and it's not clear to me that a developer always gets the right bang for his buck over the course of the project.
Edited 2008-03-31 14:01 (UTC)

[identity profile] pozorvlak.livejournal.com 2008-03-31 11:41 am (UTC)(link)
As for learning dynamic programming in the large, I can't help all that much, because the only large projects (more than a couple of thousand lines) I've worked on have been in static languages (and I spent much of my time fighting the type system. Alas). I suspect that the best thing to do would be to find a large project implemented in a dynamic language (Emacs springs to mind, but Rails/Django/Catalyst would probably be big enough) and have a look at that.

On the other hand, you can do an awful lot in ten lines of (say) Perl, particularly if you take advantage of library modules, CPAN, etc. So even if large-scale programming is easier in static languages, that doesn't mean dynamic languages aren't worth learning :-)

I came across this post (http://www.artima.com/weblogs/viewpost.jsp?thread=217080) while answering your other question - I'm not sure I agree, but it's an interesting way to think about the problem.