Saturday, December 23, 2006

Why dynamic typing is useful

Steve Yegge's Parabola has thrown a little more fuel on the static vs. dynamic fire. The comments so far have been mostly constructive, which is a nice surprise. However, there has been some back-and-forth about dynamic typing, mostly concerning why it's useful and what it actually entails. In a recent comment, "Matt" said

"I don't see how software could possibly attempt to handle anything it did not anticipate, except to gracefully fail, which T.S. did."

I've seen this sentiment before, almost always from people whose programming experience is limited to statically typed languages. If that's your background, then you've not seen the wonderful flexibility that dynamic typing adds. Here's a very simple example written in Python. Don't worry - Python is called "executable pseudo code" for a reason!

def process_many(things):
    for thing in things:
        thing.process()

Looking at this, you might expect things to be a list. That's possible, but it could also be a set, or an iterator, or even a dictionary (a hash table). As the programmer, you don't have to worry about it. As long as Python can figure out how to iterate over your things argument, everything will just work.

So, to return to Matt's statement above: this is one way code can handle situations that the programmer never considered. Maybe the guy who wrote process_many never even thought that someone might pass in a dictionary. That doesn't matter; it will work anyway.

Of course, you can get this effect in Java using interfaces. The problem with interfaces is that you have to make decisions about them ahead of time. If you were writing process_many in Java, you might not consider the case of passing in anything but a list. Then, if someone using your code wanted to pass in a set, iterator, dictionary, etc., they'd be out of luck.

This is a recurring theme with static vs. dynamic languages. In explicitly typed static languages, you have to spend time figuring out what type every little thing should have. Then, when you're finally done, there will always be some case you didn't consider, like a hand-written note telling you to call the ticket counter.

9 comments:

Abram Catalano said...

I'm not sure I understand why D.T. is better (from your example). I'm somewhat new to the D.T world, but I've done some in perl and php, but nothing mission-critical. So, as someone interested in the discussion...

If I were to make this generic function in Java, I'd make the parameter be a Collection with generics (so I know the parameter's data is iterable, and the objects have a process() the function is expecting to be able to call), so I know at compile time whether I can use it to perform operations on and I get the functionality I am expecting.


public void process_many(Collection<InterfaceWithProcessMethod> things){

for (InterfaceWithProcessMethod thing : things)
thing.process()

}

I think the argument FOR dynamic typing has something to do with unplanned code-reuse (but correct me if I'm wrong) which seems like an obvious way to force yourself to debug at runtime instead of compile time. FME, bugs caught at compile time take an order-of-magnitude less time to catch than runtime bugs.

I've heard some argue that you should have 100% code-coverage unit tests anyways, but in fact you NEED those with D.T., whereas with S.T you don't NEED as many, initially anyways.

I haven't made up my mind, but it looks to me that the time saved by DT is quickly used up writing unit tests and finding runtime bugs.

Back to the example you gave:

If yesterday me was concious enough to make the parameter a Collection, instead of just a List, then I know I was thinking ahead and generalizing, and the code inside the function will probably handle it. If I wasn't thinking that abstractly, and I made the parameter a List, OK, because today I want to use it with other Collections, so I can go fix it up so I'm sure it handles all Collections properly.

The D.T. way, in your example, leaves you wondering if it will work since you're not sure if yesterday-you planned to pass in data in today's format.

To be sure it works, you have to test it at runtime for every combination (because you might not have planned for today's format).

Hm, OK, so it still seems like the issue is DT forces you to write complete tests (which can't always catch all the possibilities, unless you have the time to mathematically prove your functions, and you spend way too much time writing your tests - killing your productivity) (meaning productivity in achieving the goal of the program, not productivity in coding :) and ST takes some of that burden away. FME, coders are fairly good at running through tests in their head while writing a function with the data they think it will be used with. They aren't good at writing it for data they aren't expecting. Which is another reason why unit tests are more important for DT than for ST. Therefore, I'm still not convinced DT is better. (Alas you said "useful", but thats an argument about the pros and cons of one way to solve a problem, or many) e.g. perl vs. python from what I've heard.

Thoughts?

Gary Bernhardt said...

Abram,

I think that your skepticism toward dynamic typing is reasonable. In the long term, I'm not sold on dynamic typing either, so I'm certainly not going to argue that it's universally better than static. Your point about the tradeoff between time spent writing tests and time spent wrangling type declarations is important. At this point, in a comparison of Python and Java, I think that Python wins for development time. You really should be writing comprehensive tests anyway, so that's not a huge loss. The few studies that exist seem to back this up, but they're very unreliable because they never involve large, identical projects.

In general, I think we should avoid trying to compare dynamic and static type systems at large. Although I'm convinced that development happens faster in Python than in Java, I'm also convinced that it happens faster in Haskell than in Python. Haskell is statically typed, so obviously there's much more to the story than static vs. dynamic alone.

Finally, as for this blog post, it was meant as a justification of why you *might* want to use dynamic typing, not as an argument for dynamic typing's superiority or anything like that. I don't know if I could even make a logical argument for Python being faster for development than Java, and I certainly couldn't make an argument for dynamic or static being better overall. There are many isolated justifications that I can give to suggest that Python is faster, of course, but there's always more to the story. Ultimately, the only reason I believe it is because of my own experience.

Matt said...

Abram/Gary,

Thanks for hashing this out. I think most of my questions were vetted in this exchange.

Please notice the statement from my earlier comments that follows the one quoted in the post above: "It would seem you would need to make some basic assumptions about an object or information you were presented with and these assumptions could be embedded in to a type hierarchy(the rulebook)." The stated assumption is that "Python can figure out how to iterate over your things". This seems a bit hand wavey to me. The computer scientist in me is thinking that in the end, polymorphism is really the only way to do this. Whether its implemented via C++ style vtables, or reflecting dynamically in to a type to find an "iterator" function, or based on some internal runtime concept of an iterator or collection, it is essentially the same if you squint hard enough. Both the Python and Java examples are counting on an object implementing an interface. The Python is implicit and the Java is explicit in demanding this interface. Abram did a great job exploring the tradeoffs of explicit vs. implicit so I won't go into that.

I still am striving to reconcile this with the original allegory. Maybe I'm taking this too literally, but it would seem that in the scenario you just described, Mr Python, from the airline checkin counter, walks Anushri to the front of the security line and explains the situation to T.S - who trusts Mr. Python. If the whole point is that it’s nice to have Mr. Python around to make things easier, then, I get that. But let’s not pretend we can handle any situation (input) with this new layer of indirection. I can see how this can provide benefit in a system but let's be honest about its capabilities. We can only handle the scenarios (subtypes) Mr. Python has anticipated or learned over time.

I will say that the Python example is much more concise, was probably easier to compose, and possibly even readable by non technical types. If that is the point, I get it. Loud and clear.

I'm intrigued. Thanks. I'd like to understand this better, especially the real world trade offs Abram touches on. Where do I go from here? Learn Ruby? Python? Any other blogs?

Thanks again. If I'm being a belligerent, anal butthead, please tell me to get lost :-)

Gary Bernhardt said...

Matt,

You are certainly right that both Python and Java have to count on an interface, whether implicit or explicit. The major difference that's relevant to this discussion is that, in the case of Java, the interface is fixed. If you're using a Java library written by Joe, and Joe wrote a method expecting an ArrayList, you'll be forced to massage your arguments to conform to that.

If Joe wrote his library in Python instead, you obviously still have to give him arguments that fit some criteria, but the criteria will generally be much easier to satisfy. To tie this into my looping example: in order to be loopable, your class just needs to have an __iter__ or __getitem__ method (__iter__ returns an iterator for the object; __getitem__ is the method that gets called on foo when you do foo[i]). In practice, anything that remotely resembles a collection will usually meet this requirement.

As a result, in addition to being more concise and readable, the Python version is also more flexible. When you're using Joe's library, you have to bend yourself to his interface much less in Python than in Java. Similarly, his lack of foresight in choosing types is less likely to bite you in Python.

Of course, the relative rigidity of interfaces is just one of many issues with static vs. dynamic typing. It happens to be the one that Steve's post focused on, but there are many, many more conveniences to be found with dynamic typing.

As far as exploring dynamic typing further, just pick up a language and learn it! Python and Ruby are both fine choices, and I think you'll enjoy either of them. My own preference is probably obvious, but I promise not to hate you if you try That Other Dynamic Language first. :)

D said...

I think one way to look at it is that statically typed languages use "default deny", while dynamically typed languages use "default allow".

With Java, you can do polymorphism by explicitly declaring all the different types that are legal for a function. With Python you can write the function to accept anything, but have explicit error checking for any invalid types.

The Java method is "safer", in that there are some things you can't accidentally do wrong. But it's restrictive in that you can't intentionally do those same things. If there is code you don't control that would do exactly what you want, if only the author had thought to include support for the type you need, you have no choice but to reimplement the same thing.

Nate Kohari said...

I still don't get why dynamic typing is supposedly more effective than static typing. Designing software with restrictive interfaces is just bad design, period. In C#, for example, you could accept a List<T> as a parameter, but if all you're doing is iterating through the collection, what you really want to expect is an IEnumerable<T>. It just goes back to programming against a service contract rather than against a concrete type -- something that developers should do anyway.

That's why, in my humble opinion, dynamic typing is just laziness. What happens if I define a method that wants a collection, but instead I pass it a string? Or an integer? Or a FooBar?

Loosening the strings on the compiler doesn't fix the issue when the real problem exists between the chair and keyboard. :)

Gary Bernhardt said...
This post has been removed by the author.
Gary Bernhardt said...

Nate, the issue is not simply static vs. dynamic. If you read through my blog post and the comments that have been posted, you'll not find me claim that dynamic languages are "more effective." Making a value judgement about the merits of static languages vs. dynamic languages in general is absurd.

The debate, when it's occurring reasonably, is generally about languages that strangle you with explicitness vs. languages that don't. Yes, in C# you can do things like accept an IEnumerable<T>. But in Haskell you don't even have to tell it what the type of the argument is, yet it's still strongly, statically typed.

I think you should broaden your experience. Learn Haskell or Ruby or Lisp or anything that's strongly, but not explicitly, typed. I'm sure you'll find that your experience mirrors experiment after experiment after experiment that all show explicitly typed languages having much longer development times and code length than implicitly typed languages (ML, Haskell) or dynamic languages (Python, Ruby, Lisp, Perl, Erlang).

Nate Kohari said...

@gary:

That's fair. I admittedly don't have a tremendous amount of experience outside the static-typed realm, except with Perl and PHP, which are too script-y to count. Usually these days when someone says "dynamic typing" they really mean "works like Ruby" -- which raises a major performance flag in my head. I will have to take it upon myself to learn an implicitly-typed language.

I fall on the "static typing" side of the Epic Binding Battle because I'd rather spend a little more time developing and reap the benefits in terms of performance. (Yes, there is a limit to that too. That's why I write in C# and not in C or x86 assembly. :D)