feed

Mar 2005 Braindump: closures by Chris Poirier • in Designpermalink

So, I’ve been working on the design for Nexxi, in dribs and drabs, since December. It doesn’t feel coherent to me, yet, and there are some ideas currently in the design that I’m not even close to sure about. I think the best thing for me to do is to just write it up. I apologize in advance if this post doesn’t make too much sense.

First up, closures are very near and dear to my heart. I’ve been using Ruby for several years, and if it weren’t for its terrible performance, I would use it for everything and be happy. Done properly, closures enable user-defined control structures, and this ability fundamentally changes the way you think about programming. while doesn’t do what you want it to? Roll your own. Wish exception handling would log specific conditions? Roll your own. This is a fundamental goal of Nexxi.

Closures: some background

One of the core problems in a discussion like this is that the word “closure” is a bit of a fuzzy term. Everybody has a slightly different idea of what it means (and some people have no idea at all). For me, a closure is a piece of code you can pass around, a piece of code that runs in the variable scope where it is defined, but can be called from anywhere.

Perhaps an example will help.


x = 10;
while( x > 0 )
{
   System.out.println( x );
   x -= 1;
}

In languages like Java, while is a control structure — it is merely an instruction to the compiler to use the condition and the block in a particular pattern. But what is such an access pattern except an algorithm?

To me, the stuff between the curly braces is the most basic example of a closure. That block of code is a “bundle” of logic that gets executed whenever the while wants to call it, for as long as the while remains active. To the while’s perspective, the content of the block is irrelevant — it tests the condition, then either passes control to the block, or hands control back to its invoker.

What is special about the control structure nature of while is in what it doesn’t do — it doesn’t cause a change in variable scoping. When the block references variable x, it is the same variable x that we set to 10 before entering the while.

Languages that supply closures provide a third option to compiler-supplied control structures and full-scale functions — they allow you to define referenceable pieces of code, but without defining a new variable scope to do it.

Block-oriented syntax

For me, “closure” is a generalized term. It applies to a whole set of potential implementations. Ruby has closures, Groovy has closures, MetroJ has closures, even Java has closures (albeit a rather broken version of them). For what I want — user-defined control structures — to say “closure” is to not say enough. Specifically, there is something that wasn’t discussed in the example above: control flow. In order for closures to be useful for user-defined control structures, the closure must not only be connected to the enclosing variable scope, it must also be connected to the enclosing control flow. When you call return in such a closure, that return must return from the same entity that defined the variable scope. In Nexxi, such closures are called “blocks”.

Here’s an example with a language-defined control structure.


try
{
   for( number in listOfIntegers )
   {
      println( number );
      if( number == 27 )
      {
         break;
      }
      else if( number == 13 )
      {
         throw new Exception();
      }
      else if( number == 19 )
      {
         return;
      }
   }
   println( "normal termination" );
}
catch( Exception e )
{
   println( "unlucky 13!" );
}

So it prints out some number of integers from a list. It stops when it hits 27, 19, or 13, or when it has used all the numbers in the list. If it stops at 19, no additional output is generated by the function; if it stops at 13, it prints out “unlucky 13″; and otherwise it prints out “normal termination”. This is standard Java-style control flow.

Here it is again with a user-defined control structure. My apologies for using Nexxi syntax I haven’t discussed yet.


try
{
   listOfIntegers.each :: number   # Write the current element into "number"
   {
      println( number );
      if( number == 27 )
      {
         break;
      }
      else if( number == 13 )
      {
         throw new Exception();
      }
      else if( number == 19 )
      {
         return;
      }
   }
   println( "normal termination" );
}
catch( Exception e )
{
   println( "unlucky 13!" );
}

These two examples demonstrate what, to my mind, is exactly the same algorithm. The description given for the first algorithm should apply equally to the second. This is what it means for user-defined control structures to be on par with language-defined ones. Further, given the similarities in the syntaces, this is only what most programmers will expect. If it looks like return, it should act like one, right?

This is why I’m calling Nexxi a block-oriented language. The language is structured as expressions inside blocks, and all blocks are equal, regardless who is executing them. It should make no difference to the programmer whether the thing controlling the block is language-defined or user-defined — everything should work the same, regardless.

So what’s with the “::”?

Okay, so there’s one little flaw in my argument (two actually, but I’ll deal with the second one in a later article). When dealing with user-defined control structures, how do you get data from the structure’s inner namespace back out into the closure’s namespace. When you call list.each(), how do you get the current element into the block for it to use?

This idea of control structure parameters is an entirely new thing for Java-style syntax. None of the language-defined control structures need them — the language requires you to define the variables in the usual way and deal with them yourself. Which is fine for the language, because it can see your namespace as well as its own. User-defined control structures don’t have this luxury. So we need something new, something that allows us to have parameters to a block — something that allows us to define new variables to receive the data supplied by the control structure.

In Ruby, this is done with the following syntax:


   list.each do |value|
   end

In Groovy, this is done with a similar syntax:


   list.each() { |value|
   }

And in MetroJ, the following syntax is used:


   list.each() clos(value)
   {
   }

Of these approaches, I think Ruby and MetroJ have it right — placing the parameter list inside the block is a mistake, not only for syntactic reasons (it complicates the grammar and makes the language inherently biased to a particular coding style), but also because it fails to recognize that the block parameters are an interface between the caller and the callee. They are an enhancement to the language, and shouldn’t complicate patterns that are already familiar.

So outside the block. But what should the syntax look like? Overloading an existing operator is a possibility. This is what MetroJ is doing, for instance. In a way, it is nice because the syntax looks familiar — the closure looks something like a function declaration. The problem is that it also looks like a function call, especially as it lacks a return type. For this reason, I have rejected it for Nexxi in favour of an operator. A new operator for an entirely new purpose.

Over the last several months, I’ve tried a variety of operators, most notably “==>” and “::”. I had been using “==>” for a long while, as it does have visual connotations of direction of flow and so forth, but it is very visually noisy on the line. To a certain extend, the eye organizes code into structure by visual clues. In the statement:


squares = list.collect ==> number
{
   number ^ 2;
}

the structure is (= list (==> anotherList.collect() value)), but the visual weight of “==>” conflicts with that parsing. As a result, about a month ago, after seeing it used in OCAML (for something completely different), I decided to try out “::” instead, and I think it works nicely. It looks the same in all fonts, and separates without calling attention to itself.


squares = list.collect :: number
{
   number ^ 2;
}

So this brings us to one last question about these parameters: are new variables created, or are existing ones used, if available? This is a tougher problem. In Ruby, which is entirely untyped, the variables are created only if a variable of that name does not already exist in the containing scope. If it does exist, that variable is reused. My experience has been that this is a major source of bugs. In my own Ruby work, I’ve wasted many hours tracking down problems related to a closure parameter stomping on a variable. Further, as Nexxi is strongly-typed, there could easily be typing problems between the old use of the name and the new. Of course, the compiler could just refuse to choose — could just demand, as Java does, that you rename the inner variable. That may even be the wise thing to do. But one of the reasons I dislike Java so much is that it is always trying to scold me about things that are nobody’s business but mine. So, in Nexxi, the block parameters will always create new variables, variables that hide any existing variable of that name, and that live for the length of the block. If people want to live dangerously, I’ll not be a killjoy.

Conclusion

So, that’s it for starters. Sorry about the ramble — I probably went into more detail than I needed to. And, even still, there are tons of interactions that I haven’t discussed: variable typing, whether or not statements return a value, what if any shorthands should be supported for when the block is a single expression, etc. But I’ll bring these up in a later article. Hopefully a shorter one. :-) If you made it this far, thanks for reading. Any comments are welcome, as none of this stuff is set in stone yet. Thanks.

Related Links

in Design:
on site:

Discussion: No comments

Jump to comment form | comments rss | trackback uri

Leave a comment

Markdown: The kinds of formatting markup you'd use in an email will probably work here. For more details on what you can do, check out the Markdown docs.


Site copyright 2007-2008 Chris Poirier.       Powered by Wordpress.       Entries RSS Comments RSS Validate Log in