Wednesday, June 24, 2009

More Scala Using RAISIN

Last time, we offered a minimally functional emulation of C#'s using syntax, to manage resources elegantly in Scala. We defined a curried function, whose second argument was a simple block of code. We'll refine that approach and try to bring about the remaining goals we set for ourselves for this feature.

def using[T <% Disposable]
(resource: T)(block: => Unit) = {
try {
block
}
finally {
resource.dispose
}
}

One problem with our first cut was that the object encapsulating the managed resource had a larger scope than we wanted. Since we constructed our FileHandle instance outside of the block that used it, one could accidentally access it after it had been disposed.

val handle = new FileHandle("trouble")
using(handle) {
handle.read
handle.write(42)
}
// big trouble below!
handle.read

What we really need is not to pass a Unit into the using function, but a function that accepts the resource as its argument. In other words, we'd like to be able to make a useful function and pass that as an argument into the using method

def useful_function(handle: FileHandle): Unit = {
handle.read
handle.write(42)
}

// pseudo-code to capture the idea
//
using(new FileHAndle("good"), useful_function)

That's the gist of what we want to do, but we don't want all the cruft of declaring the useful function separately. Happily, Scala allows us to use function literals to write the above very economically.

using(new FileHandle("good")) { handle =>
handle.read
handle.write(42)
}
//
// handle is not visible down here and
// can't be abused, Yay

For this to work, we have to refine our using method. All we have to do is change the second argument from type Unit to the function T => Unit, and make sure to call the block with the expected T resource.

def using[T <% Disposable]
(resource: T)(block: T => Unit) {
try {
block(resource)
}
finally {
resource.dispose
}
}

Our using function is pretty powerful now. Without any modifications, it works with closures as well as function literals. Let's alter the client code a bit to demonstrate. The following is a closure and not a function literal because i is not defined inside the curly braces demarking the code passed into using.

def demonstrate_closure(i: Int) = {
using (new FileHandle("simple")) { handle =>
handle.read
handle.write(i)
}
}

Still, there are additional things we can do in the body of our using method. For example, we could take special action if the resource passed in were null. Alternatively, we could wrap the dispose calls inside a try-catch block to prevent them from emitting exceptions.

C++ uses compile-time overloading to choose different behaviors for some functions. For example, the new operator comes in different overloaded flavors. One takes a throwaway argument of type nothrow_t to indicate that the desired version of new will return NULL when it fails, instead of throwing an exception.

In Scala, a tried and true way to choose different behaviors at compile time is by the import statements. For example, if you want a mutable Set in Scala, you

import scala.collection.mutable.Set

This inherits from the same Set trait as the immutable version, so the logic where the class is used is clean. Although the C++ nothrow_t concept is interesting, Scala's approach appears to have a better separation of concerns, and results in uncluttered code.

If we are so inclined, we can do something analogous with our using method. We could choose to import from one package where the implementation swallows Throwables emitted by dispose. Or, we could import from another where they are allowed to propagate. In other words, we can handle exceptions quite intelligently, and customize our behavior depending on context.

Finally, let's consider whether we can avoid needing to nest using clauses, and manage the disposal of multiple resources more elegantly. This is possible, but there's one important subtlety that we have to worry about.

def using[T <% Disposable, U <% Disposable]
(resource: T, _resource2: => U)(block: (T,U) => Unit) {
try {
val resource2 = _resource2
try {
block(resource, resource2)
}
finally {
resource2.dispose
}
}
finally {
resource.dispose
}
}

Note that the _resource2 argument is passed by name, and not by value. We don't actually access it until declaring the val resource2 inside the outer try block. This means that if the construction of resource2 fails, we will still call dispose on the other resource.

Let's demonstrate this. Suppose our first resource object constructs okay, but the second one throws an exception in its constructor. This is standard behavior for a RAISIN class, which disallows partially constructed instances.

def two_resources() = {
using (new FileHandle("okay"), new FileHandle("bad")) {
(first, second) =>
second.write(first.read)
}
}

If that second FileHandle constructor fires before entering the using method, then we have a resource leak! The first FileHandle is never disposed. But, because we pass the second argument by name, the second constructor does not fire before entering the using method. We're essentially passing a pointer to the constructor into the using function, who calls it.

Why pass just the second one by name and not the first one? Did we just get lucky? No. Scala evaluates its arguments from left to right.

A consequence of this choice is that we cannot access the _resource2 argument more than once inside the using method. Note that it's accessed exactly once when defining the val resource2. Otherwise, the constructor would be called again and again inside the using method. That would be an even worse resource leak, and would probably malfunction.

We've now shown that our C# emulation meets all but one of our goals. This is impressive because the Scala behavior is superior even to C# itself, for example with regard to limiting the scope of variables. The remaining goal is to demonstrate how our using construct can work with legacy classes such as java.io.File that do not extend Disposable. We'll take up this cause in the near future, after a detour into some decidedly non-standard C++. But the punchline is, we had the foresight to use view bounds and not upper bounds, so we're well prepared.

In summary, we've shown how to emulate the C# using syntax in Scala, to enable RAISIN style programming. We were remarkably successful at bullet-proofing our resource management with surprisingly few lines of code. We handling many edge cases, offered flexibility, and achieved ambitious goals. Along the way, we encountered function literals, closures, pass by name, generics, view bounds, import statements, and (presently) implicits.

This was a lovely exercise because so many different aspects of Scala had to come together in harmony. It's clear that API designers must master these features to produce high quality code, but even casual programmers would do well to learn them.

2 comments:

Zab Rab Oof said...

"This is impressive because the Scala behavior is superior even to C# itself, for example with regard to limiting the scope of variables."

Actually C# does this.
using (var a = new Resource())
{
a.stuff();
}
a.otherStuff(); // does not compile (a is not in scope)

Morgan Creighton said...

Aha, thanks very much Eldritch. That's a great comment!

By coincidence, I'm starting this week to do some C# professionally, so I may find an opportunity to use what you have written soon.

Thanks again for visiting my blog and for sharing.