Friday, July 15, 2011

Reachability follow up

We've been having quite an interesting series of conversations at work about this reachability problem. Today, one of my co-workers, Dan Heidinga, pointed out that Microsoft's CLR has the same issue. For CLR Microsoft has provided a special static method called GC.KeepAlive(Object). This acts as a hint to the virtual machine and JIT to extend an object's lifetime, but is otherwise a no-op. There's a good article about the problem on an old MSDN blog here. Note that the author considers and rejects the option of automatically extending the lifetime of all function arguments to the end of their functions on the basis that it impacts codegen and therefore performance.

Sunday, July 10, 2011

A subtle issue of reachability

In the last few weeks I've run into two similar and very subtle problems in Java code. In some ways, these seem to illustrate an oversight in the design the Java language and/or virtual machine. The problem has to do with objects being collected earlier than expected.

In one case a finalizer was run and in the other a PhantomReference was cleared. I'll describe an example based on finalization. It's easy enough to see how this could also apply to reference objects.

Consider a class like this:

public class Foo {
  private byte[] array = new byte[] { 1, 2, 3 };

  public void finalize() { 
    array[0] = array[1] = 
      array[2] = 0; 
  }

  public byte[] getData() { return array.clone(); }
}

This class is able to return a copy of its data array. (This is a common pattern since it prevents the caller from modifying the master copy of the array). When instances of this class are garbage collected they wipe out the data in the array, overwriting it with zeros. (Let's ignore why it does this; it's just an example!) So far so good.

What result will we get if we invoke getData() on an instance of Foo?

  Foo f = new Foo();
  byte[] array = f.getData();
  System.out.println("array={" + 
    array[0] + ", " + 
    array[1] + ", " + 
    array[2] + "}");

Intuitively, we expect this to print "array={1, 2, 3}". And it usually does. But is it legitimate for it to print "array={0, 0, 0}" (or even "array={1, 2, 0}")? If it did, that would mean that the object was finalized while we were still using it, wouldn't it?

Actually, that can happen. It happens quite often in the IBM Java VM, and seems to happen occasionally in Oracle HotSpot, too, but less frequently.

The Java VM is permitted to collect (or finalize) an object when it is no longer reachable. But how could the object become unreachable when we're running the getData() function? It's easier to understand if you imagine the function in-lined in the caller, and broken up into individual statements:

  Foo f = new Foo();
  byte[] masterArray = f.array; // ignore that array
                                // is private
  // what if a garbage collection happens here?
  // e.g. System.gc(); System.runFinalization();
  byte[] copyArray = masterArray.clone();
  System.out.println("array={" + 
    copyArray[0] + ", " + 
    copyArray[1] + ", " + 
    copyArray[2] + "}");

Here we can see that if the garbage collector interrupts the program at just the right (wrong?) time, the finalize() function might run before we clone the array. Even though we don't explicitly assign null to f, a clever VM can analyze the program and determine that f is never used again. It can reclaim the memory for that object and, in this case, finalize it, before the clone() function runs. In most cases this is exactly what you want the VM to do: garbage collect objects as early as possible to recover as much memory as possible.

Ok, but is that really the same thing? Surely, the receiver of a function is kept alive until the function returns, right? In-lining the function isn't quite the same!

Actually, neither the Java language specification nor the Java Virtual Machine specification say anything about that. In the VM, the receiver of a function (i.e. this) isn't very special at all. It's just the first argument of a virtual function. Although the language doesn't allow it (keep in mind that the Java language and the Java VM have separate specifications) you can overwrite the receiver just as you can a local variable if you're writing bytecodes directly without the aid of javac:

byte[] getData() {
  byte[] masterArray = this.array;
  this = null; // not legal in Java language,
               // but is legal in class files!
  return masterArray.clone();
}

javac won't compile this, but the JVM's class file verifier won't report any problems in this function.

So, what's the right way to write your Java code so that your objects won't be finalized or collected earlier than expected? Unfortunately, I don't know the answer. You could add an extra reference to the receiver, like this:

byte[] getData() {
  byte[] result = array.clone();
  this.array = this.array;
  return result;
}

But that's a hack, not a real solution, and is unlikely to work reliably. The VM can easily determine that the dummy "this.array = this.array" statement has no effect and can be removed, leaving us exactly where we started.
Perhaps Java needs a new keyword like this:

byte[] getData() {
  keep_alive(this) {
    return array.clone();
  }
}

However I doubt that something like that would be used correctly very often.

Unfortunately, the best advice is probably to avoid finalization whenever possible.