The nuts and bolts of anonymous inner classes in Java
Note:
Although I've tried to keep it basically up to date, this article is about concepts that don't have so much application to modern Java programming. Please note also that I use the term 'closure' a lot, even in circumstances where a lamba expression is not strictly a closure. This is just the terminology that was used in the Java community when I wrote this article. Modernizing the article properly would take more time than I can really spare at present. Sorry.
Introduction
This article describes the use of
anonymous inner classes
in Java programming, and some of the problems that developers commonly
experience in their use.
With reference to the decompiled output of the Java compiler, it
attempts to explain that these problems are a consequence of the way
that Java had to implement inner classes without breaking backward
compatibility.
Inner classes in Java
Java has supported a notion of
inner classes since version 1.1.
An inner class is a class whose definition is nested inside that of
another class.
Named inner classes are commonly used to encapsulate
subsidiary class-based logic inside a class that uses it. For example,
here is the skeleton of a class for parsing XML documents:
public class XMLDocument
{
public static class Node
{
protected List nodes = new ArrayList<Node>();
// Methods for manipulated document nodes
}
public void parse (String s)
{
// Parse document into nodes
}
}
The class
Node
is defined within
XMLDocument
because it is subsidiary to the document
itself. Because this
Node
class is defined as
static
and
public
, it is accessible to other classes, and
access does not require a specific instance of
XMLDocument
.
For example, in another class I could say:
XMLDocument.Node n = new XMLDocument.Node();
Had I not declared the
Node
class as
static
,
I could still manipulate instances of
XMLDocument.Node
from other classes,
but I could not instantiate
Node
instances independently of an
enclosing
XMLDocument
.
It should be clear that defining an inner class as
public static
is
almost the same as defining a global class. What we're really
doing here is introducing a new level of application packaging, rather
than encapsulating logic -- creating an inner class of this sort is
rather a stylistic decision than a logical one. Had I declared the node
class non-static and
private
, then I would have been
created a class which is
owned and managed entirely by its enclosing class; but it would still be a
named class with its own identity, of a sort.
Anonymous inner classes
Anonymous inner classes look quite different from named
classes. They have
no (programmer-defined) name, only a limited independent identity,
and typically are defined
entirely within specific programming statements.
The following code example shows a typical use of an anonymous
inner class, which defines and instantiates a subclass of
Thread
to carry out some background operation.
public class Test
{
public static void main (String[] args)
{
final int[] ticks = new int[1];
new Thread()
{
public void run()
{
while (true)
{
try {Thread.sleep(1000);} catch (InterruptedException e){}
System.out.println ("ticks=" + (ticks[0]++));
}
}
}.start ();
System.out.println ("Thread started; carrying on...");
}
}
The syntax is somewhat unlike any other class definition in Java.
In outline we have:
Something o = new Something()
{
// Definition of the methods of Something
};
o.someMethod();
In the previous example, since we were only calling one method
on the
Thread
(
start()
), we did not even
need to assign the new instance a variable name. We just had:
new Something()
{
// Definition of the methods of Something
}.someMethod();
Something
may be a class name or an interface name; definition
of an anonymous inner class is one of the few circumtances in which we
can legitimately say
new [Interface]
in Java. We
can't instantiate an interface, of course, but with anonymous inner classes we
provide the implementation of the interface right in the instantiation
statement itself. Similarly, we can
instantiate a fully abstract class this way, provided that the
definition of the inner class provides definitions of all the required
abstract methods.
Whether you like the anonymous inner class syntax or not, it is undeniable
that this is idiomatic Java. Partly this is because Java
relies so heavily on interfaces, and it is often much more compact simply
to provide the implementation of the interface in line with the code that
uses it.
Limitations of the use of anonymous inner classes
There are two, related problems that Java developers frequently come
up against when coding with anonymous inner classes. Both are
related to identifier scope, but in different ways.
Consider the following code, which is a very slight variation on the previous
example -- but this one will not compile:
public class Test
{
public static void main (String[] args)
{
int ticks = 0;
new Thread()
{
public void run()
{
while (true)
{
try {Thread.sleep(1000);} catch (InterruptedException e){}
System.out.println ("ticks=" + (ticks++));
}
}
}.start ();
System.out.println ("Thread started; carrying on...");
}
}
If you try to compile this, you'll get a spiteful message from the compiler:
Test.java:13: local variable ticks is accessed from within inner class; needs to be declared final
System.out.println ("ticks=" + (ticks++));
Of course, declaring the variable
final
is not at all what we
want in this case -- we want the thread to be able to update the
value of
ticks
.
The usual (wrong) explanation that is offered for this problem is that
the variable
ticks
is out of scope when the method
main()
ends, and so is not available to the inner class. However,
the same could be said for the variable
ticks[]
in the first
example, and that compiles just fine. In fact, declaring a
final
array containing one variable is an ugly, but common-place, workaround for
the problem described here.
The other common problem concerns scope resolution within the methods
of the inner class. In the example above, the closest enclosing scope
of the method
run()
is the class
Thread
and
not the method
main()
, even though the code layout
would suggest otherwise. This can lead to subtle problems with unexpected
methods being called when there are multiple methods with the same name
in different scopes.
Both these problems are hard to understand until we see how anonymous
inner classes are actually implemented.
Anonymous inner classes under the hood
To understand what's going on here, we need to look at the code generated
by the compiler. Because bytecode is not particularly easy to read,
my approach will be to compile the classes, then convert them back
to Java with a decompiler tool.
The first point to note is that the Java runtime has
no understanding of inner classes at all. Whether the inner class
is named or anonymous, a smoke-and-mirrors procedure is used to convert
the inner class to a global class. If the class has a name, then
the compiler generates class files whose names have the format
[outer]$[inner]
-- $ is a legal identifier in Java.
For anonymous inner classes, the generated class files are simply numbered.
So when the
Thread
example at the start of this article
is compiled, we end up with a class file called
Test$1.class
.
The number '1' indicates that this is the first anonymous class
defined within the class
Test
.
Here is the code generated by the compiler for the public class
called
Test
.
public class Test {
public static void main(String[] var0) {
int[] var1 = new int[1];
(new 1(var1)).start();
System.out.println("Thread started; carrying on...");
}
You'll notice that the entire inner class definition is missing, and
the instantiation of the inner class and the call to the
start()
method is replaced by:
int[] var1 = new int[1];
(new 1(var1)).start();
The class called
1
(not normally a legal class name,
of course), is the anonymous inner class, whose implementation in the
class file
Test$1.class
we'll get to in a minute.
Because the decompiler loses local variable names, it takes a bit of
detective work to realize that
var1
is actually the
final
array
ticks
we declared in the
main()
method:
int[] ticks = new int[1];
When the anonymous inner class is instantiated, it gets passed the
array
ticks
in its constructor. We did not tell the
compiler to do that -- it
had to do it, because there's
really no other way for the local variable
ticks
to
be made accessible to the anonymous inner class which,
as we can see, is not really
inner at all.
Now the inner class itself:
final class Test$1 extends Thread {
final int[] val$ticks;
Test$1(int[] var1) {
this.val$ticks = var1;
}
public void run() {
while(true) {
try {
Thread.sleep(1000L);
} catch (InterruptedException var2) {
;
}
PrintStream var10000 = System.out;
StringBuilder var10001 = (new StringBuilder()).append("ticks=");
int var10005 = this.val$ticks[0];
int var10002 = this.val$ticks[0];
this.val$ticks[0] = var10005 + 1;
var10000.println(var10001.append(var10002).toString());
}
}
}
Some of this rather tortuous code arises from the way that
string concatenation is implemented in Java -- as a bunch of
StringBuilder
operations. That code isn't really
relevant here.
The first thing to note is that the class
Test$1
extends
Thread
-- it has to, because that's part of the definition
in the original public outer class:
new Thread()
{
// etc
}.start();
Now look at the next few lines of this class:
final int[] val$ticks;
Test$1(int[] var1) {
this.val$ticks = var1;
}
The array
val$ticks
is simply the counterpart in
this inner class of the array
ticks
that we declared
in the
main()
method of
Test
. The constructor
initializes this array from the value of
ticks
passed
from the enclosing class.
Thereafter, the
run()
method references the
elements of
val$ticks
, and any modifications made in this
method are reflected back in the
main()
method, since
ticks
and
val$ticks
refer to the same method;
Had the method
main()
introduced more local variables, then
the compiler would simply have extended the constructor of the
anonymous class to include more paramters.
Why the implementation leads to problems
The Java runtime has no built-in notion of inner classes. We have seen
how anonymous inner class usage is cleverly transfored into global
class operations, with a bunch of synthetic variables and constructors
forming the bridge between the inner and outer classes.
But, in the end, we are dealing with separate classes here. They have
the same scope and lifetime arrangements as any other Java classes. It's
easy to see why the
run()
method in the anonymous class
'sees' members of the
Thread
class before members of the
Test
class -- at runtime the inner class is nothing more nor
less than a global class that extends
Thread
.
The problem in which local variables need to be declared as
final
is also easily explained, when we know how the implementation works.
If I had defind
ticks
as a plain integer, then it would
have been passed to the constructor of the inner class by value, and
the inner class would have its own version of the variable, completely
idenpendent of the value in the
main()
method. This has
the potential to be deeply confusing and error-prone, and so the
compiler rejects any attempt to create such a situation.
When we refer to an array in the
run()
method it still
has to be declared
final
; but all this means in Java is
that the variable that represents the array cannot be changed to take on
the value of a different array. It does not mean that the array contents
cannot be changed. Arguably, this is an odd definition of 'final', but
it's useful here.
All these limitations could be overcome by changing the way that the
JVM deals with inner classes at runtime. So far, no such change has been
made, presumably because it would be difficult to keep the JVM
backwardly-compatible with earlier compiled code. Moreover, it's possible
that any plans in that direction could be overtaken by the current
work on
closures.
Where closures fit into all this
It seems very likely that the way in which anonymous inner classes are
predominantly used reflects the fact that Java had for a long time no
support for
closures as first-class language elements.
Many of the things that
we do, in a rather ungainly way, with inner classes can be done in a more
elegant way with closures. In this
context, a closure is a code block that can be manipulated as an independent
language element. With closure support, our original threading example
code could be
re-written something like this:
public class Test
{
public static void main (String[] args)
{
int ticks = 0;
new Thread
(
{ () -> while (true) { System.out.println (ticks++); }
).start();
System.out.println ("Thread started; carrying on...");
}
}
In this example (which may, or may not, ever work in Java), I've
passed an anonymous block of code to the constructor of
Thread
,
which stores it, and invokes it when its
start()
method is called.
It's not hugely more elegant than the anonymous inner class example but, to be
fair, this isn't really the kind of situation that closures are intended
to simplify.
Closing remarks
The limitations of anonymous inner classes can readily be understood, not
as the result of theoretical decisions in programming language theory, but
expediencies that follow from the implementation strategy. Whether the
introduction of closures into the language will eventually
change any of these limitations remains to be seen.