The nuts and bolts of anonymous inner classes in Java
Note:
Although I've tried to keep it basically up to date, this article is about concepts that don't have so much application to modern Java programming. Please note also that I use the term 'closure' a lot, even in circumstances where a lamba expression is not strictly a closure. This is just the terminology that was used in the Java community when I wrote this article. Modernizing the article properly would take more time than I can really spare at present. Sorry.
Introduction
This article describes the use of anonymous inner classes
in Java programming, and some of the problems that developers commonly
experience in their use.
With reference to the decompiled output of the Java compiler, it
attempts to explain that these problems are a consequence of the way
that Java had to implement inner classes without breaking backward
compatibility.
Inner classes in Java
Java has supported a notion of inner classes since version 1.1. An inner class is a class whose definition is nested inside that of another class. Named inner classes are commonly used to encapsulate subsidiary class-based logic inside a class that uses it. For example, here is the skeleton of a class for parsing XML documents:public class XMLDocument
{
public static class Node
{
protected List nodes = new ArrayList<Node>();
// Methods for manipulated document nodes
}
public void parse (String s)
{
// Parse document into nodes
}
}
The class Node is defined within XMLDocument
because it is subsidiary to the document
itself. Because this Node class is defined as static
and public, it is accessible to other classes, and
access does not require a specific instance of XMLDocument.
For example, in another class I could say:
XMLDocument.Node n = new XMLDocument.Node();Had I not declared the
Node class as static,
I could still manipulate instances of XMLDocument.Node
from other classes,
but I could not instantiate Node instances independently of an
enclosing XMLDocument.
It should be clear that defining an inner class as public static
is almost the same as defining a global class. What we're really
doing here is introducing a new level of application packaging, rather
than encapsulating logic -- creating an inner class of this sort is
rather a stylistic decision than a logical one. Had I declared the node
class non-static and private, then I would have been
created a class which is
owned and managed entirely by its enclosing class; but it would still be a
named class with its own identity, of a sort.
Anonymous inner classes
Anonymous inner classes look quite different from named classes. They have no (programmer-defined) name, only a limited independent identity, and typically are defined entirely within specific programming statements. The following code example shows a typical use of an anonymous inner class, which defines and instantiates a subclass ofThread to carry out some background operation.
public class Test
{
public static void main (String[] args)
{
final int[] ticks = new int[1];
new Thread()
{
public void run()
{
while (true)
{
try {Thread.sleep(1000);} catch (InterruptedException e){}
System.out.println ("ticks=" + (ticks[0]++));
}
}
}.start ();
System.out.println ("Thread started; carrying on...");
}
}
The syntax is somewhat unlike any other class definition in Java.
In outline we have:
Something o = new Something()
{
// Definition of the methods of Something
};
o.someMethod();
In the previous example, since we were only calling one method
on the Thread (start()), we did not even
need to assign the new instance a variable name. We just had:
new Something()
{
// Definition of the methods of Something
}.someMethod();
Something may be a class name or an interface name; definition
of an anonymous inner class is one of the few circumtances in which we
can legitimately say new [Interface] in Java. We
can't instantiate an interface, of course, but with anonymous inner classes we
provide the implementation of the interface right in the instantiation
statement itself. Similarly, we can
instantiate a fully abstract class this way, provided that the
definition of the inner class provides definitions of all the required
abstract methods.
Whether you like the anonymous inner class syntax or not, it is undeniable
that this is idiomatic Java. Partly this is because Java
relies so heavily on interfaces, and it is often much more compact simply
to provide the implementation of the interface in line with the code that
uses it.
Limitations of the use of anonymous inner classes
There are two, related problems that Java developers frequently come up against when coding with anonymous inner classes. Both are related to identifier scope, but in different ways. Consider the following code, which is a very slight variation on the previous example -- but this one will not compile:public class Test
{
public static void main (String[] args)
{
int ticks = 0;
new Thread()
{
public void run()
{
while (true)
{
try {Thread.sleep(1000);} catch (InterruptedException e){}
System.out.println ("ticks=" + (ticks++));
}
}
}.start ();
System.out.println ("Thread started; carrying on...");
}
}
If you try to compile this, you'll get a spiteful message from the compiler:
Test.java:13: local variable ticks is accessed from within inner class; needs to be declared final
System.out.println ("ticks=" + (ticks++));
Of course, declaring the variable final is not at all what we
want in this case -- we want the thread to be able to update the
value of ticks.
The usual (wrong) explanation that is offered for this problem is that
the variable ticks is out of scope when the method
main() ends, and so is not available to the inner class. However,
the same could be said for the variable ticks[] in the first
example, and that compiles just fine. In fact, declaring a final
array containing one variable is an ugly, but common-place, workaround for
the problem described here.
The other common problem concerns scope resolution within the methods
of the inner class. In the example above, the closest enclosing scope
of the method run() is the class Thread and
not the method main(), even though the code layout
would suggest otherwise. This can lead to subtle problems with unexpected
methods being called when there are multiple methods with the same name
in different scopes.
Both these problems are hard to understand until we see how anonymous
inner classes are actually implemented.
Anonymous inner classes under the hood
To understand what's going on here, we need to look at the code generated by the compiler. Because bytecode is not particularly easy to read, my approach will be to compile the classes, then convert them back to Java with a decompiler tool. The first point to note is that the Java runtime has no understanding of inner classes at all. Whether the inner class is named or anonymous, a smoke-and-mirrors procedure is used to convert the inner class to a global class. If the class has a name, then the compiler generates class files whose names have the format[outer]$[inner] -- $ is a legal identifier in Java.
For anonymous inner classes, the generated class files are simply numbered.
So when the Thread example at the start of this article
is compiled, we end up with a class file called Test$1.class.
The number '1' indicates that this is the first anonymous class
defined within the class Test.
Here is the code generated by the compiler for the public class
called Test.
public class Test {
public static void main(String[] var0) {
int[] var1 = new int[1];
(new 1(var1)).start();
System.out.println("Thread started; carrying on...");
}
You'll notice that the entire inner class definition is missing, and
the instantiation of the inner class and the call to the
start() method is replaced by:
int[] var1 = new int[1];
(new 1(var1)).start();
The class called 1 (not normally a legal class name,
of course), is the anonymous inner class, whose implementation in the
class file Test$1.class we'll get to in a minute.
Because the decompiler loses local variable names, it takes a bit of
detective work to realize that var1 is actually the
final array ticks we declared in the
main() method:
int[] ticks = new int[1];When the anonymous inner class is instantiated, it gets passed the array
ticks in its constructor. We did not tell the
compiler to do that -- it had to do it, because there's
really no other way for the local variable ticks to
be made accessible to the anonymous inner class which,
as we can see, is not really
inner at all.
Now the inner class itself:
final class Test$1 extends Thread {
final int[] val$ticks;
Test$1(int[] var1) {
this.val$ticks = var1;
}
public void run() {
while(true) {
try {
Thread.sleep(1000L);
} catch (InterruptedException var2) {
;
}
PrintStream var10000 = System.out;
StringBuilder var10001 = (new StringBuilder()).append("ticks=");
int var10005 = this.val$ticks[0];
int var10002 = this.val$ticks[0];
this.val$ticks[0] = var10005 + 1;
var10000.println(var10001.append(var10002).toString());
}
}
}
Some of this rather tortuous code arises from the way that
string concatenation is implemented in Java -- as a bunch of
StringBuilder operations. That code isn't really
relevant here.
The first thing to note is that the class Test$1 extends
Thread -- it has to, because that's part of the definition
in the original public outer class:
new Thread()
{
// etc
}.start();
Now look at the next few lines of this class:
final int[] val$ticks;
Test$1(int[] var1) {
this.val$ticks = var1;
}
The array val$ticks is simply the counterpart in
this inner class of the array ticks that we declared
in the main() method of Test. The constructor
initializes this array from the value of ticks passed
from the enclosing class.
Thereafter, the run() method references the
elements of val$ticks, and any modifications made in this
method are reflected back in the main() method, since
ticks and val$ticks refer to the same method;
Had the method main() introduced more local variables, then
the compiler would simply have extended the constructor of the
anonymous class to include more paramters.
Why the implementation leads to problems
The Java runtime has no built-in notion of inner classes. We have seen how anonymous inner class usage is cleverly transfored into global class operations, with a bunch of synthetic variables and constructors forming the bridge between the inner and outer classes. But, in the end, we are dealing with separate classes here. They have the same scope and lifetime arrangements as any other Java classes. It's easy to see why therun() method in the anonymous class
'sees' members of the Thread class before members of the
Test class -- at runtime the inner class is nothing more nor
less than a global class that extends Thread.
The problem in which local variables need to be declared as final
is also easily explained, when we know how the implementation works.
If I had defind ticks as a plain integer, then it would
have been passed to the constructor of the inner class by value, and
the inner class would have its own version of the variable, completely
idenpendent of the value in the main() method. This has
the potential to be deeply confusing and error-prone, and so the
compiler rejects any attempt to create such a situation.
When we refer to an array in the run() method it still
has to be declared final; but all this means in Java is
that the variable that represents the array cannot be changed to take on
the value of a different array. It does not mean that the array contents
cannot be changed. Arguably, this is an odd definition of 'final', but
it's useful here.
All these limitations could be overcome by changing the way that the
JVM deals with inner classes at runtime. So far, no such change has been
made, presumably because it would be difficult to keep the JVM
backwardly-compatible with earlier compiled code. Moreover, it's possible
that any plans in that direction could be overtaken by the current
work on closures.
Where closures fit into all this
It seems very likely that the way in which anonymous inner classes are predominantly used reflects the fact that Java had for a long time no support for closures as first-class language elements. Many of the things that we do, in a rather ungainly way, with inner classes can be done in a more elegant way with closures. In this context, a closure is a code block that can be manipulated as an independent language element. With closure support, our original threading example code could be re-written something like this:public class Test
{
public static void main (String[] args)
{
int ticks = 0;
new Thread
(
{ () -> while (true) { System.out.println (ticks++); }
).start();
System.out.println ("Thread started; carrying on...");
}
}
In this example (which may, or may not, ever work in Java), I've
passed an anonymous block of code to the constructor of Thread,
which stores it, and invokes it when its start() method is called.
It's not hugely more elegant than the anonymous inner class example but, to be
fair, this isn't really the kind of situation that closures are intended
to simplify.
Closing remarks
The limitations of anonymous inner classes can readily be understood, not as the result of theoretical decisions in programming language theory, but expediencies that follow from the implementation strategy. Whether the introduction of closures into the language will eventually change any of these limitations remains to be seen.
Have you posted something in response to this page?
Feel free to send a webmention
to notify me, giving the URL of the blog or page that refers to
this one.


