Java as a scripting language: new auto-compilation features in Java 11
Introduction
Java is traditionally a compiled language, although the output of the compiler
is not "machine code" in the conventional sense. Separate tools are needed to
compile the Java source and to execute the compiled output.
Languages like Perl and
Python, on the other hand, are traditionally interpreted -- executing the
program is a single-step operation that takes source code as its input.
Many Linux/Unix shells, like bash
, offer fairly sophisticated
programming features in their own right: the choice whether to
use bash
or Perl for a task can sometimes be a coin-tossing one.
Programming languages that support one-step execution are often referred to as "scripting languages", although that's a pretty vague term.
Developers and system administrators have typically turned to scripting language for implementing quick (and sometimes dirty) utilities for simple, highly-specific tasks. The lack of a separate compilation operation makes it easy to debug and revise script code, and scripting languages usually have an accessible syntax.
Since Java 11, it has become possible to use Java in a very similar way to scripting languages. This new feature is particularly interesting if you want to use Java to write command-line utilities. So, while the basic principle I'm describing here does apply, to some extent, to Microsoft Windows, I would expect it to be of most interest to Linux/Unix developers and administrators.
Note:
When I use the term "Java script" -- which I'm reluctant to do -- I'm referring to Java scripting using the new auto-compile features in Java 11. Nothing in this article relates in any way to the JavaScript programming language. Sorry about this -- I don't choose these names.
Scripting in Java
Consider the old chestnut "Hello, World" program in Java:
public class Test { public static void main (String[] args) { System.out.println ("Hello, World"); } }
Traditionally the compilation and execution would be specific steps, like this:
$ javac T1.java $ java T1 Hello, World
In this mode of operation, javac
produces one or more
.class
files, which contain the compiled code.
java
executes the compiled code, starting at a method
called main()
in the class T1
.
The java
command does not take a filename as its input --
it takes a class name, and finds the corresponding .class
file
using class searching rules.
Since the implementation of JEP330 in JDK 11, it has become possible to run this simple Java program in a single step, like this:
$ java T1.java Hello, World
In this mode of operation, no .class
files are produced, and
the compilation and execution steps are quietly combined. Running a Java
program this way is very similar to running a Perl script, by entering:
$ perl my_script.pl
In fact, there a more similarities than this one-step execution, as we shall see.
What else can we do?
One of the most striking things about this new feature is that regular Java class naming conventions are bypassed. What I mean by this is that, while traditional Java usage requires a public class to have the same name as the file that contains it, auto-compilation bypasses that requirement. In my previous example, I defined the class
public class Test
but I could have called it Fred
or myclass
, or
anything I liked; I can still execute it using the
filename Test.java
. I can even define the class to be
in a particular package, and execute it without using the package name
(although I'm not sure why I would).
The rules for auto-compilation state that the execution begins with the
first class defined in the file, at the main
method.
Since the implementation refers to the "first" class, you might wonder if that means you can use multiple classes in the same file and, as it turns out, you can. So you can implement a full, object-oriented program, so long as it fits into one source file. This ability to handle multiple classes opens up the possibility of doing "real" scripting in Java, of the same kind we might otherwise do with Perl or Python.
The "script" we execute need not have a name that ends in .java
;
in fact, there are good reasons not to name files this way. If it doesn't,
you can use the --source
script to force the java
utility to interpret the file as source code. So, for example, if I
name my file test.jsh
, I can execute it as
$ java --source 11 test.jsh
If you're used to writing Perl or Python scripts for Linux, you might be familiar with invoking the script at the prompt just by its filename, without specifying the particular language interpreter. Can we do this with Java? Amazingly, yes we can.
Shebangs
Unix-like shells (and sometimes kernels) provide ways to have an interpreter
launched, based on a specification in the file. For example, I could
write a Perl script like this, in a file called my_script
:
#!/usr/bin/perl print ("Hello, World\n");
Then I can run it at a shell prompt like this:
$ ./my_script
without needing to specify the interpreter perl
in the command.
Nearly always, I'll need to set the execute permission on the file first:
$ chmod 755 my_script
This mode of operation works because of a collaboration between the shell,
perl, and the kernel's program loader. Essentially, perl
has to
know to avoid interpreting the first line of the script, which is
#!/usr/bin/perl
and the program loader needs to read this line (and only this line) and invoke
perl
. This first line is colloquially know as a
"shebang". Since many scripting languages use "#" to introduce
a comment, the first requirement, ignoring the shebang line, is
easy to implement -- at least in bash
and Perl. It's
more of a problem for Java -- but not an insurmountable one.
Here's a self-executing Java script. We could call the file test
,
or test.jsh
or, in fact, anything that does not end in
.java
.
#!java --source 11 public class Test { public static void main (String[] args) { System.out.println ("Hello, World"); } }
Then we can run it like this:
$ ./test Hello, World
Note that the (invisible) Java compilation has ignored the shebang
line, even though Java does not use a "#" to introduce comments.
It is for this reason, I think, that the compiler won't allow
a shebang line in a file named .java
-- it completely
violates the Java compiler's regular syntax rules about comments.
This is a highly specific feature, introduced into Java to
support scripting operation.
So what's going on?
It's important to understand that the new auto-compile feature does not change how Java works -- it's still a two-step process. Code is still compiled, following all the usual rules (apart from those applying to comments, as explained above). When the source has been compiled, the JVM's run-time engine is turned loose on the compiled code, exactly as before. All JVM subsystems, including the garbage collector, work as they always have.
There is therefore no gain in speed or memory efficiency from using Java in script mode. In fact, memory usage might be slightly increased, because the compiled code has to be held in memory for the whole duration of the program. Compared to the JVM itself, this contribution to memory usage is likely to be nugatory. However, repeatedly running the "script" incurs the compilation time overhead on every execution.
On my desktop system, the compilation processes takes about half a second, for a "script" of a few hundred lines of Java. That's not long if I only run it once. If I run it repeatedly -- perhaps in another script -- those half-second delays soon add up. In comparison, a similar script in Perl takes about 50 msec to start execution.
This isn't a surprise -- Perl is a language that was always designed to operate in a scripting, interpreted mode; Java is not. And, once compilation is over, the Java implementation may well out-perform the Perl version -- it really depends on the specific operations.
So is this real scripting?
That the new auto-compilation feature supports shebang lines -- and thus the ability for programs to be executed easily at the prompt -- does suggest that the implementers of the new feature were aiming at full-scale scripting. However, JEP330 expressly distances itself from this kind of speculation:
"...it is not a goal to evolve the Java language into a general purpose scripting language."
It is right to take this stance because, for better or worse, Java does not
really have any of the features that make Perl and bash
so
successful for scripting.
Both these utilities (and Python to a lesser extent) were designed to form a kind of plumbing around command-line utilities. Consider, for example, the following Perl script:
my @df = `df -h`; chomp (@df); # Remove end-of-line marks foreach my $line (@df) { if ($line !~ /^Filesystem/) # Remove header line { print ("$line\n"); #... } }
This script is intended to do something (doesn't matter what) to mounted
filesystems, based on their size. It starts by executing the command-line
utility df -h
, and assigning its output line-by-line
to an array of
strings. Then it removes all end-of-line marks from the array
using chomp()
.
Then it iterates the array, using a regular expression match to
ignore the header line in the output from df
.
The purpose of this script is unimportant -- what I'm trying to illustrate
is how difficult it would be to implement these ten lines of Perl
in Java. You'd have to use Runtime.exec()
to execute
df
, and set up multiple threads to consume the
stdout
and stderr
streams from its execution.
Then you'd have to parse the output, removing end-of-line markers
in your own code. You'd have to create a Vector<String>
or
similar to hold the specific lines as you parse them. You'd need to
use the Java regular expression support to remove the unwanted
lines. Oh, and you'd need to deal with character set conversion in
some way, because the platform's character set probably won't match
the JVM's internal string format.
In Perl or Python I can define modules -- separate program files -- that are themselves defined in Perl or Python. I don't need to compile or link them. There's no comparable way for a self-compiling Java source file to run another Java source file -- except by invoking the compiler explicitly.
The fact is that real scripting languages are good at this kind of thing -- they're good at (a) working with the platform, and (b) assembling a complex program from modules written in the same scripting language. A self-compiling Java script can run additional Java modules -- but they have to be compiled first. If the application allows for compiling some Java, there seems to be little to gain by not compiling all Java.
It's no accident that Perl is regularly voted the most hated programming language by developers, but there's no doubt that it's very good at the kinds of things it was designed to do.
In principle, where the new auto-compilation feature could be useful is in education. Auto-compilation makes Java quite access to experiment with -- but not as accessible as Python, because the student still has to content with the Java boilerplate.
Moreover, beyond primary education, is it too much to ask, that a
potential programmer should know what a compiler does? Is typing
javac
followed by java
really all that much
less comprehensible than just typing java
?
For all that, I can see a role for auto-compilation in an educational setting. I don't really see Java replacing Perl for one-off system administration tasks, and that doesn't seem to be the focus of the new features. These features are interesting, and time will tell whether they prove to be useful.