keplerd: a new Java-based server for the Gemini protocol and others

I’ve been working on a new small-net protocol which I’m calling “Kepler”. It’s only a minor extension to the Gemini protocol – a trivial one, really – but it seeks to make it plausible to scale Gemini to a potentially large number of users and sites. Kepler supports both plaintext and TLS-encrypted communication; in a sense, it combines the features of Gemini and Spartan.

I mentioned Kepler in passing in my article in praise of HTTP. The protocol specification and supporting documents for the draft Kepler protocol are available on GitHub, and I welcome comments.

Why a new server?

Kepler is sufficiently similar to Gemini that I initially thought I could modify an existing Gemini server to handle Kepler instead of, or in addition to, Gemini. I was able to modify the well-established “Molly Brown” server easily enough – just another twenty lines of code – but not to the extent that it could support both TLS-encrypted and plaintext communication in the same process. Somebody who knows the Molly Brown code better would probably succeed where I failed, but I thought it would be quicker in the long term to write a new server just for Kepler. In any case, I wanted the new server to be multi-threaded and reasonably scalable, in a way that Molly Brown really isn’t designed to be.

Having started down this path, I figured it would require little extra work to include support for Gemini and Spartan as well.

Serving static documents is an important part of a Gemini (etc) server, but not the whole: we need a server to cater for programmatically-generated content as well. Molly Brown uses a variant of the long-standing “CGI” technique, which has a number of problems. In particular, the server has to launch a new operating-system process for each request, something that doesn’t scale well, and can lead to problems with resource contention. There is an “SCGI” variant which avoids these problems to some extent, but I don’t think it ever really caught on. For the new server, I wanted to take the same approach that large-scale application servers use on the mainstream web, and integrate content-generating application components directly into the server’s process space.

Introducing ‘keplerd’

keplerd is an all-new, Java-based server for Gemini, Spartan, and Kepler. It wouldn’t be difficult to add other protocols, like Gopher. This would open the possibility of supporting all the current small-net protocols in a single server.

keplerd is multi-threaded, so it should scale to handle many concurrent users – provided it’s running on powerful enough hardware. At the same time, the thread pool is configurable, so it shouldn’t cause resource exhaustion on weaker hosts.

keplerd is extensible in Java, in the same way as Java-based web servers like Apache Tomcat, and provides a strongly-defined API for extensions.

Why Java?

A huge proportion of all the world’s dynamic web content is generated using Java code, running on Java-based web servers and application servers. We’ve developed a good idea, over the last fifteen years or so, how to do this efficiently and securely. The Java servlet specification defines rigorously how to extend a web server with self-contained Java modules.

Extending the server this way – with programatic content generators in the same process space – avoids all the overheads associated with technologies like CGI, as well as many of the security hazards. Rather than launching a new process, or even a new thread, for each incoming request, the server maintains a shared pool of threads. One thread is allocated from the pool to a request when it arrives. When the request completes, the thread goes back into the pool. If the thread pool is of a fixed size, then no new threads need be created once the server is fully initialized. Of course, additional clients will be blocked when all the threads in the pool are busy, but this is a feature, not a defect – this is how the server protects itself from overloads.

Java servlets don’t get process isolation from the operating system, the way CGI applications do. Instead, what they get is class-loader isolation, which should prevent applications from interfering with one another. They can interact with the platform only in ways allowed by the defined interface and the Java JVM.

It’s possible to do this kind of thing using technologies other than Java, but Java is the way most of the IT industry has gone, for better or worse.

By analogy with Java servlets, I’m calling keplerd extensions “gemlets”. The gemlet API is is similar enough to the servlet API that anybody who is familiar with Java-based application servers should have no difficulty writing a gemlet. Gemlets are packaged into Java JAR files, which keplerd loads at runtime based on a configuration file.

Another advantage of Java, of course, is that it’s platform-independent. The same compiled code – the server and any extension gemlets – should run on any platform with a Java runtime environment. In practice, though, most of this kind of technology is hosted on Linux.

Why not Java?

There are two big problems with using Java to implement an Internet-facing server.

First, because it’s platform-neutral, Java lacks some operating-specific features that we’d definitely take advantage of, if we were working in C or Go. For example, we can’t change the server’s user identity at runtime. This means that we can’t start up as root with administrative privileges, for initialization, and then change to an unprivileged user. Almost all web servers and application servers use this technique to open privileged ports (like port 80 for a webserver) as root, and then drop privileges for security.

This is a problem in the mainstream web as well. That’s why we conventionally use port 8080 for Java-based web servers. In a commercial environment, we’d conceal this non-standard port from the user by putting a reverse proxy between the browser and the application server. The proxy would translate conventional ports like 80 and 443 to the high-numbered ports that the Java server prefers. In a less high-powered environment, we might just have to learn to love high-numbered ports. keplerd defaults to using port 8300 for the Spartan protocol, rather than the conventional 300. For Gemini and Kepler, the usual port numbers are already outside the restricted range, so they aren’t a problem.

Another problem is that, Java being what it is, it’s difficult to secure the server from the rest of the platform using a sandbox. I have another article on running a Java application in a chroot jail – but I can’t deny that it’s fiddly. In the commercial world we don’t worry about this: we run our Java application servers in Docker containers and the like, which provide strict isolation. But techniques like this are a bit heavy-handed if you’re running your server on a Raspberry Pi.

It also can’t be denied that a Java application will use more memory than one written in C – at least in conditions of low load. The Java JVM will need, perhaps, 500Mb of RAM even at idle. As the load increases, however, the amount of resources needed to handle that load will come to dominate the overall usage, and the JVM’s contribution will be less significant. Still, while keplerd does run on a Raspberry Pi, it will take up to ten seconds to start, and be slow for the first requests, until the JVM has done all its run-time compilation stuff.

Since one of my interests is making small-net protocols that can actually scale to a large user base, I’m not particularly worried about the security or resource usage of a Java-based server. After all, this technology works well in the mainstream web, where it has all the same potential problems, which we know how to solve.

Project status

Although there’s a long way to go, keplerd is basically functional. The source code and binaries are on GitHub. It’s not as well-documented as I’d like, but I’m working on that.

I’m sure there are many unknown bugs, along with the ones I know about. Along with fixing all the bugs, I have some tentative plans for the future.

Support for Gopher and Nightfall Express, and maybe others
Automatic “transcoding” of content from Markdown or Gemtext to plain text for clients that expect pre-formatted text
More, larger-scale applications to exercise the gemlet API more intensively

Closing remarks

The keplerd server is now live in my own capsule, serving four protocols with the same content:

Please note that if you’re reading this article on my ‘regular’ website, you won’t be able to follow any of these links using a regular web browser: you’ll need a client for small-net protocols like my own Caztor, or Alhena, or Lagrange.

As well as my static content, the server also provides a simple weather forecast application, which demonstrates the use of the “gemlet” API for extending the server. There are better weather services on the small net – this one is really only a demonstration of the technology. However, it’s available on all the supported protocols; for example here is the Gemini version.

Right now I’m maintaining the Caztor multi-protocol client, as well as the keplerd server, while working on the Kepler specification. So if anybody is interested in collaborating on any of this, please get in touch.

Have you posted something in response to this page?
Feel free to send a webmention to notify me, giving the URL of the blog or page that refers to this one.