How does Camel-K actually work?
Camel-K is a relatively new technology, for deploying Apache Camel routes directly on an OpenShift or Kubernetes cluster. The more traditional approach to deploying Camel routes is to create an application based on Camel, perhaps using a framework like Karaf or Spring Boot, build it into a Docker image, and deploy that image on the cluster. Camel-K allows us to go directly from a route specification in some language (Java, XML, YAML) to a running pod that implements the route.
Once set up, Camel-K is very easy to use -- just define the routes using your language of choice (Java, XML, YAML...), and use the command-line tool or an IDE plug-in to deploy to a container. Of course, a huge amount goes on behind the scenes and, if something goes wrong, it can be rather awkward to find out why.
In this article I'll describe in outline what actually happens on the OpenShift/Kubernetes cluster, when deploying a Camel route using Camel-K. I should point out from the start that I'm going to over-simplify enormously, because the process is extremely complex. However, I hope that this article might serve as a starting point for understanding Camel-K better.
Prerequisites
I'm assuming that you've done the basic installation of the Camel-K operator
in your OpenShift or Kubernetes environment. I'm also assuming that you
have the kamel
command-line utility, even if you're using an
IDE to build your integration code. I'll only be using command-line tools
in this article. I'm using OpenShift commands, but substituting
Kubernetes commands should work fine (e.g., kubectl get pods
rather than oc get pods
).
I'm assuming that you know about Camel in general, and at least a little about OpenShift/Kubernetes. If you don't know what a "ConfigMap" is, or what a YAML file looks like, you might need to read more around OpenShift or Kubernetes first.
Example
For the purposes of this article, I'll be using the "Hello, World" boilerplate example, that just consumes from a timer and writes to the log:
import org.apache.camel.builder.RouteBuilder; public class Hello extends RouteBuilder { @Override public void configure() throws Exception { from("timer:tick") .setBody().constant("Hello") .to("log:info"); } }
Note that this isn't a complete application, or even a complete Java class. It's just the fragment of code that defines the Camel route. In more traditional Camel programming, you'd have to build the rest of the application around this fragment. With Camel-K there's no need to, because the framework will supply the necessary infrastructure.
To deploy this trivial integration from the command prompt, using defaults for all settings, we can just do this:
$ kamel run Hello.java
But then what happens?
The role of the "integration kit"
Let's look at the pods that are running in the user's namespace, assuming
that the kamel run
operation succeeded. There might be
other pods present, of course, from other applications --
I'm only showing the ones relevant to the
current Camel-K deployment.
Note:
Depending on how you installed the Camel-K infrastructure, you might have a Camel-K operator pod as well. This plays no part in the present discussion, so I'll ignore it.
$ oc get pods NAME READY STATUS camel-k-kit-c505rq5062cf6c1f47mg-1-build 0/1 Completed camel-k-kit-c505rq5062cf6c1f47mg-builder 0/1 Completed hello-7b7975957b-6bz5n 1/1 Running
Because I did not specify a particular name for my integration, Camel-K
has just taken the name of the source file, and used it as the basis
for the pod name. The pod hello-XXX
is, of course, the
running Camel application. If we examine its logs, we should see the
output of the Camel route:
$ oc logs hello-7b7975957b-6bz5n 2021-09-14 09:58:14,144 INFO [info] (Camel (camel-1) thread #0 - timer://tick) Exchange[ExchangePattern: InOnly, BodyType: String, Body: Hello] ...
Not very exciting, but we can see that a Camel route is running. How did we get from a snippet of Java source code, to this running application?
The first step in the process is for the Camel-K operator to build an integration kit. This is, in effect, an OpenShift image that contains an application apable of constructing new pods that run Camel routes. In the pod listing above, we have two pods with "kit" in their name. Note that the names do not contain "hello". The reason for this is that the same integration kit can build multiple, different Camel applications. The contents and structure of the integration kit depend on a number of factors:
The language used to specify the integration (Java, in this case).
The dependencies specified by the user (if any) when deploying the integration source.
The dependencies auto-disovered by inspection of the source.
It's entirely plausible that a single integration kit will suffice for an entire installation, even a large complex one, if the same source language is used.
You can see that the integration kit corresponds to an image:
$ oc get is NAME IMAGE REPOSITORY TAGS UPDATED camel-k-kit-c505rq5062cf6c1f47mg [...]/test/camel-k-kit-c505rq5062cf6mg 33655 About an hour ago
If you look at the three pods (e.g., using oc describe pod
)
you'll see that all three are actually derived from the same image.
That is, the integration kit creates an integration pod that is based on
exactly the same code as itself. If you're familiar with the
OpenShift "Source-to-Image" (S2I) process, you'll have come
across cases where a pod (the "builder") creates a new image that is based
on its own image, with additional data or code. The process used by
Camel-K looks
superficially like S2I, except that only one image is involved.
What distinguishes the "integration kit" functionality of that image,
from the "run the integration" functionality is simply the environment
provided to the pod.
The integration kit generates a Kubernetes Deployment for the new integration. We can see the contents of the deployment:
$ oc get deployment hello -o yaml
Note that this deployment is specific to the integration we are deploying, not to the integration kit, and thus has the name "hello". Broadly, each new Camel route deployed will result in a new Deployment object. The contents of the Deployment are extensive, but one thing is worthy of note here.
Look at the volumes
section of the Deployment:
volumes: - configMap: items: - key: content path: Hello.java name: hello-source-000 name: i-source-000 - configMap: items: - key: application.properties path: application.properties name: hello-application-properties name: application-properties
You'll see that the entire source code of the integration, Hello.java
has been inserted into a Kubernetes ConfigMap, as has a configuration
file application.properties
. This latter name might put you
in mind of a Spring Boot application and, in fact, Camel-K did at one
time use Spring Boot as its application framework. More recent versions
use Quarkus, but many of the Spring Boot names have been retained for
compatibility.
We can see the source code by examining the ConfigMap:
$ oc get configmap hello-source-000 -o yaml apiVersion: v1 data: content: |+ import org.apache.camel.builder.RouteBuilder; public class Hello extends RouteBuilder ....
The original source Hello.java
is embeded in
an entry in the ConfigMap called content
, as
the Deployment stipulates.
It is these ConfigMaps that essentially distinguish one integration from
another, when they are built using the same integration kit, and are
based on exactly the same image.
In short, the integration kit has taken the supplied integration source code, put it in a ConfigMap, and then built a Deployment that will run a pod with the ConfigMap available. When the pod starts, it will be responsible for turning the integration source from the ConfigMap into a running Camel application.
Building the Quarkus application from the source
That it is the integration pod, not the integration kit or the kamel
utility, that is responsible for building the Camel application can
easily be seen by deploying a Camel route with
a syntax error. Let's create Hello2.java
with a deliberate
error (it's not relevant what the error is).
Now when I deploy the source using kamel run
, it gives
every impression of being successful:
$ kamel run Hello2.java $ kamel get NAME PHASE KIT hello Running ... hello2 Running ...
Eh? How can a deliberately broken integration be "Running"? It's running because the integration kit got as far as creating a Deployment:
$ oc get deployment NAME READY UP-TO-DATE AVAILABLE AGE hello 1/1 2 2 68m hello2 0/1 1 0 54m
It's clear that something is wrong, however -- the deployment
hello2
has zero pods running.
However, so far as Camel-K is concerned, once a Deployment is successfully
created, the integration is "running", even if it quite plainly isn't.
This can be a little confusing, if you aren't familiar with the
build process.
That the Camel route isn't running is easy enough to see if we look at the pods:
$ oc get pods NAME READY STATUS hello2-64bdd49c9d-4r9bq 0/1 CrashLoopBackOff
This pod hasn't started, and it isn't ever going to start. To find out why, we can look at its logs:
ReflectException: Compilation error: /Hello2.java:10: error: not a statement foo ^ /Hello2.java:10: error: ';' expected foo
There's a trivial syntax error in the Java code, which is pretty clear in the log, so the pod can't start. OpenShift will try to restart it, but it isn't going to work any better, however many times it tries.
The problem of not
being immediately aware of problems in the Camel source can be
ameliorated by running kamel run
with the --logs
switch, which causes the integration pods logs to be printed to
the console.
So we can see that the integration pod is responsible for taking the source code provided by the integration kit (via a ConfigMap) and building it into a Camel route. I'll have more to say about the implications of this design decision later.
If there is any magic in Camel-K, this is where it is. The Camel-K
runner in the integration pod turns the snippet of code in Hello.java
into a full Java application based on the Quarkus framework, and then
runs it. It does this entirely in memory -- no intermediate files are
generated. Rebuilding the application on starting the integration
pod is time-consuming, and will increase the start-up time of the pod.
However, there
are significant advantages when it comes to redeployment, as I'll
explain later.
Let's see what happens when creating an integration using a different language. Here is the "Hello, World" example expressed in YAML:
- from: uri: "timer:yaml" parameters: period: "1000" steps: - set-body: constant: "Hello Camel K from yaml" - to: "log:info"
Deploy this in the usual way:
$ kamel run Hello3.yaml $ kamel get NAME PHASE KIT hello Running hello2 Running hello3 Building Kit
Notice that Camel-K is building a new integration kit for this integration,
although it would not have done so for another Java route (with the
same dependencies). When this operation is complete, you'll see
(e.g., using oc get is
) that a new image has been created
for the integration kit (and for the integrations it deploys,
as discussed above).
If fact, although Hello3.yaml
is a route specified in
YAML, the route is still converted into a Quarkus application, and
executed using a Java JVM. The construction is, in fact, almost
identical between the Java and YAML examples. So it's worth asking
why we need different integration kits at all. The answer seems to lie in
the dependencies: the YAML integration kit installs a large number
of Java software components for processing YAML; these are not needed
in the Java example. In other respects, these integration kits are
almost identical.
The potential advantage of maintaining separate integration kits -- despite the complexity it adds to Camel-K -- is that the number of dependencies needed for a particular integration can be minimized. YAML support is not needed for a Camel route that is neither expressed in YAML, nor processes YAML. Consequently, using a different integration kit for YAML-based routes allows the necessary dependencies to be added only to those routes. This distinction is not hugely important when running Java code using traditional JVMs, but it may become increasingly relevant when using newer technologies like GraalVM. The underlying principle -- which is not yet fully realized -- is to match integration kits as closely as possible to classes of Camel route.
Redeployment
A significant feature of the Camel-K deployment model is that deploying new code does not require the creation of any new OpenShift/Kubernetes image -- so long as it uses the same integration kit. This is a marked difference from most Camel-based development methodologies for OpenShift/Kubernetes -- in most cases, the way to run changed code is to build and deploy a new Docker image, which is not exactly speedy.
Deploying a modified route to Camel-K amounts to changing the contents of the source in a ConfigMap, and then restarting any pods that are consuming code from that ConfigMap. The pods still have to build a Quarkus application from the new code, but that's generally a lot faster than building a new image.
The kamel
command-line tool even has a "developer" mode,
activitated by running
$ kamel run [file] --dev
In this mode, the kamel
utility runs in the background,
watching for changes to the source file. When changes are detected,
they are redeployed, and take effect as a new integration pod can be
spun up. In practice, this takes a few seconds for sources of modest
complexity. Developer mode also enables --logs
, so
errors are more apparent.
It has to be admitted, I think, that this improvement in development turn-around time comes at the expense of an increase pod start-up time, because the application to be build from source each time. In practice, it's only the user's code that has to be built -- plus a little boilerplate -- since the Quarkus framework is built into the image as a series of JAR files. So pod start-up time isn't usually more than a second or so. Still, in a "serverless" environment, where the number of running pods scales up and down on demand, this additional start-up time could be significant.
Closing remarks
I've tried to provide an overview of the way that Camel-K takes snippets of Camel source code, in a variety of different languages, and generates a running application on OpenShift or Kubernetes. The infrastructure that makes this possible is fairly complex, and the design decisions involved do embody some compromises. Camel-K is a relatively new technology, and it's implementation has already changed somewhat since its first release. There's still work to do, and I think we can expect further improvements in future.