AppImage from scratch
An AppImage is a single file that contains an entire Linux application. In most cases, it doesn’t require any particular installation – the user just executes the file, and the AppImage takes care of the rest. AppImage doesn’t automate the collection or installation of an application’s dependencies – the AppImage file is expected simply to provide them. All of them. Technologies like FlatPak and Snap are superficially similar, but these all require some management infrastructure on the computer where the application is to run. AppImage requires only the Linux desktop (sometimes not even that), and some fudamental Linux utilities.
AppImage is increasingly popular, because it’s a very simple technology for the end user. It’s not necessarily simple for the application packager, but there are tools to help with that. Although the tools are pretty well documented, I’ve not seen a lot of documentation about how the technology operates fundamentally. It’s possible that the developers of AppImage technology think it’s just too simple to document. If so, I disagree and, in this article, I set out to explain how AppImage works from first principles.
Fundamentals of the AppImage technology
We’re probably all familiar with the ‘executable installers’ and ‘self-extracting zipfiles’ that are commonplace in the Windows world. On Linux, however, applications usually come in some kind of package, that has to be unpacked and installed. The installation process is usually coordinated by a package manager, in collaboration with a software repository. The application package states what its dependencies (often libraries) are, and the package management framework attempts to resolve, obtain, and install the dependencies.
Tools like yum
and apt
, and the
repositories with which they interact, are very good at handling complex
dependency relationships. But what should we do when dependencies are
irreconcilably in conflict?
It’s also a problem that somebody has to provide packages for each Linux variant, and often each version of each variant. In practice, the maintainers of specific Linux distributions often shoulder this burden, leading to a situation in which a package exists for some Linux distributions and not for others. Sometimes the only way to get the latest version of an application is to upgrade the entire Linux installation.
AppImage solves these problems by supplying a single executable that contains the entire application, along with all its dependencies, in some executable, compressed format. It’s a bit like a self-extracting zipfile, without the actual extraction – the compressed data is loaded into memory every time the user runs the application.
So how can we supply a whole application, including all its dependencies, in a single file, requiring no installation?
AppImage technology relies on three fundamental features of Linux.
First, Linux doesn’t care what an executable file contains, so long as the start of the file is executable. An AppImage has a small executable header, followed by the application’s complete set of files in a compressed format.
Second, Linux allows a file, or part of a file, to be mounted as a filesystem. This is called loopback mounting. A filesystem image is embedded in the AppImage at some known offset; at runtime, the AppImage tells Linux to mount that filesystem on some temporary directory.
Third, on modern Linux systems we can perform the loopback mount
without elevated privileges. You don’t need to be
root
to mount the filesystem on a temporary director. The
technology that handles the mounting is called FUSE – Filesystem on
USerspacE.
At runtime, the AppImage header at the start of the AppImage file
locates the embedded filesystem, and uses FUSE to mount it on a
directory under /tmp
. The header then runs a script in that
directory called AppRun
, which sets up and runs the
application.
Building an AppImage-style application from scratch
To explain how this all works in detail, I’ll describe how to build an AppImage-style trivial application. Of course, you can build a real AppImage, using tooling designed for that purpose. But doing it from scratch is more educational. My example will, naturally, be a lot simpler than a real AppImage, but it will use exactly the same principles.
My application will be a simple shell script, that dumps a text file. The application is AppImage-like because it embeds the script and the text file in a single executable. It has an AppImage header, but mine is just a shell script: real AppImages use a statically-linked binary as the header, which does a lot more than my simple script.
I’ll show some of the source code in this article; the whole thing is available from my GitHub repository.
Compressing the filesystem
In my example, the application’s filesystem will start life as a
directory called appdir/
. My directory only contains two
files – the script that comprises the application (run.sh
)
and the text file it dumps (test.txt
). In a full-scale
application I would lay out the source directory like a complete root
filesystem, with subdirectories /usr/lib
,
/usr/bin
, and so on.
We’ll compress this directory into a complete filesystem image. AppImages seem to use the SquashFS format; since the entire filesystem has to be loaded into memory, and will usually be read-only, I guess it makes sense to use a compressed format like this.
Turning appdir/
into a SquashFS filesystem is easy:
$ mkdir build
$ mksquashfs appdir/ build/appdir.img
mksquashfs
might not be installed by default; it’s
typically part of a package called squashfs-tools
. The
utility has hundreds of command-line switches, but the defaults are fine
for this simple demonstration.
appdir.img
is the compressed image that
mksquashfs
outputs. If we wanted, we could mount this on a
directory using squashfuse
:
$ squashfuse build/appdir.img /tmp/some_directory
In fact, in my example, it’s the AppImage(-style) header that will
run squashfuse
– this has to be done when running the
application, not building it.
The AppImage(-style) header
My build process appends the SquashFS filesystem image
appdir.img
to a file header which, in this demo, is just a
shell script. Here it is, in its entirety:
#!/bin/bash
my_dir=/tmp/app_mount.$$
mkdir -p $my_dir
squashfuse -o offset=NNNN $0 $my_dir
export LD_LIBRARY_PATH=$my_dir/usr/lib:$my_dir/usr/lib64
$my_dir/run.sh
fusermount -u $my_dir
rmdir $my_dir
exit
The SquashFS filesytem gets appended after the exit
line, which is necessary to ensure that the shell doesn’t try to execute
the filesystem data after the application has finished.
The first thing the script does to create a directory on which it
will mount the SquashFS filesystem. To reduce the likelihood of
different applications using the same directory, we append the process
ID ($$
).
Then the script mounts the filesystem whose data follows the
exit
line. It uses squashfuse
to do this, with
an offset
argument. While building the application’s
executable, we must change
offset=NNNN
to the actual length of the header (and thus the start of the
embedded filesystem). There are many ways to do this (see
build.sh
for how I do it, but I don’t claim it’s
optimal).
I should point out that not all Linux installations will have
squashfuse
by default (try
apt install squashfuse
or
yum install squashfuse
). Real AppImages don’t rely on this
utility – I assume that the AppImage header replicates its functionality
internally. The ‘real’ method is better, as it doesn’t rely on a Linux
package that not everybody will have; but I couldn’t think of a way to
replicate the behaviour of squashfuse
in a shell script
alone.
The script then sets the environment variable
LD_LIBRARY_PATH
, to tell the Linux loader where to look for
shared library (.so
) files. My example doesn’t actually
need any such libraries, but most real application will. The AppImage
builder (whether that’s a person or a software tool) will usually put
.so
files in usr/lib
or
usr/lib64
, as the maintainer of a traditional package
would. The Linux loader will prefer the libraries in
LD_LIBRARY_PATH
over the default ones in /lib
,
etc., but will fall back on the defaults for libraries the AppImage
doesn’t provide.
Then the header runs the application – run.sh
in this
case. When the application completes (or is killed) the header unmounts
the filesystem and deletes the temporary directory.
The application
My application is very simple: it just prints a text file.
my_dir="$(readlink -f "$(dirname "$0")")"
echo Printing test.txt:
cat $my_dir/test.txt
Note, though, that the application needs to work out where the text
file actually is. Other that shared libraries, which are handled by
setting LD_LIBRARY_PATH
, the application will need to
perform this computation for every file that is bundled with the
application. This is potentially a significant limitation of the
AppImage technology, which I’ll discuss later.
Building the AppImage
This is just basic Linux shell scripting – please see
build.sh
. All the build does is concatenate the
AppImage(-style) header and the SquashFS filesystem, adjusting the
header to indicate the offset of the filesystem in the final file.
AppImage in practice
My simple example works in the same way as a real AppImage, but it doesn’t have to manage any dependencies, and that’s where the real problems begin. In practice, I think that most AppImage maintainers use tools like linuxdeploy to handle the dependency management. This tool scans an executable, and tries to work out what libraries it depends on. It copies these libraries to a directory, which can then be used as the basis for the SquashFS filesystem.
This scanning process isn’t foolproof, particularly if the application loads libraries explicitly at runtime (so library information is absent from the application’s executable). Still, it’s a start.
Another approach to managing AppImage dependencies is to leverage a
platform’s existing dependency framework. If applications are available
as packages (.deb
, .rpm
, etc), then the
package file should already contain dependency information. It should be
possible for AppImage tooling to resolve the dependencies in the same
way that a platform’s package manager would.
However, not all applications are easy to convert into AppImage form, even if the dependencies are clear. To work as an AppImage, the application must usually be relocatable. A relocatable application, in this context, is one that could just be unpacked into an arbitrary directory and executed there. Any application that is written to look for its own files at specific locations will need to be modified, perhaps extensively, to use locations in the mounted SquashFS filesystem.
Some of this modification can be automated, but probably not all,
because some files may have be at specific filesystem
locations. If an application uses configuration files in
/etc/
, for example, or $HOME/.config
, these
references shouldn’t be changed to files in the SquashFS
filesystem. Apart from anything else, it’s read-only.
In practice, it takes a lot of self-discipline maintain a complex application that is completely relocatable, and most likely it’s only the authors of the application that know how to do this. Converting an existing application – particularly a large one – to be relocatable can be hugely complicated.
Closing remarks
In this article, I demonstrated how to build an AppImage-style application, using only shell scripts and commonplace utilities. My approach is conceptually similar to real AppImage technology, but a lot simpler.
Building an AppImage-like package from scratch does highlight a lot of the limitations of the technology, particularly those related to making the application relocatable. I don’t think anybody would use my all-manual approach to package a full-scale application, but I’m sure it would work, with sufficient patience. Real AppImages are typically built with the assistance of a lot of tooling.
When we see how AppImage works at the platform level, we can appreciate how inefficient this technology can be. It’s inefficient in storage because, in practice, multiple AppImage packages are likely providing copies of exactly the same dependencies. It’s inefficient in resources, because the entire AppImage has to be loaded into memory at runtime. To be fair, it’s really mapped into virtual memory, so unused parts of the application use little to no RAM. Still, the run-time decompression of the parts that are used will use some CPU. Given how Linux caches filesystem data, that overhead is probably not significant on a modern, desktop Linux. AppImage might be unsuitable for low-resource or embedded Linux systems. For the desktop, many users will likely find the inefficient use of resources a small price to pay for the simplicity.
AppImage technology is less sophisticated than FlatPak, Snap, Docker, Podman, etc. Any sophistication has to be provided by the AppImage tooling, not the platform. For example, in lieu of a framework for keeping AppImage applications up to date, tooling can incorporate a complete auto-update mechanism into each AppImage application. While this is an interesting development, I can’t help thinking that there are better ways to do what AppImage does, for users that need that kind of automation.
It should also be clear the AppImage is not a container technology. AppImage applications are not sandboxed, or isolated from one another. Concerns about security are making the use of lightweight containers a popular way to run applications that the user doesn’t entirely trust. Of course, AppImage is no worse in this respect that the traditional method of packaging Linux applications: we still have to trust the supplier of the package.