Devops4Scala - Using SBT to for advanced Docker builds

Are you a Scala Developer/Devops, interested in automating your workflow and build your Docker images?

Do you want to solve the limitations of Dockerfiles using the ubitiquous Scala’s SBT ?

Do you want to be able to build multiple your Docker images, with dependencies, compilation, documentation processing and all the artifacts generation with just one SBT command: sbt buildAll?

If you are interested then in super powering your Docker builds with SBT, read on. Of course, all of this could be of interest for everyone but it is really aimed to people into Scala development.

What is wrong with Dockerfiles?

I am a Scala developers, and I routinely develop applications I deploy in the cloud. The tool of the choice for packaging cloud applications is, as you can guess, Docker. As a result, I spend a lot of time writing Dockerfiles.

As much I love and appreciate Docker, the same I actually dislike Dockerfiles. I am sure if you ever tried to create Docker images with the standard Dockerfile syntax, you will share my feeling the format is way too limited. To be honest, I believe those limitations makes sense and can be seen as a good thing: those force Docker to be a simple and well understood tool, with a clear scope and well defined goals.

In real world, howerver, builds are complex enough you need much more than Dockerfiles provides out-of-the-box. In this post I will explore how to overcome some limitations using the mighty Scala build tools SBT with the sbt-docker plugin.

Dockerfiles are really meant to create reusable (and somewhat immutable) components to be shared on public services like the Docker Hub. The real intent of this format is allowing to create repeatable and portable builds.

By design, Docker assumes all you need to build your image either exists in the source directory or it can be downloaded straight away from the internet. Furthermore, they restrict you to use only linux scripting tools (bash mostly) to do all the work.

I use a lot Docker, so often I can write a Dockerfile on top of my mind without having to check documentation. So I am very familiar with what you can and cannot do (a lot) with them. While it is easy and fast doing simple things, it becomes quickly clumsy when you need to do more complex builds. Here is a list of the a list the requirements I usually have with Dockerfiles when I build my images:

  • I need to express dependencies between images. I routinely create a base image, then I construct a number of derived images, all of them extending the a base image. I have to update the base image before I actually build the derived one, and I need not to have to do manually.

  • I need to collect artifacts from different sources, not just the source folder. Dockerfiles are adamant on the constraint that everything you want to place in the image should be available in the source directory. As a result, you need to copy your stuff in the source directory manually.

  • I need to be able to download something from Internet, but avoid to repeat the download all the time (as Dockerfile usually do) while executing the build. I need to avoid the same stuff is downloaded multiple times, because it slows down a lot the development process.

  • I need to be able to add something in your container compiling it. But I need to be able to do it without having to add a whole new development environment in the image, then removing it after I compiled the stuff.

I can gather all those requirements in a simple statement: I need a build tool, to be run before you actually build your Docker images, and tightly integrated with Docker builds.

Luckily, Scla SBT with appropriate plugins is such a build tool. I show here how you can do advanced builds with SBT and satisfy the previous requirements.

Writing your Dockerfile with sbt-docker

The starting point of my effort is the plugin sbt-docker, a really smart SBT plugin by Marcus Lonnberg. This plugin lets you to describe the whole Dockerfile in SBT syntax. Think to it as a SBT DSL for Docker. The advantage of this technique lies in getting the whole power of SBT at your fingertips, being no more limited to what Dockerfile has to offer.

The obvious disadvantage is of course you need to learn SBT, that is itself a Scala DSL, so you need to know Scala too. Indeed this technique is not for everyone. But if you are a Scala developer, since this article is part of the Devops4Scala series, and it is aimed to those Scala developers who know Scala and SBT, it should not be an issue.

Let’s start learning sbt-docker, first building an image for a web server based on nginx in the classic way, then rewriting it in SBT syntax highlighting the advantages.

All the examples described here are available in the GitHub repository sciabarra/Devops4Scala.

A classic Dockerfile for nginx

Without any further ado here is our example:

1
2
3
4
5
6
7
FROM alpine
RUN apk update &&\
apk add nginx &&\
echo "daemon off;" >>/etc/nginx/nginx.conf &&\
mkdir /run/nginx
COPY index.html /var/lib/nginx/html/index.html
CMD /usr/sbin/nginx

We start from an alpine linux distribution image (very commonly used with Docker), we add and configure nginx, copy a file in the image and start nginx. That is almost all.

However, in the Dockerfile we do not specify the name of the image. And suppose the file index.html is not actually stored in the same folder of the Dockerfile, but in its parent directory. As a result we need (and commonly we have) a build script like this one to run the build:

1
2
cp index.html nginx-classic/index.html
docker build -t devops4scala/nginx-classic:1 nginx-classic

The script is simple but it shows a few problems already:

  • you need additional scripts to collect things and tag images properly
  • no way to parametrize the Dockerfile, unless you play games with sed and shell
  • error prone way just to write a long command line (just a space after the final backslash breaks the script)

Preparing to use SBT and sbt-docker

Now let’s see how to improve with the help of Scala tools.

Before starting, you need to install SBT. Follow installation instructions,
SBT is available for almost every operating system, with multiple installation options.

Once you have installed SBT you need to enable the sbt-plugin for your project.
To do so, create a folder for your project and place in it 2 files.

First is the file project/plugins.sbt (yes, you need a project subfolder) with the single line:

1
addSbtPlugin("se.marcuslonnberg" % "sbt-docker" % "1.4.0")

Second, the file plugins.sbt (in the folder you created) with the single line:

1
enablePlugins(DockerPlugin)

You are ready.

A Dockerfile generated in sbt

Now let’s create the equivalent Dockerfile with a build.sbt as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
val dest = "/var/lib/nginx/html"

dockerfile in docker := new Dockerfile {
from("alpine")
runRaw("""
|apk update &&
|apk add nginx &&
|echo "daemon off;" >>/etc/nginx/nginx.conf &&
|mkdir /run/nginx
""".stripMargin.replaceAll("[\\n\\r]", ""))
copy(baseDirectory.value.getParentFile/"index.html", s"$dest/index.html")
cmdRaw("/usr/sbin/nginx")
}

name := "nginx"
organization := "devops4scala"
version := "0.1"
imageNames in docker := Seq(ImageName(s"${organization.value}/${name.value}:${version.value}"))

As long you know both Dockerfile syntax and Scala syntax, you should notice that sbt-docker looks like a DSL expressing the Docker build file in Scala syntax. But it is not only a syntactic change, we can immediately notice a few improvements.

For clarity, destination folder in the container is now declared in a variable (val dest) at the beginning of the file.
The variable is used later leveraging Scala string interpolation syntax (s"$dest/index.html").

We can actually do better than this, placing those variables in a separate file, and I will describe in another post how to do it with a specific plugin I wrote. For now, just note we got parametrization.

We can actually do more than interpolation. Note the long string in the runRaw command. We do not have to worry anymore of being wrong with spaces and newlines, we can enjoy indentation: we placed an expression stripping the newlines (.stripMargin.replaceAll("[\\n\\r]", ""))

Furthermore, you can note the image name is in the same file, and the name of the image is calculated from other settings. Aactually, we used the standards SBT settings to identify the project: name, organization and version, that matches Docker image informations.

Last but not least: we do not need to copy files anymore in the folder: the source file for the index is actually outside the source folder and it is specified as baseDirectory.value.getParentFile. It is copied from the parent directory by SBT, instead of having to do it in our build script.

Basically, we replaced the manual build script with an all-in-one build handled by our favorite build tool. We have covered the basics, now it is time to do more.

Modelling dependencies

Now you have SBT in the mix, you can leverage it for declaring dependencies. Indeed, the ability to define tasks that must be executed before others is one key feature of all the build systems and a clearly missing feature in Dockerfiles.

Now we have an image with nginx, basically empty. What I want to do now, is building an image including a complete website. For simplicity, I will do an example using a static site generators (I will cover in later posts dynamic websites), and I want to reuse the image I created before. So we are going to create an image that depends on 2 other builds: the first one is the build of the image, and the second one is the build of the static site (it has to be processed, too).

So basically the final goal I want to reach is creating task that can be performed by some continuous integration tools like jenkins:

  • checking out source code from a version control systems
  • execute a buildAll command
  • getting my website image built (meaning also building the static site, and base image automatically as a dependency)

If the goal is clear, let’s start working on it.

Top level project

Each image with sbt-docker is built using an SBT project. As a result, if we have 2 images, we need 2 projects. Because we want to model a dependency we will have a project referring to anore project. Unfortunately SBT does not allow us to refer to a sibling project if both does not belong to the same, top level project. As a result you also need a top level project including two subprojects. So we need a layout like this.

  • root/build.sbt : top level build file including subprojects
  • root/project/plugins.sbt: all the plugins we use in the project, including those used by subprojects
  • root/nginx/build.sbt: the build file for the nginx image
  • root/website/build.sbt: the build file for generating the static site and the image including it.
  • root/website/src/main/paradox/index.md: the markdown source of the website we are going to build.

Note that SBT is (almost) recursive: a project can be used as a subproject of another project. I say almost because there are a few exceptions to this rule. The first exception are plugins. You cannot declare plugins in each subproject (or better, if you declare them, they are ignored), you need to declare all the plugins for all the subprojects in the toplevel plugin file. So we start with the root/project/plugins.sbt file as follows:

1
2
3
addSbtPlugin("se.marcuslonnberg" % "sbt-docker" % "1.4.0")

addSbtPlugin("com.lightbend.paradox" % "sbt-paradox" % "0.2.0")

We are adding, obviously, the sbt-docker plugin but also the sbt-paradox plugin, a nice static site generator available out of the box as a SBT plugin. I picked this one because is the simplest to use in our context and clearly depicts what I want to demonstrate.

Now, we need the toplevel build file that only refers the subprojects and enable the plugins. File root/build.sbt as follows:

1
2
3
4
5
6
7
lazy val nginx = project.in(file("nginx"))
.enablePlugins(DockerPlugin)

lazy val website = project.in(file("website"))
.enablePlugins(DockerPlugin,ParadoxPlugin)

addCommandAlias("buildAll", "website/docker")

We are following the multi project builds documentation. Also, really building all means building the subproject website (treated in the next paragraph) that will in turn will trigger the build of the base image. The macro buildAll is just a convention. However if we need to build multiple images, this macro is the place to list them all.

Configuring our build

In the file website/build.sbt first we put some declarations as follows:

1
2
3
4
5
6
7
8
9
10
11
name := "website"
organization := "devops4scala"
version := "0.1"

imageNames in docker := Seq(ImageName(s"${organization.value}/${name.value}:${version.value}"))

paradoxTheme := Some(builtinParadoxTheme("generic"))

val dest = "/var/lib/nginx/html"

lazy val nginx = project.in(file("..")/"nginx").enablePlugins(DockerPlugin)

Here you can see imageNames as the declaration of the name of the image we are going to build. Furthermore the paradoxTheme is a mandatory setting for paradox to select a theme (in this case just the default one), while dest is just a constant for the sake of locating more easily the target folder in the image.

Building with dependencies

More interesting is the project declaration. Here we need to refer to a sibling project. By design, SBT enforces isolation of the build files, so we cannot use the project declared in the top level, we have to refer to the project explicity declaring it. Furthermore we need to enable (actally, make visible) the plugins we are going to use. I admit this is the part I like less, I would like a more straightforward way to declare project dependencies. However, it is basically a mechanical copy-and-paste of the top level build file declaration.

Now we are ready, and building an image and triggering dependencies is in the following code, whose magic it will be explained later:

1
2
3
4
dockerfile in docker := new Dockerfile {
from((docker in nginx).value.toString)
copy((paradox in Compile).value, dest)
}

Before a quick reminder. In SBT you trigger dependent builds evaluating their keys. We have two dependent builds, one is the paradox command that builds the site processing the markdown and generating a whole website. It returns the folder where the output site was placed. The second dependent build is the build of the dependent image for nginx that we will use ad a base, simply adding the generated site inside the container. Evaluating the keys returns the name of the image built.

Hence, (docker in nginx).value.toString will force the rebuild of the base image, so we can be sure it is already built before we build our new image, and we use its name as a base image. And copy((paradox in Compile).value, dest) will first build our static site with the static site generator. Once we are sure the site is built, we use the target folder and copy inside the image.

The key concept is the reference to the other artifact build commands and then evaluating them will ensure we actually model a dependency system, not just a script that will execute actions in a given order. Note that docker builds are incremental and cached, so if the image is already built, it won’t be rebuilt again and the whole build process will be very fast.

Demo time!

Now you can see the solution in action again, with a depth understanding of what happening.

the final result is in this screenshot:

But wait! There is (much) more…

Ok for now I stop here but I have not finished. We have still to see how to configure with shared property files, how to download files intelligently, how to use images to build artifacts , how to use the awesome Ammonite Scala Scripting by Li Haoyi…

You can see those features alredy in action in my (work-in-progress) Devops solution for scala applications: Mosaico. It is now at version 0.2, which includes a plugin sbt-mosaico and a set of docker images that you can reuse in your application. But a lot of more features are planned (and also partially developed already).

Stay tuned for more installments of Devops4scala series of blog posts and new releases of Mosaico.