Tag Archives: python

Self-contained GoCD Environment Using Docker-Compose

Motivation

Today’s software is often connected, be it for automatic desktop updates or for implementing an internet-scale service. Developers’ tools or toys are no longer solely editors or compilers, but also databases, logging or search services, code sharing platforms, or Continuous Integration servers. Evaluating distributed on-premise software has not been as easy as desktop software, however. Arduous and error-prone installation instructions seem out of place, but are still very common. A number of open source software now comes with a one(or two)-liner installation that is always up to date. GoCD is one of them.

While discovering features of GoCD I sometimes wished for even more simplicity and automation: a one-liner for a whole Continuous Integration environment. This should include a server, several build agents, and several source repositories. With a recent push towards containerized software delivery, the path is quite clear: build, provision and configure the whole infrastructure from code and run it in containers. This way, it is easier to experiment, build and communicate Continuous Delivery prototypes.

GoCD Infrastructure as Code

There are official GoCD Docker images for both the server and base images for the build agents. The containers are configured such that the build agents can register themselves automatically with the server if the auto-registration key of the server is known.

Thus, starting from the official Docker images, to get to a one-liner self-contained infrastructure installation, the following ingredients are missing:

  • starting the server and the agents
  • provisioning the agents to contain desired compilers
  • adding source code repositories to be built

Docker Compose as a Powerful Toy

Docker Compose is a tool to define and run multi-container applications. It is so pragmatic that using it almost feels like playing.

To start the server and an agent, the following Docker Compose configuration would suffice:

Running docker-compose up -d pulls and brings up the minimal infrastructure.

A Custom Build Agent

In the configuration above, the agent is rather empty, and probably does not contain the build infrastructure we need.

Here, we extend the gocd/gocd-agent-ubuntu-16.04:v17.3.0 image, installing some Lua infrastructure:

In the docker-compose configuration, the image tag is replaced with a build one:

Note the resources that can be assigned to the build agents from the docker-compose file and get passed to the agent container as environment variable for correct automatic registration with the server.

After bringing up the infrastructure and the auto-registration key matches that of the server, the agents are registered with the server:

GoCD agents

25.3.2017 Update: in the GoCD 17.3.0 server image, the auto-registration key is a generated one, and is not set to the default above. To enable auto-registration, the provisioning step is used.

Pipeline →⇉→ as Code

The final missing piece of the self-provisioning CD infrastructure is the addition of the repositories to be built. There is a number of GoCD management libraries on Github. Each would probably serve the purpose of setting up the pipelines.

The approach taken here is to centrally manage a list of repositories to be built, while delegating the details of each pipeline to the corresponding repository. This way, the pipeline configuration is part of the repository itself (pipeline as code), and could be portable between different GoCD instances.

The three ingredients are: the pipeline configuration from source control feature #1133, the YAML Config Plugin, and the GoCD REST API, as of the time of the experiment, none of the tools I have seen could add the pipeline-as-code configuration.

Adding the Pipelines (External Provisioning)

for a simpler provisioning via mapping a config file into the container, see the next chapter

To configure the server with the pipelines, and set the auto-registration key, a separate container runs a Python script once. This way, provisioning is decoupled from the generic server configuration and can be replaced by another mechanism without rebuilding the server image. The script provisions the server with the pipelines, and sets the auto-registration key to the one agreed upon.

A common problem is the startup sequence of the containers. Trying to provision the server before it is ready to accept configuration would result in an error. Thus, the provisioning script tries to avoid pushing the configuration too early by waiting for the GoCD web UI to become available. This is accomplished by waiting using urlwait:

if not wait_for_url("http://go-server:8153", 300):
  print("""Go server did not start in a timely fashion.
           Please retry docker-compose up provisioner""")
  sys.exit()

The repositories to configure pipelines from are (here, rather crudely) added directly to the XML config as the yaml config plugin expects them. The XML is first read from the /go/api/admin/config.xml API, then simply extended with the necessary tags, and then posted to the same URL. There is still a chance of a race condition that the configuration is changed between it is read and it is written. As GoCD validates the config upon modification, and the script strives to be idempotent, re-running the container should fix the conflict.

The GoCD XML config needs the following addition for a repository to configure a pipeline:

<config-repos>
 <config-repo plugin="yaml.config.plugin">
  <git url="https://github.com/d-led/gocd-rpi-unicorn-hat-monitor.git" />
 </config-repo>
</config-repos>

In the repository itself, place a ci.gocd.yaml with a corresponding pipeline definition.

After grabbing a coffee, the infrastructure has been started, provisioned and configured, and the UI shows the result:

GoCD pipelines

Yet Simpler Self-Provisioning

With the current official Docker image for the server, it is possible to map configuration-relevant files and folders into the container. In our case this means, the plugin, and the whole server configuration can be directly mapped into the container, thus, provisioning the server without an extra provisioning step.

As GoCD keeps the pipeline configuration in a single file cruise-config.xml, we can simply track it in the same repository as the Docker Compose config. To map the configuration, and the plugin jar, the volumes are added to the container config:

When the server starts, it already has most of its configuration. As the agent auto-registration key is part of the XML config, the agents will automatically register themselves, since they are configured with the same key.

Other CI Tools

A comparable exercise can be performed with almost any tool. Some tools that are natively based on the concept of pipelines are Concourse CI and Drone that both use Docker as build agents (runners). Drone even comes with its own Docker-Compose config. Concourse can be bootstrapped via Vagrant.

A significant difference of GoCD to the more recent tools is its platform-independence. Sometimes, building inside a container is not a choice, e.g. on Windows. GoCD agents can run anywhere, where a JRE8 can run, thus, increasing its reach. A phoenix environment including Windows agents can be achieved with some effort using Chocolatey → Packer → Terraform/Vagrant.

Conclusion

This article has described an experiment to rather quickly arrive at a self-contained and self-provisioned Continuous Delivery infrastructure consisting of “Phoenix Servers” – a phoenix infrastructure.

While the result is rather humble, it demonstrates that continuous delivery techniques can be applied to a continuous delivery infrastructure itself. Using Docker Compose allowed to prototype a distributed development-supporting application and its configuration on a local developer machine with a potential to transfer the prototype into real use.

All this would be impossible without a huge network, or, I’d say, a universe, consisting of online services running and providing open source software that is created by a large number of open source communities, and a yet larger number of individuals collaborating in various ways to envision, create, maintain, and run it. Moreover, times are such that one can experience successful transition of proprietary software into open source (GoCD), and companies, building business around open source software. The OSS ecosystem is a distributed, self-directing system that catalyzes idea creation, mutation and destruction much faster that most smaller systems can do. For the moment, I hope, there is no going back.

Repository

The Github repository to run the self-contained infrastructure can be found here: gocd_docker_compose_example.

The repository can be used to bootstrap demos, further experiments and proofs of concept.

Disclaimer

This post, as any other post on my blog, is not advertisement, and no affiliation or endorsement exists. It is a write-up of my personal experiments, experiences and opinions. The results obtained here can most certainly be achieved using other tools and technologies.

Build Agent Infrastructure Testing in GoCD

In this post I would like to describe a simple technique for reducing the waiting time and stress related to build agent environment volatility when using Continuous Integration / Continuous Delivery tools like GoCD, via infrastructure testing.

The Problem

Given a modern CI server, such as GoCD, and a set of dedicated build machines (agents), it is possible to improve software development agility. Automated build/test/deploy pipelines, built to reflect the value stream, bring transparency and focus into the software delivery activities.

CI automation is software itself, and is thus susceptible to errors. Configuration management can optimize the set up of the environment, in which the build agents run. However, when computational resources are added to a CI infrastructure, i.e. to parallelize the build and thus reduce feedback times, a missing environment dependency can cause stress and pain that CI is trying to eliminate.

Consider a pipeline where a complete cycle (i.e. with slow integration tests followed by a reporting step at the end or long check-outs in the beginning) takes a significant amount of time. If one of the last tasks fails due to a configuration or an environment issue, the whole stage fails. The computational resources have been wasted just to find out that a compiler is missing. This can easily happen when there is variation in the capabilities of the build agents.

Improvement idea: fail fast! Don’t wait for environment or infrastructure mismatches

GoCD Resources as Requirements and Capabilities

If a step in a build pipeline requires a certain compiler or a particular environment, this can be conveniently expressed in the configuration of GoCD as a build agent resource. A resource can be seen as a requirement of a pipeline step that is fulfilled by a corresponding capability of a build agent.

Consider the following set-up with two build agents – one running on Windows, another one on Linux. Some tasks could be completely platform-independent, such as text processing, and thus could be potentially performed on any machine with a required interpreter installed.

agents

Build Agent is the Culprit

We have set up our environment, and have successfully tested our first commit, but the second one fails:

another_agent

The code is the same, why did the second build fail? The builds ran on different agents, but I expected them to behave similarly…

build_fails

Oh, that’s embarrassing. While yes, the script is platform-independent, there’s no executable named python2 in my Windows runner environment path.

With a one-file repository and a simple print statement this failure did not cause much damage, but as mentioned earlier, real life builds failing due to a missing executable might be costly.

Unhappy Picture

unhappy

Infrastructure Test Pipeline

In order to fail fast in situations where new agents are added to a CI infrastructure, or their environment is volatile, I propose to use a single independent pipeline that checks the assumptions that longer builds depend upon.

If a build step requires python in the path, there should be a test for it that gives this feedback in seconds without much additional waiting time. This can be as easy as calling python --version, which will fail with a non-zero return code if the binary is missing. More fine-grained assertions are possible, but should still remain fast.

If a certain binary should not be in the path, this can be asserted as well. The same goes for environment variables and file existence. Dedicated infrastructure testing tools, such as Serverspec could also be used, but having a response time under a minute is crucial in my view.

Run on All Agents

In order to validate the consistency of the CI infrastructure, the validation tasks should run on all agents that advertise a corresponding resource. This is where, in my view, the real power of GoCD comes to light, and the concepts used in it fit in the right places.

GoCD will run the test tasks on all agents that fulfill all resource requirements for the task.

run_on_all_g

Test fails

Now that we have all the tests, running them gives quick and precise feedback:

infrastructure_test_fails

Checking out the job run details reveals the offending agent. Note the test duration: under 1 second.

infrastructure_test_agent

Fixing the Infrastructure

resources_modified

Whatever the resolution of the infrastructure problem, when the infrastructure test has a good coverage of the prerequisites for a pipeline, adding new agents to the CI infrastructure should become as much fun as TDD is: write an infrastructure test, see it fail, fix the infrastructure, feel the good hormones. Add new build agents for speed — still works — great!

infrastructure_test_passes

Note how the resources that are available only on one machine are only run on one corresponding machine.

 

Happy Pictures and Developers

happy pipeline

When to Test

It is an open question, when to test the infrastructure. With the system being composed of the CI server and agents, the tests should probably run on any global state change, such as

  • added/removed/reconfigured agents
  • automatic OS updates (controversial)
  • restarts
  • network topology changes

It is also possible to schedule a regular environment check. Having the environment test pipeline be the input for other pipelines unfortunately will not do in the following sequence of events:

  • environment tests pass
  • faulty agents are added
  • downstream pipeline is triggered
  • environment failure causes a pipeline to fail

In any case, there is a REST API available for the GoCD server should automating the automation become a necessity.

Acknowledgments

I would like to thank all the great minds, authors and developers who have worked and are working to make lives of developers and software users better. Tools and ideas that work and provide value are indispensable.  The articles and the software linked in this blog entry are examples of knowledge that brings the software industry forward. I am also very grateful to my current employer for letting me learn, grow and make a positive impact.