Scality’s Release Engineering team has completed the integration of all Zenko-related repositories in its GitWaterFlow delivery model, like all other Scality’s products. The GitWaterFlow model was introduced years ago at Scality to increase release quality and increase development speed.
You may have noticed that Cloudserver and Backbeat repositories now default to a branch called development/8.0. The repositories also contain new directories called eve and pull requests contain comments from the bot Bert-E. What’s going on? That’s the GitWaterFlow process in action. To understand it, we need a little history …
At the start, the small-scale team of Scality engineers working to develop the RING product employed CVS and later Subversion in an ad-hoc fashion, and collaboration happened ‘on the spot’. The engineering team pushed features and bug fixes into a shared trunk branch. This branch was slated to become the next ‘major’ release of the product, though overeager integration of partially-delivered features often resulted in the branch being in a non-shippable state.
The process to port bug fixes to relevant branches (‘back-porting’) was fully manual. When the change rate of the codebase reached a certain level, this turned out to be a bottleneck in the development process. As with all manual processes, this also was prone to introduce accidental bugs or regressions. Creation of backport commits on various version branches also destroyed relationships between semantically equivalent changesets, which could only be recovered through information kept in commit messages or the ticketing system, again relying on humans doing the right thing.
Introducing the GitWaterFlow model
Fig. 1. Forward-porting patches on multiple development branches
That approach had too many flaws and didn’t scale. The Release Engineering team investigated options to radically change the approach, easing the workflow for developers as well as ensuring correctness of meta-information. The results were announced at the end of 2016 and kept improving since then.
GitWaterFlow (GWF) is a combination of a branching model and its associated tooling, featuring a transactional view on multi-branch changesets supported by none of the tools and models previously described. GWF tends to ban “backporting” in favor of “(forward) porting”. The term “porting” is employed to describe the act of developing a changeset on an old — yet active — version branch and subsequently merging it on newer ones. It is considered better than “backporting” for multiple reasons. “Porting” also makes merge automation trivial. In fact, changes that are merged in an old version branch, whether fixes or improvements, must also land in newer ones, otherwise there is a risk of regression. A bot can use this assumption to prepare and then execute the merge on newer branches, thus offloading the developer.
Development Branches
GWF comes with a versioning scheme that is inspired by semantic versioning (semver). Basically, version numbers are in the form major.minor.patch. patch is incremented only when backward compatible bug fixes are being added, minor is incremented when backward-compatible features are added, and major is incremented with major backward incompatible changes.
In GWF, every living minor version has a corresponding development/major.minor branch, each of which must be included in newer ones. In fig. 1 a development/1.0 is included into development/1.1, which in turn is included in development 2.0. Consequently, a GWF-compliant repository has a waterfall-like representation, hence the name “GitWaterFlow”.
As GWF is based on ‘porting’, feature branches do not necessarily start from the latest development branch. In fact, prior to start coding a developer must determine the oldest development/* branch his code should land upon (refer to fig.1.a). Once ready to merge, the developer creates a pull request that targets the development branch from which he started. A gating and merging bot will ensure that the feature branch will be merged not only on the destination but also on all the subsequent development branches.
Transactional Multi-Branch Changes
The fact that every pull request can concurrently target more than one mainline branch can dramatically affect the approach that developers take in addressing issues. For instance, it is not uncommon that conflicts exist between the feature branch targeting version n, and version n+1. In our setup, this class of conflicts must be detected and fixed prior to merging the pull request. The code that resolves such conflicts is considered part of the change, in fact, and must be reviewed at the same time. Also, it is a requirement that a pull request be merged only once it has passed the tests on all targeted versions.
In short, the changes brought to the software on multiple branches is a single entity and should be developed, reviewed, tested, and merged as such.
Only The Bot Can Merge
Bert-E is the gatekeeping and merging bot Scality developed in-house to automate GWF, its purpose being to help developers merge their feature branches on multiple development branches. The tool is written in Python and designed to function as a stateless idempotent bot. It is triggered via Bitbucket/GitHub webhooks after each pull request change occurrence (creation, commit, peer approval, comment, etc.).
Bert-E helps the developer prepare his pull request for merging. It interacts directly with the developer through GitHub’s (or Bitbucket) comment system via the pull-request timeline, pushing contextualized messages on the current status and next expected actions. In Scality’s case, Bert-E ensures that the pull request has at least two approvals from peers before it merges the contribution. In the future, Bert-E will also check that the JIRA fixVersion field is correct for the target branches to help product managers keep track of progress. Bert-E usually replies in less than 50 seconds, thus creating a trial-and-error process with a fast feedback loop that is ideal in onboarding newcomers to the ticketing process.
Integration Branches
In parallel with the previously described process, Bert-E begins trying to merge on the subsequent development branches by creating integration branches named w/major.minor/feature/foo, after both the originating feature branch and the target development branch (refer to fig.1.b). Every time Bert-E is triggered, it checks to ensure that the w/* branches are ahead of both the feature branch and the corresponding development branches (updating them following the same process when this is not the case).
Every change on a w/* branch triggers a build/test session. When the pull request fulfills all the requirements previously described, and when the builds are green on all the w/* branches, Bert-E fast-forwards all the development branches to point to the corresponding w/* branches in an atomic transaction, as depicted in fig.1.c.
Note that if another pull request is merged in the interim, Bert-E will not be able to push and must re-update its w/* branches and repeat the build/test process.
Better ‘Definition of DONE’ and Smoother Developer Process
In use at Scality for over two years, we can testify that the main GWF benefit is its atomic multi-branch merge property. In this context, ‘DONE’ means merged and fully tested on all target branches, and there is no additional backporting phase wherein it is discovered that the backport is more complex than the fix itself. Target branch conflicts are detected early and are dealt with prior to merging.
Peer reviews/approvals aside, the development process is smoother and allows the developer to push his changeset to completion without depending on third parties to merge. Changeset ownership reverts back to the author and does not vary across time. Thus, the developer is responsible for it up until the merge.
Also, the metadata in the git repository is much clearer, and now a simple git branch –contains <commit> will indicate within which branch a change has been merged. Due to gatekeeping, the development branches are always in a shippable state, which has greatly improved Scality’s accuracy in predicting delivery dates. Hand-in-hand with that, the amount of overall engineering work in progress has been reduced due to the GWF deployment, and as a direct result Scality is shipping faster.
Bert-E and Eve are open source
Source code and documentation for Bert-E and Eve are available on Bitbucket. If you have questions, please ask them on Zenko forum.
One of the first recommended ways to test Zenko is to deploy it in a Docker Swarm cluster. Docker Swarm allows to make a group of servers part of a cluster that will give you fault tolerance. Deploying, and that provides Zenko with a high-availability architecture.
Since Zenko deployment documentation mentions a functioning Docker Swarm cluster as one of the prerequisites, we recorded a short video to illustrate how to create a cluster with 2 worker nodes and 3 manager nodes. We do this every day, we tend to forget that we had to learn it too, at some point in (recent) time. Enjoy the video and reach out on our forum if you’re having any difficulty deploying Zenko: we’re here to help!
It is always a victory for an open source project when a contributor takes the time to provide a substantial addition to it. Today, we’re very happy to introduce a Docker Registry – Zenko tutorial by GitHub user rom-stratoscale, also known as Rom Freiman, R&D Director at Stratoscale.
Thank you very much for making the Zenko community stronger, Rom! — Laure
Introduction
Docker registry is a service enabling storage and distribution of private docker images among relevant users. Although there are public offerings for that service, many organizations prefer to host a private registry for their internal use.
Private Docker Registries have multiple options for storage configuration. One of the useful ones is Zenko’s CloudServer (formerly known as S3 Server), by Scality.
Zenko’s CloudServer is an open-source standalone S3 API deployed as a docker container. In other words, it allows you to have an S3-compatible store on premise (and even on your laptop).
In this tutorial, I’ll demonstrate how to deploy a private Docker Registry with Scality as your private on-premise S3 backend.
We’ll use containers to run both services (registry; Zenko CloudServer).
Prerequisites and assumptions
Prerequisites:
Docker daemon (v1.10.3 or above)
python (v2.7.12 or above)
AWS CLI (v1.11.123 or above)
$> pip install awscli==1.11.123
Configure aws credentials by:
$> aws configure
AWS Access Key ID []: {{YOUR_ACCESS_KEY_ID}}
AWS Secret Access Key []: {{YOUR_SECRET_KEY}}
Default region name []: us-east-1
Default output format[]: json
Assumptions:
In this tutorial both s3 and registry will run on the same physical server, in order to avoid dealing with networking. If you choose to run it on different servers, verify that you have routing from the registry server towards CloudServer(TCP, port 8000)
The s3 storage will use the containers storage and will be lost once the container will be stopped. For persistency, you should create appropriate volumes on the host and mount them as Docker volumes for the CloudServer container
The consumption of the docker registry is from localhost, hence we’ll not dive into SSL and certificates generation (but you can if you want to).
Run Zenko CloudServer container
User the docker run utility to start Zenko Cloudserver:
Now it’s a time for testing… Let’s pull an alpine container, push it into the registry, and check it was created:
docker pull alpine
docker tag alpine:latest 127.0.0.1:5000/alpine:latest
docker push 127.0.0.1:5000/alpine:latest # the actual submission to the newly spawned registry
docker rmi alpine:latest; docker rmi 127.0.0.1:5000/alpine:latest # local cleanup before pulling from the registry.
docker pull 127.0.0.1:5000/alpine:latest
And… voilà!!!
Now, you can list what is inside your CloudServer-hosted docker-registry bucket to check how zenkoregistry actually saved the data:
aws s3 --endpoint-url=http://127.0.0.1:8000 ls --recursive s3://docker-registry
Let’s explore how to write a simple Node.js application that uses the S3 API to write data to the Scality S3 Server. If you do not have the S3 Server up and running yet, please visit the Docker Hub page to run it easily on your laptop. First we need to create a list of the libraries needed in a file called package.json. When the node package manager (npm) is run, it will download each library for the application. For this simple application, we will only need the aws-sdk library.
Now let’s begin coding the main application in a file called app.js with the following contents:
varaws=require('aws-sdk');
varACCESS_KEY=process.env.ACCESS_KEY;
varSECRET_KEY=process.env.SECRET_KEY;
varENDPOINT=process.env.ENDPOINT;
varBUCKET=process.env.BUCKET;
aws.config.update({
accessKeyId: ACCESS_KEY,
secretAccessKey: SECRET_KEY
});
vars3=newaws.S3({
endpoint: ENDPOINT,
s3ForcePathStyle: true,
});
functionupload() {
params= {
Bucket: BUCKET,
Key: process.argv[2],
Body: process.argv[3]
};
s3.putObject(params, function(err, data) {
if (err) {
console.log('Error uploading data: ', err);
} else {
console.log("Successfully uploaded data to: "+BUCKET);
}
});
}
if (ACCESS_KEY&&SECRET_KEY&&ENDPOINT&&BUCKET&&process.argv[2] &&process.argv[3]) {
console.log('Creating File: '+process.argv[2] +' with the following contents:\n\n'+process.argv[3] +'\n\n');
upload();
} else {
console.log('\n[Error: Missing S3 credentials or arguments!\n');
}
This simple application will accept two arguments on the command-line. The first argument is for the file name and the second one is for the contents of the file. Think of it as a simple note taking application.
Now that the application is written, we can install the required libraries with npm.
npm install
Before the application is started, we need to set the S3 credentials, bucket, and endpoint in environment variables.
Please ensure that the bucket specified in the BUCKET argument exists on the S3 Server. If it does not, please create it.
Now we can run the application to create a simple file called “my-message” with the contents of “Take out the trash at 1pm PST”
node app.js 'my-message' 'Take out the trash at 1pm PST'
You should now see the file on the S3 Server using your favorite S3 Client:
I hope that this tutorial will help you get started quickly to create wonderful applications that use the S3 API to store data on the Scality S3 Server. For more code samples for different SDKs, please visit the Scality S3 Server GitHub .
Docker swarm is a clustering tool developed by Docker and ready to use with its containers. It allows to start a service, which we define and use as a means to ensure s3server’s continuous availability to the end user. Indeed, a swarm defines a manager and n workers among n+1 servers. We will do a basic setup in this tutorial, with just 3 servers, which already provides a strong service resiliency, whilst remaining easy to do as an individual. We will use NFS through docker to share data and metadata between the different servers.
You will see that the steps of this tutorial are defined as On Server, On Clients, On All Machines. This refers respectively to NFS Server, NFS Clients, or NFS Server and Clients. In our example, the IP of the Server will be 10.200.15.113, while the IPs of the Clients will be 10.200.15.96 and 10.200.15.97
Installing docker
Any version from Docker 1.13 onwards should work; we used Docker 17.03.0-ce for this tutorial.
On All Machines
On Ubuntu 14.04
The docker website has solid documentation.
We have chosen to install the aufs dependency, as recommended by Docker. Here are the required commands:
Your NFS Clients will mount Docker volumes over your NFS Server’s shared folders. Hence, you don’t have to mount anything manually, you just have to install the NFS commons:
On Ubuntu 14.04
Simply install the NFS commons:
$> sudo apt-get install nfs-common
On CentOS 7
Install the NFS utils, and then start the required services:
Your NFS Server will be the machine to physically host the data and metadata. The package(s) we will install on it is slightly different from the one we installed on the clients.
On Ubuntu 14.04
Install the NFS server specific package and the NFS commons:
Choose where your shared data and metadata from your local S3 Server will be stored. We chose to go with /var/nfs/data and /var/nfs/metadata. You also need to set proper sharing permissions for these folders as they’ll be shared over NFS:
Now you need to update your /etc/exports file. This is the file that configures network permissions and rwx permissions for NFS access. By default, Ubuntu applies the no_subtree_check option, so we declared both folders with the same permissions, even though they’re in the same tree:
Eventually, you need to allow for NFS mount from Docker volumes on other machines. You need to change the Docker config in /lib/systemd/system/docker.service:
$> sudo vim /lib/systemd/system/docker.service
In this file, change the MountFlags option:
MountFlags=shared
Now you just need to restart the NFS server and docker daemons so your changes apply.
On Ubuntu 14.04
Restart your NFS Server and docker services:
$> sudo service nfs-kernel-server restart
$> sudo service docker restart
We will now set up the Docker volumes that will be mounted to the NFS Server and serve as data and metadata storage for S3 Server. These two commands have to be replicated on all machines:
$> docker volume create --driver local --opt type=nfs --opt o=addr=10.200.15.113,rw --opt device=:/var/nfs/data --name data
$> docker volume create --driver local --opt type=nfs --opt o=addr=10.200.15.113,rw --opt device=:/var/nfs/metadata --name metadata
There is no need to “”docker exec” these volumes to mount them: the Docker Swarm manager will do it when the Docker service will be started.
On Server
To start a Docker service on a Docker Swarm cluster, you first have to initialize that cluster (i.e.: define a manager), then have the workers/nodes join in, and then start the service. Initialize the swarm cluster, and look at the response:
$> docker swarm init --advertise-addr 10.200.15.113
Swarm initialized: current node (db2aqfu3bzfzzs9b1kfeaglmq) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-5yxxencrdoelr7mpltljn325uz4v6fe1gojl14lzceij3nujzu-2vfs9u6ipgcq35r90xws3stka \
10.200.15.113:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
On Clients
Simply copy/paste the command provided by your docker swarm init. When all goes well, you’ll get something like this:
$> docker swarm join --token SWMTKN-1-5yxxencrdoelr7mpltljn325uz4v6fe1gojl14lzceij3nujzu-2vfs9u6ipgcq35r90xws3stka 10.200.15.113:2377
This node joined a swarm as a worker.
If you run a docker service ls, you should have the following output:
$> docker service ls
ID NAME MODE REPLICAS IMAGE
ocmggza412ft s3 replicated 1/1 scality/s3server:latest
If your service won’t start, consider disabling apparmor/SELinux.
Testing your High Availability S3Server
On All Machines
On Ubuntu 14.04 and CentOS 7
Try to find out where your Scality S3 Server is actually running using the docker ps command. It can be on any node of the swarm cluster, manager or worker. When you find it, you can kill it, with docker stop <container id> and you’ll see it respawn on a different node of the swarm cluster. Now you see, if one of your servers falls, or if docker stops unexpectedly, your end user will still be able to access your local S3 Server.
Troubleshooting
To troubleshoot the service you can run:
$> docker service ps s3
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
0ar81cw4lvv8chafm8pw48wbc s3.1 scality/s3server localhost.localdomain.localdomain Running Running 7 days ago
cvmf3j3bz8w6r4h0lf3pxo6eu \_ s3.1 scality/s3server localhost.localdomain.localdomain Shutdown Failed 7 days ago "task: non-zero exit (137)"
If the error is truncated it is possible to have a more detailed view of the error by inspecting the docker task ID:
$> docker inspect cvmf3j3bz8w6r4h0lf3pxo6eu
Off you go!
Let us know what you use this functionality for, and if you’d like any specific developments around it. Or, even better: come and contribute to our Github repository! We look forward to meeting you!
If you wish to use https with your local Scality S3 Server, you need to set up SSL certificates. Here is a simple guide of how to do it.
Deploying Zenko CloudServer
First, you need to deploy CloudServer (previously known as S3 Server). This can be done very easily via our DockerHub page (you want to run it with a file backend).
Updating your Scality S3 Server container’s config
You’re going to add your certificates to your container. In order to do so, you need to exec inside your CloudServer container. Run a $> docker ps and find your container’s id (the corresponding image name should be zenko/cloudserver. Copy the corresponding container id (here we’ll use 894aee038c5e, and run:
$> docker exec -it 894aee038c5e bash
You’re now inside your container, using an interactive terminal 🙂
Generate SSL key and certificates
There are 5 steps to this generation. The paths where the different files are stored are defined after the -out option in each command
# Generate a private key for your CSR
$> openssl genrsa -out ca.key 2048
# Generate a self signed certificate for your local Certificate Authority
$> openssl req -new -x509 -extensions v3_ca -key ca.key -out ca.crt -days 99999 -subj "/C=US/ST=Country/L=City/O=Organization/CN=scality.test"
# Generate a key for Scality S3 Server
$> openssl genrsa -out test.key 2048
# Generate a Certificate Signing Request for Scality S3 Server
$> openssl req -new -key test.key -out test.csr -subj "/C=US/ST=Country/L=City/O=Organization/CN=*.scality.test"
# Generate a local-CA-signed certificate for Scality S3 Server
$> openssl x509 -req -in test.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out test.crt -days 99999 -sha256
Update CloudServer config.json
Add a certFilePaths section to ./config.json with the appropriate paths:
First, you need to exit your container. Simply run $> exit. Then, you need to restart your container. Normally, a simple $> docker restart s3server should do the trick.
Update your host config
Associates local IP addresses with hostname
In your /etc/hosts file on Linux, OS X, or Unix (with root permissions), edit the line of localhost so it looks like this:
127.0.0.1 localhost s3.scality.test
Copy the self-signed certificate from your container
In the above commands, it’s the file named ca.crt. Choose the path you want to save this file at (here we chose /root/ca.crt), and run something like: