CloudServer can now be deployed on a Kubernetes cluster through Helm. The CloudServer Helm chart allows to add S3-compatible storage system to a K8s cluster easily. CloudServer can store data locally or can be used with existing S3 compatible servers by supplying credentials to the values.yaml file in the chart. See the full documentation at the Helm charts Github repository.
To start using CloudServer on an existing Kubernetes cluster, run:
$ helm install stable/cloudserver
In order to connect CloudServer to S3 compatible services, fill in cloud backend credentials on the values.yaml file. An AWS configuration may look like this:
Minikube and Docker for Mac Edge also support single-node Kubernetes for local testing. Docker has a step-by-step guide for such a set up. CloudServer’s full documentation covers other details. Try it out and let us know what you think on Zenko forums.
It’s easy to make mistakes when developing multi-cloud applications, even when only dealing with object storage API. Amazon S3 and Azure Blob Storage are similar models but with differing semantics and APIs, just like Google Cloud Storage API. Amazon S3 is a RESTful API providing command syntax for create (PUT), access (GET), and delete (DELETE) operations on both buckets and objects, plus access to bucket metadata (HEAD).
Applications that need to support both API have to be developed very carefully to manage all corner cases and different implementations of the clouds. Luckily, Zenko’s team is dedicated to finding those corner cases and solve them once for everybody. Zenko CloudServer translates standard Amazon S3 calls to Azure Blob Storage, abstracting complexity. The design philosophy of CloudServer’s translations are:
S3 API calls follow the Amazon S3 API specification for mandatory and optional headers, and for response and error codes.
The Azure Blob Storage container is created when the application calls S3 PUT bucket, and the container is assigned the name given in the PUT bucket request.
Bucket names must follow AWS naming conventions and limitations.
Bucket naming restrictions are similar but not the same.
CloudServer returns an InvalidBucketName error for a bucket name with “.” even though allowed on AWS S3.
Canned ACLs can be sent as part of the header in an AWS S3 bucket put call.
CloudServer uses Azure metadata x-ms-meta-scality_md_x-amz-acl header to store canned ACLs in Azure containers.
Get Bucket / List Blobs
The AWS S3 “Marker” parameter expects a object key value, but Azure does not have an implementation to retrieve object listings after a certain key name alphabetically (can only retrieve blobs after an opaque continuation token).
AWS S3 sends back the object owner in each listing entry XML but Azure does not include object owner information in listings.
Delete Bucket / Delete Container
While AWS S3 returns an error if a bucket is non-empty, Azure deletes containers regardless of contents. Zenko CloudServer makes a call to lists blobs in the container first and returns the AWS S3 BucketNotEmpty error if not empty.
Put Object / Put Blob
CloudServer only allows canned ACLs, except aws-exec-read and log-delivery-write. ACLs are stored as blob metadata. From the Azure side, there are no object ACLs so behavior is based on container settings.
Only the STANDARD setting is allowed as “storage class”
Setting object-level encryption is not allowed through headers. The user must set encryption through Azure on an account basis.
Delete Object / Delete Blob
AWS S3 has delete versions and offers an MFA requirement for delete. MFA header is not supported in CloudServer.
Get Service / ListContainers
AWS S3 returns a creation date in its listing, while Azure only stores the last-modified date.
Initiate Multi-part Upload (MPU) / no correspondent on Azure
A MPU is treated as a regular Put Blob call in Azure. CloudServer cannot allow users to initiate more than one MPU at a time because there is no way of renaming or copying a committed block blob to the correct name efficiently, and any uncommitted blocks on a blob are deleted when the block blob is committed (preventing an upload to the same key name). To allow for initiate MPU, Zenko CloudServer creates a “hidden” blob with a unique prefix that is used for saving the metadata/ACL/storage class/encryption of the future objectListing of ongoing MPUs.
Put Part / Put Block
Azure has a size limit of 100 MB per block blob. AWS S3 has a max part size of 5 GB.
Azure also has a 50,000-block maximum. At 100 MB max per block, this comes out to around 5 TB, which is the maximum size for an AWS S3 MPU. Putting the same part number to an MPU multiple times may also risk running out of blocks before 5 TB size limit is reached.
The easiest way to write multi-cloud applications is to use the open source projects Zenko and Zenko CloudServer.
Storing data in multiple clouds without a global metadata search engine is like storing wine bottles without labels in random shelves: the wine may be safe but you’ll never know which bottle will be appropriate for dinner. Using one object-based storage system can easily become complex but when you start uploading files to multiple clouds things can become an inextricable mess where nobody knows what is stored where. The good thing of object store is that objects are usually stored with metadata to describe them. For example, a video production company can include details to indicate that a video file is “production ready” or contain details about the department that produced the file, when raw footage was taken or the rockstar featured in a video. The tags we used to identify pictures of melons with Machine Box example are metadata, too.
Zenko offers a way to search metadata on objects stored across any cloud: whether your files are in Azure, Google Cloud, Amazon, Wasabi, Digital Ocean or Scality RING, you’ll be able to find all the videos classified for production or all the images of water melons.
The global metadata search capability is one of the core design principles of Zenko: one endpoint to control all your data, regardless of where it’s stored. The first implementation was using Apache Spark but the team realized it wasn’t performing as expected and switched to MongDB. Metadata searches can be performed from the command line or from the Orbit graphical user interface. Both searches use a common SQL-like syntax to drive a MongoDB search.
The Metadata Search feature expands on the standard GET Bucket S3 API. It allows users to conduct metadata searches by adding the custom Zenko querystring parameter, search. The search parameter is structured as a pseudo-SQL WHERE clause and supports basic SQL operators. For example, “A=1 AND B=2 OR C=3”. More complex queries can also be made using nesting operators, “(” and “)”.
The search process is as follows:
1. Zenko receives a GET request containing a search parameter:
GET /bucketname?search=key%3Dsearch-item HTTP/1.1
Host: 127.0.0.1:8000
Date: Wed, 18 Oct 2018 17:50:00 GMT
Authorization: <authorization string>
2. CloudServer parses and validates the search string: If the search string is invalid, CloudServer returns an InvalidArgument error. If the search string is valid, CloudServer parses it and generates an abstract syntax tree (AST).
3. CloudServer passes the AST to the MongoDB backend as the query filter for retrieving objects in a bucket that satisfies the requested search conditions.
4. CloudServer parses the filtered results and returns them as the response. Search results are structured the same as GET Bucket results:
You can perform metadata searches by entering a search in the Orbit Search tool or using the search_bucket tool. The S3 Search tool is an API extension to the AWS S3 search syntax. S3 Search is MongoDB-native, and addresses the S3 search through queries encapsulated in a SQL WHERE predicate. It uses Perl-Compatible Regular Expression (PCRE) search syntax. In the following examples, Zenko is accessible on endpoint http://127.0.0.1:8000 and contains the bucket zenkobucket.
$ node bin/search_bucket -a accessKey1 -k verySecretKey1 -b testbucket -q "`last-modified` LIKE "2018-03-23.*"" -h 127.0.0.1 -p 8000
Zenko’s global metadata search capabilities play a fundamental role in guaranteeing your freedom to choose the best cloud storage solution while keeping control of your data.
Since the TV show Silicon Valley brutally made fun of image recognition powered by Artificial Intelligence in the infamous hotdog-not-hotdog episode, I decided it was my mission to do something better than that. Who eats hot dogs anyway? I’m French, and at Scality ‘Eat well’ is one of our core values! I wanted an app that automatically sorted melons, delicious summer fruit with low calories, high water content and fiber! I contacted the folks at Machine Box to get their help and used Zenko to do some magic. My mission was to train an algorithm to automatically tag images of melons based on their kind (watermelon, cantaloupe, etc.) and store them in Zenko with metadata for later retrieval. “Not funny!” screamed my colleagues, but I hadn’t meant to be funny.
We manipulate and store lots of data without being able to efficiently search and retrieve them later on. Google Photos and other tools introduced automatic recognition of images to the consumer space but at the cost of losing control of data. The compromise that consumers can accept are often not acceptable for corporations. AI tools like Machine Box can automatically add useful metadata information to the content that is uploaded on your storage. With Zenko, such metadata gets indexed so you can quickly and easily search for the content you’re looking for. I prepared a demo exploring this workflow:
Teach the TagBox application to recognize melon images and differentiate between watermelons, cantaloupes and honeydews
Upload new images to Zenko via the S3 API – the ones we want Machine Box to analyze and tag
Get the TagBox application to check that image directly via S3 and tag it with a melon type with a degree of confidence and some default built-in tags that Machine Box will recognize and return (i.e. “Food”, “Plant”, “Fruit”, etc.)
Upload the Machine Box resulting metadata information to the object in Zenko via S3
Use Zenko Orbit to browse the metadata and search the images for those that have a level of confidence > 0.8 that the image is an image of a watermelon.
It’s a lot easier if you just look at the demo video to understand the different phases of this integration example: upload via S3 API in Zenko, AI teach, AI check, metadata indexing and search.
The multi-cloud character of Zenko lets you use any of the public cloud providers (Amazon, Azure, Google), or on-prem on a NAS or local object storage. With the same S3-based code, switch from an on-prem to an Amazon-based workflow by just choosing the bucket you want to use (associated to an Amazon, Azure, Google, etc. location).
Zenko can be deployed on a managed-Kubernetes cluster on Google Cloud (GKE) using the Helm charts distributed in its repository. There are many other ways to run Zenko on a Kubernetes cluster, including our favorite Kubernetes distribution MetalK8s. The Helm charts are designed to isolate how Zenko is deployed from where it is deployed: any Kubernetes cluster will be good to get started. In some ways, it helps developers like Zenko itself tries to give developers the freedom to choose the best cloud storage system, abstracting the complex choices like supporting multiple APIs or aggregate metadata. GKE is an easy way to quickly setup a cloud-agnostic storage platform.
The first step is to start a new cluster on Kubernetes following the instructions on Google Cloud documentation. For better performance, you’ll need a cluster with 3 nodes, 2vCPU and 7.5GB RAM each. Once the cluster is running, connect to it and install Helm.
Create Role For Tiller
Google Kubernetes Engine requires Role Based Access Control to be setup. The first step is to create a serviceaccount for tiller:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-cluster-1-default-pool-9ad69bcf-4g2n Ready <none> 1m v1.8.10-gke.0
gke-cluster-1-default-pool-9ad69bcf-frj5 Ready <none> 1m v1.8.10-gke.0
gke-cluster-1-default-pool-9ad69bcf-rsbt Ready <none> 1m v1.8.10-gke.0
Install Helm on Kubernetes Cluster
Helm is not available by default on GKE and needs to be installed.
Clone Zenko’s repo and go into the charts directory:
$ git clone https://github.com/scality/Zenko.git
$ cd ./Zenko/charts
Once you have the repo cloned you can retrieve all dependencies:
$ helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
"incubator" has been added to your repositories
$ helm dependency build zenko/
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "incubator" chart repository
...Successfully got an update from the "stable" chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 8 charts
Downloading prometheus from repo https://kubernetes-charts.storage.googleapis.com/
Downloading mongodb-replicaset from repo https://kubernetes-charts.storage.googleapis.com/
Downloading redis from repo https://kubernetes-charts.storage.googleapis.com/
Downloading kafka from repo http://storage.googleapis.com/kubernetes-charts-incubator
Downloading zookeeper from repo http://storage.googleapis.com/kubernetes-charts-incubator
Deleting outdated charts
With your dependencies built, you can run the following shell command to deploy a three-nodes Zenko stack with Orbit enabled.
Scality’s Release Engineering team has completed the integration of all Zenko-related repositories in its GitWaterFlow delivery model, like all other Scality’s products. The GitWaterFlow model was introduced years ago at Scality to increase release quality and increase development speed.
You may have noticed that Cloudserver and Backbeat repositories now default to a branch called development/8.0. The repositories also contain new directories called eve and pull requests contain comments from the bot Bert-E. What’s going on? That’s the GitWaterFlow process in action. To understand it, we need a little history …
At the start, the small-scale team of Scality engineers working to develop the RING product employed CVS and later Subversion in an ad-hoc fashion, and collaboration happened ‘on the spot’. The engineering team pushed features and bug fixes into a shared trunk branch. This branch was slated to become the next ‘major’ release of the product, though overeager integration of partially-delivered features often resulted in the branch being in a non-shippable state.
The process to port bug fixes to relevant branches (‘back-porting’) was fully manual. When the change rate of the codebase reached a certain level, this turned out to be a bottleneck in the development process. As with all manual processes, this also was prone to introduce accidental bugs or regressions. Creation of backport commits on various version branches also destroyed relationships between semantically equivalent changesets, which could only be recovered through information kept in commit messages or the ticketing system, again relying on humans doing the right thing.
Introducing the GitWaterFlow model
Fig. 1. Forward-porting patches on multiple development branches
That approach had too many flaws and didn’t scale. The Release Engineering team investigated options to radically change the approach, easing the workflow for developers as well as ensuring correctness of meta-information. The results were announced at the end of 2016 and kept improving since then.
GitWaterFlow (GWF) is a combination of a branching model and its associated tooling, featuring a transactional view on multi-branch changesets supported by none of the tools and models previously described. GWF tends to ban “backporting” in favor of “(forward) porting”. The term “porting” is employed to describe the act of developing a changeset on an old — yet active — version branch and subsequently merging it on newer ones. It is considered better than “backporting” for multiple reasons. “Porting” also makes merge automation trivial. In fact, changes that are merged in an old version branch, whether fixes or improvements, must also land in newer ones, otherwise there is a risk of regression. A bot can use this assumption to prepare and then execute the merge on newer branches, thus offloading the developer.
Development Branches
GWF comes with a versioning scheme that is inspired by semantic versioning (semver). Basically, version numbers are in the form major.minor.patch. patch is incremented only when backward compatible bug fixes are being added, minor is incremented when backward-compatible features are added, and major is incremented with major backward incompatible changes.
In GWF, every living minor version has a corresponding development/major.minor branch, each of which must be included in newer ones. In fig. 1 a development/1.0 is included into development/1.1, which in turn is included in development 2.0. Consequently, a GWF-compliant repository has a waterfall-like representation, hence the name “GitWaterFlow”.
As GWF is based on ‘porting’, feature branches do not necessarily start from the latest development branch. In fact, prior to start coding a developer must determine the oldest development/* branch his code should land upon (refer to fig.1.a). Once ready to merge, the developer creates a pull request that targets the development branch from which he started. A gating and merging bot will ensure that the feature branch will be merged not only on the destination but also on all the subsequent development branches.
Transactional Multi-Branch Changes
The fact that every pull request can concurrently target more than one mainline branch can dramatically affect the approach that developers take in addressing issues. For instance, it is not uncommon that conflicts exist between the feature branch targeting version n, and version n+1. In our setup, this class of conflicts must be detected and fixed prior to merging the pull request. The code that resolves such conflicts is considered part of the change, in fact, and must be reviewed at the same time. Also, it is a requirement that a pull request be merged only once it has passed the tests on all targeted versions.
In short, the changes brought to the software on multiple branches is a single entity and should be developed, reviewed, tested, and merged as such.
Only The Bot Can Merge
Bert-E is the gatekeeping and merging bot Scality developed in-house to automate GWF, its purpose being to help developers merge their feature branches on multiple development branches. The tool is written in Python and designed to function as a stateless idempotent bot. It is triggered via Bitbucket/GitHub webhooks after each pull request change occurrence (creation, commit, peer approval, comment, etc.).
Bert-E helps the developer prepare his pull request for merging. It interacts directly with the developer through GitHub’s (or Bitbucket) comment system via the pull-request timeline, pushing contextualized messages on the current status and next expected actions. In Scality’s case, Bert-E ensures that the pull request has at least two approvals from peers before it merges the contribution. In the future, Bert-E will also check that the JIRA fixVersion field is correct for the target branches to help product managers keep track of progress. Bert-E usually replies in less than 50 seconds, thus creating a trial-and-error process with a fast feedback loop that is ideal in onboarding newcomers to the ticketing process.
Integration Branches
In parallel with the previously described process, Bert-E begins trying to merge on the subsequent development branches by creating integration branches named w/major.minor/feature/foo, after both the originating feature branch and the target development branch (refer to fig.1.b). Every time Bert-E is triggered, it checks to ensure that the w/* branches are ahead of both the feature branch and the corresponding development branches (updating them following the same process when this is not the case).
Every change on a w/* branch triggers a build/test session. When the pull request fulfills all the requirements previously described, and when the builds are green on all the w/* branches, Bert-E fast-forwards all the development branches to point to the corresponding w/* branches in an atomic transaction, as depicted in fig.1.c.
Note that if another pull request is merged in the interim, Bert-E will not be able to push and must re-update its w/* branches and repeat the build/test process.
Better ‘Definition of DONE’ and Smoother Developer Process
In use at Scality for over two years, we can testify that the main GWF benefit is its atomic multi-branch merge property. In this context, ‘DONE’ means merged and fully tested on all target branches, and there is no additional backporting phase wherein it is discovered that the backport is more complex than the fix itself. Target branch conflicts are detected early and are dealt with prior to merging.
Peer reviews/approvals aside, the development process is smoother and allows the developer to push his changeset to completion without depending on third parties to merge. Changeset ownership reverts back to the author and does not vary across time. Thus, the developer is responsible for it up until the merge.
Also, the metadata in the git repository is much clearer, and now a simple git branch –contains <commit> will indicate within which branch a change has been merged. Due to gatekeeping, the development branches are always in a shippable state, which has greatly improved Scality’s accuracy in predicting delivery dates. Hand-in-hand with that, the amount of overall engineering work in progress has been reduced due to the GWF deployment, and as a direct result Scality is shipping faster.
Bert-E and Eve are open source
Source code and documentation for Bert-E and Eve are available on Bitbucket. If you have questions, please ask them on Zenko forum.