Linux Containers and the Future Cloud

Linux-based container infrastructure is an emerging cloud technology based on fast and lightweight process virtualization. It provides its users an environment as close as possible to a standard Linux distribution. As opposed to para-virtualization solutions (Xen) and hardware virtualization solutions (KVM), which provide virtual machines (VMs), containers do not create other instances of the operating system kernel. Due to the fact that containers are more lightweight than VMs, you can achieve higher densities with containers than with VMs on the same host (practically speaking, you can deploy more instances of containers than of VMs on the same host).

Another advantage of containers over VMs is that starting and shutting down a container is much faster than starting and shutting down a VM. All containers under a host are running under the same kernel, as opposed to virtualization solutions like Xen or KVM where each VM runs its own kernel. Sometimes the constraint of running under the same kernel in all containers under a given host can be considered a drawback. Moreover, you cannot run BSD, Solaris, OS/x or Windows in a Linux-based container, and sometimes this fact also can be considered a drawback.

The idea of process-level virtualization in itself is not new, and it already was implemented by Solaris Zones as well as BSD jails quite a few years ago. Other open-source projects implementing process-level virtualization have existed for several years. However, they required custom kernels, which was often a major setback. Full and stable support for Linux-based containers on mainstream kernels by the LXC project is relatively recent, as you will see in this article. This makes containers more attractive for the cloud infrastructure. More and more hosting and cloud services companies are adopting Linux-based container solutions. In this article, I describe some open-source Linux-based container projects and the kernel features they use, and show some usage examples. I also describe the Docker tool for creating LXC containers.

The underlying infrastructure of modern Linux-based containers consists mainly of two kernel features: namespaces and cgroups. There are six types of namespaces, which provide per-process isolation of the following operating system resources: filesystems (MNT), UTS, IPC, PID, network and user namespaces (user namespaces allow mapping of UIDs and GIDs between a user namespace and the global namespace of the host). By using network namespaces, for example, each process can have its own instance of the network stack (network interfaces, sockets, routing tables and routing rules, netfilter rules and so on).

Creating a network namespace is very simple and can be done with the following iproute command: ip netns add myns1. With the ip netns command, it also is easy to move one network interface from one network namespace to another, to monitor the creation and deletion of network namespaces, to find out to which network namespace a specified process belongs and so on. Quite similarly, when using the MNT namespace, when mounting a filesystem, other processes will not see this mount, and when working with PID namespaces, you will see by running the ps command from that PID namespace only processes that were created from that PID namespace.

The cgroups subsystem provides resource management and accounting. It lets you define easily, for example, the maximum memory that a process may use. This is done by using cgroups VFS operations. The cgroups project was started by two Google developers, Paul Menage and Rohit Seth, back in 2006, and it initially was called “process containers”. Neither namespaces nor cgroups intervene in critical paths of the kernel, and thus they do not incur a high performance penalty, except for the memory cgroup, which can incur significant overhead under some workloads.

Linux-Based Containers

Basically, a container is a Linux process (or several processes) that has special features and that runs in an isolated environment, configured on the host. You might sometimes encounter terms like Virtual Environment (VE) and Virtual Private Server (VPS) for a container.

The features of this container depend on how the container is configured and on which Linux-based container is used, as Linux-based containers are implemented differently in several projects. I mention the most important ones in this article:

  • OpenVZ: the origins of the OpenVZ project are in a proprietary server virtualization solution called Virtuozzo, which originally was started by a company called SWsoft, founded in 1997. In 2005, a part of the Virtuozzo product was released as an open-source project, and it was called OpenVZ. Later, in 2008, SWsoft merged with a company called Parallels. OpenVZ is used for providing hosting and cloud services, and it is the basis of the Parallels Cloud Server. Like Virtuozzo, OpenVZ also is based on a modified Linux kernel. In addition, it has command-line tools (primarily vzctl) for management of containers, and it makes use of templates to create containers for various Linux distributions. OpenVZ also can run on some unmodified kernels, but with a reduced feature set. The OpenVZ project is intended to be fully mainlined in the future, but that could take quite a long time.
  • Google containers: in 2013, Google released the open-source version of its container stack, lmctfy (which stands for Let Me Contain That For You). Right now, it’s still in the beta stage. The lmctfy project is based on using cgroups. Currently, Google containers do not use the kernel namespaces feature, which is used by other Linux-based container projects, but using this feature is on the Google container project roadmap.
  • Linux-VServer: an open-source project that was first publicly released in 2001, it provides a way to partition resources securely on a host. The host should run a modified kernel.
  • LXC: the LXC (LinuX Containers) project provides a set of userspace tools and utilities to manage Linux containers. Many LXC contributors are from the OpenVZ team. As opposed to OpenVZ, it runs on an unmodified kernel. LXC is fully written in userspace and supports bindings in other programming languages like Python, Lua and Go. It is available in most popular distributions, such as Fedora, Ubuntu, Debian and more. Red Hat Enterprise Linux 6 (RHEL 6) introduced Linux containers as a technical preview. You can run Linux containers on architectures other than x86, such as ARM (there are several how-tos on the Web for running containers on Raspberry PI, for example).

I also should mention the libvirt-lxc driver, with which you can manage containers. This is done by defining an XML configuration file and then running virsh startvirsh console and visrh destroy to run, access and destroy the container, respectively. Note that there is no common code between libvirt-lxc and the userspace LXC project.

LXC Container Management

First, you should verify that your host supports LXC by running lxc-checkconfig. If everything is okay, you can create a container by using one of several ready-made templates for creating containers. In lxc-0.9, there are 11 such templates, mostly for popular Linux distributions. You easily can tailor these templates according to your requirements, if needed. So, for example, you can create a Fedora container called fedoraCT with:


lxc-create -t fedora -n fedoraCT

 

The container will be created by default under /var/lib/lxc/fedoraCT. You can set a different path for the generated container by adding the --lxcpath PATH option.

The -t option specifies the name of the template to be used, (fedora in this case), and the -n option specifies the name of the container (fedoraCT in this case). Note that you also can create containers of other distributions on Fedora, for example of Ubuntu (you need the debootstrap package for it). Not all combinations are guaranteed to work.

You can pass parameters to lxc-create after adding --. For example, you can create an older release of several distributions with the -R or -r option, depending on the distribution template. To create an older Fedora container on a host running Fedora 20, you can run:


lxc-create -t fedora -n fedora19 -- -R 19

 

You can remove the installation of an LXC container from the filesystem with:


lxc-destroy -n fedoraCT

 

For most templates, when a template is used for the first time, several required package files are downloaded and cached on disk under /var/cache/lxc. These files are used when creating a new container with that same template, and as a result, creating a container that uses the same template will be faster next time.

You can start the container you created with:


lxc-start -n fedoraCT

 

And stop it with:


lxc-stop -n fedoraCT

 

The signal used by lxc-stop is SIGPWR by default. In order to use SIGKILL in the earlier example, you should add -k to lxc-stop:


lxc-stop -n fedoraCT -k

 

You also can start a container as a dæmon by adding -d, and then log on into it with lxc-console, like this:


lxc-start -d -n fedoraCT
lxc-console -n fedoraCT

 

The first lxc-console that you run for a given container will connect you to tty1. If tty1 already is in use (because that’s the second lxc-console that you run for that container), you will be connected to tty2 and so on. Keep in mind that the maximum number of ttys is configured by the lxc.tty entry in the container configuration file.

You can make a snapshot of a non-running container with:


lxc-snapshot -n fedoraCT

 

This will create a snapshot under /var/lib/lxcsnaps/fedoraCT. The first snapshot you create will be called snap0; the second one will be called snap1 and so on. You can time-restore the snapshot at a later time with the -r option—for example:


lxc-snapshot -n fedoraCT -r snap0 restoredFdoraCT

 

You can list the snapshots with:


lxc-snapshot -L -n fedoraCT

 

You can display the running containers by running:


lxc-ls --active

 

Managing containers also can be done via scripts, using scripting languages. For example, this short Python script starts the fedoraCT container:


#!/usr/bin/python3

import lxc

container = lxc.Container("fedoraCT")
container.start()

 

Container Configuration

A default config file is generated for every newly created container. This config file is created, by default, in /var/lib/lxc/<containerName>/config, but you can alter that using the --lxcpath PATH option. You can configure various container parameters, such as network parameters, cgroups parameters, device parameters and more. Here are some examples of popular configuration items for the container config file:

  • You can set various cgroups parameters by setting values to the lxc.cgroup.[subsystem name] entries in the config file. The subsystem name is the name of the cgroup controller. For example, configuring the maximum memory a container can use to be 256MB is done by setting lxc.cgroup.memory.limit_in_bytes to be 256MB.
  • You can configure the container hostname by setting lxc.utsname.
  • There are five types of network interfaces that you can set with the lxc.network.type parameter: emptyvethvlan,macvlan and phys. Using veth is very common in order to be able to connect a container to the outside world. By using phys, you can move network interfaces from the host network namespace to the container network namespace.
  • There are features that can be used for hardening the security of LXC containers. You can avoid some specified system calls from being called from within a container by setting a secure computing mode, or seccomp, policy with the lxc.seccomp entry in the configuration file. You also can remove capabilities from a container with the lxc.cap.drop entry. For example, setting lxc.cap.drop = sys_module will create a container without the CAP_SYS_MDOULE capability. Trying to run insmod from inside this container will fail. You also can define Apparmor and SELinux profiles for your container. You can find examples in the LXC README and inman 5 lxc.conf.

Docker

Docker is an open-source project that automates the creation and deployment of containers. Docker first was released in March 2013 with Apache License Version 2.0. It started as an internal project by a Platform-as-a-Service (PaaS) company called dotCloud at the time, and now called Docker Inc. The initial prototype was written in Python; later the whole project was rewritten in Go, a programming language that was developed first at Google. In September 2013, Red Hat announced that it will collaborate with Docker Inc. for Red Hat Enterprise Linux and for the Red Hat OpenShift platform. Docker requires Linux kernel 3.8 (or above). On RHEL systems, Docker runs on the 2.6.32 kernel, as necessary patches have been backported.

Docker utilizes the LXC toolkit and as such is currently available only for Linux. It runs on distributions like Ubuntu 12.04, 13.04; Fedora 19 and 20; RHEL 6.5 and above; and on cloud platforms like Amazon EC2, Google Compute Engine and Rackspace.

Docker images can be stored on a public repository and can be downloaded with the docker pull command—for example, docker pull ubuntu or docker pull busybox.

To display the images available on your host, you can use thedocker images command. You can narrow the command for a specific type of images (fedora, for example) with docker images fedora.

On Fedora, running a Fedora docker container is simple; after installing the docker-io package, you simply start the docker dæmon with systemctl start docker, and then you can start a Fedora docker container with docker run -i -t fedora /bin/bash.

Docker has git-like capabilities for handling containers. Changes you make in a container are lost if you destroy the container, unless you commit your changes (much like you do in git) withdocker commit <containerId> <containerName/containerTag>. These images can be uploaded to a public registry, and they are available for downloading by anyone who wants to download them. Alternatively, you can set a private Docker repository.

Docker is able to create a snapshot using the kernel device mapper feature. In earlier versions, before Docker version 0.7, it was done using AUFS (union filesystem). Docker 0.7 adds “storage plugins”, so people can switch between device mapper and AUFS (if their kernel supports it), so that Docker can run on RHEL releases that do not support AUFS.

You can create images by running commands manually and committing the resulting container, but you also can describe them with a Dockerfile. Just like a Makefile will compile code into a binary executable, a Dockerfile will build a ready-to-run container image from simple instructions. The command to build an image from a Dockerfile is docker build. There is a tutorial about Dockerfiles and their command syntax on the Docker Web site. For example, the following short Dockerfile is for installing the iperfpackage for a Fedora image:


FROM fedora
MAINTAINER Rami Rosen
RUN yum install -y iperf

 

You can upload and store your images for free on the Docker public index. Just like with GitHub, storing public images is free and just requires you to register an account.

The Checkpoint/Restore Feature

The CRIU (Checkpoint/Restore in userspace) project is implemented mostly in userspace, and there are more than 100 little patches scattered in the kernel for supporting it. There were several attempts to implement Checkpoint/Restore in kernel space solely, some of them by the OpenVZ project. The kernel community rejected all of them though, as they were too complex.

The Checkpoint/Restore feature enables saving a process state in several image files and restoring this process from the point at which it was frozen, on the same host or on a different host at a later time. This process also can be an LXC container. The image files are created using Google’s protocol buffer (PB) format. The Checkpoint/Restore feature enables performing maintenance tasks, such as upgrading a kernel or hardware maintenance on that host after checkpointing its applications to persistent storage. Later on, the applications are restored on that host.

Another feature that is very important in HPC is load balancing using live migration. The Checkpoint/Restore feature also can be used for creating incremental snapshots, which can be used after a crash occurs. As mentioned earlier, some kernel patches were needed for supporting CRIU; here are some of them:

  • A new system call named kcmp() was added; it compares two processes to determine if they share a kernel resource.
  • A socket monitoring interface called sock_diag was added to UNIX sockets in order to be able to find the peer of a UNIX domain socket. Before this change, the ss tool, which relied on parsing of /proc entries, did not show this information.
  • A TCP connection repair mode was added.
  • procfs entry was added (/proc/PID/map_files).

Let’s look at a simple example of using the criu tool. First, you should check whether your kernel supports Checkpoint/Restore, by running criu check --ms. Look for a response that says "Looks good."

Basically, checkpointing is done by:


criu dump -t <pid>

 

You can specify a folder where the process state files will be saved by adding -D folderName.

You can restore with criu restore <pid>.

Summary

In this article, I’ve described what Linux-based containers are, and I briefly explained the underlying cgroups and namespaces kernel features. I have discussed some Linux-based container projects, focusing on the promising and popular LXC project. I also looked at the LXC-based Docker engine, which provides an easy and convenient way to create and deploy LXC containers. Several hands-on examples showed how simple it is to configure, manage and deploy LXC containers with the userspace LXC tools and the Docker tools.

Due to the advantages of the LXC and the Docker open-source projects, and due to the convenient and simple tools to create, deploy and configure LXC containers, as described in this article, we presumably will see more and more cloud infrastructures that will integrate LXC containers instead of using virtual machines in the near future. However, as explained in this article, solutions like Xen or KVM have several advantages over Linux-based containers and still are needed, so they probably will not disappear from the cloud infrastructure in the next few years.

Acknowledgements

Thanks to Jérôme Petazzoni from Docker Inc. and to Michael H. Warfield for reviewing this article.

Resources

Google Containers: https://github.com/google/lmctfy

OpenVZ: http://openvz.org/Main_Page

Linux-VServer: http://linux-vserver.org

LXC: http://linuxcontainers.org

libvirt-lxc: http://libvirt.org/drvlxc.html

Docker: https://www.docker.io

Docker Public Registry: https://index.docker.io

(Via LinuxJournal.com)

AWS OpsWorks in the Virtual Private Cloud

I am pleased to announce support for using AWS OpsWorks with Amazon Virtual Private Cloud (Amazon VPC). AWS OpsWorks is a DevOps solution that makes it easy to deploy, customize and manage applications. OpsWorks provides helpful operational features such as user-based ssh management, additional CloudWatch metrics for memory and load, automatic RAID volume configuration, and a variety of application deployment options. You can optionally use the popular Chef automation platform to extend OpsWorks using your own custom recipes. With VPC support, you can now take advantage of the application management benefits of OpsWorks in your own isolated network. This allows you to run many new types of applications on OpsWorks.

For example, you may want a configuration like the following, with your application servers in a private subnet behind a public Elastic Load Balancer (ELB). This lets you control access to your application servers. Users communicate with the Elastic Load Balancer which then communicates with your application servers through the ports you define. The NAT allows your application servers to communicate with the OpsWorks service and with Linux repositories to download packages and updates.

To get started, we’ll first create this VPC. For a shortcut to create this configuration, you can use a CloudFormation template. First, navigate to theCloudFormation console and select Create Stack.  Give your stack a name, provide the template URL http://cloudformation-templates-us-east-1.s3.amazonaws.com/OpsWorksinVPC.template and select Continue. Accept the defaults and select Continue. Create a tag with a key of “Name” and a meaningful value. Then create your CloudFormation stack.

When your CloudFormation stack’s status shows “CREATE_COMPLETE”, take a look at the outputs tab; it contains several IDs that you will need later, including the VPC and subnet IDs.

You can now create an OpsWorks stack to deploy a sample app in your new private subnet. Navigate to the AWS OpsWorks console and click Add Stack. Select the VPC and private subnet that you just created using the CloudFormation template.

Next, under Add your first layer, click Add a layer. For Layer type box, select PHP App Server. Select the Elastic Load Balancer created in by the CloudFormation template to the Layer and then click Add layer.

Next, in the layer’s Actions column click Edit. Scroll down to the Security Groups section and select the Additional Group with OpsWorksSecurityGroup in the name. Click the + symbol, then click Save.

Next, in the navigation pane, click Instances, accept the defaults, and then click Add an Instance. This creates the instance in the default subnet you set when you created the stack.

Under PHP App Server, in the row that corresponds to your instance, click start in the Actions column.

You are now ready to deploy a sample app to the instance you created. An app represents code you want to deploy to your servers. That code is stored in a repository, such as Git or Subversion. For this example, we’ll use the SimplePHPApp application from the Getting Started walkthrough.  First, in the navigation pane, click Apps. On the Apps page, click Add an app. Type a name for your app and scroll down to the Repository URL and set Repository URL to git://github.com/amazonwebservices/opsworks-demo-php-simple-app.git, and Branch/Revision to version1. Accept the defaults for the other fields.

When all the settings are as you want them, click Add app. When you first add a new app, it isn’t deployed yet to the instances for the layer. To deploy your App to the instance in PHP App Server layer, under Actions, click Deploy.

Once your deployment has finished, in the navigation pane, click Layers. Select the Elastic Load Balancer for your PHP App Server layer. The ELB page shows the load balancer’s basic properties, including its DNS name and the health status of the associated instances. A green check indicates the instance has passed the ELB health checks (this may take a minute). You can then click on the DNS name to connect to your app through the load balancer.

You can try these new features with a few clicks of the AWS Management Console. To learn more about how to launch OpsWorks instances inside a VPC, see the AWS OpsWorks Developer Guide.

You may also want to sign up for our upcoming AWS OpsWorks Webinar on September 12, 2013 at 10:00 AM PT. The webinar will highlight common use cases and best practices for how to set up AWS OpsWorks and Amazon VPC.

– Chris Barclay, Senior Product Manager

(source: Amazon Web Services blog)

 

More Database Power – 20,000 IOPS for MySQL With the CR1 Instance

If you are a regular reader of this blog, you know that I am a huge fan of the Amazon Relational Database Service (RDS). Over the course of the last couple of years, I have seen that my audiences really appreciate the way that RDS obviates a number of tedious yet mission-critical tasks that are an inherent responsibility when running a relational database. There’s no joy to be found in keeping operating systems and database engines current, creating and restoring backups, scaling hardware up and down, or creating an architecture that provides high availability.

Today we are making RDS even more powerful by adding a new high-end database instance class. The new db.cr1.8xlarge instance type gives you plenty of memory, CPU power, and network throughput to allow your MySQL 5.6 applications to perform at up to 20,000 IOPS. This is a 60% improvement over the previous high-water mark of 12,500 IOPS and opens the door to database-driven applications that are even more demanding than before. Here are the specs:

  • 64-bit platform
  • 244 GB of RAM
  • 88 ECU (16 hyperthreaded virtual cores each delivering 2.75 ECU)
  • High-performance networking

This new instance type is available in the US East (Northern Virginia), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo) Regions and you can start using it today!

(source: Amazon Web Services blog)

OpenStack Grizzly Architecture

As OpenStack has continued to mature, it has become more complicated in some ways but radically simplified in others. From a deployers view, each service has become easier to deploy with more sensible defaults and the proliferations of cloud distributions. However, the architects view of OpenStack has actually gotten more complicated – new services have been added and new ways of integrating them are now feasible.

As an aid to architects that are new to OpenStack, this post updates my OpenStack Folsom Architecture blog and revisits my Intro to OpenStack Architecture (Grizzly Edition)presentation from Portland, with a few clarfications and updates.

OpenStack Components

There are currently seven core components of OpenStack: Compute, Object Storage, Identity, Dashboard, Block Storage, Network and Image Service. Let’s look at each in turn:

  • Object Store (codenamed “Swift“) allows you to store or retrieve files (but not mount directories like a fileserver). Several companies provide commercial storage services based on Swift. These include KT, Rackspace (from which Swift originated) and Hewlett-Packard. Swift is also used internally at many large companies to store their data.
  • Image Store (codenamed “Glance“) provides a catalog and repository for virtual disk images. These disk images are mostly commonly used in OpenStack Compute.
  • Compute (codenamed “Nova“) provides virtual servers upon demand. Rackspace andHP provide commercial compute services built on Nova and it is used internally at companies like Mercado Libre, Comcast, Best Buy and NASA (where it originated).
  • Dashboard (codenamed “Horizon“) provides a modular web-based user interface for all the OpenStack services. With this web GUI, you can perform most operations on your cloud like launching an instance, assigning IP addresses and setting access controls.
  • Identity (codenamed “Keystone“) provides authentication and authorization for all the OpenStack services. It also provides a service catalog of services within a particular OpenStack cloud.
  • Network (which used to named “Quantum” but is in the process of being renamed due to a trademark issue) provides “network connectivity as a service” between interface devices managed by other OpenStack services (most likely Nova). The service works by allowing users to create their own networks and then attach interfaces to them. Quantum has a pluggable architecture to support many popular networking vendors and technologies.
  • Block Storage (codenamed “Cinder“) provides persistent block storage to guest VMs. This project was born from code originally in Nova (the nova-volume service that has been depricated). While this was originally a block storage only service, it has been extended to NFS shares.

In addition to these core projects, there are also a number of non-core projects that will be included in future OpenStack releases.

Conceptual Architecture

The OpenStack project as a whole is designed to “deliver(ing) a massively scalable cloud operating system.” To achieve this, each of the constituent services are designed to work together to provide a complete Infrastructure as a Service (IaaS). This integration is facilitated through public application programming interfaces (APIs) that each service offers (and in turn can consume). While these APIs allow each of the services to use another service, it also allows an implementer to switch out any service as long as they maintain the API. These are (mostly) the same APIs that are available to end users of the cloud.

Conceptually, you can picture the relationships between the services as so:

OpenStack Grizzly Conceptual Architecture

  • Dashboard provides a web front end to the other OpenStack services
  • Compute stores and retrieves virtual disks (“images”) and associated metadata in the Image Store (“Glance”)
  • Network provides virtual networking for Compute.
  • Block Storage provides storage volumes for Compute.
  • Image Store can store the actual virtual disk files in the Object Store
  • All the services authenticate with Identity

This is a stylized and simplified view of the architecture, assuming that the implementer is using all of the services together in the most common configuration. However, OpenStack does not mandate an all-or-nothing approach. Many implementers only deploy the pieces that they need. For example, Swift is a popular object store for cloud service providers, even if they deploy another cloud compute infrastructure.

The diagram also only shows the “operator” side of the cloud — it does not picture how consumers of the cloud may actually use it. For example, many users will access object storage heavily (and directly).

Logical Architecture

As you can imagine, the logical architecture is far more complicated than the conceptual architecture shown above. As with any service-oriented architecture, diagrams quickly become “messy” trying to illustrate all the possible combinations of service communications. The diagram below, illustrates the most common architecture of an OpenStack-based cloud. However, as OpenStack supports a wide variety of technologies, it does not represent the only architecture possible.

OpenStack Grizzly Logical Architecture

This picture is consistent with the conceptual architecture above in that:

  • End users interact through a common web interface or directly to each service through their API
  • All services authenticate through a common source (facilitated through Keystone)
  • Individual services interact with each other through their APIs (except where privileged administrator commands are necessary) — including the user’s web interface

In the sections below, we’ll delve into the architecture for each of the services.

Dashboard

Horizon is a modular Django web application that provides an end user and cloud operator interface to OpenStack services.

OpenStack Horizon Screenshot

The interface has user screens for:

  • Quota and usage information
  • Instances to operate cloud virtual machines
  • Volume management to control creation, deletion and connectivity to block storage
  • Image and snapshot to upload and control virtual images, which are used to backup and boot new instances
  • Access and security to manage keypairs and security groups (firewall rules)

In addition to the user screens, it also provides an interface for cloud operators. The operator interface sees across the entire cloud and adds some configuration focused screens such as:

  • Flavors to define service catalog offerings of CPU, memory and boot disk storage
  • Projects to provide logical groups of user accounts
  • Users to administer user accounts
  • System Info to view services running in the cloud and quotas applied to projects

The Grizzly edition of Horizon adds a few new features as well as significant refactoring to the user experience:

  • Networking (see the new network topology diagrams)
  • Direct image upload to Glance
  • Support for flavor extra specs
  • Migrate instances to other compute hosts
  • User experience improvements

The Horizon architecture is fairly simple. Horizon is usually deployed via mod_wsgi in Apache. The code itself is separated into a reusable python module with most of the logic (interactions with various OpenStack APIs) and presentation (to make it easily customizable for different sites).

From a network architecture point of view, this service will need to be customer accessible as well as be able to talk to each service’s public APIs. If you wish to use the administrator functionality (i.e. for other services), it will also need connectivity to their Admin API endpoints (which should not be customer accessible).

Compute

Nova is the most complicated and distributed component of OpenStack. A large number of processes cooperate to turn end user API requests into running virtual machines. Among Nova more prominent features are:

  • Starting, resizing, stopping and querying virtual machines (“instances”)
  • Assigning and removing public IP addresses
  • Attaching and detaching block storage
  • Adding, modifying and deleting security groups
  • Show instance consoles
  • Snapshot running instances

There are several changes to the architecture in this release. These changes include the depreciation of nova-network and nova-volume as well as the decoupling of nova-compute from the database (through the no-compute-db feature). All of these changes are all optional (the old code is still available to be used), but are slated to disappear soon.

Below is a list of these processes and their functions:

  • nova-api is a family of daemons (nova-apinova-api-os-computenova-api-ec2,nova-api-metadata or nova-api-all) that accept and respond to end user compute API calls. It supports OpenStack Compute API, Amazon’s EC2 API and a special Admin API (for privileged users to perform administrative actions). It also initiates most of the orchestration activities (such as running an instance) as well as enforces some policy (mostly quota checks). Different daemons allow Nova to implement different APIs (Amazon EC2, OpenStack Compute, Metadata) or combination of APIs (nova-api starts both the EC2 and OpenStack APIs).
  • The nova-compute process is primarily a worker daemon that creates and terminates virtual machine instances via hypervisor’s APIs (XenAPI for XenServer/XCP, libvirt for KVM or QEMU, VMwareAPI for VMware, etc.). New to the Grizzly release is the return of Hyper-V (thanks to the Cloudbase Solutions guys for the comment). The process by which it does so is fairly complex but the basics are simple: accept actions from the queue and then perform a series of system commands (like launching a KVM instance) to carry them out while updating state in the database through nova-conductor. Please note that the use of nova-conductor is optional in this release, but does greatly increase security.
  • The nova-scheduler process is conceptually the simplest piece of code in OpenStack Nova: take a virtual machine instance request from the queue and determines where it should run (specifically, which compute server host it should run on). In practice, it is now one of the most complex.
  • A new service called nova-conductor has been added to this release. It mediates access to the database for other daemons (only nova-compute in this release) to provide greater security.
  • The queue provides a central hub for passing messages between daemons. This is usually implemented with RabbitMQ today, but could be any AMPQ message queue (such as Apache Qpid), or Zero MQ.
  • The SQL database stores most of the build-time and run-time state for a cloud infrastructure. This includes the instance types that are available for use, instances in use, networks available and projects. Theoretically, OpenStack Nova can support any database supported by SQL-Alchemy but the only databases currently being widely used are sqlite3 (only appropriate for test and development work), MySQL and PostgreSQL.
  • Nova also provides console services to allow end users to access their virtual instance’s console through a proxy. This involves several daemons (nova-console,nova-xvpvncproxynova-spicehtml5proxy and nova-consoleauth).

Nova interacts with many other OpenStack services: Keystone for authentication, Glance for images and Horizon for web interface. The Glance interactions are central. The API process can upload and query Glance while nova-compute will download images for use in launching images.

Object Store

OpenStack’s Object Store (“Swift”) is designed to provide large scale storage of data that is accessible via APIs. Unlike a traditional file server, it is completely distributed, storing multiple copies of each object to achieve greater availability and scalability. Swift provides the following user functionality:

  • Stores and retrieves objects (files)
  • Sets and modifies metadata on objects (tags)
  • Versions objects
  • Serve static web pages and objects via HTTP. In fact, the diagrams in this blog post are being served out of Rackspace’s Swift service.

The swift architecture is very distributed to prevent any single point of failure as well as to scale horizontally. It includes the following components:

  • Proxy server (swift-proxy-server) accepts incoming requests via the OpenStack Object API or just raw HTTP. It accepts files to upload, modifications to metadata or container creation. In addition, it will also serve files or container listing to web browsers. The proxy server may utilize an optional cache (usually deployed with memcache) to improve performance.
  • Account servers manage accounts defined with the object storage service.
  • Container servers manage a mapping of containers (i.e folders) within the object store service.
  • Object servers manage actual objects (i.e. files) on the storage nodes.

There are also a number of periodic process which run to perform housekeeping tasks on the large data store. The most important of these is the replication services, which ensures consistency and availability through the cluster. Other periodic processes include auditors, updaters and reapers. Authentication for the object store service is handled through configurable WSGI middleware (which will usually be Keystone).

To learn more about Swift, head over to the SwiftStack website and read their OpenStack Swift Architecture.

Image Store

OpenStack Image Store centralizes virtual images for users and other cloud services:

  • Stores public and private images that users can utilize to start instances
  • Users can query and list available images for use
  • Delivers images to Nova to start instances
  • Snapshots from running instances can be stored so that virtual machines can be backed

The Glance architecture has stayed relatively stable since the Cactus release.

  • glance-api accepts Image API calls for image discovery, image retrieval and image storage.
  • glance-registry stores, processes and retrieves metadata about images (size, type, etc.).
  • A database to store the image metadata. Like Nova, you can choose your database depending on your preference (but most people use MySQL or SQlite).
  • A storage repository for the actual image files. In the diagram above, Swift is shown as the image repository, but this is configurable. In addition to Swift, Glance supports normal filesystems, RADOS block devices, Amazon S3 and HTTP. Be aware that some of these choices are limited to read-only usage.

There are also a number of periodic process which run on Glance to support caching. The most important of these is the replication services, which ensures consistency and availability through the cluster. Other periodic processes include auditors, updaters and reapers.

As you can see from the diagram in the Conceptual Architecture section, Glance serves a central role to the overall IaaS picture. It accepts API requests for images (or image metadata) from end users or Nova components and can store its disk files in the object storage service, Swift.

Identity

Keystone provides a single point of integration for OpenStack policy, catalog, token and authentication:

  • Authenticate users and issue tokens for access to services
  • Store users and tenants for a role-based access control (RBAC)
  • Provides a catalog of the services (and their API endpoints) in the cloud
  • Create policies across users and services

Architecturally, Keystone is very simple:

  • keystone handles API requests as well as providing configurable catalog, policy, token and identity services.
  • Each Keystone function has a pluggable backend which allows different ways to use the particular service. Most support standard backends like LDAP or SQL, as well as Key Value Stores (KVS).

Most people will use this as a point of customization for their current authentication services.

Network

Quantum provides “network connectivity as a service” between interface devices
managed by other OpenStack services (most likely Nova). It allows users to:

  • Users can create their own networks and then attach server interfaces to them
  • Pluggable backend architecture lets users take advantage of commodity gear or vendor supported equipment
  • Extensions allow additional network services like load balancing

Like many of the OpenStack services, Quantum is highly configurable due to it’s
plug-in architecture. These plug-ins accommodate different networking equipment
and software. As such, the architecture and deployment can vary dramatically.

  • quantum-server accepts API requests and then routes them to the
    appropriate quantum plugin for action.
  • Quantum plugins and agents perform the actual work such as plugging and
    unplugging ports, creating networks or subnets and IP addressing. These
    plugins and agents differ depending on the vendor and technologies used in the
    particular cloud. Quantum ships with plugins and agents for: Cisco virtual and
    physical switches, Nicira NVP product, NEC OpenFlow products, Open vSwitch,
    Linux bridging and the Ryu Network Operating System. Midokua also provides a plug-in for Quantum integration. The common agents are L3 (layer 3), DHCP (dynamic host IP addressing) and vendor specific plug-in agent(s).
  • Most Quantum installations will also make use of a messaging queue to route
    information between the quantum-server and various agents as well as a
    database to store networking state for particular plugins.

Quantum will interact mainly with Nova, where it will provide networks and
connectivity for its instances. Florian Otel has written very thorough article on implementing Open vSwitch is you are looking for an example of Quantum in action.

Block Storage

Cinder separates out the persistent block storage functionality that was
previously part of Openstack Compute (in the form of nova-volume) into it’s own
service. The OpenStack Block Storage API allows for manipulation of volumes,
volume types (similar to compute flavors) and volume snapshots:

  • Create, modify and delete volumes
  • Snapshot or backup volumes
  • Query volume status and metadata

It’s architecture follows the Quantum model, which provides for a northbound API and vendor plugins underneath it.

  • cinder-api accepts API requests and routes them to cinder-volume
    for action.
  • cinder-volume acts upon the requests by reading or writing to the
    Cinder database to maintain state, interacting with other processes (like
    cinder-scheduler) through a message queue and directly upon block
    storage providing hardware or software. It can interact with a variety of
    storage providers through a driver architecture. Currently, there are included drivers for IBM (Xiv, Storwize and SVC), SolidFire, Scality, Coraid appliances, RADOS block storage (Ceph), Sheepdog, NetApp, Windows Server 2012 iSCSI, HP (Lefthand and 3PAR), Nexenta appliances, Huawei (T series and Dorado storage systems), Zadara VPSA, Red Hat’s GlusterFS, EMC (VNX and VMAX arrays), Xen and linux iSCSI.
  • Much like nova-scheduler, the cinder-scheduler daemon picks the optimal
    block storage provider node to create the volume on.
  • cinder-backup is a new service that backs up the data from a volume (not a full snapshot) to a backend service. Currently, the only shipping backend service is Swift.
  • Cinder deployments will also make use of a messaging queue to route
    information between the cinder processes as well as a database to store volume state.

Like Quantum, Cinder will mainly interact with Nova, providing volumes for its
instances.

Future Projects

In the next version of OpenStack (“Havana” which is due in the Fall of 2013), two new projects will be brought into the fold:

  • Ceilometer is a metering project. The project offers metering information and the ability to code more ways to know what has happened on an OpenStack cloud. While it provides metering, it is not a billing project. A full billing solution requires metering, rating, and billing. Metering lets you know what actions have taken place, rating enables pricing and line items, and billing gathers the line items to create a bill to send to the consumer and collect payment. For users that also want a billing package, BillingStack is another open source project that provides payment gateway and other billing features. Ceilometer is available as a preview now.
  • Heat provides a REST API to orchestrate multiple cloud applications implementing standards such as AWS CloudFormation.

Looking beyond the “Havana” release, OpenStack is slated to see the addition of two more projects in the Spring of 2014 (for the newly named “Icehouse” release):

  • Reddwarf is a database as a service offering that provides MySQL databases within OpenVZ containers upon demand.
  • Ironic is the aptly named project that uses OpenStack to deploy bare metal servers instead of virtualized cloud instances.

There are also a number of related but unofficial projects:

  • Moniker that provides DNS-as-a-service for OpenStack
  • Marconi which is a message queueing service
  • Savanna to provision Hadoop clusters on OpenStack
  • Murano that allows a non-experienced user to deploy reliable Windows based environments
  • Convection is a task or workflow service to execute command sequences or long running jobs

(Source: solinea.com)

 

Storm-YARN Released as Open Source

At Yahoo! we have worked on the convergence of Storm with Hadoop, as mentioned in our earlier post. We are pleased to announce that Storm-YARN has been released as open source. Storm-YARN enables Storm applications to utilize the computational resources in a Hadoop cluster along with accessing Hadoop storage resources such as HBase and HDFS.

Motivation
Collocating real-time processing with batch processing offers a number of advantages over segregated clusters.

  • It provides a huge potential for elasticity. Real-time processing will rarely produce a constant and predictable load. As such, Storm needs more resources to keep up with spikes in demand. Collocating Storm with batch processing allows Storm to steal resources from batch jobs when needed and give them back when demand subsides. The Storm-YARN effort lays the groundwork to make this possible.
  • Many applications use Storm for low-latency processing and Map/Reduce for batch processing while sharing data between Storm and Map/Reduce. By placing Storm physically closer to the data source and/or other components in the same pipeline we can reduce network transfers and in turn the total cost of acquiring the data.

Launch Storm Cluster
To launch a Storm cluster managed by YARN, you simply execute:

storm-yarn launch <storm-yarn.yaml>

storm-yarn.yaml is the standard storm configuration file with YARN specification parameters including master.initial-num-supervisors (the initial number of supervisors to be launched) and master.container.size-mb (the memory size of the container to be allocated for each supervisor).
Figure 1 illustrates the execution of storm-yarn command. Storm-YARN asks YARN’s Resource Manager to launch a Storm Application Master. The Application Master then launches a Storm nimbus server and a Storm UI server locally. It also uses YARN to find resources for the supervisors and launch them.

Figure 1: Launch Storm Cluster with Hadoop YARN

Execute Storm Topologies
You can communicate with the Storm cluster the same as with a standalone Storm cluster, through the storm command.

storm jar <topology_jar>

Because nimbus is running on a node picked by YARN, you may need to specify that node on the command line by setting the nimbus.host config.

As illustrated in Figure 2, each Storm supervisor will launch worker processes within its container. These Storm worker processes are enabled to access Hadoop datasets stored in HDFS and HBase etc..

Figure 2 Submit and Execute Storm Topologies

Open Source Release
Yahoo! has decided to release Storm-YARN code under the Apache 2.0 License. The code is available at https://github.com/yahoo/storm-yarn. This alpha release enables members of the community to jointly make Storm-YARN a high-quality product. Please try it out and let us know what you think.

If you are interested in contributing, please feel free to submit proposals as issues, sign an Apache style CLA and contribute your code.

Additional details on Storm-YARN will be shared during our Storm-on-YARN: Convergence of Low-Latency and Big-Data talk at the 2013 Hadoop Summit North America on June 26, 2013, 11:20 am under the Future of Apache Hadoop track. We look forward to seeing you there.

Acknowledgement
Derek Dagit has implemented significant portion of Storm-Yarn release. We thank him for making this early release avaialble for open source.

Biography
Bobby Evans is a software developer at Yahoo! and a Hadoop PMC member at the Apache Software Foundation.

Andy Feng is a Distinguished Architect at Yahoo! and a Core Contributor of Storm project. He lead architecture design and development of next-gen big-data platform, which empowers variety application patterns (Batch, Microbatch, Streaming, Query).

(via Yahoo.com)

 

Containers—Not Virtual Machines—Are the Future Cloud

Cloud infrastructure providers like Amazon Web Service sell virtual machines. EC2 revenue is expected to surpass $1B in revenue this year. That’s a lot of VMs.

It’s not hard to see why there is such demand. You get the ability to scale up or down, guaranteed computational resources, security isolation and API access for provisioning it all, without any of the overhead of managing physical servers.

But, you are also paying for lot of increasingly avoidable overhead in the form of running a full-blown operating system image for each virtual machine. This approach has become an unnecessarily heavyweight solution to the underlying question of how to best run applications in the cloud.

Figure 1. Traditional virtualization and paravirtualization require a full operating system image for each instance.

Until recently it has been assumed that OS virtualization is the only path to provide appropriate isolation for applications running on a server. These assumptions are quickly becoming dated, thanks to recent underlying improvements to how the Linux kernel can now manage isolation between applications.

Containers now can be used as an alternative to OS-level virtualization to run multiple isolated systems on a single host. Containers within a single operating system are much more efficient, and because of this efficiency, they will underpin the future of the cloud infrastructure industry in place of VM architecture.

Figure 2. Containers can share a single operating system and, optionally, other binary and library resources.

How We Got Here

There is a good reason why we buy by the virtual machine today: containers used to be terrible, if they existed in any useful form at all. Let’s hop back to 2005 for a moment. “chroot” certainly didn’t (and still doesn’t) meet the resource and security isolation goals for multi-tenant designs. “nice” is a winner-takes-all scheduling mechanism. The “fair” resource scheduling in the kernel is often too fair, equally balancing resources between a hungry, unimportant process and a hungry, important one. Memory and file descriptor limits offer no gradient between normal operation and crashing an application that’s overstepped its boundaries.

Virtual machines were able to partition and distribute resources viably in the hypervisor without relying on kernel support or, worse, separate hardware. For a long time, virtual machines were the only way on Linux to give Application A up to 80% of CPU resources and Application B up to 20%. Similar partitioning and sharing schemes exist for memory, disk block I/O, network I/O and other contentious resources.

Virtual machines have made major leaps in efficiency too. What used to be borderline-emulation has moved to direct hardware support for memory page mapping and other hard-to-virtualize features. We’re down to a CPU penalty of only a few percent versus direct hardware use.

The Problem with VMs

Here are the penalties we currently pay for VMs:

  1. Running a whole separate operating system to get a resource and security isolation.
  2. Slow startup time while waiting for the OS to boot.

The OS often consumes more memory and more disk than the actual application it hosts. The Rackspace Cloud recently discontinued 256MB instances because it didn’t see them as practical. Yet, 256MB is a very practical size for an application if it doesn’t need to share that memory with a full operating system.

As for slow startup time, many cloud infrastructure users keep spare virtual machines around to accelerate provisioning. Virtual machines have taken the window from requesting to receiving resources down to minutes, but having to keep spare resources around is a sign that it’s still either too slow or not reliable enough.

So What’s the Difference?

VMs are fairly standardized; a system image running on one expects mostly the same things as if it had its own bare-metal computer. Containers are not very standardized in the industry as a whole. They’re very OS- and kernel-specific (BSD jails, Solaris Zones, Linux namespaces/cgroups). Even on the same kernel and OS, options may range from security sandboxes for individual processes to nearly complete systems.

VMs don’t have to run the same kernel or OS as the host and they obtain access to resources from the host over virtualized devices—like network cards—and network protocols. However, VMs are fairly opaque to the host system; the hypervisor has poor insight into what’s been written but freed in block storage and memory. They usually leverage hardware/CPU-based facilities for isolating their access to memory and appear as a handful of hypervisor processes on the host system.

Containers have to run the same kernel as the host, but they can optionally run a completely different package tree or distribution. Containers are fairly transparent to the host system. The kernel manages memory and filesystem access the same way as if it were running on the host system. Containers obtain access to resources from the host over normal userland/IPC facilities. Some implementations even support handing sockets from the host to the container of standard UNIX facilities.

VMs are also heavyweight. They require a full OS and system image, such as EC2′s AMIs. The hypervisor runs a boot process for the VM, often even emulating BIOS. They usually appear as a handful of hypervisor processes on the host system. Containers, on the other hand, are lightweight. There may not be any persistent files or state, and the host system usually starts the containerized application directly or runs a container-friendly init dæmon, like systemd. They appear as normal processes on the host system.

Figure 3. Virtualization decoupled provisioning from hardware deployment. Containers decouple provisioning from OS deployment and boot-up.

Containers Now Offer the Same Features as VMS, but with Minimal Overhead

Compared to a virtual machine, the overhead of a container is disruptively low. They start so fast that many configurations can launch on-demand as requests come in, resulting in zero idle memory and CPU overhead. A container running systemd or Upstart to manage its services has less than 5MB of system memory overhead and nearly zero CPU consumption. With copy-on-write for disk, provisioning new containers can happen in seconds.

Containers are how we at Pantheon maintain a consistent system architecture that spans free accounts up to clustered, highly available enterprise users. We manage an internal supply of containers that we provision using an API.

We’re not alone here; virtually every large PaaS system—including Heroku, OpenShift, dotCloud and Cloud Foundry—has a container foundation for running their platform tools. PaaS providers that do rely on full virtual machines have inflexibly high infrastructure costs that get passed onto their customers (which is the case for our biggest competitors in our market).

The easiest way to play with containers and experience the difference is by using a modern Linux OS—whether locally, on a server or even on a VM. There is a great tutorial on Fedora by Dan Walsh and another one with systemd’s nspawn tool that includes the UNIX socket handoff.

Evolving Containers in Open Source

While Pantheon isn’t in the business of providing raw containers as a service, we’re working toward the open-source technical foundations. It’s in the same spirit as why Facebook and Yahoo incubated Hadoop; it’s part of the foundation we need to build the product we want. We directly contribute as co-maintainers to systemd, much of which will be available in the next Red Hat Enterprise Linux release. We also send patches upstream to enable better service “activation” support so containers can run only when needed.

We have a goal that new installations of Fedora and other major distributions, like Ubuntu, provide out-of-the-box, standard API and command-line tools for managing containers, binding them to ports or IP addresses and viewing the resource reservation and utilization levels. This capability should enable other companies eventual access to a large pool of container hosts and flavors, much as Xen opened the door to today’s IaaS services.

One way we’ll measure progress toward this goal is how “vanilla” our container host machines are. If we can prepare a container host system by just installing some packages and a certificate (or other PKI integration), we’ll have achieved it. The free, open-source software (FOSS) world will be stronger for it, and Pantheon will also benefit by having less code to maintain and broader adoption of container-centric computing.

There’s also an advantage to this sort of FOSS contribution versus Cloud Foundry, OpenShift or OpenStack, which are open-source PaaS and LaaS stacks. What Pantheon is doing with projects like systemd redefines how users manage applications and resources even on single machines. The number of potential users—and, since it’s FOSS, contributors—is orders of magnitude larger than for tools that require multi-machine deployments to be useful. It’s also more sustainable in FOSS to have projects where 99% of the value reaches 99% of a large, existing user base.

Efficiency demands a future of containers running on bare-metal hardware. Virtual machines have had their decade.

(via LinuxJournal.com)

 

Google Cloud Messaging Update Boosted by XMPP

Google’s huge update on its Cloud Messaging service is the biggest announcement from this year Google I/O, as it mixes our two core strengths. We strongly believe this is a big deal for mobile developers and we will explain why.

What is Google Cloud Messaging (GCM) ?

google-io-keynote-cloud-1

In case you never heard about it, here is a quick overview.

Google Cloud Messaging have been announced last year as a replacement of Google Cloud to Device Messaging. This is a service that allows mobile developer to notify mobile devices (but also Chrome browser) about important changes on the server-side component of the application. It makes it possible for the device to stay up to date, without the need to use polling. Polling is about checking periodically for updates on your server and is bad for several reasons:

  • it consumes lot of battery, periodically waking up the mobile device network stack.
  • it consumes a lot of useless resources on the server as most of this resource lead to a “no update” reply.
  • it introduces latency as the request for new data can happen long after the data availability (unless you put a very low polling interval, which is even worse for the mobile battery). You can get your news alert hours after every one already knows about it, which defeat the purpose of the alert in the first place.

Push notifications avoid all those drawbacks, saving battery life, providing lower latency, reducing server load. As a permanently connected session to Google servers, it get the important notification messages as soon as they happen, in a battery efficient way.

Being a service that runs at device level, it can be optimized across applications running on the same device and perform further battery saving benefits.

So, this is a really critical feature to implement in application relying on data coming from the network (which actually covers a lot of applications).

GCM offers nice outstanding features like:

  • Support two use cases:
    • “send to sync”: short notification to tell client to get update from developer server.
    • “send data”: larger notification containing the data payload, avoiding an extra client to developer server roundtrip.
  • Time to live can be used to destroy message without notification if device was not online before its expiration. Pending messages can also be replaced with the collapse feature.
  • Delay when idle can be use for less urgent notifications. Device will get it when becoming active again.
  • Multicast messages to send the same notification to up to 1000 devices.

What does GCM update brings to developers ?

The new update is a brand new service deployment bringing a large set of new features for developers.

In short, it brings:

  • Persistent connections are supported between developer backend and Google, allowing sending a larger number of notifications faster.
  • Upstream messaging: This allows the device to send back notifications to the developer server, through Google platform.
  • Notifications synchronization between devices. Basically, this allows a developer to remove a notification from a device when it has been read / processed on another device.

GCM Cloud Connection Service (CCS): Persistent connections using XMPP

Google choose to use XMPP to allow developers to keep a persistent connection. It works as follow:

  • You open an XMPP client connection to Google XMPP GCM server (Port 5235 on gcm.googleapis.com).
  • You can then keep the secure connection open for as long as you wish.
  • You can then send your usual GCM JSON payload in XMPP message packets.
  • This is streaming so you receive possible errors asynchronously and you can match it to original push thanks to XMPP message id.

This alone brings you a huge performance increase, allowing your server to send up to 4000 messages per second on the persistent connection. Knowing you are allowed up to 10 connections, you can possibly send many notifications fast (up to 40k notifications per second).

A few things to note:

  • You cannot use multicast messages with XMPP persistant connection (at least yet).
  • However, you can mix HTTP connexions and XMPP persistant connections in your use case to optimize the performance depending on the use case.

Upstream messaging

Upstream messaging allows your client to send data asynchronously to your server. Compared to HTTP posts, it offer several advantages:

  • It is easy for client developer, with a simple gcm.send command. This is “fire and forget”. Client developer sends the data. Android GCM framework save it locally, with the commitment to deliver it.
  • It is reliable with reliability handled transparently by GCM mobile framework on the client. There is acknowledgement mechanism on the client and between GCM and your server that ensure no message can be lost.
  • Timing to send is optimized across mobile applications. This allow significant battery saving for messages that are not very time sensitive. Optimization is even performed based on the server side down stream needs. If client or server send messages first, the other direction queue is flushed at the same time.
  • It supports time to live. If the message could not be send, because network is unavailable and is not relevant anymore, it is discarded.

As a developer, your server will receive the message through an XMPP connection. However, be very careful about your server efficiency: you have to be robust and read data fast as the GCM server will queue for you 100 messages before starting to overwrite them (and if you are offline for 24 hours it will discard your messages).

Multi-devices notification synchronisation

The goal of this feature is to makes your notifications state becomes up to date and consistent across a user devices, by propagating state change between the user application installations on your various devices.

This feature is designed to bring many added benefits in the way multiple devices are handled.

First, to know that your app is used by the same users, you can group your device Registration Ids under the same notification_key. Notification key is typically a hash of the username. You are allowed up to 10 devices linked to that notification_key.

Once you have done that, you can use that notification_key as a user id to send notifications to all devices of a given user. By doing so, you let GCM performs lots of optimizations under the hood. Notification will be send in priority to the active device or the last active device. Other devices will get the notification a bit later in a delay while idle type of delivery.

Once the notification has been processed and dismissed, other devices are notified and they can remove the notification as well. If they were not notified yet, notification is directly canceled from the notification queue.

Conclusion

As you see, update to Google Cloud Messaging is really a big update for developers, increasing the number of situation this platform is relevant.

We are already working on supporting those improvements on our push notification platform to help you all benefits of those improvements.

Stay tuned for more information on this support very soon.

(Source: Process-one.net)