Containers—Not Virtual Machines—Are the Future Cloud

Cloud infrastructure providers like Amazon Web Service sell virtual machines. EC2 revenue is expected to surpass $1B in revenue this year. That’s a lot of VMs.

It’s not hard to see why there is such demand. You get the ability to scale up or down, guaranteed computational resources, security isolation and API access for provisioning it all, without any of the overhead of managing physical servers.

But, you are also paying for lot of increasingly avoidable overhead in the form of running a full-blown operating system image for each virtual machine. This approach has become an unnecessarily heavyweight solution to the underlying question of how to best run applications in the cloud.

Figure 1. Traditional virtualization and paravirtualization require a full operating system image for each instance.

Until recently it has been assumed that OS virtualization is the only path to provide appropriate isolation for applications running on a server. These assumptions are quickly becoming dated, thanks to recent underlying improvements to how the Linux kernel can now manage isolation between applications.

Containers now can be used as an alternative to OS-level virtualization to run multiple isolated systems on a single host. Containers within a single operating system are much more efficient, and because of this efficiency, they will underpin the future of the cloud infrastructure industry in place of VM architecture.

Figure 2. Containers can share a single operating system and, optionally, other binary and library resources.

How We Got Here

There is a good reason why we buy by the virtual machine today: containers used to be terrible, if they existed in any useful form at all. Let’s hop back to 2005 for a moment. “chroot” certainly didn’t (and still doesn’t) meet the resource and security isolation goals for multi-tenant designs. “nice” is a winner-takes-all scheduling mechanism. The “fair” resource scheduling in the kernel is often too fair, equally balancing resources between a hungry, unimportant process and a hungry, important one. Memory and file descriptor limits offer no gradient between normal operation and crashing an application that’s overstepped its boundaries.

Virtual machines were able to partition and distribute resources viably in the hypervisor without relying on kernel support or, worse, separate hardware. For a long time, virtual machines were the only way on Linux to give Application A up to 80% of CPU resources and Application B up to 20%. Similar partitioning and sharing schemes exist for memory, disk block I/O, network I/O and other contentious resources.

Virtual machines have made major leaps in efficiency too. What used to be borderline-emulation has moved to direct hardware support for memory page mapping and other hard-to-virtualize features. We’re down to a CPU penalty of only a few percent versus direct hardware use.

The Problem with VMs

Here are the penalties we currently pay for VMs:

  1. Running a whole separate operating system to get a resource and security isolation.
  2. Slow startup time while waiting for the OS to boot.

The OS often consumes more memory and more disk than the actual application it hosts. The Rackspace Cloud recently discontinued 256MB instances because it didn’t see them as practical. Yet, 256MB is a very practical size for an application if it doesn’t need to share that memory with a full operating system.

As for slow startup time, many cloud infrastructure users keep spare virtual machines around to accelerate provisioning. Virtual machines have taken the window from requesting to receiving resources down to minutes, but having to keep spare resources around is a sign that it’s still either too slow or not reliable enough.

So What’s the Difference?

VMs are fairly standardized; a system image running on one expects mostly the same things as if it had its own bare-metal computer. Containers are not very standardized in the industry as a whole. They’re very OS- and kernel-specific (BSD jails, Solaris Zones, Linux namespaces/cgroups). Even on the same kernel and OS, options may range from security sandboxes for individual processes to nearly complete systems.

VMs don’t have to run the same kernel or OS as the host and they obtain access to resources from the host over virtualized devices—like network cards—and network protocols. However, VMs are fairly opaque to the host system; the hypervisor has poor insight into what’s been written but freed in block storage and memory. They usually leverage hardware/CPU-based facilities for isolating their access to memory and appear as a handful of hypervisor processes on the host system.

Containers have to run the same kernel as the host, but they can optionally run a completely different package tree or distribution. Containers are fairly transparent to the host system. The kernel manages memory and filesystem access the same way as if it were running on the host system. Containers obtain access to resources from the host over normal userland/IPC facilities. Some implementations even support handing sockets from the host to the container of standard UNIX facilities.

VMs are also heavyweight. They require a full OS and system image, such as EC2’s AMIs. The hypervisor runs a boot process for the VM, often even emulating BIOS. They usually appear as a handful of hypervisor processes on the host system. Containers, on the other hand, are lightweight. There may not be any persistent files or state, and the host system usually starts the containerized application directly or runs a container-friendly init dæmon, like systemd. They appear as normal processes on the host system.

Figure 3. Virtualization decoupled provisioning from hardware deployment. Containers decouple provisioning from OS deployment and boot-up.

Containers Now Offer the Same Features as VMS, but with Minimal Overhead

Compared to a virtual machine, the overhead of a container is disruptively low. They start so fast that many configurations can launch on-demand as requests come in, resulting in zero idle memory and CPU overhead. A container running systemd or Upstart to manage its services has less than 5MB of system memory overhead and nearly zero CPU consumption. With copy-on-write for disk, provisioning new containers can happen in seconds.

Containers are how we at Pantheon maintain a consistent system architecture that spans free accounts up to clustered, highly available enterprise users. We manage an internal supply of containers that we provision using an API.

We’re not alone here; virtually every large PaaS system—including Heroku, OpenShift, dotCloud and Cloud Foundry—has a container foundation for running their platform tools. PaaS providers that do rely on full virtual machines have inflexibly high infrastructure costs that get passed onto their customers (which is the case for our biggest competitors in our market).

The easiest way to play with containers and experience the difference is by using a modern Linux OS—whether locally, on a server or even on a VM. There is a great tutorial on Fedora by Dan Walsh and another one with systemd’s nspawn tool that includes the UNIX socket handoff.

Evolving Containers in Open Source

While Pantheon isn’t in the business of providing raw containers as a service, we’re working toward the open-source technical foundations. It’s in the same spirit as why Facebook and Yahoo incubated Hadoop; it’s part of the foundation we need to build the product we want. We directly contribute as co-maintainers to systemd, much of which will be available in the next Red Hat Enterprise Linux release. We also send patches upstream to enable better service “activation” support so containers can run only when needed.

We have a goal that new installations of Fedora and other major distributions, like Ubuntu, provide out-of-the-box, standard API and command-line tools for managing containers, binding them to ports or IP addresses and viewing the resource reservation and utilization levels. This capability should enable other companies eventual access to a large pool of container hosts and flavors, much as Xen opened the door to today’s IaaS services.

One way we’ll measure progress toward this goal is how “vanilla” our container host machines are. If we can prepare a container host system by just installing some packages and a certificate (or other PKI integration), we’ll have achieved it. The free, open-source software (FOSS) world will be stronger for it, and Pantheon will also benefit by having less code to maintain and broader adoption of container-centric computing.

There’s also an advantage to this sort of FOSS contribution versus Cloud Foundry, OpenShift or OpenStack, which are open-source PaaS and LaaS stacks. What Pantheon is doing with projects like systemd redefines how users manage applications and resources even on single machines. The number of potential users—and, since it’s FOSS, contributors—is orders of magnitude larger than for tools that require multi-machine deployments to be useful. It’s also more sustainable in FOSS to have projects where 99% of the value reaches 99% of a large, existing user base.

Efficiency demands a future of containers running on bare-metal hardware. Virtual machines have had their decade.



Google Cloud Messaging Update Boosted by XMPP

Google’s huge update on its Cloud Messaging service is the biggest announcement from this year Google I/O, as it mixes our two core strengths. We strongly believe this is a big deal for mobile developers and we will explain why.

What is Google Cloud Messaging (GCM) ?


In case you never heard about it, here is a quick overview.

Google Cloud Messaging have been announced last year as a replacement of Google Cloud to Device Messaging. This is a service that allows mobile developer to notify mobile devices (but also Chrome browser) about important changes on the server-side component of the application. It makes it possible for the device to stay up to date, without the need to use polling. Polling is about checking periodically for updates on your server and is bad for several reasons:

  • it consumes lot of battery, periodically waking up the mobile device network stack.
  • it consumes a lot of useless resources on the server as most of this resource lead to a “no update” reply.
  • it introduces latency as the request for new data can happen long after the data availability (unless you put a very low polling interval, which is even worse for the mobile battery). You can get your news alert hours after every one already knows about it, which defeat the purpose of the alert in the first place.

Push notifications avoid all those drawbacks, saving battery life, providing lower latency, reducing server load. As a permanently connected session to Google servers, it get the important notification messages as soon as they happen, in a battery efficient way.

Being a service that runs at device level, it can be optimized across applications running on the same device and perform further battery saving benefits.

So, this is a really critical feature to implement in application relying on data coming from the network (which actually covers a lot of applications).

GCM offers nice outstanding features like:

  • Support two use cases:
    • “send to sync”: short notification to tell client to get update from developer server.
    • “send data”: larger notification containing the data payload, avoiding an extra client to developer server roundtrip.
  • Time to live can be used to destroy message without notification if device was not online before its expiration. Pending messages can also be replaced with the collapse feature.
  • Delay when idle can be use for less urgent notifications. Device will get it when becoming active again.
  • Multicast messages to send the same notification to up to 1000 devices.

What does GCM update brings to developers ?

The new update is a brand new service deployment bringing a large set of new features for developers.

In short, it brings:

  • Persistent connections are supported between developer backend and Google, allowing sending a larger number of notifications faster.
  • Upstream messaging: This allows the device to send back notifications to the developer server, through Google platform.
  • Notifications synchronization between devices. Basically, this allows a developer to remove a notification from a device when it has been read / processed on another device.

GCM Cloud Connection Service (CCS): Persistent connections using XMPP

Google choose to use XMPP to allow developers to keep a persistent connection. It works as follow:

  • You open an XMPP client connection to Google XMPP GCM server (Port 5235 on
  • You can then keep the secure connection open for as long as you wish.
  • You can then send your usual GCM JSON payload in XMPP message packets.
  • This is streaming so you receive possible errors asynchronously and you can match it to original push thanks to XMPP message id.

This alone brings you a huge performance increase, allowing your server to send up to 4000 messages per second on the persistent connection. Knowing you are allowed up to 10 connections, you can possibly send many notifications fast (up to 40k notifications per second).

A few things to note:

  • You cannot use multicast messages with XMPP persistant connection (at least yet).
  • However, you can mix HTTP connexions and XMPP persistant connections in your use case to optimize the performance depending on the use case.

Upstream messaging

Upstream messaging allows your client to send data asynchronously to your server. Compared to HTTP posts, it offer several advantages:

  • It is easy for client developer, with a simple gcm.send command. This is “fire and forget”. Client developer sends the data. Android GCM framework save it locally, with the commitment to deliver it.
  • It is reliable with reliability handled transparently by GCM mobile framework on the client. There is acknowledgement mechanism on the client and between GCM and your server that ensure no message can be lost.
  • Timing to send is optimized across mobile applications. This allow significant battery saving for messages that are not very time sensitive. Optimization is even performed based on the server side down stream needs. If client or server send messages first, the other direction queue is flushed at the same time.
  • It supports time to live. If the message could not be send, because network is unavailable and is not relevant anymore, it is discarded.

As a developer, your server will receive the message through an XMPP connection. However, be very careful about your server efficiency: you have to be robust and read data fast as the GCM server will queue for you 100 messages before starting to overwrite them (and if you are offline for 24 hours it will discard your messages).

Multi-devices notification synchronisation

The goal of this feature is to makes your notifications state becomes up to date and consistent across a user devices, by propagating state change between the user application installations on your various devices.

This feature is designed to bring many added benefits in the way multiple devices are handled.

First, to know that your app is used by the same users, you can group your device Registration Ids under the same notification_key. Notification key is typically a hash of the username. You are allowed up to 10 devices linked to that notification_key.

Once you have done that, you can use that notification_key as a user id to send notifications to all devices of a given user. By doing so, you let GCM performs lots of optimizations under the hood. Notification will be send in priority to the active device or the last active device. Other devices will get the notification a bit later in a delay while idle type of delivery.

Once the notification has been processed and dismissed, other devices are notified and they can remove the notification as well. If they were not notified yet, notification is directly canceled from the notification queue.


As you see, update to Google Cloud Messaging is really a big update for developers, increasing the number of situation this platform is relevant.

We are already working on supporting those improvements on our push notification platform to help you all benefits of those improvements.

Stay tuned for more information on this support very soon.


Google I/O 2013: Services, services, services

Today was the keynote of Google I/O developer conference. The keynote is usually the place where major announcements are made regarding the Google ecosystem.

Despite impressive announcements, the most important thing that strikes me is not what has been released, but what has not been mentioned.

Services, services and more services

First, what are the main areas of focus this year ?

Either on Android, on Chrome or on the Cloud and server engine architecture Google is showing its consistency in pushing further existing services, adding new ones, and integrating all the pieces together. Here is the impressive list of highlights to their service stack:

  • Google Maps API V2 and location API improvements, with:
    • Fused location = faster, more accurate and more battery friendly.
    • Geofencing, ability to save up to one hundred location triggers per application.
    • Activity recognition based on the phone accelerometer. Device can know if your are walking, cycling, walking, driving. This is a battery efficient, not relying on GPS.
  • Google+ Sign in, brings deep integration between website and Android apps with Google+ service.
  • Google Cloud Messaging, Google Push Notification service, with three major highlights:
    • Persistent connections are supported between developer backend and Google, allowing sending a larger number of notifications faster.
    • Upstream messaging: This allows the device to send back notifications to the developer server, through Google platform.
    • Notifications synchronization between devices. Basically, this allows a developer to remove a notification from a device when it has been read / processed on another device.

Google Cloud Messaging is one our main area of interest. We are already working on the new features and we will have announcements to make soon under our mobile Boxcar brand. Stay tuned 🙂

  • Google Play Game Service, with:
    • Cloud Save to synchronize your game progress across devices.
    • Achievements and Learderboard, integrated with Google+.
    • Multiplayer API to help developer with networking part.
    • Matchmaking to find players to play with.
    • Cross-platform experience on Android and iOS.
  • Google Wallet: low profile (aka not really promoted in the keynote) improvements, like GMail payments, or easier checkout on mobile.
  • Better developer console to analyse how Android apps are doing and optimize their performance, more specifically:
    • Optimization tips
    • App Translation Service.
    • Referal tracking
    • Usage metrics (Google Analytics from the developer console).
    • Revenue graph
    • Beta testing and stage rollout management

There have also been announcements focusing on Google services improvements or addition for main users (as opposed to developers):

  • Google Play Store improvements to better promote apps to users.
  • Google Music subscription service (US only for now).
  • Huge Google Maps rework and redesign.
  • Search improvements with focus on:
    • more knowledge graph integration in search.
    • more integration of personal information and Google+, circle based personnalisation in search results.
    • Better conversation (aka iteratively refined voice search), with conversation-based queries coming to Chrome on the desktop.
    • Google Now improvements with more cards, to anticipate your search needs on the go.
  • Google+ improvements, mostly for end users:
    • Stream redesign
    • Hangouts chat system, which is a cross platform merge of all Google chats. It is cross-platform, focus on conversation, realtime, photo sharing and video group calling.
    • Photos management, with impressive auto-enhancement and sorting features.

Note from an XMPP developer perspective: Does the new Hangouts mean Gtalk and XMPP will disappear, along with interoperability ? There was not word about it, but I think so.

Impressive, isn’t it ?

Still, as a developer, that first day strangely leaves me with a feeling of unfullfilled expectations. Why ? I think to understand it, we need to list what Google did not talk about.

What Google did not talk about

In previous Google I/O, the center stage is usually taken by:

  • Android updates: There was none announced today.
  • Shiny new devices, usually prerelease to developers. Nothing on this part as well.
  • New unexpected projects, like Chrome, Google Glasses, Google TV, or even the now dead Wave.

On this side, nothing has been announced. No mention of Android for home or TV. No successor for the now dead Nexus Q. No update on Android Accessory Developement Kit. No glasses push. No new wearable computer.

Despite a few talks on Google glasses tomorrow, there have been little mention of the progress so far in the keynote.

This year, Google is focusing on services for several (valid) reasons:

  • Those services are updated directly on the devices through Google Play Store. They can more easily push the updates to the end users.
  • Services are perceived are Apple’s Achille heel.
  • Services are a way to put Google at the front stage and differenciatethe Google experience from the various Android forks. It also allow Google to differenciate from device manufacturers that are increasingly trying to get the front stage with their Android devices.

But as Google focus on Services, the story they tell is increasingly about themselves.

For Android developer, on the most major highlight (outside of Google Services) was Android Studio, a development environment based on Jetbrains Intellij, release today in early preview (version 0.1!).

Google have even been heavily promoting Chrome on Desktop, but now also on Android and iOS, focusing on bringing the same experience from all the environment. Along with the fact that both Chrome and Android and under the unique direction of Sundar Pichai, this leave a strange confusing impression.

My conclusion is that for Google, devices do not matter. When Larry Page says that he wants the technology, the device to disappear, he actually means it in the proper sense. Google Glasses and conversational search are a steps in that direction. They are the most straightforward access to Google services. Ideally, they should not even be needed.

It does not matter if you use Chrome, Glasses, Android or iOS to access Google Services. What matter are the services themselves and the contextual data that can be gathered to improve relevance and personnalisation of the service.

Sundar Pichai said two days ago that Google I/O will not be centered on the devices. It is because devices are not an end but a mean for Google.

I feel at this Google I/O, the goal of Google has never been more clear (if you look through the confusion I mentioned earlier).

At this very moment, the path of Apple and Google may split there:

Google wants to improve people lifes with services, making the technology totally hidden. Apple wants to improve people life by focusing on how people interact with the technology (touch, voice, and more). This goes through devices improvements (lighter, faster, easier to use), not making the devices disappear.

Today, I feel that we are at a turning point, I am really looking forward WWDC to see what will be Apple move.

(Source: Nati Shalom’s Blog)

Amazon S3 – Two Trillion Objects, 1.1 Million Requests / Second

Last June I blogged about the first trillion objects stored in Amazon S3. On the first day of re:Invent I updated that number to 1.3 trillion.

It is time for another update!

I’m pleased to announce that there are now more than 2 trillion (2 x 1012) objects stored in Amazon S3 and that the service is regularly peaking at over 1.1 million requests per second.

It took us six years to grow to one trillion stored objects, and less than a year to double that number.

What Does That Mean?
It is always fun to try and put these numbers into real-world terms:

I spoke at a cloud computing conference in China last week. With a population of 1.35 billion, there are 1,481 Amazon S3 objects per Chinese citizen!

Our galaxy is estimated to contain about 400 billion stars. That works out to five objects for every star in the galaxy.

The field of Paleodemography estimates that 100 billion people have been born on planet Earth. Each of them can have 20 S3 objects.

And one more — our universe is about 13.6 billion years old. If you added one S3 object every 60 hours starting at the Big Bang, you’d have accumulated almost two trillion of them by now.

(via Amazon Web Services)

Real-Time Ad Impression Bids Using DynamoDB

Real-time bidding (RTB) allows advertisers and publishers to buy and sell online display ad inventory in an efficient, high-speed marketplace. RTB platforms solicit bid requests for each ad impression from multiple ad exchanges (or advertisers) and pick the highest bid. Auctions occur in real-time, while end-users wait for content to load in a browser or an application. The impact on end-user latencies places strict time constraints on the decision-making latency of bidding systems (or bidders) built by ad exchanges and advertisers to compete in the RTB marketplace.

RTB promises higher ad effectiveness and reach to advertisers, and optimal pricing of inventory and better fill-rates to publishers. To deliver on this promise, bidders must make intelligent decisions based on all available information, including cookie profiles, available inventory, price history, content, and context in a super-tight time-window. To achieve this, bidders need access to a data-store that offers extremely low latency and the ability to scale to massive transaction rates. Amazon DynamoDB is specifically designed to offer consistent low-latency performance at high scale, making it a great fit for RTB systems that need to quickly access information like cookie profile data. Bidders can also leverage DynamoDB’s durable and fast writes to update dynamic information like ad counts (for particular ads) into the cookie profile without worrying about data availability and durability in the face of failures.

Latency, Latency, Latency…
Each ad served using RTB involves complex real-time interactions between content systems, ad servers, a real-time bidding platform, and multiple ad exchanges. The process has to complete under strict time constraints because it occurs while a user waits for content to load, and latency has a direct impact on end-user engagement. Typical bidder latency requirements are in the 50 to 150 ms range, to ensure that ad opportunities are not lost due to ad serving delays. Because separate companies own the bidders and the RTB platform, the systems hosting these services are typically in different data-centers. Assuming a latency requirement of 100 ms, network round-trip latency of 40 ms, and a buffer of 20 ms (to account for network jitter and queuing delays), the bidder has only 40 ms to decide on a bid indicating the ad to serve and the bid-price. Even if the application heavily optimizes computation and data access, the available time for individual data fetches and writes is on the order of single-digit milliseconds.

Here’s a graphical depiction of the ad serving time budget:

Reliability, Zero Touch Operations
In addition to raw speed, DynamoDB offers other advantages that make it a great fit for bidding systems. The elasticity of provisioned capacity allows reads and writes to be scaled up or down based on bid request rates and ad availability. Atomic increment and decrement operations allow accurate count of events (e.g. ad views per user for frequency capping). DynamoDB’s flexible schema allows attributes to be added to a table when required. For example, adding an attribute like “sports_enthusiast” to a cookie-profile table on the fly. Moreover, customers can focus on building a bidder without having to deal with the operational aspects of running a large scale distributed datastore, like partitioning, repartitioning, setting up replication groups, cluster management. With DynamoDB, these things are done under the covers for them.

AdRoll: Building a 7 billion Impressions/Day Platform on DynamoDB
AdRoll, which was named by Inc. Magazine as the fastest-growing advertising company in 2012, uses DynamoDB for their retargeting platform. According to Valentino Volonghi, AdRoll’s chief architect – “We use DynamoDB to bid on more than 7 billion impressions per day on the Web and FBX. AdRoll’s bidding system accesses more than a billion cookie profiles stored in DynamoDB, and sees uniform low-latency response. In addition, the availability of DynamoDB in all AWS regions allows our lean team to meet the rigorous low latency demands of real-time bidding in countries across the world without having to worry about infrastructure management.”

In a competitive world of real-time bidding, where a delayed bid results in revenue lost and unconsidered data can lead to over-bidding, the choice of the right data-store is critical. DynamoDB provides a high-speed, flexible, highly available data-store, which provides the ideal platform for building a bidding system.

If you’d like to build an ad serving system of your own, we recommend that you start out with our Ad Serving Reference Architecture:

— Prashant and Pravin

(via Amazon Web Services Blog)

CloudSigma goes all-SSD to boost HPC performance in the public cloud

Public clouds offer lots of flexibility, but not necessarily the sort of performance you need for handling big data. The Zurich-based provider CloudSigma has felt this pinch more than most, as it is a supplier to Europe’s performance-hungry science cloud,Helix Nebula, and now it says it has found the solution: going all-SSD. Well, that and rolling its own stack.

CloudSigma, which operates out of both Switzerland (Zurich) and the U.S. (Las Vegas), was one of a handful of infrastructure-as-a-service (IaaS) providers that signed up last November for SolidFire’s all-SSD storage system. The result is now here: CloudSigma has ditched all its hard-disk drives and, as a result, it now feels confident enough to offer a service-level agreement (SLA) for performance, as well as uptime.

What’s more, despite the fact that solid-state storage costs about eight times as much as hard-disk, CloudSigma hasn’t changed its pricing – its SSD-based utility service costs $0.14 per GB per month, same as the HDD-based service did. Customers can also pick up the SSD storage service unbundled from CPU and RAM if they so choose.

HPC in the public cloud

According to CloudSigma COO Bernino Lind, the shift to SSD is a major help when it comes to handling high-performance computing (HPC) workloads, such as those of Helix Nebula users CERN, the European Space Agency (ESA) and the European Molecular Biology Laboratory (EMBL):

“They want to go to opex instead of capex, but the problem is there is no-one really who does public infrastructure-as-a-service which works well enough for HPC. There is contention — variable performance on compute power and, even worse, really variable performance on IOPS [Input/Output Operations Per Second]. When you have a lot of I/O operations, then you get all over the spectrum from having a couple of hundred to having 1,000 and it just goes up and down. It means that, once you run a large big data setup, you get iowaits and your entire stack normally just stops and waits.”

Lind pointed out that, while aggregated spinning-disk setups will only allow up to 10,000 IOPS, one SSD will allow 100,000-1.5 million IOPS. That mitigates that particular contention problem. “There should be a law that public IaaS shouldn’t run on magnetic disks,” he said. “The customer buys something that works sometimes and doesn’t work other times – it shouldn’t be possible to sell something that has that as a quality.”

CloudSigma has also resolved another contention point around RAM, Lind claimed:

“A modern CPU can ask for a lot of data because it’s fast and efficient, so it is possible to saturate and make contention on your memory bus. That has been solved with NUMA topology, which is like a multiplexer to get access to memory banks. You get asynchronous access, which means you don’t have contention on accessing the RAM.

“However, public cloud service providers turn this off so the actual instance doesn’t have access to NUMA. We figured out a way to pass on the NUMA topology so, when you run really extensive compute jobs, you won’t hit a kind of contention when you want access to RAM. This is really important for big data workloads.”

In-house stack

Speaking of things that public cloud providers tend to turn off, CloudSigma’s stack – apart from the underlying KVM hypervisor, everything was written in-house – makes it possible to access all the instruction set goodies that are built into modern processors, such as the AES encryption instruction set.

Public clouds may run on a variety of physical hosts that encompass a range of CPU generations, only some of which will have certain instruction sets hard-coded onto the silicon. Providers will often turn off these instruction sets to make their platform homogeneous, but that means losing out on the performance benefits offered by hard-coding. According to Lind, CloudSigma’s stack allows a heterogeneous cloud based on allocation pools – say, one of older Intel chips and another of newer AMD 6380 chips – that customers can choose according to their performance needs.

What does all this mean in practice? Lind cited the example of augmented-reality gaming outfit Ogmento, which recently used CloudSigma’s all-SSD setup to power a mobile, location-based version of a popular title. “They [said] all their I/O-heavy stuff, databases and so on, saw a x8-x12 performance increase,” he noted. “Their entire stack saw a x2-x4 performance increase. That means they need to use less compute power in order to run their system.”

With the budgetary constraints faced by European scientists these days, it’s not hard to see how that same kind of effect could make a real difference in more serious applications too.


Amazon RDS Scales Up – Provision 3 TB and 30,000 IOPS Per DB Instance

The Amazon Relational Database Service (RDS) handles all of the messy low-level aspects of setting up, managing, and scaling MySQL, Oracle Database, and SQL Server databases. You can simply create an RDS DB instance with the desired processing power and storage space and RDS will take care of the rest.

The RDS Provisioned IOPS feature (see my recent blog post for more information) gives you the power to specify the desired number of I/O operations per second when you create each DB instance.  This allows you to set up instances with the desired level of performance while keeping your costs as low as possible.

Today we are introducing three new features to make Amazon RDS even more powerful and more scalable:

  1. Up to 3 TB of storage and 30,000 Provisioned IOPS.
  2. Conversion from Standard Storage to Provisioned IOPS storage.
  3. Independent scaling of IOPS and storage.

Let’s take an in-depth look at each of these new features.

Up to 3 TB of Storage and 30,000 Provisioned IOPS
We are tripling the amount of storage that you can provision for each DB instance, and we’re also tripling the number of IOPS for good measure.

You can now create DB instances (MySQL or Oracle) with up to 3 TB of storage (the previous limit was 1 TB) and 30,000 IOPS (previously, 10,000). SQL Server DB Instances can be created with up to 1TB of storage and 10,000 IOPS.

For a workload with 50% reads and 50% writes running on an m2.4xlarge instance, you can realize up to 25,000 IOPS for Oracle and 12,500 IOPS for MySQL. However, by provisioning up to 30,000 IOPS, you may be able to achieve lower latency and higher throughput. Your actual realized IOPS may vary from what you have provisioned based on your database workload, instance type, and choice of database engine. Refer to the Factors That Affect Realized IOPS section of the Amazon RDS User Guide to learn more.

Obviously, you can work with larger datasets, and you can read and write the data faster than before. You might want to start thinking about scaling PIOPS up and down over time in response to seasonal variations in load. You could also use a CloudWatch alarm to make sure that you are the first to know

You can modify the storage of existing instances that are running MySQL or Oracle Database. When you do this you can grow storage by 10% or more, and you can raise and lower PIOPS in units of 1,000. There will be a performance impact while the scaling process is underway.

Conversion from Standard Storage to Provisioned IOPS Storage
You can convert DB instances with Standard Storage to Provisioned IOPS in order to gain the benefits of fast and predictable performance. You can do this from the AWS Management Console, the command line, or through the RDS APIs. Simply Modify the instance and specify the desired number of PIOPS.

There will be a brief impact on availability when the modification process starts. If you are running a Multi-AZ deployment of RDS, the availability impact will be limited to the amount of time needed for the failover to complete (typically three minutes). The conversion may take several hours to complete and there may be a moderate performance degradation during this time.

Note: This feature is applicable to DB instances running MySQL or Oracle Database.

Independent Scaling of IOPS and Storage
You can now scale Provisioned IOPS and storage independently. In general, you will want to have between 3.0 and 10.0 IOPS per GB of storage. You can modify the ratio over time as your needs change.

Again, this feature is applicable to DB instances running MySQL or Oracle Database.

Available Now
All three of these features are available now, and they are available in every AWS Region where Provisioned IOPS are supported (all Regions except AWS GovCloud (US)).

You can use these features in conjunction with RDS Multi-AZ deployments and RDS Read Replicas.

(via Amazon Web Service blog)