AWS OpsWorks in the Virtual Private Cloud

I am pleased to announce support for using AWS OpsWorks with Amazon Virtual Private Cloud (Amazon VPC). AWS OpsWorks is a DevOps solution that makes it easy to deploy, customize and manage applications. OpsWorks provides helpful operational features such as user-based ssh management, additional CloudWatch metrics for memory and load, automatic RAID volume configuration, and a variety of application deployment options. You can optionally use the popular Chef automation platform to extend OpsWorks using your own custom recipes. With VPC support, you can now take advantage of the application management benefits of OpsWorks in your own isolated network. This allows you to run many new types of applications on OpsWorks.

For example, you may want a configuration like the following, with your application servers in a private subnet behind a public Elastic Load Balancer (ELB). This lets you control access to your application servers. Users communicate with the Elastic Load Balancer which then communicates with your application servers through the ports you define. The NAT allows your application servers to communicate with the OpsWorks service and with Linux repositories to download packages and updates.

To get started, we’ll first create this VPC. For a shortcut to create this configuration, you can use a CloudFormation template. First, navigate to theCloudFormation console and select Create Stack.  Give your stack a name, provide the template URL http://cloudformation-templates-us-east-1.s3.amazonaws.com/OpsWorksinVPC.template and select Continue. Accept the defaults and select Continue. Create a tag with a key of “Name” and a meaningful value. Then create your CloudFormation stack.

When your CloudFormation stack’s status shows “CREATE_COMPLETE”, take a look at the outputs tab; it contains several IDs that you will need later, including the VPC and subnet IDs.

You can now create an OpsWorks stack to deploy a sample app in your new private subnet. Navigate to the AWS OpsWorks console and click Add Stack. Select the VPC and private subnet that you just created using the CloudFormation template.

Next, under Add your first layer, click Add a layer. For Layer type box, select PHP App Server. Select the Elastic Load Balancer created in by the CloudFormation template to the Layer and then click Add layer.

Next, in the layer’s Actions column click Edit. Scroll down to the Security Groups section and select the Additional Group with OpsWorksSecurityGroup in the name. Click the + symbol, then click Save.

Next, in the navigation pane, click Instances, accept the defaults, and then click Add an Instance. This creates the instance in the default subnet you set when you created the stack.

Under PHP App Server, in the row that corresponds to your instance, click start in the Actions column.

You are now ready to deploy a sample app to the instance you created. An app represents code you want to deploy to your servers. That code is stored in a repository, such as Git or Subversion. For this example, we’ll use the SimplePHPApp application from the Getting Started walkthrough.  First, in the navigation pane, click Apps. On the Apps page, click Add an app. Type a name for your app and scroll down to the Repository URL and set Repository URL to git://github.com/amazonwebservices/opsworks-demo-php-simple-app.git, and Branch/Revision to version1. Accept the defaults for the other fields.

When all the settings are as you want them, click Add app. When you first add a new app, it isn’t deployed yet to the instances for the layer. To deploy your App to the instance in PHP App Server layer, under Actions, click Deploy.

Once your deployment has finished, in the navigation pane, click Layers. Select the Elastic Load Balancer for your PHP App Server layer. The ELB page shows the load balancer’s basic properties, including its DNS name and the health status of the associated instances. A green check indicates the instance has passed the ELB health checks (this may take a minute). You can then click on the DNS name to connect to your app through the load balancer.

You can try these new features with a few clicks of the AWS Management Console. To learn more about how to launch OpsWorks instances inside a VPC, see the AWS OpsWorks Developer Guide.

You may also want to sign up for our upcoming AWS OpsWorks Webinar on September 12, 2013 at 10:00 AM PT. The webinar will highlight common use cases and best practices for how to set up AWS OpsWorks and Amazon VPC.

– Chris Barclay, Senior Product Manager

(source: Amazon Web Services blog)

 

More Database Power – 20,000 IOPS for MySQL With the CR1 Instance

If you are a regular reader of this blog, you know that I am a huge fan of the Amazon Relational Database Service (RDS). Over the course of the last couple of years, I have seen that my audiences really appreciate the way that RDS obviates a number of tedious yet mission-critical tasks that are an inherent responsibility when running a relational database. There’s no joy to be found in keeping operating systems and database engines current, creating and restoring backups, scaling hardware up and down, or creating an architecture that provides high availability.

Today we are making RDS even more powerful by adding a new high-end database instance class. The new db.cr1.8xlarge instance type gives you plenty of memory, CPU power, and network throughput to allow your MySQL 5.6 applications to perform at up to 20,000 IOPS. This is a 60% improvement over the previous high-water mark of 12,500 IOPS and opens the door to database-driven applications that are even more demanding than before. Here are the specs:

  • 64-bit platform
  • 244 GB of RAM
  • 88 ECU (16 hyperthreaded virtual cores each delivering 2.75 ECU)
  • High-performance networking

This new instance type is available in the US East (Northern Virginia), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo) Regions and you can start using it today!

(source: Amazon Web Services blog)

OpenStack Grizzly Architecture

As OpenStack has continued to mature, it has become more complicated in some ways but radically simplified in others. From a deployers view, each service has become easier to deploy with more sensible defaults and the proliferations of cloud distributions. However, the architects view of OpenStack has actually gotten more complicated – new services have been added and new ways of integrating them are now feasible.

As an aid to architects that are new to OpenStack, this post updates my OpenStack Folsom Architecture blog and revisits my Intro to OpenStack Architecture (Grizzly Edition)presentation from Portland, with a few clarfications and updates.

OpenStack Components

There are currently seven core components of OpenStack: Compute, Object Storage, Identity, Dashboard, Block Storage, Network and Image Service. Let’s look at each in turn:

  • Object Store (codenamed “Swift“) allows you to store or retrieve files (but not mount directories like a fileserver). Several companies provide commercial storage services based on Swift. These include KT, Rackspace (from which Swift originated) and Hewlett-Packard. Swift is also used internally at many large companies to store their data.
  • Image Store (codenamed “Glance“) provides a catalog and repository for virtual disk images. These disk images are mostly commonly used in OpenStack Compute.
  • Compute (codenamed “Nova“) provides virtual servers upon demand. Rackspace andHP provide commercial compute services built on Nova and it is used internally at companies like Mercado Libre, Comcast, Best Buy and NASA (where it originated).
  • Dashboard (codenamed “Horizon“) provides a modular web-based user interface for all the OpenStack services. With this web GUI, you can perform most operations on your cloud like launching an instance, assigning IP addresses and setting access controls.
  • Identity (codenamed “Keystone“) provides authentication and authorization for all the OpenStack services. It also provides a service catalog of services within a particular OpenStack cloud.
  • Network (which used to named “Quantum” but is in the process of being renamed due to a trademark issue) provides “network connectivity as a service” between interface devices managed by other OpenStack services (most likely Nova). The service works by allowing users to create their own networks and then attach interfaces to them. Quantum has a pluggable architecture to support many popular networking vendors and technologies.
  • Block Storage (codenamed “Cinder“) provides persistent block storage to guest VMs. This project was born from code originally in Nova (the nova-volume service that has been depricated). While this was originally a block storage only service, it has been extended to NFS shares.

In addition to these core projects, there are also a number of non-core projects that will be included in future OpenStack releases.

Conceptual Architecture

The OpenStack project as a whole is designed to “deliver(ing) a massively scalable cloud operating system.” To achieve this, each of the constituent services are designed to work together to provide a complete Infrastructure as a Service (IaaS). This integration is facilitated through public application programming interfaces (APIs) that each service offers (and in turn can consume). While these APIs allow each of the services to use another service, it also allows an implementer to switch out any service as long as they maintain the API. These are (mostly) the same APIs that are available to end users of the cloud.

Conceptually, you can picture the relationships between the services as so:

OpenStack Grizzly Conceptual Architecture

  • Dashboard provides a web front end to the other OpenStack services
  • Compute stores and retrieves virtual disks (“images”) and associated metadata in the Image Store (“Glance”)
  • Network provides virtual networking for Compute.
  • Block Storage provides storage volumes for Compute.
  • Image Store can store the actual virtual disk files in the Object Store
  • All the services authenticate with Identity

This is a stylized and simplified view of the architecture, assuming that the implementer is using all of the services together in the most common configuration. However, OpenStack does not mandate an all-or-nothing approach. Many implementers only deploy the pieces that they need. For example, Swift is a popular object store for cloud service providers, even if they deploy another cloud compute infrastructure.

The diagram also only shows the “operator” side of the cloud — it does not picture how consumers of the cloud may actually use it. For example, many users will access object storage heavily (and directly).

Logical Architecture

As you can imagine, the logical architecture is far more complicated than the conceptual architecture shown above. As with any service-oriented architecture, diagrams quickly become “messy” trying to illustrate all the possible combinations of service communications. The diagram below, illustrates the most common architecture of an OpenStack-based cloud. However, as OpenStack supports a wide variety of technologies, it does not represent the only architecture possible.

OpenStack Grizzly Logical Architecture

This picture is consistent with the conceptual architecture above in that:

  • End users interact through a common web interface or directly to each service through their API
  • All services authenticate through a common source (facilitated through Keystone)
  • Individual services interact with each other through their APIs (except where privileged administrator commands are necessary) — including the user’s web interface

In the sections below, we’ll delve into the architecture for each of the services.

Dashboard

Horizon is a modular Django web application that provides an end user and cloud operator interface to OpenStack services.

OpenStack Horizon Screenshot

The interface has user screens for:

  • Quota and usage information
  • Instances to operate cloud virtual machines
  • Volume management to control creation, deletion and connectivity to block storage
  • Image and snapshot to upload and control virtual images, which are used to backup and boot new instances
  • Access and security to manage keypairs and security groups (firewall rules)

In addition to the user screens, it also provides an interface for cloud operators. The operator interface sees across the entire cloud and adds some configuration focused screens such as:

  • Flavors to define service catalog offerings of CPU, memory and boot disk storage
  • Projects to provide logical groups of user accounts
  • Users to administer user accounts
  • System Info to view services running in the cloud and quotas applied to projects

The Grizzly edition of Horizon adds a few new features as well as significant refactoring to the user experience:

  • Networking (see the new network topology diagrams)
  • Direct image upload to Glance
  • Support for flavor extra specs
  • Migrate instances to other compute hosts
  • User experience improvements

The Horizon architecture is fairly simple. Horizon is usually deployed via mod_wsgi in Apache. The code itself is separated into a reusable python module with most of the logic (interactions with various OpenStack APIs) and presentation (to make it easily customizable for different sites).

From a network architecture point of view, this service will need to be customer accessible as well as be able to talk to each service’s public APIs. If you wish to use the administrator functionality (i.e. for other services), it will also need connectivity to their Admin API endpoints (which should not be customer accessible).

Compute

Nova is the most complicated and distributed component of OpenStack. A large number of processes cooperate to turn end user API requests into running virtual machines. Among Nova more prominent features are:

  • Starting, resizing, stopping and querying virtual machines (“instances”)
  • Assigning and removing public IP addresses
  • Attaching and detaching block storage
  • Adding, modifying and deleting security groups
  • Show instance consoles
  • Snapshot running instances

There are several changes to the architecture in this release. These changes include the depreciation of nova-network and nova-volume as well as the decoupling of nova-compute from the database (through the no-compute-db feature). All of these changes are all optional (the old code is still available to be used), but are slated to disappear soon.

Below is a list of these processes and their functions:

  • nova-api is a family of daemons (nova-apinova-api-os-computenova-api-ec2,nova-api-metadata or nova-api-all) that accept and respond to end user compute API calls. It supports OpenStack Compute API, Amazon’s EC2 API and a special Admin API (for privileged users to perform administrative actions). It also initiates most of the orchestration activities (such as running an instance) as well as enforces some policy (mostly quota checks). Different daemons allow Nova to implement different APIs (Amazon EC2, OpenStack Compute, Metadata) or combination of APIs (nova-api starts both the EC2 and OpenStack APIs).
  • The nova-compute process is primarily a worker daemon that creates and terminates virtual machine instances via hypervisor’s APIs (XenAPI for XenServer/XCP, libvirt for KVM or QEMU, VMwareAPI for VMware, etc.). New to the Grizzly release is the return of Hyper-V (thanks to the Cloudbase Solutions guys for the comment). The process by which it does so is fairly complex but the basics are simple: accept actions from the queue and then perform a series of system commands (like launching a KVM instance) to carry them out while updating state in the database through nova-conductor. Please note that the use of nova-conductor is optional in this release, but does greatly increase security.
  • The nova-scheduler process is conceptually the simplest piece of code in OpenStack Nova: take a virtual machine instance request from the queue and determines where it should run (specifically, which compute server host it should run on). In practice, it is now one of the most complex.
  • A new service called nova-conductor has been added to this release. It mediates access to the database for other daemons (only nova-compute in this release) to provide greater security.
  • The queue provides a central hub for passing messages between daemons. This is usually implemented with RabbitMQ today, but could be any AMPQ message queue (such as Apache Qpid), or Zero MQ.
  • The SQL database stores most of the build-time and run-time state for a cloud infrastructure. This includes the instance types that are available for use, instances in use, networks available and projects. Theoretically, OpenStack Nova can support any database supported by SQL-Alchemy but the only databases currently being widely used are sqlite3 (only appropriate for test and development work), MySQL and PostgreSQL.
  • Nova also provides console services to allow end users to access their virtual instance’s console through a proxy. This involves several daemons (nova-console,nova-xvpvncproxynova-spicehtml5proxy and nova-consoleauth).

Nova interacts with many other OpenStack services: Keystone for authentication, Glance for images and Horizon for web interface. The Glance interactions are central. The API process can upload and query Glance while nova-compute will download images for use in launching images.

Object Store

OpenStack’s Object Store (“Swift”) is designed to provide large scale storage of data that is accessible via APIs. Unlike a traditional file server, it is completely distributed, storing multiple copies of each object to achieve greater availability and scalability. Swift provides the following user functionality:

  • Stores and retrieves objects (files)
  • Sets and modifies metadata on objects (tags)
  • Versions objects
  • Serve static web pages and objects via HTTP. In fact, the diagrams in this blog post are being served out of Rackspace’s Swift service.

The swift architecture is very distributed to prevent any single point of failure as well as to scale horizontally. It includes the following components:

  • Proxy server (swift-proxy-server) accepts incoming requests via the OpenStack Object API or just raw HTTP. It accepts files to upload, modifications to metadata or container creation. In addition, it will also serve files or container listing to web browsers. The proxy server may utilize an optional cache (usually deployed with memcache) to improve performance.
  • Account servers manage accounts defined with the object storage service.
  • Container servers manage a mapping of containers (i.e folders) within the object store service.
  • Object servers manage actual objects (i.e. files) on the storage nodes.

There are also a number of periodic process which run to perform housekeeping tasks on the large data store. The most important of these is the replication services, which ensures consistency and availability through the cluster. Other periodic processes include auditors, updaters and reapers. Authentication for the object store service is handled through configurable WSGI middleware (which will usually be Keystone).

To learn more about Swift, head over to the SwiftStack website and read their OpenStack Swift Architecture.

Image Store

OpenStack Image Store centralizes virtual images for users and other cloud services:

  • Stores public and private images that users can utilize to start instances
  • Users can query and list available images for use
  • Delivers images to Nova to start instances
  • Snapshots from running instances can be stored so that virtual machines can be backed

The Glance architecture has stayed relatively stable since the Cactus release.

  • glance-api accepts Image API calls for image discovery, image retrieval and image storage.
  • glance-registry stores, processes and retrieves metadata about images (size, type, etc.).
  • A database to store the image metadata. Like Nova, you can choose your database depending on your preference (but most people use MySQL or SQlite).
  • A storage repository for the actual image files. In the diagram above, Swift is shown as the image repository, but this is configurable. In addition to Swift, Glance supports normal filesystems, RADOS block devices, Amazon S3 and HTTP. Be aware that some of these choices are limited to read-only usage.

There are also a number of periodic process which run on Glance to support caching. The most important of these is the replication services, which ensures consistency and availability through the cluster. Other periodic processes include auditors, updaters and reapers.

As you can see from the diagram in the Conceptual Architecture section, Glance serves a central role to the overall IaaS picture. It accepts API requests for images (or image metadata) from end users or Nova components and can store its disk files in the object storage service, Swift.

Identity

Keystone provides a single point of integration for OpenStack policy, catalog, token and authentication:

  • Authenticate users and issue tokens for access to services
  • Store users and tenants for a role-based access control (RBAC)
  • Provides a catalog of the services (and their API endpoints) in the cloud
  • Create policies across users and services

Architecturally, Keystone is very simple:

  • keystone handles API requests as well as providing configurable catalog, policy, token and identity services.
  • Each Keystone function has a pluggable backend which allows different ways to use the particular service. Most support standard backends like LDAP or SQL, as well as Key Value Stores (KVS).

Most people will use this as a point of customization for their current authentication services.

Network

Quantum provides “network connectivity as a service” between interface devices
managed by other OpenStack services (most likely Nova). It allows users to:

  • Users can create their own networks and then attach server interfaces to them
  • Pluggable backend architecture lets users take advantage of commodity gear or vendor supported equipment
  • Extensions allow additional network services like load balancing

Like many of the OpenStack services, Quantum is highly configurable due to it’s
plug-in architecture. These plug-ins accommodate different networking equipment
and software. As such, the architecture and deployment can vary dramatically.

  • quantum-server accepts API requests and then routes them to the
    appropriate quantum plugin for action.
  • Quantum plugins and agents perform the actual work such as plugging and
    unplugging ports, creating networks or subnets and IP addressing. These
    plugins and agents differ depending on the vendor and technologies used in the
    particular cloud. Quantum ships with plugins and agents for: Cisco virtual and
    physical switches, Nicira NVP product, NEC OpenFlow products, Open vSwitch,
    Linux bridging and the Ryu Network Operating System. Midokua also provides a plug-in for Quantum integration. The common agents are L3 (layer 3), DHCP (dynamic host IP addressing) and vendor specific plug-in agent(s).
  • Most Quantum installations will also make use of a messaging queue to route
    information between the quantum-server and various agents as well as a
    database to store networking state for particular plugins.

Quantum will interact mainly with Nova, where it will provide networks and
connectivity for its instances. Florian Otel has written very thorough article on implementing Open vSwitch is you are looking for an example of Quantum in action.

Block Storage

Cinder separates out the persistent block storage functionality that was
previously part of Openstack Compute (in the form of nova-volume) into it’s own
service. The OpenStack Block Storage API allows for manipulation of volumes,
volume types (similar to compute flavors) and volume snapshots:

  • Create, modify and delete volumes
  • Snapshot or backup volumes
  • Query volume status and metadata

It’s architecture follows the Quantum model, which provides for a northbound API and vendor plugins underneath it.

  • cinder-api accepts API requests and routes them to cinder-volume
    for action.
  • cinder-volume acts upon the requests by reading or writing to the
    Cinder database to maintain state, interacting with other processes (like
    cinder-scheduler) through a message queue and directly upon block
    storage providing hardware or software. It can interact with a variety of
    storage providers through a driver architecture. Currently, there are included drivers for IBM (Xiv, Storwize and SVC), SolidFire, Scality, Coraid appliances, RADOS block storage (Ceph), Sheepdog, NetApp, Windows Server 2012 iSCSI, HP (Lefthand and 3PAR), Nexenta appliances, Huawei (T series and Dorado storage systems), Zadara VPSA, Red Hat’s GlusterFS, EMC (VNX and VMAX arrays), Xen and linux iSCSI.
  • Much like nova-scheduler, the cinder-scheduler daemon picks the optimal
    block storage provider node to create the volume on.
  • cinder-backup is a new service that backs up the data from a volume (not a full snapshot) to a backend service. Currently, the only shipping backend service is Swift.
  • Cinder deployments will also make use of a messaging queue to route
    information between the cinder processes as well as a database to store volume state.

Like Quantum, Cinder will mainly interact with Nova, providing volumes for its
instances.

Future Projects

In the next version of OpenStack (“Havana” which is due in the Fall of 2013), two new projects will be brought into the fold:

  • Ceilometer is a metering project. The project offers metering information and the ability to code more ways to know what has happened on an OpenStack cloud. While it provides metering, it is not a billing project. A full billing solution requires metering, rating, and billing. Metering lets you know what actions have taken place, rating enables pricing and line items, and billing gathers the line items to create a bill to send to the consumer and collect payment. For users that also want a billing package, BillingStack is another open source project that provides payment gateway and other billing features. Ceilometer is available as a preview now.
  • Heat provides a REST API to orchestrate multiple cloud applications implementing standards such as AWS CloudFormation.

Looking beyond the “Havana” release, OpenStack is slated to see the addition of two more projects in the Spring of 2014 (for the newly named “Icehouse” release):

  • Reddwarf is a database as a service offering that provides MySQL databases within OpenVZ containers upon demand.
  • Ironic is the aptly named project that uses OpenStack to deploy bare metal servers instead of virtualized cloud instances.

There are also a number of related but unofficial projects:

  • Moniker that provides DNS-as-a-service for OpenStack
  • Marconi which is a message queueing service
  • Savanna to provision Hadoop clusters on OpenStack
  • Murano that allows a non-experienced user to deploy reliable Windows based environments
  • Convection is a task or workflow service to execute command sequences or long running jobs

(Source: solinea.com)