DESKTOP

Cloud Application Architectures : Machine Image Design

10/12/2010 4:08:02 PM

Two indirect benefits of the cloud are:

  • It forces discipline in deployment planning

  • It forces discipline in disaster recovery

Thanks to the way virtualized servers launch from machine images, your first step in moving into any cloud infrastructure is to create a repeatable deployment process that handles all the issues that could come up as the system starts up. To ensure that it does, you need to do some deployment planning.

The machine image (in Amazon, the AMI) is a raw copy of your operating system and core software for a particular environment on a specific platform. When you start a virtual server, it copies its operating environment from the machine image and boots up. If your machine image contains your installed application, deployment is nothing more than the process of starting up a new virtual instance.

1. Amazon Machine Image Data Security

When you create an Amazon machine image, it is encrypted and stored in an Amazon S3 bundle. One of two keys can subsequently decrypt the AMI:

  • Your Amazon key

  • A key that Amazon holds

Only your user credentials have access to the AMI. Amazon needs the ability to decrypt the AMI so it can actually boot an instance from the AMI.

Don’t Store Sensitive Data in an AMI

Even though your AMI is encrypted, I strongly recommend never storing any sensitive information in an AMI. Not only does Amazon have theoretical access to decrypt the AMI, but there also are mechanisms that enable you to make your AMI public and thus perhaps accidentally share whatever sensitive data you were maintaining in the AMI.

For example, if one company sues another Amazon customer, a court may subpoena the other Amazon customer’s data. Unfortunately, it is not uncommon for courts to step outside the bounds of common sense and require a company such as Amazon to make available all Amazon customer data. If you want to make sure your data is never exposed as the result of a third-party subpoena, you should not store that data in an Amazon AMI.

Instead, encrypt it separately and load it into your instance at launch so that Amazon will not have the decryption keys and thus the data cannot be accessed, unless you are a party to the subpoena.


2. What Belongs in a Machine Image?

A machine image should include all of the software necessary for the runtime operation of a virtual instance based on that image and nothing more. The starting point is obviously the operating system, but the choice of components is absolutely critical. The full process of establishing a machine image consists of the following steps:

  1. Create a component model that identifies what components and versions are required to run the service that the new machine image will support.

  2. Separate out stateful data in the component model. You will need to keep it out of your machine image.

  3. Identify the operating system on which you will deploy.

  4. Search for an existing, trusted baseline public machine image for that operating system.

  5. Harden your system using a tool such as Bastille.

  6. Install all of the components in your component model.

  7. Verify the functioning of a virtual instance using the machine image.

  8. Build and save the machine image.

The starting point is to know exactly what components are necessary to run your service. Figure 1 shows a sample model describing the runtime components for a MySQL database server.

Figure 1. Software necessary to support a MySQL database server


In this case, the stateful data exists in the MySQL directory, which is externally mounted as a block storage device. Consequently, you will need to make sure that your startup scripts mount your block storage device before starting MySQL.

Because the stateful data is assumed to be on a block storage device, this machine image is useful in starting any MySQL databases, not just a specific set of MySQL databases.

The services you want to run on an instance generally dictate the operating system on which you will base the machine image. If you are deploying a .NET application, you probably will use one of the Amazon Windows images. A PHP application, on the other hand, probably will be targeting a Linux environment. Either way, I recommend searching some of the more trusted prebuilt, basic AMIs for your operating system of choice and customizing from there. 


Warning: Avoid using “kitchen sink” Linux distributions. Each machine image should be a hardened operating system and have only the tools absolutely necessary to serve its function.

Hardening an operating system is the act of minimizing attack vectors into a server. Among other things, hardening involves the following activities:

  • Removing unnecessary services.

  • Removing unnecessary accounts.

  • Running all services as a role account (not root) when possible.

  • Running all services in a restricted jail when possible.

  • Verifying proper permissions for necessary system services.

The best way to harden your Linux system is to use a proven hardening tool such as Bastille.

Now that you have a secure base from which to operate, it is time to actually install the software that this system will support. In the case of the current example, it’s time to install MySQL.

When installing your server-specific services, you may have to alter the way you think about the deployment thanks to the need to keep stateful data out of the machine image. For a MySQL server, you would probably keep stateful data on a block device and mount it at system startup. A web server, on the other hand, might store stateful media assets out in a cloud storage system such as Amazon S3 and pull it over into the runtime instance on startup.

Different applications will definitely require different approaches based on their unique requirements. Whatever the situation, you should structure your deployment so that the machine image has the intelligence to look for its stateful data upon startup and provide your machine image components with access to that data before they need it.

Once you have the deployment structured the right way, you will need to test it. That means testing the system from launch through shutdown and recovery. Therefore, you need to take the following steps:

  1. Build a temporary image from your development instance.

  2. Launch a new instance from the temporary image.

  3. Verify that it functions as intended.

  4. Fix any issues.

  5. Repeat until the process is robust and reliable.

At some point, you will end up with a functioning instance from a well-structured machine image. You can then build a final instance and go have a beer (or coffee).

3. A Sample MySQL Machine Image

The trick to creating a machine image that supports database servers is knowing how your database engine of choice stores its data. In the case of MySQL, the database engine has a data directory for its stateful data. This data directory may actually be called any number of things (/usr/local/mysql/data, /var/lib/mysql, etc.), but it is the only thing other than the configuration file that must be separated from your machine image. In a typical custom build, the data directory is /usr/local/mysql/data.


Note: If you are going to be supporting multiple machine images, it often helps to first build a hardened machine image with no services and then build each service-oriented image from that base.

Once you start an instance from a standard image and harden it, you need to create an elastic block storage volume and mount it. The standard Amazon approach is to mount the volume off of /mnt (e.g., /mnt/database). Where you mount it is technically unimportant, but it can help reduce confusion to keep the same directory for each image.


You can then install MySQL, making sure to install it within the instance’s root filesystem (e.g., /usr/local/mysql ). At that point, move the data over into the block device using the following steps:

  1. Stop MySQL if the installation process automatically started it.

  2. Move your data directory over into your mount and give it a name more suited to mounting on a separate device (e.g., /mnt/database/mysql ).

  3. Change your my.cnf file to point to the new data directory.

You now have a curious challenge on your hands: MySQL cannot start up until the block device has been mounted, but a block device under Amazon EC2 cannot be attached to an instance of a virtual machine until that instance is running. As a result, you cannot start MySQL through the normal boot-up procedures. However, you can end up where you want by enforcing the necessary order of events: boot the virtual machine, mount the device, and finally start MySQL. You should therefore carefully alter your MySQL startup scripts so that the system will no longer start MySQL on startup, but will still shut the MySQL engine down on shutdown.


Warning: Do not simply axe MySQL from your startup scripts. Doing so will prevent MySQL from cleanly shutting down when you shut down your server instance. You will thus end up with a corrupt database on your block storage device.

The best way to effect this change is to edit the MySQL startup script to wait for the presence of the MySQL data directory before starting the MySQL executable.

4. Amazon AMI Philosophies

In approaching AMI design, you can follow one of two core philosophies:

  • A minimalist approach in which you build a few multipurpose machine images.

  • A comprehensive approach in which you build numerous purpose-specific machine images.

I am a strong believer in the minimalist approach. The minimalist approach has the advantage of being easier for rolling out security patches and other operating-system-level changes. On the flip side, it takes a lot more planning and EC2 skills to structure a multipurpose AMI capable of determining its function after startup and self-configuring to support that function. If you are just getting started with EC2, it is probably best to take the comprehensive approach and use cloud management tools to eventually help you evolve into a library of minimalist machine images.

For a single application installation, you won’t likely need many machine images, and thus the difference between a comprehensive approach and a minimalist approach is negligible. SaaS applications—especially ones that are not multitenant—require a runtime deployment of application software.

Runtime deployment means uploading the application software—such as the MySQL executable discussed in the previous section—to a newly started virtual instance after it has started, instead of embedding it in the machine image. A runtime application deployment is more complex (and hence the need for cloud management tools) than simply including the application in the machine image, but it does have a number of major advantages:

  • You can deploy and remove applications from a virtual instance while it is running. As a result, in a multiapplication environment, you can easily move an application from one cluster to another.

  • You end up with automated application restoration. The application is generally deployed at runtime using the latest backup image. When you embed the application in an image, on the other hand, your application launch is only as good as the most recent image build.

  • You can avoid storing service-to-service authentication credentials in your machine image and instead move them into the encrypted backup from which the application is deployed.

Other  
  •  Windows Azure : Using the Storage Client Library
  •  Windows Azure : Using the Blob Storage API
  •  Windows Azure : Blobs - Usage Considerations
  •  Windows Azure : Understanding the Blob Service
  •  Design and Deploy High Availability for Exchange 2007 : Design Edge Transport and Unified Messaging High Availability
  •  Design and Deploy High Availability for Exchange 2007 : Design Hub Transport High Availability
  •  Design and Deploy High Availability for Exchange 2007 : Design CAS High Availability
  •  Design and Deploy High Availability for Exchange 2007 : Create Bookmark Create Note or Tag Implement Standby Continuous Replication (SCR)
  •  Windows Server 2008 : Utilize System Center VMM
  •  Windows Server 2008 : Create Virtual Hard Drives and Machines
  •  Windows Server 2008 : Manage Hyper-V Remotely
  •  Windows Server 2008 : Install the Hyper-V Role
  •  Windows 7 : Rolling Back to a Stable State with System Restore
  •  Windows 7 : Configuring System Protection Options
  •  Windows 7 : Using the Windows Backup Program
  •  Active Directory Federation Services (ADFS)
  •  Active Directory Rights Management Service (RMS)
  •  Active Directory Lightweight Directory Service (LDS)
  •  Windows Server 2003 : Securing and Troubleshooting Authentication
  •  Windows Server 2003 : Managing User Profiles
  •  
    Most View
    Run Android Apps on Windows
    Safeguarding Confidential Data in SharePoint 2010 : Outlining Database Mirroring Requirements
    The End Of Wintel (Part 1)
    Elliott Neep - ‘Being a wildlife photographer is a dream come true” (Part 1)
    Synchronizing Mobile Data - Using Merge Replication (part 2) - Programming for Merge Replication
    Epic Moments in Sports (Part 2)
    GIGABYTE GA-Z77N - Wi-Fi
    Plum Crazy Trooper – “‘70s-Era Muscle Cars”
    Windows Server 2008 : Configure NAP
    Active Directory Domain Services 2008 : Create a New Group Policy Object from a Starter GPO, Edit Group Policy Objects and Starter GPOs, Copy Group Policy Objects and Starter GPOs
    Top 10
    ADO.NET Programming : Microsoft SQL Server (part 4) - Working with Typed Data Sets
    ADO.NET Programming : Microsoft SQL Server (part 3) - Using Stored Procedures with DataSet Objects
    ADO.NET Programming : Microsoft SQL Server (part 2) - Using SQL Server Stored Procedures
    ADO.NET Programming : Microsoft SQL Server (part 1) - Connecting to SQL Server, Creating Command Objects
    Windows Phone 8 In-Depth Review (Part 6)
    Windows Phone 8 In-Depth Review (Part 5)
    Windows Phone 8 In-Depth Review (Part 4)
    Windows Phone 8 In-Depth Review (Part 3)
    Windows Phone 8 In-Depth Review (Part 2)
    Windows Phone 8 In-Depth Review (Part 1)