Moving to using blob storage can be tricky at times. The differences
between Windows Azure blob storage and a normal filesystem are
significant, and though smaller, there are also significant differences
between Azure blobs and other cloud blob services such as Amazon’s
S3.These issues are amplified by how similar all these services are.
Windows Azure blobs make every blob look like a file, but you must keep in
mind that, at the bottom, they’re not files, and that there are
differences. Similarly, users of other cloud storage services expect
similar design patterns and architecture to work unmodified, and they
consequently run into “gotchas” because of underlying differences.
Let’s take a look at some characteristics of blob storage that often
trip up users.
1. Requests Could Fail
Think about a typical file server. It is sitting in a corner
somewhere (or on a rack) connected by a gigabit Ethernet connection to
your server running your application. Now, think about the same scenario
when your application talks to Windows Azure blob storage.
First, blob storage is spread out over several hundred (if not
thousand) individual nodes, and a lot more networking magic is
happening. Also, if your application is not hosted on Windows Azure, you
must make several networking hops to even get to the Windows Azure data
centers and talk to blob storage. As anyone who has spent time in
networking can tell you, requests can fail from time to time. Even if
your application is hosted on Windows Azure, you might see blips if
you’re hosted on different data centers, or even in the same data
center, because of network/software issues.
These errors can manifest in several forms. Timeout errors can
show up when there is an issue between your code and the Windows Azure
data centers. Errors can occur even when the request reaches Windows
Azure—the service might be experiencing an outage or might experience an
internal error. Your application must be resilient to such errors. The
right way to deal with these errors is to back off and retry the
request. When you see repeated errors and you’re confident that this
isn’t an issue in your code, you should exponentially back off and try
with longer and longer delays.
Timeout errors could show up when transferring large amounts of
data in a single request. When dealing with large blobs, split
interactions with them into smaller chunks. When uploading a big blob,
split it into multiple blocks. When reading a large blob, use HTTP range
requests, and keep track of progress so that you can resume reading a
blob if you experience an error.
Finally, if you consistently see errors, contact Microsoft, since
this could be caused by a bug in the blob service itself, or by an
ongoing outage with Windows Azure. You can look at the status of the
various Windows Azure services at http://www.microsoft.com/windowsazure/support/status/servicedashboard.aspx.
These are extremely rare, but they do happen!
2. Changes Are Reflected Instantly
If you haven’t used any other cloud storage service, you can
safely skip this discussion. If you are familiar with other cloud
storage services, you have probably dealt with some of them being
eventually consistent. Essentially, changes you make take some time to
propagate. The Windows Azure blob service is different in that all
changes are reflected instantly. Whatever your operation—creating a blob
or a container, writing data, or deleting a blob or a container—every
client accessing your storage account will see those changes
instantly.
This does not necessarily mean that your data is deleted from
Windows Azure’s disks instantly. Windows Azure reflects changes
instantly, so when you delete a blob or a container, no one can access
it. However, space is recovered lazily, so your data will be deleted
eventually. However, this delay is very short, so you don’t need to
worry about your data lingering on Microsoft’s servers.
3. Compressed Content
You may often want to compress content to reduce bandwidth
costs and to improve performance. Fewer bytes sent over the wire means
less time spent in the network. Typically, you do this by enabling an
option on your web server. However, the Windows Azure blob service
doesn’t support compressing content on the fly.
In other words, you can’t make it gzip-compress bytes as it serves
them. One workaround you can use is to compress content before you store
it in Windows Azure blob storage. Later in this chapter, you’ll learn
how to do this in a manner where browsers can automatically decompress
this data.