In
the previous section, we looked at how you could represent static data
for the purposes of data retrieval. This is only one side of the
picture, because typically you’ll want to associate the static data with
some dynamic data.
For example, people viewing the product details web page shown in figure 1 will hopefully purchase the shirt displayed. When they do, how should you store that data so that you can easily retrieve it?
Depending on the implementation
of the web page, you can either allow the user to purchase the item
directly or add the item to a shopping cart. In either case, the method
of storing the data will be the same, so let’s look at storing items in a
shopping cart.
1. Representing the shopping cart in SQL Azure
You’re probably familiar with
relational databases, so we’ll first look at how you can store shopping
cart data in SQL Azure.
The Shopping Cart Data Model
The shopping cart can be
persisted across sessions, and if the user is a registered logged-in
user, you can associate this account with their user ID. Figure 1 represents a typical data model for a shopping cart.
In figure 1,
the shopping cart is represented as two tables (ShoppingCart and
ShoppingCartItems). The ShoppingCart table represents the shopping cart
for each user, and each item in the shopping cart is stored in the
ShoppingCartItems table. For each item in the cart, the ShirtId,
MaterialId, and SizeId are stored, and the appropriate foreign-key
relationships between tables are established.
If the website user is a
registered user, the UserId would be stored in the ShoppingCart table;
if the user is unregistered, you could use the session ID.
Note
Although this data model represents a shopping cart, you could model an orders table using the same data structure.
Retrieving the Data
Because SQL Azure is a relational database, you could query across multiple tables by using JOIN clauses in your SQL statements, if required. For example, to return all items in the shopping cart for UserId 12345, including the shirt name, size, and material, you could issue the following SQL query:
SELECT sci.ShoppingCartItemId,
s.ShirtName,
s.Description,
s.Price,
sz.SizeId,
sz.SizeName,
m.MaterialId,
m.MaterialName
FROM ShoppingCart sc
JOIN ShoppingCartItems sci ON sci.ShoppingCartId = sc.ShoppingCartId
JOIN Shirts s ON s.ShirtId = sci.ShirtId
JOIN SizeTypes sz ON sz.SizeId = sci.SizeId
JOIN Materials m ON m.MaterialId = sci.MaterialId
WHERE sc.UserId = '12345'
Because the shopping cart
table will never hold a large amount of data, and the static data is
held in an in-memory cache, there’s no need to return the static data as
part of your SQL query. You can, therefore, use the following query
instead:
SELECT sci.ShoppingCartItemId,
s.ShirtName,
s.Description,
s.Price,
sci.SizeId,
sci.MaterialId,
FROM ShoppingCart sc
JOIN ShoppingCartItems sci
ON sci.ShoppingCartId = sc.ShoppingCartId
JOIN Shirts s ON s.ShirtId = sci.ShirtId
WHERE sc.UserId = '12345'
Not returning the
static data (SizeTypes and Materials) both improves the performance and
reduces the complexity of the query by reducing the number of joins.
Once the data is returned from the database, you can combine it with the in-memory cached data using LINQ to Objects.
Note
This strategy is
perfectly acceptable when dealing with static reference data because the
cached data tends to be very small. This would not be an acceptable
strategy when dealing with millions of rows of data.
2. Partitioning the SQL Azure shopping cart
The data model in figure 14.5
looks like it’s only usable for single-server databases, but this isn’t
strictly the case. That model will easily scale out, as we’ll explain
in a second.
One of the issues with using
SQL Azure is that currently there’s no built-in method of partitioning
data across multiple servers. To avoid bottlenecking your application on a single database, you need to partition (or shard) your data across multiple servers in your application layer.
Splitting the Data Model Across Multiple Servers
As we said, the data model in figure 14.5
will work across multiple servers—you just have to be a little smart
about how you do this. Because the shopping cart tables have no
dependencies on any other part of the application, you can separate
those tables into their own separate database.
In the application
layer, you could therefore separate the shopping cart’s data-access
layer methods and connection strings into their own layer. By keeping
the functionality logically separated, you can split your application
across multiple databases if required. In the Hawaiian Shirt Shop
example, you could easily maintain separate databases for the shopping
cart, orders, customers, products, and static data, if required.
Partitioning Data Further
In your web application, you
might get to the point where the shopping cart database is bottlenecking
and you need to partition the data further. This isn’t as difficult as
it sounds.
Because the shopping cart
data is only used by a single user and you never query across multiple
users, you can easily partition out the data by user. For example, you
might have 100,000 registered users who maintain a shopping cart. If you
currently have a single database and wish to split it into two or more
databases, you could use a partitioning function in your application:
If (userId.ToString()[0] < 'N')
{
// Use Connection String for Database 1
}
else
{
// Use Connection String for Database 2
}
In the preceding example, if the first character of the userId begins with N
or later, they would use the second shopping cart database; otherwise,
they’d use the first database. You could break this down into further
partitions if required.
As
stated earlier, this type of partitioning can only be done in the
application layer at present. It’s planned that in future releases of
SQL Azure there’ll be some partitioning built-in on the server side.
Let’s transition over to how we would implement the same concept, but using the Azure Table service instead.
3. Representing the shopping cart’s static data in the Table service
In the previous section, we
looked at how you could store the shopping cart in SQL Azure and how you
could scale it out horizontally if necessary. Although it’s possible to
scale out a SQL Azure database, it still requires you to add some
manual partitioning code
at the application layer. We’ll now look at how you could represent the
shopping cart table in the Table service, and at how you can use the
built-in partitioning model to scale out your tables.
Due to the architecture
of the Table service, there’s no facility to perform a server-side join
between a Shirts table and the ShoppingCart table. This effectively
leaves you two options when you have data that resides on two different
tables:
For now, we’ll ignore the
duplicate data option (we’ll cover that in the next section), and we’ll
focus on client-side data joining. But before we look at joining the
data on the client side, let’s take a peek at how you could represent
the ShoppingCart table in the Table service.
Shoppingcart Entity
In the SQL Azure
implementation of our shopping cart data model, there were two tables
(ShoppingCart and ShoppingCartItems). Because the Table service isn’t a
relational database and doesn’t support joining between tables, you can
represent the shopping cart as a single table (ShoppingCart). Within the
ShoppingCart table you can store the entity as follows:
public class ShoppingCartItem : TableServiceEntity
{
public string Shirt {get;set;}
public string Material { get; set; }
public string Size { get; set; }
}
Consider the definition of the ShoppingCartItem
entity. Because both the material and size data are cached in memory,
you can simply store a reference to the data (the row key for the
material and size) and then perform a client-side join between the
shopping cart entity and the cached versions of the Material and
SizeType entities. Because the cached data is a small set of static
reference data, and it’s being joined to a small set of shopping cart
data, this technique is appropriate.
To
join these two sets of data, you can define a new entity that will
represent a strong version of your shopping cart item, as follows:
public class StrongShoppingCartItem
{
public Shirt SelectedShirt { get; set; }
public SizeType Size { get; set; }
public Material Material { get; set; }
}
As you can see, this code
represents the shopping cart item with a reference to each entity rather
than the using an ID reference.
Now that you have the stronger version of the entity, you need to populate it, like this:
var materials = (IEnumerable<Material>)Cache["materials"];
var sizeTypes = (IEnumerable<SizeType>)Cache["sizeTypes"];
var shoppingCartItemContext = new ShoppingCartItemContext ();
var shoppingCartItems =
shoppingCartItemContext.ShoppingCartItem.ToList();
var q = from shoppingCartItem in shoppingCartItems
join sizeType in sizeTypes
on shoppingCartItem.Size.RowKey equals sizeType.RowKey
join material in materials
on shoppingCartItem.Material.RowKey equals material.RowKey
select new StrongShoppingCartItem
{
SelectedShirt = new Shirt(),
Material = material,
Size = sizeType
};
The cool thing about the
preceding query is that because the size type and materials are cached,
you don’t need to make any extra table service calls to get the
reference data. This is all performed in-memory using LINQ to Objects.
Warning
This technique is super cool
for small datasets. Make sure you check your performance on large
datasets, such as when you have millions of rows, because it may not
meet your needs.
The
previous example improved performance and saved money by joining static
data. Although this works well for static data, it doesn’t work so well
for nonstatic data—dynamic data or infrequently changing data.