Worker
roles can receive the messages they need to process in either a push or
a pull way. Pushing a message to the worker instance is an active
approach, where you’re directly giving it work to do. The alternative is
to have the role instances call out to some shared source to gather
work to do, in a sense pulling in the messages they need. When pulling
messages in, remember that there will possibly be several instances
pulling in work. You’ll need a mechanism similar to what the Azure Queue
service provides to avoid conflicts between the different worker role
instances that are trying to process the same work.
Keep in mind the
difference between roles and role instances, which we covered earlier.
Although it’s sometimes convenient to think of workers as a single
entity, they don’t
run as a role when they’re running, but as one or more instances of
that role. When you’re designing and developing your worker roles, keep
this duality in mind. Think of the role as a unit of deployment and
management, and the role instance as the unit of work assignment. This
will help reduce the number of problems in your architecture.
One advantage that worker
roles have over web roles is that they can have as many service
endpoints as they like, using almost any transport protocol and port.
Web roles are limited to HTTP/S and can have two endpoints at most.
We’ll use the worker role’s flexibility to provide several ways to send
it messages.
We’ll cover three approaches to sending messages to a worker role instance:
A pull model, where each worker role instance polls a queue for work to be completed
A push model, where a producer outside Azure sends messages to the worker role instance
A push model, where a producer inside the Azure application sends messages to the worker role instance
Let’s look first at the pull model.
1. Consuming messages from a queue
The most common way for a worker role to receive messages is through a queue.
The general model is to have a while
loop that never quits. This approach is so common that the standard
worker role template in Visual Studio provides one for you. The role
instance tries to get a new message from the queue it’s polling on each
iteration of the loop. If it gets a message, it’ll process the message.
If it doesn’t, it’ll wait a period of time (perhaps 5 seconds) and then
poll the queue again.
The core of the loop calls
the business code. Once the loop has a message, it passes the message
off to the code that does the work. Once that work is done, the message
is deleted from the queue, and the loop polls the queue again.
while (true)
{
CloudQueueMessage msg = queue.GetMessage();
if (msg != null)
{
DoWorkHere(msg);
queue.DeleteMessage(msg);
}
else
{
Thread.Sleep(5000);
}
}
You
might jump to the conclusion that you could easily poll an Azure Table
for work instead of polling a queue. Perhaps you have a property in your
table called Status that defaults to new. The worker role could poll the table, looking for all entities whose Status property equals new. Once a list is returned, the worker could process each entity and set their Status to complete. At its base, this sounds like a simple approach.
Unfortunately, this
approach is a red herring. It suffers from some severe drawbacks that
you might not find until you’re in testing or production because they
won’t show up until you have multiple instances of your role running.
The first problem is of
concurrency. If you have multiple instances of your worker role polling a
table, they could each retrieve the same entities in their queries.
This would result in those entities being processed multiple times,
possibly leading to status updates getting entangled. This is the exact
concurrency problem the Azure Queue service was designed to avoid.
The other, more important,
issue is one of recoverability and durability. You want your system to
be able to recover if there’s a problem processing a particular entity.
Perhaps you have each worker role set the status
property to the name of the instance to track that the entity is being
worked on by a particular instance. When the work is completed, the
instance would then set the status property to done.
On the surface, this approach seems to make sense. The flaw is that
when an instance fails during processing (which will happen), the entity
will never be recovered and processed. It’ll remain flagged with the
instance name of the worker processing the item, so it’ll never be
cleared and will never be picked up in the query of the table to be
processed. It will, in effect, be “hung.” The system administrator would
have to go in and manually reset the status property back to new. There isn’t a way for the entity to be recovered from a failure and be reassigned to another instance.
It would take a fair amount
of code to overcome the issues of polling a table by multiple
consumers, and in the end you’d end up having built the same thing as
the Azure Queue service. The Queue service is designed to play this
role, and it removes the need to write all of this dirty plumbing code.
The Queue service provides a way for work to be distributed among
multiple worker instances, and to easily recover that work if the
instance fails. A key concept of cloud architecture is to design for
failure recoverability in an application. It’s to be expected that nodes
go down (for one reason or another) and will be restarted and
recovered, possibly on a completely different server.
Queues are the easiest way to
get messages into a worker role. Now, though, we’ll discuss inter-role communication,
which lets a worker role receive a message from outside of Azure.
2. Exposing a service to the outside world
Web roles are built to
receive traffic from outside of Azure. Their whole point in life is to
receive messages from the internet (usually from a browser) and respond
with some message (usually HTML). The great thing is that when you have
multiple web role instances, they’re automatically enrolled in a load
balancer. This load balancer automatically distributes the load across
the different instances you have running.
Worker
roles can do much the same thing, but because you aren’t running in IIS
(which isn’t available on a worker role), you have to host the service
yourself. The only real option is to build the service as a WCF service.
Our goal is to convert our
little string-reversal method into a WCF service, and then expose that
externally so that customers can call the service. The first step is to
remove the loop that polls the queue and put in some service plumbing.
When you host a service in a worker role, regardless of whether it is
for external or internal use, you need to declare an endpoint. How you
configure this endpoint will determine whether it allows traffic from
sources internal or external to the application. The two types of
endpoints are shown in figure 1.
If it’s configured to run externally, it will use the Azure load
balancers and distribute service calls across all of the role instances
running the server, much like how the web role does this. We’ll look at
internal service endpoints in the next section.
The next step in the process is
to define the endpoint. You can do this the macho way in the
configuration of the role, or you can do it in the Visual Studio
Properties window. If you right-click on the Worker-Process String
worker role in the Azure project and choose Properties, you’ll see the
window in figure 2.
Name the service endpoint StringReverseService and set it to be an input endpoint, using TCP on port 2202. There’s no need to use any certificates or security at this time.
After you save these settings, you’ll find the equivalent settings in the ServiceConfiguration.csdef file:
<Endpoints>
<InputEndpoint name="StringReverserService" protocol="tcp" port="2202" />
</Endpoints>
You might normally host your
service in IIS or WAS, but those aren’t available in a worker role. In
the future, you might be able to use Windows Server AppFabric, but that
isn’t available yet, so you’ll have to do this the old-fashioned way.
You’ll have to host the WCF service using ServiceHost,
which is exactly that, a host that will act as a container to run your
service in. It will contain the service, manage the endpoints and
configuration, and handle the incoming service requests.
Next you need to add a method called StartStringReversalService. This method will wire up the service to the ServiceHost and the endpoint you defined. The contents of this method are shown in the following listing.
Listing 1. The StartStringReversalService method wires up the service
Listing 1
is an abbreviated version of the real method, shortened so that it fits
into the book better. We didn’t take out anything that’s super
important. We took out a series of trace
commands so we could watch the startup and status of the service. We
also abbreviated some of the error handling, something you would
definitely want to beef up in a production environment.
Most of this code is normal for setting up a ServiceHost. You first have to tell the service host the type of the service that’s going to be hosted . In this case, it’s the ReverseStringTools type.
When you go to add the service
endpoint to the service host, you’re going to need three things, the
ABCs of WCF: address, binding, and contract. The contract is provided by
your code, IReverseString, and it’s a
class file that you can reference to share service contract information
(or use MEX like a normal web service). The binding is a normal TCP
binary binding, with all security turned off. (We would only run with
security off for debug and demo purposes!)
Then the address is needed. You
can set up the address by referencing the service endpoint from the
Azure project. You won’t know the real IP address the service will be
running under until runtime, so you’ll have to build it on the fly by
accessing the collection of endpoints from the RoleEnvironment.CurrentRoleInstance.InstanceEndpoints collection .
The collection is a dictionary, so you can pull out the endpoint you
want to reference with the name you used when setting it up—in this
case, StringReverserService. Once you have a reference to the endpoint, you can access the IP address that you need to set up the service host.
After you have that wired up,
you can start the service host. This will plug in all the components,
fire them up, and start listening for incoming messages. This is done
with the Open method .
Once the service is up, you’ll
want the main execution thread to sleep forever so that the host stays
up and running. If you didn’t include the sleep loop ,
the call pointer would fall out of the method, and you’d lose your
context, losing the service host. At this point, the worker role
instance is sitting there, sleeping, whereas the service host is
running, listening for and responding to messages.
We wired up a simple WPF test client, as shown in figure 15.4, to see if our service is working. There are several ways you could write this test harness. If you’re using .NET 4, it’s
very common to use unit tests to test your services instead of an
interactive WPF client. Your other option would be to use
WCFTestClient.exe, which comes with Visual Studio.
Exposing public service
endpoints is useful, but there are times when you’ll want to expose
services for just your use, and you don’t want them made public. In this
case, you’ll want to use inter-role communication, which we’ll look at
next.
3. Inter-role communication
Exposing service input
endpoints, as we just discussed, can be useful. But many times, you just
need a way to communicate between your role instances. Usually you
could use a queue, but at times there might be a need for direct
communication, either for performance reasons or because the process is
synchronous in nature.
You can enable
communication directly from one role instance to another, but there are
some issues you should be aware of first. The biggest issue is that
you’ll have direct access to an individual role instance, which means
there’s no separation that can deal with load balancing. Similarly, if
you’re communicating with an instance and it goes down, your work is
lost. You’ll have to write code to handle this possibility on the client
side.
To set up inter-role communication, you need to add an internal endpoint
in the same way you add an input endpoint, but in this case you’ll set
the type to Internal (instead of Input), as shown in figure 4. The port will automatically be set to dynamic and will be managed for you under the covers by Azure.
Using an internal endpoint is a
lot like using an external endpoint, from the point of view of your
service. Either way, your service doesn’t know about any other instances
running the service in parallel. The load balancing is handled outside
of your code when you’re using an external endpoint, and internal
endpoints don’t have any available load balancing. This places the
choice of which service instance to consume on the shoulders of the
service consumer itself.
Most
of the work involved with internal endpoints is handled on the client
side, your service consumer. Because there can be a varying number of
instances of your service running at any time, you have to be prepared
to decide which instance to talk to, if not all of them. You also have
to be wily enough to not call yourself if calling the service from a
sibling worker role instance.
You can access the set of instances running, and their exposed internal endpoints, with the RoleEnvironment static class:
foreach (var instance in RoleEnvironment.CurrentRoleInstance.Role.Instances)
{
if (instance != RoleEnvironment.CurrentRoleInstance)
SendMessage(instance.InstanceEndpoints["MyServiceEndpointName"]);
}
The preceding sample code
loops through all of the available role instances of the current role.
As it loops, it could access a collection of any type of role in the
application, including itself. So, for each instance, the code checks to
see if that instance is the instance the code is running in. If it
isn’t, the code will send that instance a message. If it’s the same
instance, the code won’t send it a message, because sending a message to
oneself is usually not productive.
All three ways of
communicating with a worker role have their advantages and
disadvantages, and each has a role to play in your architecture:
Use a queue for complete separation of your instances from the service consumers.
Use input endpoints to expose your service publicly and leverage the Azure load balancer.
Use internal endpoints for direct and synchronous communication with a specific instance of your service.
Now that we’ve covered how you
can communicate with a worker role, we should probably talk about what
you’re likely to want to do with a worker role.