Federated authentication is
neither a new or unique concept. For instance, users of TweetPhoto do
not need to create a separate account to log in we can instead use our
account from one of several popular social sites to log in at
TweetPhoto, even though they are all separate and distinct companies.
When the Sign in with Twitter
button is clicked, we're transferred to Twitter, and the URL contains
an authentication token in the querystring.
As an additional
confirmation step, Twitter requires confirmation for the partner site to
access a user's account, as seen in the following screenshot. This is a
very good idea when there is user interaction, but for unattended
systems this won't be possible. Fortunately, Access Control can be
preconfigured to provide access using shared keys.
As the logins of these services are joined together, they are said to be federated.
We see federation all over the Internet, where a single OpenId, Google,
Twitter, Facebook, or Windows Live login can be used to access dozens
or more additional services, or a single site can support logins from
multiple other sites. The sites where we create an account are called identity providers,
as our identity information is stored and provided by these services.
Knowing that each identity provider is queried and returns data
differently, we can appreciate the amount of work it would take to
implement a number of popular sites, as well as add new identity
providers and maintain any changes with currently supported providers.
One distinct difference
between the Twitter example and Access Control is that a single Twitter
account can be used on multiple websites, while Access Control is
designed to integrate multiple identity providers into a single
application.
If we don't actually log in to Access Control, what is it used for? Access Control is a security token service (STS)
a trusted application that issues security tokens via a standard
interface. A security token is a small piece of text that contains
identifying information and an encrypted signature that is used to
assure the contents of the token.
So how do identity
providers and Access Control fit together? Every identity provider
returns identity data in a different format, which our application would
need to parse correctly. That could mean a lot of work and rework as
APIs change or are added. As application developers, we use Access
Control to configure which identity providers we trust, and remap the
properties from each into a single format our application can consume.
The goal of Access Control is to federate identity providers into a
single common format that our application can understand. If we're
developing a very private application, Access Control may not be all
that interesting, and there are other means we could use (such as the
standard ASP.NET login provider). However, if we're developing a public
application (like a forum site), this is a very exciting service. As new
providers emerge or current ones change their API, we do not need to
make any changes to our application code; we only need to make
configuration changes in Access Control so that our application can
integrate with the latest popular Internet sites.
The little pieces of
information about a user such as the username, first name, last name,
and so on are called "claims". An identity is the full set of claims
that represents a user. When a user accesses a site and logs in, the
user ID is a claim being made by that user. By claim we mean when a user
enters his/her ID while logging in to a site, he/she is trying to say
that "I am this identity on this site." The verification for that claim
is entering the correct password.
These days, on a majority
of sites, we must create a unique account for each one, and each site
then stores our identity separately as compared to other sites. By
contrast, Access Control does not store any identity information. Our
service trusts the information from the identity provider because we
told Access Control to trust the claims from the provider, and our
application trusts Access Control.
Because Access Control is
RESTful, Access Control can be utilized by any application, and on any
platform that can consume REST data. It's important to note that Access
Control can be the only Azure service we use, and the application that
consumes Access Control tokens can be written in any language.
Even though we just said
that we don't log in to Access Control, Access Control does support
simple symmetric key logins, which can be used as a rudimentary user
ID/password system; however, it is not the main purpose behind Access
Control and its capability for authentication services is very limited.
Authentication versus authorization
We've mentioned the
terms authentication and authorization, and it's worth discussing the
difference between the two. In a claims-based identity model like Access
Control, authentication and authorization are separated from each other
and the rest of the application code.
Authentication establishes the
identity of the user. This can be as simple as a username/password, or
as secure as a retina scan. Once the user is authenticated, our
application can determine what actions the user is then authorized to
perform.
In Access Control, we do not
configure authentication; we configure trust relationships with the
identity providers. We do configure authorization rules (discussed
further), which our application consumes and respects.
Basics of Access Control configuration
When it comes to
configuring Access Control, there is some good news and some bad news.
First, the good news: one day, Access Control may be a very useful
service for a considerable number of applications, whether or not these
applications are hosted on Azure. Now the bad news: at the time of
writing, only symmetric key and Active Directory Federation Services 2.0
(ADFS) are supported. The long-term goal is to support every major
identity provider, but we're not there yet. At the rate Microsoft is
developing Azure, it may not be too far, but no promises have been made.
More good news:
all configuration changes can be accomplished through a REST interface.
More bad news: at the time of writing, there were no online tools; all
configuration changes are handled through local tools that wrap the REST
requests. One tool is a command line in the SDK called ACM.EXE, the
other is a sample called ACMBROWSER.
The lowest level of
configuration in Access Control is a rule. Each rule specifies something
about a supported identity provider, such as the name, ID, algorithm,
key, or which actions users from this provider can perform. Rules are
not configured or used separately, but are done so as part of a RuleSet.
Alongside a RuleSet, we also configure a Token Policy,
which sets the timeout for security token issued by Access Control.
Together, the RuleSet and Token Policy form the Scope. A service (such
as a web role) can have multiple scopes, and the scopes associated with a
service are collectively known as the Service Policy.
A Service Policy is applied to a Service Namespace, which organizes the
rules for a resource and segregates transactions on the billing
statement (allowing us to use one AppFabric account for many
applications).
Currently, only one RuleSet
is allowed per scope, but each service can have multiple scopes. Also,
at the time of writing, a RuleSet cannot be shared across different
scopes even if they contain the same information, the RuleSet must be
created for each scope.
Requests and Simple Web Tokens
The point of this configuration
is to be able to retrieve access tokens from Access Control, so it's
necessary to discuss the tokens themselves. Access Control returns a
type of token called a Simple Web Token (SWT).
SWT is a recent token specification developed by a group of Internet
leaders, including Microsoft and Google. SWT was developed to be
structurally and cryptographically simple, and to be compact. The small
size of an SWT means it can be easily transmitted in HTTP headers or as
part of a querystring.
SWTs are a type of access token
that are utilized in Web Resource Authentication Protocol (WRAP), which
is itself an extension of OAuth called OAuth-WRAP, and the upcoming
OAuth 2.0 specification.
For a little history,
OAuth was developed in part by some of the developers of OpenID. OpenID
works great for an individual to log in to many websites with a single
credential; however, with the rise of web services and APIs, a better
system was needed. OAuth was designed to allow third-party applications
to access secured resources (such as TweetPhoto being able to post a new
photo upload on the user's Twitter account). OAuth was comprehensive,
but also complicated to implement. OAuth-WRAP is a subsequent
implementation of OAuth, and served to rectify some of the complaints
about OAuth. On the other hand, OAuth 2.0 is an upcoming upgrade to
OAuth intended to simplify the protocol further.
Requests can be made for either a plaintext token or a simple web token (SWT).
All token requests sent to Access Control are made using HTTPS protocol
and form POST method. Request data are form-encoded, and the scope
parameter (named wrap_scope in the request) URL-encoded as well. SWTs are signed with an HMACSHA256 signature.
The SWTs returned from
Access Control are URL-encoded and are part of a longer return string.
To use them, we must parse them from the return string and decode them.
Additionally, we should ensure the token is valid before allowing the
user to access any secured resources. Validations should include
confirming the signature is a valid HMACSHA256 signature, whether or not
the token has expired, and that the issuer and audience values match
what was requested.
A raw token has the following format:
Issuer=https://<serviceNamespace>.accesscontrol.windows.net/WRAPv0.9/ &Audience=<requested appliesto> &
<claim type1>=<claim value1>,<claim value2>...<claim valueN> &ExpiresOn=<expires date> &HMACSHA256=<hmac signature>
Essentially, a token
is just a series of name/value pairs. The HMAC256 signature should
always be the last name/value in the token.