Thus far, our discussion has been limited to transactions
on a single database. What if more than one database is involved? What
if more than one database server is involved? What if a nondatabase
operation, such as modifying an in-memory cache, is involved? Can you
use BEGIN TRANSACTION to bind other such operations within a single transaction? Unfortunately, you cannot. BEGIN TRANSACTION
works only on local transactions dealing with data in a database. These
transactions do not apply to an in-memory cache, because no transaction
logging mechanism is available. In this section, we’ll look at a deeper theory of transactions and explore the Transaction Management API in the .NET Framework System.Transactions
namespace. You’ll also see why a transaction that is inherently
expensive due to its overhead becomes even more expensive when it is
being managed by an external entity—a transaction coordinator. This
discussion addresses a scope broader than database-only transactions. 1. Distributed Transaction TerminologyYou will frequently encounter the terms resource manager, transaction manager, transaction coordinator, and two-phase commit when discussing distributed transactions. Let’s look at those terms more closely. Transactions
(database or otherwise) manage a resource. Any operation that needs to
be made transactional is managed by a logical entity—a subroutine, a
function, a dynamic-link library (DLL), an executable, a machine, or
anything else that is capable of supporting transactions. Any such
logical entity that is eventually responsible for managing the resource
in a transactional manner is called a resource manager (RM). Thus,
an RM has the ability and responsibility to enlist itself in a current
running transaction and thereby supports transactional capabilities. Transaction Manager or Transaction CoordinatorIf
you have an RM that manages its resources in a transactional manner by
enlisting in a current running transaction, by definition you need an
external entity that manages the transaction itself. This external
entity is responsible for listening to and coordinating between several
RMs that are all enlisted within the same transaction. It acknowledges
requests for new transactions and listens for and sends notifications
in the event of success and failure. This entity is referred to as a transaction manager (TM), transaction coordinator (TC), or distributed transaction coordinator
(DTC). Two common transaction coordinators that ship with Microsoft
Windows are the Lightweight Transaction Manager (LTM) and the Microsoft
Distributed Transaction Coordinator (MS DTC). Do
note that a TC is not necessarily a DTC. In fact, if the RM itself has
transactional capabilities built in, it might not need a TC at all. For
instance, in the case of SQL Server, if a transaction is limited to a
single database, SQL Server is fully capable of managing the
transaction on its own. Thus, for local transactions,
SQL Server chooses not to consult the MS DTC. There are good reasons
for this, which will become evident once you read about the typical
implementation of a distributed transaction—namely, the two-phase commit process. A
distributed transaction can be implemented in a number of ways. One of
the most common ways is through the two-phase commit process. Here is
the typical flow of a two-phase transaction involving two RMs and a DTC: The
transaction initiator requests a transaction from the DTC. This
transaction initiator can be the application that interacts with the
two RMs, or it can be one of the RMs itself. The
transaction initiator requests that the RMs to do their work as a part
of the same transaction. The RMs register themselves with the DTC as a
part of the same transaction, thus expressing an interest in receiving
notifications about the success or failure of the transaction as a
whole. This process is referred to as “enlisting within a transaction.” The
RMs go ahead and do their work and notify the DTC of a success, while
keeping a rollback mechanism in place. This is the first phase of a
two-phase commit process, also called the prepare phase. Once
the DTC receives a success notification for the prepare phases from
each of the enlisted RMs, the DTC issues a notification to go ahead and
make the changes permanent. Upon receiving such a notification, all RMs
make their changes permanent by committing their transient states. This
is also known as the commit phase. The system has now gone from one stable state to another, and the distributed transaction is complete.
2. Rules and Methods of EnlistmentAs
you might have surmised, the DTC on a machine running Windows, MS DTC,
engages in a lot of chatty communication with the various RMs involved
in a transaction. Due to the network roundtrips involved, this chatting
affects the performance
of the application in general and might also be blocked by a firewall.
In addition, RMs that enlist themselves in a distributed transaction
often use the serializable isolation level. This architecture is the
easiest to implement because when using the serializable isolation
level, you have a perfect transaction—no dirty reads, no phantom reads,
and no nonrepeatable reads. Of course, the downside is a serious
performance impact. But
SQL Server itself is capable of managing transactions, so why should it
have to escalate the isolation level to serializable in every
circumstance? After all, depending on your logic, you might want to
take advantage of an MS DTC–based transaction if and only if
your transaction ends up involving more than one RM. But as long as
only one database connection is involved, you shouldn’t have to pay the
extra cost of involving the DTC. As it turns out, the Microsoft
engineers thought of this situation. And to rectify the situation, an
RM can enlist within a transaction in different ways. An
RM that deals with resources that are volatile (not permanent) is a
good candidate for volatile enlistment. Typically, in a volatile
enlistment scenario, if the RM cannot perform the second (commit) phase
of a distributed
transaction for any reason, it doesn’t explicitly need to recover the
first (prepare) phase. This means that if an RM crashes in the middle
of a transaction, the RM doesn’t need to provide an explicit recovery
contract to the TC. Volatile enlistment doesn’t need the implementation
of MS DTC, so it is usually managed by the LTM,
which is a much lighter weight TC designed to work with volatile
enlistment scenarios. An example of such an RM might be one that
manages an in-memory cache. The cache data isn’t meant to be
permanent—it only lasts for a relatively short duration. Durable
enlistment is necessary if the RM has permanent (durable) data that
depends on the transaction for consistency. A good example is a
transaction that involves disk I/O. Say you are writing to a file on
disk. If the transaction fails, or if the RM crashes, the file that was
written as a part of the prepare phase will need to be deleted. Thus
the RM will need to prepare a transaction log and record the history of
changes since the transaction was begun. In the event of a requested
recovery, the RM needs sufficient information to perform a graceful
rollback. NoteSQL
Server’s FILESTREAM feature implements a transparent coordination
between database transactions and NTFS file system transactions.
Promotable Single-Phase EnlistmentIn
many situations, the nature of a transaction can change as new RMs
continue to enlist. For instance, a SQL Server database is perfectly
capable of managing a transaction on its own, as long as the
transaction is limited to one database. Or say, for instance, that an
RM that manages an in-memory cache doesn’t need the implementation of
MS DTC because the cache by nature is temporary anyway. But if there is
a transaction containing an in-memory cache RM or a SQL Server
connection being managed by the LTM, and a second SQL Server connection
enlists itself in the same transaction, the transaction will be
promoted to MS DTC because the RMs are no longer capable of managing
the transaction on their own. It
is
important to note that along with the promotion comes what are
sometimes considered (necessary) disadvantages of MS DTC—a higher
isolation level and a more expensive and chatty transaction in general.
Promotable single-phase enlistment (PSPE) offers a huge advantage in
that as long as you don’t really need MS DTC, you don’t use it, so you
don’t pay the penalty for it. (However, when you truly need MS DTC,
it’s a godsend.) There
are well-defined rules for the promotion of a transaction from LTM to
MS DTC. A transaction is escalated from LTM to MS DTC if any of the
following happens: A durable resource that doesn’t support single-phase notifications is enlisted in the transaction. Two durable resources that support single-phase notification enlist in the same transaction. The TC receives a request to marshal a transaction to a different .NET AppDomain or Windows process.
NoteSQL Server 2000 connections are always promoted to MS DTC, and SQL Common Language Runtime (CLR) connections inside a System.Transactions.TransactionScope are promoted to MS DTC even if only one of them is enlisted in the transaction scope. With
a good theory and a common terminology in hand, we can look at the
support for distributed transactions in SQL Server and the .NET
Framework in general.
|