Identifying Tracked Changes
After Change Tracking is
enabled for a table, any data modification statements that affect rows
in the table cause Change Tracking information for each modified row to
be recorded. To query for the rows that have changed and to obtain
information about the changes, you can use the built-in Change Tracking
functions.
Unless you enabled the TRACK_COLUMNS_UPDATED
option, only the values of the primary key column are recorded with the
change information to allow you to identify the rows that have been
changed. To identify the changed rows, use the CHANGETABLE (CHANGES ...) Change Tracking function. The CHANGETABLE (CHANGES ...) function takes two parameters: the first is the table name, and the second is the last synchronization version number.
If you pass 0
for the last synchronization version parameter, you get a list of all
the rows that have been modified since version 0, which means all the
changes to the table since first enabling Change Tracking. Typically,
however, you do not want all the rows that have changed from the
beginning of Change Tracking, but only those rows that have changed
since the last time you retrieved the changed rows.
Rather than having to keep track of the version numbers, you can use the CHANGE_TRACKING_CURRENT_VERSION()
function to obtain the current version that will be used the next time
you query for changes. The version returned represents the version of
the last committed transaction.
Before an application can
obtain changes for the first time, the application must first execute a
query to obtain the initial data from the table and a query to retrieve
the initial synchronization version using CHANGE_TRACKING_CURRENT_VERSION() function. The version number that is retrieved is passed to the CHANGETABLE(CHANGES ...) function the next time it is invoked.
The following example illustrates how to obtain the initial synchronization version and initial data set:
USE AdventureWorks2008R2
Go
declare @synchronization_version bigint
Select change_tracking_version = CHANGE_TRACKING_CURRENT_VERSION();
-- Obtain initial data set.
select CustomerID, TerritoryID, @synchronization_version as version
from MyCustomer
where CustomerID <= 5
go
change_tracking_version
-----------------------
0
CustomerID TerritoryID
----------- -----------
1 1
2 1
3 4
4 4
5 4
As you can see, because no updates have been performed since Change Tracking was enabled, the initial version is 0.
Now let’s perform some updates on these rows to effect some changes:
update MyCustomer
set TerritoryID = 5
where CustomerID = 4
update MyCustomer
set TerritoryID = 4
where CustomerID = 5
Now you can use the CHANGETABLE(CHANGES ...) function to find the rows that have changed since the last version (0):
declare @last_synchronization_version bigint
set @last_synchronization_version = 0
SELECT
CT.CustomerID as CustID, CT.SYS_CHANGE_OPERATION,
CT.SYS_CHANGE_COLUMNS, CT.SYS_CHANGE_CONTEXT
FROM
CHANGETABLE(CHANGES MyCustomer, @last_synchronization_version) AS CT
Go
CustID SYS_CHANGE_OPERATION SYS_CHANGE_COLUMNS SYS_CHANGE_CONTEXT
------ -------------------- ------------------ ------------------
4 U 0x0000000004000000 NULL
5 U 0x0000000004000000 NULL
You can see in these
results that this query returns the CustomerIDs of the two rows that
were changed. However, most applications also want the data from these
rows as well. To return the data, you can join the results from CHANGETABLE(CHANGES ...) with the data in the user table. For example, the following query joins with the MyCustomer table to obtain the values for the PersonID, StoredID, and TerritoryID columns. Note that the query uses an OUTER JOIN
to make sure that the change information is returned for any rows that
may have been deleted from the user table. Also, at the same time you
are retrieving the data rows, you also want to retrieve the current
version as well to use the next time the application comes back to
retrieve the latest changes:
declare @last_synchronization_version bigint
set @last_synchronization_version = 0
select current_version = CHANGE_TRACKING_CURRENT_VERSION()
SELECT
CT.CustomerID as CustID,
C.PersonID,
C.StoreID,
C.TerritoryID,
CT.SYS_CHANGE_OPERATION,
CT.SYS_CHANGE_COLUMNS, CT.SYS_CHANGE_CONTEXT
FROM
MyCustomer C
RIGHT OUTER JOIN
CHANGETABLE(CHANGES MyCustomer, @last_synchronization_version) AS CT
on C.CustomerID = CT.CustomerID
go
current_version
--------------------
2
CustID PersonID StoreID TerritoryID
SYS_CHANGE_OPERATION SYS_CHANGE_COLUMNS SYS_CHANGE_CONTEXT
----------- ----------- ----------- -----------
-------------------- ------------------ -------------------
4 NULL 932 5
U 0x0000000004000000 NULL
5 NULL 1026 4
U 0x0000000004000000 NULL
You can see in the output
from this query that the current version is now 2. The next time the
application issues a query to identify the rows that have been changed
since this query, it will pass the value of 2 as the @last_synchronization_version to the CHANGETABLE(CHANGES ...) function.
Caution
The version number is NOT
specific to a table or user session. The Change Tracking version number
is maintained across the entire database for all users and change
tracked tables. Whenever a data modification is performed by any user on
any table that has Change Tracking enabled, the version number is
incremented.
For example, immediately
after running an update on change tracked table A in the current
application and incrementing the version to 3, another application could
run an update on change tracked table B and increment the version to 4,
and so on. This is why you should always capture the current version
number whenever you are retrieving the latest set of changes from the
change tracked tables.
If an application has not
synchronized with the database in a while, the stored version number
could no longer be valid if the Change Tracking retention period has
expired for any row modifications that have occurred since that version.
To validate the version number, you can use the CHANGE_TRACKING_MIN_VALID_VERSION() function. This function returns the minimum valid version that a client can have and still obtain valid results from CHANGETABLE().
Your client applications should check the last synchronization version
obtained against the value returned by this function and if the last
synchronization version is less than the version returned by this
function, that version is invalid. The client application has to
reinitialize all the data rows from the table. The following T-SQL code
snippet can be used to validate the last_synchronization_version:
-- Check individual table.
IF (@last_synchronization_version <
CHANGE_TRACKING_MIN_VALID_VERSION(OBJECT_ID('MyCustomer')))
BEGIN
-- Handle invalid version and do not enumerate changes.
-- Client must be reinitialized.
END
Identifying Changed Columns
In addition to information
about which rows were changed and the operation that caused the change
(insert, update, or delete—reported as I, U, or D in the SYS_CHANGE_OPERATION), the CHANGETABLE(CHANGES ...) function also provides information on which columns were modified if you enabled the TRACK_COLUMNS_UPDATED
option. You can use this information to determine whether any action is
needed in your client application based on which columns changed.
To identify whether a specific column has changed, you can use the CHANGE_TRACKING_IS_COLUMN_IN_MASK (column_id, change_columns) function. This function interprets the SYS_CHANGE_COLUMNS bitmap value returned by the CHANGETABLE(CHANGES ...) function and returns a 1 if the column was modified or 0 if it was not:
declare @last_synchronization_version bigint
set @last_synchronization_version = 0
SELECT
CT.CustomerID as CustID,
TerritoryChanged = CHANGE_TRACKING_IS_COLUMN_IN_MASK
(COLUMNPROPERTY(OBJECT_ID('MyCustomer'),
'TerritoryID', 'ColumnId'),
CT.SYS_CHANGE_COLUMNS),
CT.SYS_CHANGE_OPERATION,
CT.SYS_CHANGE_COLUMNS
FROM
CHANGETABLE(CHANGES MyCustomer, @last_synchronization_version) AS CT
go
CustID TerritoryChanged SYS_CHANGE_OPERATION SYS_CHANGE_COLUMNS
----------- ---------------- -------------------- ------------------
4 1 U 0x0000000004000000
5 1 U 0x0000000004000000
In the query results, you can see that both update operations (SYS_CHANGE_OPERATION = 'U') modified the TerritoryID column (TerritoryChanged = 1).
Change Tracking Overhead
Although Change Tracking has
been optimized to minimize the performance overhead on DML operations,
it is important to know that there are some performance overhead and
space requirements within the application databases when implementing
Change Tracking.
The performance
overhead associated with using Change Tracking on a table is similar to
the index maintenance overhead incurred for insert, update, and delete
operations. For each row changed by a DML operation, a row is added to
the internal Change Tracking table. The amount of overhead incurred
depends on various factors, such as
The number of primary key columns
The amount of data being changed in the user table row
The number of operations being performed in a transaction
Whether column Change Tracking is enabled
Change Tracking also
consumes some space in the databases where it is enabled as well. Change
Tracking data is stored in the following types of internal tables:
These internal tables affect storage requirements in the following ways:
For each change to
each row in the user table, a row is added to the internal change table.
This row has a small fixed overhead plus a variable overhead equal to
the size of the primary key columns. The row can contain optional
context information set by an application. In addition, if column
tracking is enabled, each changed column requires an additional 4 bytes
per row in the tracking table.
For each committed transaction, a row is added to an internal transaction table.
If you are concerned
about the space usage requirements of the internal Change Tracking
tables, you can determine the space they use by executing the sp_spaceused stored procedure. The internal transaction table is called sys.syscommittab. The names of the internal change tables for each table are in the form change_tracking_object_id. The following example returns the size of the internal transaction table and internal change table for the MyCustomer table:
exec sp_spaceused 'sys.syscommittab'
declare @tablename varchar(128)
set @tablename = 'sys.change_tracking_'
+ CONVERT(varchar(16), object_id('MyCustomer'))
exec sp_spaceused @tablename