If you have worked with replication, you know that
prior planning and design are important. First-time replicators often
think that having the same data in two places and needing to keep that
data in sync is an easy task. But consider just two examples of the
unexpected complexity involved in replication.
SQL Server provides a capability called triggers; a change in TableX automatically causes a change to occur in TableY. But SQL Server CE does not support triggers. If TableX and TableY are replicated down to a SQL Server CE database, a change in TableX does not cause an automatic change to TableY. Thus, the same statement executed on the “same” table, TableX, produces different results in TableY, and the two copies of TableY are not in sync.
If
you replicate a table that has a foreign key, you need to replicate the
primary key table as well to validate the rows inserted into the
foreign key table. Perhaps you wish to replicate only some of the rows
in the foreign key table—say, only the rows for delivery route West92.
You want to replicate only those rows from the primary key table (the Products table) that are relevant to delivery route West92. But delivery route code probably is not a column in the Products table, so how do you specify which Products
table rows should be replicated? (As we will see later in this chapter,
you can accomplish this by publishing all the tables involved in
enforced relationships or by adding a horizontal filter that extends
outward to include related rows from other tables.)
At first, such
considerations may make you wish for the perceived simplicity of RDA.
But after you examine some key differences between Merge Replication and
RDA, you will realize that Merge Replication is quite attractive. Merge
Replication requires less code than RDA. Merge Replication provides the
ability to have the server control much of the application logic,
instead of having the logic distributed to the device. For example,
Ranged Identity columns and dynamic horizontal partitions are two key
areas where the logic resides on the server and not on the device. This
can substantially reduce the amount of code on the device and allow
application maintenance to be performed on the server rather than on the
device.
Any
replication schema must be well planned and well designed, including
the ones you plan to use with your SQL Server CE applications.
Therefore, we take a moment to review the basic concepts of SQL Server’s
Merge Replication and then follow up with a detailed examination of
design considerations.
In Merge Replication, as it is used with a SQL Server CE application, the SQL Server is referred to as the publisher with one or more defined publications on one or more databases, and the SQL Server CE databases are referred to as the subscribers. Publications are made up of a defined set of tables, columns, and filters. These chosen tables are referred to as the articles of a publication; and the definition of a subset, such as “only columnA, columnB, columnC of TableX” or “only those rows of TableY where route code equals 'West92'” are referred to as filters.
Filters allow you to replicate a vertical subset of a table (such as
the first example in the previous sentence) or replicate a horizontal
subset of a table (such as the second example) or both. They also can be
dynamic, such as “only rows in the Orders table where EmployeeID equals the employee ID of the user performing the synchronization”.
With this flexibility
in publication design comes the need to ensure that changes made to
this subset of the data residing on the device will synchronize
correctly when propagated back to the server.
1. Using Good Design to Avoid Synchronization Failures
Most mobile applications that have a Merge Replication component consist of three repetitively executed steps.
1. | Subscribe to a publication located at the server and synchronize the data to the device.
|
2. | Make updates to that data as it resides on the device.
|
3. | Synchronize with the server to submit the changes made locally back to the server database.
|
All mobile application
designers hope that step 3 succeeds so that modifications are applied at
the server without errors and that those modifications do not overwrite
any modifications made by other users. Many mobile application
designers end up wondering why step 3 failed. The purpose of this
subsection is to explain why publication design decisions can cause
synchronization failures and to provide a blueprint on how to avoid
those failures.
A
mobile application that uses Merge Replication for the transfer of data
between SQL Server and SQL Server CE does so by subscribing to a
publication that resides on the server. Since SQL Server CE provides
only for anonymous subscriptions, the entire publication must be
subscribed to. But mobile applications usually do not want to work with
large amounts of data; rather, they want just the data that applies to
the subject at hand and to the user of the device. They almost always
want less than the entire server database. This means that almost all
publications are defined as subsets of the database and therefore that
all mobile applications work with a subset of the database.
Working with a subset of data always creates the possibility of data
integrity problems occurring in the subset that will not be detected
until the data is synchronized with the full set of data residing at the
server. For example, the Northwind database that comes with SQL Server
contains two tables that are related by a foreign key constraint: Products, which contains the foreign key, and Categories, which contains the primary key. Now suppose that you define a publication that includes the Products table but not the Categories table. When you synchronize, the Products table is brought to the device. That table has a CategoryID
column both on the server and on the device. At the server, the list of
possible values that can be entered into that column is known; by
definition it is the list of values in the Categories table’s primary key column. But on the device, that list is not known. The local application cannot insert rows into the Products table or update the CategoryID
field in rows that are already there because the application does not
know what values will be accepted or rejected when the data is
synchronized to the server.
Not all developers
understand that each decision made during the design of the publication
results in a specific and definable limitation that must be placed on
the application for local data modifications to synchronize successfully
at the server. As we look at publication design decisions and the
resulting restrictions that those decisions place on your local
application, let us be clear about something: The design of a
publication does not place automatic restrictions on your application;
rather, it creates situations that force you to design limitations into
your application as you develop it.
To illustrate this
“choice and impact” world in which you must develop applications based
on Merge Replication, we develop a sample publication and mobile
application design. For our server database, we pick the Northwind
database. Our Merge Replication–based mobile application synchronizes
order data for one employee with the Northwind database. In the
publication definition, a row filter specifies that only order
information for the employee who is performing the synchronization
should be delivered to the device. Our definition of “order information”
is illustrated by the SQL Server data diagram in Figure 1.
Thus, our publication includes the employee row, all orders related to
that employee, all customers and order details connected to those
orders, and all products for the order details. We included only those
columns we determined to be essential to our application.
Our publication (and
therefore the data delivered to the device) is a four-way subset of the
host database; it contains some of the tables, some of the columns,
some of the rows, and some of the constraints of the Northwind database.
Each of these reductions in content places limitations on the
modifications that the application can make to that data on the device.
For the sake of illustration, the employee named Nancy Davolio is performing the synchronization, unless otherwise stated.
Table 1 relates publication design decisions with their impacts on the application.
Table 1. Impacts of Publication Design
Type of Subset | Impact on Local Application | Example |
---|
Not all columns of a table are included | We
cannot insert rows unless the database at the server contains default
value definitions or permits nulls for the missing columns
Deleting rows will result in the loss of unknown information | The sample application should not add new customers or delete existing customers |
Not all rows of a table are included | Inserted rows must adhere to the row filter specified by the publication | The
sample application should not add a new employee or any orders for that
new employee. The new rows would successfully update to the server, but
they would disappear from the device during the next synchronization
because they would not be Nancy Davolio’s data |
A table that has a foreign key is included, but the table containing the matching primary key is not | New rows cannot be inserted into the table, nor can the foreign key column be updated | The sample application should not insert rows into the ProductsCategoryID table or update its column because the domain of allowable values is not known |
A table that has a primary key column is included, but the table with the corresponding foreign key column is not | The primary key column cannot be updated (which isn’t allowed anyway). Rows cannot be deleted from the table | The sample application should not delete a customer because that customer may have data in the CustomerCustomerDemo table on the server |
Some
rows are not unique to a specific synchronization (i.e., they will be
propagated to more than one device). This is an inevitable result of a
many-to-many relationship, such as employees and customers. For
instance, a customer might have placed orders with several employees | The
table whose rows can synchronize to more than one device cannot be
updated unless ownership of rows and columns can be assigned to a single
subscriber. Rows cannot be deleted from the table
Updating the row on more than one device will result in
conflicts during synchronization. The last update synchronized will be
the winner | The sample application should not update customer information (however, if the Customers table had a Responsible-Employee foreign key column, the sample application could permit only that employee to make modifications) |
A column that has the IDENTITY property is included in the publication | SQL
Server’s Ranged Identity capability must be specified in the
publication definition. The application should not try to solve this
problem | If, for example, an Identity column were added to the Orders
table on the server, Ranged Identity would assign a range of possible
identity values to each subscriber, ensuring that two devices did not
give different orders the same identifying number |
Derived (calculated) columns are included in the publication definition | These
columns normally should not be updated. Their values are derived from
other values in the database. They can be updated only if all their
source values are included in the publication | The sample application should not recalculate Product.UnitsInStock
based on subscriber activity. Any such updates would reflect only the
activity of the local user. During synchronization, each subscriber’s
value would overwrite the previous subscriber’s value. All values would
be wrong; the last subscriber’s value would persist and propagate |
A recursive relationship on a table with a row filter specified is included in the publication | The table will not synchronize to the device | When Nancy Davolio synchronizes, her row is the only employee row brought to the device. In the Reports-To
column of that table is the employee ID of her manager. The foreign key
constraint mandates that her manager’s row be on the device before her
row can be inserted. But her manager’s row is not on her device because
it was filtered out by the publication |
Many of the restrictions in Table 1
sound like nothing more than common sense, but they are grounded on the
basic principles of relational algebra. These restrictions are not
arbitrary or optional; they cannot be circumvented by clever application
code or expensive hardware. Common sense is not always common. When you
are involved in a design-time discussion, it’s nice to know that your
recommendations are based on solid ground, especially when those of
others are not.
Having studied the
restrictions that will be placed on our application by our publication,
we decide that the restrictions are acceptable. It is time now to
configure the publication.