Creating and Managing Views in SQL Server 2008 : Partitioned Views

7/21/2011 3:15:06 PM

Partitioned views are used to access data that has been horizontally split, or partitioned, across multiple tables. These tables can be in the same or different databases—or even spread across multiple servers. Partitioning of tables is done to spread the I/O and processing load of large tables across multiple disks or servers.

You combine the tables in a partitioned view by using a UNION ALL statement that causes the data from the separate tables to appear as if they were one table. These separate tables are referred to as member tables or base tables. The member tables in a SELECT statement of the view must all be structured in the same way, and the view must adhere to the following restrictions:

All the columns from the member tables should be included in the view definition.
Columns with the same ordinal position in the SELECT list should have the same data type.
The same column cannot be used multiple times in the SELECT list.
A partitioning column that segments the data must be identified and needs to have the same ordinal position across all the member table SELECT statements.
The partitioning column cannot be a computed column, an identity, a default, or a time stamp.
The data values in the partitioning column cannot overlap in the underlying tables.
The partitioning column must be part of the primary key of the member table.
The member tables in the partitioned view need a CHECK constraint on the partitioning column.
A table can appear only once as part of the UNION ALL statement.
The member tables cannot have indexes created on computed columns in the table.
The number of columns in the member table primary key constraints should be the same.
All member tables should have the same ANSI PADDING setting when created.

The list of restrictions for creating partitioned views is extensive, but the creation of a partitioned view is relatively straightforward and intuitive. Consider, for example, the Sales.SalesOrderHeader table in the Adventureworks2008 database. This table is relatively small, but it is the type of table that could have a large number of rows and experience heavy utilization. To balance the workload against this table, you could use a partitioned view that utilizes base tables that each contain a separate year’s data. Listing 1 shows the CREATE TABLE statements to create the base tables for each year. The yearly tables are intended to hold summarized daily numbers, and each contains only a subset of the columns in the Sales.SalesOrderHeader table.

Listing 1. Creating the Base Tables for a Partitioned View

CREATE TABLE Sales.Sales_2001
(
    OrderDay datetime NOT NULL
        CHECK (OrderDay BETWEEN '20010101' AND '20011231'),
    SubTotal money NOT NULL ,
    TaxAmt money not null,
    Freight money not null,
 CONSTRAINT PK_Sales_2001_OrderDay PRIMARY KEY CLUSTERED (OrderDay ASC)
)

CREATE TABLE Sales.Sales_2002
(
    OrderDay datetime NOT NULL,
        CHECK (OrderDay BETWEEN '20020101' AND '20021231'),
    SubTotal money NOT NULL ,
    TaxAmt money not null,
    Freight money not null,
 CONSTRAINT PK_Sales_2002_OrderDay PRIMARY KEY CLUSTERED (OrderDay ASC)
)

CREATE TABLE Sales.Sales_2003
(
    OrderDay datetime NOT NULL
        CHECK (OrderDay BETWEEN '20030101' AND '20031231'),
    SubTotal money NOT NULL ,
    TaxAmt money not null,
    Freight money not null,
 CONSTRAINT PK_Sales_2003_OrderDay PRIMARY KEY CLUSTERED (OrderDay ASC)
)

CREATE TABLE Sales.Sales_2004
(
    OrderDay datetime NOT NULL
        CHECK (OrderDay BETWEEN '20040101' AND '20041231'),
    SubTotal money NOT NULL ,
    TaxAmt money not null,
    Freight money not null,
 CONSTRAINT PK_Sales_2004_OrderDay PRIMARY KEY CLUSTERED (OrderDay ASC)
)

Notice that each table has a primary key on OrderDay, the partitioning column. Also notice that a CHECK constraint is defined for each table; it ensures that only orders for the given year can be stored in the table.

To demonstrate the power of a partitioned view, it is best to populate the base tables that will be used by the view. Listing 2 contains a series of INSERT statements that select from the Sales.SalesOrderHeader table and populate the base tables. The SELECT statements summarize several key columns by day and contain a WHERE clause that limits the result to orders for the respective years.

Listing 2. Populating the Base Tables for a Partitioned View

INSERT Sales.Sales_2001
   SELECT CONVERT(VARCHAR(8),OrderDate,112),
      SUM(SubTotal), SUM(TaxAmt), SUM(Freight)
    FROM Sales.SalesOrderHeader
    WHERE OrderDate between '20010101' AND '20011231'
    GROUP BY CONVERT(VARCHAR(8),OrderDate,112)
INSERT Sales.Sales_2002
   SELECT CONVERT(VARCHAR(8),OrderDate,112),
      SUM(SubTotal), SUM(TaxAmt), SUM(Freight)
    FROM Sales.SalesOrderHeader
    WHERE OrderDate between '20020102' AND '20021231'
    GROUP BY CONVERT(VARCHAR(8),OrderDate,112)

INSERT Sales.Sales_2003
   SELECT CONVERT(VARCHAR(8),OrderDate,112),
      SUM(SubTotal), SUM(TaxAmt), SUM(Freight)
    FROM Sales.SalesOrderHeader
    WHERE OrderDate between '20030101' AND '20031231'
    GROUP BY CONVERT(VARCHAR(8),OrderDate,112)

INSERT Sales.Sales_2004
   SELECT CONVERT(VARCHAR(8),OrderDate,112),
      SUM(SubTotal), SUM(TaxAmt), SUM(Freight)
    FROM Sales.SalesOrderHeader
    WHERE OrderDate between '20040102' AND '20041231'
    GROUP BY CONVERT(VARCHAR(8),OrderDate,112)

Now that you have the populated base table, you can create a partitioned view and ensure that the view is selecting only from the base tables that it needs.

Two types of partitioned views are discussed in this article: local and distributed. A local partitioned view utilizes base tables found on the same server. A distributed partitioned view contains at least one base table that resides on a different (remote) server. The focus in the section is on local partitioned views. The T-SQL for creating a local partitioned view named Sales.vw_Sales_Daily is shown in Listing 3.

Listing 3. Creating a Local Partitioned View

Create View Sales.vw_Sales_Daily
 as
      SELECT * FROM Sales.Sales_2001
       UNION ALL
      SELECT * FROM Sales.Sales_2002
       UNION ALL
      SELECT * FROM Sales.Sales_2003
       UNION ALL
      SELECT * FROM Sales.Sales_2004

The best way to validate that a partitioned view is working properly is to run a conditional SELECT against the view and display the execution plan. If the partitioned view is functioning properly, it should be accessing only the base tables it needs to satisfy the SELECT and should not access all the tables in the view unless it needs to. The following example shows a sample SELECT against the new partitioned view:

SELECT * FROM Sales.vw_Sales_Daily
 WHERE OrderDay > '20040701'
   and SubTotal > 2000

If you execute this statement and review the actual execution plan, you see that an index seek is performed against the Sales.Sales_2004 table. This is the correct result, given that the SELECT statement is targeting order data from 2004.

Modifying Data Through a Partitioned View

You can modify data via a partitioned view if the SQL statement performing the modification meets certain conditions, as described here:

All columns in the partitioned view must be specified in the INSERT statement. Columns that include a DEFAULT constraint or allow nulls are also subject to this requirement.
The DEFAULT keyword cannot be used on inserts to partitioned views or on updates to partitioned views.
UPDATE statements cannot modify PRIMARY KEY columns if the member tables have text, ntext, or image columns.
Inserts and updates to a partitioned view are not allowed if the view contains a time stamp.
Identity columns in a partitioned view cannot be modified by an INSERT or UPDATE statement.
INSERT, UPDATE, and DELETE statements are not allowed against a partitioned view if there is a self-join with the same view or with any of the member tables in the statement.

Note

Data can be modified through partitioned views only in the Enterprise and Developer Editions of SQL Server 2008.

In addition to the conditions shown in this list, you must also satisfy any restrictions that apply to the member tables. Check constraints, foreign key constraints, and any other table-level restrictions must be accounted for in the modification statement. The user executing the modification against the partitioned view must have the appropriate INSERT, UPDATE, or DELETE permissions on the member tables for the update to succeed.

Distributed Partitioned Views

Microsoft provides distributed partitioned views (DPVs) as a primary means to scale out a database server. Scalability allows an application or a database to utilize additional resources, which allows it to perform more work. There are two kinds of scalability: scaleup and scaleout. A scaleup solution focuses on a single server scaled to provide more processing power than its predecessor. An example of scaleup would be migrating from a server with a single dual-core processor to a machine with 4-quad-core processor. Scaleout solutions include the addition of servers to augment the overall processing power.

DPVs are similar to local partitioned views, but they utilize one or more tables located on a remote server. The placement of partitioned data on remote servers allows the processing power of more than one server to be utilized. The partitioning is intended to be transparent to the application and allow for additional partitions and servers as the application’s needs scale.

The following list outlines the basic requirements for creating a DPV:

A linked server definition is added to each member server that will contain the partitioned data. The linked server contains the connection information required to run distributed queries on another member server.
The lazy schema validation option is set to true on each of the member servers, using sp_serveroption. This option is set for performance reasons and allows the query processor to skip schema checking of remote tables if the query can be satisfied on a single member server.
A DPV is created on each member server. This DPV references the local tables in addition to the tables found on the other member servers.

Listing 3 shows SQL commands that can be used to satisfy the requirements in the preceding list. The DPV created in the last portion of the script is similar to the local partitioned view created in the previous section. The key difference in this DPV example is the inclusion of a distributed query that retrieves records for Sales.Sales_2002 from a remote server. The remote server in this example is named DbSvrXP.

Listing 3. Creating a Distributed Partitioned View

Exec sp_addlinkedserver @server='dbsvrxp',
      @srvproduct='',
      @provider='MSDASQL',
      @provstr='DRIVER={SQL Server};
SERVER=dbsvrxp;UID=linklogin;PWD=pw;Initial Catalog=Adventureworks2008'
—Set the server option for improved DPV performance
exec sp_serveroption dbsvrxp, 'lazy schema validation', true

Create View Sales.vw_Sales_Daily
 as
      SELECT * FROM Sales.Sales_2001
       UNION ALL
      SELECT * FROM dbsvrxp.Adventureworks2008.Sales.Sales_2002
       UNION ALL
      SELECT * FROM Sales.Sales_2003
       UNION ALL
      SELECT * FROM Sales.Sales_2004

The DPV created in Listing 3 contains only one remote table. The example could be further expanded to have each table in the UNION clause on a different remote server. Keep in mind that the DPV CREATE statement needs to be adjusted when run on the remote server(s). The tables that are local on one server are now remote on the other server, and those that are remote can now be local.

If the DPVs are properly defined, SQL Server 2008 attempts to optimize their performance by minimizing the amount of data transferred between member servers. The query processor retrieves the CHECK constraint definitions from each member table. This allows the query processor to map the specified search arguments to the appropriate table(s). The query execution plan then accesses only the necessary tables and retrieves only the remote rows needed to complete the SQL statement.

Data can be modified through a DPV as well. Updatable DPVs, which were introduced in SQL Server 2000, are still available in SQL Server 2008. Data modifications are performed against a view, allowing true transparency. The view is accessed as if it were a base table, and the user or application is unaware of the actual location of the data. If it is configured properly, SQL Server determines via the WHERE clause specified in the update query which partition defined in the view must be updated rather than updating all tables in the join.

Note

Data can be modified through distributed partitioned views only in the Enterprise and Developer Editions of SQL Server 2008.

Other

Creating and Managing Views in SQL Server 2008 : Managing Views, Data Modifications and Views

Creating and Managing Views in SQL Server 2008 : Creating Views

Creating and Managing Views in SQL Server 2008 : Definition of Views & Using Views

Transact-SQL in SQL Server 2008 : Insert over DML

Transact-SQL in SQL Server 2008 : MERGE Statement

SQL Server 2008 : Transact-SQL Programming - The TABLESAMPLE Clause

SQL Server 2008 : Transact-SQL Programming - TRY...CATCH Logic for Error Handling

SQL Server 2008 : Transact-SQL Programming - The APPLY Operator

SQL Server 2008 : Transact-SQL Programming - PIVOT and UNPIVOT

SQL Server 2008 : Transact-SQL Programming - Ranking Functions