Using SQL Server 2005 Integration Services : Working with Integration Services Packages (part 1) - Control Flow

6/5/2012 9:18:42 AM

History of Data Transfer in SQL Server

Before we jump right into Integration Services, it is important to review the historical context for the Integration Services design goals. Ten years ago in SQL Server 6.5, one of the few ways to import data to or export data from SQL Server outside of T-SQL and Bulk Insert was through the command-line bcp utility. Although bcp did the job, it exported in a simple text format by default and was difficult to automate. To master bcp, you had to go to the command line, set many parameters, and usually set up schema (format) files. In addition, bcp provided no easy way to manipulate data going into or out of SQL Server.

When SQL Server 7.0 shipped in 1998, we were first introduced to Data Transformation Services. DTS was a robust and easy-to-use replacement for bcp built on OLE DB technology. (Of course, bcp still ships with SQL Server, as does DTS.) With DTS, you could easily import and export between SQL Server and any OLE DB data source. You could also move data from one non-SQL Server–based data source to another, so moving data from a text file to SQL Server or from Microsoft Access to Oracle was simple—all you had to do was use DTS on your local machine or remotely on a server. In addition, you could apply transformations to your data and save the functionality into one location called a DTS Package. DTS shipped with a wizard and interface that made the package creation process pretty straightforward. This opened a new door that bcp had kept nailed shut.

DTS Packages

DTS was a huge improvement over its predecessors and whatever else was available in the marketplace at the time, so millions of DTS packages were developed around the world. Developers set up packages to process sets of data in a succession of steps. SQL Server 2000 expanded on this methodology with performance and wizard improvements, but packages were still created in SQL Server Enterprise Manager. Developers now had many packages with several steps in them, but they quickly learned that logic and control flow was still far from perfect, and no form of debugging was available. Developers could make steps in a package dependent on each other and run simple ActiveX scripts as steps in a package, but other than that, they had to turn to third-party components or fancy T-SQL hacks to get the control flow that they desired.

Working with Integration Services Packages

SQL Server 2005 ships with SQL Server Integration Services. Integration Services is a replacement for DTS that extends the same metaphor of a package, but the designer has been radically changed to use Visual Studio and to focus on controlling the flow of the data, giving the developer more control over the design of the package. In this section, we introduce the concepts you must understand to work with Integration Services.

Control Flow

Control flow is the new model for representing the process embodied by the package. When you open a package in the Integration Services designer, the first thing you see is the control flow design surface. The visual representation of control flow is a flowchart. The shapes in the flowchart are called tasks. Tasks are the atomic units of execution in an Integration Services package. The lines connecting tasks are the precedence constraints, which govern the order in which tasks are executed.

In addition to tasks, the control flow can contain a special type of object called a container. You can think of containers as very tightly built-in tasks; they are internally integrated with the package execution engine and can provide some interesting features above and beyond what is possible for tasks. The Sequence container supports nesting of the control flow. When you add a Sequence container to your package, it is as if you have another package within the current package—you can add tasks and precedence constraints, all scoped within that container. In fact, the package itself is a top-level Sequence container. The For Loop container supports iterative execution of the control flow in the container until the loop expression remains true. The Foreach Loop container supports looping over a collection of objects supplied by the chosen Foreach Loop enumerator. Figure 1 shows a control flow containing a Sequence container and use of precedence constraints.

Figure 1. Containers and precedence constraints in a control flow

The control flow can also contain a few other things to help you control execution of your process. The first thing you need to do with almost any package is to define connections to be used by your tasks. Structurally, there is a Connections collection associated with the package, where you can define and manage all the connections that the package uses. When you have a complete package, which as experience shows often becomes large to handle the complex extract, transform, and load (ETL) processes that must be performed, you can appreciate the benefit of centrally controlling properties of your connections in one spot. Compare this to DTS, where DTS tasks must often be individually configured with connection information.

Variables are another important feature. For every Sequence container you have a collection of variables. You can set the data type, the initial value, and a few other properties on the variable. During execution, tasks can read and write variables. Precedence constraints that control ordering of execution of tasks can also refer to variables. Using variables, you can specify that a particular task be executed if the preceding task had succeeded and the value of a specific variable meets a certain condition.

Major Tasks

Integration Services comes with more than 20 items in the Control Flow toolbox. Let’s take a look at some of the most significant and commonly used tasks.

  • Execute SQL task Runs queries against databases. This task is very flexible and supports many features, such as parameterized queries, input and output parameters for stored procedures, and multiple ways of returning results.

  • Execute Process task Executes a program. This is helpful for launching various command-line utilities from the package.

  • Execute Package task Executes another Integration Services package, either in the same or a different process space.

  • File System task Performs various file system operations.

  • FTP task Transfers files between systems using File Transfer Protocol (FTP).

  • Script task Executes custom scripts written in Microsoft Visual Basic .NET.

  • Data Flow task Transfers and transforms data between multiple sources and destinations. This is the biggest task in the product and is the subject of the next section.

Video tutorials
- How To Install Windows 8

- How To Install Windows Server 2012

- How To Install Windows Server 2012 On VirtualBox

- How To Disable Windows 8 Metro UI

- How To Install Windows Store Apps From Windows 8 Classic Desktop

- How To Disable Windows Update in Windows 8

- How To Disable Windows 8 Metro UI

- How To Add Widgets To Windows 8 Lock Screen

- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010
programming4us programming4us