Programming Microsoft SQL Server 2005: Using the Data Mining Wizard and Data Mining Designer (part 2) - Creating a Mining Model

1/14/2011 5:31:14 PM

Creating a Mining Model

To create the initial mining model, follow these steps:

Right-click the Mining Structures node in Solution Explorer and choose New Mining Structure. This starts the Mining Structure Wizard.

On the Welcome page, click Next.

On the Select the Definition Method page, select From Existing Relational Database or Data Warehouse and then click Next.

On the Select the Data Mining Technique page, select a data mining technique for your initial mining model, which you must configure when you create a mining structure. After the mining structure is created, you can add additional mining models to that structure. Select Microsoft Clustering from the drop-down list and then click Next.

On the Select Data Source View page, select the DSV that you created earlier and then click Next.

On the Specify Table Types page, you will see vCustomerProfitability from your DSV along with check boxes for specifying the table type. Select the Case check box next to vCustomerProfitability. Click Next.

Tables for Mining Models

SSAS has two types of tables for use in mining models: case tables and nested tables. Every mining structure must have a case table. Simply put, the case table defines the entities to be analyzed for your mining model. In the customer profitability analysis, the basic unit of analysis is the customer; therefore, we will define vCustomerProfitability, which has one row per customer, as the case table. Nested tables are used to provide additional detail about your cases. Not every mining structure will have a nested table. For now, our mining structure will have only a case table.

On the Specify The Training Data page, you will see a list of all the columns in your case table. To the right of each column is a set of three check boxes for specifying whether each column is a Key, Input, or Predictable column in your model, as shown in Figure 2.

Figure 2. The Data Mining Wizard’s Specify The Training Data page

The Key column uniquely identifies each record in your case table. Input columns form the basis on which data pattern discoveries are made, whether they be descriptive or predictive patterns. For example, the clustering algorithm creates clusters based on the values of the input columns. The Decision Trees algorithm decides tree splits based on how well the values of an input column predict a particular outcome.

Predictive and Descriptive Mining Models

From a technical standpoint, marking a column as predictable means that it can be selected from the model in a DMX prediction query. This is true of the predictable columns in any mining model, even those that do not have prediction as a goal. This definition doesn’t really convey what predictable columns are for. For a common-sense definition, we need to distinguish between predictive and descriptive mining models.

Predictive mining models are designed to forecast (predict) the value of their predictable columns. For descriptive mining models, predicted columns have a technical usage, and that usage varies. In a clustering model, input columns are used to determine the clusters. Selecting only the Predict check box for a column means that the column will not be used to determine the clusters; however, including the column as a predictable column in the model allows you to view its distribution within each cluster and compare its distribution across clusters after the model has been processed.

Back on the Specify the Training Data page, select CustomerKey as the Key column of your case table and select the Input check box for the following columns: Age, CommuteDistance, EnglishEducation, Gender, HasKidsAtHome, IncomeGroup, IsCarOwner, IsHomeOwner, IsNewCustomer, MaritalStatus, NumProdGroup, RecencyGroup, and Region.

Select the Predict check box for the ProfitCategory column, and then click Next.

On the Specify Columns’ Data Type and Content page, click Detect.

This forces SSAS to determine whether your input and predictable columns are discrete (categorical) or continuous. You can also modify the content and data type by clicking in the cell and selecting the type from the drop-down list that appears, as shown in Figure 3.

Figure 3. Content and data types of columns are specified at the mining structure level in the Data Mining Wizard.

Verify that the content type is continuous for the Age column, key for the CustomerKey column, and discrete for all other columns. Click Next.

Name the mining structure CustomerProfitCategory and the mining model CustomerProfitCategory_CL and then click Finish.

This brings you to the Mining Structure tab of the data mining structure designer, as shown in Figure 4.

                                       Figure 4. The data mining structure designer’s Mining Structure tab shows the columns in the mining structure.

The left pane of the tab contains a treeview display of our mining structure. From here, you can add and delete columns and nested tables by using the tab’s toolbar or by right-clicking anywhere in the treeview pane (on the left) and selecting options from the context menu. The Data Source View pane (on the right) allows you to explore and edit the DSV.
Top 10
SG50 Ferrari F12berlinetta : Prancing Horse for Lion City's 50th
The latest Audi TT : New angles for TT
Era of million-dollar luxury cars
Game Review : Hearthstone - Blackrock Mountain
Game Review : Battlefield Hardline
Google Chromecast
Keyboards for Apple iPad Air 2 (part 3) - Logitech Ultrathin Keyboard Cover for iPad Air 2
Keyboards for Apple iPad Air 2 (part 2) - Zagg Slim Book for iPad Air 2
Keyboards for Apple iPad Air 2 (part 1) - Belkin Qode Ultimate Pro Keyboard Case for iPad Air 2
Michael Kors Designs Stylish Tech Products for Women
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
Popular Tags
Video Tutorail Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Exchange Server Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe Photoshop CorelDRAW X5 CorelDraw 10 windows Phone 7 windows Phone 8 Iphone