programming4us
programming4us
DATABASE

SQL Server 2005 : Extending Your Database System with Data Mining - Data Mining Applied (part 1)

2/13/2012 5:17:18 PM
SSAS data mining is no less programmable than OLAP, and although SSAS data mining can sometimes be more straightforward than OLAP development, for the most part the two are comparable in complexity.

Then again, data mining development is different from OLAP programming. To start with, data mining queries are typically not drilldown-oriented, making Excel PivotTables far less compelling as a front-end development tool. On the other hand, the model content browsers available in Visual Studio and SQL Server Management Studio are available to developers as Windows Forms controls, and building on that visualization programmability, the data mining team at Microsoft has also included sample code that implements a few of those browsers as ASP.NET server controls. Browsing models from within your applications using these controls is important and merits discussion.

Whether it be ADO MD.NET, AMO, or XMLA used programmatically or from Management Studio, virtually everything you’ve learned in the context of OLAP cubes works with SSAS data mining models. This is true for both client-side development and managed code running on the server. In fact, you can easily create stored procedures that execute DMX prediction queries and return result sets to the client, and you can do so without using the AdomdClient library and without granting the assembly Unrestricted permissions.

We will provide a few applied examples of using model content browser controls and AdomdClient code in both Windows Forms and ASP.NET 2.0 environments. We’ll also show you how easy it is to create a server-side stored procedure that executes a DMX prediction query and returns the results, and an XMLA query that does likewise. We’ll finish up by showing you how to take advantage of the direct support for data mining prediction queries provided by SQL Server Reporting Services.

Data Mining and API Programming

As we just said, ADO MD.NET provides full support for data mining. This is true on the metadata side as well as the query side. For example, the AdomdConnection object has both a MiningStructures and a MiningModels collection. Each member of the MiningStructures collection has its own MiningModels collection as well. The MiningModel object also has a Parent property pointing back to the MiningStructure object that contains it. This flexibility in the ADO MD.NET object model allows your code to browse the models in a database as a flat database-wide collection or through the hierarchical collection of structures and models. The MiningModel object also contains an Algorithm property and an Attributes collection, allowing you to determine programmatically the input and predictable columns in a model and the algorithmic context in which they are used.

Using the WinForms Model Content Browser Controls

One really good application of all of this metadata is to provide an interface where users can select a model and display its contents in the appropriate content browser control, in effect providing much of the same functionality that SQL Server Management Studio provides via its Object Explorer window and the Browse shortcut menu command, but without requiring users to install and use Management Studio itself.

You can get to the content browser controls in a couple of ways: if you have Management Studio installed on your machine, the controls are already there—you simply set a reference to the Microsoft.AnalysisServices.Viewers.DLL assembly, which by default is installed in the C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\PrivateAssemblies folder. If your development machine does not have Management Studio installed, you can download the controls from the SQL Server 2005 “feature pack” page at microsoft.com.

More Info

At the time of this book’s publication, the SQL Server 2005 feature pack page was located at http://www.microsoft.com/downloads/details.aspx?familyid=d09c1d60-a13c-4479-9b91-9e8b9d835cdc&displaylang=en. An Internet shortcut to this URL, which will save you the trouble of typing it in manually, is provided with the sample code. If the link is broken when you attempt to navigate to it, try performing a Web search on “Feature Pack for Microsoft SQL Server 2005” or “Analysis Services 2005 Data Mining Viewer Controls.”

Each component of the feature pack is available as a separately downloadable file; the content model viewer controls require first downloading and installing the Analysis Services 9.0 OLE DB Provider, which is also part of the feature pack.


After installing the controls, if necessary, you might want to add them to the Visual Studio Toolbox. To do so, use the Browse button in the Choose Toolbox Items dialog box in Visual Studio and select the controls’ assembly. Each of the constituent controls will be selected, and you can just click OK to add them to the Toolbox.

Using the controls is a snap; simply drag one onto a form and then set its ConnectionString and MiningModelName properties through the Visual Studio Properties window or in code (making sure the algorithm of the model you specify matches the viewer type you selected from the Toolbox). The controls use OLE DB, so make sure the ConnectionString value you assign includes the clause “Provider=MSOLAP.3;” in addition to Data Source and Initial Catalog clauses. Then all you do is call the LoadViewerData method on the control, passing null as the required parameter value. The following three lines of code show how to do this, assuming that objViewer is an instance of one of the viewer controls, that strConnectionString contains an ADO MD.NET–compliant connection string (specifying Data Source and Initial Catalog) as its value, and that cboMiningModels is a ComboBox control containing a list of mining models on the server, with one having been selected by the user:

objViewer.ConnectionString = "Provider=MSOLAP.3;" + strConnectionString;
objViewer.MiningModelName = cboMiningModels.Text;
objViewer.LoadViewerData(null);

The Chapter20WinClient project includes a multiple-document interface (MDI) child form called frmModelViewer (file name DMMViewerDemo.cs) that allows the user to specify a server, database, and model to view and then displays the model’s content in the appropriate viewer control type. Rather than using a control created at design time, before it executes the code shown previously, the code in the form declares objViewer as type Microsoft.AnalysisServices.Viewers.MiningModelViewerControl (the class from which each viewer control inherits) and then assigns a new instance of the appropriate algorithm-specific viewer type to that variable, based on the Algorithm property of the selected MiningModel object. It then adds the object to the form’s Controls collection and sizes, positions, and anchors it within the form.

The UI of the form lets the user click a Show Databases button after specifying a server, the Click event handler for which opens an AdomdConnection to the server and retrieves a Schema DataSet containing all of the catalogs (databases) on the server, which it uses to populate a ComboBox control. Here’s a simplified version of that code, in which txtServer is a TextBox control containing the server name, and cboCatalogs is the target ComboBox control:

AdomdConnection conn = new AdomdConnection("Data Source=" +
txtServer.Text);
conn.Open();
foreach (DataRow dr in conn.GetSchemaDataSet(AdomdSchemaGuid.Catalogs,
null).Tables[0].Rows)
{
cboCatalogs.Items.Add((string)dr[0]);
}
conn.Close();

Notice that only a Data Source clause, and not an Initial Catalog clause, is supplied in the connection string. We need server-level information only, so this is sufficient.

When the user makes a selection from cboCatalogs, the code fetches a list of, and populates a ComboBox control with, names of all the models belonging to the selected database. Here’s a simplified version of that code (with cboMiningModels as the target ComboBox):

strConnectionString = "Data Source=" + txtServer.Text + ";Initial Catalog="
+ cboCatalogs.Text;
AdomdConnection conn = new AdomdConnection(strConnectionString);
conn.Open();
foreach (MiningModel mm in conn.MiningModels)
{
cboMiningModels.Items.Add(mm.Name);
}
conn.Close();

As soon as a model is selected from cboMiningModels, the code that instantiates a viewer control and connects it to a specific model can be executed.

Executing Prediction Queries with ADO MD.NET

The programming techniques we have explored so far are fairly straightforward, but executing prediction queries is even easier. You can use the ADO MD.NET AdomdCommand and AdomdDataAdapter objects to execute MDX queries and return flattened result sets in the form of an AdomdDataReader or a DataSet/DataTable. Similarly, you can use them to execute DMX prediction queries and return their result sets in tabular form. The following code, a simplified version of the code in the Chapter20WinClient project’s form frmDMXAdHoc (file name DMXAdHoc.cs), executes a DMX query, retrieves the results as a DataTable, and then binds the DataTable to dgvResults (a DataGridView control):

AdomdConnection conn = new AdomdConnection("Data Source=" + txtServer.Text
+ ";Initial Catalog=" + cboCatalogs.Text);
AdomdCommand cmd = new AdomdCommand(txtQuery.Text, conn);
AdomdDataAdapter da = new AdomdDataAdapter(cmd);
DataTable dt = new DataTable();
da.Fill(dt);
dgvResults.DataSource = dt;

In the code, txtServer and cboCatalogs are equivalent to their namesakes in the form frmModelViewer (discussed previously), and txtQuery is a multiline TextBox control containing the DMX query.

Note that frmDMXAdHoc is functionally equivalent to Management Studio’s DMX window. This means that almost all of the DMX prediction queries in the sample code’s Management Studio project can be run from this form (and even the comments can be left in the query text). One exception is that the code will not properly handle result sets with nested columns (such as the one in Cluster5ProbabilityDist.dmx); be sure to use the FLATTENED keyword to avoid this problem (as was done in Cluster5ProbabilityDistFlattened.dmx).

Model Content Queries

Other DMX query types can be handled by the ADO MD code used in the viewer in frmDMXAdHoc. For example, each of the DML queries in the sample code’s Management Studio project can be executed in the form. However, none of these returns any data, and it would be more reasonable to run them using the AdomdCommand object’s ExecuteNonQuery method. Finally, model content queries (which we briefly discussed earlier in the context of DMX templates) can be executed as well. For example, the following query returns textual information about the CustomerProfitCategory_CL clustering model:

SELECT NODE_CAPTION, NODE_DESCRIPTION, NODE_PROBABILITY
FROM [CustomerProfitCategory_CL].CONTENT

This query can be run from a DMX Query window in Management Studio or from frmDMXAdHoc in Chapter20WinClient. (Try it and see.) The content data in text form is much less elegant than the visualization of that data, but its programmatic use is intriguing. Your application could, for example, query the model in advance, learn of certain patterns by parsing the NODE_DESCRPTION column, and then execute predictive business logic upon entry of new data.

ADO MD.NET and ASP.NET

Most of what we’ve covered so far in the context of Windows Forms development can be implemented similarly in ASP.NET. The Web site Chapter20WebClient, which is included with the sample code’s Visual Studio solution, shows an example of how to code an ASP.NET equivalent of Chapter20WinClient. The model content viewers must be handled in a different way from their Windows Forms counterparts (as we will discuss shortly). However, the execution of DMX queries is handled in much the same way as shown previously.

For example, you can populate a GridView control called gvResults with the non-hierarchical results of a DMX query, as follows:

AdomdConnection conn = new AdomdConnection("Data Source=" + txtServer.Text
+ ";Initial Catalog=" + ddlCatalogs.Text);
AdomdCommand cmd = new AdomdCommand(txtQuery.Text, conn);
AdomdDataAdapter da = new AdomdDataAdapter(cmd);
DataTable dt = new DataTable();
da.Fill(dt);
gvResults.DataSource = dt;
gvResults.DataBind();

In this snippet (which is a simplified version of the code on the DMXAdHoc.aspx page in the Chapter20WebClient application), ddlCatalogs is an ASP.NET DropDownList control containing a list of the databases on the server whose name is specified in the TextBox control txtServer. This code is strikingly similar to its Windows Forms equivalent and is just as versatile at handling DMX queries.

Using the Data Mining Web Controls

It’s good to know that DMX querying can be handled much the same way in an ASP.NET application and in Windows Forms, but what about the visualization of model content? The content viewer controls are designed for the Windows Forms environment, so we can’t use them in an ASP.NET application. However, Microsoft provides sample .NET code that implements dynamic Hypertext Markup Language (DHTML) versions of the Cluster, Naïve Bayes, and Decision Trees viewers as ASP.NET server controls. Specifically, the Cluster/Attribute Characteristics and Cluster/Attribute Discrimination tabs of the Cluster and Naïve Bayes viewers and the Decision Tree tab of the Tree Viewer are implemented. This means that only certain models can be browsed and only certain views of those models are possible.

Although they provide only a subset of the functionality supplied by the Windows Forms controls, these ASP.NET components are still extremely helpful, especially given the reach of the Internet and the Web. However, these controls are provided as sample code only and are therefore a little harder to use in your applications than their Windows Forms equivalents.

You can obtain the source code for the controls (in the form of a C# class library project) by installing the Microsoft SQL Server 2005 Samples. (The setup package for these samples is available in the Documentation and Tutorials\Samples submenu of the SQL Server 2005 main menu group on the Windows Start menu. The installer places the Visual Studio source code for the controls, by default, in the folder C:\Program Files\Microsoft SQL Server\90\Samples\Analysis Services\DataMining\Data Mining Web Controls. If you open the WebControls.sln solution file in that folder and build the project, you can reference the generated assembly in an ASP.NET Web site in Visual Studio and add the three viewer controls to the Visual Studio Toolbox window.

You can use the controls by placing them on a Web form and setting their Server, Database, and Model properties, each of which is string-typed, and store or return the name of an SSAS server, database, and mining model whose content you want to display. As an alternative to assigning values to the Server and DatabaseOleDbConnection object, connect it to a specific SSAS database, and then assign it to the controls’ Connection property. properties, you can create an

The Tree Viewer has a ViewType property that you must set to one of two enumerated constants: Tree or NodeDetails. If you set it to Tree, you must also set the TargetTree property to the name of a predictable column contained in the mining model. If you set it to NodeDetails, you must set the TargetNode property to a specific node in the decision tree. The Naïve Bayes and Cluster viewer controls both feature a ViewerMode property that also must be set to one of two enumerated constants: Discrimination or Characteristics, depending on the functionality of the tab of the Windows Forms viewers you want to duplicate.


Developing Managed Stored Procedures

Applying your OLAP development skills to data mining doesn’t end with ADO MD.NET on the client; you can use the AdomdServer library to execute server-side DMX prediction or content queries and return the results. For example, Listing 1 provides all the code necessary to implement a server-side version of the prediction query in the LeastLikelyCar.dmx Management Studio script covered earlier.

Listing 1. Using AdomdServer for DMX queries
using System;
using System.Data;
using Microsoft.AnalysisServices.AdomdServer;

namespace Chapter20ServerAssembly
{
public class StoredProcedures
{
public DataTable GetData()
{
DataTable dt = new System.Data.DataTable();
dt.Columns.Add("FirstName", typeof(string));
dt.Columns.Add("LastName", typeof(string));
dt.Columns.Add("Phone", typeof(string));
dt.Columns.Add("ProbNotCarOwner", typeof(double));

object[] row = new object[4];

AdomdCommand comContext = new AdomdCommand("SELECT TOP 250 t.FirstName,
t.LastName, t.Phone, PredictProbability([Is Car Owner],'N') AS ProbNotCarOwner
FROM CarOwner_DT PREDICTION JOIN OPENQUERY([Adventure Works DW],
'SELECT c.FirstName,c.LastName,tc.* FROM vCustomerProfitability tc INNER JOIN
dimCustomer c ON tc.CustomerKey = c.CustomerKey') AS t ON t.Gender = CarOwner_DT.Gender
AND t.IncomeGroup = CarOwner_dt.[Income Group] AND t.MaritalStatus =
CarOwner_dt.[Marital Status] AND t.Region = CarOwner_dt.Region ORDER BY
PredictProbability([Is Car Owner],'N') DESC");
try
{
AdomdDataReader dr = comContext.ExecuteReader();
while (dr.Read())
{
row[0] = dr[0];
row[1] = dr[1];
row[2] = dr[2];
row[3] = dr[3];
dt.Rows.Add(row);
}
dr.Close();
return dt;
}
catch
{
return null;
}
}
}
}


Although it is not possible to execute an MDX query using AdomdServer, it is permissible to execute a DMX query. In the case of this stored procedure, we simply create the AdomdServer.AdomdCommand object comContext, passing the DMX query to its constructor.

We make sure to type the function as returning a DataTable, and we populate a local DataTable variable with the contents of an AdomdDataReader. The latter is obtained by calling the comContext object’s ExecuteReader method.

The code in Listing 1 is taken from the Class Library project Chapter20ServerAssembly in the sample code’s Visual Studio solution. When deploying the assembly, you should assign it ExternalAccess permissions; assigning it Unrestricted permissions is not necessary, and assigning it Safe permissions will prevent the DMX query from executing and returning data properly.

Once deployed, the stored procedure can be called using the following DMX code:

CALL
Chapter20ServerAssembly.Chapter20ServerAssembly.StoredProcedures.GetData()

This query can be executed in a SQL Server Management Studio DMX Query window (it is contained in the Management Studio project script ManagedCode.dmx) or in the Chapter20WinClient and Chapter20WebClient applications.

Keep in mind that your stored procedures can also execute DDL queries using DMX (by using the AdomdCommand object’s ExecuteNonQuery method) or perform administrative or DDL tasks using AMO. The AMO Database object contains a MiningStructures collection, each member of which contains its own MiningModels collection. The AMO MiningStructure and MiningModel objects have an array of properties, methods, and child objects that are useful for performing any number of administrative or DDL tasks against your mining structures and models.

Other  
 
video
 
Video tutorials
- How To Install Windows 8

- How To Install Windows Server 2012

- How To Install Windows Server 2012 On VirtualBox

- How To Disable Windows 8 Metro UI

- How To Install Windows Store Apps From Windows 8 Classic Desktop

- How To Disable Windows Update in Windows 8

- How To Disable Windows 8 Metro UI

- How To Add Widgets To Windows 8 Lock Screen

- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010
programming4us programming4us
programming4us
 
 
programming4us