How to query prediction results from a data mining model in SQL Server Analysis Services
By: Siddharth Mehta | Comments | Related: > Analysis Services Development
SQL Server Analysis Services (SSAS) contains features for developing data mining models using various data mining algorithms for predictive analysis. Once these models are deployed on SSAS, they can be queried using Data Mining Extensions. Often there’s a need to retrieve the predicted results and report it to the end users on demand. In this tip we will learn how to retrieve these results by using DMX.
Prediction Join and Predict function can be used in DMX to execute the data mining model on filtered test cases to retrieve predicted values.
Developing Prediction Query in DMX
In order to focus on the prediction related constructs in DMX, we would need some sample data mining models with sample data. You can learn data mining from this tutorial and develop your own data mining algorithm or you can use the sample AdventureWorks SSAS database that contains data mining models.
For our example, we will be using the AdventureWorks Data Warehouse as well as a SSAS Database installed on our SQL Server instance. Open SQL Server Management Studio and navigate to the SSAS Database. Under the Targeted Mailing data mining structure, you should be able to find four different data mining models as shown below. These models are designed to predict whether a given customer is predicted to buy bikes. Our intention is to query a data mining model along with prediction values.
The Target Mailing structure contains the fields shown below. The CustomerKey is the key attribute and the Bike Buyer is the prediction attribute. In simple words, this model analyzes the data and predicts the value for Bike Buyer attribute as either 1 or 0. It reads data from vTargetedMail view from the AdventureWorks Data Warehouse. Let’s say that we intend to query the prediction values for customers whose Gender is Male and Age is less than 31 in Europe. Below are these customers from the vTargetMail view in the AdventureWorks Data Warehouse.
Open a new DMX Query window in SSMS and type the below query. Here we are using two different DMX constructs – Prediction Join and Predict function. The prediction join statement matches the data from the external data source with the fields in the “TM Decision Tree” mining model. In order to predict the value for the prediction attribute, Predict function can be used for the same.
select TM.[CustomerKey], TM.[Age], TM.[Gender], TM.[Region], Predict([Bike Buyer]) as [Prediction] from [TM Decision Tree] prediction join openquery ([AdventureWorksDW2014], 'select CustomerKey, Age, Gender, Region from vTargetMail where Gender = ''M'' and Age < 31 and Region = ''Europe'' ') as TM on [TM Decision Tree].[Age] = TM.[Age] and [TM Decision Tree].[Gender] = TM.[Gender] and [TM Decision Tree].[Region] = TM.[Region] and [TM Decision Tree].[Customer Key] = TM.[CustomerKey]
In the above query, we are selecting the dataset from the data warehouse using OpenQuery and joining the same using the prediction join to match this dataset with the fields from the data mining model. After that, the query is using the Predict function on the Bike Buyer attribute to predict values for the same, and we are giving it an alias column name named “Prediction”. The value 0 means that all these customers are unlikely to buy bikes according to the Decision Trees Data Mining Model.
In this way we can use Prediction Join to map the data from the source to the mining model and use Predict function to extract the predicted values from the model.
- Consider reading more details about the different uses of Prediction Join from here.
- Consider reading more about the Predict function from here.
About the author
View all my tips