You are on page 1of 3

Data Science Scenarios and Solution

Templates
SQL Server 2016 and later

Updated: April 18, 2016


Applies To: SQL Server 2016
Templates are sample solutions that demonstrate best practices and provide building blocks to help you implement a solution
fast. Each template is designed to solve a specific problem, and includes sample data, R code Microsoft R Server and SQL
stored procedures. The tasks in each template extend from data preparation and feature engineering to model training and
scoring. The code can be run in an R IDE, with computations done in SQL Server, or by using a SQL client tool such as SQL Server
management Studio.
You can use these templates to learn how R Services Indatabase works, and build and deploy your own solution by
customizing he template to fit your own scenario.
For download and setup instructions, see How to Use the Templates at the end of this topic.

Fraud Detection
Online Fraud Detection Template SQL Server R Services
One of the important tasks for online business is to detect fraudulent transactions, and to identify the transactions made by
stolen payment instruments or credentials, in order to reduce charge back losses. When fraudulent transactions are discovered,
businesses typically take measures to block certain accounts as soon as possible, to prevent further losses. In this scenario, you'll
learn how to use data from online purchase transactions to identify likely fraud. This methodology is one that you can easily
apply to fraud detection in other domains.
In this template, you'll learn how to use data from online purchase transactions to identify likely fraud. Fraud detection is solved
as a binary classification problem. The methodology used in this template can be easily applied to fraud detection in other
domains.

Customer Churn
Customer Churn Prediction Template SQL Server R Services
Analyzing and predicting customer churn is important in any industry where the loss of customers to competitors must be
managed and prevented: banking, telecommunications, and retail, to name a few. The goal of churn analysis is to identify which
customers are likely to churn, and then take appropriate actions to retain such customers and keep their business.
This template get you started with churn prevention by formulating the churn problem as a binary classification problem. It
uses sample data from two sources, customer demographics and customer transactions, to classify customers as likely or unlikely
to churn.

Predictive Maintenance
Predictive Maintenance Template SQL Server 2016
The goal of "datadriven" predictive maintenance is to increase the efficiency of maintenance tasks by capturing past failures and
using that information to predict when or where a device might fail. The ability to forecast device obsolescence is particularly
important for applications that rely on distributed data or sensors, as exemplified by the Internet of Things IoT.
This template focuses on answering the question of When will an inservice machine fail? The input data represents simulated
sensor measurements for aircraft engines. Data obtained from monitoring the engines current operation conditions, such as the
current working cycle, settings, sensor measurements and so forth, are used to create three types of predictive models:
Regression models, to predict how much longer an engine will last before it fails. The sample model predicts the metric
Remaining Useful Life RUL, also called Time to Failure TTF.
Classification models, to predict whether an engine is likely to fail.
The binary classification model predicts if an engine will fail within a certain time frame number of days.
The multiclass classification model predicts whether a particular engine will fail, and if it will fail, provides a probable
time window of failure. For example, for a given day, you can predict whether any device is likely to fail on the given day,
or in some time period following the given day.

Energy Demand Forecasting


Energy Demand Forecasting Template with SQL Server R Services
This template demonstrates how to use SQL Server R Services to predict demand for electricity. The solution includes a demand
simulator, all the R and TSQL code needed to train a model, and stored procedures that you can use to generate and report
predictions.

How to Use the Templates


To download the files included in each template, you can use GitHub commands, or you can open the link and click Download
Zip to save all files to your computer. When downloaded, the solution typically contains these folders:
Data: Contains the sample data for each application.
R: Contains all the R development code you need for the solution. The solution requires the libraries provided by
Microsoft R Server, but can be opened and edited in any R IDE. The R code has been optimized so that computations are
performed "indatabase", by setting the compute context to a SQL Server instance.
SQLR: Contains multiple .sql files that you can run in a SQL environment such as SQL Server Management Studio to create
the stored procedures that perform related tasks such as data processing, feature engineering, and model deployment.
The folder also contains a PowerShell script that you can run to invoke all scripts and create the endtoend environment.
Be sure to edit the script to suit your environment.

See Also
SQL Server R Services Tutorials
Announcing the Templates in Azure ML
New Predictive Maintenance Template
2016 Microsoft

You might also like