The traditional BI methodology has focused on sourcing data from disparate source systems and augmenting the data in a data warehouse / data lake. This data repository acts as the master source for different purposes like data marts, reporting, and data mining. All these forms of data analysis requires the end user to apply analytical thinking to interpret the results. Machine Learning is an advanced form of analysis where the system / model learns from the data fed to the model and derives intelligence for predicting analysis. The analysis depends on the machine learning model development process that is composed of exploratory data analysis, data transformation / modeling, model development, model training, model testing and model improvisation.
Many professionals often think that their experience with databases already covers exploratory data analysis skills. Database professionals are fluent with data analysis which is more of a database model / query logic assessment. The exploratory data analysis involved in machine learning systems is statistical in nature and is often termed as data science.
Machine Learning has a deep root in statistics, which is required to build a solid foundation for basics of data science for exploratory data analysis. We can classify statistics in two broad categories – descriptive and inferential, which is widely used in machine learning model development.
SQL Server hosted data has the advantage of a pre-defined schema and T-SQL constructs. ETL tools like SSIS offer the advantage to transform the data at a faster pace and broader scale. Assuming the data is properly structured and treated for errors during data capture / data quality, exploratory data analysis can be applied over this data, which is the basic step in machine learning model development. Model development, model training and model testing follow this analysis.
In this tutorial, we will learn the basics of machine learning where we will learn the necessary data science to examine data in relevance to machine learning model development. We will be using R in SQL Server 2017 to apply machine learning related techniques and analysis. In case you are new to R, you can get quickly get up to speed by following the R Tutorial here.
The structure of this tutorial is listed below.
After reading the course agenda, I am sure you are excited to get started with the basics of machine learning. So let's start with the question of "What is machine learning?".