Machine Learning Terms and Concepts
By: Siddharth Mehta
Machine Learning and Artificial Intelligence are buzzwords these days in the industry. Almost everyone has some perception of what it means from their perspective. Before we start with the basics of machine learning, I recommend approaching it without any prejudice or influence of what you may already know. This will help build your fundamentals with a clear thought process. Let's try to understand machine learning with a basic example.
What is Machine Learning and reason to learn?
Consider the image below as a dataset that you parse manually. From our experience, our brain can easily interpret these objects. Now let's say, given any object like an apple or a dog, we can easily identify and classify the group where the object belongs. Let's say that a machine has to do the same, how will a machine learn to do this? To answer this question, first we need to learn how the human brain processes these images. The human brain first parsed these images and learned from them, followed by grouping the images. When a new object is given for identification, it matches the test object with the nearest matching object from the learned repository and classifies it in the same group.
When we train a machine to learn from a given dataset, so that it can use the items for different purposes like classification, prediction and others, we call this concept machine learning. One point to note is that machine does not mean a physical machine. For simplicity of understanding, you can perceive a machine as a program or a data model.
Some key points and definitions related to Machine Learning are mentioned below:
- Machine Learning is concerned with computer programs that automatically improve their performance through experience.
- Machine Learning is a type of Artificial Intelligence that provides computers with the ability to learn without explicit programming.
- Machine Learning focuses on development of computer programs that can change when exposed to new data.
- The process of machine learning is similar to data mining. Both systems search through data to look for patterns. However, instead of extracting data for human comprehension as in case of data mining, machine learning uses that data to detect patterns in data and adjust program actions accordingly.
To best answer the reason to learn machine learning, let's look at some of the applications of machine learning.
Applications of Machine Learning
Below are few applications of machine learning, which makes it clear as to why we need to learn about machine learning.
- Web Search: Ranking a page based on user clicks likelihood.
- Finance: Decide target users for new credit card offers.
- E-commerce: Predicting fraudulent transactions.
- Space exploration: Space probes and radio astronomy.
- Robotics: Handling uncertainty in new environments like self-driving cars.
- Social Networks: Data on relationship and preferences.
- Computational: Suggesting bugs in applications based on cognitive processing.
Machine Learning deals with advance predictive / prescriptive analytics, which makes it a natural extension for data professionals seeking to enhance their skills.
Types of Machine Learning
You may find different classifications of machine learning in different reference materials. Generally, the machine-learning process classifies into three categories – Supervised, Unsupervised and Reinforcement Learning.
- Supervised Machine Learning – This form of machine learning learns from labeled data and takes actions. For example, consider a dataset where different attributes of a set of flowers are collected. Using these attributes one can identify a category of a flower family. Here the model learns from labeled historical data and predicts the outcome. In this tutorial, we will be learning this form of machine learning.
- Unsupervised Machine Learning – This form of machine learning learns from unlabeled data and takes actions. For example, consider a dataset containing attributes of all the houses in a given country or state or city. Using attributes like size in square feet, amenities, parks near-by, schools near-by, etc. it is not possible to predict a given house. In addition, even if it is, the intent of prediction is to predict the price of a given house based on attributes and not which house the attributes belong.
- Reinforcement Learning – This form of machine learning learns from a rewards based system depending upon the actions performed by the model. This is the most advanced form of machine learning which applies to artificial intelligence based systems like neural network, robotics, and recommendation engines.
Machine Learning Support in Microsoft Technology Stack
Microsoft acquired R last year with a vision of enabling advanced analytics within Microsoft data platforms on-premises, hybrid cloud environments and on Microsoft Azure. Post-acquisition, Microsoft has integrated R with SQL Server, PowerBI, Azure and Cortana Analytics. In addition, Revolution R Open has been renamed to Microsoft R Open and Revolution R Enterprise to SQL Server R Services (SQL Server Machine Learning Services in SQL Server 2017) and/or Microsoft R Server.
SQL Server R Services / SQL Server Machine Learning Services installs an open source distribution of R, as well as packages provided by Microsoft that support distributed and/or parallel processing. The architecture is designed such that external scripts using R run in a separate process from SQL Server. R Services integrates the R language with SQL Server, which helps in performing analytics close to the data and eliminates the costs and security risks associated with data movement.
Traditional data analytics methodology relies on transporting and transforming data from OLTP databases > Data Warehouses > Data Marts using PowerShell for administration, SSIS for ETL, SSAS for multi-dimensional / in-memory analytics, and SSRS for reporting. Data manipulation using set based operations and mathematical algebra has been the best possible solution with T-SQL on data stored in OLTP databases. Using R with T-SQL extends the power of data science, statistical computing, machine learning and other advanced predictive analytics capabilities to OLTP systems.
In the following chapters, we will be performing hands-on exercises using R and T-SQL for exploratory data analysis and machine learning. It is assumed that you have already installed SQL Server 2017, Machine Learning Services as well as R. In case you have not, you can learn how to it here.
- Consider reading more about Machine Learning here to deepen your understanding.