By: Siddharth Mehta
Overview
Python is an interpreted, interactive, object-oriented programming language. It incorporates modules, exceptions, dynamic typing, very high level dynamic data types, and classes. Python has packages that encapsulates different categories of functionality in libraries (also called packages). For applying the statistical analysis one often needs sample data. Assuming that we already have sample datasets at hand, we can use Python for analyzing the statistical computations and algorithms. To develop these computations, one needs to know regular programming constructs like variables, data types, operators, loops, etc.
Explanation
Most of the programming constructs that are available in Python are also available in T-SQL. Our intention is not to learn Python in full detail, but to learn Python constructs that enable us to consume the unique Python libraries and data processing / computation mechanisms that are not available in T-SQL. In this lesson we will be learning the basic concepts of Python, just sufficient enough for us to apply Python functions and packages, so that we can apply the concepts for the data passed from the SQL Server data repository.
Python version, packages and datasets
We already learned in the last lesson how to check the version of Python runtime that the database engine is communicating. It necessary to know the version of Python you are working with, as that can be considered as the basis of what is supported by a particular version of Python. Using the sp_execute_external_script, with a simple Python property “sys.version”, we can check the details of Python version as shown below. Print function prints the output on the SQL Server Management Studio (SSMS) message console. In this lesson our focus is developing the fundamentals of Python. We will discuss the details of sp_execute_external_script in the next lesson. Until then just consider this procedure as execution wrapper.
The next step is to explore the different libraries available with Python installation by default. You can explore them from here. You can load any given library by using the import function. We already looked at an example of the use of this function in the above code.
Variables, Comments and Printing Data
In Python, a variable is created by using the assignment operator “=“. Python variables do not need explicit declaration. The data type of the variable is determined by the data stored in R. The code can be commented in Python using the # character. The operand to the left of the = operator is the name of the variable and the operand to the right of the = operator is the value stored in the variable. Let’s understand these concepts by an example.
execute sp_execute_external_script @language = N'Python', @script = N' var1 = "Siddharth" Var1 = "Sid" var2 = 100 var3 = 50.5 var4 = "TRUE" print(var1) print(Var1) print(var2 + var3) print(var4) print(type(var1)) print(type(var2)) print(type(var3)) print(type(var4)) '
When executing the above code and the output should look as shown below. Below are the points you can derive from the above example:
- Variables can be created using the “=“ (assignment) operator.
- Variables are case-sensitive. Var1 and var1 are considered different variables.
- Data-type of the variable is determined by the type of data stored in the variable.
- You can get the value of variables using the print function
- Type function can be used on variables to determine the data type of the variables which is classified in five major types – numbers, string, list, tuple and dictionary.
- There are other data structure types too, but we are limiting our discussion to the basic types.
Arithmetic, Operators, Loops
The table below shows a list of arithmetic and logical operators in Python. It’s not an exhaustive list, but covers major operators that you may use normally when you start learning Python.
Operator | Description |
---|---|
+ | Addition |
- | Subtraction |
* | Multiplication |
/ | Division |
** | Exponentiation |
% | Modulus |
< | Less than |
<= | Less than or equal to |
> | Greater than |
>= | Greater than or equal to |
== | Exactly equal to |
!= | Not equal to |
! | NOT |
| | OR |
& | AND |
Though these operators are self-explanatory, below is a basic example of how you can use these operators. Here we have used these operators on actual values. You can use these operators in the same way on variables too.
Finally, considering we have the data with us, there is a high possibility that we may have to loop into the data for applying some statistical computations. So we need to learn at least one looping technique in Python. Below is a simple example of a while loop.
execute sp_execute_external_script @language = N'Python', @script = N' count = 0 while (count < 9): print ("The count is:", count) count = count + 1 if (count == 5) : break '
In this example we are assigning the value of 0 to a variable “count”. We are printing the value of “count” in the loop and incrementing the value of i. We are also placing a condition that if value of count reaches 5, then break out of the loop using the “break” statement.
We have not learned anything in this lesson that we cannot achieve with T-SQL. But once the data is submitted from SQL Server data tables / views to Python runtime for processing the data, we need to use programming constructs in Python language. Now that we have a basic idea of how to apply arithmetic on the data in Python, let's learn how to accept and receive data between T-SQL and Python in the next lesson.
Additional Information
- Consider implementing all the operators and practice to use the same on variables in Python.