By: Nat Sundar | Updated: 2018-05-24 | Comments | Related: More > Data Science
Problem
I have experience as SQL Server data professional and want to expand my skills and learn about using Python to become data science engineer. How do I learn Python basics to become a data science professional?
Solution
Python Basics
Python is a simple, yet very powerful language for data science. Developers and data engineers prefer Python because of the programmability and its learning curve.
Python has a very good collection of libraries that can be leveraged for machine learning. Python is a general-purpose language and in addition to machine learning, Python can be used in other areas to extract and cleanse data. Python is an ideal choice if you are planning to integrate data analysis with a web application. Python has rich set of libraries which can be used to accelerate the development process.
Python Libraries
These are all the famous libraries with Python
- NumPy /SciPy - Scientific Computing
- pandas - Data manipulation
- matplotlib - Make graphics
- scikit-learn - Machine learning
Python programs can be developed in many IDEs. In this tip, I will be using the Jupyter Notebook IDE to develop Python programs.
Advantages of using Jupyter Notebook
Jupyter Notebook is a web application for development based on the server and client architecture. Using Jupyter we can create and manipulate notebooks (documents). It has a very simple, but elegant interface to write programs. In addition, Jupyter is a great presentation tool. It is a perfect IDE for beginners.
We can also embed HTML components (Images and Videos). We can easily see and edit the source code to create great presentations. This presentation can include data visualization libraries like Matplotlib and Seaborn. The presentation can also be exported to PDF, HTML and .py format.
The Jupyter can be downloaded from this link. Once the installation is complete, you can launch the notebook as mentioned here. It will start the web server and will launch the default browser with the home page. Now click on the New->Python 3 to create a new Python Notebook.
The below image confirms that a new notebook has been created.
Sample Python Program
Now let’s start writing some sample Python programs. Let’s print a string "Hello World" in our first example.
Type the below code in the given text box. We will be using Python version 3. In the latest Python version, we need to use the function "Print" to display the value of string literals. The actual value should be supplied as an argument to the Print function.
print ("Hello World!")
Now let’s click on the "Run" button on the menu bar to execute this sample program. This will simply print the string "Hello World!". Congrats. You have successfully developed a simple Python program and started your journey for data science.
Python Defining an Integer
As a next step, let's declare an integer variable. In Python a variable can be declared and a value can be assigned. As Python is a dynamically typed language, the variable type will be decided at run time. The value of the variable can be printed using the Print function as mentioned in the below image.
#Sample Hello World Program noOfRecords = 500 print (noOfRecords)
Save Changes
Let’s save the source code that we have typed so far. You have to click on the title (Where it says "Untitled2". You will get a popup to rename the notebook and you can enter a suitable name as mentioned in the below image.
Python Integer and Decimal Calculations
In the next example, let's understand the usage of a decimal variable. In the below example, an integer variable and a decimal variable have been declared and values have been assigned.
From the above example, you might have noticed that we are adding an integer value with a float value. In this case the resultant value will be upcasted to float. You can also type cast a float to integer using the function int() as mentioned in the below image. We can also type cast an integer to float using the float function.
SalePrice=1000 #Integer UnitPrice=35.75 #Decimal TotalValue=SalePrice+UnitPrice # Result has been converted float print(TotalValue) print (int(UnitPrice)) #Type cast float value to integer print (float(1000)) #Type cast integer value to float
A string variable value can be represented using a single quote, double quote or even three double quotes. The value of the string variable can be printed using the print() function as mentioned in the below image.
databaseName = "Staging" print (databaseName)
Python String Functions
There are many useful functions available in Python for string manipulation. Let’s have a quick look at some of these functions. The Capitalize function will return the first letter in upper case.
The Replace string function will help us to replace a character or string with another string. The split function will split the string into multiple strings based on a separator.
In the below example, the Split function has been used to split the string based on the character ",".
#String handling examples strProductName='bike' #string variables can be represented using single quote strProductGroup="Mountain" #string variables can also be represented using double quote strProductDescription="""mountain bike""" #or three double quote strProductList = "car,bike,boat,scooter" #string functions print (strProductDescription) print(strProductDescription.capitalize()) print(strProductDescription.replace("bike","bike accessory")) print(strProductList.split(","))
Python Boolean Variables
A boolean variable will be defined using the value "True" or "False". A boolean variable can be type casted to an integer value using the int() function. The "True" value can be type casted to integer value using the function int(). In the below example, the "True" boolean value has been type casted as 1 and the "False" boolean value has been type casted as 0.
The equivalent of the null keyword in Python is None. A variable can have the value of "None" as mentioned in the below example.
#Boolean and None values discountedFlag=True productAvailable=False productName=None print (int(discountedFlag)) print (int(productAvailable)) print (str(productAvailable)) print (productName)
Python Conditional Statement "If"
The below example explains the usage of the conditional "IF..Else IF..Else" statement with an example. The "ElseIF" condition will be represented as "Elif".
#Conditional statements intProductCode = "TGJ7679" if intProductCode == "TGJ7679": print ("Product is available") else: print ("Product is not available")
#Multiple if conditions intSaleAmount=2500 if intSaleAmount <= 2000: print ("No discount applicable") elif intSaleAmount > 2000 and intSaleAmount < 3000: print ("5 % discount applicable") else: print ("None")
Python While Loop
A while loop statement in Python, repeatedly executes a statement as long as the given condition is true.
#While loop example intLoopCount=0 while intLoopCount <5: print ("Loop iteration {0}".format(intLoopCount)) intLoopCount+=1
Summary
In this tip, we learned about some of the basics of Python. In future tips, we will learn more advanced topics in Python.
Next Steps
- Stay tuned to read the next tip on Python
- Read and understand the differences of data science and other BI jargon here
- Learn more about the data science process here
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2018-05-24