INTRODUCTION TO
PYTHON FOR DATA
ANALYTICS
BY:
JAVAN JUMA
MSC. DATA SCIENCE
LECTURER: UNIVERSITY OF RWANDA
ROAD MAP: PROGRAMMING FOR
DATA SCIENTISTS
ROAD MAP: PROGRAMMING FOR
DATA SCIENTISTS.
PANDAS AND
PYTHON DATA ANALYSIS PYTHON NUMPY
Python Data Structures. Creating data sets Create Numpy Arrays.
Operations. I/O Tools. Indexing, Slicing and
Iterating.
Function. Creating data frames.
Fancy Indexing.
Return Multiple Values etc. Reading and exporting
datasets. Descriptive Statistics.
Wrangling/Munging Boolean Masking of Arrays.
datasets. I/O Numpy etc.
Pre-processing data
Descriptive Statistics.
Time Series.
ROAD MAP: PROGRAMMING FOR
DATA SCIENTISTS.
Matplotlib, Seaborn, Geopandas.
Types of plots.
Choose right plots for data.
Adding Style in Data Visualization.
File Formats of Graphic Outputs etcs.
WHAT IS LANGUAGE?
Computer Programming
Human Languages Languages
Kiswahili C
English. C++
French. Java
Chinese Python
Spanish. R
HTML
PHP
HOW DOES A COMPUTER UNDERSTAND
DIFFERENT LANGUAGES?
Types of
Translators
Computer
For C: Complier (Reads code
ANSWER: Language and identifies errors).
Binary Language:
Through Translators.
(0s and 1s)
For Python: Interpreter
(Checks code; checks one
error at a time and stops.
For Assembly Programming
Language: Assembler (Low
level language; close to
machine level)
PYTHON INTRODUCTION
Python was created by Guido Rossum in 1989.
Its very easy to learn.
Python is an interpreted, object-oriented, high-level programming
language with dynamic semantics; Supports both oop and
procedure oriented programming.
PYTHON INTRODUCTION
Procedure oriented: written in a small part using functions/data
e.g. C language.
Object-oriented: Written in small part using objects e.g. Java and
the concepts of encapsulation/classes/objects.
In OOP such as Java, data is encapsulated in object. i.e. data is
hidden in the object (abstraction); More concern for data than
functions.
Procedure oriented programming: step by step approach to
breaking down tasks into a collection of variables through a
sequence of instructions; Use functions to store code; less
concern for data.
WHY PYTHON
Simple and Easy to Learn.
Great popularity and high salary.
Used greatly in Data Science.
Used with Big Data.
Computer Graphics in Python.
Used in Web Development.
Portable and Extensible.
WHY PYTHON: BIG DATA/AI
Big data: huge sizes e.g petabytes; complex; highly dimensional;
real time data (requiring high speed processing).
Sources of Big data: FB, IoT e.g censors, Satelites.
These generate time each and every time.
Big Data essentially means complex.
Machine learning is used for predictions; Examples of AI – Sophia
(First to get citizenship of Qatar).
Who uses Python: Netflix for video recommendation using
machine learning capabilities.
PYTHON ENVIRONMENT
IDE-INTERGRATED DEVELOPMENT ENVIRONMENT (IDE).
We will use ANACONDA.
Search for “Anaconda Navigator” on GOOGLE.
We used Jupyter Notebook as a web-based environment for
python.
Each Jupyter Notebook has its own namespace. i.e. folder and
subfolder.
BEFORE GETTING STARTED
PYTHON ENVIRONMENT SETUP
Install anaconda navigator for data science:
• https://www.anaconda.com/download/
LESSON 1:VARIABLES AND DATA
TYPES
A variable is a small memory location.
A variable is created the moment you first assign a value to it.
Variables do not need to be declared with any particular type and
can even change type after they have been set.
Example:
Rules for Python variables:
A variable name must start with a letter or the underscore
character
A variable name cannot start with a number
A variable name can only contain alpha-numeric characters
and underscores (A-z, 0-9, and _ )
Variable names are case-sensitive (age, Age and AGE are three
different variables)
Data Types
Python is loosely typed language. Therefore, no need to define the datatype of
variables ; No need to declare variables before using them .
Data Types
N/B. Immutable: Cannot be changed e.g Tuples.
For numeric variables, we may have the following data types:
STRINGS
Strings are sequence of one-character strings
Example: Str=“Welcome to Python Workshop” Or Str=‘Welcome to
Python Workshop’
Multi-line strings can be denoted using triple quotes, “” or “””
Example: Str=“””Welcome to Python Workshop”””
N/B. Python is Case Sensitive!!!
STRING OPERATIONS
Concatenation: “Jack” + “Son” =“Jackson”
Repetition: “Miguna”*2=“MigunaMiguna”
Slicing: Str=“Miguna”
Str[1:5]=“igun”
Indexing: “Miguna”
Str[-1]+Str[1]=“ai”
Find(): Str.find(“igun”)
Replace(): Str.replace(‘Mi’,’J’)
Strip(): Str.strip()
Count: Str.count(‘i’)