Introduction to the Python
Programming Language
What is Python?
Python is an interpreted, object-oriented programming language
It was created by Guido Van Rossum in the 1990’s
It is an open-source language
Why Python?
Python is very easy to install, learn and use; it is a very high-level language that is easy to
understand; most of the syntax is based on simple English commands; powerful programs can be
created with a few lines of code
A comparison between Python, Java and C++
It only takes one line of code to print “Hello World” in Python
It is free to use; it is open-source
Many large companies use it: Google, Yahoo and Nasa
It has many readily available library packages that can be imported and used
(a package in Python is called a module)
It is one of the languages that will remain in the future
If you are a student in information technology or computer science, you will
use it a lot in the future!
Contents
Installation
Running a Python Program
Variables
Comments
Strings
Lists
Tuples
Dictionaries
Sets
If Statements
Loops
Functions
Installation
Python can be downloaded from the official website python.org/downloads.
The latest versions are Python 3.4.2 and Python 2.7.9. We will be using
version 3.4.2.
The Natural Language Toolkit (NLTK)
Introduction to NLTK
The NLTK is a set of Python modules to carry out many common natural
language tasks.
A collection of:
Python functions and objects for accomplishing NLP tasks
sample texts (corpora)
The Natural Language Toolkit (NLTK) provides:
Basic classes for representing data relevant to natural language processing.
Standard interfaces for performing tasks, such as tokenization, tagging, and
parsing.
Standard implementations of each task, which can be combined to solve complex
problems.
NLTK: Example Modules
nltk.token: processing individual elements of text, such as words or
sentences.
nltk.probability: modeling frequency distributions and
probabilistic systems.
nltk.tagger: tagging tokens with supplemental information, such as
parts of speech or wordnet sense tags.
nltk.parser: high-level interface for parsing texts.
nltk.chartparser: a chart-based implementation of the parser
interface.
nltk.chunkparser: a regular-expression based surface parser.
Installing NLTK
Download NLTK from https://pypi.python.org/pypi/nltk#downloads
To Install packages/Modules in python
Using Pip installer
Open command prompt (cmd) and type:
pip install (Package/Module Name) – for widows 32bit.
C:/Python34/Scripts/pip install (Package/Module Name) – for widows 64bit.
Installing NLTK Data
NLTK comes with many corpora, toy grammars, trained models, etc.
http://www.nltk.org/nltk_data/
Use NLTK’s data downloader as described below.
Run the Python interpreter and type the commands:
>>>import nltk
>>> nltk.download()
Click on the File menu and select Change Download Directory. For central
installation, set this to C:\nltk_data (Windows)
Select (All) and click (Download) button.
Installing NLTK Data
3
Running a Python Program
IDLE is an interactive interpreter. This means that each statement or
command is executed individually.
Command
Output
Variables in Python
To create a variable in Python, the equal sign (=) is used.
A variable has a name and a value.
var = 10
This statement creates a new variable called “var”, which has a value of 10
output
You can use the “type()” method to check the data type of a variable.
For example,
Int means Integer or Number
Str means string
Bool means Boolean
Comments
Comments are used for programmers to write notes and explanations within
the code, so that they can remember what certain parts of the code are for
Comments are ignored by the Python compiler
Comments in Python are very easy to write:
Just use a “#” before the comment for a single line comment:
For a multi-line comment, use triple quotes:
Strings
Strings are just sequences of characters and symbols. Words and sentences
are strings.
Lets create the following string:
Indexing strings
Indexing is accessing a certain character of a string:
For example, the word “Python” is a sequence of characters. Each character
has an index number that is automatically created.
P y t h o n
0 1 2 3 4 5
Create a new string called string2 with a value of “Python”
Type string2[0] to access the first character
Type string2[1] to access the second character
Which character would string2[4] output?
Slicing strings
Slicing is a technique in Python to cut a string into a substring.
It uses the index numbers of the string to mark where to cut the string.
P y t h o n
0 1 2 3 4 5
Lets slice the string “Python” to get the substring “tho” Slicing the string
Note: We use 5 and
not 4 because in a
string, the first
index number is a 0
Creating a multi-line string
String functions
There are many string methods that can be used on strings.
For example: lower()
Other functions are:
upper() : string is converted to UPPERCASE
min() : least occurring character is output
max() : most occurring character is output
len() : length of string is output
Other data types in Python
Other than numbers, strings, and Booleans, there are other data types that
can be assigned to variables in Python. These are VERY important in Python.
They are:
Lists
Tuples
Sets
Dictionaries
We will define each type and do simple examples for each..
Lists
A List is similar to an array. It is a sequence of variables. Each variable has an
index number.
A list is very similar to a string. The only difference is that a string is a
sequence of CHARACTERS, while a list is a sequence of ANY VARIABLE.
A list can consist of numbers, strings, or a combination of different variables
To define a list in Python:
Indexing and Slicing
Just like in strings, Lists also support indexing and slicing:
Slicing:
0 1 2 3 4 5
Tuples
A tuple is very similar to a list. However, elements in a tuple can not be
changed.
To define a tuple in python, use brackets
Dictionary
A dictionary is a more complicated data type that consists of name-value pairs
To define a dictionary:
Sets
A set is an unordered group that does not contain any duplicate elements.
Unordered means that there are no index numbers for the elements.
To define a set:
As you can see, if there are any elements that are identical, they would
automatically be deleted. So when we print the set fruits, it shows us that
there is only one apple element, even though we added two apples.
Data types
So the data types that can be assigned to variables in Python are:
Numbers
Strings
Booleans
Lists
Tuples
Dictionaries
Sets
They are very useful when dealing with larger programs, and every data type has
its own benefits.
Be careful with indentation
Python does not use brackets to separate code blocks. It is the only programming
language that uses whitespace. Whitespace is very important. This is because
Python uses whitespace to separate code blocks. For example,
Python uses an indentation of
if (x > y): whitespace to separate code
print("x is greater than y") blocks. Everything with an
indentation here will be
print(“x has a greater value")
executed if the condition (x > y)
another statement is satisfied.
final statement If you are new to this, it will
take some time to get used to it
else:
print("x is not greater than y")
If statements
The if statement in Python allows the execution of statements based on
conditions.
This example requires multiple lines, so we will not use IDLE for this
excerseze. It would be easier to open a new file.
Go to FILE > NEW FILE
Type in the following code (here we have two conditions):
To execute this code, go to RUN > RUN MODULE
Save to the desktop as “test.py”
The output should appear in the IDLE window
Loops
Loops in Python are used for iteration. There are two kinds of loops:
While Loop
For Loop
While Loop
While (condition): Everything with an indentation will be executed
Statements
If the condition is satisfied. Make sure you add
An indentation to all the statements to be
Statements Executed if the condition is satisfied.
Statements
Open a new file from IDLE and try this while
loop:
For Loop
The other type of loop is called a for loop:
For (condition):
Statements
Statements
Statements
Range() provides a range of numbers
Open a new file and try this for loop:
For the loop to work with.
Functions
A function is a convenient way to call a block of code when needed, so that
we do not have to keep writing the same code many times.
To define a function in Python:
def function_name():
function_statements
To call a function:
function_name()
Try this exercise:
Python Tutorials Links
https://wiki.python.org/moin/BeginnersGuide
http://www.afterhoursprogramming.com/tutorial/Python/Overview/
http://datasentimentanalysis.com/python-review-2/