Data Mining I: Introduction to Python
DWS Group | Data Mining 1 1
Python
• Started in 1989 by Guido van Rossum
– The name is a tribute to the British comedy group Monty Python
• Multi-paradigm programming language
– object-oriented, structured, functional, aspect-oriented
programming
– even more supported by extensions
• Design goals
– Be extensible, simple, and readable
DWS Group | Data Mining 1 2
Installation
• Install Anaconda (Python Distribution)
– https://www.anaconda.com/download/
– Use Python 3.7
• If you don‘t have at least 3 GB disc space
– Option 1 (better): Get a bigger disc!
– Option 2: install miniconda
• https://docs.conda.io/en/latest/miniconda.html
DWS Group | Data Mining 1 3
Popularity
DWS Group | Data Mining 1 4
https://www.kdnuggets.com/2018/05/poll-tools-analytics-data-science-machine-learning-results.html
What does it look like?
> print(“Hello, world!”) > if x > 3:
Hello, world! > print(“greater than 3”)
> else:
> print(“less or equal 3”)
greater than 3
> 1 + 2
3
> list = [1,2,3]
> for x in list:
> x*5
> x = 5 5
> 3*x 10
15 15
DWS Group | Data Mining 1 5
How do I do it?
• After installation, you should have python on your path
– Just type “python” in your command line to start it
• In the exercises, we will use Jupyter Notebooks
– Type “jupyter notebook” in your command line
– More on the following slides!
DWS Group | Data Mining 1 6
Hands on
• Start Jupyter - Option 1 (Windows)
– Click on the Jupyter Notebook icon in the start menu
– The Jupyter Notebook App can access only files within its start-up
folder (including any sub-folder)
• default is your home folder (usually C:\Users\{username} )
– To change this folder:
• Copy the Jupyter Notebook launcher from the menu to the desktop.
• Right click on the new launcher and change the Target field, change
%USERPROFILE% to the full path of the folder which will contain all the
notebooks.
• Use the Jupyter Notebook desktop launcher to start the notebook
DWS Group | Data Mining 1 7
Hands on
• Start Jupyter – Option 2 (Linux and Windows)
– Run „jupyter notebook“ in command line
• Navigate to the folder that you want to access before!
• Or (Windows): “Shift+Right Click” in the corresponding folder and then
"open command window/power shell here“
• Start Jupyter – Option 3 (Mac OS)
– Click on spotlight, type “terminal” to open a terminal window
– Enter the startup folder by typing “cd /some_folder_name”.
– Type “jupyter notebook” to launch the Jupyter Notebook App
DWS Group | Data Mining 1 8
Hands on
• Open the URL on screen in your browser, if not already opened
DWS Group | Data Mining 1 9
Jupyter Home Screen
• Startscreen in browser
– like a file explorer upload files to Create new
notebook folder files/notebooks
file explorer
DWS Group | Data Mining 1 10
Now try it out
• Click in browser „New“ -> „Python 3“
https://xkcd.com/353/
DWS Group | Data Mining 1 11
Jupyter Notebook
• Every notebook is composed of cells
– Cells contain a specific type of content
– markdown cells (for documentation and structure)
– code cells
Name of the notebook
(also filename)
Menu Bar
Shortcuts
Cell
Type of the current cell
DWS Group | Data Mining 1 12
Jupyter Cells
• Code cell:
– You can type python code (because you created a python notebook)
• Hit „Ctrl + Enter“ to run the code
• Hit „Shift + Enter“ to run it and create a new cell
• Try it and type 1 + 2
– The output is shown below the cell
DWS Group | Data Mining 1 13
Jupyter Cells
• Each „code cell“ can be reevaluated (indicated by a
number)
– All previous results / variables are stored (like in R workspace)
Change code
and run again
[1] -> [3]
run next cell
again [2] -> [4]
DWS Group | Data Mining 1 14
Jupyter Cells
• Autocomplete by pressing <tab> when writing
• Signature of function by pressing <shift>+<tab>
DWS Group | Data Mining 1 15
Jupyter Cells
• What makes a notebook a notebook?
– Markdown cells let you add documentation and notes
– Create a new cell („Insert->Insert Cell Below“)
– Change the type to Markdown
Type of the current cell
DWS Group | Data Mining 1 16
Jupyter Cells
• What makes a notebook a notebook?
– Type „# Test“ which creates a heading (add more „#“ for smaller headline)
• Whitespace after #
– Evaluate the cell and see the result
DWS Group | Data Mining 1 17
Jupyter Cells - Markdown
• Different possibilities to structure
# H1
– Header ## H2
### H3
– Unordered List (use "*", "+", or "–" in front) - Item
- Item
1. Item one
– Ordered list 2. Item two
– Links [link to google](https://www.google.com)
– Image 
– Quote > This is a quotation
DWS Group | Data Mining 1 18
Shut down Jupyter
• Closing the browser (or the tab) will not close the Jupyter App
– To completely shut it down you need to close the associated terminal
– Or press “Ctrl” + “C”
DWS Group | Data Mining 1 19