Python How To Installation
Python How To Installation
Python How To Installation
md 2/28/2021
Python Installation
This document provides a guide for installing Python 3 onto your system and all the third-party packages
necessary to complete each course. It is important that you follow this guide precisely so that your system is
setup to run all of the code.
Preparing for Dunder Data Courses
Begin by creating the folder Dunder Data Courses in your file system. Move the entire contents of the folder
containing this document inside of the Dunder Data Courses folder. Move any other course downloads into
the Dunder Data Courses folder.
conda
There are a number of ways to set up your system to run Python. This tutorial relies upon a tool called conda
which is both a package manager and an environment manager.
Package manager
A package manager is a tool that installs, updates, and removes computer programs. For us, these computer
programs will be third-party Python packages. Third-party Python packages are not part of the Python
standard library (the libraries that come installed with every version of Python). There are many thousands of
third-party packages available to be installed onto your system.
Another popular package manager is pip, which existed long before conda and is the default package manager
for new Python installations. Although pip is a good tool, we will not use it, as conda contains more features and
resolves dependencies better. Additionally, pip is only a package manager and not an environment manager.
Packages are like phone applications
You can think of Python packages like applications on a phone. There are some applications that come pre-
installed on your phone, such as your contact list or text messaging. You can analogize these built-in
applications to the Python standard library. Any applications that you install after purchasing your phone can be
analogized to third-party Python packages.
What's the difference between 'package' and 'library'
The terms package and library are closely related and for our current purposes refer to the same concept. They
are files of Python code contained within a single folder. The term package has a technical definition, but it's
not necessary to know at this point.
Environment manager
An environment manager is a tool that creates an environment (sometimes referred to as a virtual environment),
an isolated section of your computer with its own installation of Python and own third-party packages that are
independent from any other Python installation on your machine.
Miniconda
1/9
Python Installation.md 2/28/2021
Miniconda is a distribution of Python that includes conda and a small number of other packages that conda
depends on.
What is a 'distribution'?
A distribution of Python is any group of files that contains software to install the Python programming language
onto your machine. Distributions usually have other files too. For instance, I could make a distribution called
"Ted's Python Distribution" which comes with Python along with third-party packages A and C. Someone else
could create "Penelope's Python Distribution" containing Python and third-party packages A, B, and E.
Python installation
Python comes preinstalled on macOS and most Linux operating systems, but it is not a good idea to use this
preinstalled version for development as some critical system software might rely on it. It's also usually not the
latest version of Python. Instead, we will install a completely new version of Python in a different location. You
can go directly to Python.org, but we will obtain it through the Miniconda distribution, which also installs conda.
Miniconda download
DOWNLOAD MINICONDA HERE - Choose the Python 3 installation for your operating system. Both Windows
and macOS have graphical installers (.pkg file for macOS). macOS and Linux both have command line installers
(.sh file). There will be instructions for both the graphical and command line installers. Choose the graphical
installer if you have limited experience using the command line.
Many people will be aware of Anaconda, a different distribution of Python that contains hundreds of packages
and other software. Both Anaconda and Miniconda are maintained by a company called Anaconda. They both
install the same version of Python and conda. The reason I recommend installing Miniconda is that Anaconda
installs many unnecessary packages and software that you will likely never use. Any software contained in the
Anaconda distribution, but not in Miniconda, may be installed at a later date.
Keeping Anaconda
If you already have Anaconda, you can either uninstall it or keep it. Even if you are happy with the current
status, you might consider uninstalling it as there is quite a lot of excess software and it does not take too much
effort to get a minimal clean installation.
If you are keeping Anaconda
This section is only for those that would like to keep their Anaconda distribution. Do NOT install Miniconda. If
you are on a Windows machine, open up the program Anaconda Prompt. For macOS and Linux users, open up
the terminal program.
Run the following from the command line:
This will update conda to the latest version which is necessary and important to ensure that your system is set
up properly. Skip down to the section titled Python and Conda installation complete.
2/9
Python Installation.md 2/28/2021
Uninstalling Anaconda
This section is only for those that would like to uninstall the Anaconda distribution. Open the official Uninstalling
Anaconda page. For macOS and Linux, use option B first and then complete the simple remove with option A.
Miniconda installation
We will now continue with the Miniconda installation assuming you have downloaded the correct file for your
operating system.
Windows
The name of the Windows file begins with Miniconda3-latest-Windows. Start the setup and after agreeing to
the license you will be given the choice to install for 'Just Me' or 'All Users'. It's best to select 'Just Me' as this will
give you full control of the installation without needing to have administrator rights to install new packages.
The next step asks you for a file location for the installation. Select the default location, C:\Users\
<UserName>\Miniconda3, where <UserName> is the name of your user folder.
Add to PATH?
The next screen asks whether you'd like to add to PATH and recommends that you do not do so. The main
advantage of adding Miniconda to the path is to have access to it directly from the Command Prompt program.
This is unnecessary as Miniconda provides a small program called Anaconda Prompt that does add the
necessary file location to the PATH. Keep the defaults as they are and complete the installation.
3/9
Python Installation.md 2/28/2021
macOS
Graphical installer
The graphical installer file begins with Miniconda3-latest-MacOSX and ends with .pkg. After agreeing to the
license, you will be asked to choose to either 'Install for me only' or 'Install on a specific disk'. Choose to 'Install
for me only' and select the default location which is /Users/<UserName>/Miniconda3 where <UserName>
is your specific user name. This completes the installation.
Command line installer
The command line installer file ends in .sh. Open up your terminal and change directories to the location where
you downloaded the installer and then run the following command. Make sure to use bash regardless of the
shell you are using.
bash Miniconda3-latest-MacOSX-x86_64.sh
A stream of text begins that you'll need to press enter to move through. You'll be prompted to agree to the
license and whether to accept the default location for the installation which will be
/Users/<UserName>/Miniconda3. Press ENTER exactly once. The installation will appear to have paused
and tempt you to press enter again. Do not do this. Instead, just wait patiently.
You will then be prompted to initialize Miniconda3. Enter 'yes' to complete the installation. Exit the terminal
program.
Linux
4/9
Python Installation.md 2/28/2021
If you aren't able to use a web browser, you should be able to download the installer with the following
command:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Open up your terminal and change directories to the location where you downloaded the installer and then run
the following command making sure to use bash regardless of the shell you are using.
bash Miniconda3-latest-Linux-x86_64.sh
A stream of text begins that you'll need to press enter to move through. You'll be prompted to agree to the
license and whether to accept the default location for the installation which will be
/home/<UserName>/miniconda3. Press ENTER exactly once. The installation will appear to have paused
and tempt you to press enter again. Do not do this. Instead, just wait patiently.
You will then be prompted to initialize Miniconda3. Enter 'yes' to complete the installation. Exit the terminal
program.
Python and conda installation complete
After completing the steps above, you will have finished installing both Python and conda along with a small
number of other Python packages.
Test installation
Let's test that we have a successful installation.
Windows users
Windows users MUST begin by starting the program Anaconda Prompt. This program is easily found by
tapping the windows key and then typing in the exact name 'Anaconda Prompt'.
macOS/Linux users
Start your terminal program.
All users
5/9
Python Installation.md 2/28/2021
Run the command conda list which returns the name, version, build number, and channel for each package
currently installed.
This is a list of all the third-party packages that come installed with the default Miniconda distribution. Notice
that Python is considered a package. This might appear to be a large number of packages, but is a fraction of
what is installed with the full Anaconda distribution.
Verifying the Python Installation
By default, Miniconda creates an environment with name 'base' that has all of the packages displayed from
conda list installed. This is the only environment that it creates. Notice that '(base)' has been prepended to
the prompt to indicate that this environment is active. When an environment is active, it means that your Python
code will be interpreted by the Python executable in that environment.
To verify that the correct version of Python runs, we will find the exact location in our file system where the
Python executable is located.
Windows users
Run the command where python. It should be in the Miniconda folder that was just created.
macOS/Linux users
Run the command type python. It should be in the Miniconda folder that was just created.
6/9
Python Installation.md 2/28/2021
conda deactivate
You no longer have an active environment and '(base)' will be removed from the prompt. Run where python
(Windows) or type python (macOS/Linux) and you will either see an error such as "command not found" or
the location of your system python. To activate the environment again run the following:
Always make sure your base environment is active before running any of the course material.
Installing third-party packages
We will now use conda to install third-party data science packages into the base environment. Some of the
most popular and powerful data science libraries are:
numpy - Numerical computation on arrays of data. Forms the base of many scientific computing
packages.
pandas - Data exploration and analysis. Does not do machine learning
scikit-learn - Supervised and unsupervised machine learning
matplotlib - Data visualization
seaborn - Simplified data visualization
We will also install a package called sqlalchemy allowing us to connect to external databases.
Jupyter Notebooks
Jupyter Notebooks are a tool that allow you to execute code and embed detailed notes all in one document.
They are great for learning and teaching. We use Jupyter Notebooks often throughout the courses and will
install it as well.
Installation command
Install all of these packages with the following command:
Notice that not all of the packages are listed in the install command. Those packages not listed are required
dependencies for one or more of the packages that are listed. For instance, numpy, pandas, and matplotlib are
7/9
Python Installation.md 2/28/2021
all dependencies of seaborn. Take a look at the packages to be installed before confirming. Conveniently, this
list shows the name of the package, the version number, and the size.
Notice the extra -c conda-forge in the command above. The -c option stands for channel. A channel is a
repository of third-party packages. By default, conda always installs packages from the defaults channel. The
Jupyter Notebook Extensions package is not available on the defaults channel. There is a different channel
named conda-forge that hosts this package.
You and anyone else can create a channel on Anaconda.org and upload packages to it. The team at Anaconda
maintains the defaults channel so you can have confidence that the packages you install from there are in good
working order. The conda-forge channel has many more packages than the defaults channel and is maintained
independently by a community of Python developers.
Environment setup complete
Your environment should now be set up correctly to run Python within a Jupyter Notebook.
Other Considerations
8/9
Python Installation.md 2/28/2021
There are a few other items that are worthy of discussion not mentioned above.
Installing new packages
As you expand your skills and complete new projects, it is highly likely that you will need a package that you
don't currently have installed. To install it, go to your command line and enter the following (replacing
packagename with the name of the package).
Updating packages
Python and most of its popular third-party packages are under constant development and make new releases
from time to time. You can update all of the packages at once with the following command:
Before the update happens, conda shows you a list of all the packages that it will update. This allows you to
view all the latest versions of each package and make a decision on whether to update or not. To update an
individual package run conda update packagename.
Updating conda
It's also important to update the tool conda itself from time to time. To update it, run the following command:
9/9