Best Python libraries for Machine Learning
Difficulty Level : Easy ● Last Updated : 27 Sep, 2021
Machine Learning, as the name suggests, is the science of programming a computer by
which they are able to learn from different kinds of data. A more general definition
given by Ar thur Samuel is – “Machine Learning is the field of study that gives
computers the ability to learn without being explicitly programmed.” They are typically
used to solve various types of life problems.
In the older days, people used to per form Machine Learning tasks by manually coding
all the algorithms and mathematical and statistical formula. This made the process
time consuming, tedious and inefficient. But in the modern days, it is become ver y
much easy and efficient compared to the olden days by various python libraries,
frameworks, and modules. Today, P ython is one of the most popular programming
languages for this task and it has replaced many languages in the industr y, one of the
reason is its vast collection of libraries. P ython libraries that used in Machine Learning
are:
Numpy
Scipy
Scikit-learn
Theano
TensorFlow
Keras
P yTorch
Pandas
Matplotlib
▲
Data Structures Algorithms Interview Preparation Topic-wise Practice C++ Java Python
Attention reader! Don’t stop learning now. Get hold of all the impor tant Machine
Learning Concepts with the Machine Learning Foundation Course at a student-
friendly price and become industr y ready.
Numpy
NumP y is a ver y popular python librar y for large multi-dimensional array and matrix
processing, with the help of a large collection of high-level mathematical functions. It
is ver y useful for fundamental scientific computations in Machine Learning. It is
par ticularly useful for linear algebra, Fourier transform, and random number
capabilities. High-end libraries like TensorFlow uses NumP y internally for
manipulation of Tensors.
P ython3
# Python program using NumPy
# for some basic mathematical
# operations
import numpy as np
# Creating two arrays of rank 2 ▲
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])
# Creating two arrays of rank 1
v = np.array([9, 10])
w = np.array([11, 12])
# Inner product of vectors
print(np.dot(v, w), "\n")
# Matrix and Vector product
print(np.dot(x, v), "\n")
# Matrix and matrix product
print(np.dot(x, y))
Output :
219
[29 67]
[[19 22]
[43 50]]
For more details refer to Numpy.
SciP y
▲
SciP y is a ver y popular librar y among Machine Learning enthusiasts as it contains
different modules for optimization, linear algebra, integration and statistic s. There is a
difference between the SciP y librar y and the SciP y stack. The SciP y is one of the core
packages that make up the SciP y stack. SciP y is also ver y useful for image
manipulation.
P ython3
# Python script using Scipy
# for image manipulation
from scipy.misc import imread, imsave, imresize
# Read a JPEG image into a numpy array
img = imread('D:/Programs / cat.jpg') # path of the image
print(img.dtype, img.shape)
# Tinting the image
img_tint = img * [1, 0.45, 0.3]
# Saving the tinted image
imsave('D:/Programs / cat_tinted.jpg', img_tint)
# Resizing the tinted image to be 300 x 300 pixels
img_tint_resize = imresize(img_tint, (300, 300))
# Saving the resized tinted image
imsave('D:/Programs / cat_tinted_resized.jpg', img_tint_resize)
▲
Original image :
Tinted image :
Resized tinted image :
▲
For more details refer to documentation.
Scikit-learn
Skikit-learn is one of the most popular ML libraries for classical ML algorithms. It is
built on top of two basic P ython libraries, viz., NumP y and SciP y. Scikit-learn suppor ts
most of the super vised and unsuper vised learning algorithms. Scikit-learn can also be
used for data-mining and data-analysis, which makes it a great tool who is star ting out
with ML.
▲
P ython3
# Python script using Scikit-learn
# for Decision Tree Classifier
# Sample Decision Tree Classifier
from sklearn import datasets
from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier
# load the iris datasets
dataset = datasets.load_iris()
# fit a CART model to the data
model = DecisionTreeClassifier()
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
Output :
DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None
max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, presort=False, random_state=None
splitter='best')
precision recall f1-score support
0 1.00 1.00 1.00 50
1 1.00 1.00 1.00 50
2 1.00 1.00 1.00 50
micro avg 1.00 1.00 1.00 150
macro avg 1.00 1.00 1.00 150
weighted avg 1.00 1.00 1.00 150
▲
[[50 0 0]
[ 0 50 0]
[ 0 0 50]]
For more details refer to documentation.
Theano
We all know that Machine Learning is basically mathematic s and statistic s. Theano is a
popular python librar y that is used to define, evaluate and optimize mathematical
expressions involving multi-dimensional arrays in an efficient manner. It is achieved
by optimizing the utilization of CPU and GPU. It is extensively used for unit-testing and
self-verification to detect and diagnose different types of errors. Theano is a ver y
power ful librar y that has been used in large-scale computationally intensive scientific
projects for a long time but is simple and approachable enough to be used by
individuals for their own projects.
P ython3
# Python program using Theano
# for computing a Logistic
# Function
▲
import theano
import theano.tensor as T
x = T.dmatrix('x')
s = 1 / (1 + T.exp(-x))
logistic = theano.function([x], s)
logistic([[0, 1], [-1, -2]])
Output :
array([[0.5, 0.73105858],
[0.26894142, 0.11920292]])
For more details refer to documentation.
TensorFlow
TensorFlow is a ver y popular open-source librar y for high per formance numerical
computation developed by the Google Brain team in Google. A s the name suggests,
Tensor flow is a framework that involves defining and running computations involving
tensors. It can train and run deep neural networks that can be used to develop several
AI applications. TensorFlow is widely used in the field of deep learning research and
application.
▲
P ython3
# Python program using TensorFlow
# for multiplying two arrays
# import `tensorflow`
import tensorflow as tf
# Initialize two constants
x1 = tf.constant([1, 2, 3, 4])
x2 = tf.constant([5, 6, 7, 8])
# Multiply
result = tf.multiply(x1, x2)
# Initialize the Session
sess = tf.Session()
# Print the result
print(sess.run(result))
# Close the session
sess.close()
Output :
[ 5 12 21 32]
For more details refer to documentation.
Keras
▲
Keras is a ver y popular Machine Learning librar y for P ython. It is a high-level neural
networks API capable of running on top of TensorFlow, CNTK, or Theano. It can run
seamlessly on both CPU and GPU. Keras makes it really for ML beginners to build and
design a Neural Network. One of the best thing about Keras is that it allows for easy
and fast prototyping.
For more details refer to documentation.
P yTorch
▲
P yTorch is a popular open-source Machine Learning librar y for P ython based on Torch,
which is an open-source Machine Learning librar y which is implemented in C with a
wrapper in Lua. It has an extensive choice of tools and libraries that suppor ts on
Computer Vision, Natural L anguage Processing(NLP) and many more ML programs. It
allows developers to per form computations on Tensors with GPU acceleration and
also helps in creating computational graphs.
P ython3
# Python program using PyTorch
# for defining tensors fit a
# two-layer network to random
# data and calculating the loss
import torch
dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") Uncomment this to run on GPU
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10
# Create random input and output data
x = torch.randn(N, D_in, device = device, dtype = dtype)
y = torch.randn(N, D_out, device = device, dtype = dtype)
# Randomly initialize weights
w1 = torch.randn(D_in, H, device = device, dtype = dtype)
w2 = torch.randn(H, D_out, device = device, dtype = dtype)
learning_rate = 1e-6
for t in range(500):
# Forward pass: compute predicted y
h = x.mm(w1)
h_relu = h.clamp(min = 0)
y_pred = h_relu.mm(w2)
# Compute and print loss
loss = (y_pred - y).pow(2).sum().item()
print(t, loss)
# Backprop to compute gradients of w1 and w2 with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.t().mm(grad_y_pred)
grad_h_relu = grad_y_pred.mm(w2.t())
grad_h = grad_h_relu.clone()
grad_h[h < 0] = 0 ▲
grad_w1 = x.t().mm(grad_h)
# Update weights using gradient descent
w1 -= learning_rate * grad_w1
w2 -= learning_rate * grad_w2
Output :
0 47168344.0
1 46385584.0
2 43153576.0
...
...
...
497 3.987660602433607e-05
498 3.945609932998195e-05
499 3.897604619851336e-05
For more details refer to documentation.
Pandas
Pandas is a popular P ython librar y for data analysis. It is not directly related to
Machine Learning. A s we know that the dataset must be prepared before training. In
▲
this case, Pandas comes handy as it was developed specifically for data extraction and
preparation. It provides high-level data structures and wide variety tools for data
analysis. It provides many inbuilt methods for groping, combining and filtering data.
P ython3
# Python program using Pandas for
# arranging a given set of data
# into a table
# importing pandas as pd
import pandas as pd
data = {"country": ["Brazil", "Russia", "India", "China", "South Africa"],
"capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria"],
"area": [8.516, 17.10, 3.286, 9.597, 1.221],
"population": [200.4, 143.5, 1252, 1357, 52.98] }
data_table = pd.DataFrame(data)
print(data_table)
Output :
For more details refer to Pandas.
▲
Matplotlib
Matplotlib is a ver y popular P ython librar y for data visualization. Like Pandas, it is not
directly related to Machine Learning. It par ticularly comes in handy when a
programmer wants to visualize the patterns in the data. It is a 2D plotting librar y used
for creating 2D graphs and plots. A module named pyplot makes it easy for
programmers for plotting as it provides features to control line styles, font proper ties,
formatting axes, etc. It provides various kinds of graphs and plots for data
visualization, viz., histogram, error char ts, bar chats, etc,
P ython3
# Python program using Matplotlib
# for forming a linear plot
# importing the necessary packages and modules
import matplotlib.pyplot as plt
import numpy as np
# Prepare the data
x = np.linspace(0, 10, 100)
# Plot the data
plt.plot(x, x, label ='linear')
# Add a legend
plt.legend()
# Show the plot
plt.show()
▲
Output :
For more details refer to documentation.
Like 0
Previous Next
RECOMMENDED ARTICLES Page : 1 2 3
Top 5 Programming Languages Best Books To Learn Machine
01 05
and their Libraries for Machine Learning For Beginners And
Learning in 2020 Experts
26, Jun 20 25, Nov 19
▲
Top 10 Javascript Libraries for
02 Machine Learning and Data 7 Best R Packages for Machine
06
Science Learning
14, Dec 20 22, Nov 20
Why is Python the Best-Suited
03
Programming Language for 7 Best Tools to Manage Machine
07
Machine Learning? Learning Projects
27, Aug 19 14, Sep 21
Learning Model Building in Scikit-
04
learn : A Python Machine Learning Support vector machine in
08
Library Machine Learning
17, Feb 17 20, Dec 20
Ar ticle Contributed By :
Rahul_Roy
@Rahul_Roy
Vote for difficulty
Current difficulty : Easy
Easy Normal Medium Hard Expert
Improved By : Akanksha_Rai, adnanirshad158, rs1686740, sagar0719kumar
Article Tags : Python-Library, Technical Scripter 2018, Advanced Computer Subject,
Machine Learning, Python, Technical Scripter
Practice Tags : Machine Learning
Improve Article Report Issue
▲
Writing code in comment? Please use ide.geeksforgeeks.org, generate link and share the link here.
Load Comments
5th Floor, A-118,
Sector-136, Noida, Uttar Pradesh - 201305
feedback@geeksforgeeks.org
Company Learn
About Us Algorithms
Careers Data Structures
Privacy Policy Languages
Contact Us CS Subjects
Copyright Policy Video Tutorials
Web Development Contribute
Web Tutorials Write an Article
HTML Write Interview Experience
CSS Internships
JavaScript Videos
Bootstrap
@geeksforgeeks , Some rights reserved