Unit III Python

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Unit – III Python for Data Science

UNIT III: (10 hours) NumPy Basics: Arrays and Vectorized Computation- The
NumPy ndarray- Creating ndarrays Data Types for ndarrays- Arithmetic with
NumPy Arrays- Basic Indexing and Slicing - Boolean Indexing-Transposing Arrays
and Swapping Axes. Universal Functions: Fast Element-Wise Array Functions-
Mathematical and Statistical Methods-Sorting- Unique and Other Set Logic

NumPy:
NumPy, short for Numerical Python, is one of the most important
foundational packages for numerical computing in Python. it is designed for
efficiency on large arrays of data. There are a number of reasons for this:
 ndarray, an efficient multidimensional array providing fast array-oriented
arithmetic operations and flexible broadcasting capabilities.
 Mathematical functions for fast operations on entire arrays of data without
having to write loops.
 Tools for reading/writing array data to disk and working with memory-
mapped files.
 Linear algebra, random number generation, and Fourier transform
capabilities.
 A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.
One of the reasons NumPy is so important for numerical computations in
Python is because it is designed for efficiency on large arrays of data. There are a
number of reasons for this:
NumPy internally stores data in a contiguous block of memory,
independent of other built-in Python objects. NumPy’s library of algorithms
written in the C language can operate on this memory without any type checking
NumPy arrays also use much less memory than built-in Python sequences.
NumPy operations perform complex computations on entire arrays without
the need for Python for loops.
import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000))

B V Raju College Page 1


Unit – III Python for Data Science

Now let’s multiply each sequence by 2:


%time for _ in range(10): my_arr2 = my_arr * 2
CPU times: user 20 ms, sys: 8 ms, total: 28 ms
Wall time: 26.5 ms

%time for _ in range(10): my_list2 = [x * 2 for x in my_list]


CPU times: user 408 ms, sys: 64 ms, total: 472 ms
Wall time: 473 ms
NumPy-based algorithms are generally 10 to 100 times faster (or more)
than their pure Python counterparts and use significantly less memory.

Difference Between List and Array:

List Array

List can have elements of different data All elements of an array are of same
types for example, [1,3.4, ‘hello’, ‘a@’] data type for example, an array of floats
may be: [1.2, 5.4, 2.7]

Lists do not support element wise Arrays support element wise


operations, for example, addition, operations. For example, if A1 is an
multiplication, etc. because elements array, it is possible to say A1/3 to divide
may not be of same type each element of the array by 3.

Lists can contain objects of different NumPy array takes up less space in
datatype that Python must store the memory as compared to a list because
type information for every element arrays do not require to store datatype
along with its element value. Thus lists of each element separately.
take more space in memory and are
less efficient.

List is a part of core Python. Array (ndarray) is a part of NumPy


library.

B V Raju College Page 2


Unit – III Python for Data Science

THE NUMPY NDARRAY: A MULTIDIMENSIONAL ARRAY OBJECT

One of the key features of NumPy is its N-dimensional array object, or


ndarray, which is a fast, flexible container for large datasets in Python. Arrays
enable us to perform mathematical operations on whole blocks of data using
similar syntax to the equivalent operations between scalar elements.
import numpy as np
# Generate some random data
data = np.random.randn(2, 3)
print(data)

data=data * 10
print(data)

data= data + data


print(data)

output:
[[ 1.02769038 -0.45400781 -1.09134785]
[-0.74483404 -0.89984109 -0.04883344]]

[[ 10.27690378 -4.54007811 -10.91347855]


[ -7.44834037 -8.99841092 -0.48833438]]

[[ 20.55380756 -9.08015622 -21.82695709]


[-14.89668073 -17.99682185 -0.97666877]]

B V Raju College Page 3


Unit – III Python for Data Science

In the first example, all of the elements have been multiplied by 10. In the
second, the corresponding values in each “cell” in the array have been added to
each other.
An ndarray is a generic multidimensional container for homogeneous data;
that is, all of the elements must be the same type.
Every array has a shape, a tuple indicating the size of each dimension, and
a dtype, an object describing the data type of the array:

print(data.shape)
(2, 3)

print(data.dtype)
dtype('float64')

Arrays:
Example:
import numpy as np
a = np.arange(15).reshape(3, 5)
print(“array is”,a)
print("array size is",a.shape)
print("array dimensions",a.ndim)
print("itewm size is",a.itemsize)
print("type of array",type(a))

O/P:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
array size is (3, 5)

B V Raju College Page 4


Unit – III Python for Data Science

array dimensions 2
item size is 4
type of array <class 'numpy.ndarray'>

CREATING NDARRAYS
The easiest way to create an array is to use the array function. This accepts
any sequence-like object (including other arrays) and produces a new NumPy
array containing the passed data.

Exam
import numpy as np

a = np.array([2,3,4])
print("array is",a)
print("data type", a.dtype)

b = np.array([1.2, 3.5, 5.1])


print("array b",b.dtype)

O/p:
array is [2 3 4]
data type int32
array b float64

Array transforms sequences of sequences into two-dimensional arrays,


sequences of sequences of sequences into three-dimensional arrays, and so on.
Example:
import numpy as np
b = np.array([(1.5,2,3), (4,5,6)])
B V Raju College Page 5
Unit – III Python for Data Science

print("two dim array",b)


O/P:
two dim array [[1.5 2 3 ]
[4 5 6 ]]

The type of the array can also be explicitly specified at creation time:

Example:
import numpy as np
c = np.array( [ [1,2], [3,4] ], dtype=complex )
print("complex array",c)

O/P:
complex array [[1.+0.j 2.+0.j]
[3.+0.j 4.+0.j]]

To create sequences of numbers, NumPy provides a function analogous to


range that returns arrays instead of lists.
np.arange( 10, 30, 5 )
array([10, 15, 20, 25])

np.arange( 0, 2, 0.3 ) # it accepts float arguments


array([ 0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

The function zeros creates an array full of zeros, the function ones creates
an array full of ones, and the function empty creates an array whose initial
content is random and depends on the state of the memory. By default, the dtype
of the created array is float64.
Example:

B V Raju College Page 6


Unit – III Python for Data Science

import numpy as np
a=np.zeros( (3,4) )
print("array a is",a)
b=np.ones( (2,3,4), dtype=np.int16 )
print("array b is",b)
O/P:
array a is
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]

array b is
[[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]

[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]]

Function Description

Convert input data (list, tuple, array, or other sequence type) to an


array ndarray either by inferring a dtype or explicitly specifying a dtype;
copies the input data by default

Convert input to ndarray, but do not copy if the input is already an


asarray
ndarray

B V Raju College Page 7


Unit – III Python for Data Science

Function Description

arange Like the built-in range but returns an ndarray instead of a list

Produce an array of all 1s with the given shape and


ones,
dtype; ones_like takes another array and produces a ones array of the
ones_like
same shape and dtype

zeros,
Like ones and ones_like but producing arrays of 0s instead
zeros_like

empty, Create new arrays by allocating new memory, but do not populate
empty_like with any values like ones and zeros

Produce an array of the given shape and dtype with all values set to
full,
the indicated “fill value” full_like takes another array and produces a
full_like
filled array of the same shape and dtype

eye, Create a square N × N identity matrix (1s on the diagonal and 0s


identity elsewhere)

Table 4-1. Array creation functions

Data Types for ndarrays:


NumPy supports a much greater variety of numerical types than Python
does. The primitive types supported are tied closely to those in C:
By default Python have these data types:
 strings - used to represent text data, the text is given under quote marks.
e.g. "ABCD"
 integer - used to represent integer numbers. e.g. -1, -2, -3
 float - used to represent real numbers. e.g. 1.2, 42.42

B V Raju College Page 8


Unit – III Python for Data Science

 boolean - used to represent True or False.


 complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j
NumPy has some extra data types, and refer to data types with one character,
like i for integers, u for unsigned integers etc.
Below is a list of all data types in NumPy and the characters used to represent
them.
 i - integer
 b - boolean
 u - unsigned integer
 f - float
 c - complex float
 m - timedelta
 M - datetime
 O - object
 S - string
 U - unicode string
 V - fixed chunk of memory for other type ( void )

Type Type code Description

Signed and unsigned 8-bit (1 byte)


int8, uint8 i1, u1
integer types

Signed and unsigned 16-bit integer


int16, uint16 i2, u2
types

Signed and unsigned 32-bit integer


int32, uint32 i4, u4
types

Signed and unsigned 64-bit integer


int64, uint64 i8, u8
types

B V Raju College Page 9


Unit – III Python for Data Science

Type Type code Description

float16 f2 Half-precision floating point

Standard single-precision floating


float32 f4 or f
point; compatible with C float

Standard double-precision floating


float64 f8 or d point; compatible with C double and
Python float object

float128 f16 or g Extended-precision floating point

complex64, complex128, c Complex numbers represented by


c8, c16, c32
omplex256 two 32, 64, or 128 floats, respectively

Boolean type
bool ?
storing True and False values

Python object type; a value can be


object O
any Python object

Fixed-length ASCII string type (1 byte


per character); for example, to
string_ S
create a string dtype with length 10,
use 'S10'

Fixed-length Unicode type (number


of bytes platform specific); same
unicode_ U
specification semantics
as string_ (e.g., 'U10')

Table 4-2. NumPy data types

B V Raju College Page 10


Unit – III Python for Data Science

Checking the Data Type of an Array:


The NumPy array object has a property called dtype that returns the data
type of the array:
Example 1:
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr.dtype)

O/P:
int64

Example 2:
import numpy as np
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)
O/P:
U6
Creating Arrays With a Defined Data Type:
We use the array() function to create arrays, this function can take an
optional argument: dtype that allows us to define the expected data type of the
array elements:
Example:
import numpy as np
arr = np.array([1, 2, 3, 4], dtype='S')
print("array is",arr)
print("array type is",arr.dtype)
O/P:
B V Raju College Page 11
Unit – III Python for Data Science

array is [b'1' b'2' b'3' b'4']


array type is |S1
We can explicitly convert or cast an array from one dtype to another
using ndarray’s astype method:

import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print( arr.dtype)

#Output: dtype('int64')

float_arr = arr.astype(np.float64)
print( float_arr.dtype)

#Output: dtype('float64')

In this example, integers were cast to floating point. If we cast some


floating-point numbers to be of integer dtype, the decimal part will be truncated:

import numpy as np
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
print(arr)
#Output: array([ 3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
print(arr.astype(np.int32))
#Output: array([ 3, -1, -2, 0, 12, 10], dtype=int32)

If we have an array of strings representing numbers, we can use astype to


convert them to numeric form:

import numpy as np

B V Raju College Page 12


Unit – III Python for Data Science

numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_)


print(numeric_strings.astype(float))

#Output: array([ 1.25, -9.6 , 42. ])

Arithmetic with NumPy Arrays:


NumPy is an open-source Python library. It provides a wide range of
arithmetic operations like addition, subtraction, multiplication, and division which
can be performed on the NumPy arrays.
NumPy arithmetic operations are only possible if the arrays should be of
the same shape or if they satisfy the rules of broadcasting.
Addition
import numpy as np
# Defining both the matrices
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
# Performing addition using arithmetic operator
add_ans = a+b
print(add_ans)

output:
[ 7 77 23 130]

Subtraction
import numpy as np

# Defining both the matrices


a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
B V Raju College Page 13
Unit – III Python for Data Science

# Performing subtraction using arithmetic operator


sub_ans = a-b
print(sub_ans)

output:
[ 3 67 3 70]

Multiplication
import numpy as np

# Defining both the matrices


a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
# Performing multiplication using arithmetic operator
mul_ans = a*b
print(mul_ans)

output:
[ 10 360 130 3000]

Division
import numpy as np

# Defining both the matrices


a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
# Performing division using arithmetic operators

B V Raju College Page 14


Unit – III Python for Data Science

div_ans = a/b
print(div_ans)

[ 2.5 14.4 1.3 3.33333333]


import numpy as np
# Defining both the matrices
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
# Performing addition using arithmetic operator
print(a>b)
print(a/2)
print(a**2)

output:
[ True True True True]
[ 2.5 36. 6.5 50. ]
[ 25 5184 169 10000]

Basic Indexing and Slicing:


indexing and slicing are only applicable to sequence data types. In
sequence type, the order in which elements are inserted is maintained and this
allows us to access its elements using indexing and slicing.
The sequence types in Python are list, tuple, string, range, byte, and byte
arrays. And indexing and slicing apply to all these types.
Numpy indexing is used for accessing an element from an array by giving it
an index value that starts from 0.
Slicing NumPy arrays means extracting elements from an array in a specific
range. It obtains a substring, subtuple, or sublist from a string, tuple, or list.

B V Raju College Page 15


Unit – III Python for Data Science

There are two types of Indexing: basic and advanced. Advanced indexing is
further divided into Boolean and Purely Integer indexing. Negative Slicing index
values start from the end of the array.
To get some specific data or elements from numpy arrays, NumPy indexing
and slicing are used. Indexing starts from 0 and slicing is performed using
indexing.
Indexing an Array
Indexing is used to access individual elements. It is also possible to extract
entire rows, columns, or planes from multi-dimensional arrays with numpy
indexing. Indexing starts from 0. Let's see an array example below to understand
the concept of indexing:

Element
2 3 11 9 6 4 10 12
of array

Index 0 1 2 3 4 5 6 7

Indexing in one dimensions:


When arrays are used as indexes to access groups of elements, this is called
indexing using index arrays. NumPy arrays can be indexed with arrays or with any
other sequence like a list, etc.
Example:
import numpy as np
arr = np.arange(10)
print(arr)
#Output: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Print( arr[5])
#Output: 5

Print( arr[5:8])
#Output: array([5, 6, 7])
B V Raju College Page 16
Unit – III Python for Data Science

arr[5:8] = 12
print(arr)
#Output: array([ 0, 1, 2, 3, 4, 12, 12, 12, 8, 9])

As you can see, if you assign a scalar value to a slice, as in arr[5:8] = 12, the
value is propagated (or broadcasted henceforth) to the entire selection. An
important first distinction

from Python’s built-in lists is that array slices are views on the original
array. This means that the data is not copied, and any modifications to the view
will be reflected in the source array.

To give an example of this, I first create a slice of arr:

arr_slice = arr[5:8]
print( arr_slice)

Output: array([12, 12, 12])

Now, when I change values in arr_slice, the mutations are reflected in the
original array arr:

arr_slice[1] = 12345
print(arr)

Output:
array([ 0, 1, 2, 3, 4, 12, 12345, 12, 8, 9])

The “bare” slice [:] will assign to all values in an array:

B V Raju College Page 17


Unit – III Python for Data Science

arr_slice[:] = 64
print(arr)

Output:
array([ 0, 1, 2, 3, 4, 64, 64, 64, 8, 9])

Indexing in 2 Dimensions
Example:
import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("Element at 0th row and 0th column of arr1 is:",arr1[0,0])
print("Element at 1st row and 2nd column of arr1 is:",arr1[1,2])
O/P;

Picking a Row or Column in 2-D NumPy Array


import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("\n")
print("1st row :\n",arr1[1])

B V Raju College Page 18


Unit – III Python for Data Science

O/P:

Consider the two-dimensional array from before, arr2d. Slicing this array is
a bit different:

import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2d[:2])

Output:
array([[1, 2, 3],
[4, 5, 6]])

As you can see, it has sliced along axis 0, the first axis. A slice, therefore,
selects a range of elements along an axis. It can be helpful to read the
expression arr2d[:2] as “select the first two rows of arr2d.”

import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[:2, 1:]

Output:

B V Raju College Page 19


Unit – III Python for Data Science

array([[2, 3],
[5, 6]])

When slicing like this, you always obtain array views of the same number of
dimensions. By mixing integer indexes and slices, you get lower dimensional
slices.

For example, I can select the second row but only the first two columns like
so:

import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[1, :2]

Out[93]: array([4, 5])

Note that a colon by itself means to take the entire axis, so you can slice
only higher dimensional axes by doing:

import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print( arr2d[:, :1])

Output:
array([[1],
[4],
[7]])

Of course, assigning to a slice expression assigns to the whole selection:

B V Raju College Page 20


Unit – III Python for Data Science

import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[:2, 1:] = 0

print( arr2d)

Output:
array([[1, 0, 0],
[4, 0, 0],
[7, 8, 9]])

Indexing in 3 Dimensions
There are three dimensions in a 3-D array, suppose we have three
dimensions as (i, j, k), where i stands for the 1st dimension, j stands for the 2nd
dimension and, k stands for the 3rd dimension.
Example:
import numpy as np
arr=np.arange(12)
arr1=arr.reshape(2,2,3)
print("Array arr1:\n",arr1)
print("Element:",arr1[1,0,2])
O/P:

B V Raju College Page 21


Unit – III Python for Data Science

Explanation: We want to access the element of an array at index(1,0,2)


Here 1 represents the 1st dimension, and the 1st dimension has two arrays:
1st array: [0,1,2] [3,4,5] and: 2nd array: [6,7,8] [9,10,11]
Indexing starts from 0.
We have the 2nd array as we select 1: [[6,7,8] [9,10,11]
The 2nd digit 0, stands for the 2nd dimension, and the 2nd dimension also
contains two arrays: Array 1: [6, 7, 8] and: Array 2: [9, 10, 11]
0 is selected and we have 1st array : [6, 7, 8]
The 3rd digit 2, represents the 3rd dimension, 3rd dimension further has
three values: 6,7,8
As 2 is selected, 8 is the output.

Basic Slicing and indexing


Using basic indexing and slicing we can access a particular element or group
of elements from an array.
Basic indexing and slicing return the views of the original arrays.
Basic slicing occurs when the arr[index] is:
 a slice object (constructed by start: stop: step)
 an integer
 or a tuple of slice objects and integers
Example:
import numpy as np
arr = np.arange(12)
print(arr)
print("Element at index 6 of an array arr:",arr[6])
print("Element from index 3 to 8 of an array arr:",arr[3:8])
O/P:

B V Raju College Page 22


Unit – III Python for Data Science

Slicing a 2D Array
In a 2-D array, we have to specify start:stop 2 times. One for the row and 2nd one
for the column.
Exampl:
import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("\n")
print("elements of 1st row and 1st column upto last column :\n",arr1[1:,1:4])
O/P:

The 1st number represents the row, so slicing starts from the 1st row and
goes till the last as no ending index is mentioned. Then elements from the 1st
column to the 3rd column are sliced and printed as output.

B V Raju College Page 23


Unit – III Python for Data Science

Negative Slicing and Indexing


Negative indexing begins when the array sequence ends, i.e. the last
element will be the first element with an index of -1 in negative indexing, and the
slicing occurs by using this negative indexing.
Example:
import numpy as np
arr = np.array([10,20,30,40,50,60,70,80,90])
print("Element at index 2 or -7 of an array arr:",arr[-7])
print("Sliced Element from index -8 or 2 and -3 or 6 of an array arr:",arr[-8:-3])
O/P:

B V Raju College Page 24


Unit – III Python for Data Science

Boolean Indexing:
Boolean indexing occurs when the obj is a Boolean array object, i.e., a true or
false type or having some condition.
 The elements that satisfy the Boolean expression are returned.
 This is used to filter the values of the desired elements.
Example:
import numpy as np
arr = np.array([11,6,41,10,29,50,55,45])
print(arr[arr>35])
O/P:

Elements that satisfy the given condition, i.e., greater than 35, are printed
as output

Import numpy as np
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)

print( names)
print( data)
names == 'Bob'
data[names == 'Bob']
data[names == 'Bob', 2:]
data[names == 'Bob', 3]

B V Raju College Page 25


Unit – III Python for Data Science

Output:
array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4')
array([[ 0.0929, 0.2817, 0.769 , 1.2464],
[ 1.0072, -1.2962, 0.275 , 0.2289],
[ 1.3529, 0.8864, -2.0016, -0.3718],
[ 1.669 , -0.4386, -0.5397, 0.477 ],
[ 3.2489, -1.0212, -0.5771, 0.1241],
[ 0.3026, 0.5238, 0.0009, 1.3438],
[-0.7135, -0.8312, -2.3702, -1.8608]])
array([ True, False, False, True, False, False, False])
array([[ 0.0929, 0.2817, 0.769 , 1.2464],
[ 1.669 , -0.4386, -0.5397, 0.477 ]])
array([[ 0.769 , 1.2464],
[-0.5397, 0.477 ]])
array([1.2464, 0.477 ])

Transposing Arrays and Swapping Axes:


The numpy.transpose() function is one of the most important functions in
matrix multiplication. This function permutes or reserves the dimension of the
given array and returns the modified array.
The numpy.transpose() function changes the row elements into column
elements and the column elements into row elements. The output of this function
is a modified array of the original one.
Syntax
numpy.transpose(arr, axis=None)
Transposing is a special form of reshaping that similarly returns a view on
the under‐ lying data without copying anything. Arrays have the transpose
method and also the special T attribute:

B V Raju College Page 26


Unit – III Python for Data Science

Example:
import numpy as np
a= np.arange(6).reshape((2,3))
print("array",a)
b=np.transpose(a)
print("transpose array",b )
O/P:

numpy.transpose() with axis


For higher dimensional arrays, transpose will accept a tuple of axis numbers to
permute the axes (for extra mind bending):
import numpy as np
a= np.arange(16).reshape(2, 2, 4)
print("array is\n",a)
print("\n")
b=a.transpose(0, 1,2)
print("transpose with axes\n",b)

O/P:

B V Raju College Page 27


Unit – III Python for Data Science

Simple transposing with .T is just a special case of swapping axes. ndarray


has the method swapaxes which takes a pair of axis numbers:
import numpy as np
a= np.arange(16).reshape(2, 2, 4)
print("array is\n",a)
print("\n")
b=a.swapaxes(1,2)
print("transpose with swapaxes\n",b)
O/P:

Universal Functions: Fast Element-Wise Array


These are two types of universal functions
1.unary ufuncs.
2.binary ufunc(such as add or maximum, take 2 arrays (thus, binary ufuncs) and
return a single array as the result)
A universal function, or ufunc, is a function that performs element-wise
operations on data in ndarrays. You can think of them as fast vectorized wrappers
for simple functions that take one or more scalar values and produce one or more
scalar results.

B V Raju College Page 28


Unit – III Python for Data Science

unary ufuncs.

Function Description

Compute the absolute value element-wise for integer,


abs, fabs
floating-point, or complex values

Compute the square root of each element (equivalent


sqrt
to arr ** 0.5)

Compute the square of each element (equivalent to arr


square
** 2)

exp Compute the exponent ex of each element

Natural logarithm (base e), log base 10, log base 2, and
log, log10, log2, log1p
log(1 + x), respectively

Compute the sign of each element: 1 (positive), 0


sign
(zero), or –1 (negative)

Compute the ceiling of each element (i.e., the smallest


ceil
integer greater than or equal to that number)

Compute the floor of each element (i.e., the largest


floor
integer less than or equal to each element)

Round elements to the nearest integer, preserving


rint
the dtype

Return fractional and integral parts of array as a


modf
separate array

isnan Return boolean array indicating whether each value

B V Raju College Page 29


Unit – III Python for Data Science

Function Description

is NaN (Not a Number)

Return boolean array indicating whether each element


isfinite, isinf
is finite (non-inf, non-NaN) or infinite, respectively

cos, cosh, sin, sinh, tan,


Regular and hyperbolic trigonometric functions
tanh

arccos, arccosh, arcsin,


Inverse trigonometric functions
arcsinh, arctan, arctanh

Compute truth value of not x element-wise (equivalent


logical_not
to ~arr).

import numpy as np
arr = np.arange(10)
print("array is\n",arr)
print("\n")
a=np.sqrt(arr)
print("square root\n",a)
print("\n")
b=np.exp(arr)
print("exponent is\n",b)
O/p:

B V Raju College Page 30


Unit – III Python for Data Science

Binary ufunc:

Function Description

add Add corresponding elements in arrays

subtract Subtract elements in second array from first array

multiply Multiply array elements

divide, floor_divide Divide or floor divide (truncating the remainder)

Raise elements in first array to powers indicated


power
in second array

maximum, fmax Element-wise maximum; fmax ignores NaN

minimum, fmin Element-wise minimum; fmin ignores NaN

mod Element-wise modulus (remainder of division)

Copy sign of values in second argument to values


copysign
in first argument

Perform element-wise comparison, yielding


greater, greater_equal, less,
boolean array (equivalent to infix operators >, >=,
less_equal, equal, not_equal
<, <=, ==, !=)

B V Raju College Page 31


Unit – III Python for Data Science

Function Description

logical_and, logical_or, Compute element-wise truth value of logical


logical_xor operation (equivalent to infix operators & |, ^)

import numpy as np
arr = np.arange(5)
arr1=np.arange(5,10)
print("arr",arr)
print("\n")
print("arr1",arr1)
print("add is\n",np.add(arr,arr1))
print("div is\n",np.divide(arr,arr1))
O/P:

Mathematical and Statistical Methods:


Math Methods
NumPy contains a large number of various mathematical operations.
NumPy provides standard trigonometric functions, functions for arithmetic
operations, handling complex numbers, etc.

B V Raju College Page 32


Unit – III Python for Data Science

Trigonometric Functions
NumPy has standard trigonometric functions which return trigonometric
ratios for a given angle in radians.

FUNCTION DESCRIPTION

sin() Compute sin element wise

cos() Compute cos element wise

tan() Compute tangent element-wise.

degrees() Convert angles from radians to degrees.

rad2deg() Convert angles from radians to degrees.

deg2rad Convert angles from degrees to radians.

radians() Convert angles from degrees to radians.

numpy.sin(x) : This mathematical function helps user to calculate trignmetric sine


for all x(being the array elements).
numpy.cos(x) : This mathematical function helps user to calculate trignmetric
cosine for all x(being the array elements).

numpy.tan(x) : This mathematical function helps user to calculate trignmetric tan


for all x(being the array elements).

Example
import numpy as np
a = np.array([0,30,45,60,90])
B V Raju College Page 33
Unit – III Python for Data Science

print ('Sine of different angles:' )


# Convert to radians by multiplying with pi/180
print (np.sin(a*np.pi/180) )
print ('\n' )

print ('Cosine values for angles in array:' )


print (np.cos(a*np.pi/180) )
print ('\n' )

print ('Tangent values for given angles:' )


print (np.tan(a*np.pi/180))

Here is its output −


Sine of different angles:
[ 0. 0.5 0.70710678 0.8660254 1. ]

Cosine values for angles in array:


[ 1.00000000e+00 8.66025404e-01 7.07106781e-01 5.00000000e-01
6.12323400e-17]

FUNCTION DESCRIPTION

rint() Round to nearest integer towards zero.

fix() Round to nearest integer towards zero.

B V Raju College Page 34


Unit – III Python for Data Science

floor() Return the floor of the input, element-wise.

ceil() Return the ceiling of the input, element-wise.

Return the truncated value of the input, element-


trunc() wise.

numpy.round_(arr, decimals = 0, out = None) : This mathematical function round


an array to the given number of decimals.

# Python program explaining


# round_() function
import numpy as np

in_array = [.5, 1.5, 2.5, 3.5, 4.5, 10.1]


print ("Input array : \n", in_array)

round_off_values = np.round_(in_array)
print ("\nRounded values : \n", round_off_values)

in_array = [.53, 1.54, .71]


print ("\nInput array : \n", in_array)

round_off_values = np.round_(in_array)
print ("\nRounded values : \n", round_off_values)

in_array = [.5538, 1.33354, .71445]

B V Raju College Page 35


Unit – III Python for Data Science

print ("\nInput array : \n", in_array)

round_off_values = np.round_(in_array, decimals = 3)


print ("\nRounded values : \n", round_off_values)

Output :
Input array :
[0.5, 1.5, 2.5, 3.5, 4.5, 10.1]

Rounded values :
[ 0. 2. 2. 4. 4. 10.]

Input array :
[0.53, 1.54, 0.71]

Rounded values :
[ 1. 2. 1.]

Input array :
[0.5538, 1.33354, 0.71445]

Rounded values :
[ 0.554 1.334 0.714]

Statistics Methods
Numpy provides various statistical functions which are used to perform some
statistical data analysis.
 Mean
 Median
 Range (peak to peak)
 Standard Deviation

B V Raju College Page 36


Unit – III Python for Data Science

 Variance

np.mean
Compute the arithmetic mean of an array along a specific axis. The default is
along the flattened axis. For example-

import numpy as np

array_for_mean = np.array([[1, 2], [3, 4]])


m=np.mean(array_for_mean) #default
print(array_for_mean)
print("mean",m)

y=np.mean(array_for_mean, axis = 1) #mean to be calculated along axis


print("mean axis",y)

output:
[[1 2]
[3 4]]

mean 2.5
mean axis [1.5 3.5]

np.median
computes the median of the array along a specific axis. Median is the middle
value in a sorted (ascending/descending) list of numbers.
import numpy as np
array_for_median = np.array([[10, 7, 4], [3, 2, 1]])
x=np.median(array_for_median) #default
print(array_for_median)
print("Median", x)
output:

B V Raju College Page 37


Unit – III Python for Data Science

[[10 7 4]
[ 3 2 1]]
Median 3.5

Ordering the elements of the array in ascending order we get- 1, 2, 3, 4, 7,


10. It is an even-numbered list of 6 elements. The middlemost should be the 3rd
element. The average of the 3rd and 4th elements is taken (since the order is
even-numbered)- (3+4)/2 = 7/2 = 3.5

np.ptp
measures the range along a specific axis of an array. The range is the
difference between the maximum and minimum values in a matrix/array.
import numpy as np
array_for_range = np.array([[-85, 60, 94, 53],
[3, -12, 54, 14],
[32, 45, -66, 36]])

x=np.ptp(array_for_range)
print(x)
The maximum value is 94 while the minimum value is -85. The range will be:
(94- (-85)) = (94+85) = 179

np.std
The standard deviation is the spread of the values from their mean
value. np.stdis the NumPy function which helps to measure the standard deviation
of an array along a specified axis.
import numpy as np
array_for_stddev = np.array([[7, 8, 9], [10, 11, 12]])

x=np.std(array_for_stddev)

B V Raju College Page 38


Unit – III Python for Data Science

print(array_for_stddev)

print("standard deviation",x)

output:
[[ 7 8 9]
[10 11 12]]
standard deviation 1.707825127659933

np.var
np.var is the NumPy function which measures the variance of an array along
a specified axis. Variance is the average squared deviations from the mean of all
observed values.
import numpy as np

array_for_variance = np.array([[1, 2, 3], [6, 7, 8]])

x=np.var(array_for_variance) #default

print(array_for_variance)
print("variance",x)

output:

[[1 2 3]

[6 7 8]]

variance 6.916666666666667

Method Description

Sum of all the elements in the array or along an axis; zero-


sum
length arrays have sum 0

B V Raju College Page 39


Unit – III Python for Data Science

Method Description

mean Arithmetic mean; zero-length arrays have NaN mean

Standard deviation and variance, respectively, with optional


std, var
degrees of freedom adjustment (default denominator n)

min, max Minimum and maximum

argmin,
Indices of minimum and maximum elements, respectively
argmax

cumsum Cumulative sum of elements starting from 0

cumprod Cumulative product of elements starting from 1

Table 4-5. Basic array statistical methods

Sorting:
Like Python’s built-in list type, NumPy arrays can be sorted in-place using the sort
method:
Syntax
list.sort(reverse=True|False, key=myFunc)

Parameter Description

reverse Optional. reverse=True will sort the list descending. Default is


reverse=False

key Optional. A function to specify the sorting criteria(s)

Example:
import numpy as np

B V Raju College Page 40


Unit – III Python for Data Science

arr = np.array([3, 2, 0, 1])


print(np.sort(arr))
print("\n")
arr1 = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr1))
print("\n")
arr2 = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr2))
O/P:

Unique and Other Set Logic


NumPy has some basic set operations for one-dimensional ndarrays. Probably the
most commonly used one is np.unique, which returns the sorted unique values in
an array
Example:
import numpy as np
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
print( np.unique(names))
ints = np.array([3, 3, 3, 2, 2, 1, 1, 4, 4])
print( np.unique(ints))
O/P:

B V Raju College Page 41


Unit – III Python for Data Science

Another function, np.in1d, tests membership of the values in one array in


another, returning a boolean array
Example:

import numpy as np
values = np.array([6, 0, 0, 3, 2, 5, 6])
print( np.in1d(values, [2, 3, 6]))
O/P:

Array set operations

B V Raju College Page 42

You might also like