arshdeep_numpy
arshdeep_numpy
Defiti‐ Array slicing allows you to extract specific parts of an array. It works
Avoid Be mindful of unnecessary array copies, especially
nition similarly to list slicing in Python. Copies when working with large datasets. NumPy arrays
Example arr = np.array([0, 1, 2, 3, 4, 5]) share memory when possible, but certain operations
may create copies, which can impact performance and
Slicing arr[start:stop:step]
memory usage.
syntax
Use In- Whenever feasible, use in-place operations (+=, *=,
Basic slice_1 = arr[1:4] # [1, 2, 3]
Place etc.) to modify arrays without creating new ones. This
slicing slice_2 = arr[:3] # [0, 1, 2]
Operations reduces memory overhead and can improve perfor‐
slice_3 = arr[3:] # [3, 4, 5]
mance.
Negative slice_4 = arr[-3:] # [3, 4, 5]
Memory Understand how memory layout affects performance,
indexing slice_5 = arr[:-2] # [0, 1, 2] Layout especially for large arrays. NumPy arrays can be
Step slice_6 = arr[::2] # [0, 2, 4] stored in different memory orders (C-order vs. Fortran-
slicing slice_7 = arr[1::2] # [1, 3, 5] order). Choosing the appropriate memory layout can
sometimes lead to better performance, especially
Reverse slice_8 = arr[::-1] # [5, 4, 3, 2, 1, 0]
when performing operations along specific axes.
array
Data Choose appropriate data types for your arrays to
Slicing arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
Types minimize memory usage and improve performance.
2D
Using smaller data types (e.g., np.float32 instead of
arrays slice_9 = arr_2d[:2, 1:] # [[2, 3], [5, 6]]
np.float64) can reduce memory overhead and may
lead to faster computations, especially on platforms
Performance Tips and Tricks
with limited memory bandwidth.
Vector‐ Utilize NumPy's built-in vectorized operations whenever
NumExpr Consider using specialized libraries like NumExpr or
ization possible. These operations are optimized and signif‐
and Numba for performance-critical sections of your code.
icantly faster than equivalent scalar operations.
Numba These libraries can often provide significant speedups
Avoiding Minimize the use of Python loops when working with by compiling expressions or functions to native
Loops NumPy arrays. Instead, try to express operations as machine code.
array operations. Loops in Python can be slow compared
to vectorized operations.
Use Take advantage of NumPy's broadcasting rules to
Broadc‐ perform operations on arrays of different shapes effici‐
asting ently. Broadcasting allows NumPy to work with arrays of
different sizes without making unnecessary copies of
data.
Parall‐ NumPy itself doesn't provide built-in parallelism, but you Addition array1 + array2
elism can leverage multi-threading or multi-processing libraries Subtraction array1 - array2
like concurrent.futures or joblib to parallelize certain
Multiplication array1 * array2
operations, especially when working with large datasets
Division array1 / array2
or computationally intensive tasks.
Floor Division array1 // array2
Profiling Use profiling tools like cProfile or specialized profilers
such as line_profiler or memory_profiler to identify perfor‐ Modulus array1 % array2
mance bottlenecks in your code. Optimizing code based Exponentiation array1 ** array2
on actual profiling results can lead to more significant
Absolute np.abs(array)
performance improvements.
Negative -array
Sum np.sum(array)
Concat‐ array1 = np.array([[1, 2, 3], [4, 5, 6]])
enation array2 = np.array([[7, 8, 9]]) Minimum np.min(array)
Maximum
concatenated_array = np.concatenate((array1, array2), axis np.max(array)
=0) Mean np.mean(array)
# vertically
Median np.median(array)
print(concatenated_array)
Standard Deviation np.std(array)
numpy.c Concatenates arrays along a specified axis.
Variance np.var(array)
oncat‐
Dot Product np.dot(array1, array2)
enate()
Cross Product np.cross(array1, array2)
numpy.v Stack arrays vertically and horizontally, respectively.
stack()
NaN Handling
and
numpy.h Identi‐ Use np.isnan() function to check for NaN values in an
stack() fying array.
numpy.d Stack arrays depth-wise. NaNs
stack() Removing Use np.isnan() to create a boolean mask, then use
Splitting NaNs
split_arrays = np.split(concatenated_array, boolean indexing to select non-NaN values.
Interp‐ Sure, here's a short content for "NaN Handling" on your Definition NumPy provides a wide range of mathematical
olating NumPy cheat sheet: NaN Handling: Identifying NaNs: functions that operate element-wise on arrays,
NaNs Use np.isnan() function to check for NaN values in an allowing for efficient computation across large
array. Removing NaNs: Use np.isnan() to create a datasets.
boolean mask, then use boolean indexing to select Trigon‐ np.sin(), np.cos(), np.tan(), np.arcsin(), np.arccos(),
non-NaN values. Replacing NaNs: Use np.nan_to‐ ometric np.arctan()
_num() to replace NaNs with a specified value. Use Functions
np.nanmean(), np.nanmedian(), etc., to compute
Hyperbolic np.sinh(), np.cosh(), np.tanh(), np.arcsinh(), np.arc‐
mean, median, etc., ignoring NaNs. Interpolating NaNs
Functions cosh(), np.arctanh()
Ignoring Many NumPy functions have NaN-aware counterparts,
Exponential np.exp(), np.log(), np.log2(), np.log10()
NaNs in like np.nanmean(), np.nansum(), etc., that ignore
and Logari‐
Operations NaNs in computations.
thmic
Handling Aggregation functions (np.sum(), np.mean(), etc.) Functions
NaNs in typically return NaN if any NaNs are present in the
Rounding np.round(), np.floor(), np.ceil(), np.trunc()
Aggreg‐ input array. Use skipna=True parameter in pandas
Absolute np.abs()
ations DataFrame functions for NaN handling.
Value
Dealing NumPy's linear algebra functions (np.linalg.inv(),
Factorial np.factorial(), np.comb()
with NaNs np.linalg.solve(), etc.) handle NaNs by raising LinAlg‐
and
in Linear Error.
Combin‐
Algebra
ations
Array Creation
numpy.a Create an array from a Python list or tuple.
rray()
Example arr = np.array([1, 2, 3])
numpy.e‐ Create an uninitialized array (values are not set, might be arbitrary). functions for mathematical operations, whereas
mpty() Python lists are more flexible but slower due to their
dynamic nature.
Example empty_arr = np.empty((2, 2))
Vectorized NumPy allows for vectorized operations, which means
Operations you can perform operations on entire arrays without
Linear Algebra
the need for explicit looping. This leads to concise and
Matrix Multip‐ np.dot() or @ operator for matrix multiplication.
efficient code compared to using loops with Python
lication
lists.
Transpose np.transpose() or .T attribute for transposing a
Multidime‐ NumPy supports multidimensional arrays, whereas
matrix.
nsional Python lists are limited to one-dimensional arrays or
Inverse np.linalg.inv() for calculating the inverse of a matrix. Arrays nested lists, which can be less intuitive for handling
Determinant np.linalg.det() for computing the determinant of a multi-dimensional data.
matrix.
Eigenvalues np.linalg.eig() for computing eigenvalues and
and Eigenv‐ eigenvectors.
ectors
Matrix Functions like np.linalg.qr(), np.linalg.svd(), and
Decomposi‐ np.linalg.cholesky() for various matrix decomposi‐
tions tions.
Solving Linear np.linalg.solve() for solving systems of linear
Systems equations.
Vectorization Leveraging NumPy's broadcasting and array
operations for efficient linear algebra computations.
Broadc‐ NumPy arrays support broadcasting, which enables Why? Masked arrays in NumPy allow you to handle missing
asting operations between arrays of different shapes and or invalid data efficiently.
sizes. In contrast, performing similar operations with What are Masked arrays are arrays with a companion boolean
Python lists would require explicit looping or list Masked mask array, where elements that are marked as "mas‐
comprehensions. Arrays? ked" are ignored during computations.
Type NumPy arrays have a fixed data type, which leads to Creating You can create masked arrays using the numpy.m‐
Stability better performance and memory efficiency. Python lists Masked a.masked_array function, specifying the data array and
can contain elements of different types, leading to Arrays the mask array.
potential type conversion overhead.
Masking Masking is the process of marking certain elements of
Rich Set of NumPy provides a wide range of mathematical and an array as invalid or missing. You can manually
Functions statistical functions optimized for arrays, whereas create masks or use functions like numpy.ma.masked‐
Python lists require manual implementation or the use _where to create masks based on conditions.
of external libraries for similar functionality.
Operations Operations involving masked arrays automatically
Memory NumPy arrays typically consume less memory with handle masked values by ignoring them in comput‐
Usage compared to Python lists, especially for large datasets, Masked ations. This allows for easy handling of missing data
due to their fixed data type and efficient storage format. Arrays without explicitly removing or replacing them.
Indexing NumPy arrays offer more powerful and convenient Masked NumPy provides methods for masked arrays to
and Slicing indexing and slicing capabilities compared to Python Array perform various operations like calculating statistics,
lists, making it easier to manipulate and access Methods manipulating data, and more. These methods are
specific elements or subarrays. similar to regular array methods but handle masked
Parallel NumPy operations can leverage parallel processing values appropriately.
Processing capabilities of modern CPUs through libraries like Intel Applic‐ Masked arrays are useful in scenarios where datasets
MKL or OpenBLAS, resulting in significant perfor‐ ations contain missing or invalid data points. They are
mance gains for certain operations compared to commonly used in scientific computing, data analysis,
Python lists. and handling time series data where missing values
Interoper‐ NumPy arrays can be easily integrated with other are prevalent.
ability scientific computing libraries in Python ecosystem,
such as SciPy, Pandas, and Matplotlib, allowing
seamless data exchange and interoperability.
np.random.rand Generates random numbers from a uniform Example arr = np.array([1, 2, 3, 4, 5])
distribution over [0, 1). filtered = arr[(arr > 2) & (arr < 5)]
np.random.randn Generates random numbers from a # Select elements between 2 and 5
standard normal distribution (mean 0, print(filtered)
standard deviation 1). # Output: [3 4]
np.random.randint Generates random integers from a specified Using NumPy also provides functions like np.where() and np.ext‐
low (inclusive) to high (exclusive) range. Functions ract() for more complex filtering.
np.random.random_s‐ Generates random floats in the half-open Example arr = np.array([1, 2, 3, 4, 5])
ample or np.ran‐ interval [0.0, 1.0). filtered = np.where(arr % 2 == 0, arr, 0)
dom.random
np.random.choice Generates random samples from a given 1- # Replace odd elements with 0
D array or list. print(filtered)
np.random.shuffle Shuffles the elements of an array in place. # Output: [0 2 0 4 0]
np.random.seed Sets the random seed to ensure reproduci‐ For Iterate over arrays using traditional for loops. This is
bility of results. Loops useful for basic iteration but might not be the most
efficient method for large arrays.
Filtering Arrays nditer The nditer object allows iterating over arrays in a more
Filtering NumPy provides powerful tools for filtering arrays efficient and flexible way. It provides options to specify the
Arrays based on certain conditions. Filtering allows you to order of iteration, data type casting, and external loop
criteria. Flat The flat attribute of NumPy arrays returns an iterator that
Syntax filtered_array = array[condition] Iteration iterates over all elements of the array as if it were a
flattened 1D array. This is useful for simple element-wise
Example import numpy as np
operations.
arr = np.array([1, 2, 3, 4, 5])
Broadc‐ When performing operations between arrays of different
filtered = arr[arr > 2]
asting shapes, NumPy automatically broadcasts the arrays to
# Select elements greater than 2
compatible shapes. Understanding broadcasting rules
print(filtered)
can help efficiently iterate over arrays without explicit
# Output: [3 4 5]
loops.
Combining Conditions can be combined using logical operators
Conditions like & (and) and | (or).
Vectorized Instead of explicit iteration, utilize NumPy's built-in ravel() Similar to flatten(), ravel() also flattens multi-dimensional
Operations vectorized operations which operate on entire arrays arrays into a 1D array, but it returns a view of the original
rather than individual elements. This often leads to array whenever possible.
faster and more concise code. Example arr = np.array([[1, 2], [3, 4]])
raveled_arr = arr.ravel()
Array Reshaping
Explan‐ This method can be more efficient in terms of memory
Array Reshaping arrays in NumPy allows you to change the ation usage than flatten().
Reshaping shape or dimensions of an existing array without
transp‐ The transpose() method rearranges the dimensions of an
changing its data. This is useful for tasks like
ose() array. For 2D arrays, it effectively swaps rows and
converting a 1D array into a 2D array or vice versa, or
columns.
for preparing data for certain operations like matrix
Example arr = np.array([[1, 2], [3, 4]])
multiplication.
transposed_arr = arr.transpose()
reshape() The reshape() function in NumPy allows you to change
the shape of an array to a specified shape. Explan‐ This will transpose the 2x2 matrix, swapping rows and
ation columns.
For import numpy as np
example:
Sorting Arrays
arr = np.array([1, 2, 3, 4, 5, 6])
reshaped_arr = arr.reshape((2, 3)) np.sor‐ Returns a sorted copy of the array.
t(arr)
Explan‐ This will reshape the array arr into a 2x3 matrix.
ation arr.sort() Sorts the array in-place.
resize() Similar to reshape(), resize() changes the shape of an np.arg‐ Returns the indices that would sort the array.
array, but it also modifies the original array if necessary sort(arr)
to accommodate the new shape. np.lex‐ Performs an indirect sort using a sequence of keys.
Example arr = np.array([[1, 2], [3, 4]]) sort()
resized_arr = np.resize(arr, (3, 2)) np.sor‐ Sorts the array of complex numbers based on the real
Explan‐ If the new shape requires more elements than the t_comp‐ part first, then the imaginary part.
ation original array has, resize() repeats the original array to lex(arr)
fill in the new shape. np.par‐ Rearranges the elements in such a way that the kth
flatten() The flatten() method collapses a multi-dimensional tition‐ element will be in its correct position in the sorted array,
array into a 1D array by iterating over all elements in (arr, k) with all smaller elements to its left and all larger elements
row-major (C-style) order. to its right.
Example arr = np.array([[1, 2], [3, 4]]) np.arg‐ Returns the indices that would partition the array.
Array Indexing