Exercises-Linear Algebra
Exercises-Linear Algebra
Exercises-Linear Algebra
Course Content
II Semester
Text Books:
1. Linear Algebra and its Applications, Gilbert Strang , 4 Edition, Cengage India Private Limited 2005.
2. Linear Algebra and Its Applications, David C. Lay, 5th Edition - Pearson -2023.
Reference Books:
1. Boyd, Stephen, and Lieven Vandenberghe. “Introduction to applied linear algebra: vectors, matrices,
and least squares”, Cambridge university press, 2018.
2. Linear algebra for computer science “Essential Mathematics for Computer Scientists” M.
THULASIDAS, Singapore Management University, 2021.
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
1.Matrix Operations
Objective: Understand and implement matrix operations including matrix addition and scalar
multiplication using NumPy. Develop skills in manipulating matrices by applying these operations to image
processing.
Given image above shows a colour image represented as matrix values. Now, suppose you are working
with grayscale images, which is represented as matrix where each element corresponds to a pixel's
intensity. You will apply matrix operations to modify and enhance images.
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Image Representation:
Consider a grayscale image represented by a 5x5 matrix where each value represents the intensity of a
pixel (0 is black, 255 is white).
Transformation Matrix:
We will use a transformation matrix T to change the brightness of the image. Consider the following matrix
to add to the original image:
Steps
Use NumPy to create the matrices for the original image and the transformation matrix.
import numpy as np
I = np.array([
])
# Transformation matrix
T = np.array([
])
Perform matrix addition: Compute the transformed image by adding the transformation matrix T to the
original image matrix I
Apply scalar multiplication: Use NumPy to multiply the original image matrix I by the scalar value α.
# Scalar value
alpha = 1.5
# Scalar multiplication
I_scaled = I * alpha
Ensure pixel values are within valid range: As grayscale images are used, pixel values should be
between 0 and 255. After scalar multiplication, clip the values to this range.
Code:
import numpy as np
I = np.array([
])
# Transformation matrix
T = np.array([
])
# Scalar value
alpha = 1.5
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
# Matrix addition
I_adjusted = I + T
# Scalar multiplication
I_scaled = I * alpha
# Display results
Discussion
Compare the original image matrix with the transformed and scaled images using matrix operations.
How do the operations affect the image?
Discuss the effect of matrix addition and scalar multiplication on the pixel values.
Try different values for the transformation matrix T and scalar α. observe how changes in these
values affect the image.
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Objective: Solve a system of linear equations to model supply and demand, and find and verify the inverse
of a matrix using NumPy.
Example: Suppose you are a data analyst tasked with determining the stable of linear equations that you
need to solve. Additionally, you will find and verify the inverse matrix to understand its role in solving the
system.
Consider a market where two goods, Goods X and Goods Y, are being analyzed. The supply and demand
equations for these goods are given by:
Matrix Formulation:
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
These equations can be represented in matrix form as:
Ax=b
Where:
import numpy as np
# Coefficient matrix
A = np.array([
[2, 3],
[4, 5]
])
# Constant matrix
b = np.array([18, 30])
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
2. Solve the system: Use NumPy to solve for x (the quantities of Goods X and Goods Y).
x = np.linalg.solve(A, b)
A_inv = np.linalg.inv(A)
print("Identity matrix:\n", I)
x_inv_method = np.dot(A_inv, b)
Full Code:
import numpy as np
# Coefficient matrix
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
A = np.array([
[2, 3],
[4, 5] ])
# Constant matrix
b = np.array([18, 30])
x = np.linalg.solve(A, b)
A_inv = np.linalg.inv(A)
I = np.identity(A.shape[0])
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
print("Identity matrix:\n", I)
x_inv_method = np.dot(A_inv, b)
Discussion
What do the values of x and y represent in the context of the supply and demand
model?
How does the inverse matrix method help in solving systems of linear equations?
3.Vector operations
Objective: Compute and understand dot and cross products involving one-hot vectors, and apply these concepts to analyze
relationships between categories.
One-hot vectors are a popular method for encoding categorical data in various machine learning and data processing tasks. One-hot
vectors are commonly used to represent categorical data in a format that machine learning models can process. Each category in a
dataset is represented as a binary vector where only one element is set to 1 (indicating the presence of the category) and all other
elements are set to 0.
In a dataset with three categories: ['pedestrian', 'car', 'motorcycle'], these could be represented as:
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Given Data:
Consider a dataset with three categories: 'pedestrian', 'car', 'motorcycle’. Represent each category using one-hot vectors
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
[]
1
Category A (pedestrian): A= 0
0
[]
0
Category B (car): B= 1
0
[]
0
Category C (motorcycle): C= 0
1
Steps
1. Measure Similarity: Compute the dot product between different pairs of one-hot vectors to determine the similarity between
categories. Since one-hot vectors are orthogonal, the dot product should be 0 between different categories and 1 when
comparing a vector with itself.
1. Analyze Orthogonality: Compute the cross product between pairs of one-hot vectors to check if they produce vectors that
indicate orthogonality in the context of high-dimensional space.
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Code
import numpy as np
A = np.array([1, 0, 0])
B = np.array([0, 1, 0])
C = np.array([0, 0, 1])
dot_AB = np.dot(A, B)
dot_AC = np.dot(A, C)
dot_BC = np.dot(B, C)
dot_AA = np.dot(A, A)
dot_BB = np.dot(B, B)
dot_CC = np.dot(C, C)
cross_AB = np.cross(A, B)
cross_BC = np.cross(B, C)
cross_CA = np.cross(C, A)
Discussion
Verify the dot products and their values. Discuss the similarity between categories and the concept of orthogonality.
Analyze the cross products to understand their significance in high-dimensional space. Discuss how they might be used to
understand relationships between categories.
Example: In machine learning, feature selection is important to identify a set of features that capture the most information while
avoiding redundancy. This involves determining a basis for the feature space, which consists of linearly independent vectors.
You are working with a dataset where each feature is represented as a vector. Your goal is to:
Given Data:
[]
1
Feature Vector 1: f1= 2
3
[]
2
Feature Vector 2: f2= 4
6
[]
0
Feature Vector 3: f3= 1
2
[]
1
Feature Vector 4: f4= 0
1
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Steps
Create a Matrix: Combine the feature vectors into a matrix where each row represents a feature vector.
Compute the Rank: The rank of the matrix will give the number of linearly independent vectors. You can use NumPy's
np.linalg.matrix_rank function to find this.
Find the Basis: Use NumPy's np.linalg.svd (Singular Value Decomposition) to compute a basis for the vector space. Alternatively,
use np.linalg.qr to perform QR decomposition, which will also give you a basis.
Determine the Dimension: The dimension of the feature space is equal to the rank of the matrix, which is the number of linearly
independent vectors.
Code
import numpy as np
f1 = np.array([1, 2, 3])
f2 = np.array([2, 4, 6])
f3 = np.array([0, 1, 2])
f4 = np.array([1, 0, 1])
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
rank = np.linalg.matrix_rank(matrix)
Q, R = np.linalg.qr(matrix.T)
dimension = rank
Discussion
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Discuss which vectors are linearly independent and why some vectors are dependent.
Analyze the computed basis and how it represents the feature space.
Objective: Compute orthogonal projections, understand and apply orthogonality in vector spaces, and explore practical applications
of projections in data fitting and dimensionality reduction.
In machine learning, techniques like Principal Component Analysis (PCA) use orthogonal projections are to reduce the dimensionality
of data while preserving as much variance as possible. This exercise involves projecting data points onto a subspace spanned by a set
of basis vectors, which is fundamental for dimensionality reduction and feature extraction.
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
You have a dataset with 2-dimensional feature vectors and want to project these vectors onto a 1-dimensional subspace. This subspace
is spanned by a given vector, representing a principal component or direction of interest. The goal is to:
1. Compute the orthogonal projection of each data point onto this subspace.
2. Understand how orthogonality helps in simplifying the data while retaining key features.
Steps
1. Normalize the Basis Vector: Compute the unit vector in the direction of u. This is necessary for projecting onto the direction
2. Compute Projections: For each data point x, compute the orthogonal projection onto the direction of u
Proju(x) = x.u/u.u u
1. Compute the Orthogonal Component: For each data point x, compute the orthogonal component that is orthogonal to u. This
is the difference between the original vector and its projection.
1. Visualize the Data: Plot the original data points, their projections, and the orthogonal components to visualize how the data is
simplified by projecting onto the subspace.
Code
import numpy as np
u = np.array([1, 1])
x1 = np.array([1, 2])
x2 = np.array([3, 4])
x3 = np.array([5, 6])
# Compute projections
return x - proj
plt.figure(figsize=(10, 6))
plt.quiver(0, 0, u_norm[0], u_norm[1], angles='xy', scale_units='xy', scale=1, color='k', label='Basis Vector u')
plt.xlim(-1, 7)
plt.ylim(-1, 7)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend()
plt.grid(True)
plt.show()
Discussion
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Analyze how the projections simplify the representation of the data.
Discuss the orthogonal components and how they relate to the original data.
6.Gram-Schmidt orthogonalization
Objective: Apply the Gram-Schmidt process to network traffic data to construct orthonormal bases for feature vectors, and analyze
the implications for network traffic pattern recognition and anomaly detection.
Example: In network traffic analysis, different traffic patterns (e.g., normal and anomalous behaviors) can be represented as feature
vectors. Orthogonalizing these vectors can help in identifying and analyzing patterns more effectively, improving network security
and performance.
You have network traffic data represented as feature vectors where each vector captures various attributes (e.g., packet size, protocol
type, etc.) of network traffic. The goal is to:
Given Data:
Consider a simplified dataset of network traffic where each feature vector represents a different network traffic sample:
[]
3
Feature Vector 1: v1= 7
2
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
[]
6
Feature Vector 2: v2= 2
4
[]
1
Feature Vector 3: v3= 5
8
Steps
Implement the Gram-Schmidt Process: Write a function to apply the Gram-Schmidt process to the given feature vectors to
obtain an orthonormal basis.
Check Orthogonality: Compute the dot products between pairs of orthonormal vectors to verify that they are orthogonal.
Project New Data Points: Project a new network traffic vector onto the orthonormal basis to determine if it is consistent with
the known patterns.
Code
import numpy as np
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
def gram_schmidt(vectors):
orthonormal_basis = []
for v in vectors:
for u in orthonormal_basis:
v -= np.dot(v, u) * u
v = v / np.linalg.norm(v)
orthonormal_basis.append(v)
return np.array(orthonormal_basis)
def check_orthogonality(vectors):
num_vectors = vectors.shape[0]
for i in range(num_vectors):
for j in range(i + 1, num_vectors):
dot_product = np.dot(vectors[i], vectors[j])
print(f"Dot product of vector {i+1} and vector {j+1}:", dot_product)
check_orthogonality(orthonormal_basis)
# Detect anomaly
is_anomaly = detect_anomaly(new_vector, orthonormal_basis)
print("Is the new vector an anomaly?", is_anomaly)
Discussion
Discuss how the Gram-Schmidt process transformed the feature vectors into an orthonormal basis.
Analyze how the orthonormal basis helps in identifying anomalies in network traffic data.
Objective: Compute and interpret eigenvalues and eigenvectors in the context of the PageRank algorithm. Understand how these
concepts are used to rank web pages based on their link structure.
Example: PageRank is a system used by Google to rank web pages in search engine results based on their importance. It uses the link
structure of the web to assign a rank to each page. The PageRank algorithm involves computing the dominant eigenvector of a
transition matrix, where the eigenvalues and eigenvectors help in understanding the importance of each page.
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
You are given a simplified web with four pages and their link structure. You will compute the PageRank vector by finding the
dominant eigenvector of the corresponding transition matrix.
Given Data:
Consider a web with four pages A, B, C, and D with the following link structure:
Steps
Build the Transition Matrix: Construct the transition matrix M where Mij represents the probability of transitioning from page i to
page j. Each element of the matrix should be 0 or the reciprocal of the number of links from a given page.
# Transition matrix
Analyze PageRank Vector: Interpret the PageRank vector to understand the relative importance of each page. Higher values in the
PageRank vector indicate more important pages.
Code
import numpy as np
# Dominant eigenvector
dominant_eigenvector = np.real(eigenvectors[:, dominant_index])
dominant_eigenvector = dominant_eigenvector / np.sum(dominant_eigenvector) # Normalize
print("Eigenvalues:\n", eigenvalues)
print("Dominant Eigenvector (PageRank Vector):\n", dominant_eigenvector)
Discussion
Discuss how the eigenvalues and eigenvectors are used to determine the PageRank vector.
Analyze the PageRank values to identify the most important pages in the web structure.
Modify the link structure (e.g., add or remove links) and observe how the PageRank vector changes.
Objective: Understand and implement matrix diagonalization to reduce the dimensionality of software metrics data. Apply PCA
to analyze and visualize software metrics to uncover key patterns and relationships.
Software engineers often collect various metrics (e.g., code complexity, test coverage, defect counts, code churn) from software
projects. Analyzing these metrics can help identify patterns, manage project performance, and make data-driven decisions.
Diagonalization and PCA can be used to reduce the number of metrics while retaining the most important information, making it
easier to analyze and visualize the data.
You have a dataset containing various metrics from different software projects. You will apply PCA to reduce the dimensionality of
this dataset, helping to identify the principal components that explain the most variance in the data.
Given Data:
Assume you have a dataset with the following software metrics collected from several projects:
Code
Project Code Complexity Test Coverage Defect Count
Churn
P1 50 0.80 5 100
P2 70 0.60 15 200
P3 30 0.90 2 50
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
P4 60 0.70 10 150
P5 90 0.50 25 300
Exercise
1. Data Preparation
2. Create the DataFrame
3. Extract Features and Standardize
4. Compute the Covariance Matrix
5. Matrix Diagonalization - Perform Eigen Decomposition
6. Sort Eigenvalues and Eigenvectors
7. Project Data onto Principal Components
8. Select Top Principal Components
9. Plot the Principal Components
Code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(data)
Discussion
Discuss the proportion of variance explained by each principal component.
Analyze the scatter plot. Identify any patterns or clusters.
Interpret the results based on the principal components.
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
Experiment: Testing Positive Definiteness of matrix in community detection of a social network analysis
Objective:
To test if matrices related to community detection in a social network (such as similarity matrices) are positive definite, and
understand the implications of this property for identifying and analyzing communities. Community detection algorithms utilize
similarity matrices to identify clusters or communities within a network. Positive definiteness of these matrices can affect the stability
and reliability of the community detection process.
0, 1, 0, 0
1, 0, 1, 1
0, 1, 0, 1
0, 1, 1, 0
Steps:
1. Data Preparation
Code
Running the Code: the required libraries to be installed: pip install numpy pandas networkx matplotlib
import numpy as np
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
from networkx.algorithms.community import louvain_communities
from matplotlib.colors import ListedColormap
# Compute eigenvalues
eigenvalues, _ = np.linalg.eig(similarity_matrix)
print("Eigenvalues:", eigenvalues)
if np.all(eigenvalues > 0):
print("Matrix is positive definite.")
else:
print("Matrix is not positive definite.")
plt.figure(figsize=(8, 6))
nx.draw(G, pos, with_labels=True, node_color=node_colors, edge_color='gray', node_size=500, font_size=16, font_color='black',
cmap=cmap)
plt.title("Network with Detected Communities")
plt.show()
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from networkx.algorithms.community import louvain_communities
from matplotlib.colors import ListedColormap
plt.figure(figsize=(8, 6))
nx.draw(G, pos, with_labels=True, node_color=node_colors, edge_color='gray', node_size=500, font_size=16, font_color='black',
cmap=cmap)
plt.title("Network with Detected Communities")
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
plt.show()
Discussion
Positive definiteness of the similarity matrix implies that the relationships between nodes are well-defined and non-redundant,
leading to stable and reliable community detection results.
If the matrix is not positive definite, it implies issues such as redundancy in the attributes in the network representation.
Objective:
To apply Singular Value Decomposition (SVD) to a 4x4 user-movie rating matrix. This example demonstrates how to apply Singular
Value Decomposition (SVD) to a user-movie rating matrix. Applying SVD helps in dimensionality reduction and feature extraction.
By visualizing the original matrix, singular values, compressed user values, and reconstructed matrix, we gain insights into how SVD
reduces dimensionality and captures essential features of the data, which is useful for recommendation systems and other data analysis
tasks.
Experiment Steps:
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
1. Define a 4x4 User-Movie Rating Matrix
2. Apply SVD:
o Perform SVD to decompose the matrix into U, Σ, and V
3. Reconstruct the Matrix:
o Use the decomposed matrices to reconstruct the original matrix.
4. Visualize the Results:
o Display the original matrix, singular values, compressed user values, and the reconstructed matrix.
Code
import numpy as np
import matplotlib.pyplot as plt
A = np.array([
[5, 4, 0, 2],
[3, 0, 0, 5],
[0, 2, 4, 0],
[1, 0, 5, 4]
])
# Original Matrix
plt.subplot(1, 4, 1)
plt.title('Original Matrix')
plt.imshow(A, cmap='viridis', interpolation='none')
plt.colorbar()
plt.xticks(ticks=np.arange(len(movies)), labels=movies, rotation=45)
plt.yticks(ticks=np.arange(len(users)), labels=users)
plt.xlabel('Movies')
plt.ylabel('Users')
# Singular Values
plt.subplot(1, 4, 2)
plt.title('Singular Values')
plt.bar(range(len(S)), S, color='orange')
plt.xlabel('Index')
plt.ylabel('Singular Value')
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
# Compressed User Values
plt.subplot(1, 4, 3)
plt.title('Compressed User Values')
plt.imshow(U_reduced, cmap='viridis', interpolation='none')
plt.colorbar()
plt.xticks(ticks=np.arange(k), labels=[f'Feature {i+1}' for i in range(k)])
plt.yticks(ticks=np.arange(len(users)), labels=users)
plt.xlabel('Compressed Features')
plt.ylabel('Users')
# Reconstructed Matrix
plt.subplot(1, 4, 4)
plt.title('Reconstructed Matrix')
plt.imshow(A_reconstructed, cmap='viridis', interpolation='none')
plt.colorbar()
plt.xticks(ticks=np.arange(len(movies)), labels=movies, rotation=45)
plt.yticks(ticks=np.arange(len(users)), labels=users)
plt.xlabel('Movies')
plt.ylabel('Users')
plt.tight_layout()
plt.show()
Discussion
Reduce Dimensions:
We keep only the top k=2 singular values .This reduces the representation of users and movies to two dimensions.
Reducing the dimensionality of the user-movie matrix helps in compressing the data.
The reduced dimensions highlight the most important features that explain the majority of the variance in the data.