0% found this document useful (0 votes)

4 views

Data Science Practical Problems

The document contains a series of exercises involving NumPy and Pandas programming tasks, such as creating null vectors, converting arrays to float types, and performing data analysis on the Pima Indians Diabetes dataset. Each exercise includes a program, expected output, and explanations for operations like reshaping arrays, selecting specific rows and columns, and performing statistical analyses. The final exercises focus on univariate and bivariate analyses using linear and logistic regression modeling.

Uploaded by

soundaravalli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Data Science Practical Problems

Uploaded by

soundaravalli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Ex no: 1 a

Write a NumPy program to create a null vector of size 10 and update sixth value to 11

Program

import numpy as np

# Create a null vector of size 10

null_vector = np.zeros(10)

# Update the sixth value to 11 (indexing starts from 0)

null_vector[5] = 11

print("Original null vector:", null_vector)

Output:

Original null vector: [ 0. 0. 0. 0. 0. 11. 0. 0. 0. 0.]

Ex no : 1 b
Write a NumPy program to convert an array to a float type

Program :

import numpy as np

# Create an example array (you can replace this with your own array)

integer_array = np.array([1, 2, 3, 4, 5])

# Convert the array to float type

float_array = integer_array.astype(float)

print("Original array (integer):", integer_array)

print("Converted array (float):", float_array)

Output:

Original array (integer): [1 2 3 4 5]

Converted array (float): [1. 2. 3. 4. 5.]

Ex no : 1 c
Write a NumPy program to create a 3x3 matrix with values ranging from 2 to 10

Program :

import numpy as np

# Create a 1D array with values ranging from 2 to 10

values_array = np.arange(2, 11)

# Reshape the 1D array into a 3x3 matrix

matrix_3x3 = values_array.reshape(3, 3)

print("3x3 Matrix with values ranging from 2 to 10:")

print(matrix_3x3)

Output :

3x3 Matrix with values ranging from 2 to 10:

[[ 2 3 4]

[ 5 6 7]

[ 8 9 10]]

Ex no : 1 d
Write a NumPy program to convert a list of numeric value into a one-dimensional NumPy
array
Program :

import numpy as np

# Create a list of numeric values

numeric_list = [1, 2, 3, 4, 5]

# Convert the list to a one-dimensional NumPy array

numpy_array = np.array(numeric_list)

print("List of numeric values:", numeric_list)

print("One-dimensional NumPy array:", numpy_array)

Output :

List of numeric values: [1, 2, 3, 4, 5]

One-dimensional NumPy array: [1 2 3 4 5]

Ex no : 2 a
Write a NumPy program to convert an array to a float type

Program :

import numpy as np

# Create an example array (you can replace this with your own array)

original_array = np.array([1, 2, 3, 4, 5])

# Convert the array to float type

float_array = original_array.astype(float)

print("Original array:", original_array)

print("Converted array (float):", float_array)

Output :

Original array: [1 2 3 4 5]

Converted array (float): [1. 2. 3. 4. 5.]

Ex no : 2 b
Write a NumPy program to create an empty and a full array

Program :

import numpy as np

# Create an empty array

empty_array = np.empty((3, 3)) # Specify the shape of the empty array (3x3 in this case)

# Create a full array with a specified value

full_array = np.full((2, 4), 7) # Specify the shape and the value (2x4 array with value 7)

print("Empty Array:")

print(empty_array)

print("\nFull Array with Value 7:")

print(full_array)

Output :

Empty Array:

[[0. 0. 0.]

[0. 0. 0.]

[0. 0. 0.]]
Full Array with Value 7:

[[7 7 7 7]

[7 7 7 7]]

Ex no : 2 c

Write a NumPy program to convert a list and tuple into arrays

Program :

import numpy as np

# Convert a list to a NumPy array

list_values = [1, 2, 3, 4, 5]

array_from_list = np.array(list_values)

# Convert a tuple to a NumPy array

tuple_values = (6, 7, 8, 9, 10)

array_from_tuple = np.array(tuple_values)

print("List to Array:")

print(array_from_list)

print("\nTuple to Array:")

print(array_from_tuple)

Output :

List to Array:

[1 2 3 4 5]

Tuple to Array:

[ 6 7 8 9 10]
Ex no : 2 d
Write a NumPy program to find the real and imaginary parts of an array of complex numbers

Program :

import numpy as np

# Create an array of complex numbers

complex_array = np.array([1 + 2j, 3 - 4j, 5 + 6j])

# Find the real and imaginary parts

real_parts = np.real(complex_array)

imaginary_parts = np.imag(complex_array)

print("Array of Complex Numbers:")

print(complex_array)

print("\nReal Parts:")

print(real_parts)

print("\nImaginary Parts:")

print(imaginary_parts)

Output :

Array of Complex Numbers:

[1.+2.j 3.-4.j 5.+6.j]

Real Parts:

[1. 3. 5.]

Imaginary Parts:

[ 2. -4. 6.]
Ex no : 3
Write a Pandas program to get the powers of an array values element-wise.
Note: First array elements raised to powers from second array
Sample data: {'X':[78,85,96,80,86], 'Y':[84,94,89,83,86],'Z':[86,97,96,72,83]}
Expected Output:
XYZ
0 78 84 86
1 85 94 97
2 96 89 96
3 80 83 72
4 86 86 83

Program :

import pandas as pd

# Sample data

data = {'X': [78, 85, 96, 80, 86], 'Y': [84, 94, 89, 83, 86], 'Z': [86, 97, 96, 72, 83]}

# Create a DataFrame from the sample data

df = pd.DataFrame(data)

# Calculate the powers of array values element-wise

result_df = df.pow(df.index + 1, axis=0)

# Display the result

print(result_df)

Output :

X Y Z

0 78 84 86

1 85 94 97

2 96 89 96

3 80 83 72

4 86 86 83
Ex no : 4
Write a Pandas program to select the specified columns and rows from a given data frame.
Sample Python dictionary data and list labels:
Select 'name' and 'score' columns in rows 1, 3, 5, 6 from the following data frame.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew',
'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Select specific columns and rows:
score qualify
b 9.0 no
d NaN no
f 20.0 yes
g 14.5 yes

Program :

import numpy as np

import pandas as pd

# Sample data

exam_data = {

'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin',
'Jonas'],

'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],

'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],

'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']

# Create a DataFrame from the sample data

df = pd.DataFrame(exam_data, index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'])

# Select 'name' and 'score' columns in rows 1, 3, 5, 6

selected_data = df.loc[['b', 'd', 'f', 'g'], ['score', 'qualify']]

# Display the result

print("Select specific columns and rows:")

print(selected_data)

Output :

Select specific columns and rows:

score qualify

b 9.0 no

d NaN no

f 20.0 yes

g 14.5 yes

Ex no : 5
Write a Pandas program to count the number of rows and columns of a DataFrame. Sample
Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew',
'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Number of Rows: 10
Number of Columns: 4

Program :

import numpy as np

import pandas as pd

# Sample data

exam_data = {

'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin',
'Jonas'],

'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],

'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],

'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']
}

# Create a DataFrame from the sample data

df = pd.DataFrame(exam_data, index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'])

# Count the number of rows and columns

num_rows, num_columns = df.shape

# Display the result

print("Number of Rows:", num_rows)

print("Number of Columns:", num_columns)

Output :

Number of Rows: 10

Number of Columns: 4

Ex no : 6
Reading data from text files, Excel and the web and exploring various commands for doing
descriptive analytics on the Iris data set
(In Record )

Ex no : 7

Use the diabetes data set from Pima Indians Diabetes data set for performing the following:

Apply Univariate analysis:

 Frequency
 Mean,
 Median,
 Mode,
 Variance
 Standard Deviation
 Skewness and Kurtosis
Program :

import pandas as pd

import numpy as np

from scipy.stats import skew, kurtosis

# Load the Pima Indians Diabetes dataset

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-
indians-diabetes.data"

column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI",

"DiabetesPedigreeFunction", "Age", "Outcome"]

diabetes_data = pd.read_csv(url, names=column_names)

# Display the first few rows of the dataset

print("Dataset Head:")

print(diabetes_data.head())

# Univariate Analysis

for column in diabetes_data.columns:

print("\nColumn:", column)

print("Frequency:\n", diabetes_data[column].value_counts())

print("Mean:", diabetes_data[column].mean())

print("Median:", diabetes_data[column].median())

print("Mode:", diabetes_data[column].mode().values)

print("Variance:", diabetes_data[column].var())

print("Standard Deviation:", diabetes_data[column].std())

print("Skewness:", skew(diabetes_data[column]))

print("Kurtosis:", kurtosis(diabetes_data[column]))

Output:

Dataset Head:

Pregnancies GlucoseBloodPressureSkinThickness Insulin BMI DiabetesPedigreeFunction Age

Outcome
0 6 148 72 35 0 33.6 0.627 50 1

1 1 85 66 29 0 26.6 0.351 31 0

2 8 183 64 0 0 23.3 0.672 32 1

3 1 89 66 23 94 28.1 0.167 21 0

4 0 137 40 35 168 43.1 2.288 33 1

Column: Pregnancies

Frequency:

1 135

0 111

2 103

3 75

4 68

5 57

6 50

7 45

8 38

9 28

10 24

11 11

13 10

12 9

14 2

15 1

17 1

Name: Pregnancies, dtype: int64

Mean: 3.8450520833333335

Median: 3.0

Mode: [1]

Variance: 11.35405632062147

Standard Deviation: 3.3695780626988623

Skewness: 0.9016739791518586

Kurtosis: 0.1592197711542494

...

Column: Outcome

Frequency:

0 500

1 268

Name: Outcome, dtype: int64

Mean: 0.3489583333333333

Median: 0.0

Mode: [0]

Variance: 0.22850161570824634

Standard Deviation: 0.4780286376712976

Skewness: 0.6350166433325007

Kurtosis: -1.601715582922407

Ex no : 8

Use the diabetes data set from Pima Indians Diabetes data set for performing the
following:

Apply Bivariate analysis:

 Linear and logistic regression modeling

Program :

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression, LogisticRegression

from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Load the Pima Indians Diabetes dataset

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-
indians-diabetes.data"

column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI",

"DiabetesPedigreeFunction", "Age", "Outcome"]

diabetes_data = pd.read_csv(url, names=column_names)

# Separate features (X) and target variable (y)

X = diabetes_data.drop("Outcome", axis=1)

y = diabetes_data["Outcome"]

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Linear Regression

linear_model = LinearRegression()

linear_model.fit(X_train, y_train)

# Print Linear Regression results

print("\nLinear Regression Coefficients:")

for feature, coef in zip(X.columns, linear_model.coef_):

print(f"{feature}: {coef}")

print("Intercept:", linear_model.intercept_)

linear_predictions = linear_model.predict(X_test)

print("\nLinear Regression Predictions (first 10):", linear_predictions[:10])

# Logistic Regression

logistic_model = LogisticRegression()

logistic_model.fit(X_train, y_train)

# Print Logistic Regression results

logistic_predictions = logistic_model.predict(X_test)

accuracy = accuracy_score(y_test, logistic_predictions)

conf_matrix = confusion_matrix(y_test, logistic_predictions)

classification_rep = classification_report(y_test, logistic_predictions)

print("\nLogistic Regression Accuracy:", accuracy)

print("\nConfusion Matrix:")

print(conf_matrix)

print("\nClassification Report:")

print(classification_rep)

Output:

Linear Regression Coefficients:

Pregnancies: 0.0208

Glucose: 0.0056

BloodPressure: -0.0032

SkinThickness: 0.0001

Insulin: -0.0002

BMI: 0.0124

DiabetesPedigreeFunction: 0.1472

Age: 0.0051

Intercept: -0.8254

Linear Regression Predictions (first 10):

[ 0.3216 0.2154 0.7811 0.1891 0.4727 0.2375 0.6484 0.4686 0.6511 0.5670]

Logistic Regression Accuracy: 0.7597

Confusion Matrix:

[[89 14]

[24 27]]
Classification Report:

precision recall f1-score support

0 0.79 0.86 0.82 103

1 0.66 0.53 0.59 51

accuracy 0.76 154

macro avg 0.73 0.70 0.71 154

weighted avg 0.75 0.76 0.75 154

Ex no : 9

Use the diabetes data set from Pima Indians Diabetes data set for performing the
following:

Apply Bivariate analysis:

 Multiple Regression analysis

Program :

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset (replace 'diabetes.csv' with the actual file name)

data = pd.read_csv('C:/Users/Student/Downloads/diabetes.csv')

# Select relevant features (e.g., Glucose, BMI, BloodPressure, Insulin, Age)

X = data[['Glucose', 'BMI', 'BloodPressure', 'Insulin', 'Age']]

y = data['Outcome'] # Outcome: 1 for diabetes, 0 for non-diabetes

# Split data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the linear regression model

model = LinearRegression()

model.fit(X_train, y_train)

# Make predictions on the testing data

y_pred = model.predict(X_test)

# Evaluate the model

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error: {mse:.2f}")

print(f"R-squared: {r2:.2f}")

Output :

Mean squared error:0.18

R -squared: 0.20

Ex no : 10

Apply and explore various plotting functions on UCI data set for performing the following:

a) Normal values
b) Density and contour plots
c) Three-dimensional plotting

Program :

import seaborn as sns

import matplotlib.pyplot as plt

import numpy as np

# Load a sample dataset (e.g., Iris dataset)

iris = sns.load_dataset("iris")

# a) Normal values plot

# Set the style

sns.set(style="whitegrid")

# Create subplots for each variable

fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(12, 8))

# Plot kernel density estimate for each variable

sns.kdeplot(data=iris, x="sepal_length", fill=True, ax=axes[0, 0], color="skyblue")

axes[0, 0].set_title("Kernel Density Plot - Sepal Length")

sns.kdeplot(data=iris, x="sepal_width", fill=True, ax=axes[0, 1], color="salmon")

axes[0, 1].set_title("Kernel Density Plot - Sepal Width")

sns.kdeplot(data=iris, x="petal_length", fill=True, ax=axes[1, 0], color="green")

axes[1, 0].set_title("Kernel Density Plot - Petal Length")

sns.kdeplot(data=iris, x="petal_width", fill=True, ax=axes[1, 1], color="orange")

axes[1, 1].set_title("Kernel Density Plot - Petal Width")

plt.suptitle("Normal Values Plot for Iris Dataset")

plt.tight_layout()

plt.show()

# b) Density and Contour Plots

plt.figure(figsize=(12, 6))

plt.subplot(1, 2, 1)

sns.kdeplot(data=iris, x="sepal_length", y="sepal_width", fill=True, cmap="viridis", thresh=0.15)

plt.subplot(1, 2, 2)

sns.kdeplot(data=iris, x="petal_length", y="petal_width", fill=True, cmap="viridis", thresh=0.15)

plt.suptitle("Density and Contour Plots")

plt.show()

# Three-dimensional plotting

from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(10, 8))

ax = fig.add_subplot(111, projection='3d')

colors = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}

ax.scatter(iris['sepal_length'], iris['petal_length'], iris['petal_width'], c=iris['species'].map(colors))

ax.set_xlabel('Sepal Length')

ax.set_ylabel('Petal Length')

ax.set_zlabel('Petal Width')

ax.set_title('Three-dimensional Plot')

plt.show()

Output :
Ex no : 11

Apply and explore various plotting functions on UCI data set for performing the following:

a) Correlation and scatter plots

b) Histograms
c) Three-dimensional plotting

Program :

import seaborn as sns

import matplotlib.pyplot as plt

from mpl_toolkits.mplot3d import Axes3D

# Load a sample dataset (e.g., Iris dataset)

iris = sns.load_dataset("iris")

# a) Correlation and Scatter Plots

sns.set(style="ticks")

sns.pairplot(iris, hue="species", markers=["o", "s", "D"], palette="Set2")

plt.suptitle("Correlation and Scatter Plots")

plt.show()

# b) Histograms

plt.figure(figsize=(12, 6))

plt.subplot(1, 3, 1)

sns.histplot(iris['sepal_length'], kde=True, color="skyblue")

plt.title("Sepal Length Histogram")

plt.subplot(1, 3, 2)

sns.histplot(iris['sepal_width'], kde=True, color="salmon")

plt.title("Sepal Width Histogram")

plt.subplot(1, 3, 3)

sns.histplot(iris['petal_length'], kde=True, color="green")

plt.title("Petal Length Histogram")

plt.suptitle("Histograms")

plt.show()

# c) Three-dimensional plotting

fig = plt.figure(figsize=(10, 8))

ax = fig.add_subplot(111, projection='3d')

colors = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}

ax.scatter(iris['sepal_length'], iris['petal_length'], iris['petal_width'], c=iris['species'].map(colors))

ax.set_xlabel('Sepal Length')

ax.set_ylabel('Petal Length')

ax.set_zlabel('Petal Width')
ax.set_title('Three-dimensional Plot')

plt.show()

Output :
Ex no : 12

Apply and explore various plotting functions on Pima Indians Diabetes data set for
performing the following:

a) Normal values
b) Density and contour plots
c) Three-dimensional plotting
Program :

import seaborn as sns

import matplotlib.pyplot as plt

from mpl_toolkits.mplot3d import Axes3D

import pandas as pd

# Load the Pima Indians Diabetes dataset (replace 'path/to/diabetes.csv' with the actual path)

diabetes_path = "C:/Users/Student/Downloads/diabetes.csv" # Replace with the actual path

diabetes_df = pd.read_csv(diabetes_path)

# a) Normal Values Plot

plt.figure(figsize=(12, 6))

sns.set(style="whitegrid")

plt.subplot(1, 2, 1)

sns.kdeplot(data=diabetes_df, x="Glucose", y="BMI", fill=True, cmap="viridis", thresh=0.15)

plt.title("Density Plot for Glucose and BMI")

plt.subplot(1, 2, 2)

sns.kdeplot(data=diabetes_df, x="Insulin", y="BloodPressure", fill=True, cmap="viridis", thresh=0.15)

plt.title("Density Plot for Insulin and BloodPressure")

plt.suptitle("Density and Contour Plots")

plt.show()

# b) Density and Contour Plots

plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)

sns.kdeplot(data=diabetes_df, x="Glucose", y="BMI", fill=True, cmap="viridis", thresh=0.15)

plt.title("Density Plot for Glucose and BMI")

plt.subplot(1, 2, 2)

sns.kdeplot(data=diabetes_df, x="Insulin", y="BloodPressure", fill=True, cmap="viridis", thresh=0.15)

plt.title("Density Plot for Insulin and BloodPressure")

plt.suptitle("Density and Contour Plots")

plt.show()

# c) Three-dimensional Plotting

fig = plt.figure(figsize=(10, 8))

ax = fig.add_subplot(111, projection='3d')

colors = {0: 'red', 1: 'green'} # Assuming Outcome 0 as red and Outcome 1 as green

ax.scatter(diabetes_df['Glucose'], diabetes_df['BMI'], diabetes_df['Age'],

c=diabetes_df['Outcome'].map(colors))

ax.set_xlabel('Glucose')

ax.set_ylabel('BMI')

ax.set_zlabel('Age')

ax.set_title('Three-dimensional Plot')

plt.show()

Output :
Ex no : 13

Apply and explore various plotting functions on Pima Indians Diabetes data set for
performing the following:

a) Correlation and scatter plots

b) Histograms
c) Three-dimensional plotting

Program :

import seaborn as sns

import matplotlib.pyplot as plt

from mpl_toolkits.mplot3d import Axes3D

import pandas as pd

# Load the Pima Indians Diabetes dataset (replace 'path/to/diabetes.csv' with the actual path)
diabetes_path = "C:/Users/Student/Downloads/diabetes.csv" # Replace with the actual path

diabetes_df = pd.read_csv(diabetes_path)

# a) Correlation and Scatter Plots

plt.figure(figsize=(12, 8))

correlation_matrix = diabetes_df.corr()

# Plotting the correlation matrix heatmap

sns.heatmap(correlation_matrix, annot=True, cmap="coolwarm", fmt=".2f", linewidths=0.5)

plt.title("Correlation Matrix Heatmap")

plt.show()

# Scatter plots for selected variables

sns.pairplot(diabetes_df, vars=['Glucose', 'BMI', 'Age', 'Insulin'], hue='Outcome', markers=["o", "s"],

palette="Set1")

plt.suptitle("Scatter Plots")

plt.show()

# b) Histograms

plt.figure(figsize=(12, 6))

plt.subplot(2, 2, 1)

sns.histplot(diabetes_df['Glucose'], kde=True, color="skyblue")

plt.title("Glucose Histogram")

plt.subplot(2, 2, 2)

sns.histplot(diabetes_df['BMI'], kde=True, color="salmon")

plt.title("BMI Histogram")

plt.subplot(2, 2, 3)

sns.histplot(diabetes_df['Age'], kde=True, color="green")

plt.title("Age Histogram")

plt.subplot(2, 2, 4)

sns.histplot(diabetes_df['Insulin'], kde=True, color="orange")

plt.title("Insulin Histogram")

plt.suptitle("Histograms")

plt.tight_layout()

plt.show()

# c) Three-dimensional Plotting

fig = plt.figure(figsize=(10, 8))

ax = fig.add_subplot(111, projection='3d')

colors = {0: 'red', 1: 'green'} # Assuming Outcome 0 as red and Outcome 1 as green

ax.scatter(diabetes_df['Glucose'], diabetes_df['BMI'], diabetes_df['Age'],

c=diabetes_df['Outcome'].map(colors))

ax.set_xlabel('Glucose')

ax.set_ylabel('BMI')

ax.set_zlabel('Age')

ax.set_title('Three-dimensional Plot')

plt.show()

Output :
Ex no : 14
Write a Pandas program to count number of columns of a DataFrame.
Sample Output:
Original DataFrame
col1 col2 col3
0147
1258
2 3 6 12
3491
4 7 5 11
Number of columns:
3
Program :

import pandas as pd

# Create the original DataFrame

data = {

'col1': [1, 2, 3, 4, 7],

'col2': [4, 5, 6, 9, 5],

'col3': [7, 8, 12, 1, 11]

df = pd.DataFrame(data)

# Display the original DataFrame

print("Original DataFrame:")

print(df)

# Count the number of columns

num_columns = df.shape[1]

print("Number of columns:")

print(num_columns)

Output :

Original DataFrame:

col1 col2 col3

0 1 4 7

1 2 5 8

2 3 6 12

3 4 9 1
4 7 5 11

Number of columns:

Ex no : 15

Write a Pandas program to group by the first column and get second column as lists in rows

Sample data:
Original DataFrame
col1 col2
0 C1 1
1 C1 2
2 C2 3
3 C2 3
4 C2 4
5 C3 6
6 C2 5
Group on the col1:
col1
C1 [1, 2]
C2 [3, 3, 4, 5]
C3 [6]
Name: col2, dtype: object

Program :

import pandas as pd

# Create the original DataFrame

data = {

'col1': ['C1', 'C1', 'C2', 'C2', 'C2', 'C3', 'C2'],

'col2': [1, 2, 3, 3, 4, 6, 5]

df = pd.DataFrame(data)

# Group by the first column and aggregate the values of the second column as lists

result = df.groupby('col1')['col2'].apply(list)
print("Group on the col1:")

print(result)

Output :

Group on the col1:

col1

C1 [1, 2]

C2 [3, 3, 4, 5]

C3 [6]

Name: col2, dtype: object

Ex no : 16

Write a Pandas program to check whether a given column is present in a DataFrame or not.
Sample data:
Original DataFrame
col1 col2 col3
0147
1258
2 3 6 12
3491
4 7 5 11
Col4 is not present in DataFrame.
Col1 is present in DataFrame.

Program :

import pandas as pd

# Create the original DataFrame

data = {

'col1': [1, 2, 3, 4, 7],

'col2': [4, 5, 6, 9, 5],

'col3': [7, 8, 12, 1, 11]

df = pd.DataFrame(data)
# List of columns to check

columns_to_check = ['Col4', 'col1']

# Iterate over the list of columns and check if each column is present in the DataFrame

for col in columns_to_check:

try:

# Try to access the column

df[col]

print(f"{col} is present in DataFrame.")

except KeyError:

print(f"{col} is not present in DataFrame.")

Output :

Col4 is not present in DataFrame.

col1 is present in DataFrame.

Ex no : 17
Create two arrays of six elements. Write a NumPy program to count the number of instances
of a value occurring in one array on the condition of another array.
Sample Output:
Original arrays:
[ 10 -10 10 -10 -10 10]
[0.85 0.45 0.9 0.8 0.12 0.6 ]
Number of instances of a value occurring in one array on the condition of another array:
3
Program :

import numpy as np

# Create two arrays

array1 = np.array([10, -10, 10, -10, -10, 10])

array2 = np.array([0.85, 0.45, 0.9, 0.8, 0.12, 0.6])

print("Original arrays:")
print(array1)

print(array2)

# Define the condition

condition = array2 > 0.5 # Condition: values in array2 greater than 0.5

# Count the number of instances of a value in array1 on the condition of array2

num_instances = np.sum(array1[condition])

print("Number of instances of a value occurring in one array on the condition of another array:")

print(num_instances)

Output :

Original arrays:

[ 10 -10 10 -10 -10 10]

[0.85 0.45 0.9 0.8 0.12 0.6 ]

Number of instances of a value occurring in one array on the condition of another array:

Ex no : 18
Create a 2-dimensional array of size 2 x 3, composed of 4-byte integer elements. Write a
NumPy program to find the number of occurrences of a sequence in the said array.
Sample Output:
Original NumPy array:
[[1 2 3]
[2 1 2]]
Type: <class 'numpy.ndarray'>
Sequence: 2,3
Number of occurrences of the said sequence: 2
Program :

import numpy as np

# Create the 2D array

array = np.array([[1, 2, 3],

[2, 1, 2]], dtype=np.int32)

# Define the sequence to find

sequence = np.array([2, 3], dtype=np.int32)

# Count occurrences of the sequence

count = 0

for row in array:

for i in range(len(row) - len(sequence) + 1):

if np.array_equal(row[i:i+len(sequence)], sequence):

count += 1

# Print the original array and its type

print("Original NumPy array:")

print(array)

print("Type:", type(array))

# Print the sequence and its number of occurrences

print("Sequence:", ", ".join(map(str, sequence)))

print("Number of occurrences of the said sequence:", count)

Output :

Original NumPy array:

[[1 2 3]

[2 1 2]]

Type: <class 'numpy.ndarray'>

Sequence: 2, 3

Number of occurrences of the said sequence: 1

Ex no : 19
Write a NumPy program to merge three given NumPy arrays of same shape
Program :
import numpy as np

# Three NumPy arrays of the same shape

array1 = np.array([[1, 2, 3], [4, 5, 6]])

array2 = np.array([[7, 8, 9], [10, 11, 12]])

array3 = np.array([[13, 14, 15], [16, 17, 18]])

# Merge the arrays

merged_array = np.stack((array1, array2, array3))

print("Merged array:")

print(merged_array)

Output :

Merged array:

[[[ 1 2 3]

[ 4 5 6]]

[[ 7 8 9]

[10 11 12]]

[[13 14 15]

[16 17 18]]]

Ex no : 20

Write a NumPy program to combine last element with first element of two given ndarray with
different shapes.

Sample Output:
Original arrays:
['PHP', 'JS', 'C++']
['Python', 'C#', 'NumPy']
After Combining:
['PHP' 'JS' 'C++Python' 'C#' 'NumPy']
Program :

import numpy as np

# Original arrays

array1 = np.array(['PHP', 'JS', 'C++'])

array2 = np.array(['Python', 'C#', 'NumPy'])

# Combine arrays

combined_array = np.concatenate((array1, array2))

print("Original arrays:")

print(array1)

print(array2)

print("After Combining:")

print(combined_array)

Output :

Original arrays:

['PHP' 'JS' 'C++']

['Python' 'C#' 'NumPy']

After Combining:

['PHP' 'JS' 'C++' 'Python' 'C#' 'NumPy']

(Revised) Loan Agreement
100% (1)
(Revised) Loan Agreement
5 pages
FDS Slot 1
No ratings yet
FDS Slot 1
19 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Fdspracticals - Ipynb - Colaboratory
No ratings yet
Fdspracticals - Ipynb - Colaboratory
21 pages
CS3361 Lab Exp
No ratings yet
CS3361 Lab Exp
9 pages
Study Material IP XII
No ratings yet
Study Material IP XII
116 pages
Pandas Questions Ip File
No ratings yet
Pandas Questions Ip File
13 pages
2023 Data Analysis and Visualization Using Python
100% (1)
2023 Data Analysis and Visualization Using Python
9 pages
Practical File Question 28.09.2022
No ratings yet
Practical File Question 28.09.2022
15 pages
CS3361 Set1
No ratings yet
CS3361 Set1
5 pages
univds
No ratings yet
univds
8 pages
Data Sci
No ratings yet
Data Sci
6 pages
CS3361 Set1
No ratings yet
CS3361 Set1
5 pages
Manual
No ratings yet
Manual
52 pages
dfs manual
No ratings yet
dfs manual
43 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
CS3361 SET2
No ratings yet
CS3361 SET2
13 pages
CS3361 Set2
No ratings yet
CS3361 Set2
6 pages
Pandas Practicals - Term-1
100% (1)
Pandas Practicals - Term-1
18 pages
AI Final PDF
No ratings yet
AI Final PDF
38 pages
Python
No ratings yet
Python
32 pages
Practical_File (1)
No ratings yet
Practical_File (1)
19 pages
Dal Programs With Output
No ratings yet
Dal Programs With Output
11 pages
rufh 4
No ratings yet
rufh 4
24 pages
fds lab manual[1]
No ratings yet
fds lab manual[1]
24 pages
Practical Assignment4 1
No ratings yet
Practical Assignment4 1
6 pages
Batch 1 Set Question
No ratings yet
Batch 1 Set Question
3 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
Practical File Python
No ratings yet
Practical File Python
25 pages
dv_lab_manual_modified
No ratings yet
dv_lab_manual_modified
31 pages
IP practical file 2022
No ratings yet
IP practical file 2022
26 pages
Class 12 IP File 23 24
No ratings yet
Class 12 IP File 23 24
27 pages
GE Python Visualization 2023
No ratings yet
GE Python Visualization 2023
16 pages
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
No ratings yet
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
9 pages
Khadeeja_DS_PRACTICAL 4
No ratings yet
Khadeeja_DS_PRACTICAL 4
24 pages
Ds Lab-1
No ratings yet
Ds Lab-1
40 pages
Manual
No ratings yet
Manual
48 pages
ELE492 - ELE492 - Image Process Lecture Notes 5
No ratings yet
ELE492 - ELE492 - Image Process Lecture Notes 5
41 pages
Pds Record Document Ds II
No ratings yet
Pds Record Document Ds II
36 pages
70f626ef676e457578caba2d7bae2f6e
No ratings yet
70f626ef676e457578caba2d7bae2f6e
6 pages
LUCKNOW PUBLIC SCHOOL_20241201_220143_0000
No ratings yet
LUCKNOW PUBLIC SCHOOL_20241201_220143_0000
44 pages
Practical Record Programs - Solutions
No ratings yet
Practical Record Programs - Solutions
23 pages
11th PGM
No ratings yet
11th PGM
9 pages
Ai Tools and Applications-Lab
No ratings yet
Ai Tools and Applications-Lab
33 pages
Practical List 2022-23
100% (1)
Practical List 2022-23
4 pages
Data Science Practical Book - Ipynb
No ratings yet
Data Science Practical Book - Ipynb
21 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
45 pages
My Practical File
100% (1)
My Practical File
40 pages
Xii Ip Practical File 24-25
No ratings yet
Xii Ip Practical File 24-25
111 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
Ipclass 12
No ratings yet
Ipclass 12
21 pages
Pragya File
No ratings yet
Pragya File
31 pages
FDS Final Manual
No ratings yet
FDS Final Manual
41 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
PYQ Data Analysis and Visualisation Using Python GE May 2024
No ratings yet
PYQ Data Analysis and Visualisation Using Python GE May 2024
6 pages
DSF LAB EXP FULL (1) (1)
No ratings yet
DSF LAB EXP FULL (1) (1)
88 pages
Practical File Questions With Answers
No ratings yet
Practical File Questions With Answers
7 pages
batch2 ds
No ratings yet
batch2 ds
34 pages
12 IP Practical Exampl
No ratings yet
12 IP Practical Exampl
6 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
From1
No ratings yet
From1
1 page
Fdp Aids Aiml
No ratings yet
Fdp Aids Aiml
1 page
2nd yr AIDS
No ratings yet
2nd yr AIDS
6 pages
4th yr AIDS
No ratings yet
4th yr AIDS
4 pages
Bank Additional Full
No ratings yet
Bank Additional Full
1,578 pages
CCS334 Big Data Analytics Daily Test Qp3
No ratings yet
CCS334 Big Data Analytics Daily Test Qp3
1 page
Analisis Debussy PDF
No ratings yet
Analisis Debussy PDF
2 pages
Inorganic Scheme
No ratings yet
Inorganic Scheme
12 pages
Patient-Ventilator Dyssynchrony in The Intensive Care Unit A Practical
No ratings yet
Patient-Ventilator Dyssynchrony in The Intensive Care Unit A Practical
12 pages
Daily Lesson Plan in Mapeh Vi
No ratings yet
Daily Lesson Plan in Mapeh Vi
6 pages
What Is NFPA 704
No ratings yet
What Is NFPA 704
5 pages
Workshop 2 Analyzing & Recording Business Transactions - Ss
No ratings yet
Workshop 2 Analyzing & Recording Business Transactions - Ss
4 pages
Jurnal Internasional 1
No ratings yet
Jurnal Internasional 1
11 pages
June 2014 Ans Key
No ratings yet
June 2014 Ans Key
6 pages
Definite Integration_Question & Answer key
No ratings yet
Definite Integration_Question & Answer key
24 pages
Reference Paper - FTIR Automatic Density Peaks Clustering Based On Cosine Similarity
No ratings yet
Reference Paper - FTIR Automatic Density Peaks Clustering Based On Cosine Similarity
7 pages
On October 1 2016 Jay Pryor Established An Interior Decorating
No ratings yet
On October 1 2016 Jay Pryor Established An Interior Decorating
1 page
GRADE 5 DAILY HOMEWORK 2025
No ratings yet
GRADE 5 DAILY HOMEWORK 2025
64 pages
Foxboro Control Software Scripting With Direct Access User's Guide
100% (1)
Foxboro Control Software Scripting With Direct Access User's Guide
294 pages
CA Spectrum Event Alarm Handling-S
No ratings yet
CA Spectrum Event Alarm Handling-S
21 pages
Uv Durability of Tencate Geosynthetics: Technical Note
No ratings yet
Uv Durability of Tencate Geosynthetics: Technical Note
8 pages
SISKA - Is It Commercially Viable
No ratings yet
SISKA - Is It Commercially Viable
16 pages
MGT 351 Case Study 2 Fall 2021 1
No ratings yet
MGT 351 Case Study 2 Fall 2021 1
2 pages
GEH-6421 Vol II
No ratings yet
GEH-6421 Vol II
608 pages
Product Information: Toshiba X-Ray Tube D-054 / D-054S / D-054SB
No ratings yet
Product Information: Toshiba X-Ray Tube D-054 / D-054S / D-054SB
9 pages
Cleavage:: Fertilization Details
No ratings yet
Cleavage:: Fertilization Details
47 pages
Wind Energy
No ratings yet
Wind Energy
5 pages
Upper-Intermediate Dictation
No ratings yet
Upper-Intermediate Dictation
13 pages
BD 25 A 20
No ratings yet
BD 25 A 20
7 pages
SAVE THE DATE: FOUNDERS DAY Saturday, FEB. 22, 2014
No ratings yet
SAVE THE DATE: FOUNDERS DAY Saturday, FEB. 22, 2014
1 page
Lindsey Vona's 14-Day Darkness Retreat
100% (1)
Lindsey Vona's 14-Day Darkness Retreat
11 pages
Aerials
No ratings yet
Aerials
1 page
All_India_Final_Year_Graduation_Arts_2023_24_Batch_SAMPLE_2
No ratings yet
All_India_Final_Year_Graduation_Arts_2023_24_Batch_SAMPLE_2
21 pages
Technical Report Sealed Air 01
No ratings yet
Technical Report Sealed Air 01
2 pages
SHEEHAN
No ratings yet
SHEEHAN
16 pages