0% found this document useful (0 votes)

10 views3 pages

Python 1

The document outlines a process for analyzing a dataset using Python, specifically the UCI Heart Disease dataset. It includes steps for data cleaning, string manipulation, converting data to NumPy arrays, splitting the dataset into training and testing sets, building a linear regression model, and evaluating the model using mean squared error. The final output confirms the successful building and evaluation of the model.

Uploaded by

Shobha Hiremath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views3 pages

Python 1

Uploaded by

Shobha Hiremath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

import pandas as pd

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Step 1: Load the dataset (here using UCI Heart Disease dataset)
# You can download the dataset using pd.read_csv() from a local file or a URL.
url="https://raw.githubusercontent.com/datablist/sample-csvfiles/main/files/
people/people-100.csv"
# Replace with your dataset URL
df = pd.read_csv(url)

# Step 2: Data Cleaning - Handling missing values

# Checking for missing values
print("Missing values before cleaning:\n", df.isnull().sum())

# Filling missing values for numerical columns with the column's mean
df.fillna(df.mean(), inplace=True)

print("Missing values after cleaning:\n", df.isnull().sum())

# Step 3: String Manipulation - Clean text columns

# Convert all text in 'name' column to lowercase and remove extra spaces
df['name'] = df['name'].str.lower().str.strip()

# Step 4: Use NumPy - Convert numerical columns to NumPy arrays

# Convert 'age' and 'salary' columns into NumPy arrays and calculate basic
statistics
age_array = df['age'].to_numpy()
salary_array = df['salary'].to_numpy()

# Calculate basic statistics: mean and median

age_mean = np.mean(age_array)
age_median = np.median(age_array)
salary_mean = np.mean(salary_array)
salary_median = np.median(salary_array)

print(f"Age - Mean: {age_mean}, Median: {age_median}")

print(f"Salary - Mean: {salary_mean}, Median: {salary_median}")

# Step 5: Data Splitting - Split dataset into training and testing sets
# We will predict salary based on age
X = df[['age']] # Feature (Independent variable)
y = df['salary'] # Target (Dependent variable)

# Splitting the dataset (80% training, 20% testing)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Step 6: Build a Model - Using Linear Regression

# Initialize the model
model = LinearRegression()

# Train the model using the training data

model.fit(X_train, y_train)

# Make predictions on the test data

y_pred = model.predict(X_test)

# Step 7: Evaluate the Model - Calculate Mean Squared Error (MSE)

mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

# Step 8: Report the results

print("\nModel successfully built and evaluated.")

Regression Analysis - Cheatsheet
No ratings yet
Regression Analysis - Cheatsheet
9 pages
Task 1
No ratings yet
Task 1
5 pages
Regression Dataset Example
No ratings yet
Regression Dataset Example
14 pages
Experiment No.8
No ratings yet
Experiment No.8
5 pages
ML Complete Notes Hridoy
No ratings yet
ML Complete Notes Hridoy
5 pages
Btech1007022 Lab5
No ratings yet
Btech1007022 Lab5
14 pages
Btech1007022 Lab5.1
No ratings yet
Btech1007022 Lab5.1
9 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
ML 6 7 8
No ratings yet
ML 6 7 8
10 pages
Etl and Stats Code
No ratings yet
Etl and Stats Code
2 pages
EXP-4 DMusingPYTHON
No ratings yet
EXP-4 DMusingPYTHON
7 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
Data Analysis in Python-3
No ratings yet
Data Analysis in Python-3
4 pages
1st PGM
No ratings yet
1st PGM
10 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
AIDS - DM Using Python - Lab Programs
No ratings yet
AIDS - DM Using Python - Lab Programs
19 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
8 pages
Linear Regression Program
No ratings yet
Linear Regression Program
2 pages
Exp 1
No ratings yet
Exp 1
6 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Code Book
No ratings yet
Code Book
20 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
4 pages
Linear Regression2
No ratings yet
Linear Regression2
9 pages
DA Programs
No ratings yet
DA Programs
44 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
Machine File
No ratings yet
Machine File
27 pages
ML All Projectpdf Removed
No ratings yet
ML All Projectpdf Removed
41 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Exp 2 ML
No ratings yet
Exp 2 ML
4 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
2022ucd2164 1 2
No ratings yet
2022ucd2164 1 2
35 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
ML LAB Manual-1
No ratings yet
ML LAB Manual-1
33 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
Regression Demo
No ratings yet
Regression Demo
8 pages
Task 2
No ratings yet
Task 2
4 pages
AI 28-01-25
No ratings yet
AI 28-01-25
18 pages
Komal ML Assg1
No ratings yet
Komal ML Assg1
9 pages
Project Paarth
No ratings yet
Project Paarth
21 pages
ML Lab Experiment Shivansh
No ratings yet
ML Lab Experiment Shivansh
29 pages
Simple Linear Regression Code
No ratings yet
Simple Linear Regression Code
3 pages
Lab Mannual of ML
No ratings yet
Lab Mannual of ML
43 pages
Sanket ML Assign1
No ratings yet
Sanket ML Assign1
9 pages
Group Work Assignment Supervised and Unsupervised Learning
No ratings yet
Group Work Assignment Supervised and Unsupervised Learning
10 pages
ml prac 1
No ratings yet
ml prac 1
4 pages
Srushti ML Assign1
No ratings yet
Srushti ML Assign1
9 pages
Data Science Record - 05
No ratings yet
Data Science Record - 05
20 pages
Linear Reg From Scratch Codes
No ratings yet
Linear Reg From Scratch Codes
1 page
Appendix B: Source Code
No ratings yet
Appendix B: Source Code
5 pages
Python Cod1
No ratings yet
Python Cod1
3 pages
MACHINE LEARNING Manual
No ratings yet
MACHINE LEARNING Manual
36 pages
DataAnalytics Lab Manual
No ratings yet
DataAnalytics Lab Manual
35 pages
Python File
No ratings yet
Python File
5 pages
ML
No ratings yet
ML
17 pages
Chapter 4 - Linear Regression
100% (2)
Chapter 4 - Linear Regression
25 pages
ML LAB
No ratings yet
ML LAB
29 pages
DS P6 Yash
No ratings yet
DS P6 Yash
8 pages
Angular Generative AI: Building an intelligent CV enhancer with Google Gemini
From Everand
Angular Generative AI: Building an intelligent CV enhancer with Google Gemini
Abdelfattah Ragab
No ratings yet
Sample - Case Study Report
No ratings yet
Sample - Case Study Report
5 pages
Sample Phase 4
No ratings yet
Sample Phase 4
16 pages
PRGM2
No ratings yet
PRGM2
1 page
Becl504 Lab Lesson Plan
No ratings yet
Becl504 Lab Lesson Plan
17 pages
Hello Hello Hello Hello This Is Preeti Hello Hello Hello Hello This Is Preeti Hello Hello Hello Hello This Is Preeti
No ratings yet
Hello Hello Hello Hello This Is Preeti Hello Hello Hello Hello This Is Preeti Hello Hello Hello Hello This Is Preeti
1 page

Python 1

Uploaded by

Python 1

Uploaded by

import pandas as pd

# Step 2: Data Cleaning - Handling missing values

print("Missing values after cleaning:\n", df.isnull().sum())

# Step 3: String Manipulation - Clean text columns

# Step 4: Use NumPy - Convert numerical columns to NumPy arrays

# Calculate basic statistics: mean and median

print(f"Age - Mean: {age_mean}, Median: {age_median}")

# Splitting the dataset (80% training, 20% testing)

# Step 6: Build a Model - Using Linear Regression

# Train the model using the training data

# Make predictions on the test data

# Step 7: Evaluate the Model - Calculate Mean Squared Error (MSE)

# Step 8: Report the results

You might also like