Python Codes

All codes are highlighted
Ctrl F the key words of your question for code

Copy Paste in Jupyter
Edit code as per question requirement example:
(Accessing element no., column names etc.)
Edit code if asked create a own list, dataframe : give
your own names and numbers-
Example: my_list = [1, 2, 3, "four", "five"]
Change this to my_list = [4, 8, 10, "fifty", "six"]
Or else Just ‘CHATGPT’
LIST
# create a list
my_list = [1, 2, 3, "four", "five"]
# accessing elements
print(my_list[0]) # 1
print(my_list[3]) # "four"
# append element
my_list.append("six")
# insert element
my_list.insert(2, "two")
# replace element
my_list[4] = 5
# delete element
del my_list[1]
TUPLE
# create a tuple
my_tuple = (1, 2, 3, "four", "five")
print(my_tuple[0]) # 1
print(my_tuple[3]) # "four"
STRING
# create a string
my_string = "hello world"
print(my_string[0]) # "h"
print(my_string[6]) # "w"
# replace element
my_string = my_string.replace("world", "python")
DICTIONARY
# create a dictionary
my_dict = {"name": "John", "age": 30, "city": "New York"}
print(my_dict["name"]) # "John"
print(my_dict["age"]) # 30
# add element
my_dict["country"] = "USA"
# delete element
del my_dict["city"]
PANDAS SERIES (PDSERIES)

import pandas as pd
# create a Pandas Series

my_series = pd.Series([1, 2, 3, 4, 5], index=["a", "b", "c", "d", "e"])
print(my_series["a"]) # 1
print(my_series["d"]) # 4
# add element
my_series["f"] = 6
# delete element
my_series = my_series.drop("b")
DATAFRAMES
import pandas as pd
# create a Pandas DataFrame

my_data = {"name": ["John", "Alice", "Bob"],
"age": [30, 25, 35],
"country": ["USA", "Canada", "UK"]}
df = pd.DataFrame(my_data)
# accessing rows
print(df.loc[0]) # first row
# accessing columns
print(df["name"]) # name column
# add row
new_row = {"name": "Mary", "age": 28, "country": "Australia"}
df = df.append(new_row, ignore_index=True)
# delete row
df = df.drop(1)
IMPORT CSV FILE

import pandas as pd
df = pd.read_csv("my_data.csv")
DATA CLEANING
import pandas as pd
# read csv file

# drop rows with missing values

df = df.dropna()
# replace missing values with a specific value

df = df.fillna(0)
# replace values based on a condition

df.loc[df["age"] > 30, "age"] = 40
# remove duplicates
df = df.drop_duplicates()
# change data type of a column

df["age"] = df["age"].astype(float)
# rename column
df = df.rename(columns={"name": "full_name"})
DATA MANIPULATION
import pandas as pd
# read csv files
df1 = pd.read_csv("my_data1.csv")
df2 = pd.read_csv("my_data2.csv")
# merge two dataframes

df = pd.merge(df1, df2, on="id")
# filter rows based on a condition

df = df[df["age"] > 30]
# group by a column and calculate mean of another column

df_grouped = df.groupby("country")["age"].mean()
# sort dataframe by a column

df = df.sort_values("age")
# apply a function to a column

df["age_squared"] = df["age"].apply(lambda x: x**2)
FREQUENCY TABLE
import pandas as pd
# read csv file

# create frequency table

freq_table = df["age"].value_counts()
CROSS TABLE
import pandas as pd
# read csv file

# create cross table

cross_table = pd.crosstab(df["gender"], df["age_group"])
DESCRIPTIVE STATISTICS
import pandas as pd
# read csv file

# compute descriptive statistics

stats = df["age"].describe()
DATA VISUALIZATION
import pandas as pd
import matplotlib.pyplot as plt
# read csv file
# create histogram
plt.hist(df["age"], bins=10)
plt.xlabel("Age")
plt.ylabel("Frequency")
plt.show()
Example question:
1. Construct a data frame a dictionary with default
python index.
```python
import pandas as pd
data = {'name': ['John', 'Alice', 'Bob', 'Jane'],

'age': [32, 25, 45, 19],
'gender': ['M', 'F', 'M', 'F']}
df = pd.DataFrame(data)
print(df)
Output:
```
name age gender
0 John 32 M
1 Alice 25 F
2 Bob 45 M
3 Jane 19 F
```
Construct a series from a dictionary with default

2.
python index.
```python
data = {'John': 32, 'Alice': 25, 'Bob': 45, 'Jane': 19}
s = pd.Series(data)
print(s)
```
Output:
```
John 32
Alice 25
Bob 45
Jane 19
dtype: int64
```
3. Construct a data frame with user defined index.

```python
data = {'name': ['John', 'Alice', 'Bob', 'Jane'],
'age': [32, 25, 45, 19],
'gender': ['M', 'F', 'M', 'F']}
df = pd.DataFrame(data, index=['a', 'b', 'c', 'd'])
print(df)
```
Output:
```
name age gender
a John 32 M
b Alice 25 F
c Bob 45 M
d Jane 19 F
```
4. Import the data frame and name it as “df”

Assuming the data is in a CSV file called "cars.csv", the following code can be used to import it
as a data frame:
```python
df = pd.read_csv('cars.csv')
```
5. Access the price column from the data frame.

```python
price_column = df['price']
```
6. Write a syntax to determine the number of

missing values for all columns.
```python
missing_values_count = df.isnull().sum()
print(missing_values_count)
```
Output:
```
car_ID 0
symboling 0
CarName 0
fueltype 0
aspiration 0
doornumber 0
carbody 0
drivewheel 0
enginelocation 1
wheelbase 0
carlength 0
carwidth 0
carheight 0
curbweight 0
enginetype 0
cylindernumber 0
enginesize 0
fuelsystem 0
boreratio 0
stroke 0
compressionratio 0
horsepower 0
peakrpm 0
citympg 0
highwaympg 0
price 0
dtype: int64
```
7. If you find any missing values for numerical
variables, replace with its mean.
```python
df.fillna(df.mean(), inplace=True)
```
8. There is a missing value in “enginelocation”

variable, replace it with “front” category.
```python
df['enginelocation'].fillna('front', inplace=True)
```
9. Construct a frequency table for “carbody” and

interpret.
```python
frequency_table = pd.value_counts(df['carbody'])
print(frequency_table)
```
Output:
```
sedan 96
hatchback
10. To construct a cross table between “carbody”

and “enginelocation” and express the figures in
percentages by rows, we can use the pandas
`crosstab()` function with the argument
`normalize='index'`.
```python
import pandas as pd
# assuming 'df' is the name of the data frame with the relevant columns
cross_tab = pd.crosstab(df['carbody'], df['enginelocation'], normalize='index')
print(cross_tab)
```
This will give us a table with the percentage of each engine location for each car body type.
11. To determine the average price of cars for sedan

cars whose drive wheel is “rwd” and “fwd”, we can
use the pandas `groupby()` function to group the
data by the relevant columns and then calculate the
mean of the price column.
```python
sedan_df = df[df['carbody'] == 'sedan']
rwd_mean_price = sedan_df[sedan_df['drivewheel'] == 'rwd']['price'].mean()

fwd_mean_price = sedan_df[sedan_df['drivewheel'] == 'fwd']['price'].mean()
print("Average price for sedan cars with rwd drive wheel:", rwd_mean_price)
print("Average price for sedan cars with fwd drive wheel:", fwd_mean_price)
```
12. To describe various descriptives for `carlength`,

`wheelbase`, `citympg`, `highwaympg`, and `price`
by “carbody”, we can use the pandas `groupby()`
function to group the data by `carbody` and then
use the `describe()` function to get the summary
statistics for each column.
```python
grouped_by_carbody = df.groupby('carbody')[['carlength', 'wheelbase', 'citympg',
'highwaympg', 'price']]
description_by_carbody = grouped_by_carbody.describe()
print(description_by_carbody)
```
This will give us the summary statistics for each column, grouped by car body type.
13. To construct a bar chart for “enginelocation”, we

can use the pandas `value_counts()` function to get
the count of each engine location and then use the
`plot()` function with the argument `kind='bar'` to
create a bar chart.
```python
import matplotlib.pyplot as plt
engine_loc_counts = df['enginelocation'].value_counts()
engine_loc_counts.plot(kind='bar')
plt.title('Engine Location')
plt.xlabel('Location')
plt.ylabel('Count')
plt.show()
```
This will give us a bar chart with the count of each engine location.
14. To construct a boxplot for price by cylinder

number, we can use the pandas `boxplot()` function
with the relevant columns.
```python
df.boxplot(column='price', by='cylindernumber')
plt.title('Price by Cylinder Number')
plt.show()
```
This will give us a boxplot of the price column grouped by cylinder number.
15. To construct a scatter plot between price as

dependent variable and horsepower as
independent variable, we can use the `scatter()`
function from matplotlib.
```python
plt.scatter(df['horsepower'], df['price'])
plt.title('Price vs. Horsepower')
plt.xlabel('Horsepower')
plt.ylabel('Price')
plt.show()
```
This will give us a scatter plot of price against horsepower.
1. Here's a program to print the value 20 from the

given tuple1:
```python
tuple1 = ("Orange", [10, 20, 30], (5, 15, 25))
# Access the second element of tuple1, which is a list, and then access the second element of
the list
print(tuple1[1][1])
```
Output:
```
20
```
2. Here's a program to access elements 44 and 55

from the given tuple2:
```python
tuple2 = (11, 22, 33, 44, 55, 66)
# Access the fourth and fifth elements of tuple2

element1 = tuple2[3]
element2 = tuple2[4]
print(element1, element2)
```
Output:
```
44 55
```
3. Here's a program to create a 5 x 2 array from a

range between 100 to 200 with a width of 10:
```python
import numpy as np
array = np.arange(100, 200, 10).reshape(5, 2)
print(array)
```
Output:
```
[[100 110]
[120 130]
[140 150]
[160 170]
[180 190]]
```
4. Here's a program to return an array of items from

the third column of Array1:
```python
import numpy as np
Array1 = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
# Access the third column of Array1 using slicing

third_column = Array1[:, 2]
print(third_column)
```
Output:
```
[30 60 90]
```
5. Here's a program to delete the second column

from Array2:
```python
import numpy as np
Array2 = np.array([[34, 43, 73], [82, 22, 12], [53, 94, 66]])
# Delete the second column of Array2 using slicing

Array2 = np.delete(Array2, 1, axis=1)
print(Array2)
```
Output:
```
[[34 73]
[82 12]
[53 66]]
```
6. Here's a program to create two 2-D arrays and

concatenate them:
```python
import numpy as np
array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[5, 6], [7, 8]])
# Concatenate the two arrays vertically using vstack

concatenated_array = np.vstack((array1, array2))
print(concatenated_array)
```
Output:
```
[[1 2]
[3 4]
[5 6]
[7 8]]
```
7. Here's a program to add an element value 65 to

List1 and an element value 72 in the index position
2:
```python
List1 = [10, 20, 30, 40]
# Add 65 to the end of List1 using append

List1.append(65)
# Insert 72 in the index position 2 using insert

List1.insert(2, 72)
print(List1)
```
Output:
```
[10, 20, 72, 30, 40, 65]
```
8. Here's a program to remove element 9 from

List2:
```python
List2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,]

Python Codes

Uploaded by

Copyright:

Available Formats

Python Codes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Python Codes

Uploaded by

Copyright:

Available Formats

All codes are highlighted

Ctrl F the key words of your question for code

my_list = [1, 2, 3, "four", "five"]

PANDAS SERIES (PDSERIES)

# create a Pandas Series

# create a Pandas DataFrame

IMPORT CSV FILE

# read csv file

# drop rows with missing values

# replace missing values with a specific value

# replace values based on a condition

# change data type of a column

# merge two dataframes

# filter rows based on a condition

# group by a column and calculate mean of another column

# sort dataframe by a column

# apply a function to a column

# read csv file

# create frequency table

# read csv file

# create cross table

# read csv file

# compute descriptive statistics

import matplotlib.pyplot as plt

# read csv file

data = {'name': ['John', 'Alice', 'Bob', 'Jane'],

Construct a series from a dictionary with default

3. Construct a data frame with user defined index.

df = pd.DataFrame(data, index=['a', 'b', 'c', 'd'])

4. Import the data frame and name it as “df”

5. Access the price column from the data frame.

6. Write a syntax to determine the number of

8. There is a missing value in “enginelocation”

9. Construct a frequency table for “carbody” and

10. To construct a cross table between “carbody”

11. To determine the average price of cars for sedan

rwd_mean_price = sedan_df[sedan_df['drivewheel'] == 'rwd']['price'].mean()

12. To describe various descriptives for `carlength`,

13. To construct a bar chart for “enginelocation”, we

14. To construct a boxplot for price by cylinder

15. To construct a scatter plot between price as

This will give us a scatter plot of price against horsepower.

1. Here's a program to print the value 20 from the

2. Here's a program to access elements 44 and 55

# Access the fourth and fifth elements of tuple2

3. Here's a program to create a 5 x 2 array from a

array = np.arange(100, 200, 10).reshape(5, 2)

4. Here's a program to return an array of items from

# Access the third column of Array1 using slicing

5. Here's a program to delete the second column

# Delete the second column of Array2 using slicing

6. Here's a program to create two 2-D arrays and

array1 = np.array([[1, 2], [3, 4]])

array2 = np.array([[5, 6], [7, 8]])

# Concatenate the two arrays vertically using vstack

7. Here's a program to add an element value 65 to

# Add 65 to the end of List1 using append

# Insert 72 in the index position 2 using insert

8. Here's a program to remove element 9 from