Python Codes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

All codes are highlighted

Ctrl F the key words of your question for code


Copy Paste in Jupyter
Edit code as per question requirement example:
(Accessing element no., column names etc.)
Edit code if asked create a own list, dataframe : give
your own names and numbers-
Example: my_list = [1, 2, 3, "four", "five"]
Change this to my_list = [4, 8, 10, "fifty", "six"]
Or else Just ‘CHATGPT’

LIST
# create a list

my_list = [1, 2, 3, "four", "five"]

# accessing elements
print(my_list[0]) # 1

print(my_list[3]) # "four"

# append element
my_list.append("six")

# insert element
my_list.insert(2, "two")

# replace element
my_list[4] = 5

# delete element
del my_list[1]
TUPLE
# create a tuple
my_tuple = (1, 2, 3, "four", "five")

# accessing elements
print(my_tuple[0]) # 1

print(my_tuple[3]) # "four"

STRING
# create a string
my_string = "hello world"

# accessing elements
print(my_string[0]) # "h"

print(my_string[6]) # "w"

# replace element
my_string = my_string.replace("world", "python")

DICTIONARY
# create a dictionary
my_dict = {"name": "John", "age": 30, "city": "New York"}

# accessing elements
print(my_dict["name"]) # "John"

print(my_dict["age"]) # 30
# add element
my_dict["country"] = "USA"

# delete element
del my_dict["city"]

PANDAS SERIES (PDSERIES)


import pandas as pd

# create a Pandas Series


my_series = pd.Series([1, 2, 3, 4, 5], index=["a", "b", "c", "d", "e"])

# accessing elements
print(my_series["a"]) # 1

print(my_series["d"]) # 4

# add element
my_series["f"] = 6

# delete element
my_series = my_series.drop("b")

DATAFRAMES
import pandas as pd

# create a Pandas DataFrame


my_data = {"name": ["John", "Alice", "Bob"],
"age": [30, 25, 35],
"country": ["USA", "Canada", "UK"]}

df = pd.DataFrame(my_data)

# accessing rows
print(df.loc[0]) # first row

# accessing columns
print(df["name"]) # name column
# add row
new_row = {"name": "Mary", "age": 28, "country": "Australia"}

df = df.append(new_row, ignore_index=True)

# delete row
df = df.drop(1)

IMPORT CSV FILE


import pandas as pd

df = pd.read_csv("my_data.csv")

DATA CLEANING
import pandas as pd

# read csv file


df = pd.read_csv("my_data.csv")

# drop rows with missing values


df = df.dropna()

# replace missing values with a specific value


df = df.fillna(0)

# replace values based on a condition


df.loc[df["age"] > 30, "age"] = 40

# remove duplicates
df = df.drop_duplicates()

# change data type of a column


df["age"] = df["age"].astype(float)

# rename column
df = df.rename(columns={"name": "full_name"})

DATA MANIPULATION
import pandas as pd
# read csv files
df1 = pd.read_csv("my_data1.csv")

df2 = pd.read_csv("my_data2.csv")

# merge two dataframes


df = pd.merge(df1, df2, on="id")

# filter rows based on a condition


df = df[df["age"] > 30]

# group by a column and calculate mean of another column


df_grouped = df.groupby("country")["age"].mean()

# sort dataframe by a column


df = df.sort_values("age")

# apply a function to a column


df["age_squared"] = df["age"].apply(lambda x: x**2)

FREQUENCY TABLE
import pandas as pd

# read csv file


df = pd.read_csv("my_data.csv")

# create frequency table


freq_table = df["age"].value_counts()

CROSS TABLE
import pandas as pd

# read csv file


df = pd.read_csv("my_data.csv")

# create cross table


cross_table = pd.crosstab(df["gender"], df["age_group"])

DESCRIPTIVE STATISTICS
import pandas as pd

# read csv file


df = pd.read_csv("my_data.csv")

# compute descriptive statistics


stats = df["age"].describe()

DATA VISUALIZATION
import pandas as pd

import matplotlib.pyplot as plt

# read csv file

df = pd.read_csv("my_data.csv")

# create histogram
plt.hist(df["age"], bins=10)
plt.xlabel("Age")
plt.ylabel("Frequency")
plt.show()

Example question:
1. Construct a data frame a dictionary with default
python index.
```python

import pandas as pd

data = {'name': ['John', 'Alice', 'Bob', 'Jane'],


'age': [32, 25, 45, 19],
'gender': ['M', 'F', 'M', 'F']}

df = pd.DataFrame(data)

print(df)

Output:
```
name age gender
0 John 32 M
1 Alice 25 F
2 Bob 45 M
3 Jane 19 F
```

Construct a series from a dictionary with default


2.

python index.
```python
data = {'John': 32, 'Alice': 25, 'Bob': 45, 'Jane': 19}

s = pd.Series(data)

print(s)
```

Output:
```
John 32
Alice 25
Bob 45
Jane 19
dtype: int64

```

3. Construct a data frame with user defined index.


```python
data = {'name': ['John', 'Alice', 'Bob', 'Jane'],
'age': [32, 25, 45, 19],
'gender': ['M', 'F', 'M', 'F']}

df = pd.DataFrame(data, index=['a', 'b', 'c', 'd'])

print(df)
```

Output:
```
name age gender
a John 32 M
b Alice 25 F
c Bob 45 M
d Jane 19 F
```

4. Import the data frame and name it as “df”


Assuming the data is in a CSV file called "cars.csv", the following code can be used to import it
as a data frame:

```python
df = pd.read_csv('cars.csv')
```

5. Access the price column from the data frame.


```python
price_column = df['price']
```

6. Write a syntax to determine the number of


missing values for all columns.
```python
missing_values_count = df.isnull().sum()

print(missing_values_count)
```

Output:
```
car_ID 0
symboling 0
CarName 0
fueltype 0
aspiration 0
doornumber 0
carbody 0
drivewheel 0
enginelocation 1
wheelbase 0
carlength 0
carwidth 0
carheight 0
curbweight 0
enginetype 0
cylindernumber 0
enginesize 0
fuelsystem 0
boreratio 0
stroke 0
compressionratio 0
horsepower 0
peakrpm 0
citympg 0
highwaympg 0
price 0
dtype: int64
```
7. If you find any missing values for numerical
variables, replace with its mean.

```python
df.fillna(df.mean(), inplace=True)
```

8. There is a missing value in “enginelocation”


variable, replace it with “front” category.
```python
df['enginelocation'].fillna('front', inplace=True)
```

9. Construct a frequency table for “carbody” and


interpret.
```python
frequency_table = pd.value_counts(df['carbody'])

print(frequency_table)
```

Output:
```
sedan 96
hatchback

10. To construct a cross table between “carbody”


and “enginelocation” and express the figures in
percentages by rows, we can use the pandas
`crosstab()` function with the argument
`normalize='index'`.
```python
import pandas as pd

# assuming 'df' is the name of the data frame with the relevant columns
cross_tab = pd.crosstab(df['carbody'], df['enginelocation'], normalize='index')

print(cross_tab)
```

This will give us a table with the percentage of each engine location for each car body type.

11. To determine the average price of cars for sedan


cars whose drive wheel is “rwd” and “fwd”, we can
use the pandas `groupby()` function to group the
data by the relevant columns and then calculate the
mean of the price column.

```python
sedan_df = df[df['carbody'] == 'sedan']

rwd_mean_price = sedan_df[sedan_df['drivewheel'] == 'rwd']['price'].mean()


fwd_mean_price = sedan_df[sedan_df['drivewheel'] == 'fwd']['price'].mean()

print("Average price for sedan cars with rwd drive wheel:", rwd_mean_price)
print("Average price for sedan cars with fwd drive wheel:", fwd_mean_price)
```

12. To describe various descriptives for `carlength`,


`wheelbase`, `citympg`, `highwaympg`, and `price`
by “carbody”, we can use the pandas `groupby()`
function to group the data by `carbody` and then
use the `describe()` function to get the summary
statistics for each column.
```python
grouped_by_carbody = df.groupby('carbody')[['carlength', 'wheelbase', 'citympg',
'highwaympg', 'price']]

description_by_carbody = grouped_by_carbody.describe()

print(description_by_carbody)
```

This will give us the summary statistics for each column, grouped by car body type.

13. To construct a bar chart for “enginelocation”, we


can use the pandas `value_counts()` function to get
the count of each engine location and then use the
`plot()` function with the argument `kind='bar'` to
create a bar chart.

```python
import matplotlib.pyplot as plt

engine_loc_counts = df['enginelocation'].value_counts()

engine_loc_counts.plot(kind='bar')
plt.title('Engine Location')
plt.xlabel('Location')
plt.ylabel('Count')
plt.show()
```

This will give us a bar chart with the count of each engine location.

14. To construct a boxplot for price by cylinder


number, we can use the pandas `boxplot()` function
with the relevant columns.
```python
df.boxplot(column='price', by='cylindernumber')
plt.title('Price by Cylinder Number')
plt.show()
```

This will give us a boxplot of the price column grouped by cylinder number.

15. To construct a scatter plot between price as


dependent variable and horsepower as
independent variable, we can use the `scatter()`
function from matplotlib.
```python
plt.scatter(df['horsepower'], df['price'])
plt.title('Price vs. Horsepower')
plt.xlabel('Horsepower')
plt.ylabel('Price')
plt.show()
```

This will give us a scatter plot of price against horsepower.

1. Here's a program to print the value 20 from the


given tuple1:
```python
tuple1 = ("Orange", [10, 20, 30], (5, 15, 25))

# Access the second element of tuple1, which is a list, and then access the second element of
the list
print(tuple1[1][1])
```

Output:
```
20
```

2. Here's a program to access elements 44 and 55


from the given tuple2:
```python
tuple2 = (11, 22, 33, 44, 55, 66)

# Access the fourth and fifth elements of tuple2


element1 = tuple2[3]

element2 = tuple2[4]

print(element1, element2)
```

Output:
```
44 55
```

3. Here's a program to create a 5 x 2 array from a


range between 100 to 200 with a width of 10:
```python
import numpy as np

array = np.arange(100, 200, 10).reshape(5, 2)

print(array)
```

Output:
```
[[100 110]
[120 130]
[140 150]
[160 170]
[180 190]]
```

4. Here's a program to return an array of items from


the third column of Array1:
```python
import numpy as np

Array1 = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])

# Access the third column of Array1 using slicing


third_column = Array1[:, 2]

print(third_column)
```

Output:
```
[30 60 90]
```

5. Here's a program to delete the second column


from Array2:
```python
import numpy as np

Array2 = np.array([[34, 43, 73], [82, 22, 12], [53, 94, 66]])

# Delete the second column of Array2 using slicing


Array2 = np.delete(Array2, 1, axis=1)
print(Array2)
```

Output:
```
[[34 73]
[82 12]
[53 66]]
```

6. Here's a program to create two 2-D arrays and


concatenate them:
```python
import numpy as np

array1 = np.array([[1, 2], [3, 4]])

array2 = np.array([[5, 6], [7, 8]])

# Concatenate the two arrays vertically using vstack


concatenated_array = np.vstack((array1, array2))

print(concatenated_array)
```

Output:
```
[[1 2]
[3 4]
[5 6]
[7 8]]
```

7. Here's a program to add an element value 65 to


List1 and an element value 72 in the index position
2:
```python
List1 = [10, 20, 30, 40]

# Add 65 to the end of List1 using append


List1.append(65)

# Insert 72 in the index position 2 using insert


List1.insert(2, 72)

print(List1)
```

Output:
```
[10, 20, 72, 30, 40, 65]
```

8. Here's a program to remove element 9 from


List2:
```python
List2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,]

You might also like