Pandas and NumPy Interview Questions & Answers
1. How can you perform a groupby operation and aggregate data in a pandas DataFrame?
import pandas as pd
df = pd.DataFrame({
'Department': ['Sales', 'Sales', 'HR', 'HR'],
'Revenue': [1000, 1500, 800, 1200]
})
grouped = df.groupby('Department')['Revenue'].sum()
print(grouped)
2. What is the difference between the loc[] and iloc[] selectors in pandas?
df = pd.DataFrame({'A': [10, 20, 30]}, index=['x', 'y', 'z'])
df.loc['x'] # Access by label
df.iloc[0] # Access by position
3. How would you convert a column of a pandas DataFrame to a datetime type and extract
the year, month, and day?
df['date'] = pd.to_datetime(df['date_column'])
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.day
4. How do you create a Numpy array with values ranging from 0 to 100 and reshape it into a
10x10 matrix?
import numpy as np
arr = np.arange(0, 100).reshape(10, 10)
print(arr)
5. Can you explain how broadcasting works in Numpy for array operations?
a = np.array([1, 2, 3])
b = np.array([[10], [20], [30]])
result = b + a
print(result)
6. How would you find the index of the maximum value in a Numpy array?
arr = np.array([10, 50, 30])
index = np.argmax(arr)
print(index) # Output: 1
7. What is the difference between deep copy and shallow copy in the context of Numpy
arrays? How do you create each type?
a = np.array([1, 2, 3])
shallow = a # Shallow copy (same memory)
deep = a.copy() # Deep copy (independent)
8. How can you handle NaN values in a Numpy array for computations?
arr = np.array([1, 2, np.nan, 4])
mean = np.nanmean(arr)
print(mean) # Output: 2.333...
cleaned = arr[~np.isnan(arr)] # Remove NaNs