0% found this document useful (0 votes)

9 views15 pages

Day 2 Python Interview QnA

Uploaded by

spandushetty28

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views15 pages

Day 2 Python Interview QnA

Uploaded by

spandushetty28

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

### Basic Python Questions

1. **What is Python?**
- Python is a high-level, interpreted programming language known for its readability and
simplicity. It's widely used in various fields, including data analysis.

2. How do you install Python?

- You can install Python from the official Python website or use package managers like `apt`,
`brew`, or `conda`.

3. What are lists and tuples in Python?

- Lists are mutable, ordered collections of items. Tuples are immutable, ordered collections.
Lists use square brackets (`[]`), while tuples use parentheses (`()`).

4. What are dictionaries in Python?

- Dictionaries are mutable, unordered collections of key-value pairs. They are defined using
curly braces (`{}`).

5. How do you handle exceptions in Python?

- Use the `try` and `except` blocks to catch and handle exceptions. Optionally, you can use
`finally` for cleanup actions.

### Data Manipulation Questions

6. **What is NumPy?**
- NumPy is a Python library for numerical computations, providing support for arrays,
matrices, and a wide range of mathematical functions.

7. How do you create a NumPy array?

- Use `numpy.array()`, `numpy.zeros()`, or `numpy.ones()` functions to create arrays.

8. What are the advantages of using Pandas?

- Pandas is excellent for data manipulation and analysis, providing DataFrame structures,
handling missing data, and easy data filtering.

9. How do you read a CSV file in Pandas?

- Use `pandas.read_csv('filename.csv')` to read a CSV file into a DataFrame.

10. How do you handle missing data in Pandas?

- Use `DataFrame.dropna()` to remove missing values or `DataFrame.fillna(value)` to replace
them with a specified value.

### Data Analysis Questions

11. **What is data wrangling?**
- Data wrangling is the process of cleaning and transforming raw data into a format suitable
for analysis.

12. What is the difference between a Series and a DataFrame in Pandas?

- A Series is a one-dimensional labeled array, while a DataFrame is a two-dimensional
labeled data structure with columns that can be of different types.

13. How do you group data in Pandas?

- Use the `groupby()` method to group data based on specific columns.

14. What is a pivot table in Pandas?

- A pivot table is a data summarization tool that aggregates data based on one or more keys.

15. How do you merge two DataFrames in Pandas?

- Use `pd.merge(df1, df2, on='key_column')` to merge two DataFrames based on a common
column.

### Statistical Analysis Questions

16. What is the purpose of the `describe()` method in Pandas?

- The `describe()` method provides summary statistics of the DataFrame, including count,
mean, std, min, and quantiles.

17. How do you calculate correlation in Pandas?

- Use the `DataFrame.corr()` method to compute pairwise correlation of columns.

18. What is hypothesis testing?

- Hypothesis testing is a statistical method used to determine the validity of a hypothesis
based on sample data.

19. What are p-values?

- A p-value indicates the probability of observing the data if the null hypothesis is true. A low
p-value suggests that the null hypothesis may be rejected.

20. What is linear regression?

- Linear regression is a statistical method used to model the relationship between a
dependent variable and one or more independent variables.

### Data Visualization Questions

21. **What libraries are commonly used for data visualization in Python?**
- Common libraries include Matplotlib, Seaborn, and Plotly.
22. **How do you create a simple line plot using Matplotlib?**
- Use:
```python
import matplotlib.pyplot as plt
plt.plot(x, y)
plt.show()
```

23. What is Seaborn?

- Seaborn is a Python data visualization library based on Matplotlib that provides a high-level
interface for drawing attractive statistical graphics.

24. How do you create a scatter plot using Seaborn?

- Use:
```python
import seaborn as sns
sns.scatterplot(data=df, x='column1', y='column2')
```

25. What is a box plot?

- A box plot is a graphical representation of the distribution of a dataset, highlighting the
median, quartiles, and potential outliers.

### Advanced Python Questions

26. What are lambda functions in Python?

- Lambda functions are small anonymous functions defined with the `lambda` keyword. They
can take any number of arguments but only have one expression.

27. What is list comprehension?

- List comprehension is a concise way to create lists in Python using a single line of code.

28. What is the purpose of the `apply()` function in Pandas?

- The `apply()` function is used to apply a function along the axis of the DataFrame or to each
element of a Series.

29. How do you install external libraries in Python?

- Use `pip install library_name` to install external libraries.

30. **What is the difference between deep copy and shallow copy?**
- A shallow copy creates a new object but inserts references into it to the objects found in the
original. A deep copy creates a new object and recursively adds copies of nested objects found
in the original.
### Data Analytics Concepts

31. What is data normalization?

- Data normalization is the process of scaling data to fit within a specific range, often [0, 1] or
[-1, 1].

32. What is feature engineering?

- Feature engineering is the process of using domain knowledge to create new features from
raw data to improve model performance.

33. What is the difference between supervised and unsupervised learning?

- Supervised learning uses labeled data to train models, while unsupervised learning finds
patterns in unlabeled data.

34. What are outliers, and how can they be detected?

- Outliers are data points that differ significantly from the rest of the data. They can be
detected using statistical methods such as Z-scores or IQR.

35. What is the purpose of data validation?

- Data validation ensures that data is accurate, complete, and meets the specified criteria
before being used for analysis.

### SQL Integration Questions

36. How can you connect Python to a SQL database?

- Use libraries like `sqlite3`, `SQLAlchemy`, or `pyodbc` to connect to SQL databases.

37. What is the purpose of the `pandas.read_sql()` function?

- The `read_sql()` function is used to read SQL query results into a Pandas DataFrame.

38. How do you perform a SQL join in Pandas?

- Use `pd.merge(df1, df2, on='key_column', how='join_type')` to perform SQL-like joins in
Pandas.

39. What is a primary key in a database?

- A primary key is a unique identifier for records in a database table, ensuring that no two
records can have the same value.

40. What is a foreign key?

- A foreign key is a field in one table that uniquely identifies a row of another table,
establishing a relationship between the two.

### Machine Learning Questions

41. **What is the purpose of the `train_test_split()` function?**
- The `train_test_split()` function splits a dataset into training and testing sets to evaluate
model performance.

42. What is overfitting?

- Overfitting occurs when a model learns the training data too well, capturing noise and
fluctuations rather than the underlying trend.

43. What are decision trees?

- Decision trees are a type of supervised learning algorithm that splits data into branches
based on feature values to make predictions.

44. What is cross-validation?

- Cross-validation is a technique used to assess the performance of a model by dividing the
data into subsets and training/testing multiple times.

45. What is a confusion matrix?

- A confusion matrix is a table used to evaluate the performance of a classification model by
comparing predicted and actual classifications.

### Data Ethics Questions

46. What is data privacy?

- Data privacy refers to the proper handling and protection of sensitive data, ensuring
individuals' rights and freedoms are respected.

47. What is bias in data analysis?

- Bias refers to systematic errors that can lead to incorrect conclusions or unfair treatment of
certain groups in data analysis.

48. How can you ensure data integrity?

- Data integrity can be ensured through validation rules, access controls, and regular audits of
data sources and processes.

49. What is GDPR?

- The General Data Protection Regulation (GDPR) is a regulation in the EU that governs data
protection and privacy, giving individuals greater control over their personal data.

50. Why is data transparency important?

- Data transparency builds trust, allows for verification of findings, and ensures accountability
in data handling and analysis.

### More Advanced Topics

51. **What is the difference between K-means and hierarchical clustering?**
K-means: This is a partitioning method that divides the data into a specified number of clusters
(k). It initializes k centroids, assigns each data point to the nearest centroid, and then updates
the centroids based on the mean of the assigned points. This process iterates until
convergence.
Hierarchical Clustering: This method creates a tree-like structure (dendrogram) of clusters. It
can be agglomerative (bottom-up approach) or divisive (top-down approach). Agglomerative
starts with each point as its own cluster and merges them based on similarity, while divisive
starts with one cluster and splits it.

### Theory Questions

1. What is the difference between Python lists and arrays?

- Lists can hold different data types and are dynamic in size, while arrays (from the `numpy`
library) are fixed in size and hold homogeneous data types for better performance in numerical
computations.

2. Explain the concept of DataFrames in Pandas.

- DataFrames are two-dimensional, size-mutable, and potentially heterogeneous tabular data
structures with labeled axes (rows and columns), ideal for data manipulation and analysis.

3. What is the purpose of the `groupby()` function in Pandas?

- The `groupby()` function is used to split the data into groups based on some criteria, allowing
for operations like aggregation, transformation, or filtration.

4. How does the `apply()` function work in Pandas?

- The `apply()` function allows you to apply a function along the axis of a DataFrame or to
each element of a Series, enabling complex data manipulations.

5. What are some common methods to handle missing data in a dataset?

- Common methods include removing rows/columns with missing values (`dropna()`), filling
them with specific values (`fillna()`), or using interpolation methods.

### Coding Questions

#### 1. Data Manipulation

**Question:** Write a function that takes a DataFrame and a column name, and returns the
mean of that column.

```python
import pandas as pd
def mean_of_column(df, column_name):
return df[column_name].mean()

# Example usage
data = {'A': [1, 2, 3, 4], 'B': [5, 6, None, 8]}
df = pd.DataFrame(data)
print(mean_of_column(df, 'A')) # Output: 2.5
```

#### 2. Filtering Data

**Question:** Write a function to filter rows in a DataFrame where a specified column’s values
are greater than a given threshold.

```python
def filter_above_threshold(df, column_name, threshold):
return df[df[column_name] > threshold]

# Example usage
data = {'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]}
df = pd.DataFrame(data)
print(filter_above_threshold(df, 'A', 2))
```

#### 3. Grouping Data

**Question:** Write a function that returns the sum of values in a specific column grouped by
another column.

```python
def sum_grouped_by(df, group_column, sum_column):
return df.groupby(group_column)[sum_column].sum()

# Example usage
data = {'Category': ['A', 'B', 'A', 'B'], 'Values': [1, 2, 3, 4]}
df = pd.DataFrame(data)
print(sum_grouped_by(df, 'Category', 'Values')) # Output: A 4, B 6
```

#### 4. Handling Missing Values

**Question:** Write a function that replaces missing values in a DataFrame with the mean of
their respective columns.
```python
def fill_missing_with_mean(df):
return df.fillna(df.mean())

# Example usage
data = {'A': [1, None, 3], 'B': [None, 2, 3]}
df = pd.DataFrame(data)
print(fill_missing_with_mean(df))
```

#### 5. Data Visualization

**Question:** Write code to create a bar plot of the average values of a column grouped by
another column.

```python
import matplotlib.pyplot as plt

def plot_average_bar(df, group_column, value_column):

averages = df.groupby(group_column)[value_column].mean()
averages.plot(kind='bar')
plt.title(f'Average {value_column} by {group_column}')
plt.xlabel(group_column)
plt.ylabel(f'Average {value_column}')
plt.show()

# Example usage
data = {'Category': ['A', 'B', 'A', 'B'], 'Values': [1, 2, 3, 4]}
df = pd.DataFrame(data)
plot_average_bar(df, 'Category', 'Values')
```

### Additional Theory Questions

6. What is the purpose of normalization and standardization in data preprocessing?

- Normalization scales data to a specific range, while standardization centers the data around
the mean with a unit variance.

7. Explain the importance of exploratory data analysis (EDA).

- EDA is crucial for understanding data distributions, identifying patterns, detecting anomalies,
and informing feature selection for modeling.

8. What is a correlation matrix?

- A correlation matrix is a table showing correlation coefficients between variables, helping to
understand relationships and dependencies.

9. What are the benefits of using Python for data analytics?

- Python offers extensive libraries (e.g., Pandas, NumPy, Matplotlib), ease of use, community
support, and flexibility for various data manipulation tasks.

10. How do you handle categorical variables in machine learning?

- Categorical variables can be handled using encoding techniques like one-hot encoding or
label encoding to convert them into a numerical format.

### Additional Coding Challenges

#### 6. Outlier Detection

**Question:** Write a function that detects outliers in a DataFrame column using the IQR
method.

```python
def detect_outliers_iqr(df, column_name):
Q1 = df[column_name].quantile(0.25)
Q3 = df[column_name].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
return df[(df[column_name] < lower_bound) | (df[column_name] > upper_bound)]

# Example usage
data = {'Values': [1, 2, 3, 4, 100]}
df = pd.DataFrame(data)
print(detect_outliers_iqr(df, 'Values')) # Output: Rows with outliers
```

#### 7. Date and Time Manipulation

**Question:** Write a function that adds a specified number of days to a date column in a
DataFrame.

```python
def add_days_to_date(df, date_column, days):
df[date_column] = pd.to_datetime(df[date_column]) + pd.Timedelta(days=days)
return df

# Example usage
data = {'Date': ['2023-01-01', '2023-01-02']}
df = pd.DataFrame(data)
print(add_days_to_date(df, 'Date', 5))
```

### Basic Python Questions

1. **What is Python?**
- Python is a high-level, interpreted programming language known for its readability and
versatility. It is widely used in data analytics, web development, automation, and more.

2. What are Python lists?

- Lists are mutable sequences in Python that can hold a collection of items. They are defined
using square brackets `[]`.

3. How do you create a function in Python?

- A function is defined using the `def` keyword followed by the function name and
parentheses. For example:
```python
def my_function():
return "Hello, World!"
```

4. What are tuples in Python?

- Tuples are immutable sequences, defined using parentheses `()`, that can store a collection
of items.

5. How do you handle exceptions in Python?

- Exceptions are handled using `try` and `except` blocks:
```python
try:
# code that may cause an exception
except ExceptionType:
# code to handle the exception
```

### Data Manipulation with Pandas

6. **What is Pandas?**
- Pandas is a powerful data manipulation and analysis library for Python. It provides data
structures like Series and DataFrames.

7. How do you read a CSV file into a Pandas DataFrame?

- Use `pd.read_csv('filename.csv')` to read a CSV file.

8. How do you filter rows in a DataFrame?

- You can filter rows using boolean indexing:
```python
filtered_df = df[df['column_name'] > value]
```

9. How do you handle missing data in Pandas?

- You can use `df.dropna()` to remove missing values or `df.fillna(value)` to fill them with a
specified value.

10. How do you group data in Pandas?

- Use the `groupby()` method:
```python
grouped = df.groupby('column_name').mean()
```

### Data Visualization

11. What libraries can be used for data visualization in Python?

- Common libraries include Matplotlib, Seaborn, and Plotly.

12. How do you create a simple line plot using Matplotlib?

```python
import matplotlib.pyplot as plt
plt.plot(x, y)
plt.show()
```

13. What is Seaborn, and how does it relate to Matplotlib?

- Seaborn is a statistical data visualization library built on top of Matplotlib, offering a high-
level interface for drawing attractive graphics.

14. How do you create a scatter plot using Seaborn?

```python
import seaborn as sns
sns.scatterplot(data=df, x='column_x', y='column_y')
```

15. What is a histogram, and how do you create one in Python?

- A histogram is a graphical representation of the distribution of numerical data. You can
create one using:
```python
plt.hist(data, bins=10)
```

### Advanced Python Questions

16. What are lambda functions in Python?

- Lambda functions are anonymous functions defined using the `lambda` keyword. They can
take any number of arguments but can only have one expression.

17. How do you merge two DataFrames in Pandas?

- Use `pd.merge(df1, df2, on='column_name')`.

18. **What are the differences between `loc` and `iloc` in Pandas?**
- `loc` is label-based indexing, while `iloc` is position-based indexing. For example:
```python
df.loc[0] # First row by label
df.iloc[0] # First row by position
```

19. What is a DataFrame in Pandas?

- A DataFrame is a two-dimensional, size-mutable, potentially heterogeneous tabular data
structure with labeled axes (rows and columns).

20. Explain the concept of "vectorization" in Python.

- Vectorization refers to the process of applying operations on entire arrays rather than
individual elements, which enhances performance.

### Statistical Analysis

21. What is NumPy?

- NumPy is a fundamental library for numerical computing in Python, providing support for
arrays, matrices, and a collection of mathematical functions.

22. **How do you calculate the mean and standard deviation using NumPy?**
```python
import numpy as np
mean = np.mean(data)
std_dev = np.std(data)
```

23. **What is linear regression, and how can you implement it in Python?**
- Linear regression is a method to model the relationship between a dependent variable and
one or more independent variables. It can be implemented using `scikit-learn`:
```python
from sklearn.linear_model import LinearRegression
model = LinearRegression().fit(X, y)
```

24. How do you perform hypothesis testing in Python?

- You can use libraries like `SciPy` to perform various tests (e.g., t-tests, chi-square tests):
```python
from scipy import stats
t_statistic, p_value = stats.ttest_ind(sample1, sample2)
```

25. What is the Central Limit Theorem?

- The Central Limit Theorem states that the distribution of the sample means approaches a
normal distribution as the sample size increases, regardless of the original distribution of the
data.

### SQL and Data Queries

26. How can you connect to a SQL database using Python?

- You can use libraries like `sqlite3` or `SQLAlchemy` to connect to databases.

27. What is the purpose of the `GROUP BY` clause in SQL?

- The `GROUP BY` clause groups rows that have the same values in specified columns into
summary rows, like finding the average or sum.

28. How do you perform a SQL JOIN in Pandas?

- You can use the `merge()` function to perform SQL-like joins:
```python
result = pd.merge(df1, df2, on='key', how='inner')
```

29. What is a primary key in a database?

- A primary key is a unique identifier for a record in a table, ensuring that no two rows have
the same value in that column.

30. How do you handle SQL injections in Python?

- Use parameterized queries or ORM frameworks like SQLAlchemy to prevent SQL injection
attacks.

### Machine Learning Basics

31. What is the difference between supervised and unsupervised learning?

- Supervised learning uses labeled data to train models, while unsupervised learning
identifies patterns in unlabeled data.
32. **How do you split data into training and testing sets?**
- You can use `train_test_split` from `scikit-learn`:
```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
```

33. What is overfitting in machine learning?

- Overfitting occurs when a model learns the noise in the training data rather than the actual
underlying patterns, leading to poor performance on new data.

34. What are decision trees?

- Decision trees are a type of supervised learning algorithm used for classification and
regression that splits data into branches based on feature values.

35. How do you evaluate the performance of a machine learning model?

- Performance can be evaluated using metrics such as accuracy, precision, recall, F1-score,
and ROC-AUC for classification tasks, and mean squared error (MSE) for regression tasks.

### Data Wrangling and Transformation

36. What is data wrangling?

- Data wrangling is the process of cleaning and transforming raw data into a usable format for
analysis.

37. How do you pivot a DataFrame in Pandas?

- You can use the `pivot()` method:
```python
pivot_df = df.pivot(index='column1', columns='column2', values='column3')
```

38. What is one-hot encoding?

- One-hot encoding is a technique to convert categorical variables into a binary matrix format,
allowing algorithms to work with categorical data.

39. How do you concatenate DataFrames in Pandas?

- Use the `concat()` function:
```python
result = pd.concat([df1, df2])
```

40. How do you normalize data in Python?

- You can normalize data using the `MinMaxScaler` from `scikit-learn`:
```python
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
normalized_data = scaler.fit_transform(data)
```

### Final Questions and Scenarios

41. Can you explain the importance of data visualization?

- Data visualization helps communicate insights effectively, making complex data more
understandable and facilitating decision-making.

42. How would you handle imbalanced datasets?

- Techniques include resampling (over-sampling the minority class or under-sampling the
majority class), using different evaluation metrics, and employing algorithms that handle
imbalance naturally.

43. What is feature engineering, and why is it important?

- Feature engineering involves creating new features from existing data to improve model
performance. It

100 Python Interview Questions
No ratings yet
100 Python Interview Questions
68 pages
Viva
No ratings yet
Viva
7 pages
Data Science
No ratings yet
Data Science
16 pages
Top 100 Python Interview Questions For Data Analyst
No ratings yet
Top 100 Python Interview Questions For Data Analyst
10 pages
Data Analytics Lab QA
No ratings yet
Data Analytics Lab QA
7 pages
Top 50 Python Interview Questions
No ratings yet
Top 50 Python Interview Questions
8 pages
Viva Voce
No ratings yet
Viva Voce
5 pages
Python Unit 2 Question Bank
No ratings yet
Python Unit 2 Question Bank
5 pages
Python Interview Questions
No ratings yet
Python Interview Questions
6 pages
DAL Oral QB
No ratings yet
DAL Oral QB
2 pages
UNIT 4 Data Science Notes
No ratings yet
UNIT 4 Data Science Notes
4 pages
Pandas Trick Ques
No ratings yet
Pandas Trick Ques
2 pages
Data Analysis Concepts Explanation
No ratings yet
Data Analysis Concepts Explanation
3 pages
Viva Answers
No ratings yet
Viva Answers
3 pages
Data Science Mid-II Question Bank
No ratings yet
Data Science Mid-II Question Bank
1 page
Python Libraries Questions
No ratings yet
Python Libraries Questions
3 pages
MCQ QB
No ratings yet
MCQ QB
2 pages
Ds Viva
No ratings yet
Ds Viva
9 pages
Q.1 Explain Process of Working With Data From Files in Data Science
No ratings yet
Q.1 Explain Process of Working With Data From Files in Data Science
20 pages
Unit-II Data Science QB
No ratings yet
Unit-II Data Science QB
33 pages
Python and Libraries for AI
No ratings yet
Python and Libraries for AI
34 pages
Pandasmohali
No ratings yet
Pandasmohali
6 pages
Cls10datascience 24082024 113123
No ratings yet
Cls10datascience 24082024 113123
4 pages
Analystics Data Cleaning Questions Interview
No ratings yet
Analystics Data Cleaning Questions Interview
8 pages
MY Question Bank
100% (1)
MY Question Bank
3 pages
Pandas
No ratings yet
Pandas
12 pages
VIP Question Bank For DPV For Theory Exam
No ratings yet
VIP Question Bank For DPV For Theory Exam
6 pages
Data Science
No ratings yet
Data Science
10 pages
Interview Questions About Python Programming
No ratings yet
Interview Questions About Python Programming
16 pages
Interview Preparation Data Science Analyse
No ratings yet
Interview Preparation Data Science Analyse
9 pages
2A - Python+Data Analysis For Pyhton2 v2
No ratings yet
2A - Python+Data Analysis For Pyhton2 v2
38 pages
Python Interview Questions For Data Analytics
No ratings yet
Python Interview Questions For Data Analytics
2 pages
Data Science Interview Ques.
No ratings yet
Data Science Interview Ques.
141 pages
Python Interview Questions
No ratings yet
Python Interview Questions
23 pages
Data Science
No ratings yet
Data Science
28 pages
Common Python Data Science Interview Questions1
No ratings yet
Common Python Data Science Interview Questions1
5 pages
Python Ques
No ratings yet
Python Ques
5 pages
Every Data Analyst Should Know !
No ratings yet
Every Data Analyst Should Know !
4 pages
Sac QB 2023-2024
No ratings yet
Sac QB 2023-2024
2 pages
50 Common Data Analyst Interview Questions
No ratings yet
50 Common Data Analyst Interview Questions
3 pages
Pandas - Matplotlib - QA Class 12
No ratings yet
Pandas - Matplotlib - QA Class 12
4 pages
Python for Data Analysis Notes
No ratings yet
Python for Data Analysis Notes
3 pages
100 Interview Questions
No ratings yet
100 Interview Questions
15 pages
Test 1 Datasheet
No ratings yet
Test 1 Datasheet
3 pages
Python Interviews Question
No ratings yet
Python Interviews Question
47 pages
Feature Engineering Assignment
No ratings yet
Feature Engineering Assignment
7 pages
40 NumPy and Pandas Interview Questions With Answers 1740141557
No ratings yet
40 NumPy and Pandas Interview Questions With Answers 1740141557
6 pages
CSE445 NSU Week - 3
No ratings yet
CSE445 NSU Week - 3
48 pages
Viva Questions Answers
No ratings yet
Viva Questions Answers
2 pages
Data Science QnA
No ratings yet
Data Science QnA
15 pages
Python Pandas
No ratings yet
Python Pandas
15 pages
Python2 Materials
No ratings yet
Python2 Materials
27 pages
GVPCOEW-Pandas and Numpy For Data Analysis - DONE
No ratings yet
GVPCOEW-Pandas and Numpy For Data Analysis - DONE
110 pages
Jenisha INTERNSHIP REPORT-2
No ratings yet
Jenisha INTERNSHIP REPORT-2
19 pages
Python Interview Questions by Skill Arbitrage
No ratings yet
Python Interview Questions by Skill Arbitrage
3 pages
Python 1
No ratings yet
Python 1
14 pages
Python MCQs Test Papers Expanded
No ratings yet
Python MCQs Test Papers Expanded
7 pages
AI_ML
No ratings yet
AI_ML
16 pages
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Data Driven Guide for Python Programming : Master Essentials to Advanced Data Structures
From Everand
Data Driven Guide for Python Programming : Master Essentials to Advanced Data Structures
Younes Hamdani
No ratings yet
Effect of Motivation, Fatigue, Health and Work Stress When Night Overtime On Performance
No ratings yet
Effect of Motivation, Fatigue, Health and Work Stress When Night Overtime On Performance
7 pages
Logistic Regression
No ratings yet
Logistic Regression
3 pages
Model Selection NEW
No ratings yet
Model Selection NEW
24 pages
Python For Probability Statistics and Machine Learning 2nd Edition José Unpingco Download
No ratings yet
Python For Probability Statistics and Machine Learning 2nd Edition José Unpingco Download
44 pages
Myp Math Extended Unit 02
No ratings yet
Myp Math Extended Unit 02
6 pages
Forecast Calculation Examples
No ratings yet
Forecast Calculation Examples
6 pages
Previcox Carcinoma
No ratings yet
Previcox Carcinoma
9 pages
Emergency Response Facility Location in Transportation Network-A Literature Review
No ratings yet
Emergency Response Facility Location in Transportation Network-A Literature Review
17 pages
Is Linear Regression Valid When The Outcome (Dependant Variable) Not Normally Distributed?
No ratings yet
Is Linear Regression Valid When The Outcome (Dependant Variable) Not Normally Distributed?
3 pages
Midterm From Fall 2012 Stats 21
0% (1)
Midterm From Fall 2012 Stats 21
6 pages
Final Project Plan Outline
No ratings yet
Final Project Plan Outline
4 pages
Categorical Data Analysis - 2002 - Agresti - Frontmatter
No ratings yet
Categorical Data Analysis - 2002 - Agresti - Frontmatter
13 pages
CLRM
No ratings yet
CLRM
15 pages
Jurnal Penelitian
No ratings yet
Jurnal Penelitian
15 pages
Options Volume Stock Returns PDF
No ratings yet
Options Volume Stock Returns PDF
36 pages
Construction and Building Materials
No ratings yet
Construction and Building Materials
20 pages
Reymond Denver Q. Buenaseda Determinants of Using Cold Storage Technology
No ratings yet
Reymond Denver Q. Buenaseda Determinants of Using Cold Storage Technology
33 pages
ECON2206 Course Outline
0% (1)
ECON2206 Course Outline
12 pages
Programme 12134001
No ratings yet
Programme 12134001
35 pages
M.B.A Regulation 2023-24
No ratings yet
M.B.A Regulation 2023-24
280 pages
Combine
No ratings yet
Combine
71 pages
Mehak Fatima QRM Exam
No ratings yet
Mehak Fatima QRM Exam
4 pages
Explaining Intergroup Differentiation in An Industrial Organization
No ratings yet
Explaining Intergroup Differentiation in An Industrial Organization
14 pages
(Ebook PDF) Statistics Unplugged 4th Edition by Sally Caldwell - Download The Ebook With All Fully Detailed Chapters
100% (2)
(Ebook PDF) Statistics Unplugged 4th Edition by Sally Caldwell - Download The Ebook With All Fully Detailed Chapters
42 pages
696-Article Text-1292-1-10-20221004
No ratings yet
696-Article Text-1292-1-10-20221004
11 pages
Course Outline With Dates - Adjusted - BZAN 6350 - Fall 2024
No ratings yet
Course Outline With Dates - Adjusted - BZAN 6350 - Fall 2024
7 pages
0102 7638 RBCCV 38 02 0271
No ratings yet
0102 7638 RBCCV 38 02 0271
7 pages
Multicollinearity
No ratings yet
Multicollinearity
36 pages
Adaptive Lasso & Oracle Properties
No ratings yet
Adaptive Lasso & Oracle Properties
12 pages
STAT 5302 Applied Regression Analysis. Hawkins
No ratings yet
STAT 5302 Applied Regression Analysis. Hawkins
7 pages

Day 2 Python Interview QnA

Uploaded by

Day 2 Python Interview QnA

Uploaded by

### Basic Python Questions

2. **How do you install Python?**

3. **What are lists and tuples in Python?**

4. **What are dictionaries in Python?**

5. **How do you handle exceptions in Python?**

### Data Manipulation Questions

7. **How do you create a NumPy array?**

8. **What are the advantages of using Pandas?**

9. **How do you read a CSV file in Pandas?**

10. **How do you handle missing data in Pandas?**

### Data Analysis Questions

12. **What is the difference between a Series and a DataFrame in Pandas?**

13. **How do you group data in Pandas?**

14. **What is a pivot table in Pandas?**

15. **How do you merge two DataFrames in Pandas?**

### Statistical Analysis Questions

16. **What is the purpose of the `describe()` method in Pandas?**

17. **How do you calculate correlation in Pandas?**

18. **What is hypothesis testing?**

19. **What are p-values?**

20. **What is linear regression?**

### Data Visualization Questions

23. **What is Seaborn?**

24. **How do you create a scatter plot using Seaborn?**

25. **What is a box plot?**

### Advanced Python Questions

26. **What are lambda functions in Python?**

27. **What is list comprehension?**

28. **What is the purpose of the `apply()` function in Pandas?**

29. **How do you install external libraries in Python?**

31. **What is data normalization?**

32. **What is feature engineering?**

33. **What is the difference between supervised and unsupervised learning?**

34. **What are outliers, and how can they be detected?**

35. **What is the purpose of data validation?**

### SQL Integration Questions

36. **How can you connect Python to a SQL database?**

37. **What is the purpose of the `pandas.read_sql()` function?**

38. **How do you perform a SQL join in Pandas?**

39. **What is a primary key in a database?**

40. **What is a foreign key?**

### Machine Learning Questions

42. **What is overfitting?**

43. **What are decision trees?**

44. **What is cross-validation?**

45. **What is a confusion matrix?**

### Data Ethics Questions

46. **What is data privacy?**

47. **What is bias in data analysis?**

48. **How can you ensure data integrity?**

49. **What is GDPR?**

50. **Why is data transparency important?**

### More Advanced Topics

### Theory Questions

1. **What is the difference between Python lists and arrays?**

2. **Explain the concept of DataFrames in Pandas.**

3. **What is the purpose of the `groupby()` function in Pandas?**

4. **How does the `apply()` function work in Pandas?**

5. **What are some common methods to handle missing data in a dataset?**

### Coding Questions

#### 1. Data Manipulation

#### 2. Filtering Data

#### 3. Grouping Data

#### 4. Handling Missing Values

#### 5. Data Visualization

def plot_average_bar(df, group_column, value_column):

### Additional Theory Questions

6. **What is the purpose of normalization and standardization in data preprocessing?**

7. **Explain the importance of exploratory data analysis (EDA).**

8. **What is a correlation matrix?**

9. **What are the benefits of using Python for data analytics?**

10. **How do you handle categorical variables in machine learning?**

### Additional Coding Challenges

#### 6. Outlier Detection

#### 7. Date and Time Manipulation

### Basic Python Questions

2. How do you install Python?

3. What are lists and tuples in Python?

4. What are dictionaries in Python?

5. How do you handle exceptions in Python?

7. How do you create a NumPy array?

8. What are the advantages of using Pandas?

9. How do you read a CSV file in Pandas?

10. How do you handle missing data in Pandas?

12. What is the difference between a Series and a DataFrame in Pandas?

13. How do you group data in Pandas?

14. What is a pivot table in Pandas?

15. How do you merge two DataFrames in Pandas?

16. What is the purpose of the `describe()` method in Pandas?

17. How do you calculate correlation in Pandas?

18. What is hypothesis testing?

19. What are p-values?

20. What is linear regression?

23. What is Seaborn?

24. How do you create a scatter plot using Seaborn?

25. What is a box plot?

26. What are lambda functions in Python?

27. What is list comprehension?

28. What is the purpose of the `apply()` function in Pandas?

29. How do you install external libraries in Python?

31. What is data normalization?

32. What is feature engineering?

33. What is the difference between supervised and unsupervised learning?

34. What are outliers, and how can they be detected?

35. What is the purpose of data validation?

36. How can you connect Python to a SQL database?

37. What is the purpose of the `pandas.read_sql()` function?

38. How do you perform a SQL join in Pandas?

39. What is a primary key in a database?

40. What is a foreign key?

42. What is overfitting?

43. What are decision trees?

44. What is cross-validation?

45. What is a confusion matrix?

46. What is data privacy?

47. What is bias in data analysis?

48. How can you ensure data integrity?

49. What is GDPR?

50. Why is data transparency important?

1. What is the difference between Python lists and arrays?

2. Explain the concept of DataFrames in Pandas.

3. What is the purpose of the `groupby()` function in Pandas?

4. How does the `apply()` function work in Pandas?

5. What are some common methods to handle missing data in a dataset?

6. What is the purpose of normalization and standardization in data preprocessing?

7. Explain the importance of exploratory data analysis (EDA).

8. What is a correlation matrix?

9. What are the benefits of using Python for data analytics?

10. How do you handle categorical variables in machine learning?

2. What are Python lists?

3. How do you create a function in Python?

4. What are tuples in Python?

5. How do you handle exceptions in Python?

7. How do you read a CSV file into a Pandas DataFrame?

8. How do you filter rows in a DataFrame?

9. How do you handle missing data in Pandas?

10. How do you group data in Pandas?

11. What libraries can be used for data visualization in Python?

12. How do you create a simple line plot using Matplotlib?

13. What is Seaborn, and how does it relate to Matplotlib?

14. How do you create a scatter plot using Seaborn?

15. What is a histogram, and how do you create one in Python?

16. What are lambda functions in Python?

17. How do you merge two DataFrames in Pandas?

19. What is a DataFrame in Pandas?

20. Explain the concept of "vectorization" in Python.

21. What is NumPy?

24. How do you perform hypothesis testing in Python?

25. What is the Central Limit Theorem?

26. How can you connect to a SQL database using Python?

27. What is the purpose of the `GROUP BY` clause in SQL?

28. How do you perform a SQL JOIN in Pandas?

29. What is a primary key in a database?

30. How do you handle SQL injections in Python?

31. What is the difference between supervised and unsupervised learning?

33. What is overfitting in machine learning?