Revision (1)
Revision (1)
Revision (1)
REVISION NOTES
Unit 1
Key Differences:
• Data: Raw, unprocessed facts and figures (e.g., numbers, text, images). Data
itself has no context or meaning.
• Information: Data that has been processed, organized, or structured in a way
that adds meaning and context, making it useful for decision-making.
Key Differences:
• Raw vs. Processed: Data is raw, whereas information is processed and ready
to be used.
• Meaning: Information has meaning and can be acted upon, while data may
need to be processed to derive meaning.
3. How is data science transforming different industries? Elaborate on its role in personalized
recommendations, targeted advertising, smart devices, health monitoring applications, and
fraud detection.
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• Smart Devices: Internet of Things (IoT) devices use data science to optimize
performance, automate tasks, and enhance user experience (e.g., smart
thermostats, wearable devices).
• Health Monitoring Applications: Apps like Fitbit and Apple Health track
physical activities and vital signs, using data science to provide personalized
health insights.
• Fraud Detection: In finance, data science helps detect fraudulent activities by
analyzing transaction patterns and identifying anomalies (e.g., unusual spending
patterns).
Relationship:
• 1. Finance:
2. Politics:
• Sentiment Analysis: Analyzes public opinion from social media and polls.
• Voter Behavior Prediction: Predicts voting patterns based on demographics
and previous elections.
• Campaign Targeting: Identifies key voter groups to target for election
campaigns.
3. Health Care:
4. Urban Planning:
UNIT II
• Data Validation: A feature in Excel that restricts the type of data or values users
can enter into a cell. It ensures data consistency and accuracy.
• Importance:
2. How can you use Data Validation to restrict input to a specific range of numbers in
Excel? Provide the steps involved.
3. Explain how Data Validation can be used to create a drop-down list in Excel. Why
would you use a drop-down list in a data entry form?
4. What is Conditional Formatting in Excel, and why is it useful for data analysis?
Example: Highlight all cells with sales greater than $1000 using "Highlight Cells
Rules."
6. Explain the steps to apply Conditional Formatting to highlight cells that contain
values greater than a specific number.
Steps:
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• Choose Highlight Cells Rules → Greater Than.
• Enter the value (e.g., 50).
• Choose a format (e.g., bold red text).
• Click OK to apply the formatting.
7. How can you use Conditional Formatting to highlight duplicate values in a dataset?
What are the practical benefits of this feature?
Practical Benefits:
Example Scenario:
• In a sales dataset, you want to highlight the top 10% of sales performers.
Steps:
• Select the data range.
• Go to Conditional Formatting → Top/Bottom Rules → Top 10%.
• Choose a color format (e.g., green fill).
• Click OK.
9. What is a Nested IF function in Excel? Explain its syntax and provide an example of
how it can be used to evaluate multiple conditions.
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• If a student’s score is 90 or above, assign "A"; if between 70-89, assign
"B"; otherwise, assign "C".
• =IF(A1>=90, "A", IF(A1>=70, "B", "C"))
10. Explain the difference between the IF function and the IFS function in Excel. In
which scenarios is the IFS function preferred over Nested IF?
• IF Function: Used to test a single condition and return one value if TRUE,
another if FALSE.
• IFS Function: Allows testing multiple conditions without needing to nest
IF statements.
• IFS is Preferred:
11. What is the purpose of the COUNTIF function in Excel? Explain how it can be used
to count cells that meet specific criteria, with an example.
• =COUNTIF(B1:B10, "Completed").
• COUNTIFS is useful when you want to count sales greater than $1000
made by a specific salesperson:
• =COUNTIFS(Sales, ">1000", Salesperson, "John").
13. What are Filters in Excel? How can filters be used to organize and analyze large
datasets? Provide an example.
• Filters: Tools that allow you to view only the data that meets specific
criteria while hiding the rest.
• Usage: Organizes data by showing relevant records, simplifying analysis.
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• Example:
• Filter a sales report to display only records where sales are greater than
$5000.
• Steps: Select data → Data tab → Filter → Choose filter criteria.
14. How do you create a Pivot Table in Excel, and what are some of the common
operations (such as filtering, sorting, and grouping) that can be performed on Pivot
Table data?
• Common Operations:
15. How can you use Excel's Solver to solve a system of simultaneous equations? Walk
through the steps required to set up and solve the equations.
16. Explain how the Solver tool in Excel can be used to solve Linear Programming
Problems (LPP). Provide an example.
17. Explain how the LEFT, MID, and RIGHT functions are used in Excel to extract parts
of a text string. Provide examples of each function.
18. In what scenarios would you use the MID function over the LEFT or RIGHT
function? Provide an example where the MID function is essential.
a. • Scenario: When you need to extract characters from the middle of a string,
particularly when both the start and end are irrelevant.
b. • Example: Extracting a date from a string such as "Order1234567" (extract
"1234" starting from the 6th character using MID).
19. What is the INDEX MATCH combination in Excel, and how does it work?
20. Explain how the VLOOKUP function works. What are its limitations, and how can
these limitations be overcome using other functions?
21. What is the difference between VLOOKUP and HLOOKUP in Excel? Provide
examples of when each function is applicable.
• VLOOKUP: Looks for a value in the first column and retrieves corresponding
values from the same row.
• HLOOKUP: Looks for a value in the first row and retrieves corresponding values
from the same column.
• Example:
• Use VLOOKUP for a vertical table (e.g., finding prices by product name).
• Use HLOOKUP for a horizontal table (e.g., finding values across different
years).
22. Explain the purpose of the AND and OR functions in Excel. How can they be used to
evaluate multiple conditions in a formula?
23. Provide an example where you would use the AND function to return TRUE if
multiple conditions are met.
You want to check if a student passes based on two conditions: the student must score
above 50 in both Math and Science to pass.
Formula:
=AND(A2>50, B2>50)
Explanation:
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• This formula returns TRUE if both conditions (Math > 50 and Science > 50) are
met, indicating the student passed both subjects.
• If one or both conditions are false, it returns FALSE.
24. How can AND and OR functions be combined with IF statements in Excel to create
more complex logical tests? Provide an example.
• Check if a student passes either subject A or B, and if the total score is greater
than 150:
• Formula: =IF(AND(A1+B1>150, OR(A1>40, B1>40)), "Pass", "Fail")
UNIT III
1. Define machine learning. How does it differ from traditional programming? Provide
examples of real-world applications of machine learning.
• Examples:
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• Classification: Predicting categories (e.g., spam detection in emails).
• Regression: Predicting continuous values (e.g., predicting house prices).
• Algorithms:
1. Classification:
2. Regression:
Key Differences:
• Advantages:
• Limitations:
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• Expensive and time-consuming to label data.
7. What is the K-Nearest Neighbour (KNN) algorithm? Explain how the algorithm
works with the help of an example.
• How It Works:
• For a given test point, KNN identifies the "k" closest points (neighbors)
from the training data.
• In classification, it assigns the most common class among these neighbors.
• Example:
8. Discuss how the value of K (in KNN) influences the model’s performance. What
happens if K is too small or too large?
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• Influence of K:
• Small K (e.g., K=1): The model may be sensitive to noise and overfit, leading
to poor generalization.
• Large K (e.g., K=20): The model may generalize too much, leading to
underfitting and losing the fine details in the data.
9. Explain the concept of linear regression. What is the role of the slope and intercept in
the linear regression equation?
• Linear Regression:
• Equation: Y=b0+b1X
Slope (b1): The rate of change of Y with respect to X. It shows how much Y changes
for a one-unit change in X.
• Intercept (b0): The value of Y when X is 0. It is the point where the line
crosses the Y-axis
10. What is K-Means clustering? Describe the algorithm and explain how it works to
group data points into clusters.
• K-Means Clustering:
• How It Works:
UNIT IV
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• Python:
2. Explain the advantages of Python over other programming languages like Java or
C++. Why is Python popular in data science and machine learning?
3. Explain the different data types available in Python. Provide examples for each type
(e.g., integer, float, string, boolean).
Code:
if x > 10:
elif x == 10:
print("Equal to 10")
else:
5. Write a Python code snippet that uses nested conditional statements to classify an
integer as positive, negative, or zero.
if x > 0:
print("Positive")
elif x < 0:
print("Negative")
else:
print("Zero")
• R Programming:
• Primary Applications:
• Data Analysis: Used for exploring, visualizing, and analyzing large datasets.
• Statistics: Used in hypothesis testing, linear models, and regression analysis.
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
• Data Visualization: Extensive libraries like ggplot2 make it easy to create
informative and aesthetically pleasing graphs and charts.
• Machine Learning: R is widely used in predictive modeling and machine
learning algorithms (e.g., randomForest, caret).
8. Write a Python code to perform basic mathematical operations like add, subtract,
multiply, divide
if num % 2 == 0:
print(f"{num} is an even number.")
else:
print(f"{num} is an odd number.")
11. Write a Python code to find simple interest using formula I=PRT/100
SI = (P * R * T) / 100
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
print(f"The Simple Interest is: {SI}")
14. Write a R code to perform basic mathematical operations like add, subtract, multiply,
divide
num1 <- 10
num2 <- 5
sum <- num1 + num2
difference <- num1 - num2
product <- num1 * num2
quotient <- num1 / num2
cat("Sum:", sum)
cat("Difference:", difference)
cat("Product:", product)
cat("Quotient:", quotient)
16. Write a R code to find largest and smallest of numbers from a list
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.
cat("Count of numbers:", count)
18. Write a R code to find mean and median of numbers from a list
cat("Mean:", mean_value)
cat("Median:", median_value)
NOTE : Only pointers / hints are given below each question. Students are expected to elaborate
the answers and give examples or code wherever necessary to get maximum marks.