ASSIGNMENT -2
CHAPTER 2 :UNDERSTANDING DATA
MULTIPLE CHOICE QUESTIONS (MCQs)
1. What is the primary purpose of collecting data?
A. To decorate webpages
B. To store data indefinitely
C. To make decisions based on analysis
D. To reduce computer memory usage
Answer: C. To make decisions based on analysis
2. Which of the following is an example of structured data?
A. A newspaper article
B. An image on a website
C. A table showing inventory items in a shop
D. A video file
Answer: C. A table showing inventory items in a shop
3. What does metadata represent?
A. Multimedia content
B. Data about data
C. Only numerical data
D. None of the above
Answer: B. Data about data
4. Which of the following is an outlier?
A. A value that is identical to the mean
B. A value that occurs most frequently
C. An extremely high or low value compared to others
D. A value with zero frequency
Answer: C. An extremely high or low value compared to others
5. What statistical technique is most affected by outliers?
A. Median
B. Mode
C. Mean
D. Standard Deviation
Answer: C. Mean
6. In structured data, what is an attribute?
A. The format of an image
B. A column representing a characteristic
C. Arow of values
D. A description of metadata
Answer: B. A column representing a characteristic
7. Which of the following is not a measure of central tendency?
A. Mean
B. Median
C. Mode
D. Range
Answer: D. Range
8. What is the correct order of the data processing cycle?
A. Input → Output → Processing
B. Collection → Preparation → Entry → Storage → Processing → Output
C. Output → Processing → Storage
D. Entry → Output → Collection
Answer: B. Collection → Preparation → Entry → Storage → Processing → Output
9. Which of the following is used to summarize data for easy understanding?
A. Audio processing
B. Metadata generation
C. Statistical techniques
D. Data hiding
Answer: C. Statistical techniques
10. What is the standard deviation used for?
A. To find the middle value
B. To count data entries
C. To measure the spread of data
D. To identify structured data
Answer: C. To measure the spread of data
11. What is the mode of the dataset [34, 34, 27, 28, 27, 34, 34]?
A. 27
B. 34
C. 28
D. 33
Answer: B. 34
12. Which of the following storage devices is volatile?
A. Pen Drive
B. CD/DVD
C. HDD
D. None of the above
Answer: D. None of the above
13. Unstructured data can include all except:
A. Audio files
B. Web pages with multimedia
C. Tables with rows and columns
D. Social media messages
Answer: C. Tables with rows and columns
14. Which of the following tasks represents data collection?
A. Calculating standard deviation
B. Retrieving data from a file
C. Filling student details in an online form
D. Printing a report card
Answer: C. Filling student details in an online form
15. Which tool is suggested for data processing in future chapters?
A. Java
B. MySQL
C. Python
D. HTML
Answer: C. Python
16. Which statistical technique is suitable for finding income disparity?
A. Mode
B. Median
C. Mean
D. Standard Deviation
Answer: D. Standard Deviation
17. What does the median represent in a dataset?
A. The average of all values
B. The value that occurs most frequently
C. The value that appears at the centre after sorting
D. The maximum value in the list
Answer: C. The value that appears at the center after sorting
18. Which of the following best describes unstructured data?
A. Can be easily stored in tables
B. Has a clear format with fixed fields
C. Lacks predefined structure
D. Cannot be stored digitally
Answer: C. Lacks predefined structure
19. What is the formula for standard deviation (σ) as per the chapter?
A. 𝜎 =𝑀𝑒𝑎𝑛𝑜𝑓𝑣𝑎𝑙𝑢𝑒𝑠
B. 𝜎 = (𝑀𝑎𝑥𝑖𝑚𝑢𝑚−𝑀𝑖𝑛𝑖𝑚𝑢𝑚)
C. 𝜎 = √(Σ(𝑥𝑖−𝑥)2/𝑛)
D. 𝜎 =Σ(𝑥𝑖+𝑥)/𝑛
Answer: C. 𝜎 = √(Σ(𝑥𝑖−𝑥)2/𝑛)
FILL IN THE BLANKS
1. Data which is organised and can be recorded in a well-defined format is called ________.
Answer: Structured Data
2. Data which do not follow any fixed structure or format are called ________.
Answer: Unstructured Data
3. The singular form of the word ‘data’ is ________.
Answer: Datum
4. The process of collecting, storing, and analysing data for decision making is known as
________.
Answer: Data Processing
5. ________ is a measure of central tendency that represents the average of a set of values.
Answer: Mean
6. The ________ is the middle value in a sorted list of data values.
Answer: Median
7. The value that occurs most frequently in a data set is called the ________.
Answer: Mode
8. The difference between the maximum and minimum values in a data set is called the
________.
Answer: Range
9. The standard deviation is represented by the Greek letter ________.
Answer: Sigma (𝜎)
10. The data describing other data is referred to as ________.
Answer: Metadata
11. Examples of digital storage devices include HDD, SSD, CD/DVD, Pen Drive, and ________.
Answer: Memory Card
12. Statistical techniques used to summarise data include mean, median, mode, range, and
________.
Answer: Standard Deviation
13. The process of obtaining data from reliable sources before processing is called ________.
Answer: Data Collection
14. ICT revolution has led to the generation of ________ volume of data at a very fast pace.
Answer: Large
15. The structured data is generally stored in a ________ format in computers.
Answer: Tabular
2 MARKS QUESTIONS
1. What is the difference between data and information?
Answer:
Data refers to unorganised facts that need to be processed, while information is the
processed form of data that is meaningful and useful for decision making.
2. Define the term ‘datum’.
Answer:
‘Datum’ is the singular form of the word ‘data’. It represents a single piece of information or
value.
3. What is metadata? Give an example.
Answer:
Metadata is data about data. For example, in an image file, metadata may include image size,
type (JPEG, PNG), and resolution.
4. Differentiate between structured and unstructured data.
Answer:-
Structured Data: Organised in rows and columns (e.g., database tables).- Unstructured Data:
Lacks a defined format (e.g., emails, web pages, videos).
5. Give two examples of structured data.
Answer:
1. School fee payment records with fields like StudentName, RollNo, and FeesAmount. 2.
Inventory table with fields like ProductName, UnitPrice, and Quantity.
6. Mention two examples of unstructured data.
Answer: 1. Social media posts with text, images, and videos.
2. Email content with body text and attachments.
7. What is the significance of data in decision making?
Answer:
Data helps identify trends, draw conclusions, and support decisions in areas such as
business, education, healthcare, and governance.
8. Define mean and write its formula.
Answer:
Mean is the average of numeric values. Formula: 𝑀𝑒𝑎𝑛 = (𝑥1 +𝑥2 +...+𝑥𝑛)/𝑛
9. What is median? How is it calculated for even number of values?
Answer:
Median is the middle value in an ordered list. For even number of values, it is the
average of the two middle values.
10. Define mode with an example.
Answer:
Mode is the value that appears most frequently in a dataset. Example: In [34, 34, 28, 27, 34],
mode is 34.
11. What is meant by range in a dataset?
Answer:
Range is the difference between the maximum and minimum values in a dataset. Formula:
Range = Maximum– Minimum
12. What does standard deviation measure?
Answer:
Standard deviation measures the spread or dispersion of values around the mean. It
considers all data points in the dataset.
13. Name two commonly used digital storage devices.
Answer:
1. Hard Disk Drive (HDD)
2. Solid State Drive (SSD)
14. Mention any two statistical techniques used for data summarisation.
Answer:
1. Measures of Central Tendency (Mean, Median, Mode)
2. Measures of Variability (Range, Standard Deviation)
15. Differentiate between range and standard deviation.
Answer:-
Range: Difference between the maximum and minimum values.- Standard
Deviation: Measures the average spread of all values from the mean.
16. What type of data is stored in an electronic voting machine?
Answer:
Structured data such as votes cast, which are accumulated and processed for quick
result declaration.
17. Give two scenarios where data is used for making decisions.
Answer:
1. Meteorological data used to predict cyclones.
2. Sales data used by businesses to offer discounts or change product placements.
18. How can Python help in data processing and analysis?
Answer: Python provides libraries that allow efficient data processing, statistical analysis,
and visualisation of large data sets.
19.What are the steps involved in data processing?
Answer:
1. Data Collection
2. Data Preparation
3. Data Entry
4. Storage and Retrieval
5. Classification and Update
6. Generation of Reports/Results
3 MARKS QUESTIONS
1. Explain the three commonly used measures of central tendency with examples.
Answer:-
-Mean is the average of all values.
Example: Mean of [90, 100, 110] = (90+100+110)/3 = 100
-Median is the middle value in a sorted list.
Example: Median of [85, 90, 100, 110, 115] = 100
-Mode is the value that occurs most frequently.
Example: Mode of [90, 110, 110, 110, 100] = 110
2. Differentiate between structured data and unstructured data with examples.
Answer:-
Structured Data: Organised in rows and columns, easy to store and analyse. Example:
Table of student records with Roll No, Name, Marks.- Unstructured Data: Lacks
predefined format; difficult to analyse. Example: Social media posts with images and
text.
3. Define standard deviation. Write its formula and explain its significance.
Answer:
Standard deviation (σ) measures the spread of data around the mean. Formula: [𝜎 = √ 1
𝑛 𝑛 ∑ 𝑖=1 (𝑥𝑖 − ̄ 𝑥)2] It gives insights into data variability. A smaller σ means values are
closer to the mean; a larger σ indicates more spread.
4. What are metadata? Give three examples from different digital files.
Answer:
Metadata are data about data.
They describe content and structure.
Examples:- In an image file: resolution, format (JPEG/PNG)- In an email: subject,
recipient, date sent- In a document: author name, word count, creation date
5. Describe the role of data in business decision-making with any two examples.
Answer:
Businesses use data to understand market trends and improve performance.
Examples: 1. Analysing customer feedback to improve products.
2. Using sales data to implement dynamic pricing (e.g., discount in happy
hours based on past data).
6. Explain the data processing cycle with the help of a diagram or steps.
Answer:
The data processing cycle includes the following steps:
1. Input– Collecting and entering data.
2. Processing– Manipulating data to produce results.
3. Output– Presenting the results.
4. Storage and Retrieval– Saving for future use. These steps convert raw data into
useful information.
7. Distinguish between range and standard deviation with formula and example.
Answer:-
-Range: Difference between maximum and minimum values.
Formula: Range = Max– Min
Example: For [85, 90, 115],
Range = 115– 85 = 30
- Standard Deviation: Measures average spread from the mean.
Formula: 𝜎 = √(Σ(𝑥–𝑥)2/𝑛)
Example: For [90, 100, 110], σ is calculated using all data points.
8. Give three different scenarios of data collection and describe the method to convert them
into digital format.
Answer:
1. Manual Record (e.g., shopkeeper’s diary): Enter data into spreadsheet manually.
2. Digital File (e.g., CSV): Directly use data for analysis using software tools.
3. No prior data: Develop software (e.g., in Python or MySQL) to store and manage
sales digitally.
9. What are the limitations of file processing and how does DBMS help overcome them?
Answer:
File Processing Limitations:
- Difficult to handle large data
- Poor data integrity and redundancy
- No concurrent access or security control
DBMS Benefits:
- Centralised management
- Easy retrieval and update
- Ensures data consistency, security, and reduces redundancy
10. A teacher wants to compare students’ test results from five months. Which statistical
technique is suitable and why?
Answer:
Mean is the suitable technique to compare average performance over five months. It
provides a quick understanding of how the class performed each month and highlights
trends in overall class performance.
5 MARKS QUESTIONS
1. What are the different types of data? Explain structured and unstructured data with
examples.
Answer:
Data can be broadly categorized into:
1. Structured Data– Organised in a defined format like tables (rows and columns). Each
column represents an attribute and each row represents an observation.
Examples:
- School records (RollNo, Name, Marks)
- ATM withdrawal data (AccountNo, Date, Amount)
2. Unstructured Data– Data not arranged in predefined format, lacks structure.
Examples:
- Social media posts
- Email content
- News articles with images, videos, and text
2. Explain the role of data in various real-life sectors. Give at least five examples.
Answer: Data plays a crucial role in decision-making across various domains.
Examples:
1. Education: Placement data helps students choose colleges.
2. Government: Census data is used for planning policies.
3. Healthcare: Hospitals collect patient data for treatment analysis.
4. Meteorology: Weather offices use satellite data to predict storms.
5. Business: Sales data is analysed for discounts, inventory planning, and marketing
decisions.
3. Define and differentiate Mean, Median, and Mode. Include examples.
Answer:
- Mean: Average of numeric values.
Formula: Mean = (Sum of all values) / Number of values
Example: [90, 100, 110] → Mean = (90+100+110)/3 = 100
- Median: Middle value in a sorted list.
Example: [85, 90, 100, 110, 115] → Median = 100
- Mode: Most frequently occurring value.
Example: [90, 90, 100, 110, 110, 110] → Mode = 110
Difference:
- Mean considers all values.
- Median is less sensitive to outliers.
- Mode shows the most common value.
4. What is standard deviation? How is it calculated? Explain with an example.
Answer:
Standard deviation measures the dispersion or spread of data around the mean.
Formula:
Steps:
1. Find mean of the dataset.
2. Subtract each value from the mean.
3. Square the differences.
4. Find average of squared differences.
5. Take square root of the result.
Example: For heights: [90, 102, 110, 115, 85, 90, 100, 110, 110]
- Mean ≈ 101.33
- 𝜎 ≈√(938/9) ≈ 10.
5. Differentiate between Range and Standard Deviation. Explain with examples and
formula.
Answer:
Feature Range Standard Deviation
Definition Difference between highest Average spread of all values
and lowest values from the mean
Formula Max- Min Refer 3rd question in 3
markers for formula
Data Used Only two values (max and All values
min)
Sensitive to Outliers Highly sensitive Less affected
Example:
Data: [85, 90, 90, 100, 102, 110, 110, 110, 115]
- Range = 115– 85 = 30
- σ ≈10.2 (calculated using mean and all values)
6. Explain the data processing cycle with a real-life example.
Answer:
The data processing cycle includes:
1. Data Collection– Gather raw data.
2. Data Preparation– Organise, clean, and validate data.
3. Data Entry– Input data into the system.
4. Processing– Apply algorithms and logic to get results.
5. Storage/Retrieval– Save and retrieve data as needed.
6. Output/Reporting– Present results in a meaningful form.
Example: In online exam registration:
- Collect name, marks, payment details
- Check eligibility
- Generate roll numbers and admit cards
7. How is data collected in digital environments? Explain three different scenarios.
Answer:
1. Manual to Digital:
- Shopkeeper keeps records in a diary.
- Data is entered into a spreadsheet or software.
2. Already in Digital Format:
- Data is in CSV or database format.
- Can be directly imported and processed.
3. Fresh Data Collection:
- A new system is developed (e.g., using Python/MySQL) to record and store sales or
transactions digitally.
8. Explain metadata with three examples. How is it useful in processing unstructured data?
Answer:
Metadata is data about data.
It helps identify, describe, and process unstructured data.
Examples:
1. Image File: Metadata includes image size, type, resolution.
2. Email: Subject, recipient, time sent.
3. Document: Author name, date created, word count.
Usefulness:
Metadata helps organise, search, and process unstructured content like emails, images, and
documents.
9. List and explain any five real-life applications of statistical techniques in data processing.
Answer:
1. Education: Teachers analyse marks using mean and median.
2. Business: Use mode to identify popular products.
3. Health: Use standard deviation to assess variability in patient recovery times.
4. Elections: Calculate vote share using mean/percentages.
5.Weather Forecasting: Analyse range of temperatures to predict extremes.
10. Compare the use of Mean, Median, and Mode with suitable scenarios.
Answer:
Measure Best Used When Example
Mean No extreme outliers, want Average marks of students
overall average
Median Outliers present, want Income data where few are
central tendency very rich
Mode Need most frequent value Popular shoe size sold in a
store
Each measure helps summarise data based on the context of variability, frequency, or central
position.