Unit 7 - Data Interpretation
Unit 7 - Data Interpretation
Unit 7 - Data Interpretation
UNIT - 7
NET/JRF/SLET/SET/PhD Entrance
The term ‘data’ is a plural form of the Latin word ‘datum,’ and literally, it means anything that
is given.
Different sources have defined the word in different ways. According to the Oxford
Encyclopedic English Dictionary, “data are known facts or things used as a basis for
inference or reckoning.”
In social sciences, “data are stated as values or facts, together with their accompanying
study design, codebooks, research reports, etc. and are used by researchers for the purpose
of secondary analysis.”
In humanities, the text, such as Biblical materials or Shakespeare’s drama deals with a fixed
quantity of data represented by a finite amount of text to be interpreted.
In brief, Data is basically unorganized statistical facts and figures collected for some specific
purposes, such as analysis.
Types of Data
As in sciences, data in social sciences are also organised into different types so that their nature
can be easily understood. The following categorisation is normally observed in social sciences:
i) Data with reference to scale of measurement: Based on the scale of measurement, data can
be categorised as follows:
• Nominal data
• Ordinal data
• Interval data
• Ratio data
ii) Data with reference to continuity: Data with reference to continuity can be categorised as
follows:
• Univariate data –Univariate data are obtained when one characteristic is used for
observation, e.g., the performance of students in a given class.
• Bivariate data – Bivariate data result when instead of one, two characteristics are
measured simultaneously, e.g., height and weight of tenth class students.
• Multivariate data – Multivariate data consist of observations on three or more
characteristics, e.g., family size, income, and savings in a metropolitan city in India.
iv) Data with reference to time: There are two types of data under this category. These are:
a) Time series data: Data recorded in a chronological order a cross-time are referred to as time-
series data. It takes different values at different times, e.g., the number of books added to a
library in different years, monthly production of steel in a plant, yearly intake of students in a
university.
b) Cross-sectional data: This refers to data for the same unit or for different units at a point of
time, e.g., data across sections of people, regions or segments of the society.
v) Data with reference to origin: Data under this category can be put as follows:
a) primary data: The data obtained firsthand from individuals by direct observation, counting,
and measurement or by interviews or mailing a questionnaire are called primary data. It may be
complete enumeration or sampling, e.g., data collected from a market survey.
b) Secondary data: The data collected initially for the purpose and already published in books or
reports but are used later on for some other purpose are referred to as secondary data. For
example, data collected from census reports, books, data monographs, etc.
vi) Data with reference to characteristic: Data can be categorised on the basis of the
characteristics as follows:
Static Data: Static data is those data that do not change during processing. This type of data
cannot be changed when written or printed.
These are two examples of static data or information as they cannot be changed.
Dynamic data: Dynamic data refers to data that changes during processing – it is updated as and
when necessary. The data is never expected to be the same when re-input.
Primary source: A primary source is an original document that contains firsthand information
about a topic or an event. Primary sources exist on a spectrum, and different fields of study may
use different types of primary source documents.
For example, the field of History may use diary entries and letters as primary source evidence,
while the Sciences may use a publication of original research as a primary source.
• Biographies
• Indexes, Abstracts, Bibliographies
• Journal articles
• Literary criticism
• Monographs, written about the topic
• Reviews of books, movies, musical recordings, works of arts, etc
• Newsletters and professional news sources
The acquisition or collection of data can be categorised based on the sources of data. There are
two types of sources; primary sources and secondary sources.
The primary sourced of data can also be called primary data, similarly the secondary sources data
is called secondary data. We have already mentioned above what are the sources in primary and
secondary sourced data.
Note: You can also read the topic of data collection in the Research Aptitude of UGC NET
Paper 1
The representation of data is the base for any field of study. When we start the collection of data
and the range of data increases rapidly, then an efficient and convenient technique for representing
data is needed. It is needed because of the time constraints, efforts and resources. The top-level
authority or management does not have enough time to go through whole reports, but any small
point of data should not remain hidden from their eyes. Therefore, it is required for presenting the
data in such a manner that enables the reader to interpret the essential data with minimum efforts
and time.
Data presentation and data representation are two terms having similar meaning and importance.
There are several techniques for data presentation, and are broadly categorised in two ways:
(a) Tabular Form: This is better known as numerical data tables. The tabular form is the most
commonly used technique for data presentation. This technique provides a correlation or
measurement of two values/variables at a time.
Table Pic
(b) Case Form: This technique is rarely used. Data is presented in the form of paragraphs and
follows a rigid protocol to examine a limited number of variables.
The data which has been represented in the tabular form can be displayed in pictorial form by
using a graph. A graphical presentation is the easiest way to depict a given set of data. A graphical
representation is a visual display of data and statistical results. It is often more effective than
presenting data in tabular form. There are different types of graphical representations and which
are used depends on the nature of the data and the type of statistical results. Graphical
representation is the visual display of data using plots and charts.
Graphical representation helps to quantify, sort, and present data in a method that is understandable
to a large variety of audiences.
Visualization techniques are ways of creating and manipulating graphical representations of data.
Several types of mediums are used for expressing graphics, including plots, charts, and diagrams.
In literature, we found that words diagram, chart, and graph are commonly being used
interchangeably. But the meaning of these words is as follows:
Diagram: A diagram can be defined as a figure generally consisting of lines, made to accompany
and geometrical theorem, mathematical demonstration, etc. A drawing, sketch, or plan that outlines
and explains the parts of something is also a type of diagram. For example, a diagram of an engine.
Pictorial representation of a quantity or of a relationship is termed as a diagram in simple words.
Chart: A sheet exhibiting information in the tabulated or methodical form is also known as a chart.
A chart is a graphical representation of data as by lines, curves, bars, etc. of a dependent variable,
e.g., temperature, price, etc.
Graph: Graph is simply a diagram in the mathematical or scientific area of study. A drawing
representing the relationship between a certain set of numbers or quantities by means of a series
of dots, lines, bars, etc. plotted with reference to a set of the axis is called a graph.
Histogram
• A bar graph is a chart that uses either horizontal or vertical bars to show comparisons
among categories.
• One axis of the chart shows the specific categories being compared, and the other axis
represents a discrete value.
• Some bar graphs present bars clustered in groups of more than one (grouped bar graphs)
and others show the bars divided into subparts to show cumulate effect.
• A circular graph that represents total value in circle and components in part wise.
• Useful in comparing components and total value.
• Data are expressed in percentage of the total value.
• The total value is equated to 360 degrees.
Line Graph or Stick Graph or Line Chart or line plot or Curve Chart
A line chart is the most basic type of chart used in finance, and it is generally created by connecting
a series of past recorded data together with a line. It is a style of chart that is created by connecting
a series of data points together with a line. Line charts are ideal for representing trends over time.
• Frequency polygon
• Cumulative frequency curve or Ogive
• Pictogram
• Stem leaf diagram
• Scatter diagram
Mapping of Data
Data mapping is, in the most simplistic terms, knowing where your information is stored.
In its simplest form, data mapping is about relationships. In particular, it is the process of
specifying how one information set relates, or maps, to another. Consider an information set that
includes a list of people and their contact information. The list contains names, addresses, city,
province or state, and postal code for each person. Also, consider a second information set that
includes a list of people and their music preferences. This list includes listener, artist, album name,
and song name for each listener. The lists are self-contained, somewhat related, but distinct.
Suppose that you wanted to create a mailing list of people who like a particular artist. You can't
quickly get this information because there is no direct way to relate one information set to the
other. The solution is to create a mapping between the name in the first information set and the
listener in the second information set. The specification of the relationship is called a data mapping.
From there, you simply search the related or combined information set for all listeners in the list
that like that particular artist. This gives you the corresponding mailing addresses.
Data mapping, in its simplest term, is to map source data fields to their related target data
fields. For example, the value of let says a source data field A goes into a target data field X.
Data mapping is essentially a way to surface and prevent issues ahead of time before they create
bigger problems later. The benefits are:
• Data mapping neutralizes the potential for data errors and mismatches,
• Aids in the data standardization process, and
• It makes intended data destinations clearer and easier to understand.
The followings are a few of the significant challenges that can arise with data mapping:
• Inaccuracy: Any process undertaken by humans can turn into a liability since the potential
for errors and misinformed decisions is so high. Inaccurate, duplicate or otherwise decayed
data has little use to the various teams in your organization as it can provide false insights
that take the company further from its goals, not closer.
• Time-wasting: In-house teams already have enough responsibility on their plates. Tasking
them with mapping data means time spent double-checking and re-working scripts and
schemas to approach a high level of accuracy and certainty. And if fields are mapped
incorrectly, it can result in significant data loss and even more re-work.
• Changes: Rarely can you "set it and forget it" with a data map. Changes can occur at any
time — to standards, reporting requirements, software processes, and systems — which
makes any prior data map obsolete.
Data Interpretation is an extension of Mathematical skill and accuracy that draw conclusions and
inferences from a comprehensive data presented numerically in tabular form using an
illustration, viz. Graphs, Pie charts, etc. In other words, the act of organising and interpreting
data to get meaningful information is Data Interpretation.
Data Interpretation aims to test not only quantitative skill but also relative, comparative, and
analytical ability.
Important tips:
• Before solving the questions of Data Interpretation, you must know the different
types of representation of data and the basic mathematical calculation.
• Data Interpretation is an estimation of results based on some data in tabular as well as
graphical form.
• The questions are based on the information given in tables and graphs. You have to
interpret the information presented and to select the appropriate data for answering
the questions.
• Get a general picture of the information before reading the question. Read the given
titles carefully and try to understand its nature.
• The questions of Data interpretation do not require to do extensive calculations and
computations. Most questions simply require reading the data correctly and carefully
and putting them to use directly with common sense.
• Be careful while dealing with units.
• To make reading easier, and to avoid errors, observe graphs keeping them straight.
• Be prepared to apply basic mathematical rules, principles, and formulae.
• Since one of the major benefits of graphs and tables is that they present data in a form
that enables you to readily make comparisons, use this visual attribute of graphs and
tables to help you answer the questions. Where possible, use your eyes instead of
your computational skills.
Example: In the following bar graph, it shows the number of tickets sold by six students A, B. C,
D, E, and F during a fair. Observe the graph and answer questions based on it.
Ans: (d) From the graph given in the question: Least number of tickets were sold by D. He sold
7 tickets.
Example: Study the following table and answer the questions given below it.
Q. 1: Which of the following units shows continuous increase in production of sugar over
months?
a) B b) A c) C d) D
Ans. (b)
Q. 2: In the case of Unit E, in which of the following pairs of months the production of sugar
was equal?
a) June & July b) April & June c) July & August d)
April & May
Ans. (d)
Q. 3: In the month of June, how many units have a share of more than 25% of the total
production of sugar?
a) one b) Three c) Two d) Four
Ans. (a)
Ans. (b)
Data Governance refers to the organizational bodies, rules, decision rights, and accountabilities
of people and information systems as they perform information-related processes.
It is an umbrella term for an emerging discipline that consists of a number of different practices
for data management, data quality, business process management, and risk management. The
goal of data governance is to ensure that the data serves the organisational purposes in a
sustainable way.
Wikipedia explains as “data governance encompasses the people, processes, and information
technology required to create a consistent and proper handling of an organisation’s data across
the business enterprise. Goals may be defined at all levels of the enterprise and doing so may aid
in acceptance of processes by those who will use them.”
So, we can conclude as “Data Governance includes the people, processes, and technologies
needed to manage and protect the company’s data assets in order to guarantee generally
understandable, correct, complete, trustworthy, secure and discoverable corporate data.”
Data quality, data management, and data migration initiatives are booming as a result of the
growth in data, demand, and regulation. As these data initiatives increase, they need governance
to ensure they fit the needs of the organisation and work with one another.
Effective data governance creates a framework for the use of data that fits each organisation.
Data governance improves operational efficiency, application effectiveness and minimizes risk.
Because of data governance, not only do the right people get the right information at the right
time, but they also get it in the right way both for their immediate purposes and in a way that
works with the data framework for the whole organisation.
The followings are the most common barriers to success for data governance initiatives:
Organisational: Different groups within an organization must communicate and coordinate well
with one another
Data quality, Data Management, and data migration integration: Applications and data must
speak to one another, and this must be addressed upfront and planned for in any integration
initiative
Accountability and ownership of data: People must be held accountable for information assets
and supported with technology to ensure the integrity of the assets
Cost: Data governance initiatives must be implemented in such a way that costs are recouped,
and business value is proven
The following are the steps using a repeatable technological framework to ensure effective data
governance: