Dadm Module-1
Dadm Module-1
Dadm Module-1
Croxton and Cowden: “Statistics is defined as the. Collection, Presentation, Analysis. and
Interpretation of numerical data
Scope:
In education and physiology statistics has found wide application such as determining or to
determine the reliability and validity of a test, factor analysis, etc.
Marketing
As per Philip Kotler and Gary Armstrong marketing “ identifies customer needs and wants ,
determine which target markets the organisations can serve best, and designs appropriate
products, services and Programs to serve these markets”
Marketing is all about creating and growing customers profitably. Statistics is used in almost
every aspect of creating and growing customers profitably. Statistics is extensively used in
making decisions regarding how to sell products to customers. Also, intelligent use of statistics
helps managers to design marketing campaigns targeted at the potential customers. Marketing
research is the systematic and objective gathering, recording and analysis of data about aspects
related to marketing.
Use of Statistics is indispensable in forecasting sales, market share and demand for various
types of Industrial products.
Factor analysis, conjoint analysis and multidimensional scaling are invaluable tools which are
based on statistical concepts, for designing of products and services based on customer
response.
Finance
Uncertainty is the hallmark of the financial world. All financial decisions are based on
“Expectation” that is best analysed with the help of the theory of probability and statistical
techniques. Probability and statistics are used extensively in designing of new insurance
policies and in fixing of premiums for insurance policies. Statistical tools and technique are
used for analysing risk and quantifying risk, also used in valuation of derivative instruments,
comparing return on investment in two or more instruments or companies.
Beta of a stock or equity is a statistical tool for comparing volatility, and is highly useful for
selection of portfolio of stocks.
The most sophisticated traders in today’s stock markets are those who trade in “derivatives” i.e
financial instruments whose underlying price depends on the price of some other asset.
Economics
Statistical data and methods render valuable assistance in the proper understanding of the
economic problem and the formulation of economic policies. Most economic phenomena and
indicators can be quantified and dealt with statistically sound logic.
In fact, Statistics got so much integrated with Economics that it led to development of a new
subject called Econometrics which basically deals with economics issues involving use of
Statistics.
Operations
The field of operations is about transforming various resources into product and services in the
place, quantity, cost, quality and time as required by the customers. Statistics plays a very
useful role at the input stage through sampling inspection and inventory management, in the
process stage through statistical quality control and six sigma method, and in the output stage
through sampling inspection. The term Six Sigma quality refers to situation where there is only
3.4 defects per million opportunities
3.List out type bar diagrams?
The following are the different types of bar diagrams:
If someone has to represent the data based on one variable, then the simple bar diagram can be
used. For example, the figures of productions, profits, sales, etc. for various years may be
represented by the help of simple bar diagrams. From simple bar diagrams reader can easily see the
variation in the characteristic under study with respect to time or some other given factor, because
width of each bar is same and only lengths of the bars vary. In our representation we will take length
of bars along vertical axis and other given factor along horizontal axis. They are very popular in
regular practice.
In multiple bar diagram, we construct two or more than two bars together. The
multiple bars are constructed for either the different components of the total or for
the magnitudes of the variables. All the bars of one group of data are made together
so that the comparison of the bars of different groups can be done properly. The
height of the bars will be magnitude of the component to be presented as similar
as we do in simple bar diagram. In this diagram the space between the vertical axis
and the first bar of the first group of bars is left but no space is left between the
bars of the same group. There must also be left the space between the bars of the
two different groups of bars. In multiple bar diagrams two or more groups of
interrelated data are presented. The technique of drawing such type of diagrams is
the same as that of simple bar diagram. The only difference is that since more than
one components are represented in each group, so different shades, colours, dots
or crossing are used to distinguish between the bars of the same group, and same
symbols are used for the corresponding components of the other groups. The
multiple bar diagrams are very useful in situations of either the number of relative
components are large or the change in the values of the components of one variable
is important.
Subdivided bar diagram drawn on the basis of the percentage of the total is known
as percentage bar diagram. When such diagrams are drawn, the length of all the
bars is kept equal to 100 and segments are formed in these bars to represent the
components on the basis of percentage of the aggregate. First of all the total of the
given variable is assumed equal to 100. Then the percentage is calculated for each
and every component of the variable. After then the cumulative percentage are
calculated for every component. Finally the bars are subdivided into the
cumulative percentage and presented like subdivided bar diagrams.
For representing net quantities excess or deficit, i.e. net profit, net loss, net exports,
net imports, etc., the deviation bar diagrams are used. Through this kind of bars
we can represent both positive and negative values. The values which are positive
can be drawn above the base line and negative values can be drawn below it.
There are different types of table based on different basis. On the basis of the objective or
purpose of the study, the tables are classified into two types viz. (1) General purpose table and
(2) Specific purpose table or summary table. On the basis of nature of the data, the tables are
classified into two types viz. (1) Primary, or original table and (2) Secondary, or derived table.
Further, on the basis of the elements, or characteristics covered, tables are broadly classified
into two types viz. (1) Simple and (2) Complex table.
A complex table contains information about comprehensive information and presents them into
two or more interrelated categories. For example, if there are two coordinate factors, the table
is called a two-way table or bi-variate table; if the number of coordinate groups is three, it is a
case of three-way tabulation. Similarly, if it is based on more than three coordinate groups, the
table is known as higher-order tabulation or a manifold tabulation.
. Classification means grouping of related facts into different classes. Information in one class
differs from those of other class with respect to some characteristics. Sorting particulars
according to one basis of classification and then on another basis is called cross-classification.
This process can be repeated as many times as the possible sources of classification are there
Types of Classification Broadly, data can be classified under following categories:
In geographical classification, data are classified on the basis of location, region, etc. For
example, if we present the data regarding production of sugarcane or wheat or rice, in view of
the four main regions in India,
Geographical classification is usually listed in alphabetical order for easy reference. Items may
also be listed by size to emphasis the magnitude of the areas under consideration such as
ranking the states based on population.
Time series data are usually listed in chronological order, normally in ascending order of time,
like 2001, 2002,… .When the major emphasis falls on the most recent events, a reverse time
order may be used.
To achieve this end, the statistical data relating to production, consumption, birth, death,
investment, income are of paramount importance. Today efficient planning is a must for almost
all countries, particularly the developing economies for their economic development.
Statistics of Public Finance enables us to impose tax, to provide subsidy, to spend on various
heads, amount of money to be borrowed or lent etc. So we cannot think of Statistics without
Economics or Economics without Statistics.
Sampling Techniques and Estimation Theory are very powerful and indispensable tools for
conducting any social survey, pertaining to any strata of society and then analysing the results
and drawing valid inferences. The most important application of statistics in sociology is in the
field of Demography for studying mortality (death rates), fertility (birth rates), marriages,
population growth and so on.
Changes in demand, supply, habits, fashion etc. can be anticipated with the help of statistics.
Statistics is of utmost significance in determining prices of the various products, determining
the phases of boom and depression etc. Use of statistics helps in smooth running of the business,
in reducing the uncertainties and thus contributes towards the success of business.
researcher is required to lean upon his knowledge and skills in statistical methods
7.List out the application of measures of central tendency and dispersion for Business decision
making?
Arthemetic mean
Definition
Most important measure of location is the mean or average value, for a variable. The mean
provides a measure of central location for the data. If the data are for a sample, the mean is
denoted by; if the data are for a population, the mean is denoted by the Greek letter μ. (David
R. Anderson et al)
The arithmetic mean is considered a deal average. It is frequently used in all the aspects of
business i.e. number of items produced per day on a large assembly line, number of orders
received per month for a firm. further In economic analysis arithmetic mean is used extensively
to calculate average production, average wage, average cost, per capital income exports,
imports, consumption, prices, etc.
Median
Definition
The median is another measure of central location. The median is the value in the middle when
the data are arranged in ascending order. With an odd number of observations, the median is
the middle value. An even number of observations has no single middle value. In this case, we
follow convention and define the median as the average of the values for the middle two
observations.
Geometric Mean
Definition
Geometric mean is well defined only for sets of positive real numbers. This is calculated by
multiplying all the numbers (call the number of numbers n), and taking the nth root of the total.
Harmonic Mean
Definition
Harmonic mean is used to calculate the average of a set of numbers. Here the number of
elements will be averaged and divided by the sum of the reciprocals of the elements. The
Harmonic mean is always the lowest mean.
Harmonic mean is applied in the problems where small items must get more relative
importance than the large ones. It is useful in cases where time, speed, values given in
quantities, rate and prices are involved. But in practice, it has little applicability.
The measure of dispersion shows the spread of data. It explains the data differs from one
another, delivering a precise picture of the distribution of data. Dispersion is the degree to
which values in a distribution differ from the average of the distribution.
A measure of spread also called a measure of dispersion, is used to describe the variability in
a sample or population.Measures of Dispersion are used to estimate “normal” values of a
dataset, measures of dispersion are important for describing the spread of the data, or its
variation around a central value.It is usually used in conjunction with a measure of central
tendencies, such as the mean or median, to provide an overall description of a set of data.
Data and numerical information have played a very vital role in the growth and development
of agriculture, especially in the developed countries.
In an agrarian country, like India, having about 70.5 million operation holdings over an
aggregate of 162 million hectares, the utility of agricultural statistics is even more important,
though it has not been utilized adequately so far. The quantitative agricultural researches, in
fact, are largely based on statistical data.
The advent of modern data processing equipment’s has enabled the agricultural land use
planners to utilize new techniques and methodologies and the demands for still more data.
The agriculture of a place, in fact, is the result of many physical, social, cultural, economic,
institutional, technological, political and psychological forces interacting upon each other and,
therefore, the growth, development and problems of agriculture cannot be solved by fractional
and isolated approaches. In overcoming these problems, a multidisciplinary approach is
required and a large body of data is to be incorporated in any project of research.
Thus, the researchers and planners have become increasingly aware of the utility of data. The
facts and figures about agriculture, whether they relate to land use, irrigation, forestry,
agricultural production, yield and prices of the agricultural commodities are called agricultural
data.
The agricultural data refer to information presented quantitatively, that is, figures on the various
aspects of agriculture of a macro or micro region. The region may be a country as a whole or a
state, or district, or block, or village, or farm, or the field itself. The agricultural data are helpful
in estimating, planning and forecasting the agricultural operation of a given unit of area at a
given point of time.
Agricultural statistics has a very wide coverage and its scope is very widening. The detailed
agricultural statistics is required at the national to the village and farm levels for agricultural
policy decision, placing agricultural development and estimates of the agricultural and national
income.
(i) Land utilization and irrigation, including the net area sown, gross cultivated area, current
fallow, cultivable waste, fallow other than current fallow, other uncultivated land, irrigated area
in kharif and rabi seasons, etc.
(ii) Forestry.
(v) Statistics relating to agricultural organization and farming structure, e.g., persons employed
in agriculture, their status, land held under various tenure, number of draught animals,
implements, farm building, etc.
(vi) Statistics and economics of production and marketing, e.g., cost of production, input-output
ratio, marketing changes, marketing spread over, etc.
(vii) General statistics, literacy among those employed in agriculture, health, sanitation.
(viii) Statistics relating to weather and climate, rainfall and its distribution, temperature and its
range, soil and its pH value, etc.
9.Enumerate the various methods of data collection and explain the methods of
questionnaire?
Quantitative techniques for market research and demand forecasting usually make use of
statistical tools. In these techniques, demand is forecast based on historical data. These methods
of primary data collection are generally used to make long-term forecasts. Statistical methods
are highly reliable as the element of subjectivity is minimum in these methods.
Smoothing Techniques
In cases where the time series lacks significant trends, smoothing techniques can be used. They
eliminate a random variation from the historical demand. It helps in identifying patterns and
demand levels to estimate future demand. The most common methods used in smoothing
demand forecasting techniques are the simple moving average method and the weighted
moving average method.
Barometric Method
Also known as the leading indicators approach, researchers use this method to speculate future
trends based on current developments. When the past events are considered to predict future
events, they act as leading indicators.
Qualitative Methods:
Qualitative methods are especially useful in situations when historical data is not available. Or
there is no need of numbers or mathematical calculations. Qualitative research is closely
associated with words, sounds, feeling, emotions, colors, and other elements that are non-
quantifiable. These techniques are based on experience, judgment, intuition, conjecture,
emotion, etc.
Quantitative methods do not provide the motive behind participants’ responses, often don’t
reach underrepresented populations, and span long periods to collect the data. Hence, it is best
to combine quantitative methods with qualitative methods.
Surveys
Surveys are used to collect data from the target audience and gather insights into their
preferences, opinions, choices, and feedback related to their products and services. Most survey
software often a wide range of question types to select.
Polls
Polls comprise of one single or multiple choice question. When it is required to have a quick
pulse of the audience’s sentiments, you can go for polls. Because they are short in length, it is
easier to get responses from the people.
Similar to surveys, online polls, too, can be embedded into various platforms. Once the
respondents answer the question, they can also be shown how they stand compared to others’
responses.
Interviews
In this method, the interviewer asks questions either face-to-face or through telephone to the
respondents. In face-to-face interviews, the interviewer asks a series of questions to the
interviewee in person and notes down responses. In case it is not feasible to meet the person,
the interviewer can go for a telephonic interview. This form of data collection is suitable when
there are only a few respondents. It is too time-consuming and tedious to repeat the same
process if there are many participants.
Questionnaire
A questionnaire is a printed set of questions, either open-ended or closed-ended. The
respondents are required to answer based on their knowledge and experience with the issue
concerned. The questionnaire is a part of the survey, whereas the questionnaire’s end-goal may
or may not be a survey.
In statistics, for a moderately skewed distribution, there exists a relation between mean, median
and mode. This mean median and mode relationship is known as the “empirical
relationship” which is defined as Mode is equal to the difference between 3 times the
median and 2 times the mean. This relation has been discussed in detail below.
To recall,
● Mean is the average of the data set which is calculated by adding all the data values
together and dividing it by the total number of data sets.
● Median is the middle value among the observed set of values and is calculated by
arranging the values in ascending order or in descending order and then choosing the
middle value.
● Mode is the number from a data set which has the highest frequency and is calculated
by counting the number of times each data value occurs.
In case of a moderately skewed distribution, the difference between mean and mode is almost
equal to three times the difference between the mean and median. Thus, the empirical mean
median mode relation is given as:
Or
Either of these two ways of equations can be used as per the convenience since by expanding
the first representation we get the second one as shown below:
However, we can define the relation between mean, median and mode for different types of
distributions as explained below:
If a frequency distribution graph has a symmetrical frequency curve, then mean, median and
mode will be equal.
Mean = Median = Mode
In case of a positively skewed frequency distribution, the mean is always greater than median
and the median is always greater than the mode.
Mean > Median > Mode
In case of a negatively skewed frequency distribution, the mean is always lesser than median
and the median is always lesser than the mode.
Mean < Median < Mode
11.Measures of dispersion are considered to be the statistical device to measure the variability
in a series. In this context describe the various types of measures of dispersion?
Measures of Dispersion
In statistics, the measures of dispersion help to interpret the variability of data i.e. to know how
much homogenous or heterogeneous the data is. In simple terms, it shows how squeezed or
scattered the variable is.
Types of Measures of Dispersion
There are two main types of dispersion methods in statistics which are:
Absolute Measure of Dispersion
Relative Measure of Dispersion
Census Method
A census method is that process of the statistical list where all members of a population are
analysed. The population relates to the set of all observations under concern. For instance, if
you want to carry out a study to find out student’s feedback about the amenities of your school,
then all the students of your school would form a component of the ‘population’ for your study.
(A) Census ● A statistical investigation in which the data are collected for each and every
method element/unit of the population is termed as census method.
● It is also known as ‘complete enumeration’ or ‘100% enumeration’ or ‘complete
survey’.
● Examples:
1. Demographic data on birth and death rates, literacy, workforce, life expectancy, size
and composition of a population
(1) Intensive ● It provides intensive and in-depth information covering many facets of the problems.
study
● Example: In a population census, not only the number of persons is counted, but the
information is also collected on various other parameters like the number of males and
females, age, education, marital status, occupational level, income health conditions, etc.
(2) Results ● Since, in this type of investigation, every item of the universe is taken into account,
are more the conclusions are more accurate and reliable.
accurate and
reliable
(1) Costly ● Since the data are obtained for or from each and every unit of the population, it is a
method very expensive method of investigation, especially in case of a large population size.
(2) Needs ● Since a large volume of data is to be collected, more time and manpower is required
more time for its collection, analysis, and interpretation.
and
manpower
(3) Not ● This method is meaningless in the case of an infinite universe where the number of
suitable for items is unlimited.
the large
population
In our country, the Government conducts the Census of India every ten years. The Census
appropriates information from households regarding their incomes, the earning members, the
total number of children, members of the family, etc. This method must take into account all
the units. It cannot leave out anyone in collecting data. Once collected, the Census of India
reveals demographic information such as birth rates, death rates, total population, population
growth rate of our country, etc. The last census was conducted in the year 2011.
13.Under which circumstances it would be ideal to use Mean Median and Mode?
1.Mean is the most frequently used measure of central tendency and generally considered the
best measure of it. However, there are some situations where either median or mode are
preferred.
2.Median is the preferred measure of central tendency when:
0. There are a few extreme scores in the distribution of the data. (NOTE:
Remember that a single outlier can have a great effect on the mean). b.
2. There is an open ended distribution (For example, if you have a data field which
measures number of children and your options are 00, 11, 22, 33, 44, 55 or
“66 or more,” than the “66 or more field” is open ended and makes calculating
the mean impossible, since we do not know exact values for this field).
3.Mode is the preferred measure when data are measured in a nominal ( and even sometimes
ordinal) scale.
Meaning of Statistics:
Statistics, a branch of applied Mathematics, is regarded as mathematics applied to
observational data. Conceivably everything dealing with the collection, processing, analysis
and interpretation of numerical data belongs to the domain of statistics.
The second meaning of the term statistics refers to the statistical principles and methods
employed in the collection, processing, analysis and interpretation of any kind of data. In this
sense, it is a branch of applied mathematics and helps us to know the complex social
phenomena in a better way and lends precision to our ideas.
Another characteristic of statistics is that the data should be collected in a systematic manner.
The data collected in a haphazard manner will lead to difficulties in the process of analysis,
and wrong conclusions. A proper plan should be made and trained investigators should be used
to collect data so that they may collect statistics. If it is not done, in such cases reliability of
data gets decreased. So to get correct results the data must be collected in a precise manner.