570.assignment 2 Frontsheet - Fall2020
570.assignment 2 Frontsheet - Fall2020
570.assignment 2 Frontsheet - Fall2020
Student declaration
I certify that the assignment submission is entirely my own work and I fully understand the consequences of plagiarism. I understand that
making a false declaration is a form of malpractice.
Grading grid
P3 P4 P5 M2 M3 M4 D1 D2 D3
Summative Feedback: Resubmission Feedback:
Table of Contents
Introduction .................................................................................................................................................................. 5
I. Analysing and evaluating qualitative raw business data from a range of examples using appropriate
statistical methods ....................................................................................................................................................... 5
The differences between qualitative and quantitative analysis could be varied and they would be clarified as
followed.................................................................................................................................................................... 5
II. Applying a range of statistical methods used in business planning for quality, inventory and capacity
management .............................................................................................................................................................. 11
III. Using appropriate charts/tables to communicate findings of given variables ........................................... 14
Frequency distribution table ........................................................................................................................... 14
Pie chart .......................................................................................................................................................... 16
Bar chart ......................................................................................................................................................... 17
Histogram ....................................................................................................................................................... 18
Histogram with normal curve ......................................................................................................................... 20
Histogram with scatter plot ............................................................................................................................ 20
Conclusion................................................................................................................................................................... 21
Reference list .............................................................................................................................................................. 21
This assignment is written for the purposes of evaluating and analyzing business data (financial
knowledge, stock market) or microeconomics or macroeconomics to show understanding in terms of
current concerns, future trends/ plans, etc by using a number of statistical methods. All the variables
could be nominal or ordinal, interval, or ratio. The essence and method of business and economic data /
information from a variety of various written sources will be explained and assessed during the project in
a profound way of explaining. On top of that, the data, information and knowledge could be much easier
to break down using the three main approaches provided, which are descriptive approach, confirmatory
approach and exploratory approach. And apart from that, the application for statistical methods in
business planning is going to be examined by critically evaluation on their pros and cons.
I. Analysing and evaluating qualitative raw business data from a range of examples using
appropriate statistical methods
The differences between qualitative and quantitative analysis could be varied and they would be
clarified as followed.
To begin with, Qualitative study is empirical research where the data are not in the form of numbers. It
is multimethod in focus, requiring an interpretive, naturalistic approach to its subject matter. This means
that qualitative researchers investigate objects in their natural environments, trying to make sense of, or
perceive, phenomena in terms of the meanings people bring to them (McLeod, 2019).
As a result of the dissatisfaction of some psychologists (e.g., Carl Rogers) with the scientific research of
psychologists such as behaviorists, an interest in qualitative data arose. The traditional approach to
science is not seen as an appropriate way of conducting research because psychologists study people,
since it fails to capture the totality of human experience and the essence of what it is to be human. A
phenomenological approach is known as the exploration of participants' experience (Shields and
Twycross, 2003).
The aim of qualitative research is to understand the social reality of individuals, groups and cultures as
nearly as possible as its participants feel it or live it. Thus, people and groups, are studied in their natural
setting. Study following a qualitative approach is exploratory and aims to understand ‘ how 'and ‘ why' a
specific phenomenon, or action, works as it does in a particular context (McLeod, 2019).
On the other hand, Quantitative analysis collects data in a numerical form that can be divided into
categories or in order of rank or calculated in measurement units. To construct graphs and tables of raw
data, this type of data can be used. The goal of quantitative researchers is to develop general behavioral
and phenonomic laws across various contexts. Study is used to evaluate and potentially accept or refute
a hypothesis (Raimo Streefkerk, 2019).
As they are concerned with testing items, experiments usually produce quantitative results. Other
research methods, however, may provide both quantitative information, such as monitored observations
and questionnaires. For example, quantitative data can be produced by a rating scale or closed
questions on a questionnaire as these produce either numerical data or data that can be placed into
categories (e.g., "yes," "no" answers). For example, a rating scale or closed questions on a questionnaire.
The potential ways in which a research subject can respond to and communicate acceptable social
activity are restricted by experimental methods. Therefore, results are likely to be context-bound and
simply a reflection of the assumptions brought to the investigation by the researcher (Smeyers, 2001).
The main differences between Quantitative research and Qualitative research ((Raimo Streefkerk,
Moving onto the Descriptive statistics applied to the business data. In statistics, three words that often
come up are mean, mode and median showing the indication for the Measure of Central Tendency. The
mean (average) of a data set is found by adding all numbers in the data set and then dividing by the
number of values in the set. The median is the middle value when a data set is ordered from least to
greatest. The mode is the number that occurs most often in a data set (Khan, n.d.).
Source: (Byjus, n.d.)
A measure of variability is a summary statistic that reflects the amount of dispersion in a dataset. How
spread are the values? While the typical value is defined by a measure of central tendency, variability
measures determine how far away the data points appear to fall from the middle. Within the sense of a
distribution of values, we speak about variability. A low dispersion means that the points of data appear
to be closely clustered around the middle. Strong dispersion means they appear to slip further down
(Frost, 2018). This is where we can look at variability measurements, which are mathematical procedures
that explain how the data is spread out (Catherine, 2020).
They are:
-Range: defined as a single number that represents the data spread. The range is found by subtracting
the smallest data value from the largest data value. Here, the smallest data value is 100 and the largest is
297. Therefore, the range is: 297−100=197
-Standard deviation: defined as a number reflecting how far each score is from the average.
-Variance: defined as a number that indicates how the data is spread out
It can be seen that the average number of employees in three product industries is 117 people, specifically
Non-metallic mineral products, fabricated metal products and Wholesale services, based on the above
data table. Fifty percent of firms, however, have a workforce of fewer than 26 employees. The highest
frequency of exposure was 20 people / company. Great variation is represented by the high range, which
showcased 6998 employees. The mean of the second half minus the mean of the first half of the data set
is 74 individuals as a matter of fact. The standard deviation is 451 individuals, showing a very strong
dispersion around the mean value. A total of 348 companies agreed to answer questions regarding the
As can be tested on the One-sample T-test, it is clear that the average hours operating in a week is
approximately 55.55 hours per week, which explain for the larger of operating hours than 52 hours. As a
matter of fact, I am supposely concur to the questionaire statement.
Making a comparison between the two categories in terms of products services, it is clear that there
is a substantial difference in the hours operating between the Non-metallic mineral products and The
Fabricated metal products for even mean, standard deviation, Variance and Minimum range.
However, the Maximum range is seemed to be equivalent to each other, as 168 hours to be exact. In
conclusion, the first P-value and the third P-value are accepted, however, that of the second one
seems to have broken the similarities between the two, so that difference are taken place.
As can be seen from the above table, the relationship between the amount of sales and the the labors
is justified by the P-value. However, as P-value is only equal with 0.1873, equivalent to 81.27% of
the confidence level. Which explains for no existence of significant relationship between Sales and
II. Applying a range of statistical methods used in business planning for quality, inventory and
capacity management
The probability distribution is a statistical function that defines all possible values and
probabilities that can be taken within a given range by a random variable. This range would be
limited between the minimum and maximum possible values, but depending on a variety of
variables, it is exactly where the possible value is likely to be plotted on the probability
distribution. These factors include the mean (average), standard deviation, skewness, and
kurtosis of the distribution (Hayes, 2020). Typically, the data generating process of some
phenomenon will dictate its probability distribution. This process is called the probability density
function. Probability distributions can also be used to create cumulative distribution functions
(CDFs), which adds up the probability of occurrences cumulatively and will always start at zero
and end at 100%.
Source: (Jaiswal, 2018)
The binomial distribution is a probability distribution which summarizes the probability that,
under a given set of parameters or assumptions, a value will take one of two independent values.
The fundamental assumptions of the binomial distribution are that for each trial, there is only
one outcome, that each trial has the same likelihood of success, and that each trial is either
mutually exclusive or independent (Barone, 2020). The number of trials, or observations,
summarizes the binomial distribution when each trial has the same probability of obtaining one
specific value. In a specified number of trials, the binomial distribution specifies the likelihood of
observing a specified number of good results. In social science statistics, binomial distribution is
also used as a building block for models of dichotomous outcome variables, such as whether a
Republican or Democrat will win an upcoming election or whether a person will die within a
certain period of time, etc.
Normal distribution, also known as the Gaussian distribution, is a probability distribution that is
symmetric about the mean, which indicates that data near the mean is more common in
occurrence than data far from the mean. The normal distribution will appear as a bell curve in
graph form (Chen, 2019). Let's look at an example of a pizza delivery. Assume that there is a
mean delivery time of 30 minutes and a standard deviation of 5 minutes for a pizza restaurant.
We can estimate that 68 percent of the delivery times are between 25-35 minutes (30 + /- 5), 95
percent are between 20-40 minutes (30 + /- 2 * 5), and 99.7 percent are between 15-45 minutes
(30 + /-3 * 5) using the Empirical Law. The chart below graphically illustrates this property.
Inference population mean for working hours/week
The Pros and Cons of this kind of distribution table might varied, as Within a data set, it can help
recognize apparent patterns and can be used to compare data between data sets of the same kind.
However, frequency tables aren't ideal for every use. They can mask extreme values (more than X or less
than Y) and do not allow the skew and kurtosis of the data to be analyzed.
To begin with the advantages that it could bring to users, within a data set with not much more than a
cursory inspection, frequency tables can quickly reveal outliers and even significant trends. For instance,
a teacher might show the grades of students on a frequency table for a midterm in order to get a fast
look at how her class is doing overall. The number in the frequency column will reflect the number of
students receiving that grade; the frequency distribution of letter grades received might look something
like this for a class of 25 students: Grade Frequency A-7; B- 13; C-3; D- 2. On top of that, Frequency tables
may assist researchers to analyze within their sample the relative abundance of each unique target data.
Relative abundance reflects how much of the target data is composed of the data collection. Relative
abundance is often represented as a histogram of frequency, but can be easily shown in a table of
frequency. Find the same distribution of midterm grades in frequency. Relative abundance is simply the
percentage of students who scored a specific grade, and without overthinking it, it can be helpful for
conceptualizing results. For example, you can easily see that more than half of the class scored a B with
the added column that shows the percentage incidence of each grade, without having to scrutinize the
data in much detail (Reid, 2018).
On the contrary, nothing is totally perfect, there are still some shortcomings of utilizing this approach
and drawbacks are inevitable. One disadvantage is that complex data sets that are displayed on a
frequency table are hard to understand. Using a frequency table, large data sets can be divided into
interval groups for simple visualization. For example, if you asked the next 100 people to see what their
age was, you would probably get a wide variety of answers ranging from 3 to 93. You could divide the
data into intervals, such as 0-10 years, 11-20 years, 21-30 years and so on, instead of including rows for
each age in your frequency table. This may also be referred to as a distribution of clustered frequencies.
Furthermore, the skewness and kurtosis of the data may not be readily apparent in a frequency table
unless seen on a histogram. The skewness informs you the direction in which your data tends. If grades
were shown for our 25 students above around the X-axis of a graph showing the frequency of midterm
grades, the distribution will skew towards the A's and B's. Kurtosis informs you about your data's central
peak — whether it falls in line with a normal distribution, which is a nice smooth bell curve, or whether it
is tall and sharp. In our example, if you graph the midterm grades, you will find a tall peak at B with a
sharp dropoff in the lower grade distribution (Reid, 2018).
Pie chart
Moving onto one of very common used chart, which is the Pie chart. A Pie Chart is a type of graph that
displays a circular graph with details. In each group, the pieces of the graph are equal to the fraction of
the whole. In other words, in the group as a whole, each slice of the pie is relative to the size of that
category. The entire "pie" constitutes 100% of the whole, while parts of the whole are the pie "slices”.
In terms of the benefits that Pie chart might possess, as a clear and easy-to - understand picture, a pie
chart presents data. For even an uninformed audience, it can be an efficient communication tool,
because it visually represents data as a fractional part of a whole. A data comparison is seen by readers
or viewers at a glance, allowing them to do an immediate analysis or to quickly understand details. The
need for readers to analyze or calculate underlying numbers themselves is removed by this form of data
visualization map, so it is a good way to display data that would otherwise appear in a table. In the pie
circle, you can also manipulate pieces of data to highlight points that you want to make (Finch, 2010).
On the other hand, drawbacks are ineluctable, and even with pie chart, which is not an exception. If it
uses too many pieces of data, a pie chart becomes less accurate. For example, it is easy to read a chart
with four slices; one with more than 10 becomes less so, particularly if it contains several slices of a
similar size. It may not be beneficial here to add data labels and numbers, as they themselves may
become crowded and difficult to read. This kind of chart only reflects one data set-to compare different
sets, you will need a series of pie charts. This can make it harder for readers to quickly analyze and
assimilate knowledge. There are also difficulties comparing data slices in a circle, since the reader has to
factor in angles and compare non-adjacent slices. Manipulation of data within the design of the chart can
lead readers to draw inaccurate conclusions or to make decisions based on visual impact rather than
analysis of data (Finch, 2010).
A better choice might be other charts and graphs, especially if you are managing several pieces of data or
want to make comparisons between data sets. Doughnut charts share the circular shape of pie charts
and the overall functionality, but add the ability to view several sets of data. In the doughnut's hole, you
can also place data labels and totals, making it easier to compare segments. Bar graphs represent
information by length, enabling fast comparison and measurement. If you need to present many pieces
of data at a time or want to compare different sets of data in a single graph, they may be easier to read.
Bar chart
A bar graph is a chart that uses rectangular bars or columns (called bins) to graph data reflecting the
total number of observations for that group in the data. It is possible to display bar charts with vertical
columns, horizontal bars, comparative bars (several bars to display a contrast between values) or stacked
bars (several types of information are included in the bars). In financial analysis for viewing results, bar
graphs are widely used. A stock volume chart is a type of
vertical bar graph commonly used (Mitchell, 2020).
in a single year by three service industry companies in Vietnam.
As seen in the graph, it is clear that Wholesale had the lowest number of days of inventory with just 127
days. On the contrary, the quantity of inventory kept by sectors of non-metallic mineral goods tends to
be much higher than that of wholesale, with 5317 days to be precise-40 times greater than that of
wholesale services and 2018 days more than produced metal products, which had approximately 3300
As regarded the Pros of the bar chart, it is very straightforward to understand bar charts, and there is a
quite simple relationship between size and meaning that enables easy comparison. They are also easy to
create and most individuals have school experience making and understanding them as well as they may
assist more emphatically in expressing very large or very tiny values.
However, some shortcomings might arise while utilising bar chart. For example, Bar charts that aim to
reflect vast ranges of numbers will fail to express their message effectively. A bar chart for numbers 5 , 6,
10 and 378, for example, would assign the highest value to the extreme visual weight and make the
relative values of the other dimensions seem meaningless. An alternative to this would be to create an
adjustment scale for the bars, but this complicates the presentation's visual aspect and violates the
intuitive sense that size directly corresponds to value. Bar graphs appear to be locked into a single data
set, making it impossible to view various values or adjustments over time unless the graph is changed,
such as making the bars layered and three-dimensional.
Histograms is a common approach for matching images as most of the image-extracted features are
represented as histogram values such as color histogram, texture histogram, bag-of-words, etc. Two
main ways of comparing histograms are bin-wise comparison and cross-bin comparison. Two histograms
are compared bin by bin in bin-wise comparison, leading to a quicker way of computing (dis)likeness
between two histograms. One of the main disadvantages of this strategy is its failure to account for bins'
similarity. Therefore, the bin-wise comparison would always disregard the association between bins and
produce a higher matching cost even with minor distortions such as lighting variations where histogram
values are slightly disturbed. Cross-bin comparison, on the other hand, takes into account bin similarity
and is thus most robust in histograms for minor variations. Cross-bin methods of comparison, however,
have a higher computational cost (United Nations Conference on Trade and Development, 2013).
Its simplicity and flexibility are the key benefits of a histogram. In several different cases, it can be used
to give an informative look at the distribution of frequency. In sales and marketing, for instance, it can be
used to build the most effective pricing strategies and marketing campaigns. Histograms will illustrate
over time what the normal distribution is for a process that runs smoothly. However, any difference is
easily identified by regularly generating histograms. For organisations, this is a big benefit because it
helps to easily find and deal with process variations. The normal distribution is typically indicated by a
bell-shaped curve to the bar graph. In the graph, spikes signify differences that should be dealt with.
These spikes may also suggest opportunities for a trend to capitalize on.
The chart illustrates the proportion of hours operating in three different types of products services in
Vietnam gathered in a single year.
It is clear that there were huge fluctuations in organisation’s preference of hours working as a
comparison between the biggest and the smallest. People in Vietnam tended to spend their time at
workplace for around 40 to 60 hours per week.
In this particular year, approximately 82% of people went to work for 40 to 60 hours, compared to
less than 10% of other working period. This can be a reason for why people spent most of their time
in workplace rather than at their home. The second highest figure belongs to the criteria of 60-80
hours per week, the proportion of these explains that there are people taking all their time in a day, a
week, a month, a year just to work. They might even eat, sleep, and having their usual life at the
workplace, this can be a minor part of the society where people are not being tied in any kind of
any points above 3.75. On the other hand, things are a little evener for the x-axis, except for the outliers
on the far right.
In a nutshell, there has been a detailed review of business data with the aid of descriptive,
confirmatory, exploratory methods through the use of data, details and expertise from written
Reference list
Barone, A. (2020). How Binomial Distribution Works. [online] Investopedia. Available at:
BYJUS. (n.d.). Mean Median Mode - Formulas | Solved Examples. [online] Available at:
[Accessed 15 Oct. 2020].
Catherine, S. (2020). Measures of Variability: Range, Variance & Standard Deviation - Video & Lesson
Transcript | Study.com. [online] Study.com. Available at:
[Accessed 14 Oct. 2020].
Finch, C. (2010). Advantages & Disadvantages of a Pie Chart. [online] Bizfluent. Available at:
Frost, J. (2018b). Measures of Variability: Range, Interquartile Range, Variance, and Standard
Deviation. [online] Statistics By Jim. Available at: https://statisticsbyjim.com/basics/variability-range-
Hayes, A. (2020). What Are the Odds? How Probability Distribution Works. [online] Investopedia.
Available at:
Khan, S. (n.d.). Statistics intro: Mean, median, & mode (video). [online] Khan Academy. Available at:
Mitchell, C. (2020). Bar Graph Definition and Examples. [online] Investopedia. Available at:
Raimo Streefkerk (2019). Qualitative vs. Quantitative Research | Definitions, Differences & Methods.
[online] Scribbr. Available at: https://www.scribbr.com/methodology/qualitative-quantitative-
Reid, A. (2018). Advantages & Disadvantages of a Frequency Table. [online] Sciencing. Available at:
Seif, G. (2019). Everything you need to know about Scatter Plots for Data Visualisation. [online]
Medium. Available at: https://towardsdatascience.com/everything-you-need-to-know-about-scatter-
Shields, L. and Twycross, A. (2003). The difference between quantitative and qualitative research.
Paediatric Nursing, 15(9), pp.24–24.
Smeyers, P. (2001). Qualitative Versus Quantitative Research Design: A Plea for Paradigmatic
Tolerance in Educational Research. Journal of the Philosophy of Education, 35(3), pp.477–495.
United Nations Conference On Trade And Development (2013). UNCTAD handbook of statistics 2013
= Manuel de statistiques de la CNUCED 2013. New York ; Geneva: United Nations = Nations Unies.