Research Unit 4

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 53

Submitted To, Submitted By,

Dr.Vinith Kumar Nair Mohammed Nishad


Associate Dean Navya Chandran
TKMIM. Nidhin Kumar N
Nithin N Kuttan
Parvathy S
Praveena Wilson
R Balamuralikrishna
Rojini A R
S Faizal
Sambhu Chandrasen
CONTENT - UNIT 4
❏ Field work in research and data processing - Rojini A R
❏ Classification and Tabulation - Mohammed Nishad
❏ Analysis and interpretation of data - R Balamuralikrishna
❏ Testing of hypothesis - Nithin N Kuttan
❏ An overview of Parametric And Non parametric Tests - Parvathy S & Nidhin Kumar N
❏ Essential ideas of multivariate analysis of Data - S Faizal
❏ An overview of dependence and independence Methods - Praveena Wilson & Navya
Chandran
❏ Statistical Packages : SPSS - Sambhu Chandrasen
● Field research is defined as a
qualitative method of data
collection that aims to observe,
interact and understand people
while they are in a natural
environment.
METHODS OF FIELD RESEARCH
Examples of Field Research:Study animal migration patterns

Field research is used extensively to study flora and fauna. A major use case is scientists
monitoring and studying animal migration patterns with the change of seasons. Field
research helps collect data across years and that helps draw conclusions about how to
safely expedite the safe passage of animals.

Advantages of Field Research :

● It is conducted in a real-world and natural environment where there is no


tampering of variables and the environment is not doctored.
● Due to the study being conducted in a comfortable environment, data can be
collected even about ancillary topics.
● The researcher gains a deep understanding into the research subjects due to the
proximity to them and hence the research is extensive, thorough and accurate.
Disadvantages of Field Research

The disadvantages of field research are:

● The studies are expensive and time-consuming and can take years to complete.
● It is very difficult for the researcher to distance themselves from a bias in the research
study.
● It is an interpretive method and this is subjective and entirely dependent on the ability
of the researcher.
● In this method, it is impossible to control external variables and this constantly alters
the nature of the research.
● A series of actions or steps performed on
data to verify, organize, transform,
integrate, and extract data in an appropriate
output form for subsequent use.
● Methods of processing must be rigorously
documented to ensure the utility and
integrity of the data.

Example:A stock trading software that converts


millions of stock data into a simple graph

An e-commerce company uses the search


history of customers to recommend similar
products
Six stages of data processing
1. Data collection
2. Data preparation
3. Data input
4. Processing
5. Data output/interpretation
6. Data storage and Report Writing
STEPS IN DATA PROCESSING

Data processing in research consists of five important steps:

1. Editing of data

2. Coding of data

3. Classification of data

4. Tabulation of data

5. Data diagrams
classification and tabulation

Classification is arranging data in groups/classes on the basis of certain properties . The


main purpose of classification is condensation of the large mass of data to get the
general nature of the data.

Tabulation is the continuation of classification. It is the process of arranging the


classified data in the form of a table consisting of rows and columns. The purpose of
tabulation is presentation of the classified data in a form, which is most convenient for
understanding the general features.
Classification and Types of classification
Classification is done based on the characteristic consideration. It may be

1. Geographical characteristics

Table is prepared by giving the number of units belonging to each geographical


region.

2. Chronological characteristics

Classification done based on the period to which each value is associated.


3. Qualitative characteristics

When then the characteristic based on which classification is done in a quality


like sex, colour, religion, literacy etc. which cannot be numerically measured
the classification is said to be qualitative

4. Quantitative characteristics

A characteristic like height or weight which can be numerically measured is


called a quantitative characteristic
Tabulation
Thepurpose of tabulation is presentation of the classified data in a form, which is
most convenient for understanding the general features.

The following are some points to be remembered in tabulation.

● The table should have a self explanatory heading.


● Each column and row should have a heading.
● The units of measurement should be stated in the heading.
● If certain figures are to be emphasised they should be given in bold type or in
a box or in a circle.
● Necessary footnotes should be given
● Overloading a table with too many details is unfair.
Types of Tabulation
In general, the tabulation is classified in two parts, that is a simple tabulation, and a
complex tabulation.

● Simple tabulation, gives information regarding one or more independent


questions.
● Complex tabulation gives information regarding two mutually dependent
questions.
Data Analysis and interpretation

● It is the process of gathering, modelling and transforming data so as to get


useful information, suggestions and conclusions in decision making.
● Types: 1)Descriptive Analysis

2)Inferential Analysis
Descriptive analysis
● Also known as Quantitative analysis.
● It is used for elaborating the data which is under the sampling observation either
graphically or numerically.
● Based on the number of variables data analysis can be defined as:

1)Univariate Analysis

2)Bi-Variate Analysis

3)Multivariate Analysis
Inferential Analysis
● Also known as qualitative analysis.
● Used by the researchers when they have acquired the data from the sample
through a random procedure(using probability method) and with a high
response rate .
● Examples for Qualitative data : interview notes, transcripts of focus groups,
answer to open ended questions, transcriptions of video recordings, news
articles, etc.

● Types : 1) Estimation of parameter values.

2) Hypothesis Testing.
Phases of Data Analysis
1) Preliminary data analysis
● Applied before hypothesis testing.
● Clarifies how well the coding, inputting, scaling are done.
● Outcome influences the result and conclusion.

1) Hypothesis testing
● Finds the validity of the assumption with a view to choose between two
opposite hypothesis about the population parameter.
Data interpretation
●Process of reviewing data through some predefined processes.
●Assigns a meaning to the information analyzed and
determines its signification and implications.
●To help people make sense of numerical data that has been
collected, analyzed and presented.

Types : 1)Quantitative data interpretation


2) Qualitative data interpretation
Quantitative data interpretation
● Numerical
● measured by visually presenting correlation tests between two or more
variables of significance.

Interpretation processes of quantitative data include:

● Regression analysis
● Cohort analysis
● Predictive and prescriptive analysis
Qualitative data Interpretation
● Categorical
● data is described through the use of descriptive context.

The techniques include:

● Observations
● Documents
● Interviews
TESTING OF HYPOTHESIS
Hypothesis testing is an act in statistics
whereby an analyst tests an assumption
regarding a population parameter. The
methodology employed by the analyst depends
on the nature of the data used and the reason
for the analysis.
STEPS IN HYPOTHESIS
● State the hypothesis
● Formulate analysis plan
● Carry out the plan
● Analysis of the result
Parametric Test

● The statistical tests based on the assumption that population or population parameter
is normally distributed are called parametric tests.
● This data in this test is derived from interval and ratio measurement in
parametric tests.
● The important parametric tests are :

Z-test

T-test

F-test

ANOVA
Z-test
● It is a parametric test of hypothesis testing.
● It is used to determine whether the means are different when the population
variance is known and the sample size is large.
T-test

● It is a parametric test of hypothesis testing based on Student’s T distribution.


● It is essentially, testing the significance of the difference of the mean values when
the sample size is small and when population standard deviation is not available.
F-test
● It is a parametric test of hypothesis testing based on Snedecor F-distribution.
● F-test is named after its tests statistics, F, which was named in the honour of Sir
Ronald Fisher.
● It is a test for the null hypothesis that two normal population have the same
variance.
ANOVA
● Also called as Analysis of variance, it is a parametric test of hypothesis testing.
● It was developed by Ronald Fisher, also referred to as Fisher’s ANOVA.
● It is an extension of T-test and Z-test.
● It is used to test the significance of the differences of the mean values among more
than two sample groups.
Non-Parametric Test

● Non-parametric tests are experiments that do not require the underlying population
for assumptions.
● It does not rely on any data referring to any particular parametric group
● The non-parametric test is used when there are skewed data.
● The most common non-parametric tests are :

Chi-Square Test

Mann-Whitney U Test

Wilcoxon Signed Rank Test

The Kruskal-Wallis Test


Chi-Square Test

● A chi-square test is a statistical test used to compare observed results with expected
results.
● The Chi-Square test is a statistical procedure used by researchers to examine the
differences between categorical variables in the same population.
● The Chi-Square test is most useful when analyzing cross tabulations of survey
response data.
Mann-Whitney U Test

● The Mann-Whitney U Test is a nonparametric version of the independent


samples t-test.
● The test primarily deals with two independent samples that contain ordinal
data.
● Mann Whitney U test is used to compare the continuous outcomes in the two
independent samples.
● It compares whether the distribution of the dependent variable is the same for
the two groups and therefore from the same population.
Wilcoxon Signed Rank Test

● The Wilcoxon Signed Rank Test is a nonparametric counterpart of the paired


samples t-test.
● The test compares two dependent samples with ordinal data.
● Wilcoxon signed-rank test is used to compare the continuous outcome in the two
matched samples or the paired samples.
The Kruskal-Wallis Test

● Kruskal Wallis test is used to compare the continuous outcome in greater than two
independent samples.
● The Kruskal-Wallis Test is a nonparametric alternative to the one-way ANOVA.
● The Kruskal-Wallis test is used to compare more than two independent groups
with ordinal data.
Essential ideas of multivariate analysis of Data
● Multivariate means involving multiple dependent variables resulting in one outcome.
● This explains that the majority of the problems in the real world are Multivariate.
● For example, we cannot predict the weather of any year based on the season. There are multiple
factors like pollution, humidity, precipitation, etc.
● Here, we will introduce you to multivariate analysis, its history, and its application in different fields.
● Multivariate analysis (MVA) is a Statistical procedure for analysis of data involving more than one
type of measurement or observation. It may also mean solving problems where more than one
dependent variable is analyzed simultaneously with other variables.
The Objective of multivariate analysis

● Data reduction or structural simplification


● Sorting and grouping:
● Investigation of dependence among variables
● Prediction Relationships between variables
● Hypothesis construction and testing.
Advantages and Disadvantages of Multivariate Analysis

Advantages

● The main advantage of multivariate analysis is that since it considers more than one factor of independent
variables that influence the variability of dependent variables, the conclusion drawn is more accurate.
● The conclusions are more realistic and nearer to the real-life situation.

Disadvantages

● The main disadvantage of MVA includes that it requires rather complex computations to arrive at a
satisfactory conclusion.
● Many observations for a large number of variables need to be collected and tabulated; it is a rather time-
consuming process.
Popcorn time!
Scientific experiment

Your friend.. 🤣 Meanwhile, you… 😱

Comedy Horror
Your friend wins!!! 🥳 (Unless, you're a stress eater!) 😅
Variable
● Something that can either be changed or measured in an experiment.
● A variable is any entity that can take on different values.
● Variables aren’t always ‘quantitative’ or numerical.

● In an experiment, the researcher is looking for the possible effect on the dependent
variable that might be caused by changing the independent variable.
● The dependent variable is the variable a researcher is interested in. The changes to
the dependent variable are what the researcher is trying to measure with all their
fancy techniques.
Popcorn variables

● In the example,

Outcome of the variable being measured,here,

the amount of popcorn eaten - Dependent variable

What makes the different research groups different, in this case,

the Type of movie - Independent variable


Dependent variable

● A dependent variable relies on another variable for it's value.


● It is the variable which is being measured or affected in a scientific
experiment. Hence, if it is also known as Responding variable.

● Dependent variable is one which the experimenter observes, measures or


tests in a scientific experiment.

● In a scientific experiment, you cannot have a dependent variable without an


independent variable.
Dependent variable
Meaning - Dependent variable is what is being measured or what changes with the change in
independent variable.

What is it? - Dependent variable is what is being measured or what changes with the change in
independent variable.

Function - Shows the effect of change

Reflects - Response

Relation - Observed effect

Regarded as - Experiment measure

Values - Observed by the Researcher

Denoted by - Y-axis
Interdependence Method

● Interdependence method are a type of relationship that variables

cannot be classified as either dependent or independent.


● It aims to unravel relationships between variables and/or subjects

without explicitly assuming specific distributions for the


variables. The idea is to describe the patterns in the data without
making (very) strong assumptions about the variables.
Types of Interdependence Method

1.Factor Analysis

2.Cluster Analysis

3.Multidimensional Scaling

4.Correspondence Analysis
Difference Between Dependence &
Interdependence Method
About SPSS
● SPSS stands for Statistical Package for the Social Sciences.

● SPSS Incorporated is a leading worldwide provider of predictive


analytics software and solutions.

● First version of SPSS was released in 1968, after being developed by


Norman H. Nie, Dale H. Bent and C. Hadlai Hull.

● The company announced on July 28, 2009 that it was being acquired by
IBM.
Introduction to
SPSS is a powerful and flexible system for statistical and information analysis.
With SPSS we can analyze data in three basic ways:

❏Describe data using descriptive statistics. Example: frequency, mean,


minimum and maximum.

❏Examine Relationships between variables. Example: correlation,


regression, factor analysis etc.

❏Compare groups to determine if there are significant difference between


these groups. Example: t-test, ANOVA etc.
Core Functions of SPSS

❖ Statistics Program
❖ Modeler Program
❖ Text Analytics for Surveys Program
❖ Visualization Designer
THANK YOU

You might also like