Collecting and Analysing Data

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Unit 7.

5
Collecting and analysing data

Learning objectives
In this unit, you will:
■ carry out systematic studies using relevant data for English Language studies (AO4)
■ develop the skills to analyse and synthesise language information from a variety of sources (AO5)
■ learn about the guidelines which govern how research is carried out in a fair and appropriate
manner (AO4)
■ apply these principles to research in English Language topics (AO4).

The Cambridge International AS & A Level course does not require you to carry out your
own research project, but it is important that you are aware of the standard research
techniques. This will allow you to better understand research papers that you read.

Before you start


1 Work with a partner to consider the topics you will be learning about at A Level. In what
situations might you need to collect, analyse and report on data related to English Language?
2 Take two of these situations and discuss the possible research methods you might use to gather
and analyse the data.
Data collection for English Language
It may surprise you that there is data to be collected for English Language study. You may also be surprised
that you are likely to be carrying out research, individually or as a group, and then using a reliable
procedure to collate and analyse results. From these results you will have your own data to be able to draw
conclusions about elements of English language.
Your own research findings will be a valuable addition to your learning about existing studies as you will
have original data, which may or may not have the same findings as published studies. Research
techniques must follow a common investigative procedure, and you may already be familiar with the
following ideas and techniques in your other studies, particularly in the Social Sciences.
The study of English involves collecting and analysing language data. For this, you should use the following
established procedure for scientific research:
• formulation of a hypothesis
• design of the most suitable method of data collection and handling
• analysis of the data
• conclusion and evaluation
• bibliography.
Focussing your area of investigation
English Language is an extensive area to study and you will have limited time in which to carry out your
investigation. Narrow your focus to a particular field of study, such as child language acquisition, spoken
language or language and gender. From the topic you have chosen, you should narrow the focus of your
investigation to a specific topic from which you can create a hypothesis. An example might be:

ACTIVITY 1

Discuss with a partner whether the following topics are suitable for A
Level English Language investigation. For any suitable topics, suggest
a method of investigation. Suggest why some topics are unsuitable.
For example, you might think that the topic is impractical to
investigate, or too general.
• analysis of one minute of a sporting commentary to assess what
techniques of unscripted discourse are used
• analysis of two front-page newspapers from non-English-speaking
areas of the world, to see the extent of English Language lexis
• comparison of the lyrics of two songs from different time periods
to assess syntax and lexical differences
• comparison of two pieces of travel writing from different
times/centuries to assess different language styles of writing
• recording two minutes of an infant’s speech at monthly intervals
from 18–24 months to assess language acquisition
• comparison of two cosmetic or household products, from different
time periods, aimed at women to assess contrasts in the language
of persuasion and any features of language and gender
• analysis of two Facebook posts – one male and one female – to
assess whether there are lexical and stylistic differences between
genders.
Research topics and data sources
This section outlines the research methods you are most likely to use for working with English Language
data.
Copies of spoken and written texts as they are used naturally are now stored electronically. This collection
of texts is known as a corpus and the information stored is corpus data. More information on the use of
corpus data is found in Section 7.
The following is a list of some of the most popular topic areas for English Language research studies.
• lexis: distinctive jargon, relevant to a particular topic (e.g. sporting commentaries or professions, e.g.
education)
• neologisms: new words / acronyms, particularly those used in social media and advertising (e.g. ‘lol’,
‘btw’, ‘404’, ‘tweet cred’)
• features of style in a particular text (e.g. rhetorical questions, metaphor, puns, modification from
adjectives and adverbs)
• syntax: a text’s composition regarding the length and structure of sentences as well as their types (e.g.
imperative, exclamative, interrogative)
• semantics: meanings associated with particular words or phrases which have generally accepted
associations (e.g. ‘home’ does mean a living place, but it also has associations of warmth, security and
belonging)
• the form and layout of the text (e.g. brochures, posters, speeches)
• unscripted discourse features including conversational features, accents and dialects, varieties of world
English, and language and gender
• tracking diachronic changes to word meanings and their usage.

KEY CONCEPT

Diversity
The diversity of English offers a rich opportunity for analysis,
comparison and exploration. Data relevant to English Language study
must be collected and processed according to ethical guidelines,
before it is analysed and presented in a systematic way. Discuss what
you understand by ethical guidelines and where they should be used
in the analysis of English Language data.

ACTIVITY 2

Work with a partner to suggest possible research topics from the


following sources of data:
• a social media site such as Facebook
• a copy of a local/regional newspaper and a copy of a national
newspaper, both published on the same day
• an article published in a newspaper compared with the same topic
viewed on a news website
• a children’s TV programme
• tweets.

Sources of data
There is a wealth of written data from such sources as advertisements, brochures, leaflets, editorials, news
stories, articles, reviews, blogs, investigative journalism, letters, podcasts, (auto) biographies, children’s
books, diaries, essays, scripted speech and narrative/descriptive writing.
Spoken data is a very interesting source to investigate, and its recording and transcribing is essential for
careful analysis. The main categories are:
• real speech (e.g. friends talking; a teacher giving a lesson; an infant/child talking to friends or to adults)
• represented speech, such as a TV or film drama or a scripted speech
• media (e.g TV; film advertisements; news)
digital data where the boundaries between spoken and written language become blurred (e.g. social
• networking sites).
It is easy to gather much more spoken data than you actually need. Transcribing speech can be very time-
consuming and laborious, as you should write down not only every word, but all hesitations and pauses.
Just two minutes of discourse can require a lot of transcription time! If you are analysing how something is
said, rather than what is said, you may need to use phonetic spelling. When you are analysing a variety of
world English or a dialect, specialist books and online sources will teach you the symbols that match the
sounds.

Use of corpus linguistics


One way of analysing language is through corpus linguistics. This a collection of authentic texts, such as
newspapers, blogs, speeches, tweets and advertisements. The common assumption is that these texts
have been computerised and so are available for research investigations. Usually, the analysis is performed
with the help of a computer (i.e. with specialised software) and takes into account the frequency of the
particular linguistic feature being investigated. If you wanted to look at the references to ‘peace’ or ‘joy’ or
‘love’ used in the lyrics of your favourite singer, you could gather these all together in one file. This
becomes your corpus to analyse the word frequency of the topic you have chosen. These software tools are
all available online, often without cost.

Methods of data collection


English Language investigations follow similar procedures to other systematic research. When the
researcher has decided on the objective and created a hypothesis, then the most appropriate method of
data collection is chosen. Invariably a sample, a smaller number of responses than the total, must be
taken from the data available. A random sample ensures that every possible respondent has an equal
chance of selection.
A Level English Language research favours the following methods of data collection:
• recording and transcribing spoken language from the original source
• collecting different texts, such as adverts and speeches, and annotating them for comparison
• searching online for the specific data needed in videos and websites
• creating a questionnaire and interviewing respondents, or allowing respondents to complete the
questionnaires themselves
• observing participants, such as babies and toddlers, and conversationalists
• tracking diachronic changes (see Unit 7.6) – how word usage and meaning can change over time.

ACTIVITY 3

Work in a small group to collect small amounts of information from:


• original discourse between two different sets of participants
• different texts of the same genre (e.g. adverts, a social media
source)
• media sources, such as a sport commentary or TV drama.

Questionnaire design
Questionnaires are a set of questions, often, but not always, containing a choice of answers that a sample
of respondents will complete. The answers are then analysed for results.
Questionnaire design and asking people questions seems deceptively easy. But it is important to ensure
that the respondents understand the questions and complete them honestly and according to their views.
You will find a lot more information online about questionnaire design. The following points are given as
general guidelines:
• The questionnaire should be simple in design, polite and friendly. It should clearly explain the aims of
the survey.
• Early questions should engage the participants’ interest and should be straightforward.
• Important questions requiring thought and extended answers should be in the middle of the
questionnaire.
• Any questions likely to cause offence are to be avoided.
• Technical questions, if they are to be given to a non-specialist audience, are to be avoided.
• Open-ended questions, which require a lot of time to complete, should be kept to a minimum.
• ‘Loaded’ questions, which suggest the required answer to the respondents, are to be avoided.

ACTIVITY 4

Work with a partner to think of a topic which could be researched by


a questionnaire for each of the four English Language A Level areas
of study:
• English in the world
• Language and self
• Language acquisition
• Language change

Elements of questionnaire design and use


There is no single perfect questionnaire design, as the style of questions asked depends on the objective of
the questionnaire and the type of material the researcher wishes to collect. For example, if the researcher
wishes to collect descriptive information, the questions may well be open questions, where a free choice
of answer can be given; if the researcher is collecting material where responses can be measured, then
closed questions allow only a limited number of replies.
A well-structured questionnaire should ask questions about the research objectives. This may sound very
obvious but some surveys fail to make this the focal point. Respondents should be able to understand the
questions being asked and, through clear phrasing of the questionnaire, give accurate and complete
information. A pilot survey, which tests questions and the analysis procedure, should be carried out, and
any faults found in the questionnaire design and analysis should be put right before time and money are
wasted on a set of questions which do not give reliable and valid results.

ACTIVITY 5

Read the pilot survey questions a–e, then answer the questions which
follow.
a How much do you earn?
b Do you agree or disagree with the advertiser’s untruthful claim
that ‘women will be more beautiful’ after using their face
cream?
c How old are you?
d Do you agree that synthetic personalisation in language helps
media institutions reinforce their linguistic control over their
audience?
1 Why would these questions be inappropriate where the
respondents complete the survey without an interviewer?
2 Rephrase each question to be more appropriate or better phrased
for the respondents to answer.

Data analysis
Your research is likely to have data which can be measured in different ways, and specialist statistical
books and online tutorials will give additional information and help. The following is a list of the most likely
scales of measurement you will use:
1 Nominal: data gathered which is allocated to a particular category (e.g. ‘yes/no’; ‘number of virtuous
errors used’). (Virtuous errors are errors made by young children as they try to apply the regular rules of
the language they hear around them to irregular forms – e.g. they may say ‘runned’ instead of the
standard ‘ran’. See Unit 8.4).
2 Ordinal: data which can be ranked in order (e.g. results to show which second language people spoke,
where English is measured with other languages)
3 Interval: where the difference between data can be measured (e.g. temperature)
4 Ratio: similar to interval, but it must have a true zero (e.g. height)
Note: you are unlikely to need to use interval and ratio data in English Language studies.

THINK LIKE … A DATA COLLECTOR

You work for the government department concerned with children’s


health and social development in your country. Work with a partner to
create five questions which could be used as part of a questionnaire
survey on the language skills of preschoolers aged three to four
years. The questionnaires will be distributed to parents and carers of
children attending pre school institutions such as nursery and child
daycare.
Using the guidelines on data collection, create five questions which
will provide reliable and valid data for your department to use to
promote strategies to improve children’s language development.
Ethics in research
Investigations of English Language data, just as in other disciplines, require guidelines. All research
involving people (and animals) must be carried out according to internationally recognised correct practice.
The benefits of gaining information and understanding in the subject must be balanced against the welfare
of the participants.
The information given in this unit will allow you to proceed with confidence and integrity in your English
Language investigations.
Broad ethical guidelines to ensure best research practice involve the following safeguards:
• Participants must give their informed consent for the research project. In the case of children, informed
consent should be given from a responsible guardian.
• Observations of people’s behaviour, including language, in a public place may imply that the people
agree to being observed, although they should be informed that observation has taken place.
• Participants should not be subjected to physical and mental stress. Some infamous experiments, such
as the Stanford Prison 1971 experiment in the US, caused large numbers of participants to suffer
extreme stress through the cruelty in the role-playing which was required.
• There should be no deception of participants and they should not be forced to take part; participants
should be free to withdraw at any time.
• Participants should be thoroughly informed and debriefed about the purpose of the investigation.
• All data must be subject to strict confidentiality.
In turn, researchers have the right to expect that participants must agree to reveal honest information
about themselves that is relevant to the study.
There are also guidelines covering your role as a researcher which are summarised as follows:
1 All data gathered from participants should be kept confidential.
2 No data should be falsified.
3 Any references should be acknowledged and sources given in a bibliography.
4 No work from any other source should be copied and passed offas the researcher’s own work. This is
plagiarism which, with modern detection techniques, is quite easy to trace and results in work being
destroyed, and it can also involve expulsion from the educational institution attended by the researcher.
All of these guidelines delimit the acceptable behaviour for carrying out research, and English Language
research is a part of this. With this acceptable behaviour, research ethics must be part of the planning,
the implementation and the reporting of research.

KEY CONCEPT

Diversity
The diversity of English offers a rich opportunity for analysis which
must be carried out according to best practice and ethics. What
ethical issues might arise in an analysis of English Language data?

Much of the research carried out in English Language topics is done through corpus linguistics, but where
observations, such as children using language and the measurement of attitudes about language, are
being investigated, then the welfare of the participants must be respected.

ACTIVITY 6

How would the following fail to meet best practice in a research


investigation into ‘Language changes which are taking place in in the
digital world’? Discuss your answers with a partner.
• only sampling female respondents
• only sampling respondents aged 30 and under
• only sampling from people that you know and/or your family
• sharing the information you have received from your respondents
with your friends
• trying to stop respondents who want to pull out half way through
the investigation
• adding your friends as additional respondents to make up the
sample.

Self-assessment checklist
Reflect on what you’ve learnt in this unit and indicate your confidence level between 1 and 5. If you
score below 3, revisit that section. Come back to this list later in your course. Has your confidence
grown?

Confidence level Revisited?


I understand the common process of research techniques
I know how to carry out independent research studies for
English Language data
I am aware of ways of gathering data and analysis
I understand the concept and use of corpus linguistics
I can design research tools, such as questionnaires and
interview schedules
I understand the ethical research guidelines essential for
investigation
I know the rights which must be given to participants in a
research study
I understand the responsibilities of the researcher in a
research study

You might also like