0% found this document useful (0 votes)
319 views

Dynamics of Data Science

Dynamics of data science

Uploaded by

Harsh Vardhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
319 views

Dynamics of Data Science

Dynamics of data science

Uploaded by

Harsh Vardhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

Dynamics of data

science skills
How can all sectors benefit
from data science talent?
Dynamics of data science skills: How can all
sectors benefit from data science talent?
Issued: May 2019 DES5847
ISBN: 978-1-78252-395-6

The text of this work is licensed under the terms


of the Creative Commons Attribution License
which permits unrestricted use, provided the
original author and source are credited.

The license is available at:


creativecommons.org/licenses/by/4.0

Photographs are not covered by this license.

This report can be viewed online at:


royalsociety.org/dynamics-of-data-science-skills

Cover image © georgeclerk.


CONTENTS

Contents
Foreword by Professor Andrew Blake FREng FRS 4

Executive summary 6

About this report 8

Our vision and recommended actions 9

Introduction 17

Data science demand: what the data tells us 25

Dynamics of data science: career paths and talent flows between sectors 39

Area for action: Developing foundational knowledge and skills 47

Area for action: Advancing professional skills and nurturing talent 55

Area for action: Enabling movement and sharing of talent 65

Area for action: Widening access to data in a well-governed way 75

Conclusion 85

Acknowledgements 88

Glossary 93

Data appendix 94

Dynamics of Data Science: How can all sectors benefit From data science talent? 3
FOREWORD

Foreword
In the 40 years since I began a doctorate in Data science and engineering are growing
artificial intelligence vision at the University fast and broadening in scope. No longer just
of Edinburgh, AI has changed out of all the preserve of highly technical STEM- or
recognition. Where once a costly computer finance-orientated roles in London, data
would spend hours in contemplation of one science increasingly pervades modern
image, now a mobile phone can track faces business, scientific endeavour and public
in real time, and is backed by the immense affairs. Children in primary school today will
Professor Andrew Blake
power of the cloud to search millions of enter the workforce in roles that don’t exist yet
FREng FRS documents, and recognise speech at because of the way data, and data-enabled
sometimes human-level performance. That technologies such as artificial intelligence,
sort of computational facility, combined are transforming the economy. This is leading
with the power of statistical analysis, also to a dramatic shift in the demand for data
gives us unprecedented analytical power science skills, nationwide and the need for
to understand large and complex data sets. data scientists working across all sectors. Our
This combination of capabilities, that we term analysis shows that over the last five-and-a-
data science, is effecting a revolution in the half years there has been a sharp rise in UK
way we do business, access knowledge, job-listings for ‘Data Scientists and Advanced
communicate, and understand the world. Analysts’ (+ 231%) driven predominately by
increased numbers of vacancies for Data
The Royal Society has encouraged the Scientists and Data Engineers.
development and use of science for
the benefit of humanity since 1660. We This report provides more evidence for the
commissioned this project, Dynamics of case the Royal Society has already made, in
Data Science Skills, because we share the Changing education: Creating the conditions
vision of the UK as a leading data science for a broad, balanced and connected
research nation with a sustainable flow of curriculum, to change post-16 education
expertise. We believe that data science can within the next ten years1. The report also
be an exciting and fulfilling career, that also builds upon findings and recommendations
addresses society’s needs. That requires of our work in data management (Data
the right higher education and training to management and use: governance in the
be made available. More broadly, users, 21st century, with the British Academy)2,
analysts and citizens of the future will need artificial intelligence (Machine Learning: the
to be comfortable with the application of data power and promise of computers that learn
science to societally pressing questions. This by example)3, and computing education
calls for data science skills to be thoroughly (After the Reboot: computing education
integrated into the school curriculum too. in UK schools)4.

1. Royal Society 2019. Jobs are changing, so should education. See https://royalsociety.org/~/media/policy/
Publications/2019/12-02-19-jobs-are-changing-so-should-education.pdf?la=en-GB (accessed 15 April 2019).
2. Royal Society. 2017 Data management and use: Governance in the 21st century. See https://royalsociety.org/-/
media/policy/projects/data-governance/data-management-governance.pdf (accessed 15 April 2019).
3. Royal Society. 2017 Machine learning: the power and promise of computers that learn by example. See https://
royalsociety.org/~/media/policy/projects/machine-learning/publications/machine-learning-report.pdf (accessed 15
April 2019).
4. Royal Society. 2017 After the reboot: computing education in UK schools. See https://royalsociety.org/~/media/
events/2018/11/computing-education-1-year-on/after-the-reboot-report.pdf (accessed 15 April 2019).

4 Dynamics of Data Science: How can all sectors benefit From data science talent?
FOREWORD

The case to upgrade data analysis education We also share examples of exciting or
and skills provision has also been made innovative models and mechanisms that are
by other organisations, most notably in the already in place around the country that could
‘Analytic Britain’ briefing that was published be spread more widely. Those examples are
by Nesta and Universities UK in 20155. available as a separate booklet, Dynamics
This briefing drew on two research reports of data science: models and mechanisms,
on the state of supply and demand for that institutions and individuals can use to
broad analytical skills in the UK to make read about opportunities and resources
recommendations spanning the whole in data science training and practice. A
analytical talent pipeline, including schools, further companion booklet Dynamics of
colleges, universities and the labour market data science: what do data professionals
and industry. We hope that our report chimes say about data science collects some of
with and reiterates the messages of Analytic the fascinating personal stories of career
Britain, whilst adding further evidence and paths that we encountered across academia,
new data to remedy skills shortages and industry, charities, and government. Our
ensure movement of talent. We also hope to interviewees include a recent apprentice, an
continue championing the good work that is international entrepreneur, a physician who is
already taking place across sectors. also a data scientist, self-taught researchers,
and data scientists working in finance, and
Our report is an extensive exploration of global development. The case studies
the current UK data science landscape. It include their reflections on their careers, their
looks at the demand for data professionals experiences of moving across sectors, their
(including data analysts, data engineers, observations about their role models, mentors,
and data scientists), how this has changed and professional communities, and their
in recent years, and how it varies across suggestions for improving the way things are
industrial sectors and UK regions. We use the done in data science.
analysis to identify four major areas for action:
developing foundational knowledge and skills; Finally, I would like to say that working on
advancing professional skills and nurturing this report has been a fascinating experience.
talent; enabling movement and sharing of The Royal Society policy staff have been a
talent; and widening access to data in a well- pleasure to work with, and have driven the
governed way. Within these areas for action programme with a sure touch. I also want
we identify priority needs and make some to thank my colleagues on the steering
recommendations for addressing them. group for their insights, and the dozens of
people who have contributed at roundtables
and workshops and with helpful comments
and contributions.

Andrew Blake
April 2019

5. M
 ateos-Garcia J, Windsor G and Roseveare S. 2015 Analytic Britain: Securing the right skills for the data-driven
economy. See https://media.nesta.org.uk/documents/analytic_britain.pdf (accessed 15 April 2019).

Dynamics of Data Science: How can all sectors benefit From data science talent? 5
EXECUTIVE SUMMARY

Executive summary
The skills of data scientists and engineers are Foundational training in data skills begins
in high demand. They enable organisations at school, and there is an opportunity
to extract valuable insights from data, and and a need to consult broadly on a future
use them for substantial societal benefit. As curriculum that addresses the breadth of data
data analysis methodology grows in power, skills across mathematics and science, the
and the volume of data collected increases arts and humanities. Students need to be well
rapidly, the number and variety of roles in informed about the ever-widening range of
data science are also growing significantly.  opportunities working with data. New hybrid
forms of education in data science at school
However, with major industry players hiring age and later, such as apprenticeships, are
many of the most experienced data scientists becoming more prevalent.
and AI researchers, media reports have
suggested that the natural flow of researchers Across all sectors, those with a foundation
from academia to industry may be reaching of skills and capabilities need opportunities
unsustainable levels6,7. Incentives available to deepen them and acquire new ones.
from industry compete hard with those offered Professional-level courses should be flexible
by academia, and large tech companies and responsive. Training may need to be
now even allow and encourage researchers industry-approved and accredited, and
to publish, once a particular advantage coordination is needed between industry
of working in universities. It is one further and universities. More informal mechanisms
factor in drawing talent away from even the such as online material are also needed to
strongest university research groups. allow people to (re)train through self-learning.
Universities will need more than ever to retain
There is considerable strength in UK data and create the trained staff to meet these
science in academic, industrial, charitable and teaching needs.
government sectors. We were able to draw on
the experience of representative institutions
to arrive at what we hope will be helpful
recommendations for employers, practitioners,
and decision-makers in the private, public and
university sectors. Research commissioned
for this report shows an increasing need for
people with data science skills, with a sharp
rise in demand for Data Scientists and Data
Engineers in the last five-and-a-half years.
The demand spans all sectors, with specialists
sought after everywhere from government
departments to technology start-ups. These
findings suggest that further skills gap analysis
is needed to quantify the number of employed
workers per opening.

6. S
 ample I. 2017 ‘We can’t compete’: why universities are losing their best AI scientists. The Guardian. 1 November
2017. See https://www.theguardian.com/science/2017/nov/01/cant-compete-universities-losing-best-ai-scientists
(accessed 15 April 2019).
7. B
 oland H. 2018 Britain faces an AI brain drain as tech giants raid top universities. The Telegraph. 2 September
2018. See https://www.telegraph.co.uk/technology/2018/09/02/britain-faces-artificial-intelligence-brain-drain/
(accessed 15 April 2019).

6 Dynamics of Data Science: How can all sectors benefit From data science talent?
EXECUTIVE SUMMARY

Data science particularly lends itself to The challenge for UK data science is to
movement of talent between sectors, reinforce the landscape by deploying the
including on shorter timescales. The ability UK’s substantial skills base flexibly, exploiting
to do this will be enhanced by recognising the strength of its institutions to generate
the value of cross-sectoral working and and nurture more talent, and mobilising its
braided careers (reciprocal arrangements valuable data resources ever more openly
that enable an individual to pursue dual, or and effectively.
even multiple, employment opportunities).
This requires each sector to broaden its We are still a long way from realising the full
criteria and incentives to recognise and potential of data-enabled technologies. This
welcome more diverse forms of experience, report highlights some of the challenges and
for example academics gaining experience opportunities of data science careers and
in the public sector or in start-ups. Attention some ways forward in addressing skills and
must be paid to the needs of small and mobility challenges.
medium size businesses, charities and the
public sector, addressing the challenge of
large multinationals that have resources
and appetite to absorb data science talent
at scale. Joint positions between sectors
have an important role to play in sharing
talent sustainably, and in promoting diversity
of experience.

Availability of data and computing power are


major draws for talent. Industry dominates,
offering both at considerable scale. The public
and university sectors can also become more
attractive to talented workers by investing
more in the particular computing equipment
that data science needs. Public data resources
are considerable and could become uniquely
compelling with more investment in curation,
standardisation and careful attention to ethics
and the governance of data use. In addition,
access needs to be widened and opened,
which also fits well with the increasing
emphasis on reproducibility in research.

Dynamics of Data Science: How can all sectors benefit From data science talent? 7
ABOUT THIS REPORT

About this report


This report sets out what is distinctive about There is a clear need for collaborative,
data science as a discipline and offers sustainable mechanisms to develop talent
some key statistics from our commissioned and this report promotes a vision for the
research to show the level of growth in sharing of data science talent across all
demand for a range of data skills. We explore sectors. We have identified a range of
the different drivers and blockers around models and mechanisms to enable this
data science roles in different sectors, and vision, such as outreach programmes,
highlight examples from a body of case enrichment and fellowship schemes,
studies of data science careers and career capability-building programmes, informal/
mobility across sectors (the complete portfolio peer-to-peer mechanisms, collaborative
of case studies is available as a separate events and partnerships, data stores and
publication). Engagement with data scientists, data centres/institutes. All of which will be
analysts and other data users informed us explored in more detail in the following
about some opportunities to foster talent and chapters. The models can also be found in an
to enable movement between sectors. The accompanying document. (See Dynamics of
interdisciplinary nature of data science lends data science: models and mechanisms)
itself to joint appointments, and its applied
nature fits with apprenticeship mechanisms. Examples of good practice have been
collected with the input of members of
The following sections set out the data science community from across
recommendations and activities across four academia, industry and the public sector.
major areas for action, with recommendations They feature a variety of tried and tested
targeted across government, funders, ideas from across the UK, which require
universities, industry and the public sector minimal to major resource support and can
to make progress towards a thriving data be led by individuals as well as institutions.
science landscape. They also contain a range The aim of the models is to inspire scale-up
of mechanisms for developing and sharing and cohesion.
skills across sectors, highlighting examples
of models that could potentially be replicated, The models and mechanisms can be used by
scaled up and expanded. people who are:

• c oncerned about the recruitment of data


Accompanying this report is a detailed set of
scientists, data analysts and domain
case studies featuring career stories of data
experts;
scientists working in a wide range of roles,
levels and sectors, including Accenture, the • involved in developing data science talent
Alan Turing Institute, Channel 4, Cambridge at all levels;
University, DeepZen, GCHQ, Government
• c onsidering (re)training as data scientists,
Digital Service, HSBC, The One Campaign,
data analysts and domain experts;
the Office for National Statistics, UCL’s
Institute of Neurology and the University of • m
 aking decisions around skills funding on
Warwick. (See Dynamics of data science: a local, regional and national scale; or
what data scientists say about data science.)
• s eeking to ensure that data they hold is
used for societal benefit.

8 Dynamics of Data Science: How can all sectors benefit From data science talent?
VISION

Our vision and recommended actions


At the heart of this report is our vision of a healthy data science skills landscape for the UK:

VISION
The UK is a leading data science research nation with a sustainable
flow of expertise. Diverse data science skills are integrated into curricula
in order to develop future users, developers and citizens. Data science
provides an exciting and fulfilling career choice. Data skills and appropriate
infrastructure are available across sectors. Data science is applied to
acheive broad societal benefit.

To achieve this vision, we focus on the These ideas are sketched out across the
four major areas for action: next pages and set out in more detail in the
following sections, along with priority needs,
• D
 eveloping foundational knowledge
models and mechanisms, and recommended
and skills;
actions for addressing these needs.
• A
 dvancing professional skills and
nurturing talent;

• E
 nabling movement and sharing of
talent; and

• W
 idening access to data in a well
governed way.

Dynamics of Data Science: How can all sectors benefit From data science talent? 9
RECOMMENDED ACTIONS

Area for action: Developing foundational knowledge and skills

Need:
Building knowledge and skills from school level to degree level.

Recommended action Recommended action

Data skills for everyone Curricula fit for the future


At school level, data science knowledge and Post-16 curriculum change within the next ten
skills would benefit from greater integration years is vital to ensure young people leave
across the primary and secondary curriculum. education with the broad and balanced range
Mathematics and computing communities, of skills they will need to flourish in a changing
businesses and education professionals world of work. This should start with a review
have a key role here. into post-16 learning in the next parliament
to inform future curriculum development.
An analysis of the future data skills needs of
Recommended action students, industry and academia is needed
to inform such a review.
Support teachers to teach
data skills In higher education further consideration
Develop resources, training and support is needed about how universities can
for teachers and appropriate data and teach data science effectively, and
computing infrastructure for schools. This integrate it into university curricula as a
requires combined effort from mathematics developing and interdisciplinary set of
and computing communities, businesses, and skills and methodologies.
education professionals, including the new
National Centre for Computing Education
and National Centre for Excellence in the
Teaching of Mathematics, working with the
support and input of partner organisations
and government departments.

10 Dynamics of Data Science: How can all sectors benefit From data science talent?
RECOMMENDED ACTIONS

Need:
Widening access to data science education.

Recommended action Recommended action

Raise awareness of data Address underrepresentation


science careers and evaluate diversity
Data professionals work across a wide range Women make up a disproportionately small
of roles. Greater awareness of career paths fraction of the educational pipeline associated
could help to attract a wider pool of students. with data science positions, and further
Employers could also offer work experience, efforts are needed by all stakeholders to
host teacher Inset days and speak in schools, address diversity, and not only of gender. This
college and universities so that students, is particularly relevant as the development
their teachers and careers advisers gain an of data science talent needs a wider set of
understanding of possible career pathways. skills, including those involved in identifying,
understanding and interpreting real-world
problems. A diverse pipeline of data scientists
is more likely to pick up or be concerned
by inadvertent biases in algorithms that can
impact on many different types of people.
The Hall and Pesenti review into the growth
of the UK artificial intelligence industry (2017)
called for government, industry and academia
to embrace the value and importance of a
diverse workforce and the recommendations
of this review should continue to be pursued.

Dynamics of Data Science: How can all sectors benefit From data science talent? 11
RECOMMENDED ACTIONS

Area for action: Advancing professional skills and nurturing talent

Need:
Developing skills in the workforce

Recommended action Recommended action

Engagement between Develop data science as a


universities and employers profession
Universities with good industry links play Developing a professional framework for
a key role in developing appropriate data scientists with shared codes of practice,
professional training. By working in including appropriate governance of data
collaboration with employers they can help collection and use and ethics training is
address regional skills gaps and productivity an important short-term goal. In the longer
needs. term, professional bodies such as the British
Computer Society and the Royal Statistical
Society, could work with employers and
Recommended action universities and identify the skills needed for
data scientists and consider how to address
Offer nimble and responsive accreditation to ensure that students and
training opportunities professionals can be confident in the quality of
new courses.
Data science is fast-moving and requires
innovative ways to enable the development
of advanced skills. To meet the growing
demand for data scientists, universities need
to be agile and responsive to offer new
ways of upskilling. This could potentially be
achieved through MicroMasters, conversion
courses and high-quality Massive Open Online
Courses (MOOCs) for continued professional
development.

12 Dynamics of Data Science: How can all sectors benefit From data science talent?
RECOMMENDED ACTIONS

Need
Creating the right research and
working culture for data science

Recommended action

Build diverse teams


Universities and the public sector in
particular must work to create a culture that
nurtures and retains data science talent,
which can include building and supporting
interdisciplinary data science teams.

Dynamics of Data Science: How can all sectors benefit From data science talent? 13
RECOMMENDED ACTIONS

Area for action: Enabling movement and sharing of talent

Need Need
Enable movement through Recognising diverse
braided careers research outputs

Recommended action Recommended action

Create and fund joint positions Commercialise research


across academia and industry The ways that universities encourage and
Funding bodies such as UKRI could support support researchers in commercialising
positions for joint appointments for a pool of research and building spin-outs can
the UK’s most talented researchers, whose influence researchers’ abilities to hold
interests attract them equally to academia joint appointments between industry
and industry, so that excellence can be and academia. Universities may wish to
fostered at the interface of academia, industry consider their strategies for research
and government. Universities and funders commercialisation and policies on intellectual
should give urgent attention to enhancing property in order to build an environment that
mechanisms to accommodate outstanding better supports cross-sector roles.
industrial research leaders in machine
learning within the academic sector.
Recommended action

Recognise diverse research


outputs
Government departments and industry
are likely to benefit when they enable
data scientists in research roles to publish
their work wherever possible; conversely,
universities need to recognise the value
of a breadth of experience and outputs.
Alternative outputs could be recognised on
academic CVs. Changes to the Research
Excellence Framework that focus on
institutions rather than individuals could
allow universities to better recognise the
contribution of data science to broader
research output.

14 Dynamics of Data Science: How can all sectors benefit From data science talent?
RECOMMENDED ACTIONS

Need
Establishing a coherent
approach to policy

Recommended action

Make skills a core part of the


National Data Strategy
Responsibility for data policy is distributed
across DCMS, GDS, Cabinet Office and
DfE, but DCMS leads on delivering the
National Data Strategy. This Strategy should
enable departments to work closely together
on data skills, building a coherent approach
to delivering a healthy data science skills
landscape. This will be important for the
wider adoption of artificial intelligence.

Dynamics of Data Science: How can all sectors benefit From data science talent? 15
RECOMMENDED ACTIONS

Area for action: Widening access to data in a well-governed way

Need Need
Opening data and providing Providing the computing power
secure access for use by the growing data
science community

Recommended action
Recommended action
Encourage data sharing where
possible Provide access to computing
Greater transparency of private sector data power
could help build public trust in the use of Improving the UK’s computing research
data and how it is used for decision-making infrastructure will better enable data scientists
purposes. The public sector could usefully to access the necessary computing power
consider how to widen access to its data, to release the value from data and address
including sharing data, and data challenges to research challenges, and will be vital for
researchers. Journal editors should normally the UK to remain competitive with other
ensure that data is being made available countries such as the US and China. BEIS
to other researchers in its original form, or and UKRI could usefully consider the need
via appropriate summary statistics where for continuing to improve access for data
sensitive personal information is involved. scientists working across all disciplines
The Royal Society has published a report on to high-power computing, and this could
Privacy Enhancing Technologies which sets helpfully be included as part of the UKRI
out how greater use of data could potentially Infrastructure Roadmap.
be enabled by PETs8.

Recommended action

Donate data science talent


There is value in enabling data scientists to
donate their time to applying data science
to societal challenges. For example, through
pro bono project work along the lines of
DataKind UK, RSS Statisticians for Society
and hackathons.

8. T
 he Royal Society. 2019 Protecting privacy in practice: The current use, development and limits of Privacy
Enhancing Technologies in data analysis. See https://royalsociety.org/-/media/policy/projects/privacy-enhancing-
technologies/privacy-enhancing-technologies-report.pdf (accessed 15 April 2019).

16 Dynamics of Data Science: How can all sectors benefit From data science talent?
INTRODUCTION

Introduction
Data science is a rapidly developing field, Definitions of data science have evolved
and in some ways a relatively new and over time, partly as a reflection of changes “A lot of what we
emerging discipline. Its development out in technology and data handling. In 1962, are doing in data
of different disciplines, and its potential John Tukey called for a reformation of science involves the
impact on society and the economy, requires academic statistics, although there was same computers,
a workforce with new skills. This section also a move to resist it. In The future of maths, stats, data
illustrates this critical moment and outlines data analysis, he pointed to the existence and computationally
our methodology for analysing how these of an as-yet unrecognised science, whose based research that
needs can be met in practice. subject of interest was learning from data, we have been doing
or ‘data analysis’12. Tukey worked between for as long as I have
Data science as a developing discipline academia and industry, and provides a been involved in
Making sense of data has a long history. notable precedent for ‘braided careers’ in research, with the
Historically, the notion of finding useful patterns data science, working jointly at Bell Labs and current trendy label
has been given a variety of names such as Princeton University Statistics department. for it.”
data mining, knowledge extraction, information Over the last 50 years, statisticians, data
harvesting and data pattern processing. It analysts and computer scientists have played Dr James Hetherington,
was performed by scientists, statisticians, a part in the invention and development of Director of Research
Engineering at the Alan
librarians, computer scientists and others9. For computational environments for data analysis13.
Turing Institute.
example, in 1854, John Snow's map of cholera
cases alerted him to the cause of the cholera A new workforce
outbreak. A cluster of dots located close to a In 2012, Harvard Business Review described
single water supply on a map changed how the modern day data scientist as “a high
we analyse and visualise data today10. ranking professional with the training and
curiosity to make discoveries in the world of
The term ‘data science’ emerged in the big data” 14. D J Patil, LinkedIn Chief Scientist
1960s to designate a new profession that and author of Data Scientist: the sexiest job
was expected to make sense of increasingly of the 21st century, explained that the focus
large stores of data. It wasn’t until the early of teams at LinkedIn was to work on data
2000s that the first data science journal applications that would have an immediate
was launched by the Committee on Data for and massive impact on the business. The
Science and Technology (CODATA) of the term that seemed to fit best was data
International Council for Science11. scientist: those who use data and science
to create something new.

9. Fayyad U et al. 1996 From Data Mining to Knowledge Discovery in Databases. American Association for Artificial
Intelligence 17, 37-54.
10. Rogers S. 2013 John Snow’s data journalism: the cholera map that changed the world. The Guardian. 15 March
2013. See https://www.theguardian.com/news/datablog/2013/mar/15/john-snow-cholera-map (accessed 27
November 2018).
11. In 2013, Forbes technology contributor Gil Press put together a timeline which traces the evolution of the term
‘data science’ and its use, attempts to define it, and related terms. Press G. 2013 A Very Short History Of Data
Science. Forbes. 28 May 2013. See https://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-
science/#30ad96c655cf (accessed 27 November 2018).
12. Donoho D. 2017 50 Years of Data Science. Journal of Computational and Graphical Statistics 26:4, 745-766.
13. Donoho D. 2017 50 Years of Data Science. Journal of Computational and Graphical Statistics 26:4, 745-766.
14. D
 avenport T and Pail DJ. 2012 Data Scientist: The Sexiest Job of the 21st Century. Harvard Business Review.
October 2012. See https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century (accessed 27
November 2018).

Dynamics of Data Science: How can all sectors benefit From data science talent? 17
INTRODUCTION

In 2007, Microsoft Professor Jim Gray coined However, there is not yet a consistent
the era of massive data the ‘Fourth Paradigm,’ definition of data science or the role of a data
stating that our capacity for collecting data scientist16. Data science can cover a range
has outstripped our present capacity to of activities from rapid analysis of real-time
analyse it and so our focus should be on data to long term evidence collection in the
developing people with the skills to make sciences. There is a wide variety of skills
sense of it15. under the label ‘data science’ and people
with relevant skills may associate with
other disciplines. 

figure 1

Some key areas in data science as a discipline since the 1940s


This diagram shows some key moments and developments in the emergence of data science as an academic discipline.

Pre-1940s 1950s 1960s 1980s 2000s


People have been looking • Emerence of term • First international • First data science
for patterns in numbers for ‘data science’. conference journal.
centuries. For example, on knowledge
• Tukey's pioneering • Open source data
the gathering of mortality discovery and
work on data storage.
statistics and its predictive
analysis. data mining
use (for public health
and insurance) by John
Graunt, Edmond Halley,
Richard Price in the 1940s 1970s 1990s 2010s
1600s – 1700s; Thomas • New scientific field of • Data engineering • Statistics and • Greater exchange and use of data.
Bayes’ probability theorem Operational Research and processing. operational • Greater potential for innovation and
and Charles Babbage’s emerged during WW2. research. efficiency from data.
engines to eliminate errors
• Donohoe's vision • New methods of analysis.
in storing and reproducing
astronomical data in the of data science as
• Data science fueled by popularity
1700s. Automatic recording a new field.
of 'data scientist' job.
of meteorological data
by Christopher Wren and • Gray and 'The Fourth Paradigm'
Robert Hooke, applied to = the era of massive data.
weather forecasting by
Robert Fitzroy in the 1900s.

15. Hey T, Tansley S and Tolle K. 2009. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research.
16. P
 ress G. 2013 A Very Short History Of Data Science. Forbes. 28 May 2013. See https://www.forbes.com/sites/
gilpress/2013/05/28/a-very-short-history-of-data-science/#30ad96c655cf (accessed 27 November 2018).

18 Dynamics of Data Science: How can all sectors benefit From data science talent?
INTRODUCTION

Background to this study: Why now? each year. In the current age, an increasing
Greater exchange, storage and use of data: volume of information is being collected “To be a good data
Commercial, industrial and academic uses from a greater range of sources and at scientist you really
of data have expanded considerably over greater speed than ever before17. According do need a hell of a
recent years. Changes to the volume, variety, to research from McKinsey, the volume of lot of skills”
and velocity of data collection have created data continues to double every three years
a potentially rich resource for the digital from digital platforms, wireless sensors and Chanuki Illuska Seresinhe,
economy. One estimate suggests that open mobile phones18. Senior Data Scientist,
Channel 4.
data could help create $3 trillion of value

figure 2

The greater exchange and use of data and the greater potential for efficiency and
innovation has led to the development of a unique, interdisciplinary workforce with
new and evolving skills.

Greater exchange,
storage and use of
accessible data

An interdisciplinary
workforce with new
and evolving skills

More decision Increasing


making based analytical and
on data science computing
techniques power

17. Manyika J et al. 2013 Open data: unlocking innovation and performance with liquid information. McKinsey Global
Institute. See http://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/open-data-unlocking-
innovation-and-performance-with-liquid-information (accessed 27 November 2018).
18. Henke N et al. 2016 The age of analytics: Competing in a data-driven world. See https://www.mckinsey.com/
business-functions/mckinsey-analytics/our-insights/the-age-of-analytics-competing-in-a-data-driven-world
(accessed 27 November 2018).

Dynamics of Data Science: How can all sectors benefit From data science talent? 19
INTRODUCTION

Increasing analytical and computing power: The Royal Society has responded to this
“Whether people In the past, the complexity of big data moment in its recent reports Machine
are in the limited the effectiveness of existing methods learning: the power and promise of
meteorological of analysis. Now, data scientists have computers that learn by example,
office, in the unprecedented computing power at their which highlighted among a range of
City looking at disposal. In the commercial sector, analysis recommendations the need for data skills at
financial data, or of data has become central to achieving all levels – from foundational data skills to
in government competitive advantage in some sectors and machine learning expertise at the leading
statistics looking has led to new applications of predictive edge of research21.
at sociological analytics and machine learning to address
data, they have business problems19. The data scientists we This report identified the conditions
been confronted interviewed gave some examples of what this for enabling the UK to benefit from the
with questions and means for data science. opportunities presented by machine learning.
problems that go These included the need for an amenable
beyond traditional More decision making based on data data environment and to generate a healthy
notions of statistics science techniques: skills pipeline from school level through
or data analysis. In the public sector, combining data with intelligent users of machine learning to high
This has forced them analytics can, for example, allow a better level research. However, at the time of its
to grab this notion understanding of population needs and help publication there were media reports of a brain
of data science. To design and deliver services accordingly. drain from academia into industry. This led
me, that is what Better use of data can improve the design, the Royal Society to consider whether there
really justifies its efficiency and outcomes of services20. are aspects of the industry/academia/public
existence.” sector interface in data science that are unique
compared with other sciences.
Professor Graham
Cormode, Professor in Data management and use: governance for
Computer Science at the
the 21st century, produced with the British
University of Warwick.
Academy, highlighted the need for new ways
to govern new uses of data, including the
need for responsible sharing of data and
using data analysis for public benefit22. This
report addresses some of the ways that these
needs can be met in practice.

19. M
 cKinsey Analytics. 2018 Analytics comes of age. See https://www.mckinsey.com/~/media/McKinsey/Business%20
Functions/McKinsey%20Analytics/Our%20Insights/Analytics%20comes%20of%20age/Analytics-comes-of-age.ashx
(accessed 29 November 2018).
20. Timmis S, Heselwood L and Harwich E. 2018 Sharing the benefits: How to use data effectively in the public sector.
See https://reform.uk/research/sharing-benefits-how-use-data-effectively-public-sector (accessed 27 November 2018).
21. Royal Society. 2017 Machine learning: the power and promise of computers that learn by example.
See https://royalsociety.org/~/media/policy/projects/machine-learning/publications/machine-learning-report.pdf
(accessed 15 April 2019).
22. Royal Society. 2017 Data management and use: Governance in the 21st century. See https://royalsociety.org/-/
media/policy/projects/data-governance/data-management-governance.pdf (accessed 15 April 2019).

20 Dynamics of Data Science: How can all sectors benefit From data science talent?
INTRODUCTION

Workshops Statistics
To explore this further, we held scoping The Royal Society commissioned Burning “The pace of
roundtables in 2018 to explore key questions Glass Technologies, an analytics software technology
with a range of stakeholders, including: company who provide real-time data on job evolution is ever
growth, skills in demand, and labour market increasing. I think
• w
 hat draws data science talent to particular
trends to provide data and analyses on UK we should all be
sectors and organisations?
job adverts related to data science and doing everything
• h
 ow can universities and the public sector analytics. This was in order to determine we can to enable
learn from industry’s success? demand trends, salary changes, location of people to be part
positions, skills and experience requirements. of that. I would
• w
 hat kinds of models should we explore to
This analysis also sets out a taxonomy of the support training
enable a thriving landscape?
various, interconnected occupations related up of technical
• h
 ow can we promote collaboration and to generation and use of data. individuals who
cross-sector careers? are open about
Interviews with data scientists from academia, intending to take
Participants highlighted a number of existing charities, private and public sectors careers in industry.
and potential new models to attract, educate To understand the drivers and blockers in I think that should
and retain data science talent, and ensure a more detail, a series of interviews was carried be a strategic
healthy research landscape. out with researchers working within the objective for a
broad field of data science and particularly country.”
We also explored the drivers for individual those who have worked across academia,
data scientists to pursue careers in different government and industry. They revealed a Dr Ilya Zheludev, Chief
complex network of factors that impacted Data Officer at Jasmine22.
sectors and to further interrogate the models
for upskilling, retaining and sharing data on career choices. Interviewees also spoke
science talent and determine drivers for the about the mechanisms that have allowed
movement of data scientists. Participants them to thrive in multiple roles across the
also discussed the application challenges landscape, or move between them, and
and conditions needed for success, such as highlighted where they perceived skills gaps
infrastructure and funding. and gave suggestions for how to fill them.

The full set of career case studies is available


as a separate publication at royalsociety.org/
dynamics-of-data-science-skills. The stories
illustrate the richness of the data science
landscape and can be used to tell stories of
data science careers.

Dynamics of Data Science: How can all sectors benefit From data science talent? 21
INTRODUCTION

Highlights from the career stories


The full set of career case studies, Dynamics of data science: What data professionals say about
data science, is available to download at royalsociety.org/dynamics-of-data-science-skills.

“Data science is an interdisciplinary career “I wholeheartedly, absolutely, consider


where you have to have a bit of maths myself to be a data scientist. When
knowledge, a little bit of coding knowledge I was a maths undergraduate, I was
and computers, but you also have to be doing these sorts of data exercises and
quite innovative and curious about wanting simple experiments. I was doing trials on
to hack things.” friends. We would collect our opinions
and guesses; run perceptual experiments
Alexis Fernquest, Data Scientist at the Office and psychometric experiments, and do
for National Statistics. elementary data analyses. I loved that.
But people who really like doing stuff
with data did not fit into anywhere in the
“I am motivated by the impact that use of undergraduate curriculum at the time.”
data in a safe, confined, reusable, scalable
way can have on the end person. I am Dr Damon Wischik, Lecturer, Department of Computer
Science and Technology, University of Cambridge and
motivated by introducing the next shift of
former Royal Society University Research Fellow.
efficiency, in terms of how you can derive
a value from data. That could be anything
from the ability to collect data to the ability
to build a product off the back of it.” “My scientific background in physical
sciences helped me get a leg up at the
Dr Ilya Zheludev, Chief Data Officer at Jasmine22. start of my career, in terms of the maths
and statistical principles behind the work.
A sizeable proportion of data scientists and
operational researchers within government
“I consider myself to be a data scientist and
now have postgraduate degrees in physical
a data science researcher, because I am
sciences; astrophysics PhDs, particle
still quite in love with the research side of
physics PhDs. This is not an uncommon
data science. I am starting to understand
route.”
that some of it might require two different
skill sets, but there is definitely a lot of
Nick Manton, Head of Data Science at the Government
overlap. Data science seems to be quite Digital Service and cross-government Head of
multidisciplinary. You have to have a very Community for Data Scientists within the Digital Data and
scientific mindset and to be very logical with Technology profession.
how you might approach your problem. But
at the same time you are also interfacing
with businesses and the public, so you have “As a data scientist, I feel better when I
to be very good at communication.” do not know in advance how to solve
something; it is more exciting. I think that
Chanuki Illuska Seresinhe, Senior Data Scientist,
is the reason that most data scientists,
Channel 4.
including me, jump from one industry to
another after four or five years.”
Milton Luaces, Senior Manager at Accenture in the
Fraud and Risk practice.

22 Dynamics of Data Science: How can all sectors benefit From data science talent?
Dynamics of Data Science: How can all sectors benefit From data science talent? 23
24 Dynamics of Data Science: How can all sectors benefit From data science talent?
Chapter one

Data science demand:


what the data tells us

Image © monsitj.

Dynamics of Data Science: How can all sectors benefit From data science talent? 25
CHAPTER ONE

Data science demand:


what the data tells us
The history of the data science discipline Methodology
shows that it has deep roots, developing These findings are based on data provided
over time from a landscape of interrelated to the Royal Society by Burning Glass
subjects. However, in recent years Technologies. Burning Glass Technologies
the demand for specialists has grown track real-time demand by collecting job
significantly, calling for new combinations of postings from more than 7,500 UK online job
skills. There has also been an increase in the sites to develop a comprehensive portrait.
need for data skills across the board. The Table 1 sets out the framework categories
pervasiveness of data is rewriting the rules used by Burning Glass Technologies to narrow
of many professions. down a set of sample occupations to include
in this analysis. The table provides examples of
To better understand the economy’s changing specific occupations that fit under each of the
needs, the Royal Society commissioned broad framework categories and a description
the analytics software company Burning of the functional role.
Glass Technologies to analyse the demand
for data skills across the UK. Burning Glass Burning Glass Technologies have developed
Technologies analysed 9.2 million UK-based a taxonomy of more than 500 skill clusters by
jobs to compare changes in demand over a grouping skills that often travel together in job
five-and-a-half-year period (2013 – 2017/18). postings. In this report, clustering has been
Their algorithms removed duplicates, scams focused on the most frequently occurring
and junk content from the analysis23. skills across a range of industries, in addition
to skills in emerging areas. Burning Glass
This section presents key findings from the Technologies use software to extract topline
Burning Glass Technologies data. It covers information about each job vacancy (such as
trends, salaries, skills, industry-specific title, employer, and industry) and ‘reads’ each
demand and regional variation. Alongside this job description to identify specific skills and
is the methodology, parameters and caveats qualifications that employers are seeking.
for understanding the data, and a discussion A benefit of using real-time data is that the
around supply, demand and the skills gap. evolution of skills over time can be tracked,
which is important for understanding new and
See the data appendix for further information developing skills such as data science24.
and tables.

23. Source: Burning Glass Technologies. 2019


24. Burning Glass Technologies. 2019 Frequently Asked Questions. See https://www.burning-glass.com/about/faq/
(accessed 15 April 2019).

26 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER ONE

Parameters and caveats Supplementing the data


There are challenges to providing definitive Despite the caveats listed above, the data
statistics on labour market demand and a provided by Burning Glass Technologies matches
cautious approach is recommended when reasonably well with vacancy statistics from the
interpreting the results. Some caveats to the Office for National Statistics. Further analysis is
data findings are presented below: needed to quantify the number of employed
workers per opening. We suggest therefore that
• Inconsistent job titles: Job titles are not this data is looked at alongside alternative data
consistent across many of these positions. sources, such as the UK labour market survey25.
An employee called a ‘Data Scientist’ at
one company may have a distinctly different Another way to supplement the data provided
skill profile than a ‘Data Scientist’ at another by Burning Glass Technologies is to track
firm, making it difficult to analyse the overall emerging fields using occupations or skills
profile across all roles. as a proxy for the importance of certain roles
in new industries. For example, looking at
• Missing jobs: Many jobs are not advertised
‘Artificial Intelligence’ as a skill in order to
online and therefore are not included in
show which roles are most commonly calling
the data, or are advertised within closed
for it. This may be a good way to recognise
platforms such as the Civil Service portal.
new and emerging industries, such as those
• Totals: Not all jobs contain all information in the technology sector.
and so overall totals can differ.
Some work has been done by other
• Salary gap: Many roles are advertised
organisations to help measure the demand for
without salary details. Across all occupations
emerging skills, such as AI. For example, Nesta
in the Data Science and Advanced Analyst
has built a new skills taxonomy which shows
category, 37 – 50% of postings contained
the skill groups needed by workers in the UK26.
salary information in 2017/18.
The taxonomy can be used as a framework to
• Re-labelling of jobs: As data science has measure the demand for certain skills among
become widely recognised some jobs are employers, the current supply of those skills
likely to have been ‘re-branded’ and this from workers, and the potential supply based
may warrant further investigation. on courses offered by education providers and
• Data periods: Two twelve-month time periods employers27. Their research finds that out of
have been included in this report, covering 143 clusters of skills, ‘Data Engineering’ stands
January – December 2013 and July – June out as the skill cluster with the highest annual
2017/18. This represents the earliest and median salary and growth in demand28. This
latest available figures at the time the data complements our finding that Data Engineering
was commissioned. is in high demand.

25. Office for National Statistics. 2019 Labour market overview, UK Statistical bulletins. See https://www.ons.gov.
uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/bulletins/uklabourmarket/
previousReleases (accessed 15 April 2019).
26. Djumalieva J & Sleeman C. 2018 Linking skills to occupations: Using big data to build a new occupational
taxonomy for the UK. See https://www.nesta.org.uk/blog/linking-skills-to-occupations-using-big-data-to-build-a-new-
occupational-taxonomy-for-the-uk/ (accessed 15 April 2019).
27. D
 jumalieva J and Sleeman C. 2018 Making Sense of Skills. See http://data-viz.nesta.org.uk/skills-taxonomy/index.
html (accessed 15 April 2019).
28. ibid.

Dynamics of Data Science: How can all sectors benefit From data science talent? 27
CHAPTER ONE

Table 1

Framework categories showing increasing levels of analytical rigour across all Data Science
and Analytics (DSA) jobs.

Framework Functional Role Sample Occupations

Data Scientists Create sophisticated analytical Data Scientist


and Advanced models used to build new Economist
Analysts datasets and derive new insights
Data Engineer
from data
Biostatistician
Statistician
Financial Quantitative Analyst

Data Analysts Leverage data analysis and Data Analyst


modelling techniques to solve Business Intelligence Analyst
Analytical rigour

problems and glean insight


across functional domains
Data Systems Design, build and maintain an Systems Analyst
Developers organisation’s data and analytical Database Administrator
infrastructure
Analytics Oversee analytical operations Chief Analytics Officer
Managers and communicate insights to Marketing Analytics Manager
executives
Functional Utilise data and analytical models Business Analyst
Analysts to inform specific functions and Financial Analyst
business decisions
Data-Driven Leverage data to inform strategic IT Project Manager
Decision and operational decisions Marketing Manager
Makers

28 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER ONE

BOX 1

Understanding the Burning Glass Technologies methodology: Classification

Burning Glass Technologies identified • F


 ramework categories: The categories
occupations that commonly require some are: Data Systems Developers, Functional
mix of analytical skills and grouped them Analysts, Data Analysts, Data Scientists
into job categories based upon similarities and Advanced Analysts, Analytics
in skillsets and functional roles within Managers and Data-Driven Decision
the broader ‘Data Science and Analytics’ Makers. Functional Analysts and Data-
landscape. The framework is based on Driven Decision Makers are less analytical
grouping similar occupations based on than the other four categories, and may
the skills and experience required. An be thought of as data-enabled, rather than
occupation is a person's regular activity, pure analytics, roles. Nonetheless, they
performed in exchange for payment. Every require many overlapping skillsets with
job advert in Burning Glass Technologies’ other analytics roles, and are important for
database has been assigned an occupation organisations consuming and interpreting
from the Burning Glass Technologies data.
occupation taxonomy.
• O
 ccupational categories: These six
framework categories are further
The mapping is performed by a logical
broken down into 40 occupations.
rules-based system that assigns an
Roles within each DSA framework
occupation based on the information
category may vary in terms of required
extracted from a posting. Each job advert
skills and experience, but are grouped
is assigned to one occupation. The
together based on their function within
rules system compares the information
an organisation. Data Scientists and
found against a list of prioritised criteria,
Economists, for example, belong to the
such as the clean title, the skills and
Data Scientists and Advanced Analysts
certifications mentioned in the job text. It
category. Although these roles entail
is a linearised decision tree with rules that
different skillsets, offer different salaries,
takes the following format: ‘if condition 1
and require different experience,
and condition 2 and condition 3 […] then
employees in both are expected to
outcome’. Every rule uses at least one
create sophisticated analytical models,
condition which is usually based on the
work with large datasets, and derive
job title; other conditions that are based on
insights from data. The context in which a
skills, certifications, or industry are optional.
Data Scientist performs these tasks may
differ from that of an Economist, but both
roles serve similar high-level functions
within an organisation and may require
similar levels of analytical rigour.

Dynamics of Data Science: How can all sectors benefit From data science talent? 29
CHAPTER ONE

Overall trends (see Table 2) Skills and skills clusters


The data covers the period 2013 to 2017/18. Table 3 – Skills
Two time periods were compared, January – The most frequently occurring skills in 2017-18
December 2013 and July 2017 – June 2018. were Data Science, Python and SQL. These
In 2013, 6.7 million UK-based job postings and other skills most frequently requested
were listed. In 2017 there were 9.2 million for DSAA job postings were generally open
postings (an increase of 36%). Growth for all source, could be applied across sectors and
Data Science and Analytics (DSA) jobs was harness high levels of computing power.
similar (35%).
This differed to skills most frequently
However, during the period there is a sharp requested in 2013, which included Microsoft
rise in job adverts calling specifically for Data Excel and domain-specific skills such as
Scientists and Advanced Analysts (231%). Economics (Appendix Table 3).
This was particularly driven by a rise in 'Data
Scientists' (1,287%) and 'Data Engineers' (452%). Table 4 – Skills clusters
Structuring and organising the skill-level
Salary findings (see Table 2) information into clusters can help with analysis.
The data shows that the salary changes for This Is particularly important in emerging areas,
different categories do not necessarily correlate where jobs are less well defined.
with growth in demand. The salary figures are
indicative only, however, as many job postings The most frequently occurring skills clusters
do not advertise a salary, and it is possible in DSAA job adverts were Data Science,
that this is especially the case with roles at the Scripting Languages and Big Data. The Data
higher end of the salary scale. See the data Science skills cluster featured in 25,042
appendix for further information. DSAA job postings (93%), suggesting high
demand for this skill across occupations
beyond Data Scientist.

In 2013, the most frequently requested skills


clusters in DSAA job adverts were Statistics
and Statistical Software skills (Appendix
Table 4). Domain-specific expertise such as
Economics and Medical Research were in
greater relative demand in 2013 compared
to 2017-18. These shifts in demand from
employers likely reflect a constantly evolving
data science landscape.

30 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER ONE

Regional demand (see Figures 3 and 4, in 2017/18. However, for Data Scientists and
and Appendix Table 2) Advanced Analyst job vacancies (fig 4), growth
Regional breakdowns show the dominance was larger (relative to the base amount) in
of London for Data Scientist and Advanced Northern Ireland (563%), the North West
Analysts, accounting for 58% of all postings (269%) and the East of England (250%).

Table 2

Job postings and salary information for data science and analytics jobs (including a detailed look at the Data Scientists
and Advanced Analysts category).

Postings Average salary


Category 2013 2017 – 18 Change % 2013 2017 – 18 Change %
Data Scientists and
Advanced Analysts 8,157 27,033 231% £52,766 £64,376 22%

Data Scientists 768 10,655 1287% £63,053 £65,188 3%

Data Engineers 1,213 6,699 452% £53,449 £73,559 38%

Biostatistician 936 1,823 95% £43,601 £41,411 -5%

Financial Quantitative analyst 1,458 2,799 92% £68,324 £71,079 4%

Economist 2,185 2,929 34% £47,512 £49,959 5%

Statistician 1,597 2,128 33% £44,944 £44,910 0%

Data Analysts 79,903 114,327 43% £47,524 £46,531 -2%

Data Systems Developers 150,041 174,504 16% £53,775 £55,126 3%

Analytics Managers 23,625 32,325 37% £54,623 £58,835 8%

Data-Driven Decision Makers 280,218 390,328 39% £50,264 £50,794 1%

Functional Analysts 194,667 257,745 32% £45,606 £43,878 -4%

Total (DSA jobs) 736,611 996,262 35%

Dynamics of Data Science: How can all sectors benefit From data science talent? 31
CHAPTER ONE

Table 3

The top 10 skills listed in DSAA job adverts (2017/18).


This table shows the skills which occurred the most in 2017/18. This is measured in terms of the proportion
of Data Science and Advanced Analyst (DSAA) job adverts which specified the skill as a requirement for the
role. There were a total of 682 skills included in this analysis. This table displays the top ten most frequently
occurring skills.

Number of DSAA job adverts Percentage of DSAA job adverts


requiring this skill requiring this skill
Rank Skill 2013* 2017 – 18** 2013 2017 – 18
1 Data Science 738 11,989 9% 44%
2 Python 681 11,647 8% 43%
3 SQL 1,048 7,226 13% 27%
4 Machine Learning 352 7,089 4% 26%
5 Big Data 566 6,770 7% 25%
6 Research 2,042 5,279 25% 20%
7 Apache Hadoop 533 5,199 7% 19%
8 Communication Skills 1,843 4,849 23% 18%
9 Java 703 4,111 9% 15%
10 Scala 58 3,276 1% 12%

*8,157 job adverts included in this analysis. **27,033 job adverts included in this analysis.

32 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER ONE

Table 4

The top 10 skills clusters listed in DSAA job adverts (2017/18).


This table shows the skills clusters which occurred the most in 2017/18. This is measured in terms of the proportion
of Data Science and Advanced Analyst (DSAA) job adverts which specified the skills clusters as a requirement
for the role. There were a total of 278 skills clusters included in this analysis. This table displays the top ten most
frequently occurring skills clusters.

Number of DSAA job adverts Percentage of DSAA job adverts


requiring this skill cluster requiring this skill cluster
Rank Skills cluster 2013* 2017 – 18** 2013 2017 – 18
1 Data Science 1,056 25,042 13% 93%
2 Scripting Languages 829 23,450 10% 87%
3 Big Data 745 17,090 9% 63%
4 SQL Databases and Programming 1,151 15,532 14% 57%
5 Machine Learning 474 15,236 6% 56%
6 Data Analysis 1,628 10,636 20% 39%
7 Statistical Software 1,797 8,894 22% 33%
8 Java 703 8,258 9% 31%
9 Statistics 1,908 7,406 23% 27%
10 Software Development Principles 485 6,956 6% 26%

*Out of 8,157 job adverts included in this analysis. **Out of 27,033 job adverts included in this analysis.

Dynamics of Data Science: How can all sectors benefit From data science talent? 33
CHAPTER ONE

figure 3

Map showing the growth of all data job postings across the UK from 2013 to 2017 – 18.

Total for UK*

2013 2017/2018
736,611 996,262 Scotland
2013 2017/2018
35% 36,388 41,117 13%
Growth since 2013

North East
2013 2017/2018
7,444 11,095 49%

North West
2013 2017/2018
Yorkshire and the Humber
45,286 61,871 37%
2013 2017/2018
37,885 47,014 24%

East Midlands
Northern Ireland 2013 2017/2018
2013 2017/2018 30,957 37,743 22%
4,987 11,902 139%

East of England
West Midlands 2013 2017/2018
2013 2017/2018 53,295 61,725 16%
41,771 86,563 107%

Wales Greater London

2013 2017/2018 2013 2017/2018


9,198 14,507 58% 283,817 345,164 22%

South West South East


2013 2017/2018 2013 2017/2018
41,066 59,581 45% 112,721 126,551 12%

Home nations
Northern Ireland Scotland Wales England
2013 2017/2018 2013 2017/2018 2013 2017/2018 2013 2017/2018
4,987 11,902 139% 36,388 41,117 13% 9,198 14,507 58% 654,242 837,307 25%

*Regional figures do not add up to the UK total because job adverts that did not include location (eg remote working) have been excluded.

34 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER ONE

figure 4

Map showing the growth of Data Science and Advanced Analyst (DSAA) job postings across the UK from 2013 to 2017 –18.

Total for UK*

2013 2017/2018
8,157 27,033 Scotland
2013 2017/2018
231% 528 1,114 111%

Growth since 2013

North East
2013 2017/2018
54 175 224%

North West
2013 2017/2018
Yorkshire and the Humber
320 1,182 269%
2013 2017/2018
256 746 191%

East Midlands
Northern Ireland
2013 2017/2018
2013 2017/2018 200 474 137%
46 305 563%

East of England
West Midlands
2013 2017/2018
2013 2017/2018
551 1,928 250%
256 748 192%

Wales
Greater London
2013 2017/2018
2013 2017/2018
121 216 79%
4,131 14,066 240%

South West South East


2013 2017/2018 2013 2017/2018
276 903 227% 1,086 2,305 112%

Home nations
Northern Ireland Scotland Wales England
2013 2017/2018 2013 2017/2018 2013 2017/2018 2013 2017/2018
46 305 563% 528 1,114 111% 121 216 79% 7,130 22,527 216%

*Regional figures do not add up to the UK total because job adverts that did not include location (eg remote working) have been excluded.

Dynamics of Data Science: How can all sectors benefit From data science talent? 35
CHAPTER ONE

Discussion of the data and understanding The Data Skills Taskforce is currently looking
the skills gap at the gap between the supply of students
Since 2012, the data science industry has coming through with the required knowledge
moved extremely quickly. It is possible that and skills, and the demand required by
some of the jobs included in this analysis employers. No doubt data science techniques
have been relabelled as ‘Data Scientist’ or will themselves need to be applied to predict
associated roles, possibly leading to inflated future workforce needs.
or exaggerated growth rates. This is likely to
reflect wider changes in the language used BOX 2
to describe data science and data scientists.
The Data Skills Taskforce
It is very hard to estimate the true gap between
market demand and supply. Vacancies are not The Data Skills Taskforce, chaired
necessarily a good proxy for the number of by Accenture and the Alan Turing
data science jobs. One reason for this is that if Institute, sets an agenda for change to
job turnover increased, more vacancies will be inspire, educate and upskill data talent,
advertised even if there has been no change drawing on best practice from the UK’s
in the total number of people working as data leading institutions. The taskforce was
scientists. Additional analysis of the Labour established to review, promote and take
Force Survey could help to aid understanding forward key elements of Analytic Britain,
of the actual number of jobs. across schools, universities and the
labour market at large29. It comprises UK
There are also a number of challenges to businesses, data skills stakeholders and
understanding the supply of data science the Department for Digital, Culture, Media
graduates. The Higher Education Statistics and Sport.
Agency (HESA) codes that are used to
classify subjects studied do not include The main aims of the Data Skills
data science. In any case, only a very small Taskforce are to promote the importance
minority of data scientists will come from of data skills, highlight critical skills
data science courses. gaps and monitor progress against the
recommendations of Analytic Britain.
Ultimately, more needs to be done to The Data Skills Taskforce is building a
determine the existence, size and scale of the platform to help SMEs develop their data
skills gap. Further analysis is needed of the way science capabilities, working to quantify
statistics on higher education are captured, so the UK data skills gap, which is currently
as to understand better the challenges faced significant and growing, and considering
by employers. The lack of traditional labour whether a data science foundation course
market data for these roles has created an for all undergraduates is required.
information gap that is unhelpful to educators,
employers, and policymakers as they attempt
to build a workforce with the skills needed
across the landscape.

29. Mateos-Garcia J, Windsor G and Roseveare S. 2015 Analytic Britain: Securing the right skills for the data-driven
economy. See https://media.nesta.org.uk/documents/analytic_britain.pdf (accessed 15 April 2019).

36 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER ONE

Dynamics of Data Science: How can all sectors benefit From data science talent? 37
38 Dynamics of Data Science: How can all sectors benefit From data science talent?
Chapter two

Dynamics of data science:


career paths and talent
flows between sectors

Image
Image
Caption©goes
Laurence
here. Dutton.

Dynamics of Data Science: How can all sectors benefit From data science talent? 39
CHAPTER TWO

Dynamics of data science: career paths


and talent flows between sectors
At the heart of this report is our vision for The following drivers were identified by data
a healthy data science skills landscape. science professionals and those who employ
Integral to this are practical models and them about why they might choose to move
mechanisms, which can allow the flow of data around in their careers and what might stop
science talent across academia, industry, them from doing so. The drivers came from a
government and the charity sector. This will series of workshops with data experts across
ensure that the data science ecosystem is sectors. From the list some of the key trends
balanced and sustainable, with all sectors for each sector have been explored in more
benefitting from the promise of data science detail in a series of detailed career interviews
capability and expertise, and an attractive and (available in an accompanying booklet).
exciting diversity of career opportunities and
developmental challenges for data scientists.

In this section we examine the drivers for


movement of data science professionals
across academia and the private and public
sectors, and what can be done to facilitate
greater career mobility and exchange
between research in data science, and the
application of data science research in the
private and public sectors.

40 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER TWO

figure 5

Examples of drivers of movement between sectors.

Intellectual
freedom

Joint
Recognition
appointments

ACADEMIA
Access to Impact
data and on policy
computing and public
power services

INDUSTRY PUBLIC SECTOR

Access
Salary
to data

Culture Career development

Note: Some of these drivers are not the exclusive presure of any one sector but are highlighted here for illustrative purposes.

Dynamics of Data Science: How can all sectors benefit From data science talent? 41
CHAPTER TWO

figure 6

Examples of drivers and blockers by sector.

Public sector: KEY DRIVERS

Societal impact Access to data

Directly beneficial applications of data Good access to large datasets which


science eg improving healthcare and may otherwise be inaccessible.
treatment discovery. Data with a high level of integrity.
Interesting opportunities to collaborate The potential to combine large datasets
with others eg in security and defence. provides opportunity for real change.
Interesting challenges that benefit society.

academia: KEY DRIVERS

Culture Career development

High levels of expertise and access Department Heads recognising the


to very talented people. value of data science is essential to
Enabling data science through the keeping data scientists in academia.
development of theory and practice. Spin-out culture: where universities
Data scientists now ‘on the bridge’ encourage spin-outs there is an
and feeling valued. incentive to stay in academia
Freedom to choose research topic Fellowships and awards create
and research is ‘open’. opportunities for researchers to
build their career in academia.

industry: KEY DRIVERS

Start-up culture Salary

Competes with academia on time Salary is a pull factor towards industry


to focus on science. plus other related financial perks,
Multidisciplinary teams, less siloed. ie more permanent contracts and
clearer career development paths.
When there is an ability to publish,
industry competes with academia. Salary is also a pull factor to international
destinations, which has implications for
Working directly on real-world
the UK digital economy.
applications (benefit to society).
Attractive working environments at
some large companies, with space
for creativity.

42 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER TWO

Public sector: BLOCKERS

Struggle to match industry/market


rate salaries to bring in the right talent
and skills.

ACADEMIA: BLOCKERS

Academia provides the ability Struggle to match industry/market


to publish, important to furthering rate salaries to bring in the right talent
a research career. and skills.
Growing opportunities for joint Non-standard research outputs are
appointments with industry. not always valued in academia.
Lack of clear career paths for data
scientists who want to specialise.
Access to good quality datasets.
Lack of or limited permanent positions.

INDUSTRY: BLOCKERS

Access to data and


computing power
Data which can be used for Research and development is subject
commercial purposes. to demand within a company and
Large datasets can provide can change as strategies or economic
a competitive advantage. conditions change.
Access to infrastructure – Where publishing is not an option,
computing power is more readily industry can be less attractive to
available in certain organisations. data scientists.
Salaries are not ubiquitously good
across industry.

Dynamics of Data Science: How can all sectors benefit From data science talent? 43
CHAPTER TWO

How to free up movement Outputs


“It was feasible It can be difficult for data scientists in industry Another key enabler depends on universities
for me to move to move into academia, but there are ways being more willing to recognise the value
from industry to make moving between sectors a natural of data science experience, other than
to academia part of the data science career path. This traditional published science, obtained outside
due to, in part, is important on a variety of timescales from universities, in the private and public sectors.
circumstance; the career move to week-long study through a Alternative outputs are increasingly recognised
industry research secondment/internship. as an important aspect of research and should
environment that I be an asset to researchers in their careers.
was in happened to Publishing
be one that allowed In particular, there is a need to address Experience
me to build up a the problem of the ‘one-way door’ out There is also a broader policy debate about
strong published of academia which makes it difficult for how data science can support innovation.
research portfolio. researchers to return after spending time Researchers in fields like machine learning
It also allowed me in industry or government. One enabler to can have ‘a foot in each camp’ – academic
to do work that was this movement is the ability of researchers jobs in involvement with start-ups or new
public rather than a to keep publishing when they are outside agreements to work between sectors.
corporate secret.” of academia.
Ways of enabling people to gain experience
Professor Graham In 2016, Apple announced that it would that is valuable to all sectors can be
Cormode, Professor in begin publishing its machine learning through Fellowships and Grants that are
Computer Science at the
research to help attract and retain top at the interface of industry and academia.
University of Warwick.
talent in the company30. The adoption of data science is enabling
government to unlock the value of the
However, publishing is not always possible data it holds. It is being used to visualise
outside of academia. In those cases, it is and understand data, build tools that help
important for universities to recognise the policymakers access and use information,
value of work which does not result in an and carry out analysis that helps improve a
academic publication, and conversely, service or drive efficiency. There are also
industry should recognise the skills specific challenges for movement between
involved in higher level research. academia and the public sector – and
fellowship models to overcome these.

30. Shead S. 2016 Apple is finally going to start publishing its AI research. Business Insider. See https://www.
businessinsider.com/apple-is-finally-going-to-start-publishing-its-artificial-intelligence-research-2016-12?utm_
source=feedly&utm_medium=webfeeds&r=US&IR=T (accessed 15 April 2019).

44 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER TWO

Highlights from the career stories


The full set of career case studies, Dynamics of data science: What data professionals say about
data science, is available to download at royalsociety.org/dynamics-of-data-science-skills.

“ From the point of view of academic career “ The counterpoint is that mobility can be
development, my six years in Silicon harder in either direction if the gap is wider.
Valley were completely wasted. It does It is often said that it is difficult for people
not translate into anything tangible on from industry to move into academia if they
the academic career ladder. But, I suspect have not maintained a research profile.
that is part of why Cambridge University
decided to hire me. This kind of experience Equally, I think if you have been working
must be recognised. It did not lead to in an academic environment focused on
papers. It did not lead to cutting-edge stuff. publishing and you want to go to a more
But it led to loads of ideas.” industrial employer, one of the things you
will face is that they will say, ‘Before we
Dr Damon Wischik, Lecturer, Department of Computer get started, can you do some programming
Science and Technology, University of Cambridge. exercises or fairly hands-on stuff?’ For
someone who has been working on
equations and theory, it can actually be a bit
“ Once you have left academia and you of a wrench to reorient to those more hands-
are not publishing high‑quality papers, on expectations that organisations have.”
it is very difficult to go back. If there was
more funding for projects that required Professor Graham Cormode, Professor in Computer
Science at the University of Warwick.
one commercial partner and one academic
partner, that would be interesting.”

Chanuki Illushka Seresinhe, Lead Data Scientist at Popsa. “ This is a major data science/AI challenge
for government, innovative companies
and universities. In my experience, the
boundary between universities and
government is potentially difficult: the
timescales and measures of success are
very different on either side. But there are
various experiments, for example the Policy
Fellowships pioneered by the Centre for
Science and Policy, which show how the
two sides can learn from each other to the
enormous benefit of our society as a whole.”

Professor Frank Kelly CBE FRS, Professor of the


Mathematics of Systems at the University of Cambridge.

Dynamics of Data Science: How can all sectors benefit From data science talent? 45
46 Dynamics of Data Science: How can all sectors benefit From data science talent?
Chapter three

Area for action:

Developing foundational
knowledge and skills

Image © jacoblund.

Dynamics of Data Science: How can all sectors benefit From data science talent? 47
CHAPTER THREE

Area for action:

Developing foundational
knowledge and skills
Education should provide a grounding Alongside the technical skills, interviewees in
“It is not common for to ensure that all young people develop this project also highlighted a range of other
my generation to underpinning data science knowledge core skills for working well with data including
be able to program. and skills. The data experts that we spoke adaptability, curiosity, empathy, problem-
There are loads of to highlighted a range of core skills and solving and story-telling which supports the
really great online disciplines that need to be developed early case for embedding data science in an array
resources, but I on including coding, computer science, of subjects.
think it can still be mathematics, machine learning, statistics,
difficult for people and more. There are pools of potential talent which
to see why it would could be reached to address local needs and
be useful.” Last year the Royal Society published a review there could be more courses, apprenticeships
of how data science skills are nurtured in and work placements outside of London
Dr Amy Nelson, Senior England's curriculum31. The review was written and the South East. Employer-led Trailblazer
Research Associate, UCL by a curriculum expert, Dr Vanessa Pittard, groups and public sector bodies should
Institute of Neurology.
and it identified some barriers to embedding further support and expand existing
data skills into the curriculum as well as some programmes, such as those run by the Office
opportunities for further development. for National Statistics and the BBC, and work
together to resolve knotty delivery issues.
The review found that at primary level, pupils
are likely to gain a reasonable introduction This chapter sets out two priority needs
to underpinning elements of data science if and offers specific recommendations for
they are taught well. However, at secondary developing foundational knowledge and skills
level, the lack of systematic progression in for data science. It also highlights successful
relevant aspects of computing is a major or innovative models and mechanisms
curriculum challenge. for developing these foundations and
addressing these needs.
Last year the Royal Society also conducted
a study identifying a range of initiatives that
seek to increase the proportion of young
people studying computing (particularly
girls)32. The study, co-funded by Microsoft and
Google, found that 54% of English schools
do not offer Computer Science GCSE, and
put recommendations in place to encourage
greater take up and increase teachers’
confidence. The report also found that barriers
exist around teacher confidence in applying
mathematics to domain questions, and in
conceptual and technical aspects of computing.

31. P
 ittard V. 2017 The integration of data science in the primary and secondary curriculum. See https://royalsociety.
org/~/media/policy/Publications/2018/18-07-18-The%20Integration%20of%20Data%20Science%20in%20the%20
primary%20and%20secondary%20curriculum.pdf?la=en-GB (accessed 15 April 2019).
32. Royal Society. 2017 After the reboot: computing education in UK schools. See https://royalsociety.org/~/media/
events/2018/11/computing-education-1-year-on/after-the-reboot-report.pdf (accessed 15 April 2019).

48 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER THREE

Needs and recommended actions

Need:
Recommended action “In my apprenticeship
Widening access to data
there was a whole
science education Support teachers to teach unit dedicated to
Our work on the educational data science data skills how you work with
pipeline, discussions with data scientists and other people in an
Developing the resources, training and
analysis of job market data, showed that there office. You had to
support for teachers requires combined
is a need to follow up a number of priority learn to express your
effort from mathematics and computing
needs in order to develop the foundational opinion and convince
communities, businesses, and education
level knowledge and skills required for people of your ideas.
professionals, including the new National
professional roles working with data. These Apprenticeships
Centre for Computing Education and National
needs cover building core skills at school could be run across
Centre for Excellence in the Teaching of
level, supporting teachers, expanding different sectors
Mathematics, working with the support
apprenticeships and promoting outreach so that alumni can
and input of government departments and
initiatives to introduce young people to build professional
other partners. This can build on important
data skills. networks.”
work already carried out by the Department
for Education and others in this area, and Alexis Fernquest, Data
examples such as the Royal Geographical Scientist, Office for
Recommended action Society’s Data Skills in Geography National Statistics and
programme (see Models and mechanisms). former apprentice.
Data skills for everyone
At school level, data science knowledge and Issues around teacher confidence could be
skills would benefit from greater integration addressed through professional development
across the primary and secondary curriculum opportunities and more provision of detailed
to ensure that everyone leaves school with guidance. Subject specific knowledge
foundational data skills. Mathematics and could be delivered in collaboration with
computing communities, businesses and universities and professional bodies (eg
education professionals have a key role in subject associations). There are possibilities
ensuring that relevant insights are built into for incorporating real uses of data across the
the education curriculum and associated sciences and humanities. One area that is
enrichment activity in schools, ensuring relatively underplayed in the current curriculum
that data science skills are appropriately is developing investigation using GDS/GIS data
embedded across subjects and phases. which could be incorporated into humanities
subjects. Appropriate computing and data
infrastructure for schools will be crucial to
support the teaching of data skills.

Dynamics of Data Science: How can all sectors benefit From data science talent? 49
CHAPTER THREE

Need:
“I would like to see Recommended action
Widening access to data
more people with
developed skills Curricula fit for the future science education
in computational Looking further ahead, an analysis of the There is also a need to address the
statistics and future data science needs of students, underrepresentation of women in the talent
machine learning industry and academia should be undertaken pipeline. Across the mathematical sciences,
coming out of to inform future curriculum development. 37.1% of university students were female in
undergraduate Curriculum content groups should consider 2018. Computer science was the subject with
degrees that are the place of data science within the curricula the widest gender gap across all degree
not computer they are developing now, ahead of the next levels in 2018 (82.8% male). This was most
science and physics. curriculum review. In addition to the relevant pronounced at the undergraduate levels:
For example, the areas of mathematics, computer science and just 15.1% of first degree undergraduates
social sciences, data literacy, the ethical and social implications were female. Forbes data from a US study
life sciences and of machine learning should be included within of technical education provider, General
environmental teaching activities in related fields, such as Assembly, found that in data science
sciences.” Personal, Social and Health Education. programmes/boot camps in the US female
participation lags with 35.3% female students
Dr James Hetherington,
Post-16 curriculum change within the next ten enrolled in a five month period (2016/17)34.
Director of Research
Engineering. years will be vital to ensure young people
leave education with the broad and balanced
In the UK, just 4% of the UK tech industry is
range of skills they will need to flourish in
from a black, Asian and minority ethnic (BAME)
a changing world of work33. A review into
background, compared with 11% in the working
post-16 learning should consider the many
population as a whole, at the last census. With
ways in which the post-16 curriculum could
low diversity in academic programmes and
be improved and the factors which affect the
industry bootcamps, it is unlikely that diversity
options open to young people.
gaps will close in the near future without
significant investment35.
At university level, further consideration is
needed about how universities can teach
data science effectively, as a developing
and interdisciplinary discipline.


33. Royal Society 2019. Jobs are changing, so should education. See https://royalsociety.org/~/media/policy/
Publications/2019/12-02-19-jobs-are-changing-so-should-education.pdf?la=en-GB (accessed 15 April 2019).
34. Pricenomics. 2017 The Data Science Diversity Gap. Forbes. 28 September 2017. See https://www.forbes.com/sites/
priceonomics/2017/09/28/the-data-science-diversity-gap/#56ccb88c5f58 (accessed 15 April 2019)
35. Douglas R. 2018 Lack of diversity increases risk of tech product failures. Financial Times. 14 November 2018.
See https://www.ft.com/content/0ef656a8-cd8a-11e8-8d0b-a6539b949662 (accessed 16 April 2019).

50 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER THREE

Recommended action Recommended action “There should


definitely be more
Raise awareness of data Address underrepresentation diversity in role
science careers and evaluate diversity models. It shows you
that ‘this is a career
Data professionals work across a wide range Women make up a disproportionately small
for people like me.”
of roles. Greater awareness could help to fraction of the educational pipeline associated
attract a wider pool of students. Employers with data science positions, and further efforts Aimee, Government
could also offer work experience, host teacher are needed by all stakeholders to address Communications
inset days and speak in schools, college and the shortage. Diversity, and not just of gender, Headquarters (GCHQ).
universities so that students and their teachers is relevant as data science talent requires a
and careers advisers gain an understanding of wide set of skills, including those involved in
emerging career pathways. identifying, understanding and interpreting
real-world problems. A diverse pipeline
We have published a range of case studies of data scientists is more likely to pick up
alongside this report to highlight the or be concerned by inadvertent biases in
wide range of roles in data science and algorithms that can impact on many different
related fields. types of people. The Hall and Pesenti review
into the growth of the UK artificial intelligence
The Institute for Apprenticeship’s occupational industry (2017) called for government, industry
maps are an example of how to explain career and academia to embrace the value and
pathways to students, parents and teachers, importance of a diverse workforce and the
and organisations recruiting data scientists can recommendations of this review should
share stories of what they do and the impact of continue to be pursued.
their work.
 A significant barrier to improving diversity
is the lack of access to data on diversity
statistics in industry and in academia. We
encourage institutions to be more transparent
about diversity statistics in an effort to
improve diversity in the field. Funding for
diversity initiatives could be expanded
to augment the presence of minority
backgrounds and ethnicities.

Dynamics of Data Science: How can all sectors benefit From data science talent? 51
CHAPTER THREE

Models and mechanisms

Through our workshops and interviews we Mechanism


“You have to be identified some key models and mechanisms Apprenticeships to develop practical skills.
quite resilient and that can help to address these priority needs.
open to change. The mechanisms include integration of data Model
In 10 years’ time skills into the school curriculum, apprenticeships The Office for National Statistics:
it will be different to develop practical skills, alternative degree Data Science Campus
again because they pathways and outreach programmes. The The Office for National Statistics (ONS) is
have only really just specific examples below serve as models of leading the way in developing a degree
started trying to get what can and has worked in specific contexts. level apprenticeship in data science on
coding into schools. behalf of the public and private sectors. The
You are going to Mechanism course is one of a number of new offerings
have all these Integration of data skills into the at the new Data Science Campus36. The
people coming out school curriculum. campus is based at the ONS headquarters in
who are already Newport and part of a £17 million investment
excellent at coding.” Model in statistics and data by the UK government
Data Skills in Geography programme, to modernise and improve the statistics it
Alexis Fernquest, Data Royal Geographical Society with the produces. The funding was allocated after the
Scientist at the Office Institute of British Geographers Bean Review highlighted concerns around
for National Statistics.
Recent changes in curricula in schools and the use of administrative data37.
at university, along with a recognised skills
gap, have brought renewed emphasis on All the training programmes at the campus
students being trained in data skills (the cover three levels: raising awareness,
collection, analysis and presentation of embedding core skills and developing
data) in geography at GCSE, A Level and in expertise. The ONS set up the scheme
undergraduate courses – and within Higher because it felt that there were no suitable
Education geography has been recognised existing programmes available in the
by HEFCE as a ‘part-STEM’ subject. The university sector. The need to develop its
shift is presenting new challenges for many own courses was important to equip data
school teachers, particularly those with little scientists with better insights into the UK
prior experience of such skills. In response economy for policymakers. All the applicants
to these changes and challenges, the Royal to the campus come from very different
Geographical Society (with IBG) is leading and diverse backgrounds. Some are
a programme Data Skills in Geography, mathematicians, physicists, social scientists,
supported by funding from the Nuffield and artists. School-leavers and career-
Foundation. RGS has also established new changers are also welcome. The ONS wanted
networks and strengthened existing ones, a diverse array of educational pathways
including creating a Data Champions scheme so that everyone can fulfil their potential,
made up of teachers who are data enthusiasts realising that work-based learning has a key
with different levels of experience. role in delivering that ambition.

36. Poluck M. 2017 ONS opens £10 million hub to exploit digital data. UK authority. 27 March 2017. See http://www.
ukauthority.com/data4good/entry/7009/ons-opens-%C2%A310-million-hub-to-exploit-digital-data# (accessed 18
April 2018).
37. C
 hu B. 2015 Bean review tells ONS to be more “intellectually curious”. Independent. 3 December 2015. https://
www.independent.co.uk/news/business/news/bean-review-tells-ons-to-be-more-intellectually-curious-a6758226.
html (accessed 18 April 2018).

52 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER THREE

Mechanism Mechanism
Alternative degree pathways. Outreach programmes. “Q-Step academics
have enriched social
Model Model science teaching
Q-Step, Nuffield Foundation, ESRC and Code First: Girls by integrating
HEFCE (now Office for Students (OfS)) Code First: Girls supports young adult and data skills into
The Q-Step programme, launched 2013/14 working age women to develop further their courses. Most
aims to address the quantitative skills gap in personal and professional skills. Code students are now
social sciences in the UK. With an initial six- First: Girls runs free coding courses; it also handling data in
year £19.5 million investment from the Nuffield connects women to a community of other R. It demonstrates
Foundation, the ESRC and HEFCE, around talented and like-minded women and that what we refer
60 new academic staff were employed companies who can support and accompany to as STEM skills
to use innovative approaches to embed them through their professional development. don’t just come
quantitative methods and data analyses into Code First: Girls helps companies train their from taking STEM
social science teaching. There are currently people, recruit new people, and develop their degrees.”
18 universities across the UK funded to talent management policies and processes so
deliver specialist undergraduate programmes, they don’t miss out on female tech talent.   Dr Simon Gallacher, Head
including new courses, work placements and of Student Programmes at
the Nuffield Foundation.
pathways to postgraduate study.

Over 1,000 undergraduates a year start


a Q-Step degree pathway, whilst almost
ten-times as many experience enhanced
quantitative teaching by taking one or more
Q-Step modules. Over 420 employers are
involved with offering placements to more
than 860 students, ranging from private/
public to the third sector and government
departments. The programme is having a
notable and positive impact on the data skills
of social scientists taking up postgraduate
research studentships. Q-Step will continue
to be funded by the Nuffield Foundation and
the ESRC for two more years (2019/20 and
2020/21) and is undergoing an independent
impact evaluation.

Dynamics of Data Science: How can all sectors benefit From data science talent? 53
54 Dynamics of Data Science: How can all sectors benefit From data science talent?
Chapter four

Area for action:

Advancing professional
skills and nurturing talent

Image © FS-Stock.

Dynamics of Data Science: How can all sectors benefit From data science talent? 55
CHAPTER FOUR

Area for action:

Advancing professional
skills and nurturing talent
Data science is fast-moving and needs New technologies are changing the roles
“There are four main innovative mechanisms to develop that data scientists are performing and
technical skills that advanced skills quickly. When people are in this is having an impact on the depth and
a data scientist employment, there are a number of needs to breadth of skills that are being sought after.
needs: statistics, be met in enabling them to keep their data New technologies are performing roles that
machine learning, science skills up to date and to ensure that previous required highly skilled employees,
data and computer they can work most effectively. These include: making people with more junior data
science. What is qualifications increasingly effective.
• d
 eveloping teams and a workplace culture
in short supply is that enables data scientists to make the
having all four of However, the commercial sector could
best contribution in their sector;
them together, plus usefully do more to help train and develop
the soft skills.” • e
 nabling individuals to keep their skills employees with the right skillsets. Employers
up to date whether they are data science have a role in upskilling the workforce by
Milton Luaces, Senior specialists or people in a range of training existing employees, particularly those
Manager at Accenture – at risk of losing their jobs through automation,
professions who increasingly need to
Applied Intelligence.
work with data; and can work with universities to co-produce
training. Higher education institutions
• s upporting career changes to meet
with good industry links play a key role in
individual, organisation and regional needs.
developing appropriate professional training.
By working in collaboration with employers
In the longer term, problem solving,
they can potentially address regional skills
resilience, and continuous learning will also
gaps and productivity needs.
be necessary to enable people to adapt to
change, particularly as technology changes
“Industry can play a slightly bigger role in
jobs and the opportunities to collect more
being involved in the training of people,
and more data continue to grow.
regardless of what age or stage they are
at. At the moment, it is an employer and
There is a need to develop researchers
employee relationship. But I can see it
and employees with good data engineering
being a lot more collaborative and I have
skills who have knowledge and experience
come across institutions that are starting to
of handling ‘big data,’ huge unstructured
recognise this. They are moving towards a
datasets, data sources and/or real-time
mindset of upskilling people, exposing them
data. Across all sectors there are massive,
to their company, giving them a grounding
high volume and high velocity datasets that
and capabilities to help solve the problems
need to be stored, processed and analysed
that particular institution faces. I was
in real time, and this requires the creation
sponsored by a company during my PhD
and maintenance of infrastructure for big
and I would encourage more institutions
data. Typically this is work that would be
to do that.”
undertaken by data engineers, but currently
it is often done by data scientists because Ilya Zheludev, Chief Data Officer for
there are not enough data engineers to take Jasmine 22.
on these roles.

56 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER FOUR

In the public sector, collaboration is already Whilst Massive Open Online Courses
helping to upskill the workforce. At the Office (MOOCs) suffer from high dropout rates “I started off
for National Statistics there are Joint PhD/MSc and irregular provision, they also enable with Tableau,
projects, or placements, for example the joint people to undertake independent study storytelling and
Turing-Campus PhD programme. The ONS is and upskilling alongside other more formal data visualisation.
also involved with several of the Centres for training programmes. On the side, I was
Doctoral Training in big data, and runs CPD starting to learn
courses for analysts – typically short intensive Other informal mechanisms include peer R and Python for
courses linked to specific needs. learning such as forums and meet-ups, more of the deep
competitions and clusters. One example is dive work on
Beyond this, specialist skills are often Kaggle, an education platform that also holds the data.”
developed through self-learning programmes. competitions39. The aim of meet-ups is to
An increasing proportion of job adverts are promote free, open, dissemination of data Kevin Koene, former
calling for software skills such as Hadoop, science knowledge. They encourage data Junior Data Scientist,
Python and R. Informal learning and the open The One campaign.
science peer-to-peer learning and sharing,
source movement are enabling people to collaboration among data scientists and data
develop these skills in non-traditional ways. start-ups. They also promote open source
data science tools40.
In 2017, Bernard Marr, writing for Forbes, put
together a list of free online data science Some meet-ups focus on underrepresented
courses with links and endorsements from groups, such as the London-based ‘Inspiring
well-established institutions38: For example: Women in Data Science’, which has almost
• D
 ata Science Specialization (Johns Hopkins 1,000 members and runs several events
University and Coursera): One of the longest- throughout the year41. In addition, there are
established online data science courses; and other groups that focus on specialisms within
data science including Women in ML, Black
• D
 ata Science Essentials (EdX and Microsoft):
in AI, Queer in AI, LatinX in AI, the AI Club for
Part of the Microsoft Professional Program
Gender Minorities, RLadies and PyLadies, an
Certificate in Data Science;
international mentorship group for women
• B
 ecome a Data Scientist (Dataquest with who code in Python with chapters in London,
endorsements from Uber, Amazon and Edinburgh and Dublin.
Spotify): Independent online training
provider offering three pathways (analyst, Community knowledge-sharing events such
scientist and engineer); and as panel/roundtable events on specific hot
• D
 ata Mining Course (KDNuggets): Well- topics also bring together data scientists
known business and data science website to discuss, share and take advantage of
that has compiled its own free data-mining community knowledge.
syllabus.

38. Marr B. 2017 The 9 Best Free Online Big Data And Data Science Courses. Forbes. 6 June 2017. See
https://www.forbes.com/sites/bernardmarr/2017/06/06/the-9-best-free-online-big-data-and-data-science-
courses/#466bc8543cdb (accessed 15 April 2019)
39. Kaggle (no date) Kaggle is the place to do data science projects. See https://www.kaggle.com/ (accessed 15 April 2019).
40. Meetup (no date) Inspiring Women in Data Science. See https://www.meetup.com/IWDSuk/ (accessed 15 April 2019).
41. S
 ekinah T. 2017 Info on Women in Data groups. DataIQ. 6 December 2019. See https://www.dataiq.co.uk/blog/info-
women-data-groups (accessed 15 April 2019).

Dynamics of Data Science: How can all sectors benefit From data science talent? 57
CHAPTER FOUR

Needs and recommended actions

Need: Developing skills in the workforce.


“If you want a senior Through our workshops and interviews Recommended action
data scientist with we explored the need to ensure that data
more than 10 years’ science skills can be nurtured so that Offer nimble and responsive
experience, we are organisations can retain talented individuals training opportunities
short on that and and get the most out of their skills, and to
Data science is fast-moving and requires
we will still be short enable them to keep those skills up to date in
innovative ways to enable the development
for another three or a fast moving field.
of advanced skills. To meet the growing
four years.”
demand for data scientists, universities need
Milton Luaces, Senior to be agile and responsive to offer new ways
Recommended action
Manager at Accenture – of upskilling.
Applied Intelligence. Engagement between
This could potentially be achieved through
universities and employers
MicroMasters, conversion courses and
Universities with good industry links play high-quality Massive Open Online Courses
a key role in developing appropriate (MOOCs) for continued professional
professional training. By working in development.
collaboration with employers they can help
address regional skills gaps and address In the short term, a strong and effective
productivity needs. pipeline of practitioners is likely to be
established through government support for
Employers have a role in upskilling the advanced courses – namely Masters degrees
workforce by training existing employees, – which those working across a range of
particularly those at risk of losing their sectors could use to develop data science
jobs through automation, and can work skills at a high level. Encouragingly, the Office
with universities to co-produce training. By for AI, the British Computing Society and the
working in collaboration with employers, Institute for Coding are already looking at
universities can potentially address regional improving the supply of skills through a new
skills gaps and address productivity series of Masters courses, in consultation with
needs. This could involve working across The Alan Turing Institute. Options such as
professional disciplines to understand the MOOCs should be considered as a vehicle
type and level of data science skills that will for developing skills ranging from informed
be needed by professionals in fields such users through to expert data engineers.
as law, healthcare, and finance. The Royal
Society’s Machine learning report highlighted
the need for training informed users of
machine learning techniques, for example42.

42. Royal Society. 2017 Machine learning: the power and promise of computers that learn by example.
See https://royalsociety.org/~/media/policy/projects/machine-learning/publications/machine-learning-report.pdf
(accessed 15 April 2019).

58 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER FOUR

Need: Creating the right reserach and


Recommended action working culture for data science.
Our interviews with data scientists also
Develop data science highlighted key aspects of workplace culture
as a profession that can be addressed to ensure that we
have the right skills where they are needed.
Developing a professional framework for
These included the view that data scientists
data scientists with shared codes of practice,
flourish when working together in teams.
including appropriate governance of data
The following are recommended actions
collection and use and ethics training is
for ensuring that data scientists and others
an important short-term goal. In the longer
already in work can do the best work with
term, professional bodies such as the British
data that they can.
Computer Society and the Royal Statistical
Society, could work with employers and
universities and identify the skills needed for
Recommended action
data scientists and consider how to address
accreditation to ensure that students and Build diverse teams
professionals can be confident in the quality
Universities and the public sector in
of new courses.
particular must work to create a culture that
nurtures and retains data science talent,
which can include building and supporting
interdisciplinary data science teams.

Universities in particular could recognise data


science roles, create appropriate job titles
and support permanent roles and career
progression for data scientists to create
a collaborative, cross-disciplinary culture.
There are many examples of good practice,
such as Research Software Engineering
Groups, which provide a home for research
programmers who collaborate with
researchers on multiple research projects43.

43. Research Software Engineers Association. 2019 Welcome to the UK Research Software Engineer Association See
https://rse.ac.uk/ (accessed 15 April 2019).

Dynamics of Data Science: How can all sectors benefit From data science talent? 59
CHAPTER FOUR

Models and mechanisms

To address these priority needs, mechanisms organisations bring their real-world problems
“At the Office for including enrichment and fellowship to be tackled by small groups of highly
National Statistics, schemes, capability-building programmes talented, carefully selected researchers,
they have a training and informal/ peer-to-peer mechanisms such with a diversity of thought. Researchers
programme and as online courses, meet-ups and forums can brainstorm and engineer data science
they were willing to prove effective. The following models are solutions, presenting their work at the end
develop me for the illustrations of how they can work in practice. of the week. Organisations get to quickly
next role. They also prototype possible solutions to their data
encourage us to Mechanism science challenges, and researchers get an
move around every Enrichment and fellowship schemes opportunity to put knowledge into practice
two years to get and go beyond individual fields of research
experience working Model to solve real-world problems. Knowledge is
on different types Alan Turing Institute, Enrichment Scheme exchanged among groups, and participants
of analysis. I noticed and Data Study Groups from both academia and the organisations
the investment in One of the major goals of the Alan Turing posing challenges rapidly learn new skills
people and I feel Institute is to train new generations of data during the week – from how to work in
proud of the work science and artificial intelligence leaders with secure analysis environments to learning new
I do.” the necessary breadth and depth of technical data science methods and techniques, and
and ethical skills to match the UK’s growing tools for doing data science collaboratively
Alexis Fernquest, Data industrial and societal needs. The Enrichment in groups.
Scientist at the Office Scheme offers students currently enrolled on
for National Statistics.
a doctoral programme at a UK university the Mechanism
opportunity to join its research body. Doctoral Graduate development programmes.
students typically in their second or third year
of study can undertake a 6, 9 or 12 month Model
placement at the Institute’s headquarters Office for National Statistics advanced
in London. Joining a community of more training
than 400 senior academics, early career For the more advanced level training (eg PhD
researchers and PhD students, enrichment level), partnerships were made between the
students have the opportunity to boost their ONS, the Alan Turing Institute and several
skills and experience, enrich their research universities. The ONS has built relationships
and make new collaborations during their with many departments of the UK government
time at the Alan Turing Institute. which have set up Data Science ‘hubs’,
allowing them to regularly exchange and
The Data Study Groups are five-day communicate on the needs and skills
‘collaborative hackathons’, which bring needed. This should improve the flow of data
together organisations from industry, science talent across the UK.
government and the third sector, with
talented multi-disciplinary researchers
from academia.  At each event, several

60 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER FOUR

Model Model
Faculty Fellowship Pivigo data science training
The Faculty Fellowship (formerly ASI Data Pivigo is a data science marketplace and
Science Fellowship) exists to ensure that the training company based in London. It helps
brightest academics get a chance to immerse organisations to innovate through data
themselves in working life, learn about science by connecting them with their own
artificial intelligence (AI) in business and help community of data scientists. Its Science
build the future of operational AI.  to Data Science programme trains and
graduates some of the world’s top scientific
Since its founding in 2014, Faculty has PhD talent in data science, with three
trained and transitioned over 250 PhD STEM programmes each year. Pivigo runs a five-
graduates into data scientist roles in industry. week programme at its London campus or
Taking place three times a year (January, online. It works with large multinationals,
May and September), the Fellowship is highly charities, SMEs, and start-ups to help
competitive and receives applications from 5 learners gain practical experience with data
to 10% of the UK's physics, mathematics and science technologies and technical skills in a
engineering postgraduate research students. commercial environment. The scheme offers
In part, this is because alumni go on to students the opportunity to boost their skills,
work for big names like Google, DeepMind, grow their networks and work alongside
Facebook and Deliveroo.  researchers.

Faculty believes that AI is the most important Mechanism: Capability-building programmes


technology of our age, but that it is only
valuable when applied in the real world – Model
enhancing products, improving services, and The Cross-Government Data Science
saving lives. To apply AI, organisations need Accelerator
the right strategy, software and skills. This is The Data Science Accelerator is a 12 week
why fellows are given the opportunity to build skills-building programme that gives analysts
their AI skills while working on tangible, real- and aspiring data scientists from across the
world problems during a six-week placement. public sector, including central and local
government, the opportunity to develop
their data science skills. Created in 2015,
this award-winning programme has been
recognised for its impact on increasing data
capability across the Civil Service. More than
150 participants from across the country have
delivered a variety of projects, many of which
have made a substantial difference to their
public sector organisations44.

44. Government Digital Service, Government Office for Science and Office for National Statistics. 2019. Introduction
to the Data Science Accelerator programme. See https://www.gov.uk/government/publications/data-science-
accelerator-programme/introduction-to-the-data-science-accelerator (accessed 15 April 2019).

Dynamics of Data Science: How can all sectors benefit From data science talent? 61
CHAPTER FOUR

Projects range from a ‘geospatial risk common objectives to manage customer


“The exchange and impact’ tool designed to help local relationships and mitigate non-compliance
of ideas of an government strategically focus its service in a high volume financial transaction
applied nature via delivery to automating the process of environment. Its masterclass initiative has
a dialogue between categorising businesses using large sets led to further collaborative opportunities
experienced of free text descriptions through machine for thought leadership in data science
academics and learning algorithms45. and to hold data science problem-based
practitioners has forums. The masterclass courses have been
been key to the Throughout the programme, participants delivered by academics from Edinburgh,
success of the are paired with a mentor who coaches and Imperial College London and delegates
learning program.” them through the data challenges – and have attended from many UK government
opportunities – that their projects bring. The departments as well as other tax authorities
Shakeel Khan, programme is open for application four times a across Europe.
Data Science Capability year and is free. Participants dedicate at least
Building Manager at
one day per week, usually attending one of Model
HMRC
the Accelerator’s hubs in London, Manchester, Prosperity Partnerships
Newcastle, Newport, Sheffield or Taunton. Rolls-Royce is leading a large UKRI-EPSRC
Examples of previous projects can be found ‘Prosperity Partnership’ in Advanced
in the project directory and the Data Science Simulation and Modelling of Virtual Systems
Accelerator blog46. (ASiMoV), which is helping train the data
scientists needed to process and analyse
Model the petabytes of data generated by ultra-
Internal governmental initiatives high fidelity exascale simulations. The
Some government departments have programme combines leading edge research
their own internal initiatives to upskill their with PhD training and opportunity to work
workforces. For example, HMRC has a on business led challenges. By combining
capability-building initiative that gives analysts several of their highly regarded University
and data scientists from across the public Technology Centres with other leading UK
sector the opportunity to develop their data universities and SME companies, Rolls-
science skills. HMRC has also created a Royce is developing the skills needed for
specific job role ‘Data Science Capability the integration of data, computational and
Building Manager’ to learn and disseminate physical sciences across a wide range of
applied data science knowledge. The application areas.
collaborative work has been particularly
beneficial for HMRC, lending methods
from financial services that as a sector has

45. Roberts S. 2018 Data Team Wins Future Policy Network Award. Government Digital Service blog. 19 February 2018.
See https://gds.blog.gov.uk/2018/02/19/data-team-wins-future-policy-network-award/ (accessed 15 April 2019).
46. Government Digital Service, Government Office for Science and Office for National Statistics. 2019. Previous
Data Science Accelerator projects. See https://www.gov.uk/government/publications/data-science-accelerator-
programme/previous-data-science-accelerator-projects (accessed 15 April 2019).

62 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER FOUR

Dynamics of Data Science: How can all sectors benefit From data science talent? 63
64 Dynamics of Data Science: How can all sectors benefit From data science talent?
Chapter five

Area for action:

Enabling movement
and sharing of talent

Image © baona.

Dynamics of Data Science: How can all sectors benefit From data science talent? 65
CHAPTER FIVE

Area for action:

Enabling movement
and sharing of talent
To perform at their best, scientists and Universities and the public sector in particular
“I can see a engineers need the right environment in need to work to create a culture that nurtures
transition from a which to work, one based on freedom to and retains data science talent. This can
lot of individual collaborate and access to different kinds of include building and supporting cross-
contributor work, resource. Typically, industry and academia disciplinary data science teams, potentially
especially in the offer different experiences as working disrupting traditional organisational structures.
ingesting of data environments. Industry tends to offer access
(cleaning it, making to funding and to large amounts of data in Big internet companies are adding to pressure
it available to use real time. Academia tends to offer freedom to on universities, which are already struggling
in a downstream explore and easy routes to cross-fertilisation to retain professors and other employees.
process, for with other subjects. The best research may For sectors facing particular challenges
example modelling), well be done using both kinds of resource, with retention, there is a need to offer more
to distribution of and without forcing a long-term choice incentives and address common barriers.
labour by specialism, of sector upon the most talented people. One factor impacting on retention is that
and some of that Nowhere is this more urgent than in data UK universities are seeing more and more
labour is computer science, which is an area of explosive growth interest in their intellectual property from
driven and not and rapid change that is attracting talent from big tech companies47.
human driven.” around the globe.
“Large institutions are slowly starting to
Ilya Zheludev, Chief Data There are a number of ways to make understand how they can build IP and then
Officer for Jasmine22. moving between sectors a natural part retain it and monetise it for their own needs. If
of the data science career path. This is the UK government is interested in furthering
important on a variety of timescales from the narrative of commercialisation of IP, it is
career move to week-long study through natural progression that individuals working in
a secondment/internship. academia could be encouraged or could wish
to participate actively in commercialisation
In particular, there is a need to address of knowledge. I have definitely seen a trend
the problem of the ‘one-way door’ out where young people are interested in going
of academia which makes it difficult for into academia and being part of a university
researchers to return after spending time in that allows them to retain IP of anything they
industry or government, and one enabler to uncover themselves. That builds frameworks
this movement is the ability of researchers around allowing those individuals to turn that
to keep publishing when they are outside into commercial operations. I think that is a
of academia. great step forward.”
Ilya Zheludev, Chief Data Officer for
Jasmine 22.

47. C
 arey S. 2016 How UK universities are leveraging their intellectual property to benefit from the tech boom.
Techworld. 8 November 2016. See https://www.techworld.com/data/how-uk-universities-are-leveraging-their-ip-
benefit-from-tech-boom-3648861/ (accessed 15 April 2019).

66 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER FIVE

Needs and recommended actions

Need: Enable movement through


braided careers Recommended action “One of the things
Supporting different ways of building a career that I love about
is important as data science develops and Create and fund joint positions being in the UK is
is valuable across disciplines – as the Royal across academia and industry that collaboration is
Society‘s work on research culture shows48. very important. It is a
Funding bodies such as UKRI could support
When data scientists are in high demand, smaller environment
positions for joint appointments for the UK’s
enabling them to work across sectors and with a lot of smart
most talented researchers, who would be
roles is valuable. people.”
strongly in favour of a joint approach, so
that a pool of excellence can be fostered Kerem Sozugecer, Chief
“There is definitely space for the best
at the interface of academia, industry and Technology Officer and
researchers from academia in industry. If you
government. Universities and funders co-founder of DeepZen.
can take a leave of absence, keeping your
should give urgent attention to enhancing
level in academia, to spend time in industry,
mechanisms to accommodate outstanding
you could enrich your own research with
industrial research leaders in machine learning
knowledge in the field. Similarly, if you are in
within the academic sector. This academic
a really innovative sector and you see that for
leadership is critical to inspiring and training
disrupting innovation you do not have all you
the next generation of research leaders.
need in your current role, you could benefit
from the cooperation, input and autonomy
Need: Recognising diverse outputs
of academia.”
Recognising diverse outputs is important to
Milton Luaces, Senior Manager at support these braided careers. This means
Accenture – Applied Intelligence. that work done in universities can be valued
by industry and vice-versa.

Recommended action

Commercialise research
The ways that universities encourage and
support researchers in commercialising
research and building spin-outs can
influence researchers’ abilities to hold
joint appointments between industry
and academia. Universities may wish to
consider their strategies for research
commercialisation and policies on intellectual
property in order to build an environment that
supports cross-sector roles more freely.

48. Royal Society. (no date) Research culture. See https://royalsociety.org/topics-policy/projects/research-culture/


(accessed 15 April 2019).

Dynamics of Data Science: How can all sectors benefit From data science talent? 67
CHAPTER FIVE

Need: Establishing a coherent


Recommended action approach to data policy
Skills needs should be linked to the
Recognise diverse research government’s data strategy to ensure
outputs a joined-up approach.

Government departments and industry


are likely to benefit when they enable
Recommended action
data scientists in research roles to publish
their work wherever possible; conversely, Make skills a core part of the
universities need to recognise the value
National Data Strategy
of data science experience outside
universities, developed in the private and Responsibility for data policy is distributed
public sectors. Alternative outputs could be across DCMS, GDS, Cabinet Office and DfE,
recognised on academic CVs. Changes to but DCMS leads on delivering the National
the Research Excellence Framework that Data Strategy. This Strategy should enable
focus on institutions rather than individuals departments to work closely together on
could allow universities to better recognise data skills, building a coherent approach
the contribution of data science to broader to delivering a healthy data science skills
research output. landscape. This will be important for the
wider adoption of artificial intelligence.

68 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER FIVE

Models and mechanisms

There are a number of ways to make moving Model


between sectors a natural part of the data DISCnet “Close university
science career path. Some mechanisms The Data Intensive Science Centre is an and business
exist to foster long-term collaborative and STFC Centre for Doctoral Training providing collaboration
business engagement networks and close a platform to train a new generation of data resulted in more
interdisciplinary links. These include centres intensive scientists. The innovative education, capability building,
for doctoral training, industry fellowships, data training and research is delivered by a expansion of R&D
residencies and innovative approaches to consortium of five universities from the South activities of major
commercialisation. Here are some examples East Physics Network (SEPnet). The centre industry – bringing
of existing models which highlight how the trains postgraduate students via world- new jobs and getting
mechanisms are working in practice. leading research projects in particle physics projects live faster.”
and astrophysics and explores the untapped
Mechanism potential of these big data skills in diverse Professor Peter Buneman
Centres for doctoral training applications across a spectrum of industries. FRS, University of Edinburgh

DISCnet currently has 70 non-academic


Model partners.
Centre for Doctoral Training in Data Science
and AI, The University of Edinburgh, and Mechanism
beyond Industry fellowships
In 2013, the Engineering and Physical
Sciences Research Council (EPSRC) invested Model
£500 million in 115 Centres for Doctoral EPSRC Research Software Engineer
Training (CDTs), matched by more than Fellowships
£450 million from business, universities The Research Software Engineering
and other stakeholders. The University of Fellowship is awarded to exceptional
Edinburgh has hosted a CDT in Data Science individuals in the software field, who
since 2014. demonstrate leadership and have combined
expertise in programming and a solid
The CDT has two types of studentships knowledge of the research environment. As
including one with a PhD project in well as having expertise in computational
collaboration with an industry partner. For software development and engineering,
both courses the first year provides Masters RSE Fellows should be ambassadors for the
level training in the core areas of data research software community and have the
science along with a significant project. In potential to be a future research leader in the
years 2 – 4, students carry out PhD research RSE community.
in data science. In 2017, the UK’s Science and
Technology Facilities Council announced £10
million to train the next generation through
supporting eight new CDTs in data intensive
science. The centres include industrial
partners and will offer comprehensive training
in data intensive science through cutting
edge research projects and a targeted
academic training programme.

Dynamics of Data Science: How can all sectors benefit From data science talent? 69
CHAPTER FIVE

Model Model
UKRI Future Leaders Fellowships Royal Society Entrepreneur in Residence
The Future Leaders Fellowship (FLF) is UKRI’s scheme
flagship talent scheme, which aims to develop The Royal Society Entrepreneur in Residence
the next generation of research and innovation (EiR) scheme aims to increase the knowledge
leaders in the UK. It will recruit and retain and awareness in UK universities of
rising stars by attracting the brightest and cutting-edge industrial science, research
best from at home and across the world. The and innovation49. The scheme provides
FLF scheme will provide long-term funding opportunities for outstanding industrial
for each Fellow (up to £1.2 million over an scientists and entrepreneurs to spend time
initial four years, with an option to extend to working in a university to expose university
seven), allowing them to tackle difficult and staff and students to state-of-the-art industrial
novel challenges. A total of 550 FLFs will be research and development, and the scientific
awarded from 2019/20 to 2021/22 across challenges faced by industry. The scheme
six separate rounds, marking a significant also allows universities to gain from expert
investment to grow the UK’s research and advice aimed at promoting innovation and
innovation base. Although the FLF scheme is the translation of research by universities,
not prescriptive of the research and innovation and build confidence and understanding of
areas it supports, and Fellowships will be business and entrepreneurship among staff
awarded on a competitive basis, there are and students.
several features that should make it attractive
to those working in data science and AI. Model
Whereas most existing Fellowship schemes Royal Society Industry Fellowship scheme
fund only academic researchers, FLF also The Royal Society Industry Fellowship is
supports individuals in industry, as well as a paid secondment scheme for academic
those working at the interface of academia, scientists who want to work on a collaborative
industry and the public sector, encouraging project with industry and for scientists in
a new paradigm in career path that is mobile industry who want to work on a collaborative
across all three. Operating across the breadth project with an academic organisation50.
of UKRI will allow Fellows to take the most Providing a basic salary for the researcher
cross-cutting and interdisciplinary approaches and a contribution towards research costs,
to research and innovation. The open remit the Fellowship aims to enhance knowledge
of the call allows for Fellowships to be held transfer in science and technology between
across a spectrum – from those with a those in industry and those in academia in
background in AI wishing to apply their skills the UK. The scheme supports researcher-
to a wide range of disciplines and challenges, mobility and has run for over 30 years,
to those who are from different disciplinary bridging industry and academia for hundreds
backgrounds, where AI could make a of scientists.
transformational contribution to that discipline
or where that discipline could be brought to
bear on the development of AI.

49. Royal Society. (no date) Entrepreneur in Residence. See https://royalsociety.org/grants-schemes-awards/grants/


entrepreneur-in-residence/ (accessed 15 April 2019).
50. Royal Society. (no date) Industry Fellowships. See https://royalsociety.org/grants-schemes-awards/grants/industry-
fellowship/ See (accessed 15 April 2019).

70 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER FIVE

Model Model
Royal Commission for the Exhibition of 1851 Uber AI Residency (US)
The Royal Commission for the Exhibition of Established in 2018, the Uber AI Residency
1851 awards three-year research fellowships is a 12-month training programme for recent
to early career postdoc scientists or college and Master’s graduates, professionals
engineers of exceptional promise51. The who are looking to reinforce their AI skills,
Fellowship, which was founded in 1891 and and those with quantitative skills and interest
has initiated the careers of thirteen Nobel in becoming an AI researcher at Uber AI Labs
laureates to date, is open to all nationalities or Uber Advanced Technologies Group (ATG).
and fields of science, including physical or Uber AI Residents have the opportunity to
biological sciences, mathematics, applied pursue interests across academic and applied
science, and any branch of engineering.  research. Uber is committed to an open and
The Commission also awards Industrial inclusive research mission that benefits the
Fellowships to encourage industry – community at large, including contributing
academia collaboration at doctoral research papers to top conferences and taking part in
level, Industrial Design Studentships for open-source projects.
postgraduates and, in partnership with the
Royal Academy of Engineerig, graduate Mechanism
Enterprise Fellowships for entrepreneurs. Approaches to commercialisation

Mechanism Model
Residencies IP free zone, Department of
Computer Science and Technology,
Model the University of Cambridge and beyond
Microsoft AI Residency program The IP free zone at the University of
(US/UK) Cambridge is part of a more general
The Microsoft AI Residency program is a framework set up by the former Head of
12-month role designed to advance a career Department of Computer Science and
in machine learning research and engineering. Technology, Professor Andy Hopper FRS. The
The goal of the AI Residency is to help strategy has been to minimise barriers to the
residents become creative and productive formation of new companies while aligning
AI researchers, scientists and engineers. incentives for staff and students, avoiding IP
Residencies are open to BSc, MSc, and PhD issues, providing mentoring, being helpful in
graduates with substantial coursework in, but every possible way, and not picking winners.
not limited to: computer science, electrical Furthermore, this has been a cradle-to-
engineering, data science, mathematics, grave approach ranging from undergraduate
physics, economics, human – computer lectures to the maintenance of an industrial
interaction, and computational biology. business club beyond the department. A
total of 270 companies have been formed by
staff and students (including Raspberry Pi),
of which 50% are active with revenues of $1
billion, and 18% sold for over $40 billion.

51. Royal Commission for the Exhibition of 1851 (no date). Our awards. See https://www.royalcommission1851.org/
(accessed 15 April 2019).

Dynamics of Data Science: How can all sectors benefit From data science talent? 71
CHAPTER FIVE

In the US, Carnegie Mellon and the University Model


of Washington are currently working on a UKRI Impact Acceleration Funding
set of recommendations for commercial Impact Acceleration Accounts (IAAs) are
companies meant to provide a way for strategic awards provided to research
universities and companies to share talent organisations to support knowledge
more equally. exchange and accelerate the impact of
research. IAAs allow organisations to
respond in more flexible, responsive and
creative ways appropriate to their strategic
priorities, enabling impact to be achieved in
an effective and timely manner, for example,
through secondments and exchanges, user
engagement, proof of concept, and by
building capacity for work across disciplines.

72 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER FIVE

Dynamics of Data Science: How can all sectors benefit From data science talent? 73
74 Dynamics of Data Science: How can all sectors benefit From data science talent?
Chapter six

Area for action:

Widening access to data


in a well-governed way

Image © CasarsaGuru.

Dynamics of Data Science: How can all sectors benefit From data science talent? 75
CHAPTER SIX

Area for action:

Widening access to data


in a well-governed way
Access to good data can ensure that data Ways to create a professional ethos for data
“The more sensitive scientists get necessary experience with ‘real scientists, to share good practice in use of
the industries are, world’ problems that is so important in data data and to agree shared codes for ethical
the more difficult science. But more importantly, this will enable collection and use of data will be important
the partnership with the use of data science skills for public and areas to explore further to ensure that data
academia. There commercial benefit. However, it is important can be used procure diverse benefits for
are sectors that are that this is done in an ethical and well- society. Professional bodies such as the Royal
quite sensitive on governed way. Statistical Society, with its data science section,
data: for instance, and the British Computer Society are already
financial services, Previous work by the Royal Society and the exploring options for accreditation of data
pharmaceuticals British Academy in Data Management and science courses to demonstrate the quality
and government. Use: Governance for the 21st Century, argues of course content. It will also be valuable to
One challenge for a principled approach to the governance explore the development of shared codes of
is that usually of data use. Central to this is that we should conduct or practice for data science to raise
they want people govern the use of data in such a way that awareness of the need for an ethical and
working on makes human flourishing central, with data well-governed approach to data collection and
premises, which used to benefit people and communities. Data use and to ensure that a concern with ethics
is difficult for scientists should aim to open up data for social is appropriately reflected in course content.
academics.” good and do so in a well-governed way. Such professional bodies – working with the
Centre for Data Ethics and Innovation, the Ada
Milton Luaces, Senior Lovelace Institute, the Open Data Institute
Manager at Accenture – and the many other groups promoting ethical
Applied Intelligence.
use of data – can help explore and establish
a positive professional ethos and ethics for
data science.

“Making [data] available between academia


and industry is really hard, because a lot of
the time data is really sensitive, and it can
also have personal information in there.
Mechanisms that enable that sharing, in a
legal, compliant way are really important.
How can someone push the boundaries of
knowledge in a domain if they cannot work
from real data in that domain? I would say
that governments can have quite a lot of
involvement here. It is about collaboration
and being able to work problem statements
that are based on proprietary data.”
Ilya Zheludev, Chief Data Officer for
Jasmine 22.

76 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER SIX

Needs and recommended actions

Need: Opening up data and providing The public sector needs to work out how to
secure access. widen access to give university researchers “We have amazing
To get the best value from data for the access to better public data. Continuing to data in the NHS and
widest range of organisations means that ensure that data generated by charity-funded that is definitely
opening data in a secure and well-governed and publicly funded research are open by a resource that is
way should enable societal benefit to be default will be critical in supporting wider uses worth staying for.”
accessed most easily. of research data. Journals should normally
insist, as a condition of publication, on data Dr Amy Nelson, Senior
being made available to other researchers in Research Associate at UCL
Institute of Neurology and
Recommended action their original form, or via appropriate summary
a junior doctor.
statistics where personal information is
Encourage data sharing involved. Of course such a policy has
where possible additional potential benefits of enhancing
Greater transparency of private sector data reproducibility in research and increasing
could help build public trust in the use of data transparency of decision making, in both the
and how their data is used for decision-making public and private sectors.
purposes. The public sector could usefully
consider how to widen access to its data, The Royal Society has recently published
including sharing data, and data challenges to the report, Protecting privacy in practice,
researchers. Journal editors should normally which sets out how use of government data
ensure that data are being made available to could be enabled by Privacy Enchanting
other researchers in their original form, or via Technologies (PETs).
appropriate summary statistics where sensitive
personal information is involved.

Recommended action
Standardisation of open data can enable the
sharing of data and challenges, providing an Donate data science talent
amenable data environment for researchers.
There is value in enabling data scientists to
However, the majority of data held by
donate their time to applying data science
organisations is not open, and unlikely to be
to societal challenges. For example, through
so due to personal disclosure and commercial
probono project work along the lines of
restrictions. In areas where there are datasets
DataKind UK, RSS Statisticians for Society
unsuitable for general release, further progress
and hackathons.
in supporting access to public sector data
could be driven by creating policy frameworks
or agreements which make data available to
specific users under clear and binding legal
constraints to safeguard their use, and set out
acceptable uses. Government should further
consider the form and function of such new
models of data sharing.

Dynamics of Data Science: How can all sectors benefit From data science talent? 77
CHAPTER SIX

Need: Providing the computing power for


use by the growing data science community.
Recommended action
As the data science community grows there
will be a need for greater access to high Provide access to computing
power computing, and to GPUs for artificial
power
intelligence and machine learning, so that
data scientists can realise their potential. Improving the UK’s computing research
infrastructure will better enable data scientists
to access the necessary computing power
to release the value from data and address
research challenges, and will be vital for the
UK to remain competitive with other countries
such as the US and China. BEIS and UKRI
could usefully consider the need for continuing
to improve access for data scientists working
across all disciplines to high-power computing,
and this could helpfully be included as part of
the UKRI Infrastructure Roadmap.
 

78 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER SIX

Models and mechanisms

Mechanisms such as collaborative events and Model


partnerships, data stores and APIs, offices of Royal Statistical Society – Statisticians
data analytics and data centres/institutes are for Society
important ways to bring data scientists and The Statisticians for Society initiative was
data together and the models below show launched in 2014, to help statisticians offer
how this can work, even for organisations their skills to charities and other socially
that do not have the resources to hire data useful initiatives that need their professional
scientists themselves. expertise52. Many third sector organisations
are keen to explore the use of data for
Mechanism decision making and service improvements.
Collaborative events and partnerships There is a growing need for them to provide
evidence of their impact, but due to lack of
Model capacity and appropriate skills needed for
Charity DataDives, DataKind UK data analysis, some charities are unable to
A hackathon brings together a range of fully demonstrate the value of their work.
people to generate an outcome, usually As one of the leading voices for promoting
software projects. They can be associated the importance of data and evidence, the
with computer programmers, software Royal Statistical Society supports statisticians
developers, data scientists and, often, in helping charities in making a difference.
subject-matter-experts. The charity DataKind Volunteers can provide the tools and
supports charities and social enterprises large guidance for undertaking data analysis.
and small across a variety of issue areas. It Statisticians collect, analyse and interpret
runs hackathon events called ‘DataDives’ data across a wide range of industries and
where charities and social enterprises work topics; they are skilled at designing methods
alongside teams of volunteer data scientists, for collecting data and regularly tasked with
analysts, developers and designers using analysing data to spot patterns and trends;
data to gain insight into their programmes and they can manipulate data to identify
and to increase their impact.  relationships and make future predictions.
Following this, they produce reports and
Hackathon style events can stimulate summaries that communicate their findings.
more engagement between the academic
community and social projects as there are
lots of skills within universities that are both
expensive and in short supply within the
third sector.

DataKind also run longer-term engagements


over 6 – 9 months to build a data science
solution (DataCorps) and have monthly office
hours which any non-profit or social change
organisation can sign up to for advice.

52. Royal Statistical Society. 2019 Statisticians for Society. See http://www.rss.org.uk/RSS/Get_involved/Statisticians_
for_Society/RSS/Get_involved/Statisticians_for_Society.aspx?hkey=c7977c58-1558-495a-9e5a-e99d64ea9cfd
(accessed 15 April 2019).

Dynamics of Data Science: How can all sectors benefit From data science talent? 79
CHAPTER SIX

Mechanism Model
Data stores UK Data Service
The UK Data Service is the UK’s only
Model nationally funded research infrastructure
The London Datastore for the curation and provision of access
The London Datastore is a free and open data- to social science data and its practices,
sharing portal where anyone can access data especially around secure access to data and
relating to the capital53. It is one of the Greater data curation, have been influential across
London Authority’s (GLA) flagship projects the world.  Funded by the Economic and
and is a platform through which many of the Social Research Council (ESRC) to meet the
Smart London Plan objectives are delivered. data needs of researchers, students and
Researchers are encouraged to visualise or teachers from all sectors, its unique collection
build apps from the data available on the site. of social science data resources includes
major UK government-sponsored surveys,
Model cross-national surveys, longitudinal studies,
ONS API UK census data, international aggregate,
The Office for National Statistics API business data, and qualitative data. It brings
makes datasets and other data available together several important past investments
programmatically, allowing researchers to including the Economic and Social Data
filter datasets and directly access specific Service, Question Bank, Qualidata and
data points. Census Programme.

Climate change, ageing, security threats,


the provision of better public services and a
more productive society all call for policies
and solutions that are informed by evidence-
based research and innovation. Access and
training services through the UK Data Service
enable impactful research to influence national
and regional policy and develop research
excellence across all sectors. Its excellence
in collecting, storing, analysing and sharing
collections of complex data has helped it build
a trusted reputation and become a critical part
of the UK’s research infrastructure.

53. London Datastore 2019. Welcome to the Datastore. See https://data.london.gov.uk/ (accessed 15 April 2019).

80 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER SIX

BOX 2 Mechanism
Data stores
Open Data Skills Framework
(The ODI) Model
Transport for London API
Open data is a relatively new field. Its Transport for London (TFL) is the local public body
potential is being realised increasingly as responsible for public transport in London54. Every
it is slowly integrated into the strategies of year, TfL ensures the transportation of 1.37 billion
organisations. Those working with open people, with a network length of 402 km, which
data or on open data initiatives often is equivalent to 83.6 million km travelled per
have to learn the skills as they go. There year55. Over the past ten years TfL has made a
is no clear language to describe the significant amount of data accessible to the public
knowledge and skills of those working free of charge, including timetables, service status
with open data; nothing with which to and disruption information. This has allowed
benchmark a single individual’s expertise, the market to develop exponentially with the
and point to where they are in their introduction of new products and services. TfL is
learning journey. now considered as a leader in publishing open
data through APIs, the Cloud, the internet and
This is why the Open Data Institute has across its physical network. It has created over
created the open data skills framework, 700 jobs and brought £14 million per year in GVA,
a simple, three-tier framework that enabling development of UK’s skills in data.
describes the knowledge and skills of
anyone interacting with open data, from Mechanism
beginner through to expert level56. Offices of Data Analytics

The skills framework enables learners to Model


identify where they are in their learning Nesta’s programme of Offices
journey. Everybody starts their journey as of Data Analytics
an ‘explorer’. They may then wish to focus The Offices of Data Analytics (ODA) programme
on either strategic or practical skills, or helps cities and regions join up, analyse and act
both. Ideally, every learner will eventually upon data from multiple sources to reform public
apply their learning back to their sector to services57. As of December 2018, there were nine
help solve sector-specific challenges, and initiatives that classify as Offices of Data Analytics
drive change in their domain. across the UK. The model allows multiple
organisations to join up, analyse and act upon
data sourced from multiple public sector bodies
to improve services and make better decisions.

54. Transport for London 2019. Facts & figures. See https://tfl.gov.uk/corporate/about-tfl/what-we-do/london-
underground/facts-and-figures (accessed 15 April 2019).
55. Deloitte 2017. Assessing the value of TfL’s open data and digital partnerships. See http://content.tfl.gov.uk/deloitte-
report-tfl-open-data.pdf (accessed 15 April 2019).
56. Open Data Institute. 2016 Open Data Skills Framework. See https://theodi.org/article/open-data-skills-framework/
(accessed 15 April 2019).
57. E
 aton M and Bertocin C 2018. State of Offices of Data Analytics (ODA) in the UK. See: https://media.nesta.org.uk/
documents/State_of_Offices_of_Data_Analytics_ODA_in_the_UK_WEB_v5.pdf (accessed 15 April 2019).

Dynamics of Data Science: How can all sectors benefit From data science talent? 81
CHAPTER SIX

Mechanism: Model
Data Centres / Institutes National Innovation Centre for Data (NICD)
The National Innovation Centre for Data
Model (NICD) is a unique new facility that delivers
Health Data Research UK (HDR UK) data analytics skills into industry and the
Health Data Research UK is uniting the public sector by exploiting the knowledge and
UK’s health data to make discoveries that expertise currently locked within universities.
improve people’s lives. By bringing together A flexible rolling programme of collaborative
the sharpest scientific minds, and providing projects focused on organisations’ specific
safe and secure access to rich health data, challenges and opportunities will transfer
it aims to better understand diseases and practical data skills into the workforce of
discover new ways to prevent, treat and cure those organisations. These projects will be
them58. Its vision is for large-scale data and supported by a range of related activities,
advanced analytics to benefit every patient including awareness-raising events, themed
interaction, clinical trial, biomedical discovery business and technical seminars and technical
and improve public health.  To achieve this, training courses. As a result of engagement
HDR UK is leading an ambitious training and with the Centre, organisations will be able to
talent programme, and will create a cohort increase their productivity by optimising their
and network of thousands of health data existing operations, and to grow by launching
scientists spanning all career stages, from new data-driven products and services.
school-leaver to senior research manager
and international opinion leaders.  The UK
has a rich and diverse scientific talent base,
thanks to the strength of the NHS, its academic
institutions and innovative scientific and digital
industries. HDR UK plans to harness this, bring
on board international peers, to create an
intelligent cohort of health data scientists that
will dramatically change medical research,
and open up new, faster, smarter pathways to
patient care.

58. Medical Research Council. 2019. Health Data Research Institute. See https://mrc.ukri.org/about/institutes-units-
centres/uk-institute-for-health-and-biomedical-informatics-research/ (accessed 15 April 2019).

82 Dynamics of Data Science: How can all sectors benefit From data science talent?
CHAPTER SIX

Dynamics of Data Science: How can all sectors benefit From data science talent? 83
84 Dynamics of Data Science: How can all sectors benefit From data science talent?
Conclusion

Image © PeopleImages.

Dynamics of Data Science: How can all sectors benefit From data science talent? 85
CONCLUSION

Conclusion
Data professionals are in high demand from a vision for the sharing of data science
employers. Over the last five-and-a-half years, talent across all sectors. By identifying four
there has been a sharp rise in UK job-listings major areas of action with recommendations
for ‘Data Scientists and Advanced Analysts’ for addressing priority needs across the
(+231%) driven predominately by increased data science talent pipeline, from school to
numbers of vacancies for Data Scientists advanced professionals, we are hopeful that
(+1287%) and Data Engineers (+452%) we can achieve our vision for the UK to be a
leading data science research nation with a
This report has focused on Data Scientists and sustainable flow of expertise and a healthy
Advanced Analysts at the top end of analytical data science skills landscape.
rigour because this is where demand has
grown the most. However, the data shows This report also sets out a wide variety of
interesting results for data professionals across existing models and mechanisms that could
the spectrum. Moreover, our findings are likely be used more widely, from fellowship schemes
to underestimate the demand for data skills as to data stores, that represent good practice or
many jobs are not advertised online. Further innovation in supporting data science career
analysis is needed to quantify the number of development and mobility. The report features
employed workers per opening. several contributions from data scientists from
a range of backgrounds, organisations and
There is a clear need for collaborative, roles sharing their career experiences: these
sustainable mechanisms to develop data talent are also available as a separate publication,
in academia, and the charity, private and public Dynamics of Data Scientists: what data
sectors, and to allow data scientists to move professionals say about data science.
across these sectors. This report promotes

86 Dynamics of Data Science: How can all sectors benefit From data science talent?
Appendices

Image
Caption goes here.

Dynamics of Data Science: How can all sectors benefit From data science talent? 87
APPENDICES

Appendix 1:

Acknowledgements
Thank you to Will Markow and Jonathan Coutinho at Burning Glass Technologies for
providing the labour market data and advice with the analysis, the project steering group
for providing advice and guidance, the reviewers and all of those involved in the workshops
and case study interviews.

Working Group members


The members of the Working Group involved in this report are listed below. Members acted in
an individual and not a representative capacity, and declared any potential conflicts of interest.
Members contributed to the project on the basis of their own expertise and good judgement.

Working Group members


Professor Andrew Blake FREng FRS, AI consultant and Chairman of Samsung’s AI Research
Centre in Cambridge UK (Chair)

Professor Peter Buneman MBE,FRSE, FRS, Professor of Database Systems, School


of Informatics, University of Edinburgh

Sherry Coutu CBE, serial entrepreneur, former CEO, angel investor and non-executive director
Matthew Fryer, VP, Chief Data Science Officer, Hotels.com
Dr Robert Hercock, Chief Research Scientist in the British Telecommunications Security
Research Practice
Professor Marta Kwiatkowska FRS, Department of Computer Science, Oxford University
and Professorial Fellow of Trinity College
Professor Emma McCoy FRSS FIMA, Vice-Dean (Education), Faculty of Natural Science,
Imperial College London
Dr Tom Smith, Managing Director of the Data Science Campus, Office for National Statistics

88 Dynamics of Data Science: How can all sectors benefit From data science talent?
APPENDICES

Royal Society staff

Royal Society Secretariat


Dr Natasha McCarthy, Head of Policy (Data)

Jennifer Panting, Policy Adviser and Project Lead

Staff from across the Royal Society contributed to the development of the project
Dr Frances Bird, Policy Adviser

Connie Burdge, Project Coordinator


Dr Claire Craig CBE, Chief Science Policy Officer
Dr Frank Fourniol, Policy Adviser
Omar Jamshed, Press Officer
David Montagu, Policy Adviser
Karen Newman, Design Manager
Selina Patel, Policy Intern
Dr Mahlet Zimeta, Senior Policy Adviser

Previous Royal Society staff who contributed to the development of the project
Lara Gardellini, Programme Coordinator

Louise Pakseresht, Senior Policy Adviser

Dynamics of Data Science: How can all sectors benefit From data science talent? 89
APPENDICES

Reviewers
This report has been reviewed by an independent panel of experts. The Review Panel
members were not asked to endorse the conclusions or recommendations of the report, but
to act as independent referees of its technical content and presentation. The Royal Society
gratefully acknowledges the contribution of the reviewers.

Reviewers
Professor Dame Wendy Hall, DBE, FREng, FRS, Regius Professor of Computer Science at the
University of Southampton, UK
Professor Frank Kelly CBE, FRS, Professor of the Mathematics of Systems at the Statistical
Laboratory, University of Cambridge, UK
Professor Tom McLeish FRSC, FRS, Professor of Natural Philosophy, University of York
Dame Jil Matheson DCB FAcSS, former National Statistician of the United Kingdom
Hetan Shah, Executive Director of the Royal Statistical Society
Sir Bernard Silverman FRS, former Chief Scientific Adviser at the Home Office

Expert reader
Dr Jonathan Shaw, Technical Specialist in the Economics Department at the Financial
Conduct Authority and Research Fellow at the Institute for Fiscal Studies

90 Dynamics of Data Science: How can all sectors benefit From data science talent?
APPENDICES

Interview and case study participants


The Royal Society gratefully acknowledges the contribution of the case study participants.

Interview and case study participants


Aimee, Government Communications Headquarters (GCHQ)
Professor Graham Cormode, Professor in Computer Science at the University of Warwick
Alexis Fernquest, Data Scientist at the Office for National Statistics (ONS), and former Data
Analytics Apprentice
Dr James Hetherington, Director of Research Engineering at the Alan Turing Institute
Professor Frank Kelly FRS, Emeritus Professor of the Mathematics of Systems at the
University of Cambridge
Kevin Koene, former Junior Data Scientist at The One Campaign
Milton Luaces, Senior Manager at Accenture – Applied Intelligence
Nick Manton, Head of Data Science at the Government Digital Service, and cross
government Head of Community for Data Scientists within the DDaT (Digital Data and
Technology) profession.
Dr Amy Nelson, Senior Research Associate at UCL Institute of Neurology and a junior doctor
Chanuki Illushka Seresinhe, Lead Data Scientist at Popsa, a start-up using AI to automatically
curate photo content into designed physical products
Dr Tom Smith, Director of the Data Science Campus at the Office for National Statistics
Kerem Sozugecer, Chief Technology Officer and co-founder of DeepZen, a start-up which
specializes in creating emotional and expressive human voice with AI
Dr Damon Wischik, Lecturer in the University of Cambridge's Computer Laboratory and
former recipient of a University Research Fellowship from the Royal Society
Dr Maria Wolters, Reader in Design Informatics at the University of Edinburgh
Ilya Zheludev, Chief Data Officer for Jasmine22, a new technology venture being built by
HSBC based in Hong Kong

Dynamics of Data Science: How can all sectors benefit From data science talent? 91
APPENDICES

Workshop participants
The Royal Society would like to thank all those who contributed to the development of this
project through submission of evidence and attendance at the following workshops.

March 2018: What’s different about data science? scoping roundtable


This roundtable asked ‘what’s different about data science?’ to gather evidence on attracting
and cultivating data science talent, skills and the models and mechanisms to enable a thriving
landscape. Guests discussed whether the UK has the skills capacity to deliver the potential of
data science and AI and how to ensure a healthy landscape that enables talent to grow and
flow between sectors.

May 2018: Contextualising the disruption evidence gathering workshop


The aim of this workshop was to determine patterns in the drivers for the movement of data
scientists and interrogate a number of previously identified models for upskilling, retaining and
sharing data science talent.

September 2018: Data talent in Newcastle


This workshop was to discuss the demand for data professionals across Newcastle and North
East England (in all sectors – public, industry and academia), how skills gaps are currently being
met and what more could be done to develop a pipeline of talent. Guests represented local
universities, training centres and businesses in data/analytics roles or the recruitment/ training
of data professionals.

January 2019: Reviewing the emerging policy recommendations


A lunch discussion was held at the Royal Society to discuss the outcomes of the project on the
UK data science workforce with key stakeholders. Guests were asked to review the project
findings and suggest ways to improve, refine and revise the content, particularly the areas of
action and recommendations for change.

March 2019: Data talent in Wales


This workshop was to discuss the demand for data professionals across Wales (in all sectors –
public, industry and academia), how skills gaps are currently being met and what more could
be done to develop a pipeline of talent.

92 Dynamics of Data Science: How can all sectors benefit From data science talent?
APPENDICES

Appendix 2:

Glossary
Algorithm: A set of rules a computer follows Java: A programming language.
to solve a problem.
Machine learning: A set of rules that allows
Artificial intelligence (AI): An umbrella term systems to learn directly from examples, data
for the science of making machines smart. and experience.

API: Application Programming Interface. Metadata: ‘Data about data’, contains


information about a dataset. For example, this
BEIS: The Department for Business, Energy
information could include why and how the
and Industrial Strategy.
original data was generated, who created it
Big data: Large and heterogeneous forms of and when. It may also be technical, describing
data that have been collected without strict the original data’s structure, licensing terms,
experimental design. Big data is becoming and the standards to which it conforms.
more common due to the proliferation of
MOOC: A Massive Open Online Course
digital storage, the greater ease of acquisition
(MOOC) is an online course aimed at
of data (eg through mobile phones) and the
unlimited participation and open access via
higher degree of interconnection between
the web.
our devices (ie the internet).
Python: A programming language.
Data: Numbers, characters or images that
designate an attribute of a phenomenon. UKRI: UK Research and Innovation (UKRI)
is the national funding agency investing in
DCMS: The Department for Digital, Culture,
science and research in the UK.
Media and Sport.
R: A programming language.
DfE: The Department for Education.
SAS: A software suite developed by the SAS
DSAA: Classification code used by Burning
Institute for analytics purposes.
Glass Technologies to group together
Data Science and Advanced Analytics job Scala: A programming language.
vacancies.
SIC-1: The Standard Industrial Classification
DSA: Classification code used by Burning (SIC) is a system for classifying industries
Glass Technologies to group together Data according to their economic activity.
Science and Analytics job vacancies.
STEM: A term used to group together
GDS: The Government Digital Service. Science, Technology, Engineering and
Mathematics.
Hadoop: An open source framework that
manages data processing and storage for big
data applications.

Dynamics of Data Science: How can all sectors benefit From data science talent? 93
APPENDICES

Appendix 3:

Data
This appendix includes additional data tables, explanations about the Burning Glass
Technologies methodology and other findings from the data.

Understanding the Burning Glass Skills are grouped for ease of analysis of
Technologies methodology: Skills clusters broader talent requirements. Burning Glass
Skills clusters are groupings of related skills. Technologies skills hierarchy and grouping
Burning Glass Technologies has developed has been created using a combination
a taxonomy of more than 500 skills clusters of hierarchical clustering algorithms and
by grouping skills that often travel together assessment of skills similarity based on job
in job postings. Clustering has been focused postings. Using a variety of distance metrics
on the most frequently occurring skills across including Cosine, Dice, Jaccard, and others,
a range of industries, in addition to skills in the similarity of all skills combinations is
emerging areas. Burning Glass Technologies determined. Depending on how close two
labour market analysts used the following skills cluster are together based on the
three criteria to group skills: similarity measures, they get assigned to the
same skills cluster. The final step included
• r elated skills eg the skill cluster ‘Statistical
manual reviews of the clusters to resolve
Software’ includes skills such as R, SAS,
any unclear cases.
and SPSS;

• s kills that travel together eg the skill cluster Missing salary information
‘Administrative Support’, includes skills such The number of postings and the average
as meeting planning/facilitation, calendar salary have been calculated for the year 2013
management, travel arrangements, and and the last four full quarters (July 1, 2017 –
appointment setting; and June 30, 2018). However, a large proportion
of postings have no salary attached so these
• s kills that are trained together eg the skill
results should be seen as indicative rather
cluster ‘Lean Manufacturing’ includes skills
than definitive. Intriguingly the salary for the
such as Kanban, Kaizen, Six Sigma, and
‘Data Scientists’ occupation showed only a
Lean Six Sigma.
small increase, despite the huge increase
in postings.

94 Dynamics of Data Science: How can all sectors benefit From data science talent?
APPENDICES

Numerous drivers might explain the salary/ Certifications


demand anomaly for data scientists. Perhaps Data science jobs are not as heavily
employers are leaving some 'wiggle room' in certificated as other fields. This could be
negotiations. Perhaps the nature of the jobs because there is no supply of industry
has shifted as data science transitioned from recognised credentials at present. For
a sophisticated role into one that leverages example, in 2017 – 18 while 14.9% of ‘Budget
different skillsets. Perhaps employers are Analyst’ roles mentioned relevant accounting
adjusting their requirements because of the certifications, only 1.1% of ‘Data Engineer’
supply and demand dynamics that are not in roles mentioned the relevant Electrotechnical
their favour – eg asking for a BA instead of Card Scheme (ECS) certification. The IT field
a PhD. Perhaps the jobs are newer and so has more certification demand because the
there is a limited supply of people in senior community has long-standing credentials,
roles with higher paying salaries. for example ITIL (a set of detailed practices
for IT service management) and CompTIA
There has been a reduction in the level of (performance-based exams that certify
experience requested since 2013, with nearly foundational IT skills across a variety
half of all Data Scientist postings requesting of devices and operating systems).
just 0 – 2 years of experience. Looking at
skillsets within occupations would help to
try to answer this question, but beyond the
scope of this project.

Qualification levels
Qualification levels requested for the 'Data
Scientists and Advanced Analysts' category
have broadly increased. In 2013, 34% of
such postings required Level 6 (first degree)
or Level 7 (MSc or upwards) skills, but by
2017/18 this had increased to 42%. This was
most acutely seen in the ‘Data Scientist’
occupation, where half of all postings
required Level 6 or 7.

Dynamics of Data Science: How can all sectors benefit From data science talent? 95
APPENDICES

additional data tables

Appendix Table 1

The classification system used by Burning Glass Technologies to group data science
and analytics jobs in order of increasing analytical rigour (left to right).

Data-Driven Decision Makers Functional Analysts Analytics Managers

• Chief Executive Officer • Actuary • Chief Executive Officer


• Chief Information Officer • Budget Analyst • Chief Information Officer
/ Director of Information • Business / Management / Director of Information
Technology Analyst Technology
• Compensation and Benefits • Clinical Analyst / Clinical • Compensation and
Manager Benefits Manager
• Documentation and
• Financial Manager Improvement Specialist • Director of Risk Management
• Human Resources Manager • Clinical Data Manager • Financial Manager
• IT Project Manager • Compensation / Benefits • Human Resources Manager
• Logistics Manager Analyst • IT Project Manager
• Marketing Manager • Credit Analyst / Authoriser • Marketing Manager
• Operations Manager • E-Commerce Analyst • Product Manager
• Procurement Manager • Financial Analyst • Risk Manager
• Product Manager • Financial Examiner / Auditor • Talent Acquisition Manager
• Quality Control Systems • Fraud Examiner / Analyst
Manager • Geographer / GIS Specialist
• Talent Acquisition Manager • HRIS Analyst / Specialist
• Human Resources Analyst
• Logistics / Supply Chain
Analyst
• Market Research Analyst
• Operations Analyst
• Pricing Analyst
• Researcher / Research
Associate
• Risk Analyst
• Risk Consultant
• Search Engine
Optimisation Specialist
• Security / Defence
Intelligence Analyst
• Social Science Researcher
• Survey Researcher

96 Dynamics of Data Science: How can all sectors benefit From data science talent?
APPENDICES

KEY
Data Scientists and
Data Systems Developers Data Analysts Advanced Analysts Framework category

• Business Intelligence • Business Intelligence • Biostatistician • Occupation


Architect / Developer • Analyst Data / Data Mining • Data Engineer
• Computer Systems Analyst • Data Scientist
Engineer / Architect
• Economist
• Data Warehousing
• Financial Quantitative
Specialist
Analyst
• Database Administrator
• Statistician
• Database Architect
• Hadoop Developer
• Systems Analyst

Dynamics of Data Science: How can all sectors benefit From data science talent? 97
APPENDICES

Appendix Table 2

Regional data and Data Science and Advanced Analytics (DSAA) jobs.

Region/Nation Data jobs Data jobs % DSAA DSAA %


2013 2018 increase jobs 2013 jobs jobs increase
2018

Northern Ireland 4,987 11,902 139% 46 305 563%


Scotland 36,388 41,117 13% 528 1,114 111%
Wales 9,198 14,507 58% 121 216 79%
England 654,242 837,307 25% 7,130 22,527 216%
East Midlands 30,957 37,743 22% 200 474 137%
East of England 53,295 61,725 16% 551 1,928 250%
Greater London 283,817 345,164 22% 4,131 14,066 240%
North East 7,444 11,095 49% 54 175 224%
North West 45,286 61,871 37% 320 1,182 269%
South East 112,721 126,551 12% 1,086 2,305 112%
South West 41,066 59,581 45% 276 903 227%
West Midlands 41,771 86,563 107% 256 748 192%
Yorkshire and 37,885 47,014 24% 256 746 191%
the Humber
TOTAL 704815 904833 28% 7825 24162 209%

98 Dynamics of Data Science: How can all sectors benefit From data science talent?
APPENDICES

Appendix Table 3

The top 10 skills listed in DSAA job adverts (2013).


This table shows the skills which occurred the most in 2013. This is measured in terms of the
proportion of Data Science and Advanced Analyst (DSAA) job adverts which specified the skill
as a requirement for the role. There were a total of 682 skills included in this analysis. This table
displays the top ten most frequently occurring skills.

Number of DSAA job Percentage of DSAA job


adverts requiring this skill adverts requiring this skill
Rank Skill 2013* 2017 – 18** 2013 2017 – 18
1 Research 2,042 5,279 25% 20%
2 Communication Skills 1,843 4,849 23% 18%
3 Statistics 1,657 2,800 20% 10%
4 Economics 1,340 1,984 16% 7%
5 SAS 1,275 2,323 16% 9%
6 Microsoft Excel 1,126 1,585 14% 6%
7 SQL 1,048 7,226 13% 27%
8 Data Analysis 841 2,858 10% 11%
9 Statistical Analysis 811 1,630 10% 6%
10 C++ 754 2,166 9% 8%

*Out of 8,157 job adverts included in this analysis. **Out of 27,033 job adverts included in this analysis.

Dynamics of Data Science: How can all sectors benefit From data science talent? 99
APPENDICES

Appendix Table 4

The top 10 skills clusters listed in DSAA job adverts (2013).


This table shows the skills clusters which occurred the most in 2013. This is measured in terms
of the proportion of Data Science and Advanced Analyst (DSAA) job adverts which specified
the skills clusters as a requirement for the role. There were a total of 278 skills clusters
included in this analysis. This table displays the top ten most frequently occurring skills clusters.

Number of DSAA job Percentage of DSAA job


adverts requiring this adverts requiring this
skill cluster skill cluster
Rank Skills cluster 2013* 2017 – 18** 2013 2017 – 18
1 Statistics 1,908 7,406 23% 27%
2 Statistical Software 1,797 8,894 22% 33%
3 Data Analysis 1,628 10,636 20% 39%
4 Economics 1,532 5,086 19% 19%
5 SQL Databases and Programming 1,151 15,532 14% 57%
6 Data Science 1,056 25,042 13% 93%
7 Project Management 918 4,046 11% 15%
8 Medical Research 867 2,200 11% 8%
9 Scripting Languages 829 23,450 10% 87%
10 C and C++ 757 4,344 9% 16%

*Out of 8,157 job adverts included in this analysis. **Out of 27,033 job adverts included in this analysis.

100 Dynamics of Data Science: How can all sectors benefit From data science talent?
Dynamics of Data Science: How can all sectors benefit From data science talent? 101
102 Dynamics of Data Science: How can all sectors benefit From data science talent?
The Royal Society is a self-governing Fellowship of many
of the world’s most distinguished scientists drawn from all
areas of science, engineering, and medicine. The Society’s
fundamental purpose, as it has been since its foundation
in 1660, is to recognise, promote, and support excellence
in science and to encourage the development and use of
science for the benefit of humanity.

The Society’s strategic priorities emphasise its commitment


to the highest quality science, to curiosity-driven research,
and to the development and use of science for the benefit
of society. These priorities are:
• Promoting excellence in science
• Supporting international collaboration
• Demonstrating the importance of science to everyone

For further information


The Royal Society
6 – 9 Carlton House Terrace
London SW1Y 5AG
T +44 20 7451 2500
E science.policy@royalsociety.org
W royalsociety.org

Registered Charity No 207043

9 781782 523956

ISBN: 978-1-78252-395-6
Issued: May 2019 DES5847

You might also like