Implementing Generative Ai With Speed and Safety

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Implementing generative

AI with speed and safety


Generative AI poses both risks and opportunities. Here’s a road map
to mitigate the former while moving to capture the latter from day one.
This article is a collaborative effort by Oliver Bevan, Michael Chui, Ida Kristensen, Brittany Presten,
and Lareina Yee, representing views from McKinsey’s Risk & Resilience Practice and QuantumBlack,
AI by McKinsey.

March 2024
Generative AI (gen AI) presents a once-in-a- That unease is understandable. The risks associated
generation opportunity for companies, with the with gen AI range from inaccurate outputs and
potential for transformative impact across innovation, biases embedded in the underlying training data to
growth, and productivity. The technology can now the potential for large-scale misinformation and
produce credible software code, text, speech, high- malicious influence on politics and personal well-
fidelity images, and interactive videos. It has being. There are also broader debates on both
identified the potential for millions of new materials the possibility and desirability of developing AI in
through crystal structures and even developed general. These issues could undermine the
molecular models that may serve as the base for judicious deployment of gen AI, potentially leading
finding cures for previously untreated diseases. companies to pause experimentation until the risks
are better understood—or even deprioritize the
McKinsey research has estimated that gen AI has technology because of concerns over an inability to
the potential to add up to $4.4 trillion in economic manage the novelty and complexity of these issues.
value to the global economy while enhancing the
impact of all AI by 15 to 40 percent.1 While many However, by adapting proven risk management
corporate leaders are determined to capture this approaches to gen AI, it’s possible to move
value, there’s a growing recognition that gen AI responsibly and with good pace to capture the value
opportunities are accompanied by significant risks. of the technology. Doing so will also allow companies
In a recent flash survey of more than 100 organiza­ to operate effectively while the regulatory environ­
tions with more than $50 million in annual revenue, ment around AI continues to evolve, such as with
McKinsey finds that 63 percent of respondents President Biden’s executive order regarding gen AI
characterize the implementation of gen AI as a “high” development and use and the EU AI Act (see sidebar,
or “very high” priority.2 Yet 91 percent of these “The United States moves to regulate AI”). In addition,
respondents don’t feel “very prepared” to do so most organizations are likely to see the use of gen
in a responsible manner. AI increase “inbound” threats (risks likely to affect

The United States moves to regulate AI

On October 30, 2023, the Biden administration released a long-awaited executive order aimed at addressing concerns related to AI
development in economic, national-security, and social domains. The order establishes principles, tasks federal agencies with AI-testing
methods, codifies government oversight of private AI development, and outlines AI’s impact on national security and foreign policy:

— Holistic AI governance. The order — Private sector accountability. The — Cross-sector impact. The order
establishes a comprehensive frame­ order mandates that private companies addresses various sectors affected by
work for AI governance, emphasizing involved in AI adhere to industry AI, including critical infrastructure,
ethics, safety, and security. It addresses standards, report on compliance, cybersecurity, education, healthcare,
the importance of responsible inno­ and implement best practices. national security, and transportation.
vation, collaboration, and competition This includes meeting specific It promotes interagency collaboration
in the AI industry. guidelines on transparency and to integrate AI responsibly and
accountability, especially for securely across these sectors, aligning
dual-use foundation models and government and industry efforts
large-scale computing clusters. for societal benefit.

1
“The economic potential of generative AI: The next productivity frontier,” McKinsey, June 14, 2023.
2
Unpublished data from McKinsey survey results.

Implementing generative AI with speed and safety 2


organizations regardless of whether they deploy The specifics of how to implement these steps and
gen AI), particularly in fraud and cyber domains the degree of change required to make them effective
(early indications are that gen AI will be able to defeat will vary with an organization’s gen AI aspirations
standard antifraud biometric checks3). Building and nature. For instance, it could be looking to be a
fit-for-purpose risk management will help guard maker of the foundation models, a shaper that
against these threats. customizes and scales foundation models, or a taker
that adopts foundation models through off-the-
In practical terms, enterprises looking to address shelf applications with little or no customization (for
gen AI risk should take the following four steps: example, standard office productivity software).4

1. Launch a sprint to understand the risk of This article provides a blueprint for developing an
inbound exposures related to gen AI. approach to implementing gen AI responsibly.
Following these steps helps organizations move
2. Develop a comprehensive view of the materiality quickly to scale the technology and capture its
of gen-AI-related risks across domains and use benefits while minimizing their exposure to the
cases, and build a range of options (including potential downsides.
both technical and nontechnical measures) to
manage risks.
Understanding and responding to
3. Establish a governance structure that balances inbound risks
expertise and oversight with an ability to In our experience, including through building
support rapid decision making, adapting McKinsey’s own gen AI application, gen-AI-related
existing structures whenever possible. risks can be captured in eight main categories
(Exhibit 1). These categories consider both inbound
4. Embed the governance structure in an operating risks and risks that directly result from the adoption
model that draws on expertise across the of gen AI tools and applications. Every company
organization and includes appropriate training should develop some version of this core taxonomy
for end users. to support understanding and communication on
the risks arising from the implementation of gen AI.

Most organizations are likely to


see the use of gen AI increase
‘inbound’ threats, particularly in
fraud and cyber domains.

3
Security Intelligence, “AI may soon defeat biometric security, even facial recognition software,” blog entry by Mike Elgan, January 31, 2019.
4
For more, see “Technology’s generational moment with generative AI: A CIO and CTO guide,” McKinsey, July 11, 2023.

Implementing generative AI with speed and safety 3


Web 2024
McKQ-GenAIAndRisk
Exhibit
Exhibit 1 of15 (digital); 1 of 4 (print)

Half of eight basic categories of generative AI risk apply to all organizations


regardless of their deployment of related use cases.

Risk category Description Inbound Gen AI1 adoption


Impaired fairness Algorithmic bias resulting from unrepresentative training data or model
performance or misrepresentation of AI-generated content as human created

Intellectual property Infringement on copyrighted or otherwise legally protected materials,


(IP) infringement inadvertent leakage of IP into public domain, or both

Data privacy Unauthorized use or disclosure of personal or sensitive information


and quality or use of incomplete or inaccurate data for model training

Malicious use Malicious or harmful AI-generated content (eg, falsehoods/deepfakes,


scams/phishing, hate speech)

Security threats Vulnerabilities in gen AI systems (eg, payload splitting to bypass safety
filters, manipulability of open-source models)

Performance and Inability to explain model outputs or model inaccuracies appropriately


“explainability” (eg, factually incorrect or outdated answers, hallucinations)

Strategic Risk of noncompliance with standards or regulations, societal risk, and


reputational risk

Third party Risks associated with use of third-party AI tools (eg, proprietary data
being used by public models)

Generative AI.
1

McKinsey & Company

Deciding how to respond to inbound risks is — intellectual property (IP) infringement, resulting
a focus for many executive teams and boards. This from IP (such as images, music, and text) being
decision should serve as a foundation for how scraped into training engines for underlying
an organization communicates about gen AI to its large language models and made accessible to
employees and stakeholders. It should also inform anyone using the technology
the approach to use cases.
Most organizations will benefit from a focused sprint
We see four primary sources of inbound risk from to investigate how gen AI is changing their external
the adoption of gen AI: environment, with two primary objectives. The first
is to understand potential exposures to inbound
— security threats, resulting from the increased risks, anchored in the organization’s risk profile (for
volume and sophistication of attacks from gen- example, how many third parties have access to
AI-enabled malware sensitive or confidential data that need to be
restricted from training external gen AI models). The
— third-party risk, resulting from challenges second objective is to understand the maturity and
in understanding where and how third parties readiness of the control environment—the technical
may be deploying gen AI, creating potential and nontechnical capabilities the organization has
unknown exposures in place to prevent, detect, and ultimately respond
to inbound risks. These include cyber and fraud
— malicious use, resulting from the potential for defenses, third-party diligence to identify where
bad actors to create compelling deepfakes critical third parties may be deploying gen AI,
of company representatives or branding that and the ability to limit the scraping of company IP by
result in significant reputational damage engines used to train large language models.

Implementing generative AI with speed and safety 4


The outcome of these efforts should be an under­ technology’s adoption. This will likely require an
standing of where the organization faces the largest investment of time and resources and a shift in ways
potential inbound exposures, as well as the maturity of working. Yet it’s essential if organizations are to
and readiness of its current defense system. Having achieve long-term, sustainable, and transformative
conducted this exercise, the organization should benefits from gen AI. Missteps and failures can
have a clear road map of where to harden defenses erode the confidence of executives, employees, and
and what the potential ROI from these efforts would customers and trigger scaling back in the level of
be in potential risk mitigation. ambition to ultrasafe use cases that generate limited
risk but are also unlikely to capitalize on the
Given the evolving nature of the technology under­ technology’s true potential.
lying gen AI and its applications, organizations will
need to repeat the effort to identify their exposure Organizations looking to deploy high-potential use
with some regularity. For most organizations, cases for gen AI to drive productivity and innovation;
refreshing this exercise at least semiannually will provide better, more consistent customer service;
be important until the pace of change has and boost creativity in marketing and sales must
moderated and the control environments and address the challenge of responsible implementation.
defenses have matured. These use cases have varying risk profiles,
reflecting both the nature of the technology itself
and company-specific context concerning the
Tethering Prometheus: Managing the specifics of the use case (for example, deployment
risks produced by gen AI adoption of a gen AI chatbot to certain at-risk populations
Organizations with ambitions to deploy gen AI will has a very different risk profile from that of a B2B
need to undertake additional, ongoing efforts to deployment) (Exhibit 2).
understand and manage the risks of the

Web 2024
McKQ-GenAIAndRisk
Exhibit
Exhibit 2 of25 (digital); 2 of 4 (print)

Different generative AI use cases are associated with different kinds of risk.
Primary risk
Generative AI Impaired IP1 Data privacy Malicious Security Performance and
use case fairness infringement and quality use threats ‘explainability’ Strategic
Customer journeys
(eg, chatbots for
customer services)
Concision
(eg, generating
content summaries)
Coding
(eg, generating
or debugging code)
Creative content
(eg, developing
marketing content)

Intellectual property.
1

McKinsey & Company

Implementing generative AI with speed and safety 5


Identify risks across use cases When conducting this analysis, it’s important to
The essential starting point for organizations develop a rubric to calibrate expectations of what
deploying gen AI use cases is to map the potential constitutes a high versus a medium risk across
risks associated with each case across key risk categories. Otherwise, organizations may run into
categories to assess the potential risk severity. For disagreements driven more by individual comfort
example, use cases that support customer journeys, on risk levels than by objective factors. To take the
such as gen-AI-enabled chatbots for customer example of data privacy, we typically see higher-risk
service, may raise risks such as bias and inequitable examples as requiring personal or sensitive infor­
treatment across groups (for example, by gender and mation for accurate training of the model (or higher
race), privacy concerns from users inputting sensitive potential for users to enter personal information in
information, and inaccuracy risks from model interacting with the technology). Lower-risk use
hallucination or outdated information (Exhibit 3). cases would exhibit neither of these characteristics.

Web 2024
McKQ-GenAIAndRisk
Exhibit 3 5 (digital); 3 of 4 (print)
Exhibit 3 of

Organizations that deploy generative AI use cases can create a heat map
ranking the potential severity of various categories of risk.
Risk severity IP1 Malicious Performance and Third
Low Medium High infringement use explainability party
Impaired Data privacy Security
Use case fairness and quality threats Strategic
Customer AI financial advisers for
journeys individualized advice

AI bot for businesses


(eg, SMEs2) to track
targets
Concision Mining financial reports
to derive important
insights
Detect/prevent fraud by
aggregating/interpreting
payment documentation
Coding Model risk management
(eg, testing, review,
documentation)
Reduce tech delivery
timelines via automated
coding and testing
Creative Personalized content
content offerings (eg, credit
card offers)
Automate contract
drafting

Intellectual property.
1

Small and medium-size enterprises.


2

McKinsey & Company

Implementing generative AI with speed and safety 6


Using this logic, developing an application that the life cycle of the process. Importantly, these
supports an adviser in providing tailored financial controls don’t all need to be embedded in the
advice would tend to rank higher in privacy risk underlying foundation model itself (which many
exposure than would an application that automates organizations won’t have access to). Some can
basic contract templates. be overlays built in the local environment, as is the
case of a gen-AI-enabled chatbot designed by
It’s essential that the executive in charge of the use an HR department to field employee queries about
case leads the initial assessment of the risks benefits (Exhibit 4).
associated with it (as part of the role of the product
manager in an effective operating model). This In that use case, across the life cycle of a query,
fosters the appropriate awareness of potential risks once a user asks a question, many possible
and accountability for managing them when the mitigations can occur. They include having the
use case is approved for ultimate development. In chatbot ask clarifying questions to generate
addition, a cross-functional group, including additional necessary user inputs, having the user
business heads and members of legal and confirm that the chatbot has properly understood
compliance functions, should review and validate the query, limiting the types of data sets that the
the risk assessments for all use cases—and chatbot can access (for example, excluding personal
use the results as input when making decisions information), and designing the chatbot to provide
about use case prioritization. citations to explain its answers and allow for fact-
checking of its responses. Organizations
Consider options for managing risks at implementing this use case can take steps (such
each touchpoint as limiting repeated interactions) to frustrate
Once an organization maps the gen-AI-related risks, the attack vectors and jailbreaking that are known
it must develop strategies to manage exposures to create challenges for chatbots. They can
through a combination of mitigation and robust also develop classifiers to identify and reject out-of-
governance. Many (but not all) mitigations are tech­ scope queries (such as requesting calculations).
nical in nature and can be implemented across

Web 2024
McKQ-GenAIAndRisk
Exhibit 4 5 (digital); not in print cut
Exhibit 4 of

Generative AI risk can be mitigated at multiple points across a user interaction.

Sample HR chatbot interaction with built-in checkpoints to catch potential misfires

User asks Chatbot clarifies Prompt is Bot searches Bot generates Answer is
question question enriched relevant knowledge response consumed

1 2 3 4 5 6

User asks, Model seeks to Model enriches Model searches Model generates User receives
“What are my better understand query by retrieving for relevant response (explicitly response and,
vision care user’s needs by templates from a information to citing HR documents thanks to
benefits?” asking for more prompt library (all answer the prompt used to compose citations, can
data: templates have been it) and conducts easily verify
• Permanent or tested against per- final set of risk answer and
temporary formance bench- and quality checks dig deeper
employee? marking); after if necessary
• Eyewear or receiving updated
checkups? prompt, user is
• State of asked to confirm
residence? that query has been
• Benefits plan parsed accurately
subscription?

McKinsey & Company

Implementing generative AI with speed and safety 7


There are important categories of additional non­ for each case. For example, in the HR chatbot
technical mitigations that organizations should example, the ability to produce sources as part of
consider when developing use cases. At this stage the query answer could also be applied in use
of gen AI maturity, most organizations are main­ cases of an employee trying to explain a product
taining humans in the loop to guard against the to a customer or building analyses of peer
technology being able to put outputs directly into companies. In both cases, this will go some way
production or to engage directly with end customers. to addressing challenges of “explainability” and
As previously referenced, contractual provisions overall confidence in output.
to guard against problematic use of data from third
parties are important. As a third example, organiza­
tions should develop coding standards and libraries Balancing speed to scale with judicious
to capture appropriate metadata and methodological risk management through governance
standards to support reviews. Using gen AI will place new demands on most
organizations to adapt governance structures to
Many of the initial mitigating strategies for gen AI respond to demands on approvals and exercise
span multiple use cases, allowing organizations to oversight. However, most organizations should be
get scaled benefits from their technical mitigations able to adapt what they have today by expanding
rather than having to create bespoke approaches mandates or coverage (Exhibit 5). This will limit the

Web 2024
McKQ-GenAIAndRisk
Exhibit 5 of 5 (digital); 4 of 4 (print)
Exhibit 5

Moving with speed while mitigating risk often requires revised governance.
Establish a cross- Hire an AI governance officer to propel Develop AI guidelines and policies,
functional, responsible centralization needed for consistent AI agreed upon by the executive team and
gen AI1 steering group policies and standards and to keep board, to guide responsible company-wide
internal control system updated AI adoption and use cases

Responsible gen AI governance

Bodies Governance bodies Decision authority Mandates and roles Policies


and • Consistent structure and delegation • Clear definitions of • Structured
policies of governance bodies • Consistent and mandates and roles of codification of rules
that meet regularly comprehensive management position • Based on clear rules,
• Clear definition framework of • Stringently cascaded particularly around
of each body’s decision authority from all organization authority, ownership,
mandate and • Structured and layers and processes
processes transparent
delegation rules

Enablers Internal control system

Governance tools

3 lines of defense (controls, monitoring, and audit)

Governance culture

Develop a responsible culture and talent base to promote


responsible gen AI (eg, practicing “ethics and responsibility by design”)

Generative AI.
1

McKinsey & Company

Implementing generative AI with speed and safety 8


potential disruption of establishing an entirely new — Responsible AI talent and culture. A commitment
phalanx of committees and approval bodies that to responsible AI can’t rest solely in the
could add friction to decision making and confusion executive ranks. Instead, it needs to cascade
over accountability. throughout the organization, with accountability,
capability building, and awareness tailored to
Gen AI will likely require organizations to make the relevant degree of exposure of relevant roles
changes to three core elements of governance: to the technologies. Basic organization-wide
training on responsible AI should be developed
— A cross-functional, responsible gen AI steering and rolled out to foment a broad understanding
group with at least a monthly cadence. This of the dynamics of inbound risk and how to
group should include business and technology engage with the technology safely. For example,
leaders, as well as data, privacy, legal, and given the potential for the models to hallucinate,
compliance members. It should have a mandate users should be told, as part of their training,
for making critical decisions on managing gen AI that they shouldn’t accept an answer just because
risks, covering assessment of exposures and their machine has provided it (in contrast to
mitigating strategies for both inbound and how they may have experienced prior office
adoption-based risks. It should review founda­ productivity technologies). Those engaged in
tional strategy decisions, such as the selection the development and scaling of use cases
of foundational models and compatibility with the should have a deep understanding of ethics
organization’s risk posture. This steering group and “responsibility by design” to embed risk
ideally has a single individual empowered to considerations early in the design and
handle coordination and agenda setting. In engineering processes. Talent considerations
industries with established regulatory expecta­ include embedding a mix of nontechnical and
tions and a long history of risk management technical talent—and ideally, technical talent with
of model and algorithmic risk (such as financial risk expertise to support identification and
services), this person will typically be already design of user query workflows and controls.
on staff (and may be the head of model risk). For
organizations facing a sudden increase in
regulatory expectations from gen AI, they may Implementing responsible gen AI: It’s
need to hire an AI governance officer or similar all about governance and people
role to discharge these responsibilities. Establishing the right governance is a necessary
but not sufficient step in driving responsible
— Responsible AI guidelines and policies. adoption of gen AI use cases at scale. As referenced
Organizations should develop a set of guiding in the preceding section, embedding responsibility
principles agreed on by the executive team by design into the development process is essential
and the board that will guide AI adoption and for judicious deployment of the technology. There
serve as a guardrail for acceptable use cases. are four critical roles required for successful
Principles that we’ve seen debated include implementation of this throughout the use cases,
questions on the degree to which gen AI can or where the responsibilities of these roles are
should be used to drive personalized marketing tied closely to their talent and expected actions
or customer outreach, the use of gen AI to in pushing forward use cases:
support employment decisions (including hiring
and performance reviews), and the conditions — Designers. Designers, or product managers,
under which gen AI outputs can be put directly steer the direction of gen AI deployment by
into production without human review. Existing identifying new use cases with an awareness
policies typically need to be refreshed to of how they fit into the organization’s overall gen
account for gen AI development and use (for AI strategy and road map. They’re typically
example, covering misrepresentation and drawn from within the businesses and functions
IP infringement). for which the organization has the most

Implementing generative AI with speed and safety 9


conviction that gen AI can have significant impact. — Users. Users represent the end users of new
Find more content like this on the
The product managers should be accountable gen AI tools or use cases. They will need to be
McKinsey Insights App
for identifying and mitigating relevant risks. They trained and acculturated to the dynamics and
will have an important role in driving the cultural potential risks of the technology (including their
changes required to adopt gen AI, including role in responsible usage). They also play a
building trust in the proposition that business critical role in helping identify risks from gen AI
value can be achieved responsibly and safely use cases, as they may experience problematic
for employees and customers. outputs in their interactions with the model.

— Engineers. Engineers are technical experts who An operating model should account for how the
Scan • Download • Personalize
understand the mechanics of gen AI. They different personas will interact at different stages of
develop or customize the technology to support the gen AI life cycle. There will be natural variations
the gen AI use cases. Just as important, for each organization, depending on the specific
they’re responsible for guiding on the technical capabilities embedded in each of the personas. For
feasibility of mitigations and ultimately coding example, some organizations will have more tech­
the mitigations to limit risk, as well as developing nical capabilities in designers, meaning they may have
technical-monitoring strategies. a more active delivery role. But the intent of the
operating model is to show how engagement varies
— Governors. Governors make up the teams that at each stage of deployment.
help establish the necessary governance,
processes, and capabilities to drive responsible
and safe implementation practices for gen AI.
These include establishing the core risk frame­ Gen AI has the potential to redefine how people work
works, guardrails, and principles to guide the and live. While the technology is fast developing, it
work of designers and engineers and challenging comes with risks that range from concerns over the
risk evaluation and mitigation effectiveness completeness of the training data to the potential
(especially for higher-risk use cases). The AI of generating inaccurate or malicious outputs.
governance officer is a prime example of this Business leaders need to revise their technology
persona, although the role will need to be playbooks and drive the integration of effective
complemented with others, given the range of risk management from the start of their engagement
potential risks. These roles will ideally cover with gen AI. This will allow for the application of this
data risk, data privacy, cybersecurity, regulatory exciting new technology in a safe and responsible way,
compliance, and technology risk. Given the helping companies manage known risks (including
nascency of gen AI, governors will often need inbound risks) while building the muscles to adapt to
to coordinate with engineers to launch “red unanticipated risks as the capabilities and use
team” tests of emerging use cases built cases of the technology expand. With major potential
on gen AI models to identify and mitigate uplift in productivity at stake, working to scale
potential challenges. gen AI sustainably and responsibly is essential in
capturing its full benefits.

Oliver Bevan is a partner in McKinsey’s Chicago office; Michael Chui is a partner in the Bay Area office, where Brittany Presten
is an associate partner and Lareina Yee is a senior partner; and Ida Kristensen is a senior partner in the New York office.

Designed by McKinsey Global Publishing


Copyright © 2024 McKinsey & Company. All rights reserved.

Implementing generative AI with speed and safety 10

You might also like