On the steerability of large language models toward data-driven personas

Li, Junyi; Mehrabi, Ninareh; Peris, Charith; Goyal, Palash; Chang, Kai-Wei; Galstyan, Aram; Zemel, Richard; Gupta, Rahul

Computer Science > Computation and Language

arXiv:2311.04978 (cs)

[Submitted on 8 Nov 2023 (v1), last revised 2 Apr 2024 (this version, v2)]

Title:On the steerability of large language models toward data-driven personas

Authors:Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta

View PDF HTML (experimental)

Abstract:Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs, that can be leveraged to produce multiple perspectives and to reflect the diverse opinions. Moving beyond the traditional reliance on demographics like age, gender, or party affiliation, we introduce a data-driven notion of persona grounded in collaborative filtering, which is defined as either a single individual or a cohort of individuals manifesting similar views across specific inquiries. As individuals in the same demographic group may have different personas, our data-driven persona definition allows for a more nuanced understanding of different (latent) social groups present in the population. In addition to this, we also explore an efficient method to steer LLMs toward the personas that we define. We show that our data-driven personas significantly enhance model steerability, with improvements of between $57\%-77\%$ over our best performing baselines.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2311.04978 [cs.CL]
	(or arXiv:2311.04978v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2311.04978

Submission history

From: Ninareh Mehrabi [view email]
[v1] Wed, 8 Nov 2023 19:01:13 UTC (2,939 KB)
[v2] Tue, 2 Apr 2024 18:29:52 UTC (7,083 KB)

Computer Science > Computation and Language

Title:On the steerability of large language models toward data-driven personas

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On the steerability of large language models toward data-driven personas

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators