Building Markovian Generative Architectures over Pretrained LM Backbones for Efficient Task-Oriented Dialog Systems

Liu, Hong; Cai, Yucheng; Ou, Zhijian; Huang, Yi; Feng, Junlan

Computer Science > Computation and Language

arXiv:2204.06452 (cs)

[Submitted on 13 Apr 2022 (v1), last revised 14 Oct 2022 (this version, v2)]

Title:Building Markovian Generative Architectures over Pretrained LM Backbones for Efficient Task-Oriented Dialog Systems

Authors:Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng

View PDF

Abstract:Recently, Transformer based pretrained language models (PLMs), such as GPT2 and T5, have been leveraged to build generative task-oriented dialog (TOD) systems. A drawback of existing PLM-based models is their non-Markov architectures across turns, i.e., the whole history is used as the conditioning input at each turn. First, this brings inefficiencies in memory and computation. Furthermore, using the whole history increases model complexity and may hurt the training efficiency, especially when facing small amounts of labeled training data (the low-resource setting). In this paper, motivated by the observation that dialog states could be viewed as Markov states, we propose to build Markovian Generative Architectures (MGA) over PLM backbones for efficient TOD systems. Experiments on MultiWOZ2.1 show that in the rich-resource setting, the proposed Markov models reduce memory and time costs without performance degradation; in the low-resource setting, the training efficiency of the Markov models is more significant.

Comments:	Accepted by SLT 2022
Subjects:	Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2204.06452 [cs.CL]
	(or arXiv:2204.06452v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2204.06452

Submission history

From: Zhijian Ou [view email]
[v1] Wed, 13 Apr 2022 15:21:34 UTC (268 KB)
[v2] Fri, 14 Oct 2022 01:09:08 UTC (696 KB)

Computer Science > Computation and Language

Title:Building Markovian Generative Architectures over Pretrained LM Backbones for Efficient Task-Oriented Dialog Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Building Markovian Generative Architectures over Pretrained LM Backbones for Efficient Task-Oriented Dialog Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators