DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation

Wang, Hanqing; Liang, Wei; Van Gool, Luc; Wang, Wenguan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2308.07498 (cs)

[Submitted on 14 Aug 2023]

Title:DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation

Authors:Hanqing Wang, Wei Liang, Luc Van Gool, Wenguan Wang

View PDF

Abstract:VLN-CE is a recently released embodied task, where AI agents need to navigate a freely traversable environment to reach a distant target location, given language instructions. It poses great challenges due to the huge space of possible strategies. Driven by the belief that the ability to anticipate the consequences of future actions is crucial for the emergence of intelligent and interpretable planning behavior, we propose DREAMWALKER -- a world model based VLN-CE agent. The world model is built to summarize the visual, topological, and dynamic properties of the complicated continuous environment into a discrete, structured, and compact representation. DREAMWALKER can simulate and evaluate possible plans entirely in such internal abstract world, before executing costly actions. As opposed to existing model-free VLN-CE agents simply making greedy decisions in the real world, which easily results in shortsighted behaviors, DREAMWALKER is able to make strategic planning through large amounts of ``mental experiments.'' Moreover, the imagined future scenarios reflect our agent's intention, making its decision-making process more transparent. Extensive experiments and ablation studies on VLN-CE dataset confirm the effectiveness of the proposed approach and outline fruitful directions for future work.

Comments:	Accepted at ICCV 2023; Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2308.07498 [cs.CV]
	(or arXiv:2308.07498v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2308.07498

Submission history

From: Wenguan Wang [view email]
[v1] Mon, 14 Aug 2023 23:45:01 UTC (15,436 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators