Skip to content

fix(cli): add scrolling, Ctrl+C pause/resume, and enhanced model detection (#1887) #2593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

MagellaX
Copy link

@MagellaX MagellaX commented Aug 3, 2025

PR Description

Addresses CLI issues in #1887 and feedback from unmerged #1891: fixed TUI scrolling conflicts, non-destructive Ctrl+C handling, and model param errors.

  • Scrolling: Set auto_scroll=False on RichLog in tasks panel; conditional scroll_end() in update_tasks_panel() only when agent.running == True, enabling manual history navigation during execution.

  • Pause/Resume: Remapped Ctrl+C to pause_agent (calls agent.pause() if running, else quits); added Enter binding for resume_agent (handles async/sync agent.resume()); second Ctrl+C during pause invokes action_quit(). Updated on_key() to route Enter for resume when input unfocused and agent.state.paused == True.

  • Model Support: Prioritized API key detection (env > config) for OpenAI/Anthropic/Google/DeepSeek/Groq; load preferences from ~/.config/browseruse/config.json; fixed model vs model_name kwargs in Chat* constructors.

Verified via test_cli_improvements.py (bindings, detection, config loading) and manual runs. Passes ruff lint/format.

Related: #1887, #1891.


Summary by cubic

Improved the CLI by adding manual scrolling in the tasks panel, non-destructive Ctrl+C pause/resume controls, and better model/API key detection for multiple providers.

  • New Features
    • Manual scroll in the tasks panel with auto-scroll only when the agent is running.
    • Ctrl+C now pauses the agent (press again to quit); Enter resumes if paused.
    • Enhanced model selection and API key detection for OpenAI, Anthropic, Google, DeepSeek, and Groq.

Copy link
Contributor

cubic-dev-ai bot commented Aug 3, 2025

📚 Feature History

This PR modifies 1 file (+106 -22 lines).

🔗 Related Pull Requests

PR Title Merged Author
#2442 Fix incorrect pip install prompt for CLI addon Jul 14, 2025 @fureigh
#2358 hot-fix-Increase-timeouts & fix remove message Jul 8, 2025 @MagMueller
#2347 Always use subprocess + CDP for local browser inst... Jul 8, 2025 @pirate

📊 Summary

The command-line interface (CLI) agent control and model detection logic in browser_use/cli.py has undergone several key changes since early 2024, particularly around user interaction (scrolling, pause/resume), API key/model selection, and configuration handling.

MagellaX’s PR #1887 (May 2025) originally introduced a richer TUI with task panel scrolling and attempted to unify model selection, but users reported issues with scrolling conflicts and Ctrl+C causing abrupt exits. This led to feedback and further iteration in unmerged PR #1891 (June 2025), where the team debated how to make agent pausing/resuming and manual scroll coexist smoothly.

In PR #2358 (Jul 2025), MagMueller increased LLM call timeouts and improved message management, which indirectly affected the CLI’s responsiveness and error handling during agent operations. This change made it clearer that the CLI needed more graceful interruption and recovery mechanisms, especially when users pressed Ctrl+C during long-running tasks.

The model and API key detection logic has evolved from a static config-based approach to a more dynamic, environment-variable-prioritized system. Earlier versions only checked config files, but as of late July 2025, the CLI now checks environment variables first, then config files, and finally user config at ~/.config/browseruse/config.json, as seen in the current diff and recent commits.

Manual scrolling in the tasks panel was previously tied to auto-scroll, which frustrated users who wanted to review history while the agent was running; PR #2593 (Aug 2025) decouples this by setting auto_scroll=False and only triggering scroll-to-end when the agent is actively running. This addresses longstanding usability complaints and aligns with feedback from #1887 and #1891.

Ctrl+C handling has shifted from immediate application exit to a two-stage pause/quit flow, with Enter now mapped to resume the agent—this was a direct response to user confusion and accidental session loss, as discussed in #1887 and #1891. The new logic ensures that pressing Ctrl+C once pauses the agent, and a second press quits, while Enter resumes, improving both safety and clarity.

This history is relevant now because PR #2593 consolidates and finalizes these iterative changes, resolving prior usability issues and implementing the pause/resume and scrolling behaviors that were debated and prototyped across multiple earlier PRs.


Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

No issues found across 1 file. Review in cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant