[lldb-dap] Add new optional argument `time-to-live` when using `--connection` #156803

royitaqi · 2025-09-04T05:33:39Z

Usage

--time-to-live <ttl>    When using --connection, the number of milliseconds to wait for new connections
                        at the beginning and after all clients have disconnected. Not specifying this
                        argument or specifying non-positive values will wait indefinitely.

Benefits

Automatic release of resources when lldb-dap is no longer being used (e.g. release memory used by module cache).

Test

TBD. I will find a test file to add tests.

llvmbot · 2025-09-04T05:34:11Z

@llvm/pr-subscribers-lldb

Author: Roy Shi (royitaqi)

Changes

Usage

--time-to-live &lt;ttl&gt;    When using --connection, the number of milliseconds to wait for new connections
                        at the beginning and after all clients have disconnected. Not specifying this
                        argument or specifying non-positive values will wait indefinitely.

Benefits

Automatic release of resources when lldb-dap is no longer being used (e.g. release memory used by module cache).

Test

TBD. I will find a test file to add tests.

Full diff: https://github.com/llvm/llvm-project/pull/156803.diff

2 Files Affected:

(modified) lldb/tools/lldb-dap/Options.td (+7)
(modified) lldb/tools/lldb-dap/tool/lldb-dap.cpp (+51-2)

diff --git a/lldb/tools/lldb-dap/Options.td b/lldb/tools/lldb-dap/Options.td
index 867753e9294a6..754b8c7d03568 100644
--- a/lldb/tools/lldb-dap/Options.td
+++ b/lldb/tools/lldb-dap/Options.td
@@ -61,3 +61,10 @@ def pre_init_command: S<"pre-init-command">,
 def: Separate<["-"], "c">,
   Alias<pre_init_command>,
   HelpText<"Alias for --pre-init-command">;
+
+def time_to_live: S<"time-to-live">,
+      MetaVarName<"<ttl>">,
+      HelpText<"When using --connection, the number of milliseconds to wait "
+        "for new connections at the beginning and after all clients have "
+        "disconnected. Not specifying this argument or specifying "
+        "non-positive values will wait indefinitely.">;
diff --git a/lldb/tools/lldb-dap/tool/lldb-dap.cpp b/lldb/tools/lldb-dap/tool/lldb-dap.cpp
index b74085f25f4e2..8b53e4d5cda83 100644
--- a/lldb/tools/lldb-dap/tool/lldb-dap.cpp
+++ b/lldb/tools/lldb-dap/tool/lldb-dap.cpp
@@ -258,7 +258,7 @@ validateConnection(llvm::StringRef conn) {
 static llvm::Error
 serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
                 Log *log, const ReplMode default_repl_mode,
-                const std::vector<std::string> &pre_init_commands) {
+                const std::vector<std::string> &pre_init_commands, int ttl) {
   Status status;
   static std::unique_ptr<Socket> listener = Socket::Create(protocol, status);
   if (status.Fail()) {
@@ -283,6 +283,21 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
     g_loop.AddPendingCallback(
         [](MainLoopBase &loop) { loop.RequestTermination(); });
   });
+  static MainLoopBase::TimePoint ttl_time_point;
+  static std::mutex ttl_mutex;
+  if (ttl > 0) {
+    std::scoped_lock<std::mutex> lock(ttl_mutex);
+    MainLoopBase::TimePoint future =
+        std::chrono::steady_clock::now() + std::chrono::milliseconds(ttl);
+    ttl_time_point = future;
+    g_loop.AddCallback(
+        [future](MainLoopBase &loop) {
+          if (ttl_time_point == future) {
+            loop.RequestTermination();
+          }
+        },
+        future);
+  }
   std::condition_variable dap_sessions_condition;
   std::mutex dap_sessions_mutex;
   std::map<MainLoop *, DAP *> dap_sessions;
@@ -291,6 +306,12 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
                                           &dap_sessions_mutex, &dap_sessions,
                                           &clientCount](
                                              std::unique_ptr<Socket> sock) {
+    if (ttl > 0) {
+      // Reset the keep alive timer, because we won't be killing the server
+      // while this connection is being served.
+      std::scoped_lock<std::mutex> lock(ttl_mutex);
+      ttl_time_point = MainLoopBase::TimePoint();
+    }
     std::string client_name = llvm::formatv("client_{0}", clientCount++).str();
     DAP_LOG(log, "({0}) client connected", client_name);
 
@@ -327,6 +348,23 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
       std::unique_lock<std::mutex> lock(dap_sessions_mutex);
       dap_sessions.erase(&loop);
       std::notify_all_at_thread_exit(dap_sessions_condition, std::move(lock));
+
+      if (ttl > 0) {
+        // Start the countdown to kill the server at the end of each connection.
+        std::scoped_lock<std::mutex> lock(ttl_mutex);
+        MainLoopBase::TimePoint future =
+            std::chrono::steady_clock::now() + std::chrono::milliseconds(ttl);
+        // We don't need to take the max of `keep_alive_up_to` and `future`,
+        // because `future` must be the latest.
+        ttl_time_point = future;
+        g_loop.AddCallback(
+            [future](MainLoopBase &loop) {
+              if (ttl_time_point == future) {
+                loop.RequestTermination();
+              }
+            },
+            future);
+      }
     });
     client.detach();
   });
@@ -509,6 +547,17 @@ int main(int argc, char *argv[]) {
   }
 
   if (!connection.empty()) {
+    int ttl = 0;
+    llvm::opt::Arg *time_to_live = input_args.getLastArg(OPT_time_to_live);
+    if (time_to_live) {
+      llvm::StringRef time_to_live_value = time_to_live->getValue();
+      if (time_to_live_value.getAsInteger(10, ttl)) {
+        llvm::errs() << "'" << time_to_live_value
+                      << "' is not a valid time-to-live value\n";
+        return EXIT_FAILURE;
+      }
+    }
+
     auto maybeProtoclAndName = validateConnection(connection);
     if (auto Err = maybeProtoclAndName.takeError()) {
       llvm::logAllUnhandledErrors(std::move(Err), llvm::errs(),
@@ -520,7 +569,7 @@ int main(int argc, char *argv[]) {
     std::string name;
     std::tie(protocol, name) = *maybeProtoclAndName;
     if (auto Err = serveConnection(protocol, name, log.get(), default_repl_mode,
-                                   pre_init_commands)) {
+                                   pre_init_commands, ttl)) {
       llvm::logAllUnhandledErrors(std::move(Err), llvm::errs(),
                                   "Connection failed: ");
       return EXIT_FAILURE;

lldb/tools/lldb-dap/tool/lldb-dap.cpp

walter-erquinigo · 2025-09-04T12:03:05Z

lldb/tools/lldb-dap/Options.td

+      HelpText<"When using --connection, the number of milliseconds to wait "
+        "for new connections at the beginning and after all clients have "
+        "disconnected. Not specifying this argument or specifying "


you have to be more descriptive here. You need to mention what happens when the time out is hit both at the "beginning" and after clients disconnect. If there's any reset of the timer, you need to mention that here as well

You should also elaborate more in a new section in the documentation of lldb-dap, which is in lldb/tools/lldb-dap/README.md

JDevlieghere · 2025-09-04T16:27:34Z

Can you talk a bit about what's motivating this change? I'm a little worried about the proliferation of new options and I'm especially wary about the ones that need to be specified on the command line instead of over the protocol. Every new option extends the matrix of things we need to support.

Can this be a UI setting instead of a command line option?
Milliseconds seems like the wrong granularity for this . Seconds or minutes seems like a more meaningful order of magnitude.
This definitely needs a test. Unfortunately, tests that rely on timing tend to be less reliable, but here there should be a lower bound (e.g. exceeds X seconds) rather than an upper bound, so hopefully that should be fine.

royitaqi · 2025-09-04T17:27:14Z

@JDevlieghere Thanks for the good questions. I appreciate all of them. Below are my thoughts for discussion.

proliferation of new options
motivation

Main motivation is memory pressure. Other ways to counter memory pressure is either to monitor from lldb-dap VS Code extension and SIGINT for lldb-dap process to stop (drawback: IIUC this will terminate existing connections), OR use "memory pressure detected" from within lldb-dap process (drawback: the timing is tricky).

Side product is that the periodical restart will also clean up bad states that may happen.

command line option vs. over the protocol vs. VS Code settings

IIUC, there are two aspects to it: lifetime, and where to live. The following is my thoughts but no strong opinion.

Lifetime of the option: I feel this option is better for the lifetime of the lldb-dap process, because it should be consistent across different connections. Or maybe I'm missing a scenario where per-connection or per-target value works better.

I think the above decides "where does the option live". If we decide that the option is better for the whole process, then I think we need both a command line argument and a VS Code setting. If it's going to be per connection, then we need a VS Code setting.

EDIT: Did you mean that we can implement this in the lldb-dap extension instead of here in lldb-dap binary? We have use cases where we start lldb-dap binary either manually or through other scripts, so it's good to have this functionality supported into lldb-dap binary (and we can have a setting in the lldb-dap extension which passes down into this command line option).

milliseconds

~~Yeah I thought seconds/minutes will be more intuitive, too.~~

FWIW, did a calculation, max int in milliseconds is about 24.8 days. I don't think anyone would want that long of a TTL, so the range seems big enough. But yeah, seconds/minutes would probably make more sense (who is gonna use <1s TTL, right?).

Updated the patch. Now it's seconds.

This definitely needs a test

Yes, definitely (see "I will find a test file to add tests." in PR description).

Yeah probably set TTL to be a 1 second and wait for it to terminate correctly within 3 seconds.

lldb/tools/lldb-dap/tool/lldb-dap.cpp

royitaqi · 2025-09-04T22:41:37Z

@walter-erquinigo and @ashgti: Thank you for your review. I think most of your comments have been addressed in c78a38f and 13b1358. Feel free to reopen the comments if you want me to address them in a different way.

--

@walter-erquinigo: I will first finish the discussion with @JDevlieghere , and if we are aligned on adding a VS Code setting for the lldb-dap extension, I will implement that. At that time, I will also add to the lldb-dap documentation. SG?

BTW, I just checked, there is no such section for VS Code settings in that doc. Which section did you want me to add into?

JDevlieghere · 2025-09-04T23:30:57Z

Main motivation is memory pressure. Other ways to counter memory pressure is either to monitor from lldb-dap VS Code extension and SIGINT for lldb-dap process to stop (drawback: IIUC this will terminate existing connections), OR use "memory pressure detected" from within lldb-dap process (drawback: the timing is tricky).

We have a memory monitor in DAP that should take care of that. Unfortunately the metrics are pretty different across platforms so making this configurable might be tricky. That said, it would be interesting to understand why that's not kicking in.

command line option vs. over the protocol vs. VS Code settings

IIUC, there are two aspects to it: lifetime, and where to live. The following is my thoughts but no strong opinion.

Lifetime of the option: I feel this option is better for the lifetime of the lldb-dap process, because it should be consistent across different connections. Or maybe I'm missing a scenario where per-connection or per-target value works better.

I think the above decides "where does the option live". If we decide that the option is better for the whole process, then I think we need both a command line argument and a VS Code setting. If it's going to be per connection, then we need a VS Code setting.

Fair enough. We could implement something where each connection specifies its own TTL but that feels very overengineered. Given that it's tied to the server mode, I can see the argument for a command line option.

EDIT: Did you mean that we can implement this in the lldb-dap extension instead of here in lldb-dap binary? We have use cases where we start lldb-dap binary either manually or through other scripts, so it's good to have this functionality supported into lldb-dap binary (and we can have a setting in the lldb-dap extension which passes down into this command line option).

No, but I do think this should be configurable from the extension, like how folks can pick server mode there.

TL;DR You convinced me that this warrants being a command line option.

Minor feedback:

At the risk of entering bike-shedding territory, would something like "connection-timeout" be a better name for this option? When I hear TTL I associate it with data that expires (e.g. in the context of DNS and networking).
Could you catch the situation where this option is provided but connection is not and print a warning saying that essentially says this only works in server mode?

royitaqi · 2025-09-04T23:55:46Z

@JDevlieghere: Thank you for your reply. I appreciate it a lot.

I will:

Learn about the memory monitor in DAP.
- Context: It's not failing - We are just starting to experiment with the server mode. This connection timeout feature is a precaution before we can do large scale rollout. Some folks liked the concept of auto-cleanup of the lldb-dap process, just to have the peace of mind that the memory issue isn't caused by the debugger. Good to know about the memory monitor in DAP. That adds more confidence.
Rename the option to "connection-timeout". I like this name a lot, esp. because the other option is "--connection", so this has a nice synergy in the names.
Catch that case of --connection-timeout is given while --connection isn't. Print a warning.
Add tests.

--

(minor clarification:)

I do think this should be configurable from the extension, like how folks can pick server mode there.

I wonder if you have any preference about whether this should be in this same patch, or should be a separate follow-up. I have no strong opinion (shouldn't be too much work), so I will just go with whichever you prefer.

walter-erquinigo · 2025-09-05T17:27:36Z

connection-timeout seems to be a better name. I'll wait for the current discussions to finish before reviewing again

jeffreytan81 · 2025-09-05T18:17:56Z

lldb/tools/lldb-dap/tool/lldb-dap.cpp

@@ -327,6 +366,11 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
      std::unique_lock<std::mutex> lock(dap_sessions_mutex);
      dap_sessions.erase(&loop);
      std::notify_all_at_thread_exit(dap_sessions_condition, std::move(lock));
+
+      // Start the countdown to kill the server at the end of each connection.


Why are we kill the server at the end of each connection? There can still be other alive connections.
I do not think this is the behavior we wanted. You want to kill the server for the last connection.

@jeffreytan81: Thanks for the review. I think you misread the patch. The connection timeout (not the termination of the server) is started at the end of each connection. If a new connection establishes before the timeout finishes, it will reset the timeout in ResetTimeToLive(), so only the timeout from the last connection will activate the actual termination of the server. LMK if you think that's not what the code does.

I see. TrackTimeToLive is checking against the global variable update or not before calling RequestTermination(). This works, but I find it inefficient that for every connection to create a callback. For example, if you have 10 connections in parallel, 10 loop.AddCallback() are called to check should_request_terimation which is wasteful. I would suggest only schedule callback for the last connections alive, then only one loop.AddCallback() is called to check during timeout (it internally still has to check for new connection reseting though).

Also, there seems to be a race condition here:

When last connection timeout, should_request_terimation is set to true.

Before loop.RequestTermination() is called.

Then a new client starts debugging being Accepted() and calling ResetTimeToLive() and move forward.

The previous last connection calls loop.RequestTermination() and shutdown the server

The newly accepted client got killed due to server shutdown.

@jeffreytan81: Thanks for the suggestions / good points.

I would suggest only schedule callback for the last connections alive

Got it. I think the optimization makes sense (as you said, the existing logic is still needed). I will try to use dap_sessions to detect "I was the last connection alive".

--

there seems to be a race condition here

I understand what you are saying. See below. I can be wrong (new to MainLoopBase and MainLoopPosix).

TL;DR: IIUC, there is no such race condition. This is guaranteed by MainLoopPosix, because:

It handles all callbacks and read objects in sequence (see MainLoopPosix::Run()).

It checks m_terminate_request before processing each read object (see MainLoopPosix::RunImpl::ProcessReadEvents()).

Details:
So, one of the following will happen:

Case 1: Last connection times out first: In this case, the timeout callback invokes loop.RequestTermination() and sets m_terminate_request. Then, the socket detects a new connection, but the callback given to Accept() is never invoked, because m_terminate_request is already set and the object read from the socket is discarded.

Case 2: New client connects first: In this case, the callback given to Accept() is invoked. It will reset the global variable before spinning the rest of the initiation into a separate thread. Then, the "last" connection's timeout callback is invoked. It will see that the global variable has been reset, so it won't request termination.

Kindly LMK if I missed anything.

With that said, I will move the loop.RequestTermination() call into the scoped lock, because it's a cheap operation anyways.

jeffreytan81

For the comment I left.

[lldb-dap] Add optional TTL argument when using --connection

9af1b00

royitaqi requested a review from JDevlieghere as a code owner September 4, 2025 05:33

llvmbot added lldb lldb-dap labels Sep 4, 2025

royitaqi requested review from dmpots, walter-erquinigo, clayborg, kusmour and jeffreytan81 September 4, 2025 05:34

walter-erquinigo requested changes Sep 4, 2025

View reviewed changes

JDevlieghere requested a review from ashgti September 4, 2025 16:28

ashgti reviewed Sep 4, 2025

View reviewed changes

royitaqi added 2 commits September 4, 2025 15:01

Address various comments about renaming and refactoring

c78a38f

Update text in Options.td

13b1358

ashgti approved these changes Sep 4, 2025

View reviewed changes

jeffreytan81 reviewed Sep 5, 2025

View reviewed changes

jeffreytan81 requested changes Sep 5, 2025

View reviewed changes

[lldb-dap] Add new optional argument time-to-live when using --connection #156803

Are you sure you want to change the base?

[lldb-dap] Add new optional argument time-to-live when using --connection #156803

Uh oh!

Conversation

royitaqi commented Sep 4, 2025

Usage

Benefits

Test

Uh oh!

llvmbot commented Sep 4, 2025

Usage

Benefits

Test

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

walter-erquinigo Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

walter-erquinigo Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

JDevlieghere commented Sep 4, 2025

Uh oh!

royitaqi commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

royitaqi commented Sep 4, 2025

Uh oh!

JDevlieghere commented Sep 4, 2025

Uh oh!

royitaqi commented Sep 4, 2025

Uh oh!

walter-erquinigo commented Sep 5, 2025

Uh oh!

jeffreytan81 Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

royitaqi Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

jeffreytan81 Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

jeffreytan81 Sep 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

royitaqi Sep 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeffreytan81 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

[lldb-dap] Add new optional argument `time-to-live` when using `--connection` #156803

[lldb-dap] Add new optional argument `time-to-live` when using `--connection` #156803

royitaqi commented Sep 4, 2025 •

edited

Loading

jeffreytan81 Sep 6, 2025 •

edited

Loading

royitaqi Sep 6, 2025 •

edited

Loading