Skip to content

Conversation

royitaqi
Copy link
Contributor

@royitaqi royitaqi commented Sep 4, 2025

Usage

--time-to-live <ttl>    When using --connection, the number of milliseconds to wait for new connections
                        at the beginning and after all clients have disconnected. Not specifying this
                        argument or specifying non-positive values will wait indefinitely.

Benefits

Automatic release of resources when lldb-dap is no longer being used (e.g. release memory used by module cache).

Test

TBD. I will find a test file to add tests.

@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2025

@llvm/pr-subscribers-lldb

Author: Roy Shi (royitaqi)

Changes

Usage

--time-to-live &lt;ttl&gt;    When using --connection, the number of milliseconds to wait for new connections
                        at the beginning and after all clients have disconnected. Not specifying this
                        argument or specifying non-positive values will wait indefinitely.

Benefits

Automatic release of resources when lldb-dap is no longer being used (e.g. release memory used by module cache).

Test

TBD. I will find a test file to add tests.


Full diff: https://github.com/llvm/llvm-project/pull/156803.diff

2 Files Affected:

  • (modified) lldb/tools/lldb-dap/Options.td (+7)
  • (modified) lldb/tools/lldb-dap/tool/lldb-dap.cpp (+51-2)
diff --git a/lldb/tools/lldb-dap/Options.td b/lldb/tools/lldb-dap/Options.td
index 867753e9294a6..754b8c7d03568 100644
--- a/lldb/tools/lldb-dap/Options.td
+++ b/lldb/tools/lldb-dap/Options.td
@@ -61,3 +61,10 @@ def pre_init_command: S<"pre-init-command">,
 def: Separate<["-"], "c">,
   Alias<pre_init_command>,
   HelpText<"Alias for --pre-init-command">;
+
+def time_to_live: S<"time-to-live">,
+      MetaVarName<"<ttl>">,
+      HelpText<"When using --connection, the number of milliseconds to wait "
+        "for new connections at the beginning and after all clients have "
+        "disconnected. Not specifying this argument or specifying "
+        "non-positive values will wait indefinitely.">;
diff --git a/lldb/tools/lldb-dap/tool/lldb-dap.cpp b/lldb/tools/lldb-dap/tool/lldb-dap.cpp
index b74085f25f4e2..8b53e4d5cda83 100644
--- a/lldb/tools/lldb-dap/tool/lldb-dap.cpp
+++ b/lldb/tools/lldb-dap/tool/lldb-dap.cpp
@@ -258,7 +258,7 @@ validateConnection(llvm::StringRef conn) {
 static llvm::Error
 serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
                 Log *log, const ReplMode default_repl_mode,
-                const std::vector<std::string> &pre_init_commands) {
+                const std::vector<std::string> &pre_init_commands, int ttl) {
   Status status;
   static std::unique_ptr<Socket> listener = Socket::Create(protocol, status);
   if (status.Fail()) {
@@ -283,6 +283,21 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
     g_loop.AddPendingCallback(
         [](MainLoopBase &loop) { loop.RequestTermination(); });
   });
+  static MainLoopBase::TimePoint ttl_time_point;
+  static std::mutex ttl_mutex;
+  if (ttl > 0) {
+    std::scoped_lock<std::mutex> lock(ttl_mutex);
+    MainLoopBase::TimePoint future =
+        std::chrono::steady_clock::now() + std::chrono::milliseconds(ttl);
+    ttl_time_point = future;
+    g_loop.AddCallback(
+        [future](MainLoopBase &loop) {
+          if (ttl_time_point == future) {
+            loop.RequestTermination();
+          }
+        },
+        future);
+  }
   std::condition_variable dap_sessions_condition;
   std::mutex dap_sessions_mutex;
   std::map<MainLoop *, DAP *> dap_sessions;
@@ -291,6 +306,12 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
                                           &dap_sessions_mutex, &dap_sessions,
                                           &clientCount](
                                              std::unique_ptr<Socket> sock) {
+    if (ttl > 0) {
+      // Reset the keep alive timer, because we won't be killing the server
+      // while this connection is being served.
+      std::scoped_lock<std::mutex> lock(ttl_mutex);
+      ttl_time_point = MainLoopBase::TimePoint();
+    }
     std::string client_name = llvm::formatv("client_{0}", clientCount++).str();
     DAP_LOG(log, "({0}) client connected", client_name);
 
@@ -327,6 +348,23 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
       std::unique_lock<std::mutex> lock(dap_sessions_mutex);
       dap_sessions.erase(&loop);
       std::notify_all_at_thread_exit(dap_sessions_condition, std::move(lock));
+
+      if (ttl > 0) {
+        // Start the countdown to kill the server at the end of each connection.
+        std::scoped_lock<std::mutex> lock(ttl_mutex);
+        MainLoopBase::TimePoint future =
+            std::chrono::steady_clock::now() + std::chrono::milliseconds(ttl);
+        // We don't need to take the max of `keep_alive_up_to` and `future`,
+        // because `future` must be the latest.
+        ttl_time_point = future;
+        g_loop.AddCallback(
+            [future](MainLoopBase &loop) {
+              if (ttl_time_point == future) {
+                loop.RequestTermination();
+              }
+            },
+            future);
+      }
     });
     client.detach();
   });
@@ -509,6 +547,17 @@ int main(int argc, char *argv[]) {
   }
 
   if (!connection.empty()) {
+    int ttl = 0;
+    llvm::opt::Arg *time_to_live = input_args.getLastArg(OPT_time_to_live);
+    if (time_to_live) {
+      llvm::StringRef time_to_live_value = time_to_live->getValue();
+      if (time_to_live_value.getAsInteger(10, ttl)) {
+        llvm::errs() << "'" << time_to_live_value
+                      << "' is not a valid time-to-live value\n";
+        return EXIT_FAILURE;
+      }
+    }
+
     auto maybeProtoclAndName = validateConnection(connection);
     if (auto Err = maybeProtoclAndName.takeError()) {
       llvm::logAllUnhandledErrors(std::move(Err), llvm::errs(),
@@ -520,7 +569,7 @@ int main(int argc, char *argv[]) {
     std::string name;
     std::tie(protocol, name) = *maybeProtoclAndName;
     if (auto Err = serveConnection(protocol, name, log.get(), default_repl_mode,
-                                   pre_init_commands)) {
+                                   pre_init_commands, ttl)) {
       llvm::logAllUnhandledErrors(std::move(Err), llvm::errs(),
                                   "Connection failed: ");
       return EXIT_FAILURE;

Comment on lines 67 to 69
HelpText<"When using --connection, the number of milliseconds to wait "
"for new connections at the beginning and after all clients have "
"disconnected. Not specifying this argument or specifying "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have to be more descriptive here. You need to mention what happens when the time out is hit both at the "beginning" and after clients disconnect. If there's any reset of the timer, you need to mention that here as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also elaborate more in a new section in the documentation of lldb-dap, which is in lldb/tools/lldb-dap/README.md

@JDevlieghere
Copy link
Member

Can you talk a bit about what's motivating this change? I'm a little worried about the proliferation of new options and I'm especially wary about the ones that need to be specified on the command line instead of over the protocol. Every new option extends the matrix of things we need to support.

  • Can this be a UI setting instead of a command line option?
  • Milliseconds seems like the wrong granularity for this . Seconds or minutes seems like a more meaningful order of magnitude.
  • This definitely needs a test. Unfortunately, tests that rely on timing tend to be less reliable, but here there should be a lower bound (e.g. exceeds X seconds) rather than an upper bound, so hopefully that should be fine.

@JDevlieghere JDevlieghere requested a review from ashgti September 4, 2025 16:28
@royitaqi
Copy link
Contributor Author

royitaqi commented Sep 4, 2025

@JDevlieghere Thanks for the good questions. I appreciate all of them. Below are my thoughts for discussion.

proliferation of new options
motivation

Main motivation is memory pressure. Other ways to counter memory pressure is either to monitor from lldb-dap VS Code extension and SIGINT for lldb-dap process to stop (drawback: IIUC this will terminate existing connections), OR use "memory pressure detected" from within lldb-dap process (drawback: the timing is tricky).

Side product is that the periodical restart will also clean up bad states that may happen.

command line option vs. over the protocol vs. VS Code settings

IIUC, there are two aspects to it: lifetime, and where to live. The following is my thoughts but no strong opinion.

Lifetime of the option: I feel this option is better for the lifetime of the lldb-dap process, because it should be consistent across different connections. Or maybe I'm missing a scenario where per-connection or per-target value works better.

I think the above decides "where does the option live". If we decide that the option is better for the whole process, then I think we need both a command line argument and a VS Code setting. If it's going to be per connection, then we need a VS Code setting.

EDIT: Did you mean that we can implement this in the lldb-dap extension instead of here in lldb-dap binary? We have use cases where we start lldb-dap binary either manually or through other scripts, so it's good to have this functionality supported into lldb-dap binary (and we can have a setting in the lldb-dap extension which passes down into this command line option).

milliseconds

Yeah I thought seconds/minutes will be more intuitive, too.

FWIW, did a calculation, max int in milliseconds is about 24.8 days. I don't think anyone would want that long of a TTL, so the range seems big enough. But yeah, seconds/minutes would probably make more sense (who is gonna use <1s TTL, right?).

Updated the patch. Now it's seconds.

This definitely needs a test

Yes, definitely (see "I will find a test file to add tests." in PR description).

Yeah probably set TTL to be a 1 second and wait for it to terminate correctly within 3 seconds.

@royitaqi
Copy link
Contributor Author

royitaqi commented Sep 4, 2025

@walter-erquinigo and @ashgti: Thank you for your review. I think most of your comments have been addressed in c78a38f and 13b1358. Feel free to reopen the comments if you want me to address them in a different way.

--

@walter-erquinigo: I will first finish the discussion with @JDevlieghere , and if we are aligned on adding a VS Code setting for the lldb-dap extension, I will implement that. At that time, I will also add to the lldb-dap documentation. SG?

BTW, I just checked, there is no such section for VS Code settings in that doc. Which section did you want me to add into?

@JDevlieghere
Copy link
Member

Main motivation is memory pressure. Other ways to counter memory pressure is either to monitor from lldb-dap VS Code extension and SIGINT for lldb-dap process to stop (drawback: IIUC this will terminate existing connections), OR use "memory pressure detected" from within lldb-dap process (drawback: the timing is tricky).

We have a memory monitor in DAP that should take care of that. Unfortunately the metrics are pretty different across platforms so making this configurable might be tricky. That said, it would be interesting to understand why that's not kicking in.

command line option vs. over the protocol vs. VS Code settings

IIUC, there are two aspects to it: lifetime, and where to live. The following is my thoughts but no strong opinion.

Lifetime of the option: I feel this option is better for the lifetime of the lldb-dap process, because it should be consistent across different connections. Or maybe I'm missing a scenario where per-connection or per-target value works better.

I think the above decides "where does the option live". If we decide that the option is better for the whole process, then I think we need both a command line argument and a VS Code setting. If it's going to be per connection, then we need a VS Code setting.

Fair enough. We could implement something where each connection specifies its own TTL but that feels very overengineered. Given that it's tied to the server mode, I can see the argument for a command line option.

EDIT: Did you mean that we can implement this in the lldb-dap extension instead of here in lldb-dap binary? We have use cases where we start lldb-dap binary either manually or through other scripts, so it's good to have this functionality supported into lldb-dap binary (and we can have a setting in the lldb-dap extension which passes down into this command line option).

No, but I do think this should be configurable from the extension, like how folks can pick server mode there.

TL;DR You convinced me that this warrants being a command line option.

Minor feedback:

  • At the risk of entering bike-shedding territory, would something like "connection-timeout" be a better name for this option? When I hear TTL I associate it with data that expires (e.g. in the context of DNS and networking).
  • Could you catch the situation where this option is provided but connection is not and print a warning saying that essentially says this only works in server mode?

@royitaqi
Copy link
Contributor Author

royitaqi commented Sep 4, 2025

@JDevlieghere: Thank you for your reply. I appreciate it a lot.

I will:

  1. Learn about the memory monitor in DAP.
    • Context: It's not failing - We are just starting to experiment with the server mode. This connection timeout feature is a precaution before we can do large scale rollout. Some folks liked the concept of auto-cleanup of the lldb-dap process, just to have the peace of mind that the memory issue isn't caused by the debugger. Good to know about the memory monitor in DAP. That adds more confidence.
  2. Rename the option to "connection-timeout". I like this name a lot, esp. because the other option is "--connection", so this has a nice synergy in the names.
  3. Catch that case of --connection-timeout is given while --connection isn't. Print a warning.
  4. Add tests.

--

(minor clarification:)

I do think this should be configurable from the extension, like how folks can pick server mode there.

I wonder if you have any preference about whether this should be in this same patch, or should be a separate follow-up. I have no strong opinion (shouldn't be too much work), so I will just go with whichever you prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants