-
Notifications
You must be signed in to change notification settings - Fork 15k
[lldb-dap] Add new optional argument time-to-live
when using --connection
#156803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-lldb Author: Roy Shi (royitaqi) ChangesUsage
BenefitsAutomatic release of resources when lldb-dap is no longer being used (e.g. release memory used by module cache). TestTBD. I will find a test file to add tests. Full diff: https://github.com/llvm/llvm-project/pull/156803.diff 2 Files Affected:
diff --git a/lldb/tools/lldb-dap/Options.td b/lldb/tools/lldb-dap/Options.td
index 867753e9294a6..754b8c7d03568 100644
--- a/lldb/tools/lldb-dap/Options.td
+++ b/lldb/tools/lldb-dap/Options.td
@@ -61,3 +61,10 @@ def pre_init_command: S<"pre-init-command">,
def: Separate<["-"], "c">,
Alias<pre_init_command>,
HelpText<"Alias for --pre-init-command">;
+
+def time_to_live: S<"time-to-live">,
+ MetaVarName<"<ttl>">,
+ HelpText<"When using --connection, the number of milliseconds to wait "
+ "for new connections at the beginning and after all clients have "
+ "disconnected. Not specifying this argument or specifying "
+ "non-positive values will wait indefinitely.">;
diff --git a/lldb/tools/lldb-dap/tool/lldb-dap.cpp b/lldb/tools/lldb-dap/tool/lldb-dap.cpp
index b74085f25f4e2..8b53e4d5cda83 100644
--- a/lldb/tools/lldb-dap/tool/lldb-dap.cpp
+++ b/lldb/tools/lldb-dap/tool/lldb-dap.cpp
@@ -258,7 +258,7 @@ validateConnection(llvm::StringRef conn) {
static llvm::Error
serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
Log *log, const ReplMode default_repl_mode,
- const std::vector<std::string> &pre_init_commands) {
+ const std::vector<std::string> &pre_init_commands, int ttl) {
Status status;
static std::unique_ptr<Socket> listener = Socket::Create(protocol, status);
if (status.Fail()) {
@@ -283,6 +283,21 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
g_loop.AddPendingCallback(
[](MainLoopBase &loop) { loop.RequestTermination(); });
});
+ static MainLoopBase::TimePoint ttl_time_point;
+ static std::mutex ttl_mutex;
+ if (ttl > 0) {
+ std::scoped_lock<std::mutex> lock(ttl_mutex);
+ MainLoopBase::TimePoint future =
+ std::chrono::steady_clock::now() + std::chrono::milliseconds(ttl);
+ ttl_time_point = future;
+ g_loop.AddCallback(
+ [future](MainLoopBase &loop) {
+ if (ttl_time_point == future) {
+ loop.RequestTermination();
+ }
+ },
+ future);
+ }
std::condition_variable dap_sessions_condition;
std::mutex dap_sessions_mutex;
std::map<MainLoop *, DAP *> dap_sessions;
@@ -291,6 +306,12 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
&dap_sessions_mutex, &dap_sessions,
&clientCount](
std::unique_ptr<Socket> sock) {
+ if (ttl > 0) {
+ // Reset the keep alive timer, because we won't be killing the server
+ // while this connection is being served.
+ std::scoped_lock<std::mutex> lock(ttl_mutex);
+ ttl_time_point = MainLoopBase::TimePoint();
+ }
std::string client_name = llvm::formatv("client_{0}", clientCount++).str();
DAP_LOG(log, "({0}) client connected", client_name);
@@ -327,6 +348,23 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name,
std::unique_lock<std::mutex> lock(dap_sessions_mutex);
dap_sessions.erase(&loop);
std::notify_all_at_thread_exit(dap_sessions_condition, std::move(lock));
+
+ if (ttl > 0) {
+ // Start the countdown to kill the server at the end of each connection.
+ std::scoped_lock<std::mutex> lock(ttl_mutex);
+ MainLoopBase::TimePoint future =
+ std::chrono::steady_clock::now() + std::chrono::milliseconds(ttl);
+ // We don't need to take the max of `keep_alive_up_to` and `future`,
+ // because `future` must be the latest.
+ ttl_time_point = future;
+ g_loop.AddCallback(
+ [future](MainLoopBase &loop) {
+ if (ttl_time_point == future) {
+ loop.RequestTermination();
+ }
+ },
+ future);
+ }
});
client.detach();
});
@@ -509,6 +547,17 @@ int main(int argc, char *argv[]) {
}
if (!connection.empty()) {
+ int ttl = 0;
+ llvm::opt::Arg *time_to_live = input_args.getLastArg(OPT_time_to_live);
+ if (time_to_live) {
+ llvm::StringRef time_to_live_value = time_to_live->getValue();
+ if (time_to_live_value.getAsInteger(10, ttl)) {
+ llvm::errs() << "'" << time_to_live_value
+ << "' is not a valid time-to-live value\n";
+ return EXIT_FAILURE;
+ }
+ }
+
auto maybeProtoclAndName = validateConnection(connection);
if (auto Err = maybeProtoclAndName.takeError()) {
llvm::logAllUnhandledErrors(std::move(Err), llvm::errs(),
@@ -520,7 +569,7 @@ int main(int argc, char *argv[]) {
std::string name;
std::tie(protocol, name) = *maybeProtoclAndName;
if (auto Err = serveConnection(protocol, name, log.get(), default_repl_mode,
- pre_init_commands)) {
+ pre_init_commands, ttl)) {
llvm::logAllUnhandledErrors(std::move(Err), llvm::errs(),
"Connection failed: ");
return EXIT_FAILURE;
|
lldb/tools/lldb-dap/Options.td
Outdated
HelpText<"When using --connection, the number of milliseconds to wait " | ||
"for new connections at the beginning and after all clients have " | ||
"disconnected. Not specifying this argument or specifying " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have to be more descriptive here. You need to mention what happens when the time out is hit both at the "beginning" and after clients disconnect. If there's any reset of the timer, you need to mention that here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should also elaborate more in a new section in the documentation of lldb-dap, which is in lldb/tools/lldb-dap/README.md
Can you talk a bit about what's motivating this change? I'm a little worried about the proliferation of new options and I'm especially wary about the ones that need to be specified on the command line instead of over the protocol. Every new option extends the matrix of things we need to support.
|
@JDevlieghere Thanks for the good questions. I appreciate all of them. Below are my thoughts for discussion.
Main motivation is memory pressure. Other ways to counter memory pressure is either to monitor from lldb-dap VS Code extension and SIGINT for lldb-dap process to stop (drawback: IIUC this will terminate existing connections), OR use "memory pressure detected" from within lldb-dap process (drawback: the timing is tricky). Side product is that the periodical restart will also clean up bad states that may happen.
IIUC, there are two aspects to it: lifetime, and where to live. The following is my thoughts but no strong opinion. Lifetime of the option: I feel this option is better for the lifetime of the lldb-dap process, because it should be consistent across different connections. Or maybe I'm missing a scenario where per-connection or per-target value works better. I think the above decides "where does the option live". If we decide that the option is better for the whole process, then I think we need both a command line argument and a VS Code setting. If it's going to be per connection, then we need a VS Code setting. EDIT: Did you mean that we can implement this in the lldb-dap extension instead of here in lldb-dap binary? We have use cases where we start lldb-dap binary either manually or through other scripts, so it's good to have this functionality supported into lldb-dap binary (and we can have a setting in the lldb-dap extension which passes down into this command line option).
Updated the patch. Now it's seconds.
Yes, definitely (see "I will find a test file to add tests." in PR description). Yeah probably set TTL to be a 1 second and wait for it to terminate correctly within 3 seconds. |
@walter-erquinigo and @ashgti: Thank you for your review. I think most of your comments have been addressed in c78a38f and 13b1358. Feel free to reopen the comments if you want me to address them in a different way. -- @walter-erquinigo: I will first finish the discussion with @JDevlieghere , and if we are aligned on adding a VS Code setting for the lldb-dap extension, I will implement that. At that time, I will also add to the lldb-dap documentation. SG? BTW, I just checked, there is no such section for VS Code settings in that doc. Which section did you want me to add into? |
We have a memory monitor in DAP that should take care of that. Unfortunately the metrics are pretty different across platforms so making this configurable might be tricky. That said, it would be interesting to understand why that's not kicking in.
Fair enough. We could implement something where each connection specifies its own TTL but that feels very overengineered. Given that it's tied to the server mode, I can see the argument for a command line option.
No, but I do think this should be configurable from the extension, like how folks can pick server mode there. TL;DR You convinced me that this warrants being a command line option. Minor feedback:
|
@JDevlieghere: Thank you for your reply. I appreciate it a lot. I will:
-- (minor clarification:)
I wonder if you have any preference about whether this should be in this same patch, or should be a separate follow-up. I have no strong opinion (shouldn't be too much work), so I will just go with whichever you prefer. |
connection-timeout seems to be a better name. I'll wait for the current discussions to finish before reviewing again |
@@ -327,6 +366,11 @@ serveConnection(const Socket::SocketProtocol &protocol, const std::string &name, | |||
std::unique_lock<std::mutex> lock(dap_sessions_mutex); | |||
dap_sessions.erase(&loop); | |||
std::notify_all_at_thread_exit(dap_sessions_condition, std::move(lock)); | |||
|
|||
// Start the countdown to kill the server at the end of each connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we kill the server at the end of each connection? There can still be other alive connections.
I do not think this is the behavior we wanted. You want to kill the server for the last connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffreytan81: Thanks for the review. I think you misread the patch. The connection timeout (not the termination of the server) is started at the end of each connection. If a new connection establishes before the timeout finishes, it will reset the timeout in ResetTimeToLive()
, so only the timeout from the last connection will activate the actual termination of the server. LMK if you think that's not what the code does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. TrackTimeToLive
is checking against the global variable update or not before calling RequestTermination()
. This works, but I find it inefficient that for every connection to create a callback. For example, if you have 10 connections in parallel, 10 loop.AddCallback()
are called to check should_request_terimation
which is wasteful. I would suggest only schedule callback for the last connections alive, then only one loop.AddCallback()
is called to check during timeout (it internally still has to check for new connection reseting though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, there seems to be a race condition here:
- When last connection timeout,
should_request_terimation
is set to true. - Before
loop.RequestTermination()
is called. - Then a new client starts debugging being Accepted() and calling ResetTimeToLive() and move forward.
- The previous last connection calls
loop.RequestTermination()
and shutdown the server - The newly accepted client got killed due to server shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffreytan81: Thanks for the suggestions / good points.
I would suggest only schedule callback for the last connections alive
Got it. I think the optimization makes sense (as you said, the existing logic is still needed). I will try to use dap_sessions
to detect "I was the last connection alive".
--
there seems to be a race condition here
I understand what you are saying. See below. I can be wrong (new to MainLoopBase
and MainLoopPosix
).
TL;DR: IIUC, there is no such race condition. This is guaranteed by MainLoopPosix
, because:
- It handles all callbacks and read objects in sequence (see
MainLoopPosix::Run()
). - It checks
m_terminate_request
before processing each read object (seeMainLoopPosix::RunImpl::ProcessReadEvents()
).
Details:
So, one of the following will happen:
Case 1: Last connection times out first: In this case, the timeout callback invokes loop.RequestTermination()
and sets m_terminate_request
. Then, the socket detects a new connection, but the callback given to Accept()
is never invoked, because m_terminate_request
is already set and the object read from the socket is discarded.
Case 2: New client connects first: In this case, the callback given to Accept()
is invoked. It will reset the global variable before spinning the rest of the initiation into a separate thread. Then, the "last" connection's timeout callback is invoked. It will see that the global variable has been reset, so it won't request termination.
Kindly LMK if I missed anything.
With that said, I will move the loop.RequestTermination()
call into the scoped lock, because it's a cheap operation anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the comment I left.
Usage
Benefits
Automatic release of resources when lldb-dap is no longer being used (e.g. release memory used by module cache).
Test
TBD. I will find a test file to add tests.