-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Unify process termination on POSIX & Windows (+ tests) #1044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
9f88ea7
to
a27fcef
Compare
caa5628
to
1d495ca
Compare
9e8d74b
to
3f6c472
Compare
cbaee2e
to
e8954dd
Compare
e8954dd
to
90f463e
Compare
The stdio cleanup was hanging indefinitely when processes ignored termination signals or took too long to exit. This caused the MCP client to freeze during shutdown, especially with servers that don't handle SIGTERM properly. The fix introduces a 2-second timeout for process termination. If a process doesn't exit gracefully within this window, it's forcefully killed. This ensures the client always completes cleanup in bounded time while still giving well-behaved servers a chance to exit cleanly. This resolves hanging issues reported when MCP servers ignore standard termination signals or perform lengthy cleanup operations. Also adds regression tests for #559. resolves #555 Co-authored-by: Cristian Pufu <cristian.pufu@uipath.com>
The MCP specification recommends closing stdin first to allow servers to exit gracefully before resorting to signals. This approach gives well-behaved servers the opportunity to detect stdin closure and perform clean shutdown without forceful termination. The shutdown sequence now follows a graceful escalation path: first closing stdin and waiting 2 seconds for voluntary exit, then sending SIGTERM if needed, and finally using SIGKILL as a last resort. This minimizes the risk of data loss or corruption while ensuring cleanup always completes. This unified approach works consistently across all platforms and improves compatibility with MCP servers that monitor stdin for lifecycle management. resolves #765 Co-authored-by: davenpi <davenport.ianc@gmail.com>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <jingxu8885@qq.com> Co-authored-by: Surya Prakash Susarla <susarla.surya.prakash.1998@gmail.com>
Refactor the MCP spec-compliant stdio shutdown sequence into a separate _shutdown_process function to document that this is an MCP specific shutdown sequence. Logic remains unchanged. Co-authored-by: davenpi <davenport.ianc@gmail.com>
90f463e
to
50559f7
Compare
The test now demonstrates that: 1. The cleanup issue affects all platforms, not just Windows 2. Closing stdin and waiting for graceful exit allows cleanup to run This confirms that PR #1044's approach of implementing the MCP spec's stdio shutdown sequence (close stdin first, then wait, then terminate) could address issue #1027. Github-Issue:#1027
The test now demonstrates that: 1. The cleanup issue affects all platforms, not just Windows 2. Closing stdin and waiting for graceful exit allows cleanup to run This confirms that PR #1044's approach of implementing the MCP spec's stdio shutdown sequence (close stdin first, then wait, then terminate) could address issue #1027. Github-Issue:#1027
This test shows that MCP server cleanup code in lifespan doesn't run when the process is terminated, but does run when stdin is closed first (as implemented in PR #1044). The test includes: - Demonstration of current broken behavior (cleanup doesn't run) - Verification that stdin closure allows graceful shutdown - Windows-specific ResourceWarning handling - Detailed documentation of the issue and solution Github-Issue:#1027
This test shows that MCP server cleanup code in lifespan doesn't run when the process is terminated, but does run when stdin is closed first (as implemented in PR #1044). The test includes: - Demonstration of current broken behavior (cleanup doesn't run) - Verification that stdin closure allows graceful shutdown - Windows-specific ResourceWarning handling - Detailed documentation of the issue and solution Github-Issue:#1027
This test shows that MCP server cleanup code in lifespan doesn't run when the process is terminated, but does run when stdin is closed first (as implemented in PR #1044). The test includes: - Demonstration of current broken behavior (cleanup doesn't run) - Verification that stdin closure allows graceful shutdown - Windows-specific ResourceWarning handling - Detailed documentation of the issue and solution Github-Issue:#1027
This test shows that MCP server cleanup code in lifespan doesn't run when the process is terminated, but does run when stdin is closed first (as implemented in PR #1044). The test includes: - Demonstration of current broken behavior (cleanup doesn't run) - Verification that stdin closure allows graceful shutdown - Windows-specific ResourceWarning handling - Detailed documentation of the issue and solution Github-Issue:#1027
This test shows that MCP server cleanup code in lifespan doesn't run when the process is terminated, but does run when stdin is closed first (as implemented in PR #1044). The test includes: - Demonstration of current broken behavior (cleanup doesn't run) - Verification that stdin closure allows graceful shutdown - Windows-specific ResourceWarning handling - Detailed documentation of the issue and solution Github-Issue:#1027
This test shows that MCP server cleanup code in lifespan doesn't run when the process is terminated, but does run when stdin is closed first (as implemented in PR #1044). The test includes: - Demonstration of current broken behavior (cleanup doesn't run) - Verification that stdin closure allows graceful shutdown - Windows-specific ResourceWarning handling - Detailed documentation of the issue and solution Github-Issue:#1027
This test shows that MCP server cleanup code in lifespan doesn't run when the process is terminated, but does run when stdin is closed first (as implemented in PR #1044). The test includes: - Demonstration of current broken behavior (cleanup doesn't run) - Verification that stdin closure allows graceful shutdown - Windows-specific ResourceWarning handling - Detailed documentation of the issue and solution Github-Issue:#1027
Verification of Windows Process Tree Termination Fix Scenario: We observed the following process hierarchy:
Failure:
...as orphaned/zombie processes on Windows. Validation:
Conclusion: |
Motivation and Context
#555, #559, #765, #729, and #850 address important but different ways MCP servers can hang or fail to successfully terminate on Windows & POSIX systems.
This PR adds regressions tests to clearly demonstrate the edge cases addressed + unifies the approaches from the PRs listed above.
How Has This Been Tested?
New regression tests added (each fails without the relevant fix). These were developed and tested on both POSIX and Windows systems.
For #555 and #559 (guard for MCP servers that don't react to SIGTERM):
For #765 (align shutdown sequence with MCP spec):
For #729 and #850 (properly terminate processes and their children across platforms):
Breaking Changes
None.
Types of changes
Checklist
Additional context