-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Rework crt_fd to be more aligned with io-safety #5789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1cdc344
to
0a201cd
Compare
0a201cd
to
cadba1b
Compare
WalkthroughThis update refactors file descriptor handling throughout the codebase. It introduces explicit Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant PythonVM
participant RustAPI
participant OS
User->>PythonVM: Call os.open/close/read/write/stat/etc.
PythonVM->>RustAPI: Convert Python args to Owned/Borrowed fd
RustAPI->>OS: Perform syscall with fd.as_raw()
OS-->>RustAPI: Return result or error code
RustAPI->>PythonVM: Return result or raise exception
PythonVM-->>User: Return value or Python exception
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (11)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
vm/src/stdlib/os.rs (1)
1328-1334
:⚠️ Potential issue
copy_file_range
passesBorrowed
directly tolibc::syscall
, breaking the expectedlong
ABI
libc::syscall
expects its variadic arguments to be integral types widened toc_long
.
Passing a zero-sized-phantom wrapper struct (crt_fd::Borrowed
) relies onrepr(transparent>
and an implicit struct-to-integer cast, which is undefined in the C ABI and currently rejected by LLVM in some optimisation levels.- libc::SYS_copy_file_range, - args.src, + libc::SYS_copy_file_range, + args.src.as_raw() as libc::c_long, @@ - args.dst, + args.dst.as_raw() as libc::c_long,(The explicit
as libc::c_long
makes the intent unmistakable.)Without this change the build may succeed with
--release
on glibc but fail on musl/clang.
🧹 Nitpick comments (6)
vm/src/stdlib/posix_compat.rs (1)
29-36
: Lifetime-annotatedDirFd
looks good, but consider a more expressive lifetime nameAdding
'a
toSymlinkArgs
and propagating it toDirFd
aligns with the newBorrowed/Owned
model – no functional issues spotted.
For readability, you might rename'a
to something domain-specific ('fd
or'dirfd
) so that future readers instantly understand the intent.common/src/fileutils.rs (1)
98-105
: Windows: minor style – avoid shadowingh
twiceThe double-binding is perfectly valid, but a small
match
keeps the happy path flatter and setsSetLastError
only when needed:- let h = crt_fd::as_handle(fd); - if h.is_err() { - unsafe { SetLastError(ERROR_INVALID_HANDLE) }; - } - let h = h?; - let h = h.as_raw_handle(); + let h = match crt_fd::as_handle(fd) { + Ok(h) => h, + Err(e) => { + unsafe { SetLastError(ERROR_INVALID_HANDLE) }; + return Err(e); + } + }; + let h = h.as_raw_handle();Pure style – feel free to ignore.
vm/src/stdlib/posix.rs (1)
167-177
: Consider strengthening the safety documentation for file descriptor conversions.The safety comment acknowledges that Python's file descriptor API isn't io-safe, but this could be better documented. Consider adding more detailed safety requirements or invariants that callers must uphold.
- // SAFETY: none, really. but, python's os api of passing around file descriptors - // everywhere isn't really io-safe anyway, so, this is passed to the user. + // SAFETY: This conversion is inherently unsafe as we cannot guarantee: + // 1. The file descriptor is valid and open + // 2. The file descriptor won't be closed while this BorrowedFd exists + // 3. No other code will modify the file descriptor's state + // This matches Python's os module behavior which doesn't provide io-safety guarantees. Ok(unsafe { BorrowedFd::borrow_raw(fd) })vm/src/ospath.rs (1)
107-114
: Consider documenting the safety requirements for try_borrow_raw.While the unsafe block correctly handles the conversion from a raw file descriptor to a borrowed one, it would be beneficial to document the safety requirements more explicitly.
Some(int) => { let fd = int?.try_to_primitive(vm)?; + // SAFETY: We're creating a borrowed file descriptor from a raw fd provided by Python. + // The safety of this operation depends on the Python code properly managing the fd lifetime. unsafe { crt_fd::Borrowed::try_borrow_raw(fd) } .map(Self::Fd) .map_err(|e| e.into_pyexception(vm)) }vm/src/stdlib/io.rs (1)
3883-3885
:fcntl()
can takeBorrowedFd
directly — avoid raw FD conversionYou already have a typed
Borrowed
in scope:if let Ok(OsPathOrFd::Fd(fd)) = … { nix::fcntl::fcntl(fd.as_raw(), nix::fcntl::F_GETFD) }
nix::fcntl::fcntl
is generic overAsRawFd
; passingfd
directly removes
the extra syscall-unsafe integer step:- nix::fcntl::fcntl(fd.as_raw(), nix::fcntl::F_GETFD) + nix::fcntl::fcntl(&fd, nix::fcntl::F_GETFD)Minor, but keeps the code consistently io-safe.
common/src/crt_fd.rs (1)
356-359
: Preferc::read
(alias) for consistency and to avoid double-dispatchThe rest of the module consistently funnels raw libc calls through the
c::*
alias, but_read
goes directly tolibc::read
. For symmetry (and potential platform shims) it should do the same:- let n = cvt(unsafe { suppress_iph!(libc::read(fd, buf.as_mut_ptr() as _, count as _)) })?; + let n = cvt(unsafe { suppress_iph!(c::read(fd, buf.as_mut_ptr() as _, count as _)) })?;Minor, yet it keeps the abstraction layer uniform.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
common/src/crt_fd.rs
(1 hunks)common/src/fileutils.rs
(3 hunks)vm/src/ospath.rs
(3 hunks)vm/src/stdlib/io.rs
(8 hunks)vm/src/stdlib/msvcrt.rs
(3 hunks)vm/src/stdlib/nt.rs
(5 hunks)vm/src/stdlib/os.rs
(25 hunks)vm/src/stdlib/posix.rs
(15 hunks)vm/src/stdlib/posix_compat.rs
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
vm/src/stdlib/msvcrt.rs (2)
vm/src/stdlib/os.rs (2)
errno_err
(39-41)std
(1032-1032)common/src/crt_fd.rs (1)
as_handle
(337-341)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
🔇 Additional comments (11)
common/src/fileutils.rs (1)
11-21
:❌ Incorrect review comment
Unix: verify the public helper is really called
as_raw()
fd.as_raw()
compiles only ifBorrowed<'_>
exposes that exact method. In the standard library the idiom isas_raw_fd()
; drifting from that spelling makes grepping harder for people accustomed to the std traits.- let ret = libc::fstat(fd.as_raw(), stat.as_mut_ptr()); + let ret = libc::fstat(fd.as_raw_fd(), stat.as_mut_ptr());If
as_raw()
is the canonical helper incrt_fd
, ignore this.Why? A quick search in the repo will confirm which spelling is available.
🏁 Script executed:
#!/bin/bash # list all definitions ending with "fn as_raw" rg -n "fn as_raw" | headLength of output: 721
🏁 Script executed:
#!/bin/bash # Show the impl block for Borrowed<'a> to see which methods it defines rg -n "impl<'a> Borrowed<'a>" -A15 -B5 common/src/crt_fd.rsLength of output: 67
🏁 Script executed:
#!/bin/bash # Locate the struct definition for Borrowed and its impl blocks rg -n "struct Borrowed" -A5 -B5 common/src/crt_fd.rs rg -n "impl Borrowed" -A10 -B0 common/src/crt_fd.rsLength of output: 917
🏁 Script executed:
#!/bin/bash # Show all `as_raw` method definitions with context rg -n "pub fn as_raw" -A5 -B5 common/src/crt_fd.rsLength of output: 1107
No change needed:
Borrowed<'_>
defines.as_raw()
, not.as_raw_fd()
TheBorrowed<'_>
wrapper incommon/src/crt_fd.rs
implements:
pub fn as_raw(&self) -> Raw
pub fn as_raw(self) -> Raw
It does not expose an
as_raw_fd()
method. Callingfd.as_raw_fd()
will fail to compile, so the existingfd.as_raw()
invocation incommon/src/fileutils.rs
is correct.Likely an incorrect or invalid review comment.
vm/src/stdlib/msvcrt.rs (3)
24-26
: Public helpers now takeBorrowed<'_>
– good upgrade
setmode_binary
no longer risks raw-fd misuse. Implementation stays unchanged – looks correct.
75-76
: Safety check: extern_setmode
relies onBorrowed
being#[repr(transparent)]
Passing
Borrowed<'_>
directly over FFI is fine only if the wrapper is transparent over anint
andCopy
. Confirm the type definition:#[repr(transparent)] #[derive(Copy, Clone)] pub struct Borrowed<'a>(c::c_int, PhantomData<&'a ()>);If either attribute is missing, the ABI may break.
103-107
: Nice simplification ofget_osfhandle
error handlingDelegating to
crt_fd::as_handle
and usingIntoPyException
keeps the function tight and idiomatic.vm/src/stdlib/posix.rs (2)
335-336
: Good use of lifetime parameters for borrowed file descriptors.The addition of lifetime parameters to structs and functions that handle borrowed file descriptors improves type safety and makes the borrowing relationships explicit. This aligns well with Rust's ownership model.
Also applies to: 341-342, 345-345, 350-350, 412-412, 441-441, 443-443, 582-582, 594-594, 617-617, 896-896, 915-916, 924-926, 938-939, 957-957, 977-980, 983-983, 1208-1209, 1979-1979, 2200-2221
350-350
: Consistent API update from raw file descriptors to safe wrappers.The changes consistently replace raw file descriptor access methods (
.get_opt()
) with the new API (.raw_opt()
,.as_raw()
,.as_raw_fd()
), which provides better type safety and clearer semantics.Also applies to: 412-412, 416-417, 594-594, 915-916, 958-958, 1959-1959, 1983-1983
vm/src/stdlib/nt.rs (2)
65-66
: Consistent lifetime parameter additions for DirFd.The lifetime parameters added to
DirFd
usage maintain consistency with the changes in other modules, ensuring proper tracking of borrowed file descriptors.Also applies to: 71-72, 75-75, 114-115
93-100
: Improved type safety for set_inheritable function.The change from accepting a raw
i32
tocrt_fd::Borrowed<'_>
improves type safety and makes the borrowing relationship explicit. The error handling for the handle conversion is also properly implemented.vm/src/ospath.rs (2)
100-103
: Well-structured lifetime parameter addition to OsPathOrFd.The addition of the lifetime parameter
'fd
toOsPathOrFd
properly tracks the lifetime of borrowed file descriptors, maintaining safety invariants across the type system.Also applies to: 105-105, 119-120, 125-125
149-151
: Clean API update for IOErrorBuilder to support lifetime parameters.The updates to
IOErrorBuilder
methods to acceptimpl Into<OsPathOrFd<'a>>
provide a flexible and ergonomic API while maintaining proper lifetime tracking.Also applies to: 154-156, 162-165
vm/src/stdlib/io.rs (1)
4467-4469
: Verifycrt_fd::close()
does not double-drop the descriptor
close()
constructs anOwned
withfrom_raw(fd)
and passes it by value to
crt_fd::close()
. Ifclose()
internally closes the FD and then allows
Owned
to be dropped (itsDrop
impl also closes), the descriptor will be
closed twice.Please confirm the implementation contract:
- If
crt_fd::close()
consumes theOwned
andmem::forget
s it, current code
is fine.- If it merely forwards to
libc::close
and returns, the subsequentDrop
will execute a secondclose
.Unit-test with
dup()
plusfcntl(F_GETFD)
or auditcrt_fd::close()
.
let fd = unsafe { crt_fd::Borrowed::borrow_raw(fd) }; | ||
let filename = filename.unwrap_or(OsPathOrFd::Fd(fd)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consolidate repeated unsafe { borrow_raw(..) }
into a single, safe helper
borrow_raw
is invoked in several places (init
, get_fd
, path-vs-fd
resolution, etc.). Centralising the only unsafe
block behind a small
fn borrow_fd(raw: i32) -> Borrowed<'_>
greatly limits the surface for future
unsoundness and avoids copy-pasted unsafe
code.
- let fd = unsafe { crt_fd::Borrowed::borrow_raw(fd) };
+ let fd = borrow_fd(fd); // new safe helper
This also makes it trivial to add additional validation (e.g. negative FD
check) in exactly one place.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In vm/src/stdlib/io.rs around lines 4249 to 4251, the unsafe call to
borrow_raw(fd) is repeated in multiple places. To fix this, create a single safe
helper function, e.g., borrow_fd(raw: i32) -> Borrowed<'_>, that encapsulates
the unsafe block and performs any necessary validation like checking for
negative FDs. Replace all direct calls to unsafe { borrow_raw(..) } with calls
to this new helper function to centralize unsafe code and reduce duplication.
fn get_fd(&self, vm: &VirtualMachine) -> PyResult<crt_fd::Borrowed<'_>> { | ||
self.fileno(vm) | ||
.map(|fd| unsafe { crt_fd::Borrowed::borrow_raw(fd) }) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against race-to-close before creating an unsafe Borrowed
get_fd()
snapshots the raw descriptor, then immediately performs
unsafe { crt_fd::Borrowed::borrow_raw(fd) }
.
Because the borrow’s lifetime is tied to &self
, it is sound only while the
descriptor remains open.
close()
concurrently sets self.fd
to -1
and calls crt_fd::close()
; there
is no synchronisation primitive preventing another thread from:
- Calling
get_fd()
and receiving aBorrowed
. - Having the owning thread call
close()
before the borrow is used. - The borrow now aliases a closed descriptor → UB.
Even with the GIL, Rust code can run on multiple OS threads (e.g. spawn
without holding the GIL).
- self.fileno(vm)
- .map(|fd| unsafe { crt_fd::Borrowed::borrow_raw(fd) })
+ let raw = self.fileno(vm)?;
+ // Re-read the atomic after fileno() to ensure the FD is still valid.
+ if self.fd.load() != raw {
+ return Err(io_closed_error(vm));
+ }
+ Ok(unsafe { crt_fd::Borrowed::borrow_raw(raw) })
Alternatively hold an internal mutex during get_fd()
/close()
.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In vm/src/stdlib/io.rs around lines 4349 to 4352, the get_fd() method unsafely
creates a Borrowed descriptor without guarding against concurrent close() calls
that can invalidate the descriptor and cause undefined behavior. To fix this,
introduce a synchronization primitive such as an internal mutex that is locked
during both get_fd() and close() to ensure the descriptor cannot be closed while
a Borrowed is active, preventing race conditions and ensuring soundness of the
borrow.
use windows_sys::Win32::Foundation::{HANDLE, INVALID_HANDLE_VALUE}; | ||
unsafe extern "C" { | ||
fn _get_osfhandle(fd: Borrowed<'_>) -> c::intptr_t; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_get_osfhandle FFI signature should take a plain i32
instead of Borrowed
_get_osfhandle
is declared in the CRT as intptr_t _get_osfhandle(int fd);
.
Passing a Borrowed<'_>
(even though repr(transparent)
) in the FFI boundary:
- Introduces a generic‐parametrised type into the C ABI (discouraged and may be rejected by future rustc versions).
- Forces all callers to perform an implicit struct → integer cast, which is less clear than an explicit
.as_raw()
.
- fn _get_osfhandle(fd: Borrowed<'_>) -> c::intptr_t;
+ fn _get_osfhandle(fd: i32) -> c::intptr_t;
and at the call-site:
- let handle = unsafe { suppress_iph!(_get_osfhandle(fd)) };
+ let handle = unsafe { suppress_iph!(_get_osfhandle(fd.as_raw())) };
This keeps the ABI crystal-clear and avoids relying on repr(transparent)
for safety across the language boundary.
🤖 Prompt for AI Agents
In common/src/crt_fd.rs around lines 338 to 341, the FFI declaration of
_get_osfhandle currently takes a Borrowed<'_> type, which is incorrect for the C
ABI. Change the function signature to accept a plain i32 instead of Borrowed<'_>
to match the CRT declaration intptr_t _get_osfhandle(int fd). Also, update all
call sites to explicitly convert Borrowed values to i32 using .as_raw() before
passing them to _get_osfhandle, ensuring clear and safe ABI boundaries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, impressive, but irrelevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coolreader18 Fair enough! You're focused on the bigger io-safety refactoring here, and that FFI detail can certainly be addressed separately if needed. Thanks for the clarification on priorities.
Summary by CodeRabbit
Refactor
Bug Fixes
Style
Tests