Skip to content

Commit edafcce

Browse files
committed
io_uring: add support for pre-mapped user IO buffers
If we have fixed user buffers, we can map them into the kernel when we setup the io_uring. That avoids the need to do get_user_pages() for each and every IO. To utilize this feature, the application must call io_uring_register() after having setup an io_uring instance, passing in IORING_REGISTER_BUFFERS as the opcode. The argument must be a pointer to an iovec array, and the nr_args should contain how many iovecs the application wishes to map. If successful, these buffers are now mapped into the kernel, eligible for IO. To use these fixed buffers, the application must use the IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED opcodes, and then set sqe->index to the desired buffer index. sqe->addr..sqe->addr+seq->len must point to somewhere inside the indexed buffer. The application may register buffers throughout the lifetime of the io_uring instance. It can call io_uring_register() with IORING_UNREGISTER_BUFFERS as the opcode to unregister the current set of buffers, and then register a new set. The application need not unregister buffers explicitly before shutting down the io_uring instance. It's perfectly valid to setup a larger buffer, and then sometimes only use parts of it for an IO. As long as the range is within the originally mapped region, it will work just fine. For now, buffers must not be file backed. If file backed buffers are passed in, the registration will fail with -1/EOPNOTSUPP. This restriction may be relaxed in the future. RLIMIT_MEMLOCK is used to check how much memory we can pin. A somewhat arbitrary 1G per buffer size is also imposed. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
1 parent 6d0c48a commit edafcce

File tree

7 files changed

+381
-15
lines changed

7 files changed

+381
-15
lines changed

arch/x86/entry/syscalls/syscall_32.tbl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -400,3 +400,4 @@
400400
386 i386 rseq sys_rseq __ia32_sys_rseq
401401
425 i386 io_uring_setup sys_io_uring_setup __ia32_sys_io_uring_setup
402402
426 i386 io_uring_enter sys_io_uring_enter __ia32_sys_io_uring_enter
403+
427 i386 io_uring_register sys_io_uring_register __ia32_sys_io_uring_register

arch/x86/entry/syscalls/syscall_64.tbl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -345,6 +345,7 @@
345345
334 common rseq __x64_sys_rseq
346346
425 common io_uring_setup __x64_sys_io_uring_setup
347347
426 common io_uring_enter __x64_sys_io_uring_enter
348+
427 common io_uring_register __x64_sys_io_uring_register
348349

349350
#
350351
# x32-specific system call numbers start at 512 to avoid cache impact

0 commit comments

Comments
 (0)