Skip to content

Commit 00a3d66

Browse files
committed
Revert "fs: do not prefault sys_write() user buffer pages"
This reverts commit 998ef75. The commit itself does not appear to be buggy per se, but it is exposing a bug in ext4 (and Ted thinks ext3 too, but we solved that by getting rid of it). It's too late in the release cycle to really worry about this, even if Dave Hansen has a patch that may actually fix the underlying ext4 problem. We can (and should) revisit this for the next release. The problem is that moving the prefaulting later now exposes a special case with partially successful writes that isn't handled correctly. And the prefaulting likely isn't normally even that much of a performance issue - it looks like at least one reason Dave saw this in his performance tests is that he also ran them on Skylake that now supports the new SMAP code, which makes the normally very cheap user space prefaulting noticeably more expensive. Bisected-and-acked-by: Ted Ts'o <tytso@mit.edu> Analyzed-and-acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent f670268 commit 00a3d66

File tree

1 file changed

+16
-18
lines changed

1 file changed

+16
-18
lines changed

mm/filemap.c

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2473,24 +2473,30 @@ ssize_t generic_perform_write(struct file *file,
24732473
iov_iter_count(i));
24742474

24752475
again:
2476+
/*
2477+
* Bring in the user page that we will copy from _first_.
2478+
* Otherwise there's a nasty deadlock on copying from the
2479+
* same page as we're writing to, without it being marked
2480+
* up-to-date.
2481+
*
2482+
* Not only is this an optimisation, but it is also required
2483+
* to check that the address is actually valid, when atomic
2484+
* usercopies are used, below.
2485+
*/
2486+
if (unlikely(iov_iter_fault_in_readable(i, bytes))) {
2487+
status = -EFAULT;
2488+
break;
2489+
}
2490+
24762491
status = a_ops->write_begin(file, mapping, pos, bytes, flags,
24772492
&page, &fsdata);
24782493
if (unlikely(status < 0))
24792494
break;
24802495

24812496
if (mapping_writably_mapped(mapping))
24822497
flush_dcache_page(page);
2483-
/*
2484-
* 'page' is now locked. If we are trying to copy from a
2485-
* mapping of 'page' in userspace, the copy might fault and
2486-
* would need PageUptodate() to complete. But, page can not be
2487-
* made Uptodate without acquiring the page lock, which we hold.
2488-
* Deadlock. Avoid with pagefault_disable(). Fix up below with
2489-
* iov_iter_fault_in_readable().
2490-
*/
2491-
pagefault_disable();
2498+
24922499
copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
2493-
pagefault_enable();
24942500
flush_dcache_page(page);
24952501

24962502
status = a_ops->write_end(file, mapping, pos, bytes, copied,
@@ -2513,14 +2519,6 @@ ssize_t generic_perform_write(struct file *file,
25132519
*/
25142520
bytes = min_t(unsigned long, PAGE_CACHE_SIZE - offset,
25152521
iov_iter_single_seg_count(i));
2516-
/*
2517-
* This is the fallback to recover if the copy from
2518-
* userspace above faults.
2519-
*/
2520-
if (unlikely(iov_iter_fault_in_readable(i, bytes))) {
2521-
status = -EFAULT;
2522-
break;
2523-
}
25242522
goto again;
25252523
}
25262524
pos += copied;

0 commit comments

Comments
 (0)