Skip to content

Commit ac765f8

Browse files
fdmananakdave
authored andcommitted
Btrfs: fix data corruption due to cloning of eof block
We currently allow cloning a range from a file which includes the last block of the file even if the file's size is not aligned to the block size. This is fine and useful when the destination file has the same size, but when it does not and the range ends somewhere in the middle of the destination file, it leads to corruption because the bytes between the EOF and the end of the block have undefined data (when there is support for discard/trimming they have a value of 0x00). Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ export foo_size=$((256 * 1024 + 100)) $ xfs_io -f -c "pwrite -S 0x3c 0 $foo_size" /mnt/foo $ xfs_io -f -c "pwrite -S 0xb5 0 1M" /mnt/bar $ xfs_io -c "reflink /mnt/foo 0 512K $foo_size" /mnt/bar $ od -A d -t x1 /mnt/bar 0000000 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 * 0524288 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c * 0786528 3c 3c 3c 3c 00 00 00 00 00 00 00 00 00 00 00 00 0786544 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0790528 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 * 1048576 The bytes in the range from 786532 (512Kb + 256Kb + 100 bytes) to 790527 (512Kb + 256Kb + 4Kb - 1) got corrupted, having now a value of 0x00 instead of 0xb5. This is similar to the problem we had for deduplication that got recently fixed by commit de02b9f ("Btrfs: fix data corruption when deduplicating between different files"). Fix this by not allowing such operations to be performed and return the errno -EINVAL to user space. This is what XFS is doing as well at the VFS level. This change however now makes us return -EINVAL instead of -EOPNOTSUPP for cases where the source range maps to an inline extent and the destination range's end is smaller then the destination file's size, since the detection of inline extents is done during the actual process of dropping file extent items (at __btrfs_drop_extents()). Returning the -EINVAL error is done early on and solely based on the input parameters (offsets and length) and destination file's size. This makes us consistent with XFS and anyone else supporting cloning since this case is now checked at a higher level in the VFS and is where the -EINVAL will be returned from starting with kernel 4.20 (the VFS changed was introduced in 4.20-rc1 by commit 07d19dc ("vfs: avoid problematic remapping requests into partial EOF block"). So this change is more geared towards stable kernels, as it's unlikely the new VFS checks get removed intentionally. A test case for fstests follows soon, as well as an update to filter existing tests that expect -EOPNOTSUPP to accept -EINVAL as well. CC: <stable@vger.kernel.org> # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
1 parent 11023d3 commit ac765f8

File tree

1 file changed

+10
-2
lines changed

1 file changed

+10
-2
lines changed

fs/btrfs/ioctl.c

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4279,9 +4279,17 @@ static noinline int btrfs_clone_files(struct file *file, struct file *file_src,
42794279
goto out_unlock;
42804280
if (len == 0)
42814281
olen = len = src->i_size - off;
4282-
/* if we extend to eof, continue to block boundary */
4283-
if (off + len == src->i_size)
4282+
/*
4283+
* If we extend to eof, continue to block boundary if and only if the
4284+
* destination end offset matches the destination file's size, otherwise
4285+
* we would be corrupting data by placing the eof block into the middle
4286+
* of a file.
4287+
*/
4288+
if (off + len == src->i_size) {
4289+
if (!IS_ALIGNED(len, bs) && destoff + len < inode->i_size)
4290+
goto out_unlock;
42844291
len = ALIGN(src->i_size, bs) - off;
4292+
}
42854293

42864294
if (len == 0) {
42874295
ret = 0;

0 commit comments

Comments
 (0)