Skip to content

Commit 856fc29

Browse files
Hugh DickinsLinus Torvalds
authored andcommitted
[PATCH] hugetlb: fix prio_tree unit
hugetlb_vmtruncate_list was misconverted to prio_tree: its prio_tree is in units of PAGE_SIZE (PAGE_CACHE_SIZE) like any other, not HPAGE_SIZE (whereas its radix_tree is kept in units of HPAGE_SIZE, otherwise slots would be absurdly sparse). At first I thought the error benign, just calling __unmap_hugepage_range on more vmas than necessary; but on 32-bit machines, when the prio_tree is searched correctly, it happens to ensure the v_offset calculation won't overflow. As it stood, when truncating at or beyond 4GB, it was liable to discard pages COWed from lower offsets; or even to clear pmd entries of preceding vmas, triggering exit_mmap's BUG_ON(nr_ptes). Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
1 parent b9d7e6a commit 856fc29

File tree

1 file changed

+11
-13
lines changed

1 file changed

+11
-13
lines changed

fs/hugetlbfs/inode.c

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -271,26 +271,24 @@ static void hugetlbfs_drop_inode(struct inode *inode)
271271
hugetlbfs_forget_inode(inode);
272272
}
273273

274-
/*
275-
* h_pgoff is in HPAGE_SIZE units.
276-
* vma->vm_pgoff is in PAGE_SIZE units.
277-
*/
278274
static inline void
279-
hugetlb_vmtruncate_list(struct prio_tree_root *root, unsigned long h_pgoff)
275+
hugetlb_vmtruncate_list(struct prio_tree_root *root, pgoff_t pgoff)
280276
{
281277
struct vm_area_struct *vma;
282278
struct prio_tree_iter iter;
283279

284-
vma_prio_tree_foreach(vma, &iter, root, h_pgoff, ULONG_MAX) {
285-
unsigned long h_vm_pgoff;
280+
vma_prio_tree_foreach(vma, &iter, root, pgoff, ULONG_MAX) {
286281
unsigned long v_offset;
287282

288-
h_vm_pgoff = vma->vm_pgoff >> (HPAGE_SHIFT - PAGE_SHIFT);
289-
v_offset = (h_pgoff - h_vm_pgoff) << HPAGE_SHIFT;
290283
/*
291-
* Is this VMA fully outside the truncation point?
284+
* Can the expression below overflow on 32-bit arches?
285+
* No, because the prio_tree returns us only those vmas
286+
* which overlap the truncated area starting at pgoff,
287+
* and no vma on a 32-bit arch can span beyond the 4GB.
292288
*/
293-
if (h_vm_pgoff >= h_pgoff)
289+
if (vma->vm_pgoff < pgoff)
290+
v_offset = (pgoff - vma->vm_pgoff) << PAGE_SHIFT;
291+
else
294292
v_offset = 0;
295293

296294
__unmap_hugepage_range(vma,
@@ -303,14 +301,14 @@ hugetlb_vmtruncate_list(struct prio_tree_root *root, unsigned long h_pgoff)
303301
*/
304302
static int hugetlb_vmtruncate(struct inode *inode, loff_t offset)
305303
{
306-
unsigned long pgoff;
304+
pgoff_t pgoff;
307305
struct address_space *mapping = inode->i_mapping;
308306

309307
if (offset > inode->i_size)
310308
return -EINVAL;
311309

312310
BUG_ON(offset & ~HPAGE_MASK);
313-
pgoff = offset >> HPAGE_SHIFT;
311+
pgoff = offset >> PAGE_SHIFT;
314312

315313
inode->i_size = offset;
316314
spin_lock(&mapping->i_mmap_lock);

0 commit comments

Comments
 (0)