Skip to content

Commit 9bc9ccd

Browse files
committed
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro: "All kinds of stuff this time around; some more notable parts: - RCU'd vfsmounts handling - new primitives for coredump handling - files_lock is gone - Bruce's delegations handling series - exportfs fixes plus misc stuff all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits) ecryptfs: ->f_op is never NULL locks: break delegations on any attribute modification locks: break delegations on link locks: break delegations on rename locks: helper functions for delegation breaking locks: break delegations on unlink namei: minor vfs_unlink cleanup locks: implement delegations locks: introduce new FL_DELEG lock flag vfs: take i_mutex on renamed file vfs: rename I_MUTEX_QUOTA now that it's not used for quotas vfs: don't use PARENT/CHILD lock classes for non-directories vfs: pull ext4's double-i_mutex-locking into common code exportfs: fix quadratic behavior in filehandle lookup exportfs: better variable name exportfs: move most of reconnect_path to helper function exportfs: eliminate unused "noprogress" counter exportfs: stop retrying once we race with rename/remove exportfs: clear DISCONNECTED on all parents sooner exportfs: more detailed comment for path_reconnect ...
2 parents f023029 + bdd3536 commit 9bc9ccd

File tree

159 files changed

+2099
-2491
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

159 files changed

+2099
-2491
lines changed

Documentation/filesystems/directory-locking

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22
kinds of locks - per-inode (->i_mutex) and per-filesystem
33
(->s_vfs_rename_mutex).
44

5+
When taking the i_mutex on multiple non-directory objects, we
6+
always acquire the locks in order by increasing address. We'll call
7+
that "inode pointer" order in the following.
8+
59
For our purposes all operations fall in 5 classes:
610

711
1) read access. Locking rules: caller locks directory we are accessing.
@@ -12,8 +16,9 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem
1216
locks victim and calls the method.
1317

1418
4) rename() that is _not_ cross-directory. Locking rules: caller locks
15-
the parent, finds source and target, if target already exists - locks it
16-
and then calls the method.
19+
the parent and finds source and target. If target already exists, lock
20+
it. If source is a non-directory, lock it. If that means we need to
21+
lock both, lock them in inode pointer order.
1722

1823
5) link creation. Locking rules:
1924
* lock parent
@@ -30,7 +35,9 @@ rules:
3035
fail with -ENOTEMPTY
3136
* if new parent is equal to or is a descendent of source
3237
fail with -ELOOP
33-
* if target exists - lock it.
38+
* If target exists, lock it. If source is a non-directory, lock
39+
it. In case that means we need to lock both source and target,
40+
do so in inode pointer order.
3441
* call the method.
3542

3643

@@ -56,19 +63,25 @@ objects - A < B iff A is an ancestor of B.
5663
renames will be blocked on filesystem lock and we don't start changing
5764
the order until we had acquired all locks).
5865

59-
(3) any operation holds at most one lock on non-directory object and
60-
that lock is acquired after all other locks. (Proof: see descriptions
61-
of operations).
66+
(3) locks on non-directory objects are acquired only after locks on
67+
directory objects, and are acquired in inode pointer order.
68+
(Proof: all operations but renames take lock on at most one
69+
non-directory object, except renames, which take locks on source and
70+
target in inode pointer order in the case they are not directories.)
6271

6372
Now consider the minimal deadlock. Each process is blocked on
6473
attempt to acquire some lock and already holds at least one lock. Let's
6574
consider the set of contended locks. First of all, filesystem lock is
6675
not contended, since any process blocked on it is not holding any locks.
6776
Thus all processes are blocked on ->i_mutex.
6877

69-
Non-directory objects are not contended due to (3). Thus link
70-
creation can't be a part of deadlock - it can't be blocked on source
71-
and it means that it doesn't hold any locks.
78+
By (3), any process holding a non-directory lock can only be
79+
waiting on another non-directory lock with a larger address. Therefore
80+
the process holding the "largest" such lock can always make progress, and
81+
non-directory objects are not included in the set of contended locks.
82+
83+
Thus link creation can't be a part of deadlock - it can't be
84+
blocked on source and it means that it doesn't hold any locks.
7285

7386
Any contended object is either held by cross-directory rename or
7487
has a child that is also contended. Indeed, suppose that it is held by

Documentation/filesystems/porting

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -455,3 +455,11 @@ in your dentry operations instead.
455455
vfs_follow_link has been removed. Filesystems must use nd_set_link
456456
from ->follow_link for normal symlinks, or nd_jump_link for magic
457457
/proc/<pid> style links.
458+
--
459+
[mandatory]
460+
iget5_locked()/ilookup5()/ilookup5_nowait() test() callback used to be
461+
called with both ->i_lock and inode_hash_lock held; the former is *not*
462+
taken anymore, so verify that your callbacks do not rely on it (none
463+
of the in-tree instances did). inode_hash_lock is still held,
464+
of course, so they are still serialized wrt removal from inode hash,
465+
as well as wrt set() callback of iget5_locked().

arch/arm64/kernel/signal32.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ static inline int get_sigset_t(sigset_t *set,
122122
return 0;
123123
}
124124

125-
int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
125+
int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
126126
{
127127
int err;
128128

arch/ia64/kernel/elfcore.c

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,7 @@ Elf64_Half elf_core_extra_phdrs(void)
1111
return GATE_EHDR->e_phnum;
1212
}
1313

14-
int elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
15-
unsigned long limit)
14+
int elf_core_write_extra_phdrs(struct coredump_params *cprm, loff_t offset)
1615
{
1716
const struct elf_phdr *const gate_phdrs =
1817
(const struct elf_phdr *) (GATE_ADDR + GATE_EHDR->e_phoff);
@@ -35,15 +34,13 @@ int elf_core_write_extra_phdrs(struct file *file, loff_t offset, size_t *size,
3534
phdr.p_offset += ofs;
3635
}
3736
phdr.p_paddr = 0; /* match other core phdrs */
38-
*size += sizeof(phdr);
39-
if (*size > limit || !dump_write(file, &phdr, sizeof(phdr)))
37+
if (!dump_emit(cprm, &phdr, sizeof(phdr)))
4038
return 0;
4139
}
4240
return 1;
4341
}
4442

45-
int elf_core_write_extra_data(struct file *file, size_t *size,
46-
unsigned long limit)
43+
int elf_core_write_extra_data(struct coredump_params *cprm)
4744
{
4845
const struct elf_phdr *const gate_phdrs =
4946
(const struct elf_phdr *) (GATE_ADDR + GATE_EHDR->e_phoff);
@@ -54,8 +51,7 @@ int elf_core_write_extra_data(struct file *file, size_t *size,
5451
void *addr = (void *)gate_phdrs[i].p_vaddr;
5552
size_t memsz = PAGE_ALIGN(gate_phdrs[i].p_memsz);
5653

57-
*size += memsz;
58-
if (*size > limit || !dump_write(file, addr, memsz))
54+
if (!dump_emit(cprm, addr, memsz))
5955
return 0;
6056
break;
6157
}

arch/ia64/kernel/signal.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ restore_sigcontext (struct sigcontext __user *sc, struct sigscratch *scr)
105105
}
106106

107107
int
108-
copy_siginfo_to_user (siginfo_t __user *to, siginfo_t *from)
108+
copy_siginfo_to_user (siginfo_t __user *to, const siginfo_t *from)
109109
{
110110
if (!access_ok(VERIFY_WRITE, to, sizeof(siginfo_t)))
111111
return -EFAULT;

arch/mips/kernel/signal32.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ SYSCALL_DEFINE3(32_sigaction, long, sig, const struct compat_sigaction __user *,
314314
return ret;
315315
}
316316

317-
int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
317+
int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
318318
{
319319
int err;
320320

arch/parisc/kernel/signal32.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ copy_siginfo_from_user32 (siginfo_t *to, compat_siginfo_t __user *from)
319319
}
320320

321321
int
322-
copy_siginfo_to_user32 (compat_siginfo_t __user *to, siginfo_t *from)
322+
copy_siginfo_to_user32 (compat_siginfo_t __user *to, const siginfo_t *from)
323323
{
324324
compat_uptr_t addr;
325325
compat_int_t val;

arch/parisc/kernel/signal32.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ struct compat_ucontext {
3434

3535
/* ELF32 signal handling */
3636

37-
int copy_siginfo_to_user32 (compat_siginfo_t __user *to, siginfo_t *from);
37+
int copy_siginfo_to_user32 (compat_siginfo_t __user *to, const siginfo_t *from);
3838
int copy_siginfo_from_user32 (siginfo_t *to, compat_siginfo_t __user *from);
3939

4040
/* In a deft move of uber-hackery, we decide to carry the top half of all

arch/powerpc/include/asm/spu.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -235,14 +235,15 @@ extern long spu_sys_callback(struct spu_syscall_block *s);
235235

236236
/* syscalls implemented in spufs */
237237
struct file;
238+
struct coredump_params;
238239
struct spufs_calls {
239240
long (*create_thread)(const char __user *name,
240241
unsigned int flags, umode_t mode,
241242
struct file *neighbor);
242243
long (*spu_run)(struct file *filp, __u32 __user *unpc,
243244
__u32 __user *ustatus);
244245
int (*coredump_extra_notes_size)(void);
245-
int (*coredump_extra_notes_write)(struct file *file, loff_t *foffset);
246+
int (*coredump_extra_notes_write)(struct coredump_params *cprm);
246247
void (*notify_spus_active)(void);
247248
struct module *owner;
248249
};

arch/powerpc/kernel/signal_32.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -893,7 +893,7 @@ static long restore_tm_user_regs(struct pt_regs *regs,
893893
#endif
894894

895895
#ifdef CONFIG_PPC64
896-
int copy_siginfo_to_user32(struct compat_siginfo __user *d, siginfo_t *s)
896+
int copy_siginfo_to_user32(struct compat_siginfo __user *d, const siginfo_t *s)
897897
{
898898
int err;
899899

0 commit comments

Comments
 (0)