Skip to content

Commit e385139

Browse files
Davidlohr BuesoIngo Molnar
authored andcommitted
locking/rwsem: Rework zeroing reader waiter->task
Readers that are awoken will expect a nil ->task indicating that a wakeup has occurred. Because of the way readers are implemented, there's a small chance that the waiter will never block in the slowpath (rwsem_down_read_failed), and therefore requires some form of reference counting to avoid the following scenario: rwsem_down_read_failed() rwsem_wake() get_task_struct(); spin_lock_irq(&wait_lock); list_add_tail(&waiter.list) spin_unlock_irq(&wait_lock); raw_spin_lock_irqsave(&wait_lock) __rwsem_do_wake() while (1) { set_task_state(TASK_UNINTERRUPTIBLE); waiter->task = NULL if (!waiter.task) // true break; schedule() // never reached __set_task_state(TASK_RUNNING); do_exit(); wake_up_process(tsk); // boom ... and therefore race with do_exit() when the caller returns. There is also a mismatch between the smp_mb() and its documentation, in that the serialization is done between reading the task and the nil store. Furthermore, in addition to having the overlapping of loads and stores to waiter->task guaranteed to be ordered within that CPU, both wake_up_process() originally and now wake_q_add() already imply barriers upon successful calls, which serves the comment. Now, as an alternative to perhaps inverting the checks in the blocker side (which has its own penalty in that schedule is unavoidable), with lockless wakeups this situation is naturally addressed and we can just use the refcount held by wake_q_add(), instead doing so explicitly. Of course, we must guarantee that the nil store is done as the _last_ operation in that the task must already be marked for deletion to not fall into the race above. Spurious wakeups are also handled transparently in that the task's reference is only removed when wake_up_q() is actually called _after_ the nil store. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman.Long@hpe.com Cc: dave@stgolabs.net Cc: jason.low2@hp.com Cc: peter@hurleysoftware.com Link: http://lkml.kernel.org/r/1463165787-25937-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
1 parent 133e89e commit e385139

File tree

1 file changed

+7
-10
lines changed

1 file changed

+7
-10
lines changed

kernel/locking/rwsem-xadd.c

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -194,17 +194,15 @@ __rwsem_mark_wake(struct rw_semaphore *sem,
194194
waiter = list_entry(next, struct rwsem_waiter, list);
195195
next = waiter->list.next;
196196
tsk = waiter->task;
197+
198+
wake_q_add(wake_q, tsk);
197199
/*
198-
* Make sure we do not wakeup the next reader before
199-
* setting the nil condition to grant the next reader;
200-
* otherwise we could miss the wakeup on the other
201-
* side and end up sleeping again. See the pairing
202-
* in rwsem_down_read_failed().
200+
* Ensure that the last operation is setting the reader
201+
* waiter to nil such that rwsem_down_read_failed() cannot
202+
* race with do_exit() by always holding a reference count
203+
* to the task to wakeup.
203204
*/
204-
smp_mb();
205-
waiter->task = NULL;
206-
wake_q_add(wake_q, tsk);
207-
put_task_struct(tsk);
205+
smp_store_release(&waiter->task, NULL);
208206
} while (--loop);
209207

210208
sem->wait_list.next = next;
@@ -228,7 +226,6 @@ struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
228226
/* set up my own style of waitqueue */
229227
waiter.task = tsk;
230228
waiter.type = RWSEM_WAITING_FOR_READ;
231-
get_task_struct(tsk);
232229

233230
raw_spin_lock_irq(&sem->wait_lock);
234231
if (list_empty(&sem->wait_list))

0 commit comments

Comments
 (0)