Skip to content

Commit bf378d3

Browse files
Peter ZijlstraIngo Molnar
authored andcommitted
perf: Fix perf ring buffer memory ordering
The PPC64 people noticed a missing memory barrier and crufty old comments in the perf ring buffer code. So update all the comments and add the missing barrier. When the architecture implements local_t using atomic_long_t there will be double barriers issued; but short of introducing more conditional barrier primitives this is the best we can do. Reported-by: Victor Kaplansky <victork@il.ibm.com> Tested-by: Victor Kaplansky <victork@il.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: michael@ellerman.id.au Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: anton@samba.org Cc: benh@kernel.crashing.org Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
1 parent cd65718 commit bf378d3

File tree

2 files changed

+34
-9
lines changed

2 files changed

+34
-9
lines changed

include/uapi/linux/perf_event.h

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -456,13 +456,15 @@ struct perf_event_mmap_page {
456456
/*
457457
* Control data for the mmap() data buffer.
458458
*
459-
* User-space reading the @data_head value should issue an rmb(), on
460-
* SMP capable platforms, after reading this value -- see
461-
* perf_event_wakeup().
459+
* User-space reading the @data_head value should issue an smp_rmb(),
460+
* after reading this value.
462461
*
463462
* When the mapping is PROT_WRITE the @data_tail value should be
464-
* written by userspace to reflect the last read data. In this case
465-
* the kernel will not over-write unread data.
463+
* written by userspace to reflect the last read data, after issueing
464+
* an smp_mb() to separate the data read from the ->data_tail store.
465+
* In this case the kernel will not over-write unread data.
466+
*
467+
* See perf_output_put_handle() for the data ordering.
466468
*/
467469
__u64 data_head; /* head in the data section */
468470
__u64 data_tail; /* user-space written tail */

kernel/events/ring_buffer.c

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -87,10 +87,31 @@ static void perf_output_put_handle(struct perf_output_handle *handle)
8787
goto out;
8888

8989
/*
90-
* Publish the known good head. Rely on the full barrier implied
91-
* by atomic_dec_and_test() order the rb->head read and this
92-
* write.
90+
* Since the mmap() consumer (userspace) can run on a different CPU:
91+
*
92+
* kernel user
93+
*
94+
* READ ->data_tail READ ->data_head
95+
* smp_mb() (A) smp_rmb() (C)
96+
* WRITE $data READ $data
97+
* smp_wmb() (B) smp_mb() (D)
98+
* STORE ->data_head WRITE ->data_tail
99+
*
100+
* Where A pairs with D, and B pairs with C.
101+
*
102+
* I don't think A needs to be a full barrier because we won't in fact
103+
* write data until we see the store from userspace. So we simply don't
104+
* issue the data WRITE until we observe it. Be conservative for now.
105+
*
106+
* OTOH, D needs to be a full barrier since it separates the data READ
107+
* from the tail WRITE.
108+
*
109+
* For B a WMB is sufficient since it separates two WRITEs, and for C
110+
* an RMB is sufficient since it separates two READs.
111+
*
112+
* See perf_output_begin().
93113
*/
114+
smp_wmb();
94115
rb->user_page->data_head = head;
95116

96117
/*
@@ -154,9 +175,11 @@ int perf_output_begin(struct perf_output_handle *handle,
154175
* Userspace could choose to issue a mb() before updating the
155176
* tail pointer. So that all reads will be completed before the
156177
* write is issued.
178+
*
179+
* See perf_output_put_handle().
157180
*/
158181
tail = ACCESS_ONCE(rb->user_page->data_tail);
159-
smp_rmb();
182+
smp_mb();
160183
offset = head = local_read(&rb->head);
161184
head += size;
162185
if (unlikely(!perf_output_space(rb, tail, offset, head)))

0 commit comments

Comments
 (0)