Skip to content

Commit a693c46

Browse files
committed
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar: - add RCU torture scripts/tooling - static analysis improvements - update RCU documentation - miscellaneous fixes * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) rcu: Remove "extern" from function declarations in kernel/rcu/rcu.h rcu: Remove "extern" from function declarations in include/linux/*rcu*.h rcu/torture: Dynamically allocate SRCU output buffer to avoid overflow rcu: Don't activate RCU core on NO_HZ_FULL CPUs rcu: Warn on allegedly impossible rcu_read_unlock_special() from irq rcu: Add an RCU_INITIALIZER for global RCU-protected pointers rcu: Make rcu_assign_pointer's assignment volatile and type-safe bonding: Use RCU_INIT_POINTER() for better overhead and for sparse rcu: Add comment on evaluate-once properties of rcu_assign_pointer(). rcu: Provide better diagnostics for blocking in RCU callback functions rcu: Improve SRCU's grace-period comments rcu: Fix CONFIG_RCU_FANOUT_EXACT for odd fanout/leaf values rcu: Fix coccinelle warnings rcutorture: Stop tracking FSF's postal address rcutorture: Move checkarg to functions.sh rcutorture: Flag errors and warnings with color coding rcutorture: Record results from repeated runs of the same test scenario rcutorture: Test summary at end of run with less chattiness rcutorture: Update comment in kvm.sh listing typical RCU trace events rcutorture: Add tracing-enabled version of TREE08 ...
2 parents 6ffbe7d + 73a7ac2 commit a693c46

File tree

127 files changed

+3768
-198
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

127 files changed

+3768
-198
lines changed

Documentation/RCU/trace.txt

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -396,14 +396,14 @@ o Each element of the form "3/3 ..>. 0:7 ^0" represents one rcu_node
396396

397397
The output of "cat rcu/rcu_sched/rcu_pending" looks as follows:
398398

399-
0!np=26111 qsp=29 rpq=5386 cbr=1 cng=570 gpc=3674 gps=577 nn=15903
400-
1!np=28913 qsp=35 rpq=6097 cbr=1 cng=448 gpc=3700 gps=554 nn=18113
401-
2!np=32740 qsp=37 rpq=6202 cbr=0 cng=476 gpc=4627 gps=546 nn=20889
402-
3 np=23679 qsp=22 rpq=5044 cbr=1 cng=415 gpc=3403 gps=347 nn=14469
403-
4!np=30714 qsp=4 rpq=5574 cbr=0 cng=528 gpc=3931 gps=639 nn=20042
404-
5 np=28910 qsp=2 rpq=5246 cbr=0 cng=428 gpc=4105 gps=709 nn=18422
405-
6!np=38648 qsp=5 rpq=7076 cbr=0 cng=840 gpc=4072 gps=961 nn=25699
406-
7 np=37275 qsp=2 rpq=6873 cbr=0 cng=868 gpc=3416 gps=971 nn=25147
399+
0!np=26111 qsp=29 rpq=5386 cbr=1 cng=570 gpc=3674 gps=577 nn=15903 ndw=0
400+
1!np=28913 qsp=35 rpq=6097 cbr=1 cng=448 gpc=3700 gps=554 nn=18113 ndw=0
401+
2!np=32740 qsp=37 rpq=6202 cbr=0 cng=476 gpc=4627 gps=546 nn=20889 ndw=0
402+
3 np=23679 qsp=22 rpq=5044 cbr=1 cng=415 gpc=3403 gps=347 nn=14469 ndw=0
403+
4!np=30714 qsp=4 rpq=5574 cbr=0 cng=528 gpc=3931 gps=639 nn=20042 ndw=0
404+
5 np=28910 qsp=2 rpq=5246 cbr=0 cng=428 gpc=4105 gps=709 nn=18422 ndw=0
405+
6!np=38648 qsp=5 rpq=7076 cbr=0 cng=840 gpc=4072 gps=961 nn=25699 ndw=0
406+
7 np=37275 qsp=2 rpq=6873 cbr=0 cng=868 gpc=3416 gps=971 nn=25147 ndw=0
407407

408408
The fields are as follows:
409409

@@ -432,6 +432,10 @@ o "gpc" is the number of times that an old grace period had
432432
o "gps" is the number of times that a new grace period had started,
433433
but this CPU was not yet aware of it.
434434

435+
o "ndw" is the number of times that a wakeup of an rcuo
436+
callback-offload kthread had to be deferred in order to avoid
437+
deadlock.
438+
435439
o "nn" is the number of times that this CPU needed nothing.
436440

437441

@@ -443,7 +447,7 @@ The output of "cat rcu/rcuboost" looks as follows:
443447
balk: nt=0 egt=6541 bt=0 nb=0 ny=126 nos=0
444448

445449
This information is output only for rcu_preempt. Each two-line entry
446-
corresponds to a leaf rcu_node strcuture. The fields are as follows:
450+
corresponds to a leaf rcu_node structure. The fields are as follows:
447451

448452
o "n:m" is the CPU-number range for the corresponding two-line
449453
entry. In the sample output above, the first entry covers

Documentation/circular-buffers.txt

Lines changed: 27 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,7 @@ The producer will look something like this:
160160
spin_lock(&producer_lock);
161161

162162
unsigned long head = buffer->head;
163+
/* The spin_unlock() and next spin_lock() provide needed ordering. */
163164
unsigned long tail = ACCESS_ONCE(buffer->tail);
164165

165166
if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
@@ -168,9 +169,8 @@ The producer will look something like this:
168169

169170
produce_item(item);
170171

171-
smp_wmb(); /* commit the item before incrementing the head */
172-
173-
buffer->head = (head + 1) & (buffer->size - 1);
172+
smp_store_release(buffer->head,
173+
(head + 1) & (buffer->size - 1));
174174

175175
/* wake_up() will make sure that the head is committed before
176176
* waking anyone up */
@@ -183,9 +183,14 @@ This will instruct the CPU that the contents of the new item must be written
183183
before the head index makes it available to the consumer and then instructs the
184184
CPU that the revised head index must be written before the consumer is woken.
185185

186-
Note that wake_up() doesn't have to be the exact mechanism used, but whatever
187-
is used must guarantee a (write) memory barrier between the update of the head
188-
index and the change of state of the consumer, if a change of state occurs.
186+
Note that wake_up() does not guarantee any sort of barrier unless something
187+
is actually awakened. We therefore cannot rely on it for ordering. However,
188+
there is always one element of the array left empty. Therefore, the
189+
producer must produce two elements before it could possibly corrupt the
190+
element currently being read by the consumer. Therefore, the unlock-lock
191+
pair between consecutive invocations of the consumer provides the necessary
192+
ordering between the read of the index indicating that the consumer has
193+
vacated a given element and the write by the producer to that same element.
189194

190195

191196
THE CONSUMER
@@ -195,21 +200,20 @@ The consumer will look something like this:
195200

196201
spin_lock(&consumer_lock);
197202

198-
unsigned long head = ACCESS_ONCE(buffer->head);
203+
/* Read index before reading contents at that index. */
204+
unsigned long head = smp_load_acquire(buffer->head);
199205
unsigned long tail = buffer->tail;
200206

201207
if (CIRC_CNT(head, tail, buffer->size) >= 1) {
202-
/* read index before reading contents at that index */
203-
smp_read_barrier_depends();
204208

205209
/* extract one item from the buffer */
206210
struct item *item = buffer[tail];
207211

208212
consume_item(item);
209213

210-
smp_mb(); /* finish reading descriptor before incrementing tail */
211-
212-
buffer->tail = (tail + 1) & (buffer->size - 1);
214+
/* Finish reading descriptor before incrementing tail. */
215+
smp_store_release(buffer->tail,
216+
(tail + 1) & (buffer->size - 1));
213217
}
214218

215219
spin_unlock(&consumer_lock);
@@ -218,12 +222,17 @@ This will instruct the CPU to make sure the index is up to date before reading
218222
the new item, and then it shall make sure the CPU has finished reading the item
219223
before it writes the new tail pointer, which will erase the item.
220224

221-
222-
Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
223-
This prevents the compiler from discarding and reloading its cached value -
224-
which some compilers will do across smp_read_barrier_depends(). This isn't
225-
strictly needed if you can be sure that the opposition index will _only_ be
226-
used the once.
225+
Note the use of ACCESS_ONCE() and smp_load_acquire() to read the
226+
opposition index. This prevents the compiler from discarding and
227+
reloading its cached value - which some compilers will do across
228+
smp_read_barrier_depends(). This isn't strictly needed if you can
229+
be sure that the opposition index will _only_ be used the once.
230+
The smp_load_acquire() additionally forces the CPU to order against
231+
subsequent memory references. Similarly, smp_store_release() is used
232+
in both algorithms to write the thread's index. This documents the
233+
fact that we are writing to something that can be read concurrently,
234+
prevents the compiler from tearing the store, and enforces ordering
235+
against previous accesses.
227236

228237

229238
===============

Documentation/kernel-parameters.txt

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2627,7 +2627,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
26272627
for RCU-preempt, and "s" for RCU-sched, and "N"
26282628
is the CPU number. This reduces OS jitter on the
26292629
offloaded CPUs, which can be useful for HPC and
2630-
26312630
real-time workloads. It can also improve energy
26322631
efficiency for asymmetric multiprocessors.
26332632

@@ -2643,8 +2642,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
26432642
periodically wake up to do the polling.
26442643

26452644
rcutree.blimit= [KNL]
2646-
Set maximum number of finished RCU callbacks to process
2647-
in one batch.
2645+
Set maximum number of finished RCU callbacks to
2646+
process in one batch.
26482647

26492648
rcutree.rcu_fanout_leaf= [KNL]
26502649
Increase the number of CPUs assigned to each
@@ -2663,8 +2662,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
26632662
value is one, and maximum value is HZ.
26642663

26652664
rcutree.qhimark= [KNL]
2666-
Set threshold of queued
2667-
RCU callbacks over which batch limiting is disabled.
2665+
Set threshold of queued RCU callbacks beyond which
2666+
batch limiting is disabled.
26682667

26692668
rcutree.qlowmark= [KNL]
26702669
Set threshold of queued RCU callbacks below which

MAINTAINERS

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7104,6 +7104,12 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
71047104
F: Documentation/RCU/torture.txt
71057105
F: kernel/rcu/torture.c
71067106

7107+
RCUTORTURE TEST FRAMEWORK
7108+
M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
7109+
S: Supported
7110+
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
7111+
F: tools/testing/selftests/rcutorture
7112+
71077113
RDC R-321X SoC
71087114
M: Florian Fainelli <florian@openwrt.org>
71097115
S: Maintained

drivers/net/bonding/bond_main.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1763,7 +1763,7 @@ static int __bond_release_one(struct net_device *bond_dev,
17631763
}
17641764

17651765
if (all) {
1766-
rcu_assign_pointer(bond->curr_active_slave, NULL);
1766+
RCU_INIT_POINTER(bond->curr_active_slave, NULL);
17671767
} else if (oldcurrent == slave) {
17681768
/*
17691769
* Note that we hold RTNL over this sequence, so there

include/linux/rculist.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,8 +55,8 @@ static inline void __list_add_rcu(struct list_head *new,
5555
next->prev = new;
5656
}
5757
#else
58-
extern void __list_add_rcu(struct list_head *new,
59-
struct list_head *prev, struct list_head *next);
58+
void __list_add_rcu(struct list_head *new,
59+
struct list_head *prev, struct list_head *next);
6060
#endif
6161

6262
/**

0 commit comments

Comments
 (0)