Commit Graph

1289 Commits

Author SHA1 Message Date
Paul E. McKenney
6b3de7a172 rcutorture: Break up too-long rcu_torture_fwd_prog() function
This commit splits rcu_torture_fwd_prog_nr() and rcu_torture_fwd_prog_cr()
functions out of rcu_torture_fwd_prog() in order to reduce indentation
pain and because rcu_torture_fwd_prog() was getting a bit too long.
In addition, this will enable easier conditional execution of the
rcu_torture_fwd_prog_cr() function, which can give false-positive
failures in some NO_HZ_FULL configurations due to overloading the
housekeeping CPUs.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-12-01 12:45:34 -08:00
Paul E. McKenney
fc6f9c5778 rcutorture: Remove cbflood facility
Now that the forward-progress code does a full-bore continuous callback
flood lasting multiple seconds, there is little point in also posting a
mere 60,000 callbacks every second or so.  This commit therefore removes
the old cbflood testing.  Over time, it may be desirable to concurrently
do full-bore continuous callback floods on all CPUs simultaneously, but
one dragon at a time.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-12-01 12:45:33 -08:00
Paul E. McKenney
4871848531 rcutorture: Add call_rcu() flooding forward-progress tests
This commit adds a call_rcu() flooding loop to the forward-progress test.
This emulates tight userspace loops that force call_rcu() invocations,
for example, the infamous loop containing close(open()) that instigated
the addition of blimit.  If RCU does not make sufficient forward progress
in invoking the resulting flood of callbacks, rcutorture emits a warning.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-12-01 12:45:32 -08:00
Paul E. McKenney
eaaf055f27 Merge branches 'bug.2018.11.12a', 'consolidate.2018.12.01a', 'doc.2018.11.12a', 'fixes.2018.11.12a', 'initrd.2018.11.08b', 'sil.2018.11.12a' and 'srcu.2018.11.27a' into HEAD
bug.2018.11.12a:  Get rid of BUG_ON() and friends
consolidate.2018.12.01a:  Continued RCU flavor-consolidation cleanup
doc.2018.11.12a:  Documentation updates
fixes.2018.11.12a:  Miscellaneous fixes
initrd.2018.11.08b:  Automate creation of rcutorture initrd
sil.2018.11.12a:  Remove more spin_unlock_wait() calls
2018-12-01 12:43:16 -08:00
Paul E. McKenney
aacb5d91ab srcu: Use "ssp" instead of "sp" for srcu_struct pointer
In RCU, the distinction between "rsp", "rnp", and "rdp" has served well
for a great many years, but in SRCU, "sp" vs. "sdp" has proven confusing.
This commit therefore renames SRCU's "sp" pointers to "ssp", so that there
is "ssp" for srcu_struct pointer, "snp" for srcu_node pointer, and "sdp"
for srcu_data pointer.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-27 09:24:17 -08:00
Dennis Krein
eb4c238227 srcu: Lock srcu_data structure in srcu_gp_start()
The srcu_gp_start() function is called with the srcu_struct structure's
->lock held, but not with the srcu_data structure's ->lock.  This is
problematic because this function accesses and updates the srcu_data
structure's ->srcu_cblist, which is protected by that lock.  Failing to
hold this lock can result in corruption of the SRCU callback lists,
which in turn can result in arbitrarily bad results.

This commit therefore makes srcu_gp_start() acquire the srcu_data
structure's ->lock across the calls to rcu_segcblist_advance() and
rcu_segcblist_accelerate(), thus preventing this corruption.

Reported-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Christoph Hellwig <hch@infradead.org>
Reported-by: Sebastian Kuzminsky <seb.kuzminsky@gmail.com>
Signed-off-by: Dennis Krein <Dennis.Krein@netapp.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Tested-by: Dennis Krein <Dennis.Krein@netapp.com>
Cc: <stable@vger.kernel.org> # 4.16.x
2018-11-27 09:23:57 -08:00
Paul E. McKenney
5f1a6ef374 rcu: Avoid signed integer overflow in rcu_preempt_deferred_qs()
Subtracting INT_MIN can be interpreted as unconditional signed integer
overflow, which according to the C standard is undefined behavior.
Therefore, kernel build arguments notwithstanding, it would be good to
future-proof the code.  This commit therefore substitutes INT_MAX for
INT_MIN in order to avoid undefined behavior.

While in the neighborhood, this commit also creates some meaningful names
for INT_MAX and friends in order to improve readability, as suggested
by Joel Fernandes.

Reported-by: Ran Rozenstein <ranro@mellanox.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney
117f683c6e rcu: Replace this_cpu_ptr() with __this_cpu_read()
Because __this_cpu_read() can be lighter weight than equivalent uses of
this_cpu_ptr(), this commit replaces the latter with the former.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney
05f415715c rcu: Speed up expedited GPs when interrupting RCU reader
In PREEMPT kernels, an expedited grace period might send an IPI to a
CPU that is executing an RCU read-side critical section.  In that case,
it would be nice if the rcu_read_unlock() directly interacted with the
RCU core code to immediately report the quiescent state.  And this does
happen in the case where the reader has been preempted.  But it would
also be a nice performance optimization if immediate reporting also
happened in the preemption-free case.

This commit therefore adds an ->exp_hint field to the task_struct structure's
->rcu_read_unlock_special field.  The IPI handler sets this hint when
it has interrupted an RCU read-side critical section, and this causes
the outermost rcu_read_unlock() call to invoke rcu_read_unlock_special(),
which, if preemption is enabled, reports the quiescent state immediately.
If preemption is disabled, then the report is required to be deferred
until preemption (or bottom halves or interrupts or whatever) is re-enabled.

Because this is a hint, it does nothing for more complicated cases.  For
example, if the IPI interrupts an RCU reader, but interrupts are disabled
across the rcu_read_unlock(), but another rcu_read_lock() is executed
before interrupts are re-enabled, the hint will already have been cleared.
If you do crazy things like this, reporting will be deferred until some
later RCU_SOFTIRQ handler, context switch, cond_resched(), or similar.

Reported-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2018-11-12 09:03:59 -08:00
Paul E. McKenney
0a89e5a402 rcu: Trace end of grace period before end of grace period
Currently, rcu_gp_cleanup() traces the end of the old grace period after
the old grace period has officially ended.  This might make intuitive
sense, but it also makes for confusing event-trace output because the
"end" trace displays not the old but instead the new grace-period number.
This commit therefore traces the end of an old grace period just before
that grace period officially ends.

Reported-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Zhouyi Zhou
2320bda26d rcu: Adjust the comment of function rcu_is_watching
Because RCU avoids interrupting idle CPUs, rcu_is_watching() is used to
test whether or not it is currently legal to run RCU read-side critical
sections on this CPU.  However, the first sentence and last sentences
of current comment for rcu_is_watching have opposite meaning of what
is expected.  This commit therefore fixes this header comment.

Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney
c669c014d1 rcu: Add jiffies-since-GP-activity to show_rcu_gp_kthreads()
This commit adds a printout of the number of jiffies since the last time
that the RCU grace-period kthread did any processing.  This can be useful
when tracking down forward-progress issues.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney
691960197e rcu: Add state name to show_rcu_gp_kthreads() output
This commit adds the name of the RCU grace-period state to
the show_rcu_gp_kthreads() output in order to ease debugging.
This commit also moves gp_state_getname() up in the code so that
show_rcu_gp_kthreads() can use it.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney
791416c471 rcu: Parameterize rcu_check_gp_start_stall()
In order to debug forward-progress stalls, it is necessary to check
for excessively delayed grace-period starts.  This is currently done
for RCU CPU stall warnings by rcu_check_gp_start_stall(), which checks
to see if the start of a requested grace period has been delayed by an
RCU CPU stall warning period.  Because rcutorture will need to check
for the time consumed by an RCU forward-progress delay, this commit
promotes gpssdelay from a local variable to a formal parameter.  It is
not necessary to export rcu_check_gp_start_stall() because rcutorture
will access it via a wrapper function.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney
b3c1d9ec7c rcu: Avoid double multiply by HZ
The rcu_check_gp_start_stall() function multiplies the return value
from rcu_jiffies_till_stall_check() by HZ, but the units are already
in jiffies.  This commit therefore avoids the need for introduction of
a jiffies-squared unit by removing the extraneous multiplication.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 09:03:59 -08:00
Paul E. McKenney
f0ad56e876 rcu: Eliminate BUG_ON() for kernel/rcu/update.c
The update.c file has a number of calls to BUG_ON(), which panics the
kernel, which is not a good strategy for devices (like embedded) that
don't have a way to capture console output.  This commit therefore
converts these BUG_ON() calls to WARN_ON_ONCE() and WARN_ONCE().

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:15:59 -08:00
Paul E. McKenney
9213784b48 rcu: Eliminate BUG_ON() for kernel/rcu/tree_plugin.h
The tree_plugin.h file has a number of calls to BUG_ON(), which panics
the kernel, which is not a good strategy for devices (like embedded)
that don't have a way to capture console output.  This commit therefore
converts these BUG_ON() calls to WARN_ON_ONCE() and WARN_ONCE().

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
[ paulmck: Fix typo: s/rcuo/rcub/. ]
2018-11-12 08:15:16 -08:00
Paul E. McKenney
9cac83a57e rcu: Stop expedited grace periods from relying on stop-machine
The CPU-selection code in sync_rcu_exp_select_cpus() disables preemption
to prevent the cpu_online_mask from changing.  However, this relies on
the stop-machine mechanism in the CPU-hotplug offline code, which is not
desirable (it would be good to someday remove the stop-machine mechanism).

This commit therefore instead uses the relevant leaf rcu_node structure's
->ffmask, which has a bit set for all CPUs that are fully functional.
A given CPU's bit is cleared very early during offline processing by
rcutree_offline_cpu() and set very late during online processing by
rcutree_online_cpu().  Therefore, if a CPU's bit is set in this mask, and
preemption is disabled, we have to be before the synchronize_sched() in
the CPU-hotplug offline code, which means that the CPU is guaranteed to be
workqueue-ready throughout the duration of the enclosing preempt_disable()
region of code.

This also has the side-effect of using WORK_CPU_UNBOUND if all the CPUs for
this leaf rcu_node structure are offline, which is an acceptable difference
in behavior.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-11 11:23:01 -08:00
Paul E. McKenney
0607ba8403 srcu: Prevent __call_srcu() counter wrap with read-side critical section
Ever since cdf7abc461 ("srcu: Allow use of Tiny/Tree SRCU from
both process and interrupt context"), it has been permissible
to use SRCU read-side critical sections in interrupt context.
This allows __call_srcu() to use SRCU read-side critical sections to
prevent a new SRCU grace period from ending before the call to either
srcu_funnel_gp_start() or srcu_funnel_exp_start completes, thus preventing
SRCU grace-period counter overflow during that time.

Note that this does not permit removal of the counter-wrap checks in
srcu_gp_end().  These check are necessary to handle the case where
a given CPU does not interact at all with SRCU for an extended time
period.

This commit therefore adds an SRCU read-side critical section to
__call_srcu() in order to prevent grace period counter wrap during
the funnel-locking process.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-08 21:54:14 -08:00
Paul E. McKenney
d3ff3891b2 rcu: Consolidate the RCU update functions invoked by sync.c
This commit retains all the various gp_ops[] entries, but makes their
update functions all be synchronize_rcu(), call_rcu() and rcu_barrier().
The read-side checks remain consistent with the various RCU flavors,
which still exist on the read side.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2018-11-08 21:43:20 -08:00
Paul E. McKenney
309ba859b9 rcu: Eliminate synchronize_rcu_mult()
Now that synchronize_rcu() waits for both RCU read-side critical
sections and preempt-disabled regions of code, the sole caller of
synchronize_rcu_mult() can be replaced by synchronize_rcu().
This patch makes this change and removes synchronize_rcu_mult().
Note that _wait_rcu_gp() still supports synchronize_rcu_mult(),
and thus might be simplified in the future to take only take
a single call_rcu() function rather than the current list of them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-08 21:43:20 -08:00
Joel Fernandes (Google)
adbccddb4a rcu: Fix rcu_{node,data} comments about gp_seq_needed
Recent changes have removed the old ->gp_seq_needed field from the
rcu_state structure, which in turn obsoleted a couple of comments in
the rcu_node and rcu_data structures.  This commit therefore updates
these comments accordingly.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:43:20 -08:00
Joel Fernandes (Google)
75a8f72245 rcu: Remove unused rcu_state externs
The rcu_bh_state and rcu_sched_state variables were removed during the
RCU flavor consolidations, but external declarations remain in tree.h.
This commit therefore removes these obsolete declarations.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:43:20 -08:00
Paul E. McKenney
08543bda42 rcu: Eliminate BUG_ON() for kernel/rcu/tree.c
The tree.c file has a number of calls to BUG_ON(), which panics the
kernel, which is not a good strategy for devices (like embedded) that
don't have a way to capture console output.  This commit therefore
converts these BUG_ON() calls to WARN_ON_ONCE() and WARN_ONCE().

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:41:57 -08:00
Paul E. McKenney
042d4c70a2 rcu: Eliminate BUG_ON() for sync.c
The sync.c file has a number of calls to BUG_ON(), which panics the
kernel, which is not a good strategy for devices (like embedded) that
don't have a way to capture console output.  This commit therefore
changes these BUG_ON() calls to WARN_ON_ONCE(), but does so quite naively.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2018-11-08 21:41:57 -08:00
Paul E. McKenney
b56ada1209 Merge branches 'doc.2018.08.30a', 'dynticks.2018.08.30b', 'srcu.2018.08.30b' and 'torture.2018.08.29a' into HEAD
doc.2018.08.30a: Documentation updates
dynticks.2018.08.30b: RCU flavor consolidation updates and cleanups
srcu.2018.08.30b: SRCU updates
torture.2018.08.29a: Torture-test updates
2018-08-30 16:12:53 -07:00
Paul E. McKenney
4e6ea4ef56 srcu: Make early-boot call_srcu() reuse workqueue lists
Allocating a list_head structure that is almost never used, and, when
used, is used only during early boot (rcu_init() and earlier), is a bit
wasteful.  This commit therefore eliminates that list_head in favor of
the one in the work_struct structure.  This is safe because the work_struct
structure cannot be used until after rcu_init() returns.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-30 16:10:49 -07:00
Paul E. McKenney
e0fcba9ac0 srcu: Make call_srcu() available during very early boot
Event tracing is moving to SRCU in order to take advantage of the fact
that SRCU may be safely used from idle and even offline CPUs.  However,
event tracing can invoke call_srcu() very early in the boot process,
even before workqueue_init_early() is invoked (let alone rcu_init()).
Therefore, call_srcu()'s attempts to queue work fail miserably.

This commit therefore detects this situation, and refrains from attempting
to queue work before rcu_init() time, but does everything else that it
would have done, and in addition, adds the srcu_struct to a global list.
The rcu_init() function now invokes a new srcu_init() function, which
is empty if CONFIG_SRCU=n.  Otherwise, srcu_init() queues work for
each srcu_struct on the list.  This all happens early enough in boot
that there is but a single CPU with interrupts disabled, which allows
synchronization to be dispensed with.

Of course, the queued work won't actually be invoked until after
workqueue_init() is invoked, which happens shortly after the scheduler
is up and running.  This means that although call_srcu() may be invoked
any time after per-CPU variables have been set up, there is still a very
narrow window when synchronize_srcu() won't work, and this window
extends from the time that the scheduler starts until the time that
workqueue_init() returns.  This can be fixed in a manner similar to
the fix for synchronize_rcu_expedited() and friends, but until someone
actually needs to use synchronize_srcu() during this window, this fix
is added churn for no benefit.

Finally, note that Tree SRCU's new srcu_init() function invokes
queue_work() rather than the queue_delayed_work() function that is
invoked post-boot.  The reason is that queue_delayed_work() will (as you
would expect) post a timer, and timers have not yet been initialized.
So use of queue_work() avoids the complaints about use of uninitialized
spinlocks that would otherwise result.  Besides, some delay is already
provide by the aforementioned fact that the queued work won't actually
be invoked until after the scheduler is up and running.

Requested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-30 16:10:19 -07:00
Mike Galbraith
894d45bbf7 rcu: Convert rcu_state.ofl_lock to raw_spinlock_t
1e64b15a4b ("rcu: Fix grace-period hangs due to race with CPU offline")
added spinlock_t ofl_lock to the rcu_state structure, then takes it with
preemption disabled during CPU offline, which gives the -rt patchset's
sleeping spinlock heartburn.

This commit therefore converts ->ofl_lock to raw_spinlock_t.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2018-08-30 16:03:54 -07:00
Paul E. McKenney
8d8a9d0e7e rcu: Remove obsolete ->dynticks_fqs and ->cond_resched_completed
The rcu_data structure's ->dynticks_fqs is incremented but never
accesses.  Its ->cond_resched_completed field isn't used at all.
This commit therefore removes both fields.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:53 -07:00
Paul E. McKenney
dc5a4f2932 rcu: Switch ->dynticks to rcu_data structure, remove rcu_dynticks
This commit move ->dynticks from the rcu_dynticks structure to the
rcu_data structure, replacing the field of the same name.  It also updates
the code to access ->dynticks from the rcu_data structure and to use the
rcu_data structure rather than following to now-gone ->dynticks field
to the now-gone rcu_dynticks structure.  While in the area, this commit
also fixes up comments.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:52 -07:00
Paul E. McKenney
4c5273bf2b rcu: Switch dyntick nesting counters to rcu_data structure
This commit removes ->dynticks_nesting and ->dynticks_nmi_nesting from
the rcu_dynticks structure and updates the code to access them from the
rcu_data structure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:51 -07:00
Paul E. McKenney
2dba13f0b6 rcu: Switch urgent quiescent-state requests to rcu_data structure
This commit removes ->rcu_need_heavy_qs and ->rcu_urgent_qs from the
rcu_dynticks structure and updates the code to access them from the
rcu_data structure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:50 -07:00
Paul E. McKenney
c458a89e96 rcu: Switch lazy counts to rcu_data structure
This commit removes ->all_lazy, ->nonlazy_posted and ->nonlazy_posted_snap
from the rcu_dynticks structure and updates the code to access them from
the rcu_data structure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:49 -07:00
Paul E. McKenney
5998a75adb rcu: Switch last accelerate/advance to rcu_data structure
This commit removes ->last_accelerate and ->last_advance_all from the
rcu_dynticks structure and updates the code to access them from the
rcu_data structure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:48 -07:00
Paul E. McKenney
0fd79e7521 rcu: Switch ->tick_nohz_enabled_snap to rcu_data structure
This commit removes ->tick_nohz_enabled_snap from the rcu_dynticks
structure and updates the code to access it from the rcu_data
structure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:47 -07:00
Paul E. McKenney
cc72046cc3 rcu: Merge rcu_dynticks structure into rcu_data structure
Now that there is only ever one rcu_data structure per CPU, there is no
need for a separate rcu_dynticks structure.  This commit therefore adds
the rcu_dynticks fields into the rcu_data structure in preparation for
removing the rcu_dynticks structure entirely.  Note that the ->dynticks
field will be handled specially because there is a field by that name
in both structures.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:47 -07:00
Paul E. McKenney
df63fa5bc1 rcu: Convert "1UL << x" to "BIT(x)"
This commit saves a few characters by converting "1UL << x" to "BIT(x)".

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:46 -07:00
Paul E. McKenney
fced9c8cfe rcu: Avoid resched_cpu() when rescheduling the current CPU
The resched_cpu() interface is quite handy, but it does acquire the
specified CPU's runqueue lock, which does not come for free.  This
commit therefore substitutes the following when directing resched_cpu()
at the current CPU:

	set_tsk_need_resched(current);
	set_preempt_need_resched();

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2018-08-30 16:03:45 -07:00
Paul E. McKenney
d3052109c0 rcu: More aggressively enlist scheduler aid for nohz_full CPUs
Because nohz_full CPUs can leave the scheduler-clock interrupt disabled
even when in kernel mode, RCU cannot rely on rcu_check_callbacks() to
enlist the scheduler's aid in extracting a quiescent state from such CPUs.
This commit therefore more aggressively uses resched_cpu() on nohz_full
CPUs that fail to pass through a quiescent state in a timely manner.
By default, the resched_cpu() beating starts 300 milliseconds into the
quiescent state.

While in the neighborhood, add a ->last_fqs_resched field to the rcu_data
structure in order to rate-limit resched_cpu() calls from the RCU
grace-period kthread.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:44 -07:00
Paul E. McKenney
c06aed0e31 rcu: Compute jiffies_till_sched_qs from other kernel parameters
The jiffies_till_sched_qs value used to determine how old a grace period
must be before RCU enlists the help of the scheduler to force a quiescent
state on the holdout CPU.  Currently, this defaults to HZ/10 regardless of
system size and may be set only at boot time.  This can be a problem for
very large systems, because if the values of the jiffies_till_first_fqs
and jiffies_till_next_fqs kernel parameters are left at their defaults,
they are calculated to increase as the number of CPUs actually configured
on the system increases.  Thus, on a sufficiently large system, RCU would
enlist the help of the scheduler before the grace-period kthread had a
chance to scan for idle CPUs, which wastes CPU time.

This commit therefore allows jiffies_till_sched_qs to be set, if desired,
but if left as default, computes is as jiffies_till_first_fqs plus twice
jiffies_till_next_fqs, thus allowing three force-quiescent-state scans
for idle CPUs.  This scales with the number of CPUs, providing sensible
default values.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:43 -07:00
Paul E. McKenney
74de6960c9 rcu: Provide functions for determining if call_rcu() has been invoked
This commit adds rcu_head_init() and rcu_head_after_call_rcu() functions
to help RCU users detect when another CPU has passed the specified
rcu_head structure and function to call_rcu().  The rcu_head_init()
should be invoked before making the structure visible to RCU readers,
and then the rcu_head_after_call_rcu() may be invoked from within
an RCU read-side critical section on an rcu_head structure that
was obtained during a traversal of the data structure in question.
The rcu_head_after_call_rcu() function will return true if the rcu_head
structure has already been passed (with the specified function) to
call_rcu(), otherwise it will return false.

If rcu_head_init() has not been invoked on the rcu_head structure
or if the rcu_head (AKA callback) has already been invoked, then
rcu_head_after_call_rcu() will do WARN_ON_ONCE().

Reported-by: NeilBrown <neilb@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Apply neilb naming feedback. ]
2018-08-30 16:03:42 -07:00
Paul E. McKenney
7e28c5af4e rcu: Eliminate ->rcu_qs_ctr from the rcu_dynticks structure
The ->rcu_qs_ctr counter was intended to allow providing a lightweight
report of a quiescent state to all RCU flavors.  But now that there is
only one flavor of RCU in any one running kernel, there is no point in
having this feature.  This commit therefore removes the ->rcu_qs_ctr
field from the rcu_dynticks structure and the ->rcu_qs_ctr_snap field
from the rcu_data structure.  This results in the "rqc" option to the
rcu_fqs trace event no longer being used, so this commit also removes the
"rqc" description from the header comment.

While in the neighborhood, this commit also causes the forward-progress
request .rcu_need_heavy_qs be set one jiffies_till_sched_qs interval
later in the grace period than the first setting of .rcu_urgent_qs.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:42 -07:00
Paul E. McKenney
c5bacd9417 rcu: Motivate Tiny RCU forward progress
If a long-running CPU-bound in-kernel task invokes call_rcu(), the
callback won't be invoked until the next context switch.  If there are
no other runnable tasks (which is not an uncommon situation on deep
embedded systems), the callback might never be invoked.

This commit therefore causes rcu_check_callbacks() to ask the scheduler
for a context switch if there are callbacks posted that are still waiting
for a grace period.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:41 -07:00
Paul E. McKenney
c116dba68d rcutorture: Dump reader protection sequence if failures or close calls
Now that RCU can have readers with multiple segments, it is quite
possible that a specific sequence of reader segments might result in
an rcutorture failure (reader spans a full grace period as detected
by one of the grace-period primitives) or an rcutorture close call
(reader potentially spans a full grace period based on reading out
the RCU implementation's grace-period counter, but with no ordering).
In such cases, it would clearly ease debugging if the offending specific
sequence was known.  For the first reader encountering a failure or a
close call, this commit therefore dumps out the segments, delay durations,
and whether or not the reader was preempted.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Mark variables static, as suggested by kbuild test robot. ]
2018-08-30 16:03:40 -07:00
Paul E. McKenney
a0ef9ec241 rcu: Provide improved interrupt-from-idle check in rcu_check_callbacks()
The patch making need_resched() respond to urgent RCU-QS needs used
is_idle_task(current) to detect an interrupt from idle, which does work
reasonably, but is (in theory at least) vulnerable to loops containing
need_resched() invoked from within RCU_NONIDLE() or its tracepoint
equivalent.  This commit therefore moves rcu_is_cpu_rrupt_from_idle()
to a place from which rcu_check_callbacks() can invoke it and replaces
the is_idle_task(current) with rcu_is_cpu_rrupt_from_idle().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:39 -07:00
Paul E. McKenney
92aa39e9dc rcu: Make need_resched() respond to urgent RCU-QS needs
The per-CPU rcu_dynticks.rcu_urgent_qs variable communicates an urgent
need for an RCU quiescent state from the force-quiescent-state processing
within the grace-period kthread to context switches and to cond_resched().
Unfortunately, such urgent needs are not communicated to need_resched(),
which is sometimes used to decide when to invoke cond_resched(), for
but one example, within the KVM vcpu_run() function.  As of v4.15, this
can result in synchronize_sched() being delayed by up to ten seconds,
which can be problematic, to say nothing of annoying.

This commit therefore checks rcu_dynticks.rcu_urgent_qs from within
rcu_check_callbacks(), which is invoked from the scheduling-clock
interrupt handler.  If the current task is not an idle task and is
not executing in usermode, a context switch is forced, and either way,
the rcu_dynticks.rcu_urgent_qs variable is set to false.  If the current
task is an idle task, then RCU's dyntick-idle code will detect the
quiescent state, so no further action is required.  Similarly, if the
task is executing in usermode, other code in rcu_check_callbacks() and
its called functions will report the corresponding quiescent state.

Reported-by: Marius Hillenbrand <mhillenb@amazon.de>
Reported-by: David Woodhouse <dwmw2@infradead.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:39 -07:00
Paul E. McKenney
dd46a7882c rcu: Inline _rcu_barrier() into its sole remaining caller
Because rcu_barrier() is a one-line wrapper function for _rcu_barrier()
and because nothing else calls _rcu_barrier(), this commit inlines
_rcu_barrier() into rcu_barrier().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:39 -07:00
Paul E. McKenney
395a2f097e rcu: Define rcu_all_qs() only in !PREEMPT builds
Now that rcu_all_qs() is used only in !PREEMPT builds, move it to
tree_plugin.h so that it is defined only in those builds.  This in
turn means that rcu_momentary_dyntick_idle() is only used in !PREEMPT
builds, but it is simply marked __maybe_unused in order to keep it
near the rest of the dyntick-idle code.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:37 -07:00
Paul E. McKenney
06462efc80 rcu: Clean up flavor-related definitions and comments in update.c
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:36 -07:00
Paul E. McKenney
0ae86a2726 rcu: Clean up flavor-related definitions and comments in tree_plugin.h
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:35 -07:00
Paul E. McKenney
8fa946d428 rcu: Clean up flavor-related definitions and comments in tree_exp.h
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:35 -07:00
Paul E. McKenney
49918a54e6 rcu: Clean up flavor-related definitions and comments in tree.c
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:34 -07:00
Paul E. McKenney
679d3f3092 rcu: Clean up flavor-related definitions and comments in tiny.c
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:34 -07:00
Paul E. McKenney
6eb95cc450 rcu: Clean up flavor-related definitions and comments in srcutree.h
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:34 -07:00
Paul E. McKenney
62a1a94536 rcu: Clean up flavor-related definitions and comments in rcutorture.c
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:33 -07:00
Paul E. McKenney
7f87c036fe rcu: Clean up flavor-related definitions and comments in rcu.h
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:33 -07:00
Paul E. McKenney
8c1cf2da6f rcu: Clean up flavor-related definitions and comments in Kconfig
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:32 -07:00
Paul E. McKenney
de3875d302 rcu: Remove now-unused rcutorture APIs
This commit removes rcu_sched_get_gp_seq(), rcu_bh_get_gp_seq(),
rcu_exp_batches_completed_sched(), rcu_sched_force_quiescent_state(),
and rcu_bh_force_quiescent_state(), which are no longer used because
rcutorture no longer does "rcu_bh" and "rcu_sched" torture types.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:30 -07:00
Paul E. McKenney
620d246065 rcuperf: Remove the "rcu_bh" and "sched" torture types
Now that the RCU-bh and RCU-sched update-side functions are simple
wrappers around their RCU counterparts, there isn't a whole lot of point
in testing them.  This commit therefore removes the "rcu_bh" and "sched"
torture types from rcuperf.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:30 -07:00
Paul E. McKenney
c770c82a23 rcutorture: Remove the "rcu_bh" and "sched" torture types
Now that the RCU-bh and RCU-sched update-side functions are simple
wrappers around their RCU counterparts, there isn't a whole lot of point
in testing them.  This commit therefore removes the "rcu_bh" and "sched"
torture types from rcutorture.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:29 -07:00
Paul E. McKenney
72ce30dd1f rcu: Stop testing RCU-bh and RCU-sched
Now that the RCU-bh and RCU-sched update-side functions are simple
wrappers around their RCU counterparts, there isn't a whole lot of
point in testing them.  This commit therefore removes the self-test
capability and removes the corresponding kernel-boot parameters.
It also updates the various rcutorture .boot files to remove the
kernel boot parameters that call for testing RCU-bh and RCU-sched.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:29 -07:00
Paul E. McKenney
2ceebc0350 rcutorture: Add RCU-bh and RCU-sched support for extended readers
Since there is now a single consolidated RCU flavor, rcutorture
needs to test extending of RCU readers via rcu_read_lock_bh() and
rcu_read_lock_sched().  This commit adds this support, with added checks
(just like for local_bh_enable()) to ensure that rcu_read_unlock_bh()
will not be invoked while interrupts are disabled.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:27 -07:00
Paul E. McKenney
a8bb74acd8 rcu: Consolidate RCU-sched update-side function definitions
This commit saves a few lines by consolidating the RCU-sched function
definitions at the end of include/linux/rcupdate.h.  This consolidation
also makes it easier to remove them all when the time comes.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:26 -07:00
Paul E. McKenney
4c7e9c1434 rcu: Consolidate RCU-bh update-side function definitions
This commit saves a few lines by consolidating the RCU-bh function
definitions at the end of include/linux/rcupdate.h.  This consolidation
also makes it easier to remove them all when the time comes.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:25 -07:00
Paul E. McKenney
c3854a055b rcu: Pull rcu_gp_kthread() FQS loop into separate function
The rcu_gp_kthread() function is long and deeply indented, so this
commit pulls the loop that repeatedly invokes rcu_gp_fqs() into a new
rcu_gp_fqs_loop() function.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:24 -07:00
Paul E. McKenney
4e95020cdd rcu: Inline increment_cpu_stall_ticks() into its sole caller
Consolidation of the RCU flavors into one makes increment_cpu_stall_ticks()
a trivial one-line function with only one caller.  This commit therefore
inlines it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:23 -07:00
Paul E. McKenney
8ff0b90780 rcu: Fix typo in force_qs_rnp()'s parameter's parameter
Pointers to rcu_data structures should be named rdp, not rsp.  This
commit therefore makes this change.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:23 -07:00
Paul E. McKenney
eb7a665388 rcu: Eliminate initialization-time use of rsp
Now that there is only one rcu_state structure, there is less point in
maintaining a pointer to it.  This commit therefore replaces rsp with
&rcu_state in rcu_cpu_starting() and rcu_init_one().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:22 -07:00
Paul E. McKenney
ec9f5835f7 rcu: Eliminate RCU-barrier use of rsp
Now that there is only one rcu_state structure, there is less point
in maintaining a pointer to it.  This commit therefore replaces rsp
with &rcu_state in rcu_barrier_callback(), rcu_barrier_func(), and
_rcu_barrier().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:22 -07:00
Paul E. McKenney
67a0edbf3c rcu: Eliminate quiescent-state and grace-period-nonstart use of rsp
Now that there is only one rcu_state structure, there is less point in
maintaining a pointer to it.  This commit therefore replaces rsp with
&rcu_state in rcu_report_qs_rnp(), force_quiescent_state(), and
rcu_check_gp_start_stall().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:21 -07:00
Paul E. McKenney
3c779dfef2 rcu: Eliminate callback-invocation/invocation use of rsp
Now that there is only one rcu_state structure, there is less point in
maintaining a pointer to it.  This commit therefore replaces rsp with
&rcu_state in rcu_do_batch(), invoke_rcu_callbacks(), and __call_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:21 -07:00
Paul E. McKenney
9cbc5b9702 rcu: Eliminate grace-period management code use of rsp
Now that there is only one rcu_state structure, there is less point
in maintaining a pointer to it.  This commit therefore replaces
rsp with &rcu_state in rcu_start_this_gp(), rcu_accelerate_cbs(),
__note_gp_changes(), rcu_gp_init(), rcu_gp_fqs(), rcu_gp_cleanup(),
rcu_gp_kthread(), and rcu_report_qs_rsp().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:20 -07:00
Paul E. McKenney
4c6ed43708 rcu: Eliminate stall-warning use of rsp
Now that there is only one rcu_state structure, there is less point
in maintaining a pointer to it.  This commit therefore replaces rsp
with &rcu_state in print_other_cpu_stall(), print_cpu_stall(), and
check_cpu_stall().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:20 -07:00
Paul E. McKenney
7cba4775ba rcu: Restructure rcu_check_gp_kthread_starvation()
This commit removes the rsp and gpa local variables, repurposes the j
local variable and adds a gpk (GP kthread) local to improve readability.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:19 -07:00
Paul E. McKenney
f7dd7d44fd rcu: Simplify rcutorture_get_gp_data()
This commit restructures rcutorture_get_gp_data() to take advantage of
the fact that there is only one flavor of RCU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:19 -07:00
Paul E. McKenney
b97d23c51c rcu: Remove for_each_rcu_flavor() flavor-traversal macro
Now that there is only ever a single flavor of RCU in a given kernel
build, there isn't a whole lot of point in having a flavor-traversal
macro.  This commit therefore removes it and converts calls to it to
straightline code, inlining trivial functions as appropriate.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:18 -07:00
Paul E. McKenney
564a9ae604 rcu: Remove last non-flavor-traversal rsp local variable from tree_plugin.h
This commit removes the last non-flavor-traversal rsp local variable from
kernel/rcu/tree_plugin.h in favor of &rcu_state.  The flavor-traversal
locals will be removed with the removal of flavor traversal.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:17 -07:00
Paul E. McKenney
88d1bead85 rcu: Remove rcu_data structure's ->rsp field
Now that there is only one rcu_state structure, there is no need for the
rcu_data structure to indicate which it corresponds to.  This commit
therefore removes the rcu_data structure's ->rsp field, replacing all
remaining uses of it with &rcu_state.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:17 -07:00
Paul E. McKenney
aedf4ba984 rcu: Remove rsp parameter from rcu_node tree accessor macros
There now is only one rcu_state structure in a given build of the Linux
kernel, so there is no need to pass it as a parameter to RCU's rcu_node
tree's accessor macros.  This commit therefore removes the rsp parameter
from those macros in kernel/rcu/rcu.h, and removes some now-unused rsp
local variables while in the area.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:16 -07:00
Paul E. McKenney
63d4c8c979 rcu: Remove rsp parameter from expedited grace-period functions
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to
RCU's functions.  This commit therefore removes the rsp parameter
from the code in kernel/rcu/tree_exp.h, and removes all of the
rsp local variables while in the area.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:14 -07:00
Paul E. McKenney
4580b0541b rcu: Remove rsp parameter from no-CBs CPU functions
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to
RCU's functions.  This commit therefore removes the rsp parameter
from rcu_nocb_cpu_needs_barrier(), rcu_spawn_one_nocb_kthread(),
rcu_organize_nocb_kthreads(), rcu_nocb_cpu_needs_barrier(), and
rcu_nohz_full_cpu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:13 -07:00
Paul E. McKenney
b21ebed951 rcu: Remove rsp parameter from print_cpu_stall_info()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
print_cpu_stall_info().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:12 -07:00
Paul E. McKenney
6dbfdc1409 rcu: Remove rsp parameter from rcu_spawn_one_boost_kthread()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_spawn_one_boost_kthread().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:11 -07:00
Paul E. McKenney
81ab59a3ad rcu: Remove rsp parameter from dump_blkd_tasks() and friend
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
dump_blkd_tasks() and rcu_preempt_blocked_readers_cgp().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:10 -07:00
Paul E. McKenney
a2887cd85f rcu: Remove rsp parameter from rcu_print_detail_task_stall()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_print_detail_task_stall().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:09 -07:00
Paul E. McKenney
b8bb1f63cf rcu: Remove rsp parameter from rcu_init_one() and friends
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_init_one() and rcu_dump_rcu_node_tree().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:09 -07:00
Paul E. McKenney
53b46303da rcu: Remove rsp parameter from rcu_boot_init_percpu_data() and friends
There now is only one rcu_state structure in a given build of
the Linux kernel, so there is no need to pass it as a parameter
to RCU's functions.  This commit therefore removes the rsp
parameter from rcu_boot_init_percpu_data(), rcu_init_percpu_data(),
rcu_cleanup_dying_idle_cpu(), and rcu_migrate_callbacks().  While in
the neighborhood, line the last three into rcutree_prepare_cpu(),
rcu_report_dead() and rcutree_migrate_callbacks(), respectively.
This also gets rid of the for_each_rcu_flavor() calls that were in
those tree functions.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:08 -07:00
Paul E. McKenney
8344b871b1 rcu: Remove rsp parameter from _rcu_barrier() and friends
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
_rcu_barrier_trace() and _rcu_barrier().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:08 -07:00
Paul E. McKenney
98ece508b5 rcu: Remove rsp parameter from __rcu_pending()
There now is only one rcu_state structure in a given build of the Linux
kernel, so there is no need to pass it as a parameter to RCU's functions.
This commit therefore removes the rsp parameter from __rcu_pending(),
and also inlines it into rcu_pending(), removing the for_each_rcu_flavor()
while in the neighborhood..

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:07 -07:00
Paul E. McKenney
5c7d89676b rcu: Remove rsp parameter from __call_rcu() and friend
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
__call_rcu_core() and __call_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:07 -07:00
Paul E. McKenney
b049fdf8e3 rcu: Remove rsp parameter from __rcu_process_callbacks()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
__rcu_process_callbacks(), and also inlines it into rcu_process_callbacks(),
removing the for_each_rcu_flavor() while in the neighborhood.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:06 -07:00
Paul E. McKenney
b96f9dc4fb rcu: Remove rsp parameter from rcu_check_gp_start_stall()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_check_gp_start_stall().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:06 -07:00
Paul E. McKenney
e9ecb780fe rcu: Remove rsp parameter from force-quiescent-state functions
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
force_qs_rnp() and force_quiescent_state().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:05 -07:00
Paul E. McKenney
5bb5d09cc4 rcu: Remove rsp parameter from rcu_do_batch()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_do_batch().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:05 -07:00
Paul E. McKenney
780cd59083 rcu: Remove rsp parameter from CPU hotplug functions
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_cleanup_dying_cpu() and rcu_cleanup_dead_cpu().  And, as long as
we are in the neighborhood, inlines them into rcutree_dying_cpu() and
rcutree_dead_cpu(), respectively.  This also eliminates a pair of
for_each_rcu_flavor() loops.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:04 -07:00
Paul E. McKenney
8087d3e3c4 rcu: Remove rsp parameter from rcu_check_quiescent_state()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_check_quiescent_state().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:04 -07:00
Paul E. McKenney
0854a05c9f rcu: Remove rsp parameter from rcu_gp_kthread() and friends
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_gp_init(), rcu_gp_fqs_check_wake(), rcu_gp_fqs(), rcu_gp_cleanup(),
and rcu_gp_kthread().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:03 -07:00
Paul E. McKenney
22212332c1 rcu: Remove rsp parameter from rcu_gp_slow()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_gp_slow().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:03 -07:00
Paul E. McKenney
15cabdffbb rcu: Remove rsp parameter from note_gp_changes()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
note_gp_changes().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:02 -07:00
Paul E. McKenney
c7e48f7ba3 rcu: Remove rsp parameter from __note_gp_changes()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
__note_gp_changes().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:01 -07:00
Paul E. McKenney
834f56bf54 rcu: Remove rsp parameter from rcu_advance_cbs()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_advance_cbs().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:01 -07:00
Paul E. McKenney
c6e09b97b9 rcu: Remove rsp parameter from rcu_accelerate_cbs_unlocked()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_accelerate_cbs_unlocked().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:00 -07:00
Paul E. McKenney
02f501423d rcu: Remove rsp parameter from rcu_accelerate_cbs()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_accelerate_cbs().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:00 -07:00
Paul E. McKenney
532c00c97f rcu: Remove rsp parameter from rcu_gp_kthread_wake()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_gp_kthread_wake().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:59 -07:00
Paul E. McKenney
3481f2eab0 rcu: Remove rsp parameter from rcu_future_gp_cleanup()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_future_gp_cleanup().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:59 -07:00
Paul E. McKenney
ea12ff2b7d rcu: Remove rsp parameter from check_cpu_stall()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
check_cpu_stall().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:58 -07:00
Paul E. McKenney
4e8b8e08f9 rcu: Remove rsp parameter from print_cpu_stall()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
print_cpu_stall().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:58 -07:00
Paul E. McKenney
a91e7e58b1 rcu: Remove rsp parameter from print_other_cpu_stall()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
print_other_cpu_stall().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:57 -07:00
Paul E. McKenney
e1741c69d4 rcu: Remove rsp parameter from rcu_stall_kick_kthreads()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_stall_kick_kthreads().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:57 -07:00
Paul E. McKenney
33dbdbf025 rcu: Remove rsp parameter from rcu_dump_cpu_stacks()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_dump_cpu_stacks().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:56 -07:00
Paul E. McKenney
8fd119b652 rcu: Remove rsp parameter from rcu_check_gp_kthread_starvation()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_check_gp_kthread_starvation().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:56 -07:00
Paul E. McKenney
ad3832e974 rcu: Remove rsp parameter from record_gp_stall_check_time()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
record_gp_stall_check_time().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:55 -07:00
Paul E. McKenney
336a4f6c45 rcu: Remove rsp parameter from rcu_get_root()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_get_root().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:55 -07:00
Paul E. McKenney
de8e87305a rcu: Remove rsp parameter from rcu_gp_in_progress()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_gp_in_progress().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:54 -07:00
Paul E. McKenney
33085c469a rcu: Remove rsp parameter from rcu_report_qs_rdp()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_report_qs_rdp().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:53 -07:00
Paul E. McKenney
139ad4da5a rcu: Remove rsp parameter from rcu_report_unblock_qs_rnp()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_report_unblock_qs_rnp(), which is particularly appropriate in
this case given that this parameter is no longer used.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:53 -07:00
Paul E. McKenney
aff4e9ede5 rcu: Remove rsp parameter from rcu_report_qs_rsp()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_report_qs_rsp().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:52 -07:00
Paul E. McKenney
b50912d0b5 rcu: Remove rsp parameter from rcu_report_qs_rnp()
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions.  This commit therefore removes the rsp parameter from
rcu_report_qs_rnp().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:51 -07:00
Paul E. McKenney
2280ee5a7d rcu: Remove rcu_data_p pointer to default rcu_data structure
The rcu_data_p pointer references the default set of per-CPU rcu_data
structures, that is, those that call_rcu() uses, as opposed to
call_rcu_bh() and sometimes call_rcu_sched().  But there is now only one
set of per-CPU rcu_data structures, so that one set is by definition
the default, which means that the rcu_data_p pointer no longer serves
any useful purpose.  This commit therefore removes it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:51 -07:00
Paul E. McKenney
16fc9c600b rcu: Remove rcu_state_p pointer to default rcu_state structure
The rcu_state_p pointer references the default rcu_state structure,
that is, the one that call_rcu() uses, as opposed to call_rcu_bh()
and sometimes call_rcu_sched().  But there is now only one rcu_state
structure, so that one structure is by definition the default, which
means that the rcu_state_p pointer no longer serves any useful purpose.
This commit therefore removes it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:50 -07:00
Paul E. McKenney
da1df50d16 rcu: Remove rcu_state structure's ->rda field
The rcu_state structure's ->rda field was used to find the per-CPU
rcu_data structures corresponding to that rcu_state structure.  But now
there is only one rcu_state structure (creatively named "rcu_state")
and one set of per-CPU rcu_data structures (creatively named "rcu_data").
Therefore, uses of the ->rda field can always be replaced by "rcu_data,
and this commit makes that change and removes the ->rda field.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:49 -07:00
Paul E. McKenney
ec5dd444b6 rcu: Eliminate rcu_state structure's ->call field
The rcu_state structure's ->call field references the corresponding RCU
flavor's call_rcu() function.  However, now that there is only ever one
rcu_state structure in a given build of the Linux kernel, and that flavor
uses plain old call_rcu(), there is not a lot of point in continuing to
have the ->call field.  This commit therefore removes it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:48 -07:00
Paul E. McKenney
358be2d368 rcu: Remove RCU_STATE_INITIALIZER()
Now that a given build of the Linux kernel has only one set of rcu_state,
rcu_node, and rcu_data structures, there is no point in creating a macro
to declare and compile-time initialize them.  This commit therefore
just does normal declaration and compile-time initialization of these
structures.  While in the area, this commit also removes #ifndefs of
the no-longer-ever-defined preprocessor macro RCU_TREE_NONCORE.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:47 -07:00
Paul E. McKenney
709fdce754 rcu: Express Tiny RCU updates in terms of RCU rather than RCU-sched
This commit renames Tiny RCU functions so that the lowest level of
functionality is RCU (e.g., synchronize_rcu()) rather than RCU-sched
(e.g., synchronize_sched()).  This provides greater naming compatibility
with Tree RCU, which will in turn permit more LoC removal once
the RCU-sched and RCU-bh update-side API is removed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix Tiny call_rcu()'s EXPORT_SYMBOL() in response to a bug
  report from kbuild test robot. ]
2018-08-30 16:02:46 -07:00
Paul E. McKenney
45975c7d21 rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds
Now that RCU-preempt knows about preemption disabling, its implementation
of synchronize_rcu() works for synchronize_sched(), and likewise for the
other RCU-sched update-side API members.  This commit therefore confines
the RCU-sched update-side code to CONFIG_PREEMPT=n builds, and defines
RCU-sched's update-side API members in terms of those of RCU-preempt.

This means that any given build of the Linux kernel has only one
update-side flavor of RCU, namely RCU-preempt for CONFIG_PREEMPT=y builds
and RCU-sched for CONFIG_PREEMPT=n builds.  This in turn means that kernels
built with CONFIG_RCU_NOCB_CPU=y have only one rcuo kthread per CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
2018-08-30 16:02:45 -07:00
Paul E. McKenney
4cf439a200 rcu: Fix typo in rcu_get_gp_kthreads_prio() header comment
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:43 -07:00
Paul E. McKenney
2bbfc25b09 rcu: Drop "wake" parameter from rcu_report_exp_rdp()
The rcu_report_exp_rdp() function is always invoked with its "wake"
argument set to "true", so this commit drops this parameter.  The only
potential call site that would use "false" is in the code driving the
expedited grace period, and that code uses rcu_report_exp_cpu_mult()
instead, which therefore retains its "wake" parameter.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:43 -07:00
Paul E. McKenney
82fcecfa81 rcu: Update comments and help text for no more RCU-bh updaters
This commit updates comments and help text to account for the fact that
RCU-bh update-side functions are now simple wrappers for their RCU or
RCU-sched counterparts.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:42 -07:00
Paul E. McKenney
65cfe3583b rcu: Define RCU-bh update API in terms of RCU
Now that the main RCU API knows about softirq disabling and softirq's
quiescent states, the RCU-bh update code can be dispensed with.
This commit therefore removes the RCU-bh update-side implementation and
defines RCU-bh's update-side API in terms of that of either RCU-preempt or
RCU-sched, depending on the setting of the CONFIG_PREEMPT Kconfig option.

In kernels built with CONFIG_RCU_NOCB_CPU=y this has the knock-on effect
of reducing by one the number of rcuo kthreads per CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:40 -07:00
Paul E. McKenney
ba1c64c272 rcu: Report expedited grace periods at context-switch time
This commit reduces the latency of expedited RCU grace periods by
reporting a quiescent state for the CPU at context-switch time.
In CONFIG_PREEMPT=y kernels, if the outgoing task is still within an
RCU read-side critical section (and thus still blocking some grace
period, perhaps including this expedited grace period), then that task
will already have been placed on one of the leaf rcu_node structures'
->blkd_tasks list.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:38 -07:00
Paul E. McKenney
d28139c4e9 rcu: Apply RCU-bh QSes to RCU-sched and RCU-preempt when safe
One necessary step towards consolidating the three flavors of RCU is to
make sure that the resulting consolidated "one flavor to rule them all"
correctly handles networking denial-of-service attacks.  One thing that
allows RCU-bh to do so is that __do_softirq() invokes rcu_bh_qs() every
so often, and so something similar has to happen for consolidated RCU.

This must be done carefully.  For example, if a preemption-disabled
region of code takes an interrupt which does softirq processing before
returning, consolidated RCU must ignore the resulting rcu_bh_qs()
invocations -- preemption is still disabled, and that means an RCU
reader for the consolidated flavor.

This commit therefore creates a new rcu_softirq_qs() that is called only
from the ksoftirqd task, thus avoiding the interrupted-a-preempted-region
problem.  This new rcu_softirq_qs() function invokes rcu_sched_qs(),
rcu_preempt_qs(), and rcu_preempt_deferred_qs().  The latter call handles
any deferred quiescent states.

Note that __do_softirq() still invokes rcu_bh_qs().  It will continue to
do so until a later stage of cleanup when the RCU-bh flavor is removed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix !SMP issue located by kbuild test robot. ]
2018-08-30 16:02:38 -07:00
Paul E. McKenney
e11ec65cc8 rcu: Add warning to detect half-interrupts
RCU's dyntick-idle code is written to tolerate half-interrupts, that it,
either an interrupt that invokes rcu_irq_enter() but never invokes the
corresponding rcu_irq_exit() on the one hand, or an interrupt that never
invokes rcu_irq_enter() but does invoke the "corresponding" rcu_irq_exit()
on the other.  These things really did happen at one time, as evidenced
by this ca-2011 LKML post:

http://lkml.kernel.org/r/20111014170019.GE2428@linux.vnet.ibm.com

The reason why RCU tolerates half-interrupts is that usermode helpers
used exceptions to invoke a system call from within the kernel such that
the system call did a normal return (not a return from exception) to
the calling context.  This caused rcu_irq_enter() to be invoked without
a matching rcu_irq_exit().  However, usermode helpers have since been
rewritten to make much more housebroken use of workqueues, kernel threads,
and do_execve(), and therefore should no longer produce half-interrupts.
No one knows of any other source of half-interrupts, but then again,
no one seems insane enough to go audit the entire kernel to verify that
half-interrupts really are a relic of the past.

This commit therefore adds a pair of WARN_ON_ONCE() calls that will
trigger in the presence of half interrupts, which the code will continue
to handle correctly.  If neither of these WARN_ON_ONCE() trigger by
mid-2021, then perhaps RCU can stop handling half-interrupts, which
would be a considerable simplification.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Joel Fernandes <joel@joelfernandes.org>
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2018-08-30 16:02:36 -07:00
Paul E. McKenney
fcc878e4df rcu: Remove now-unused ->b.exp_need_qs field from the rcu_special union
The ->b.exp_need_qs field is now set only to false, so this commit
removes it.  The job this field used to do is now done by the rcu_data
structure's ->deferred_qs field, which is a consequence of a better
split between task-based (the rcu_node structure's ->exp_tasks field) and
CPU-based (the aforementioned rcu_data structure's ->deferred_qs field)
tracking of quiescent states for RCU-preempt expedited grace periods.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:36 -07:00
Paul E. McKenney
27c744e32a rcu: Allow processing deferred QSes for exiting RCU-preempt readers
If an RCU-preempt read-side critical section is exiting, that is,
->rcu_read_lock_nesting is negative, then it is a good time to look
at the possibility of reporting deferred quiescent states.  This
commit therefore updates the checks in rcu_preempt_need_deferred_qs()
to allow exiting critical sections to report deferred quiescent states.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:35 -07:00
Paul E. McKenney
c0335743c5 rcutorture: Test extended "rcu" read-side critical sections
This commit makes the "rcu" torture type test extended read-side
critical sections in order to test the deferral of RCU-preempt
quiescent-state testing.

In CONFIG_PREEMPT=n kernels, this simply duplicates the setup already
in place for the "sched" torture type.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:35 -07:00
Paul E. McKenney
3e31009898 rcu: Defer reporting RCU-preempt quiescent states when disabled
This commit defers reporting of RCU-preempt quiescent states at
rcu_read_unlock_special() time when any of interrupts, softirq, or
preemption are disabled.  These deferred quiescent states are reported
at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug
offline operation.  Of course, if another RCU read-side critical
section has started in the meantime, the reporting of the quiescent
state will be further deferred.

This also means that disabling preemption, interrupts, and/or
softirqs will act as an RCU-preempt read-side critical section.
This is enforced by checking preempt_count() as needed.

Some special cases must be handled on an ad-hoc basis, for example,
context switch is a quiescent state even though both the scheduler and
do_exit() disable preemption.  In these cases, additional calls to
rcu_preempt_deferred_qs() override the preemption disabling.  Similar
logic overrides disabled interrupts in rcu_preempt_check_callbacks()
because in this case the quiescent state happened just before the
corresponding scheduling-clock interrupt.

In theory, this change lifts a long-standing restriction that required
that if interrupts were disabled across a call to rcu_read_unlock()
that the matching rcu_read_lock() also be contained within that
interrupts-disabled region of code.  Because the reporting of the
corresponding RCU-preempt quiescent state is now deferred until
after interrupts have been enabled, it is no longer possible for this
situation to result in deadlocks involving the scheduler's runqueue and
priority-inheritance locks.  This may allow some code simplification that
might reduce interrupt latency a bit.  Unfortunately, in practice this
would also defer deboosting a low-priority task that had been subjected
to RCU priority boosting, so real-time-response considerations might
well force this restriction to remain in place.

Because RCU-preempt grace periods are now blocked not only by RCU
read-side critical sections, but also by disabling of interrupts,
preemption, and softirqs, it will be possible to eliminate RCU-bh and
RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels.  This may
require some additional plumbing to provide the network denial-of-service
guarantees that have been traditionally provided by RCU-bh.  Once these
are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh
into RCU-sched.  This would mean that all kernels would have but
one flavor of RCU, which would open the door to significant code
cleanup.

Moving to a single flavor of RCU would also have the beneficial effect
of reducing the NOCB kthreads by at least a factor of two.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback
  from Joel Fernandes. ]
[ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in
  response to bug reports from kbuild test robot. ]
[ paulmck: Fix bug located by kbuild test robot involving recursion
  via rcu_preempt_deferred_qs(). ]
2018-08-30 16:02:34 -07:00
Byungchul Park
cf7614e13c rcu: Refactor rcu_{nmi,irq}_{enter,exit}()
When entering or exiting irq or NMI handlers, the current code uses
->dynticks_nmi_nesting to detect if it is in the outermost handler,
that is, the one interrupting or returning to an RCU-idle context (the
idle loop or nohz_full usermode execution).  When entering the outermost
handler via an interrupt (as opposed to NMI), it is necessary to invoke
rcu_dynticks_task_exit() just before the CPU is marked non-idle from an
RCU perspective and to invoke rcu_cleanup_after_idle() just after the
CPU is marked non-idle.  Similarly, when exiting the outermost handler
via an interrupt, it is necessary to invoke rcu_prepare_for_idle() just
before marking the CPU idle and to invoke rcu_dynticks_task_enter()
just after marking the CPU idle.

The decision to execute these four functions is currently taken in
rcu_irq_enter() and rcu_irq_exit() as follows:

   rcu_irq_enter()
      /* A conditional branch with ->dynticks_nmi_nesting */
      rcu_nmi_enter()
         /* A conditional branch with ->dynticks */
      /* A conditional branch with ->dynticks_nmi_nesting */

   rcu_irq_exit()
      /* A conditional branch with ->dynticks_nmi_nesting */
      rcu_nmi_exit()
         /* A conditional branch with ->dynticks_nmi_nesting */
      /* A conditional branch with ->dynticks_nmi_nesting */

   rcu_nmi_enter()
      /* A conditional branch with ->dynticks */

   rcu_nmi_exit()
      /* A conditional branch with ->dynticks_nmi_nesting */

This works, but the conditional branches in rcu_irq_enter() and
rcu_irq_exit() are redundant with those in rcu_nmi_enter() and
rcu_nmi_exit(), respectively.  Redundant branches are not something
we want in the to/from-idle fastpaths, so this commit refactors
rcu_{nmi,irq}_{enter,exit}() so they use a common inlined function passed
a constant argument as follows:

   rcu_irq_enter() inlining rcu_nmi_enter_common(irq=true)
      /* A conditional branch with ->dynticks */

   rcu_irq_exit() inlining rcu_nmi_exit_common(irq=true)
      /* A conditional branch with ->dynticks_nmi_nesting */

   rcu_nmi_enter() inlining rcu_nmi_enter_common(irq=false)
      /* A conditional branch with ->dynticks */

   rcu_nmi_exit() inlining rcu_nmi_exit_common(irq=false)
      /* A conditional branch with ->dynticks_nmi_nesting */

The combination of the constant function argument and the inlining allows
the compiler to discard the conditionals that previously controlled
execution of rcu_dynticks_task_exit(), rcu_cleanup_after_idle(),
rcu_prepare_for_idle(), and rcu_dynticks_task_enter().  This reduces both
the to-idle and from-idle path lengths by two conditional branches each,
and improves readability as well.

This commit also changes order of execution from this:

	rcu_dynticks_task_exit();
	rcu_dynticks_eqs_exit();
	trace_rcu_dyntick();
	rcu_cleanup_after_idle();

To this:

	rcu_dynticks_task_exit();
	rcu_dynticks_eqs_exit();
	rcu_cleanup_after_idle();
	trace_rcu_dyntick();

In other words, the calls to rcu_cleanup_after_idle() and
trace_rcu_dyntick() are reversed.  This has no functional effect because
the real concern is whether a given call is before or after the call to
rcu_dynticks_eqs_exit(), and this patch does not change that.  Before the
call to rcu_dynticks_eqs_exit(), RCU is not yet watching the current
CPU and after that call RCU is watching.

A similar switch in calling order happens on the idle-entry path, with
similar lack of effect for the same reasons.

Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Applied Steven Rostedt feedback. ]
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-30 16:00:46 -07:00
Paul E. McKenney
7c590fcca6 rcutorture: Maintain self-propagating CB only during forward-progress test
The current forward-progress testing maintains a self-propagating
callback during the full test.  This could result in false negatives
for stutter-end checking, where it might appear that RCU was clearing
out old callbacks only because it was being continually motivated by
the self-propagating callback.  This commit therefore shuts down the
self-propagating callback at the end of each forward-progress test
interval.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
474e59b476 rcutorture: Check GP completion at stutter end
The rcu_torture_writer() function invokes stutter_wait() at the end of
each writer pass, which occasionally blocks for an extended time period
in order to ensure that RCU can handle intermittent loads.  But part of
handling a busy period is invoking all the callbacks before the end of
the idle period induced by stutter_wait().

This commit therefore adds a return value to stutter_wait() indicating
whether stutter_wait() actually waited.  In addition, this commit causes
rcu_torture_writer() to test this value and if set, checks that all the
elements of the rcu_tortures[] array have been freed up.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
f4de46ed5b rcutorture: Print forward-progress test interval on error
This commit prints the duration of the forward-progress test interval in
the case that no forward progress was observed as an aid to debugging.
When forward progress does happen, it prints out the number of
rcu_torture_writer() versions and grace periods that elapsed during the
forward-progress test.  At the end of the run, it also prints the number
of attempted and actual forward-progress tests.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
c04dd09bd3 rcutorture: Adjust number of reader kthreads per CPU-hotplug operations
Currently, rcutorture provisions rcu_torture_reader() kthreads based
on the initial number of CPUs.  This can be problematic when CPU hotplug
is enabled, as a system with a very large number of CPUs will provision
a very large number of rcu_torture_reader() kthreads.  All of these
kthreads will continue running even if the CPU-hotplug operations result
in only one remaining online CPU.  This can result in all sorts of strange
artifacts due simply to massive overload.

This commit therefore causes the rcu_torture_reader() kthreads to start
blocking as the number of online CPUs decreases.  This is accomplished
by numbering these kthreads, and having each check to make sure that the
number of online CPUs is at least as large as its assigned number.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
fecad5091f rcutorture: Reduce priority of forward-progress testing
On !SMP tests, the forward-progress kthread might prevent RCU's
grace-period kthread from running, which would defeat RCU's
forward-progress measures.  On PREEMPT tests without RCU priority
boosting, the forward-progress kthread might preempt a reader for an
extended time period, which would also defeat RCU's forward-progress
measures.  This commit therefore reduced rcutorture's forward-progress
kthread's priority in those cases.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
1e69676592 rcutorture: Limit reader duration if irq or bh disabled
There are debug checks in some environments that will complain if the
duration of a bh-disabled region of code exceeds about 50 milliseconds.
Because rcu_read_delay() can produce a 50-millisecond delay and because
there could be up to eight reader segments with such delays, this commit
limits the maximum delay to 10 milliseconds if either interrupts or
softirqs are disabled.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
3cff54a830 rcutorture: Increase rcu_read_delay() longdelay_ms
RCU now takes certain actions 100 and 200 milliseconds into a grace period
by default, but rcutorture only runs RCU read-side critical sections
with durations up to 50 milliseconds.  This commit therefore increases
test coverage by increasing the maximum critical-section duration to
300 milliseconds.  Note that the existing code automatically dials down
the probability of long delays based on the maximum duration, which means
that this change should not significantly change the rate of execution
of RCU read-side critical sections.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
9fdcb9afe0 rcutorture: Add self-propagating callback to forward-progress testing
If rcutorture is run on a quiet system with the rcutorture.stutter module
parameter set high, then there can legitimately be an extended period
during which no RCU forward progress takes place.  This can result
in false-positive no-forward-progress splats.  This commit therefore
makes rcu_torture_fwd_prog() create a self-propagating RCU callback
to ensure that grace periods are in progress for the duration of the
forward-progress test.

Note that the RCU flavor under test must define ->call(), ->sync(),
and ->cb_barrier() for this self-propagating callback to be created.
If one or more of those rcu_torture_ops fields are NULL, then the
rcu_torture_fwd_prog() function will silently proceed without creating
the self-propagating callback.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
08a7a2ec68 rcutorture: Vary forward-progress test interval
Some of the Linux kernel's RCU implementations provide several mechanisms
to promote forward progress that operate over different timeframes.
This commit therefore causes rcu_torture_fwd_prog() to vary the duration
of its forward-progress testing in order to test each such mechanism.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
152f4afbfd rcutorture: Avoid no-test complaint if too few forward-progress tries
In a too-short test, random delays can cause each attempt to do
forward-progress testing to fail to complete, thus resulting in
spurious splats.  This commit therefore requires at least five tries
before complaining about rcutorture runs that failed to produce at
least one valid forward-progress testing attempt.  Note that actual
forward-progress failures will splat regardless of the number of tries.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
119248bec9 rcutorture: Also use GP sequence to judge forward progress
Currently, rcutorture relies solely on the progress of
rcu_torture_writer() to judge grace-period forward progress.  In theory,
this is the gold standard of forward progress, but in practice rcutorture
separately detects and reports rcu_torture_writer() stalls.  This commit
therefore adds the grace-period sequence number (when provided) to the
judgment of grace-period forward progress, which makes it easier to
distinguish between failure of actual grace periods to progress on the
one hand and downstream forward-progress failures on the other.

For example, given this change, if rcu_torture_writer() stalls,
but rcu_torture_fwd_prog() does not complain, then the grace-period
computation is working, which is a hint that the failure lies in callback
processing, wakeup of the rcu_torture_writer() kthread, or similar.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
1b27291b1e rcutorture: Add forward-progress tests for RCU grace periods
This commit adds a kthread that loops going into and out of RCU
read-side critical sections, but also including a cond_resched(),
optionally guarded by a check of need_resched(), in that same loop.
This commit relies solely on rcu_torture_writer() progress to judge
the forward progress of grace periods.

Note that Tasks RCU and SRCU are exempted from forward-progress testing
due their (intentionally) less-robust forward-progress guarantees.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
f028806442 rcuperf: Warn on bad perf type for built-in tests
When running a built-in rcuperf test, specifying an invalid perf type
results in what looks like a hard hang, with the error messages hidden
by other boot-time output.  This commit therefore executes a WARN_ON()
in this case so that the splat appears just following the error messages.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
e746b55857 rcutorture: Warn on bad torture type for built-in tests
When running a built-in rcutorture test, specifying an invalid torture
type results in what looks like a hard hang, with the error messages
hidden by other boot-time output.  This commit therefore executes a
WARN_ON() in this case so that the splat appears just following the
error messages.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
444da518fd rcutorture: Force occasional reader waits
Deferred quiescent states can interact with the scheduler, but
rcu_torture_reader() does not force such interaction all that frequently.
This commit therefore blocks for one jiffy after ten jiffies of read-side
runtime.  This has the beneficial effect of being most likely to block
just after long-running readers, and it is exactly these readers that
are most likely to have been preempted (in CONFIG_PREEMPT=y kernels).
This in turn helps increase the probability that a deferred quiescent
state will be seen by RCU's context-switch hooks.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Linus Torvalds
f7951c33f0 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Thomas Gleixner:

 - Cleanup and improvement of NUMA balancing

 - Refactoring and improvements to the PELT (Per Entity Load Tracking)
   code

 - Watchdog simplification and related cleanups

 - The usual pile of small incremental fixes and improvements

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  watchdog: Reduce message verbosity
  stop_machine: Reflow cpu_stop_queue_two_works()
  sched/numa: Move task_numa_placement() closer to numa_migrate_preferred()
  sched/numa: Use group_weights to identify if migration degrades locality
  sched/numa: Update the scan period without holding the numa_group lock
  sched/numa: Remove numa_has_capacity()
  sched/numa: Modify migrate_swap() to accept additional parameters
  sched/numa: Remove unused task_capacity from 'struct numa_stats'
  sched/numa: Skip nodes that are at 'hoplimit'
  sched/debug: Reverse the order of printing faults
  sched/numa: Use task faults only if numa_group is not yet set up
  sched/numa: Set preferred_node based on best_cpu
  sched/numa: Simplify load_too_imbalanced()
  sched/numa: Evaluate move once per node
  sched/numa: Remove redundant field
  sched/debug: Show the sum wait time of a task group
  sched/fair: Remove #ifdefs from scale_rt_capacity()
  sched/core: Remove get_cpu() from sched_fork()
  sched/cpufreq: Clarify sugov_get_util()
  sched/sysctl: Remove unused sched_time_avg_ms sysctl
  ...
2018-08-13 11:25:07 -07:00
Paul E. McKenney
18952651da Merge branches 'fixes1.2018.07.12b' and 'torture1.2018.07.12b' into HEAD
fixes1.2018.07.12b: Post-gp_seq miscellaneous fixes
torture1.2018.07.12b: Post-gp_seq torture-test updates
2018-07-12 15:42:41 -07:00
Joel Fernandes (Google)
bf5b64355a rcutorture: Fix rcu_barrier successes counter
The rcutorture test module currently increments both successes and error
for the barrier test upon error, which results in misleading statistics
being printed.  This commit therefore changes the code to increment the
success counter only when the test actually passes.

This change was tested by by returning from the barrier callback without
incrementing the callback counter, thus introducing what appeared to
rcutorture to be rcu_barrier() failures.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:08 -07:00
Joel Fernandes (Google)
4babd855fd rcutorture: Add support to detect if boost kthread prio is too low
When rcutorture is built in to the kernel, an earlier patch detects
that and raises the priority of RCU's kthreads to allow rcutorture's
RCU priority boosting tests to succeed.

However, if rcutorture is built as a module, those priorities must be
raised manually via the rcutree.kthread_prio kernel boot parameter.
If this manual step is not taken, rcutorture's RCU priority boosting
tests will fail due to kthread starvation.  One approach would be to
raise the default priority, but that risks breaking existing users.
Another approach would be to allow runtime adjustment of RCU's kthread
priorities, but that introduces numerous "interesting" race conditions.
This patch therefore instead detects too-low priorities, and prints a
message and disables the RCU priority boosting tests in that case.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:08 -07:00
Arnd Bergmann
622be33fcb rcutorture: Use monotonic timestamp for stall detection
The get_seconds() call is deprecated because it overflows on 32-bit
architectures. The algorithm in rcu_torture_stall() can deal with
the overflow, but another problem here is that using a CLOCK_REALTIME
stamp can lead to a false-positive stall warning when a settimeofday()
happens concurrently.

Using ktime_get_seconds() instead avoids those issues and will never
overflow. The added cast to 'unsigned long' however is necessary to
make ULONG_CMP_LT() work correctly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:07 -07:00
Joel Fernandes (Google)
3b745c8969 rcutorture: Make boost test more robust
Currently, with RCU_BOOST disabled, I get no failures when forcing
rcutorture to test RCU boost priority inversion. The reason seems to be
that we don't check for failures if the callback never ran at all for
the duration of the boost-test loop.

Further, the 'rtb' and 'rtbf' counters seem to be used inconsistently.
'rtb' is incremented at the start of each test and 'rtbf' is incremented
per-cpu on each failure of call_rcu. So its possible 'rtbf' > 'rtb'.

To test the boost with rcutorture, I did following on a 4-CPU x86 machine:

modprobe rcutorture  test_boost=2
sleep 20
rmmod rcutorture

With patch:
rtbf: 8 rtb: 12

Without patch:
rtbf: 0 rtb: 2

In summary this patch:
 - Increments failed and total test counters once per boost-test.
 - Checks for failure cases correctly.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:06 -07:00
Joel Fernandes (Google)
450efca718 rcutorture: Disable RT throttling for boost tests
Currently rcutorture is not able to torture RCU boosting properly. This
is because the rcutorture's boost threads which are doing the torturing
may be throttled due to RT throttling.

This patch makes rcutorture use the right torture technique (unthrottled
rcutorture boost tasks) for torturing RCU so that the test fails
correctly when no boost is available.

Currently this requires accessing sysctl_sched_rt_runtime directly, but
that should be Ok since rcutorture is test code. Such direct access is
also only possible if rcutorture is used as a built-in so make it
conditional on that.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:06 -07:00
Paul E. McKenney
bf1bef50be rcutorture: Emphasize testing of single reader protection type
For RCU implementations supporting multiple types of reader protection,
rcutorture currently randomly selects the combinations of types of
protection for each phase of each reader.  The problem with this,
for example, given the four kinds of protection for RCU-sched
(local_irq_disable(), local_bh_disable(), preempt_disable(), and
rcu_read_lock_sched()), the reader will be protected by a single
mechanism only 25% of the time.  We really heavier testing of single
read-side mechanisms.

This commit therefore uses only a single mechanism about 60% of the time,
half of the time explicitly and one-eighth of the time by chance.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:05 -07:00
Paul E. McKenney
2397d072f7 rcutorture: Handle extended read-side critical sections
This commit enables rcutorture to test whether RCU properly aggregates
different types of read-side critical sections into a larger section
covering the set.  It does this by extending an initial read-side
critical section randomly for a random number of extensions.  There is
a new rcu_torture_ops field ->extendable that specifies what extensions
are permitted for a given flavor of RCU (for example, SRCU does not
permit any extensions, while RCU-sched permits all types).  Note that
if a given operation (for example, local_bh_disable()) extends an RCU
read-side critical section, then rcutorture feels free to also start
and end the critical section with that operation's type of disabling.

Disabling operations include local_bh_disable(), local_irq_disable(),
and preempt_disable().  This commit also adds a new "busted_srcud"
torture type, which verifies rcutorture's ability to detect extensions
of RCU read-side critical sections that are not handled.  Gotta test
the test, after all!

Note that it is not legal to invoke local_bh_disable() with interrupts
disabled, and this transition is avoided by overriding the random-number
generator when it wants to call local_bh_disable() while interrupts
are disabled.  The code instead leaves both interrupts and bh/softirq
disabled in this case.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:05 -07:00
Paul E. McKenney
241b42522a rcutorture: Make rcu_torture_timer() use rcu_torture_one_read()
This commit saves a few lines of code by making rcu_torture_timer()
invoke rcu_torture_one_read(), thus completing the consolidation of
code between rcu_torture_timer() and rcu_torture_reader().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:04 -07:00
Paul E. McKenney
3025520ec4 rcutorture: Use per-CPU random state for rcu_torture_timer()
Currently, the rcu_torture_timer() function uses a single global
torture_random_state structure protected by a single global lock.
This conflicts to some extent with performance and scalability,
but even more with the goal of consolidating read-side testing
with rcu_torture_reader().  This commit therefore creates a per-CPU
torture_random_state structure for use by rcu_torture_timer() and
eliminates the lock.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Make rcu_torture_timer_rand static, per 0day Test Robot report. ]
2018-07-12 15:42:04 -07:00
Paul E. McKenney
8da9a59523 rcutorture: Use atomic increment for n_rcu_torture_timers
Currently, rcu_torture_timer() relies on a lock to guard updates to
n_rcu_torture_timers.  Unfortunately, consolidating code with
rcu_torture_reader() will dispense with this lock.  This commit
therefore makes n_rcu_torture_timers be an atomic_long_t and uses
atomic_long_inc() to carry out the update.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:03 -07:00
Paul E. McKenney
6b06aa723e rcutorture: Extract common code from rcu_torture_reader()
This commit extracts the code executed on each pass through the loop
in rcu_torture_reader() into a new rcu_torture_one_read() function.
This new function will also be used by rcu_torture_timer().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:02 -07:00
Paul E. McKenney
2d3625841d rcuperf: Remove unused torturing_tasks() function
The torturing_tasks() function in rcuperf.c is not used, so this commit
removes it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:02 -07:00
Paul E. McKenney
6bea2cc5a9 rcu: Remove rcutorture test version and sequence number
Back when RCU had a debugfs interface, there was a test version and
sequence number that allowed associating debugfs data with a particular
test run, where the test run started with modprobe and ended with rmmod,
which was how tests were run back on the old ABAT system within IBM.
But rcutorture testing no longer runs on ABAT, and there is no longer an
RCU debugfs interface, so there is no longer any need for test versions
and sequence numbers.

This commit therefore removes the rcutorture_record_test_transition()
and rcutorture_record_progress() functions, and along with them the
rcutorture_testseq and rcutorture_vernum variables that they update.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:01 -07:00
Paul E. McKenney
028be12b29 rcutorture: Change units of onoff_interval to jiffies
Some RCU bugs have been sensitive to the frequency of CPU-hotplug
operations, which have been gradually increased over time.  But this
frequency is now at the one-second lower limit that can be specified using
the rcutorture.onoff_interval kernel parameter.  This commit therefore
changes the units of rcutorture.onoff_interval from seconds to jiffies,
and also sets the value specified for this kernel parameter in the TREE03
rcutorture scenario to 200, which is 200 milliseconds for HZ=1000.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:42:01 -07:00
Joel Fernandes (Google)
c7cd161ecb rcu: Assign higher prio to RCU threads if rcutorture is built-in
The rcutorture RCU priority boosting tests fail even with CONFIG_RCU_BOOST
set because rcutorture's threads run at the same priority as the default
RCU kthreads (RT class with priority of 1).

This patch checks if RCU torture is built into the kernel and if so,
assigns RT priority 1 to the RCU threads, allowing the rcutorture boost
tests to pass.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:26 -07:00
Paul E. McKenney
52e17ba1d0 srcu: Add grace-period number to rcutorture statistics printout
This commit adds the SRCU grace-period number to the rcutorture statistics
printout, which allows it to be compared to the rcutorture "Writer stall
state" message.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:25 -07:00
Paul E. McKenney
89b4cd4b9e rcu: Print stall-warning NMI dyntick state in hexadecimal
The ->dynticks_nmi_nesting field records the nesting depth of both
interrupt and NMI handlers.  Because the kernel can enter interrupts
and never leave them (and vice versa) and because NMIs can interrupt
manipulation of the ->dynticks_nmi_nesting field, the values in this
field must be both chosen and maniupated very carefully.  As a result,
although the value is zero when the corresponding CPU is executing
neither an interrupt nor an NMI handler, it is 4,611,686,018,427,387,906
on 64-bit systems when there is a single level of interrupt/NMI handling
in progress.

This number is difficult to remember and interpret, so this commit
switches the output to hexadecimal, resulting in the much nicer
0x4000000000000002.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:24 -07:00
Paul E. McKenney
2ee5aca546 rcu: Make rcu_seq_diff() more exact
The current implementatation of rcu_seq_diff() follows tradition in
providing a rough-and-ready approximation of the number of elapsed grace
periods between the two rcu_seq values.  However, this difference is
used to flag RCU-failure "near misses", which can be a valuable debugging
aid, so more exactitude would be an improvement.  This commit therefore
improves the accuracy of rcu_seq_diff().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:23 -07:00
Byungchul Park
67abb96cbf rcu: Check the range of jiffies_till_{first,next}_fqs when setting them
Currently, the range of jiffies_till_{first,next}_fqs are checked and
adjusted on and on in the loop of rcu_gp_kthread on runtime.

However, it's enough to check them only when setting them, not every
time in the loop. So make them handled on a setting time via sysfs.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:22 -07:00
Paul E. McKenney
47199a0812 rcu: Add diagnostics for rcutorture writer stall warning
This commit adds any in-the-future ->gp_seq_needed fields to the
diagnostics for an rcutorture writer stall warning message.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:22 -07:00
Steven Rostedt (VMware)
cd23ac8ddb rcu: Add comment to the last sleep in the rcu tasks loop
At the end of rcu_tasks_kthread() there's a lonely
schedule_timeout_uninterruptible() call with no apparent rationale for
its existence. But there is. It is to keep the thread from going into
a tight loop if there's some anomaly. That really needs a comment.

Link: http://lkml.kernel.org/r/20180524223839.GU3803@linux.vnet.ibm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:21 -07:00
Steven Rostedt (VMware)
c03be752d3 rcu: Speed up calling of RCU tasks callbacks
Joel Fernandes found that the synchronize_rcu_tasks() was taking a
significant amount of time. He demonstrated it with the following test:

 # cd /sys/kernel/tracing
 # while [ 1 ]; do x=1; done &
 # echo '__schedule_bug:traceon' > set_ftrace_filter
 # time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real	0m1.064s
user	0m0.000s
sys	0m0.004s

Where it takes a little over a second to perform the synchronize,
because there's a loop that waits 1 second at a time for tasks to get
through their quiescent points when there's a task that must be waited
for.

After discussion we came up with a simple way to wait for holdouts but
increase the time for each iteration of the loop but no more than a
full second.

With the new patch we have:

 # time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real	0m0.131s
user	0m0.000s
sys	0m0.004s

Which drops it down to 13% of what the original wait time was.

Link: http://lkml.kernel.org/r/20180523063815.198302-2-joel@joelfernandes.org
Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Suggested-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:21 -07:00
Joel Fernandes (Google)
0d805a70a6 rcu: Add comment documenting how rcu_seq_snap works
rcu_seq_snap may be tricky to decipher. Lets document how it works with
an example to make it easier.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Shrink comment as suggested by Peter Zijlstra. ]
2018-07-12 15:39:20 -07:00
Paul E. McKenney
b06ae25a1e rcu: Use RCU CPU stall timeout for rcu_check_gp_start_stall()
Currently, rcu_check_gp_start_stall() waits for one second after the first
request before complaining that a grace period has not yet started.  This
was desirable while testing the conversion from ->future_gp_needed[] to
->gp_seq_needed, but it is a bit on the hair-trigger side for production
use under heavy load.  This commit therefore makes this wait time be
exactly that of the RCU CPU stall warning, allowing easy adjustment of
both timeouts to suit the distribution or installation at hand.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:20 -07:00
Paul E. McKenney
51fbb910f5 rcu: Remove __maybe_unused from rcu_cpu_has_callbacks()
The rcu_cpu_has_callbacks() function is now used in all configurations,
so this commit removes the __maybe_unused.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:19 -07:00
Paul E. McKenney
9622179519 rcu: Remove "inline" from rcu_perf_print_module_parms()
This function is in rcuperf.c, which is not an include file, so there
is no problem dropping the "inline", especially given that this function
is invoked only twice per rcuperf run.  This commit therefore delegates
the inlining decision to the compiler by dropping the "inline".

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:19 -07:00
Paul E. McKenney
eac45e586c rcu: Remove "inline" from rcu_torture_print_module_parms()
This function is in rcutorture.c, which is not an include file, so there
is no problem dropping the "inline", especially given that this function
is invoked only twice per rcutorture run.  This commit therefore delegates
the inlining decision to the compiler by dropping the "inline".

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:18 -07:00
Paul E. McKenney
95394e69c4 rcu: Remove "inline" from panic_on_rcu_stall() and rcu_blocking_is_gp()
These functions are in kernel/rcu/tree.c, which is not an include file,
so there is no problem dropping the "inline", especially given that these
functions are nowhere near a fastpath.  This commit therefore delegates
the inlining decision to the compiler by dropping the "inline".

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:18 -07:00
Paul E. McKenney
ab6b82147f rcu: Remove unused local variable "cpu"
One danger of using __maybe_unused is that the compiler doesn't yell
at you when you remove the last reference, witness rcu_bind_gp_kthread()
and its local variable "cpu".  This commit removes this local variable.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:17 -07:00
Paul E. McKenney
164ba3fc48 rcu: Remove unused rcu_kick_nohz_cpu() function
The rcu_kick_nohz_cpu() function is no longer used, and the functionality
it used to provide is now provided by a call to resched_cpu() in the
force-quiescent-state function rcu_implicit_dynticks_qs().  This commit
therefore removes rcu_kick_nohz_cpu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:17 -07:00
Paul E. McKenney
c7037ff524 rcu: Clarify and correct the rcu_preempt_qs() header comment
The rcu_preempt_qs() function only applies to the CPU, not the task.
A task really is allowed to invoke this function while in an RCU-preempt
read-side critical section, but only if it has first added itself to
some leaf rcu_node structure's ->blkd_tasks list.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:16 -07:00
Paul E. McKenney
3b57a3994f rcu: Inline rcu_dynticks_momentary_idle() into its sole caller
The rcu_dynticks_momentary_idle() function is invoked only from
rcu_momentary_dyntick_idle(), and neither function is particularly
large.  This commit therefore saves a few lines by inlining
rcu_dynticks_momentary_idle() into rcu_momentary_dyntick_idle().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:16 -07:00
Paul E. McKenney
15651201fa rcu: Mark task as .need_qs less aggressively
If any scheduling-clock interrupt interrupts an RCU-preempt read-side
critical section, the interrupted task's ->rcu_read_unlock_special.b.need_qs
field is set.  This causes the outermost rcu_read_unlock() to incur the
extra overhead of calling into rcu_read_unlock_special().  This commit
reduces that overhead by setting ->rcu_read_unlock_special.b.need_qs only
if the grace period has been in effect for more than one second.

Why one second?  Because this is comfortably smaller than the minimum
RCU CPU stall-warning timeout of three seconds, but long enough that the
.need_qs marking should happen quite rarely.  And if your RCU read-side
critical section has run on-CPU for a full second, it is not unreasonable
to invest some CPU time in ending the grace period quickly.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:15 -07:00
Paul E. McKenney
6f56f714db rcu: Improve RCU-tasks naming and comments
The naming and comments associated with some RCU-tasks code make
the faulty assumption that context switches due to cond_resched()
are voluntary.  As several people pointed out, this is not the case.
This commit therefore updates function names and comments to better
reflect current reality.

Reported-by: Byungchul Park <byungchul.park@lge.com>
Reported-by: Joel Fernandes <joel@joelfernandes.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:15 -07:00
Joe Perches
a7538352da rcu: Use pr_fmt to prefix "rcu: " to logging output
This commit also adjusts some whitespace while in the area.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Revert string-breaking %s as requested by Andy Shevchenko. ]
2018-07-12 15:39:13 -07:00
Byungchul Park
07f27570dc rcu: Improve rcu_note_voluntary_context_switch() reporting
We expect a quiescent state of TASKS_RCU when cond_resched_tasks_rcu_qs()
is called, no matter whether it actually be scheduled or not. However,
it currently doesn't report the quiescent state when the task enters
into __schedule() as it's called with preempt = true. So make it report
the quiescent state unconditionally when cond_resched_tasks_rcu_qs() is
called.

And in TINY_RCU, even though the quiescent state of rcu_bh also should
be reported when the tick interrupt comes from user, it doesn't. So make
it reported.

Lastly in TREE_RCU, rcu_note_voluntary_context_switch() should be
reported when the tick interrupt comes from not only user but also idle,
as an extended quiescent state.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Simplify rcutiny portion given no RCU-tasks for !PREEMPT. ]
2018-07-12 15:39:12 -07:00
Paul E. McKenney
3949fa9bac rcu: Make rcu_read_unlock_special() static
Because rcu_read_unlock_special() is no longer used outside of
kernel/rcu/tree_plugin.h, this commit makes it static.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:11 -07:00
Paul E. McKenney
f2e2df5978 rcu: Add diagnostics for offline CPUs failing to report QS
CPUs are expected to report quiescent states when coming online and
when going offline, and grace-period initialization is supposed to
handle any race conditions where a CPU's ->qsmask bit is set just after
it goes offline.  This commit adds diagnostics for the case where an
offline CPU nevertheless has a grace period waiting on it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:10 -07:00
Paul E. McKenney
fea3f222d3 rcu: Record ->gp_state for both phases of grace-period initialization
Grace-period initialization first processes any recent CPU-hotplug
operations, and then initializes state for the new grace period.  These
two phases of initialization are currently not distinguished in debug
prints, but the distinction is valuable in a number of debug situations.
This commit therefore introduces two new values for ->gp_state,
RCU_GP_ONOFF and RCU_GP_INIT, in order to make this distinction.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:09 -07:00
Paul E. McKenney
5773894231 rcu: Add CPU online/offline state to dump_blkd_tasks()
Interactions between CPU-hotplug operations and grace-period
initialization can result in dump_blkd_tasks().  One of the first
debugging actions in this case is to search back in dmesg to work
out which of the affected rcu_node structure's CPUs are online and to
determine the last CPU-hotplug operation affecting any of those CPUs.
This can be laborious and error-prone, especially when console output
is lost.

This commit therefore causes dump_blkd_tasks() to dump the state of
the affected rcu_node structure's CPUs and the last grace period during
which the last offline and online operation affected each of these CPUs.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:09 -07:00
Paul E. McKenney
ff3cee3908 rcu: Add up-tree information to dump_blkd_tasks() diagnostics
This commit updates dump_blkd_tasks() to print out quiescent-state
bitmasks for the rcu_node structures further up the tree.  This
information helps debugging of interactions between CPU-hotplug
operations and RCU grace-period initialization.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:08 -07:00
Paul E. McKenney
e05121ba5b rcu: Remove CPU-hotplug failsafe from force-quiescent-state code path
Now that quiescent states for newly offlined CPUs are reported either
when that CPU goes offline or at the end of grace-period initialization,
the CPU-hotplug failsafe in the force-quiescent-state code path is no
longer needed.

This commit therefore removes this failsafe.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:07 -07:00
Paul E. McKenney
17a8212b8d rcu: Remove failsafe check for lost quiescent state
Now that quiescent-state reporting is fully event-driven, this commit
removes the check for a lost quiescent state from force_qs_rnp().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:06 -07:00
Paul E. McKenney
f34f2f5852 rcu: Move grace-period pre-init delay after pre-init
The main race with the early part of grace-period initialization appears
to be with CPU hotplug.  To more fully open this race window, this commit
moves the rcu_gp_slow() from the beginning of the early initialization
loop to follow that loop, thus widening the race window, especially for
the rcu_node structures that are initialized last.  This commit also
expands rcutree.gp_preinit_delay from 3 to 12, giving the same overall
delay in the grace period, but concentrated in the spot where it will
do the most good.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:06 -07:00
Paul E. McKenney
1f3e5f51b9 rcu: Add RCU-preempt check for waiting on newly onlined CPU
RCU should only be waiting on CPUs that were online at the time that the
current grace period started.  Failure to abide by this rule can result
in confusing splats during grace-period cleanup and initialization.
This commit therefore adds a check to RCU-preempt's preempted-task
queuing that checks for waiting on newly onlined CPUs.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:05 -07:00
Paul E. McKenney
1e64b15a4b rcu: Fix grace-period hangs due to race with CPU offline
Without special fail-safe quiescent-state-propagation checks, grace-period
hangs can result from the following scenario:

1.	CPU 1 goes offline.

2.	Because CPU 1 is the only CPU in the system blocking the current
	grace period, the grace period ends as soon as
	rcu_cleanup_dying_idle_cpu()'s call to rcu_report_qs_rnp()
	returns.

3.	At this point, the leaf rcu_node structure's ->lock is no longer
	held: rcu_report_qs_rnp() has released it, as it must in order
	to awaken the RCU grace-period kthread.

4.	At this point, that same leaf rcu_node structure's ->qsmaskinitnext
	field still records CPU 1 as being online.  This is absolutely
	necessary because the scheduler uses RCU (in this case on the
	wake-up path while awakening RCU's grace-period kthread), and
	->qsmaskinitnext contains RCU's idea as to which CPUs are online.
	Therefore, invoking rcu_report_qs_rnp() after clearing CPU 1's
	bit from ->qsmaskinitnext would result in a lockdep-RCU splat
	due to RCU being used from an offline CPU.

5.	RCU's grace-period kthread awakens, sees that the old grace period
	has completed and that a new one is needed.  It therefore starts
	a new grace period, but because CPU 1's leaf rcu_node structure's
	->qsmaskinitnext field still shows CPU 1 as being online, this new
	grace period is initialized to wait for a quiescent state from the
	now-offline CPU 1.

6.	Without the fail-safe force-quiescent-state checks, there would
	be no quiescent state from the now-offline CPU 1, which would
	eventually result in RCU CPU stall warnings and memory exhaustion.

It would be good to get rid of the special fail-safe quiescent-state
propagation checks, and thus it would be good to fix things so that
the above scenario cannot happen.  This commit therefore adds a new
->ofl_lock to the rcu_state structure.  This lock is held by rcu_gp_init()
across the applying of buffered online and offline operations to the
rcu_node tree, and it is also held by rcu_cleanup_dying_idle_cpu()
when buffering a new offline operation.  This prevents rcu_gp_init()
from acquiring the leaf rcu_node structure's lock during the interval
between when rcu_cleanup_dying_idle_cpu() invokes rcu_report_qs_rnp(),
which releases ->lock and the re-acquisition of that same lock.
This in turn prevents the failure scenario outlined above, and will
hopefully eventually allow removal of the offline-CPU checks from the
force-quiescent-state code path.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:04 -07:00
Paul E. McKenney
ec2c29765a rcu: Fix grace-period hangs from mid-init task resume
Without special fail-safe quiescent-state-propagation checks, grace-period
hangs can result from the following scenario:

1.	A task running on a given CPU is preempted in its RCU read-side
	critical section.

2.	That CPU goes offline, and there are now no online CPUs
	corresponding to that CPU's leaf rcu_node structure.

3.	The rcu_gp_init() function does the first phase of grace-period
	initialization, and sets the aforementioned leaf rcu_node
	structure's ->qsmaskinit field to all zeroes.  Because there
	is a blocked task, it does not propagate the zeroing of either
	->qsmaskinit or ->qsmaskinitnext up the rcu_node tree.

4.	The task resumes on some other CPU and exits its critical section.
	There is no grace period in progress, so the resulting quiescent
	state is not reported up the tree.

5.	The rcu_gp_init() function does the second phase of grace-period
	initialization, which results in the leaf rcu_node structure
	being initialized to expect no further quiescent states, but
	with that structure's parent expecting a quiescent-state report.

	The parent will never receive a quiescent state from this leaf
	rcu_node structure, so the grace period will hang, resulting in
	RCU CPU stall warnings.

It would be good to get rid of the special fail-safe quiescent-state
propagation checks.  This commit therefore checks the leaf rcu_node
structure's ->wait_blkd_tasks field during grace-period initialization.
If this flag is set, the rcu_report_qs_rnp() is invoked to immediately
report the possible quiescent state.  While in the neighborhood, this
commit also report quiescent states for any CPUs that went offline between
the two phases of grace-period initialization, thus reducing grace-period
delays and hopefully eventually allowing removal of offline-CPU checks
from the force-quiescent-state code path.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:04 -07:00
Paul E. McKenney
0b107d24d9 rcu: Suppress false-positive splats from mid-init task resume
Consider the following sequence of events in a PREEMPT=y kernel:

1.	All CPUs corresponding to a given leaf rcu_node structure are
	offline.

2.	The first phase of the rcu_gp_init() function's grace-period
	initialization runs, and sets that rcu_node structure's
	->qsmaskinit to zero, as it should.

3.	One of the CPUs corresponding to that rcu_node structure comes
	back online.  Note that because this CPU came online after the
	grace period started, this grace period can safely ignore this
	newly onlined CPU.

4.	A task running on the newly onlined CPU enters an RCU-preempt
	read-side critical section, and is then preempted.  Because
	the corresponding rcu_node structure's ->qsmask is zero,
	rcu_preempt_ctxt_queue() leaves the rcu_node structure's
	->gp_tasks field NULL, as it should.

5.	The rcu_gp_init() function continues running the second phase of
	grace-period initialization.  The ->qsmask field of the parent of
	the aforementioned leaf rcu_node structure is set to not expect
	a quiescent state from the leaf, as is only right and proper.

	However, when rcu_gp_init() reaches the leaf, it invokes
	rcu_preempt_check_blocked_tasks(), which sees that the leaf's
	->blkd_tasks list is non-empty, and therefore sets the leaf's
	->gp_tasks field to reference the first task on that list.

6.	The grace period ends before the preempted task resumes, which
	is perfectly fine, given that this grace period was under no
	obligation to wait for that task to exit its late-starting
	RCU-preempt read-side critical section.  Unfortunately, the
	leaf's ->gp_tasks field is non-NULL, so rcu_gp_cleanup() splats.
	After all, it appears to rcu_gp_cleanup() that the grace period
	failed to wait for a task that was supposed to be blocking that
	grace period.

This commit avoids this false-positive splat by adding a check of both
->qsmaskinit and ->wait_blkd_tasks to rcu_preempt_check_blocked_tasks().
If both ->qsmaskinit and ->wait_blkd_tasks are zero, then the task must
have entered its RCU-preempt read-side critical section late (after all,
the CPU that it is running on was not online at that time), which means
that the upper-level rcu_node structure won't be waiting for anything
on the leaf anyway.

If ->wait_blkd_tasks is non-zero, then there is at least one task on
ths rcu_node structure's ->blkd_tasks list whose RCU read-side
critical section predates the current grace period.  If ->qsmaskinit
is non-zero, there is at least one CPU that was online at the start
of the current grace period.  Thus, if both are zero, there is nothing
to wait for.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:03 -07:00
Paul E. McKenney
99990da1b3 rcu: Suppress more involved false-positive preempted-task splats
Consider the following sequence of events in a PREEMPT=y kernel:

1.	All but one of the CPUs corresponding to a given leaf rcu_node
	structure go offline.  Each of these CPUs clears its bit in that
	structure's ->qsmaskinitnext field.

2.	A new grace period starts, and rcu_gp_init() scans the leaf
	rcu_node structures, applying CPU-hotplug changes since the
	start of the previous grace period, including those changes in
	#1 above.  This copies each leaf structure's ->qsmaskinitnext
	to its ->qsmask field, which represents the CPUs that this new
	grace period will wait on.  Each copy operation is done holding
	the corresponding leaf rcu_node structure's ->lock, and at the
	end of this scan, rcu_gp_init() holds no locks.

3.	The last CPU corresponding to #1's leaf rcu_node structure goes
	offline, clearing its bit in that structure's ->qsmaskinitnext
	field, but not touching the ->qsmaskinit field.  Note that
	rcu_gp_init() is not currently holding any locks!  This CPU does
	-not- report a quiescent state because the grace period has not
	yet initialized itself sufficiently to have set any bits in any
	of the leaf rcu_node structures' ->qsmask fields.

4.	The rcu_gp_init() function continues initializing the new grace
	period, copying each leaf rcu_node structure's ->qsmaskinit
	field to its ->qsmask field while holding the corresponding ->lock.
	This sets the ->qsmask bit corresponding to #3's CPU.

5.	Before the grace period ends, #3's CPU comes back online.
	Because te grace period has not yet done any force-quiescent-state
	scans (which would report a quiescent state on behalf of any
	offline CPUs), this CPU's ->qsmask bit is still set.

6.	A task running on the newly onlined CPU is preempted while in
	an RCU read-side critical section.  Because this CPU's ->qsmask
	bit is net, not only does this task queue itself on the leaf
	rcu_node structure's ->blkd_tasks list, it also sets that
	structure's ->gp_tasks pointer to reference it.

7.	The grace period started in #1 above comes to an end.  This
	results in rcu_gp_cleanup() being invoked, which, among other
	things, checks to make sure that there are no tasks blocking the
	just-ended grace period, that is, that all ->gp_tasks pointers
	are NULL.  The ->gp_tasks pointer corresponding to the task
	preempted in #3 above is non-NULL, which results in a splat.

This splat is a false positive.  The task's RCU read-side critical
section cannot have begun before the just-ended grace period because
this would mean either: (1) The CPU came online before the grace period
started, which cannot have happened because the grace period started
before that CPU went offline, or (2) The task started its RCU read-side
critical section on some other CPU, but then it would have had to have
been preempted before migrating to this CPU, which would mean that it
would have instead queued itself on that other CPU's rcu_node structure.
RCU's grace periods thus are working correctly.  Or, more accurately,
that remaining bugs in RCU's grace periods are elsewhere.

This commit eliminates this false positive by adding code to the end
of rcu_cpu_starting() that reports a quiescent state to RCU, which has
the side-effect of clearing that CPU's ->qsmask bit, preventing the
above scenario.  This approach has the added benefit of more promptly
reporting quiescent states corresponding to offline CPUs.  Nevertheless,
this commit does -not- remove the need for the force-quiescent-state
scans to check for offline CPUs, given that a CPU might remain offline
indefinitely.  And without the checks in the force-quiescent-state scans,
the grace period would also persist indefinitely, which could result in
hangs or memory exhaustion.

Note well that the call to rcu_report_qs_rnp() reporting the quiescent
state must come -after- the setting of this CPU's bit in the leaf rcu_node
structure's ->qsmaskinitnext field.  Otherwise, lockdep-RCU will complain
bitterly about quiescent states coming from an offline CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:03 -07:00
Paul E. McKenney
fece27760f rcu: Suppress false-positive preempted-task splats
Consider the following sequence of events in a PREEMPT=y kernel:

1.	All CPUs corresponding to a given rcu_node structure go offline.
	A new grace period starts just after the CPU-hotplug code path
	does its synchronize_rcu() for the last CPU, so at least this
	CPU is present in that structure's ->qsmask.

2.	Before the grace period ends, a CPU comes back online, and not
	just any CPU, but the one corresponding to a non-zero bit in
	the leaf rcu_node structure's ->qsmask.

3.	A task running on the newly onlined CPU is preempted while in
	an RCU read-side critical section.  Because this CPU's ->qsmask
	bit is net, not only does this task queue itself on the leaf
	rcu_node structure's ->blkd_tasks list, it also sets that
	structure's ->gp_tasks pointer to reference it.

4.	The grace period started in #1 above comes to an end.  This
	results in rcu_gp_cleanup() being invoked, which, among other
	things, checks to make sure that there are no tasks blocking the
	just-ended grace period, that is, that all ->gp_tasks pointers
	are NULL.  The ->gp_tasks pointer corresponding to the task
	preempted in #3 above is non-NULL, which results in a splat.

This splat is a false positive.  The task's RCU read-side critical
section cannot have begun before the just-ended grace period because
this would mean either: (1) The CPU came online before the grace period
started, which cannot have happened because the grace period started
before that CPU was all the way offline, or (2) The task started its
RCU read-side critical section on some other CPU, but then it would
have had to have been preempted before migrating to this CPU, which
would mean that it would have instead queued itself on that other CPU's
rcu_node structure.

This commit eliminates this false positive by adding code to the end
of rcu_cleanup_dying_idle_cpu() that reports a quiescent state to RCU,
which has the side-effect of clearing that CPU's ->qsmask bit, preventing
the above scenario.  This approach has the added benefit of more promptly
reporting quiescent states corresponding to offline CPUs.

Note well that the call to rcu_report_qs_rnp() reporting the quiescent
state must come -before- the clearing of this CPU's bit in the leaf
rcu_node structure's ->qsmaskinitnext field.  Otherwise, lockdep-RCU
will complain bitterly about quiescent states coming from an offline CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:02 -07:00
Paul E. McKenney
5554788e1d rcu: Suppress false-positive offline-CPU lockdep-RCU splat
The rcu_lockdep_current_cpu_online() function currently checks only the
RCU-sched data structures to determine whether or not RCU believes that a
given CPU is offline.  Unfortunately, there are multiple flavors of RCU,
which means that there is a short window of time during which the various
flavors disagree as to whether or not a given CPU is offline.  This can
result in false-positive lockdep-RCU splats in which some other flavor
of RCU tries to do something based on its view that the CPU is online,
only to get hit with a lockdep-RCU splat because RCU-sched instead
believes that the CPU is offline.

This commit therefore changes rcu_lockdep_current_cpu_online() to scan
all RCU flavors and to consider a given CPU to be online if any of the
RCU flavors believe it to be online, thus preventing these false-positive
splats.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:02 -07:00
Paul E. McKenney
928164351e rcu: Prevent useless FQS scan after all CPUs have checked in
The force_qs_rnp() function checks for ->qsmask being all zero, that is,
all CPUs for the current rcu_node structure having already passed through
quiescent states.  But with RCU-preempt, this is not sufficient to report
quiescent states further up the tree, so there are further checks that
can initiate RCU priority boosting and also for races with CPU-hotplug
operations.  However, if neither of these further checks apply, the code
proceeds to carry out a useless scan of an all-zero ->qsmask.

This commit therefore adds code to release the current rcu_node
structure's lock and continue on to the next rcu_node structure, thereby
avoiding this useless scan.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:01 -07:00
Paul E. McKenney
91f63ced7d rcu: Replace smp_wmb() with smp_store_release() for stall check
This commit gets rid of the smp_wmb() in record_gp_stall_check_time()
in favor of an smp_store_release().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:01 -07:00
Paul E. McKenney
77cfc7bf24 rcu: Fix typo and add additional debug
This commit fixes a typo and adds some additional debugging to the
message emitted when a task blocking the current grace period is listed
as blocking it when either that grace period ends or the next grace
period begins.  This commit also reformats the console message for
readability.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:00 -07:00
Paul E. McKenney
c74859d1eb rcu: Make rcu_report_unblock_qs_rnp() warn on violated preconditions
If rcu_report_unblock_qs_rnp() is invoked on something other than
preemptible RCU or if there are still preempted tasks blocking the
current grace period, something went badly wrong in the caller.
This commit therefore adds WARN_ON_ONCE() to these conditions, but
leaving the legitimate reason for early exit (rnp->qsmask != 0)
unwarned.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:59 -07:00
Paul E. McKenney
8d672fa6bf rcu: Make rcu_init_new_rnp() stop upon already-set bit
Currently, rcu_init_new_rnp() walks up the rcu_node combining tree,
setting bits in the ->qsmaskinit fields on the way up.  It walks up
unconditionally, regardless of the initial state of these bits.  This is
OK because only the corresponding RCU grace-period kthread ever tests
or sets these bits during runtime.  However, it is also pointless, and
it increases both memory and lock contention (albeit only slightly), so
this commit stops the walk as soon as an already-set bit is encountered.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:59 -07:00
Paul E. McKenney
c50cbe535c rcu: Fix an obsolete ->qsmaskinit comment
Back in the old days, when grace-period initialization blocked CPU
hotplug, the ->qsmaskinit mask was indeed updated at the time that
a given CPU went offline.  However, with the deferral of these updates
until the beginning of the next grace period in commit 0aa04b055e
("rcu: Process offlining and onlining only at grace-period start"),
it is instead ->qsmaskinitnext that gets updated at that time.

This commit therefore updates the obsolete comment.  It also fixes
punctuation while on the topic of comments mentioning ->qsmaskinit.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:58 -07:00
Paul E. McKenney
962aff03c3 rcu: Clean up handling of tasks blocked across full-rcu_node offline
Commit 0aa04b055e ("rcu: Process offlining and onlining only at
grace-period start") deferred handling of CPU-hotplug events until the
start of the next grace period, but consider the following sequence
of events:

1.	A task is preempted within an RCU-preempt read-side critical
	section.

2.	The CPU that this task was running on goes offline, along with all
	other CPUs sharing the corresponding leaf rcu_node structure.

3.	The task resumes execution.

4.	One of those CPUs comes back online before a new grace period starts.

In step 2, the code in the next rcu_gp_init() invocation will (correctly)
defer removing the leaf rcu_node structure from the upper-level bitmasks,
and will (correctly) set that structure's ->wait_blkd_tasks field.  During
the ensuing interval, RCU will (correctly) track the tasks preempted on
that structure because they must block any subsequent grace period.

In step 3, the code in rcu_read_unlock_special() will (correctly) remove
the task from the leaf rcu_node structure.  From this point forward, RCU
need not pay attention to this structure, at least not until one of the
corresponding CPUs comes back online.

In step 4, the code in the next rcu_gp_init() invocation will
(incorrectly) invoke rcu_init_new_rnp().  This is incorrect because
the corresponding rcu_cleanup_dead_rnp() was never invoked.  This is
nevertheless harmless because the upper-level bits are still set.
So, no harm, no foul, right?

At least, all is well until a little further into rcu_gp_init()
invocation, which will notice that there are no longer any tasks blocked
on the leaf rcu_node structure, conclude that there is no longer anything
left over from step 2's offline operation, and will therefore invoke
rcu_cleanup_dead_rnp().  But this invocation of rcu_cleanup_dead_rnp()
is for the beginning of the earlier offline interval, and the previous
invocation of rcu_init_new_rnp() is for the end of that same interval.
That is right, they are invoked out of order.

That cannot be good, can it?

It turns out that this is not a (correctness!) problem because
rcu_cleanup_dead_rnp() checks to see if any of the corresponding CPUs
are online, and refuses to do anything if so.  In other words, in the
case where rcu_init_new_rnp() and rcu_cleanup_dead_rnp() execute out of
order, they both have no effect.

But this is at best an accident waiting to happen.

This commit therefore adds logic to rcu_gp_init() so that
rcu_init_new_rnp() and rcu_cleanup_dead_rnp() are always invoked in
order, and so that neither are invoked at all in cases where RCU had to
pay attention to the leaf rcu_node structure during the entire time that
all corresponding CPUs were offline.

And, while in the area, this commit reduces confusion by using formal
parameters rather than local variables that just happen to have the same
value at that particular point in the code.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:58 -07:00
Joel Fernandes (Google)
226ca5e766 rcu: Identify grace period is in progress as we advance up the tree
There's no need to keep checking the same starting node for whether a
grace period is in progress as we advance up the funnel lock loop. Its
sufficient if we just checked it in the start, and then subsequently
checked the internal nodes as we advanced up the combining tree. This
also makes sense because the grace-period updates propogate from the
root to the leaf, so there's a chance we may find a grace period has
started as we advance up, lets check for the same.

Reported-by: Paul McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:57 -07:00
Joel Fernandes (Google)
df2bf8f7f7 rcu: Use better variable names in funnel locking loop
The funnel locking loop in rcu_start_this_gp uses rcu_root as a
temporary variable while walking the combining tree. This causes a
tiresome exercise of a code reader reminding themselves that rcu_root
may not be root. Lets just call it rnp, and rename other variables as
well to be more appropriate.

Original patch: https://patchwork.kernel.org/patch/10396577/

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix name in comment as well. ]
2018-07-12 15:38:57 -07:00
Joel Fernandes
b73de91d6a rcu: Rename the grace-period-request variables and parameters
The name 'c' is used for variables and parameters holding the requested
grace-period sequence number.  However it is no longer very meaningful
given the conversions from ->gpnum and (especially) ->completed to
->gp_seq. This commit therefore renames 'c' to 'gp_seq_req'.

Previous patch discussion is at:
https://patchwork.kernel.org/patch/10396579/

Signed-off-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:56 -07:00
Paul E. McKenney
3d18469a2b rcu: Regularize resetting of rcu_data wrap indicator
The rcu_data structure's ->gpwrap indicator is currently reset only
when the CPU in question detects a new grace period.  This is in theory
sufficient because any CPU that has been out of action for long enough
that its ->gpwrap indicator is set is guaranteed to see both the end
of an old grace period and the start of a new one.

However, the current code leaves a short window during which the ->gpwrap
indicator has been reset but the corresponding ->gp_seq counter has not
yet been brought up to date.  This is harmless because interrupts are
disabled, but it is likely to (at the very least) cause confusion.

This commit therefore moves the resetting of ->gpwrap to follow the
updating of ->gp_seq.  While in the area, it also resets ->gp_seq_needed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:56 -07:00
Paul E. McKenney
d72193123c rcutorture: Correctly handle grace-period sequence wrap
The new ->gq_seq grace-period sequence numbers must be shifted down,
which give artifacts when these numbers wrap.  This commit therefore
enables rcutorture and rcuperf to handle grace-period sequence numbers
even if they do wrap.  It does this by allowing a special subtraction
function to be specified, and this function subtracts before shifting.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:55 -07:00
Paul E. McKenney
2e3e5e5501 rcu: Make rcu_start_this_gp() check for grace period already started
In the old days of ->gpnum and ->completed, the code requesting a new
grace period checked to see if that grace period had already started,
bailing early if so.  The new-age ->gp_seq approach instead checks
whether the grace period has already finished.  A compensating change
pushed the requested grace period down to the bottom of the tree, thus
reducing lock contention and even eliminating it in some cases.  But why
not further reduce contention, especially on large systems, by doing both,
especially given that the cost of doing both is extremely small?

This commit therefore adds a new rcu_seq_started() function that checks
whether a specified grace period has already started.  It then uses
this new function in place of rcu_seq_done() in the rcu_start_this_gp()
function's funnel locking code.

Reported-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:54 -07:00
Joel Fernandes (Google)
5ca0905f67 rcu: Fix cpustart tracepoint gp_seq number
The "cpustart" trace event shows a stale gp_seq. This is because it uses
rdp->gp_seq, which is updated only at the end of the __note_gp_changes()
function. This commit therefore instead uses rnp->gp_seq.

An alternative fix would be to update rdp->gp_seq earlier, but this would
break RCU's detection of the beginning of a new-to-this-CPU grace period.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:53 -07:00
Joel Fernandes (Google)
5b55072f22 rcu: Produce last "CleanupMore" trace only if late-breaking request
Currently Tree RCU's clean-up code emits a "CleanupMore" trace event in
response to late-arriving grace-period requests even if the grace period
was already requested. This makes "CleanupMore" show up an extra time (in
addition to once for each rcu_node structure that was previously marked
with the request), and for no good reason.  This commit therefore avoids
emitting this trace message unless the the only request for this next
grace period arrived during or after the cleanup scan of the rcu_node
structures.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:53 -07:00
Paul E. McKenney
a2165e4168 rcu: Don't funnel-lock above leaf node if GP in progress
The old grace-period start code would acquire only the leaf's rcu_node
structure's ->lock if that structure believed that a grace period was
in progress.  The new code advances to the leaf's parent in this case,
needlessly acquiring then leaf's parent's ->lock.  This commit therefore
checks the grace-period state after marking the leaf with the need for
the specified grace period, and if the leaf believes that a grace period
is in progress, takes an early exit.

Reported-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Add "Startedleaf" tracing as suggested by Joel Fernandes. ]
2018-07-12 15:38:52 -07:00
Paul E. McKenney
e44e73ca47 rcu: Make simple callback acceleration refer to rdp->gp_seq_needed
Now that the rcu_data structure contains ->gp_seq_needed, create an
rcu_accelerate_cbs_unlocked() helper function that locklessly checks to
see if new callbacks' required grace period has already been requested.
If so, update the callback list locally and again locklessly.  (Though
interrupts must be and are disabled to avoid racing with conflicting
updates in interrupt handlers.)

Otherwise, call rcu_accelerate_cbs() as before.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:49 -07:00
Paul E. McKenney
ff3bb6f4d0 rcu: Remove ->gpnum and ->completed
Now that everything has been converted to use ->gp_seq instead of
->gpnum and ->completed, this commit removes ->gpnum and ->completed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:48 -07:00
Paul E. McKenney
fee5997c17 rcu: Convert rcu_fqs tracepoint to ->gp_seq
This commit makes the rcu_fqs tracepoint use ->gp_seq instead of ->gpnum.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:47 -07:00
Paul E. McKenney
db023296f0 rcu: Convert rcu_quiescent_state_report tracepoint to ->gp_seq
This commit makes the rcu_quiescent_state_report tracepoint use ->gp_seq
instead of ->gpnum.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:47 -07:00
Paul E. McKenney
865aa1e08d rcu: Convert rcu_unlock_preempted_task tracepoint to ->gp_seq
This commit makes the rcu_unlock_preempted_task tracepoint use ->gp_seq
instead of ->gpnum.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:46 -07:00
Paul E. McKenney
598ce09480 rcu: Convert rcu_preempt_task tracepoint to ->gp_seq
This commit makes the rcu_preempt_task tracepoint use ->gp_seq instead
of ->gpnum.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:45 -07:00
Paul E. McKenney
abd13fdd95 rcu: Convert rcu_future_grace_period tracepoint to gp_seq
This commit makes the rcu_future_grace_period tracepoint use gp_seq
instead of ->gpnum and ->completed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:44 -07:00
Paul E. McKenney
477351f782 rcu: Convert rcu_grace_period tracepoint to gp_seq
This commit makes the rcu_grace_period tracepoint use gp_seq instead
of ->gpnum or ->completed.  It also introduces a "cpuofl-bgp" string to
less obscurely indicate when a CPU has gone offline while a grace period
is waiting on it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:43 -07:00
Paul E. McKenney
ab5e869c1f rcu: Make rcu_nocb_wait_gp() check if GP already requested
This commit makes rcu_nocb_wait_gp() check rdp->gp_seq_needed to see
if the current CPU already knows about the needed grace period having
already been requested.  If so, it avoids acquiring the corresponding
leaf rcu_node structure's ->lock, thus decreasing contention.  This
optimization is intended for cases where either multiple leader rcuo
kthreads are running on the same CPU or these kthreads are running on
a non-offloaded (e.g., housekeeping) CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Move lock release past "if" as suggested by Joel Fernandes. ]
[ paulmck: Fix caching of furthest-future requested grace period. ]
2018-07-12 15:38:42 -07:00
Paul E. McKenney
7a1d0f23ad rcu: Move from ->need_future_gp[] to ->gp_seq_needed
One problem with the ->need_future_gp[] array is that the grace-period
assignment of each element changes as the grace periods complete.
This means that it is necessary to hold a lock when checking this
array to learn if a given grace period has already been requested.
This increase lock contention, which is the opposite of helpful.
This commit therefore replaces the ->need_future_gp[] with a single
->gp_seq_needed value and keeps it updated in the rcu_data structure.

This will enable reliable lockless checking of whether or not a given
grace period has already been requested.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:37:48 -07:00
Paul E. McKenney
aebc82644b rcutorture: Convert rcutorture_get_gp_data() to ->gp_seq
SRCU has long used ->srcu_gp_seq, and now RCU uses ->gp_seq.  This
commit therefore moves the rcutorture_get_gp_data() function from
a ->gpnum / ->completed pair to ->gp_seq.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:57 -07:00
Paul E. McKenney
471f87c3d9 rcu: Make RCU CPU stall warnings use ->gp_seq
This commit makes the RCU CPU stall-warning code in print_other_cpu_stall(),
print_cpu_stall(), and check_cpu_stall() use ->gp_seq instead of ->gpnum
and ->completed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:56 -07:00
Paul E. McKenney
29365e563b rcu: Convert grace-period requests to ->gp_seq
This commit converts the grace-period request code paths from ->completed
and ->gpnum to ->gp_seq.  The need_future_gp_element() macro encapsulates
the shift operation required to use ->gp_seq as an index to the
->need_future_gp[] array.  The rcu_cbs_completed() function is removed
in favor of the rcu_seq_snap() function.  The rcu_start_this_gp()
gets some temporary consistency checks and uses rcu_seq_done(),
rcu_seq_current(), rcu_seq_state(), and rcu_gp_in_progress() in place
of the earlier open-coded comparisons of ->gpnum and ->completed.
The rcu_future_gp_cleanup() function replaces use of ->completed
with ->gp_seq.  The rcu_accelerate_cbs() function replaces a call to
rcu_cbs_completed() with one to rcu_seq_snap().  The rcu_advance_cbs()
function replaces an access to >completed with one to ->gp_seq and adds
some temporary warnings.  The rcu_nocb_wait_gp() function replaces a
call to rcu_cbs_completed() with one to rcu_seq_snap() and an open-coded
comparison with rcu_seq_done().

The temporary warnings will be removed when the various ->gpnum and
->completed fields are removed.  Their purpose is to locate code who
might still be using ->gpnum and ->completed.  (Much easier that way
than trying to trace down the causes of too-short grace periods and
grace-period hangs!)

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:55 -07:00
Paul E. McKenney
d43a5d32e1 rcu: Convert ->completedqs to ->gp_seq
This commit switches the quiescent-state no-backtracking checks from
->gpnum and ->completed to ->gp_seq.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:54 -07:00
Paul E. McKenney
8aa670cdac rcu: Convert ->rcu_iw_gpnum to ->gp_seq
This commit switches the interrupt-disabled detection mechanism to
->gp_seq.  This mechanism is used as part of RCU CPU stall warnings,
and detects cases where the stall is due to a CPU having interrupts
disabled.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:53 -07:00
Paul E. McKenney
ba04107fc9 rcu: Move rcu_gp_in_progress() to ->gp_seq
This commit makes rcu_gp_in_progress() use ->gp_seq instead of
->completed and ->gpnum.  The READ_ONCE() invocations are buried
in rcu_seq_current().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:52 -07:00
Paul E. McKenney
e0da2374c3 rcu: Move rcu_nocb_gp_get() to ->gp_seq
This commit makes rcu_try_advance_all_cbs() use ->gp_seq.  It uses
rcu_seq_ctr() in order to shift away the state bits, so that the
low-order bits of the result may safely be used to index ->nocb_gp_wq[].

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:52 -07:00
Paul E. McKenney
03c8cb765a rcu: Move rcu_try_advance_all_cbs() to ->gp_seq
This commit makes rcu_try_advance_all_cbs() use ->gp_seq, with the
exception of tracing, which will be converted later.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:51 -07:00
Paul E. McKenney
e05720b097 rcu: Move rcu_implicit_dynticks_qs() to ->gp_seq
This commit makes rcu_implicit_dynticks_qs() use ->gp_seq, with the
exception of tracing, which will be converted later.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:51 -07:00
Paul E. McKenney
a66ae8ae35 rcu: Convert rcu_gpnum_ovf() to ->gp_seq
This commit converts rcu_gpnum_ovf() to use ->gp_seq instead of ->gpnum.
Same size unsigned long, so same approach.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:50 -07:00
Paul E. McKenney
67e14c1e39 rcu: Move RCU's grace-period-change code to ->gp_seq
This commit moves __note_gp_changes(), note_gp_changes(), and
__rcu_pending() to ->gp_seq, creating new rcu_seq_completed_gp() and
rcu_seq_new_gp() functions for this purpose.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Reinstate "cpuend: trace as suggested by Joel Fernandes. ]
2018-07-12 14:27:50 -07:00
Paul E. McKenney
e4be81a2ed rcu: Convert conditional grace-period primitives to ->gp_seq
This commit converts get_state_synchronize_rcu(), cond_synchronize_rcu(),
get_state_synchronize_sched(), and cond_synchronize_sched() from ->gpnum
and ->completed to ->gp_seq.  Note that this also introduces a full
memory barrier in the already-done paths off cond_synchronize_rcu() and
cond_synchronize_sched(), as work with LKMM indicates that the earlier
smp_load_acquire() were insufficiently strong in some situations where
these two functions were called just as the grace period ended.  In such
cases, these two functions would not gain the benefit of memory ordering
at the end of the grace period.

Please note that the performance impact is negligible, as you shouldn't
be using either function anywhere near a fastpath in any case.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:49 -07:00
Paul E. McKenney
c9a24e2d0c rcu: Make quiescent-state reporting use ->gp_seq
This commit switches the functions reporting quiescent states from
use of ->gpnum to ->gp_seq.  In either case, the point is to handle
races where a given grace period ends before a quiescent state can
be reported.  Failing to catch these races would result in too-short
grace periods, hence the checking.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:48 -07:00
Paul E. McKenney
78c5a67f17 rcu: Convert rcu_check_gp_kthread_starvation() to GP sequence number
This commit switches rcu_check_gp_kthread_starvation() from printing
->gpnum and ->completed to printing ->gp_seq upon detecting a starving
RCU grace-period kthread during an RCU CPU stall warning.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:48 -07:00
Paul E. McKenney
17ef2fe97c rcu: Make rcutorture's batches-completed API use ->gp_seq
The rcutorture test invokes rcu_batches_started(),
rcu_batches_completed(), rcu_batches_started_bh(),
rcu_batches_completed_bh(), rcu_batches_started_sched(), and
rcu_batches_completed_sched() to do grace-period consistency checks,
and rcuperf uses the _completed variants for statistics.
These functions use ->gpnum and ->completed.  This commit therefore
replaces them with rcu_get_gp_seq(), rcu_bh_get_gp_seq(), and
rcu_sched_get_gp_seq(), adjusting rcutorture and rcuperf to make
use of them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:47 -07:00
Paul E. McKenney
dee4f42298 rcu: Move rcu_gp_slow() to ->gp_seq
This commit moves rcu_gp_slow() to ->gp_seq.  This function only uses
the grace-period number to modulate delay, so rcu_seq_ctr(rsp->gp_seq)
gets the same effect, at least in cases where the delay is to happen
more than four times per wrap of an unsigned long.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:46 -07:00
Paul E. McKenney
de30ad512a rcu: Introduce grace-period sequence numbers
This commit adds grace-period sequence numbers (->gp_seq) to the
rcu_state, rcu_node, and rcu_data structures, and updates them.
It also checks for consistency between rsp->gpnum and rsp->gp_seq.
These ->gp_seq counters will eventually replace the existing ->gpnum
and ->completed counters, allowing a single memory access to determine
whether or not a grace period is in progress and if so, which one.
This in turn will enable changes that will reduce ->lock contention on
the leaf rcu_node structures.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 14:27:46 -07:00
Paul E. McKenney
609af1cdf0 Merge branches 'expedited.2018.07.12a', 'fixes.2018.07.12a', 'srcu.2018.06.25b' and 'torture.2018.06.25b' into HEAD
expedited.2018.07.12a: Expedited grace-period updates.
fixes.2018.07.12a: Pre-gp_seq miscellaneous fixes.
srcu.2018.06.25b: SRCU updates.
torture.2018.06.25b: Pre-gp_seq torture-test updates.
2018-07-12 14:26:14 -07:00