kernel_optimize_test/kernel/trace
Rasmus Villemoes 0f5e5a3ab7 tracing: Eliminate const char[] auto variables
Automatic const char[] variables cause unnecessary code
generation. For example, the this_mod variable leads to

    3f04:       48 b8 5f 5f 74 68 69 73 5f 6d   movabs $0x6d5f736968745f5f,%rax # __this_m
    3f0e:       4c 8d 44 24 02                  lea    0x2(%rsp),%r8
    3f13:       48 8d 7c 24 10                  lea    0x10(%rsp),%rdi
    3f18:       48 89 44 24 02                  mov    %rax,0x2(%rsp)
    3f1d:       4c 89 e9                        mov    %r13,%rcx
    3f20:       b8 65 00 00 00                  mov    $0x65,%eax # e
    3f25:       48 c7 c2 00 00 00 00            mov    $0x0,%rdx
                        3f28: R_X86_64_32S      .rodata.str1.1+0x18d
    3f2c:       be 48 00 00 00                  mov    $0x48,%esi
    3f31:       c7 44 24 0a 6f 64 75 6c         movl   $0x6c75646f,0xa(%rsp) # odul
    3f39:       66 89 44 24 0e                  mov    %ax,0xe(%rsp)

i.e., the string gets built on the stack at runtime. Similar code can be
found for the other instances I'm replacing here. Putting the string
in .rodata reduces the combined .text+.rodata size and saves time and
stack space at runtime.

The simplest fix, and what I've done for the this_mod case, is to just
make the variable static.

However, for the "<faulted>" case where the same string is used twice,
that prevents the linker from merging those two literals, so instead use
a macro - that also keeps the two instances automatically in
sync (instead of only the compile-time strlen expression).

Finally, for the two runs of spaces, it turns out that the "build
these strings on the stack" is not the worst part of what gcc does -
it turns print_func_help_header_irq() into "if (tgid) { /*
print_event_info + five seq_printf calls */ } else { /* print
event_info + another five seq_printf */}". Taking inspiration from a
suggestion from Al Viro, use %.*s to make snprintf either stop after
the first two spaces or print the whole string. As a bonus, the
seq_printfs now fit on single lines (at least, they are not longer
than the existing ones in the function just above), making it easier
to see that the ascii art lines up.

x86-64 defconfig + CONFIG_FUNCTION_TRACER:

$ scripts/stackdelta /tmp/stackusage.{0,1}
./kernel/trace/ftrace.c ftrace_mod_callback     152     136     -16
./kernel/trace/trace.c  trace_default_header    56      32      -24
./kernel/trace/trace.c  tracing_mark_raw_write  96      72      -24
./kernel/trace/trace.c  tracing_mark_write      104     80      -24

bloat-o-meter

add/remove: 1/0 grow/shrink: 0/4 up/down: 14/-375 (-361)
Function                                     old     new   delta
this_mod                                       -      14     +14
ftrace_mod_callback                          577     542     -35
tracing_mark_raw_write                       444     374     -70
tracing_mark_write                           616     540     -76
trace_default_header                         600     406    -194

Link: http://lkml.kernel.org/r/20190320081757.6037-1-linux@rasmusvillemoes.dk

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-08 12:15:12 -04:00
..
blktrace.c blkcg: annotate implicit fall through 2019-03-13 14:31:12 -06:00
bpf_trace.c Merge branch 'linus' into perf/core, to pick up fixes 2019-02-28 08:27:17 +01:00
fgraph.c tracing: Fix ftrace_graph_get_ret_stack() to use task and not current 2018-12-22 08:21:03 -05:00
ftrace_internal.h ftrace: Create new ftrace_internal.h header 2018-12-08 20:54:06 -05:00
ftrace.c tracing: Eliminate const char[] auto variables 2019-05-08 12:15:12 -04:00
Kconfig kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig 2019-02-27 21:43:20 +09:00
Makefile tracing: Add unified dynamic event framework 2018-12-08 20:54:09 -05:00
power-traces.c
preemptirq_delay_test.c tracing: Use trace_clock_local() for looping in preemptirq_delay_test.c 2018-10-17 15:35:33 -04:00
ring_buffer_benchmark.c ring-buffer: Fix mispelling of Calculate 2019-05-08 12:15:12 -04:00
ring_buffer.c ring-buffer: Fix ring buffer size in rb_write_something() 2019-04-02 18:24:06 -04:00
rpm-traces.c
trace_benchmark.c
trace_benchmark.h
trace_branch.c
trace_clock.c
trace_dynevent.c tracing: initialize variable in create_dyn_event() 2019-03-26 08:35:36 -04:00
trace_dynevent.h tracing: Add unified dynamic event framework 2018-12-08 20:54:09 -05:00
trace_entries.h tracing: Change the function format to display function names by perf 2019-02-11 14:53:43 -05:00
trace_event_perf.c tracing/perf: Use strndup_user() instead of buggy open-coded version 2019-02-21 10:35:10 -05:00
trace_events_filter_test.h
trace_events_filter.c tracing: Have the error logs show up in the proper instances 2019-04-08 09:22:44 -04:00
trace_events_hist.c tracing: Have the error logs show up in the proper instances 2019-04-08 09:22:44 -04:00
trace_events_trigger.c tracing: Add trace_array parameter to create_event_filter() 2019-04-08 09:22:28 -04:00
trace_events.c tracing: Kernel access to Ftrace instances 2019-04-02 18:24:06 -04:00
trace_export.c
trace_functions_graph.c tracing: Put a margin between flags and duration for wakeup tracers 2019-02-06 11:56:19 -05:00
trace_functions.c
trace_hwlat.c
trace_irqsoff.c The biggest change for this release is in the histogram code. 2019-03-11 17:01:32 -07:00
trace_kdb.c tracing: kdb: Allow ftdump to skip all but the last few entries 2019-05-02 21:32:55 -04:00
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_kprobe.c tracing: Use tracing error_log with probe events 2019-04-02 18:24:07 -04:00
trace_mmiotrace.c
trace_nop.c
trace_output.c tracing: Simplify printf'ing in seq_print_sym 2018-12-22 08:21:06 -05:00
trace_output.h
trace_preemptirq.c kprobes: Prohibit probing on hardirq tracers 2019-02-13 08:16:40 +01:00
trace_printk.c tracing: Trivia spelling fix containerof() -> container_of() 2018-09-26 12:21:00 +03:00
trace_probe_tmpl.h tracing: probeevent: Do not accumulate on ret variable 2019-05-08 12:15:11 -04:00
trace_probe.c tracing: probeevent: Fix to make the type of $comm string 2019-05-08 12:15:11 -04:00
trace_probe.h tracing: uprobes: Re-enable $comm support for uprobe events 2019-05-08 12:15:11 -04:00
trace_sched_switch.c
trace_sched_wakeup.c tracing: Add conditional snapshot 2019-02-20 13:51:06 -05:00
trace_selftest_dynamic.c
trace_selftest.c function_graph: Have selftest also emulate tr->reset() as it did with tr->init() 2019-04-21 19:46:56 -04:00
trace_seq.c
trace_stack.c tracing: Use the return of str_has_prefix() to remove open coded numbers 2018-12-22 22:52:30 -05:00
trace_stat.c
trace_stat.h
trace_syscalls.c
trace_uprobe.c tracing: uprobes: Re-enable $comm support for uprobe events 2019-05-08 12:15:11 -04:00
trace.c tracing: Eliminate const char[] auto variables 2019-05-08 12:15:12 -04:00
trace.h tracing: Add trace_total_entries() / trace_total_entries_cpu() 2019-05-02 21:32:31 -04:00
tracing_map.c
tracing_map.h