Skip to content

Commit 90405aa

Browse files
Andi KleenIngo Molnar
authored andcommitted
perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack mode
In callstack mode the LBR is not a ring buffer, but a stack that grows up and down. This means in this case we don't need to access all LBRs, only the ones up to TOS. Do this optimization for the normal LBR read, and the context switch save/restore code. For save/restore it can be done unconditionally, as it only runs when call stack mode is active. This recovers some of the cost of going to 32 LBRs on Skylake. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1432786398-23861-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
1 parent e057336 commit 90405aa

File tree

1 file changed

+7
-3
lines changed

1 file changed

+7
-3
lines changed

arch/x86/kernel/cpu/perf_event_intel_lbr.c

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -240,7 +240,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
240240

241241
mask = x86_pmu.lbr_nr - 1;
242242
tos = intel_pmu_lbr_tos();
243-
for (i = 0; i < x86_pmu.lbr_nr; i++) {
243+
for (i = 0; i < tos; i++) {
244244
lbr_idx = (tos - i) & mask;
245245
wrmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]);
246246
wrmsrl(x86_pmu.lbr_to + lbr_idx, task_ctx->lbr_to[i]);
@@ -263,7 +263,7 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx)
263263

264264
mask = x86_pmu.lbr_nr - 1;
265265
tos = intel_pmu_lbr_tos();
266-
for (i = 0; i < x86_pmu.lbr_nr; i++) {
266+
for (i = 0; i < tos; i++) {
267267
lbr_idx = (tos - i) & mask;
268268
rdmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]);
269269
rdmsrl(x86_pmu.lbr_to + lbr_idx, task_ctx->lbr_to[i]);
@@ -425,8 +425,12 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
425425
u64 tos = intel_pmu_lbr_tos();
426426
int i;
427427
int out = 0;
428+
int num = x86_pmu.lbr_nr;
428429

429-
for (i = 0; i < x86_pmu.lbr_nr; i++) {
430+
if (cpuc->lbr_sel->config & LBR_CALL_STACK)
431+
num = tos;
432+
433+
for (i = 0; i < num; i++) {
430434
unsigned long lbr_idx = (tos - i) & mask;
431435
u64 from, to, mis = 0, pred = 0, in_tx = 0, abort = 0;
432436
int skip = 0;

0 commit comments

Comments
 (0)