Skip to content

Debugging

Sergey Bronnikov edited this page Dec 15, 2023 · 23 revisions

Debugging Tarantool

Status: Work-in-progress

Table of Contents

General Recommendations

Try to reproduce a problem on the latest version. Probably it is already fixed.

Debugging environment

Github Actions

Sometimes Tarantool has a problem that triggered in CI runner and cannot be reproduced locally. There is a helpful Github Action action-tmate (doc) that helps to debug your GitHub Actions via SSH by using tmate.

By default a failed step will cause all following steps to be skipped. You can specify that the tmate session only starts if a previous step failed.

  - name: Setup tmate session
    if: ${{ failure() }}
    uses: mxschmitt/action-tmate@v3

This approach works on Linux workers only.

Docker

NOTE: It is a Linux-specific method.

Reproduced locally? If so, nice. If not, whether it reproduces on a given OS or distro? If so, good. strace and gdb requires ptrace capacity of a Docker container: use docker run --privileged -it blah:blah (or --cap-add SYS_PTRACE). If you setup everything already in a container, but forgot --privileged, stop it, add ptrace capacity into json, start again.

grep -o CapAdd /var/lib/docker/containers/6b842b1290f5ca4e7b6a46bc6b985a93ab52f70d68804c75cc5c180e113126f8/hostconfig.json`
CapAdd:["SYS_PTRACE"]

QEMU

How to run FreeBSD in QEMU:

FreeBSD project publishes prepared QEMU images in https://download.freebsd.org/releases/VM-IMAGES/.

Download an image and resize it:

curl -O https://download.freebsd.org/releases/VM-IMAGES/13.2-RELEASE/amd64/Latest/FreeBSD-13.2-RELEASE-amd64.qcow2.xz
unxz FreeBSD-13.2-RELEASE-amd64.qcow2.xz
qemu-img resize FreeBSD-13.2-RELEASE-amd64.qcow2 +10G

Run QEMU with image: qemu-system-x86_64 -enable-kvm -smp 1 -m 2048 -drive file=FreeBSD-13.2-RELEASE-amd64.qcow2,format=qcow2 -enable-kvm -serial mon:stdio -net nic,model=virtio -net user

Inside FreeBSD guest OS:

login: root
echo 'console="comconsole"' > /boot/loader.conf
service growfs onestart
halt -p

Run QEMU virtual machine again: qemu-system-x86_64 -enable-kvm -smp 1 -m 2048 -drive file=FreeBSD-13.2-RELEASE-amd64.qcow2,format=qcow2 -enable-kvm -serial mon:stdio -net nic,model=virtio -net user -nographic

These links are helpful:

Using Git bisection

If you suspect that the problem is a regression, let's check and, if so, just bisect it. It is semi-automatic action that may give much more information about the problem without any real investigation.

There are good examples, where git bisect allows to better understand the problem in a short time:

$ git clone --recurse-submodules git@github.com:tarantool/tarantool.git
$ cd tarantool

$ git bisect start
$ git clean -xffd; git submodule foreach --recursive 'git clean -xffd'
$ cmake . -DCMAKE_BUILD_TYPE=Debug -DENABLE_BACKTRACE=ON -DENABLE_DIST=ON -DENABLE_FEEDBACK_DAEMON=OFF -DENABLE_BUNDLED_LIBCURL=OFF && make -j
$ ./src/tarantool ../repro.lua # !! BUG !!
$ git bisect bad

$ git clean -xffd; git submodule foreach --recursive 'git clean -xffd'
$ git checkout <..commit hash..>
$ git submodule update --recursive --init
$ cmake . -DCMAKE_BUILD_TYPE=Debug -DENABLE_BACKTRACE=ON -DENABLE_DIST=ON -DENABLE_FEEDBACK_DAEMON=OFF -DENABLE_BUNDLED_LIBCURL=OFF && make -j
$ ./src/tarantool ../repro.lua # !! OK !!
$ git bisect good
w
<..repeat git clean, cmake && make, repro.lua..>

Using logging

Use say_debug in C, io.stderr:write() or log-methods in Lua.

See https://www.tarantool.io/en/doc/latest/reference/reference_lua/log/

Enabling verbose mode in third party components:

Curl

Consider options below for enabling verbose mode in libcurl:

Using debug build

A weird someone-breaks-a-memory problem may be caught early on some assertion in the Debug build. Our packages are built with -DCMAKE_BUILD_TYPE=RelWithDebInfo: it has debugging info, but has no assertions and built with -O2. Debug mode could be enabled with -DCMAKE_BUILD_TYPE=Debug. Use debug build for a module if the problem may be there.

Enabling debug mode in third party components:

Curl

Note that CMake build does not allow build curl and libcurl debug enabled, see docs/INSTALL.cmake. However, CMAKE_BUILD_TYPE=Debug has worked for me.

Using Address Sanitizer/Valgrind

Build Tarantool with enabled Address Sanitizer (ENABLE_ASAN) or Valgrind (ENABLE_VALGRIND).

ASAN vs Valgrind: https://developers.redhat.com/blog/2021/05/05/memory-error-checking-in-c-and-c-comparing-sanitizers-and-valgrind and https://github.com/google/sanitizers/wiki/AddressSanitizerComparisonOfMemoryTools

Note, Valgrind cannot run executables that were built by Clang compiler. It was fixed in Valgrind 3.19, see https://bugs.kde.org/show_bug.cgi?id=452758.

Tarantool has a suppression list with known memory leaks, this list should be specified in environment variable on Tarantool start:

LSAN_OPTIONS=suppressions=${PWD}/asan/lsan.supp tarantool

Otherwise Addres Sanitizer will report about memory leaks to stdout.

==532476== Memcheck, a memory error detector
==532476== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==532476== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==532476== Command: /home/sergeyb/sources/MRG/tarantool/build/src/tarantool /home/sergeyb/sources/MRG/tarantool-ee/tarantool/test-run/lib/luatest/bin/luatest -c test/integration/general_test.lua
==532476== 
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x23
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x1b
==532476== Valgrind: debuginfo reader: ensure_valid failed:
==532476== Valgrind:   during call to ML_(img_get)
==532476== Valgrind:   request for range [299270124, +4) exceeds
==532476== Valgrind:   valid image size of 23591872 for image:
==532476== Valgrind:   "/home/sergeyb/sources/MRG/tarantool/build/src/tarantool"
==532476== 
==532476== Valgrind: debuginfo reader: Possibly corrupted debuginfo file.
==532476== Valgrind: I can't recover.  Giving up.  Sorry.
==532476==

Valgrind limitation: https://github.com/tarantool/tarantool/issues/3071

Fun fact: there is a built in Lua variable tarantoool.build.asan that controls max slice check, see commit 232c28f333ac75be26abb54dc308e2e5813b7acd:

commit 232c28f333ac75be26abb54dc308e2e5813b7acd
Author: Nikolay Shirokovskiy <nshirokovskiy@tarantool.org>
Date:   Tue Jul 11 17:36:37 2023 +0300

    fiber: turn off max slice check with ASAN

Using debugger

Using Lua debugger

It used to be very hard to debug Lua sources using standard Tarantool console facilities. Usually one ends up with a number of debugging prints inserted here and there. In this release we introduce console debugger shell which significantly simplify debugging scenario for Lua modules.

Lua debugger supports all basic operations you may expect from debugger: step-by-step execution, investigations of variables in local or global context, traversing stack frames, setting breakpoints, and so on.

local dbg = require("luadebug")

dbg()

local a = 1

local function b()
    print("hello")

    local m = 1
    local b = 2
    local a = 1
    local c = m + b
    if c == 2 then
        b = 22
    end
end

if a == 1 then
    b()
    a = 2
    b()
end
$ tarantool dbg-example.lua 
luadebug: Loaded for 2.11.0-entrypoint-1031-gdc451913f
break via dbg() => debug-dbg.lua:3 in chunk at debug-dbg.lua:0
   1    local debugger = require 'luadebug'
   2    debugger()
   3 => local date = require 'datetime'
   4
   5    local T = date.new{hour = 3, tzoffset = '+0300'}
   6    print(T)
luadebug>

Screencast: https://asciinema.org/a/560039

In addition to the instrumented approach of calling debugger via require('luadebug')() call mentioned above, one may use a less invasive approach, using the new -d option. This way you may debug Lua scripts directly, without any prior instrumentation:

$ tarantool -d debug.lua
Tarantool debugger 2.11.0-entrypoint-1031-gdc451913f
type 'help' for interactive help
luadebug: Loaded for 2.11.0-entrypoint-1031-gdc451913f
break via debug.lua => debug.lua:1 in chunk at debug.lua:0
   1 => local date = require 'datetime'
   2
   3    local T = date.new{hour = 3, tzoffset = '+0300'}
   4    print(T)

Note, Lua debugger has a limitations - there is no fiber support yet, see https://github.com/orgs/tarantool/discussions/5857.

Using GDB/LLDB with extensions

Note, SMALL allocators poisons deallocated memory regions by certain symbols. It could be helpful on debugging session.

There are two debugger extensions: LuaJIT-specific luajit-gdb.py and Tarantool-specific tarantool-gdb.py. NOTE: В ближайшее время расширения luajit-gdb.py и luajit_lldb.py будут объединены.

These GDB settings are useful on debugging:

$ cat ~/.gdbinit

# Cause GDB to print structures in an indented format with one member per line.
set print pretty on

set history save on
set history size unlimited

# Control the removal of duplicate history entries in the command history list.
set history remove-duplicates 100

set history filename ~/.gdb_history

LuaJIT-specific

Load extension:

(gdb) source ~/tools/tarantool/luajit/luajit-gdb.py

lj-arch

(gdb) help lj-arch 

lj-arch

The command requires no args and dumps values of LJ_64 and LJ_GC64
compile-time flags. These values define the sizes of host and GC
pointers respectively.
    
(gdb) lj-arch 
LJ_64: True, LJ_GC64: False, LJ_DUALNUM: False
(gdb) 

lj-gc

(gdb) help lj-gc                

lj-gc

The command requires no args and dumps current GC stats:
* total: <total number of allocated bytes in GC area>
* threshold: <limit when gc step is triggered>
* debt: <how much GC is behind schedule>
* estimate: <estimate of memory actually in use>
* stepmul: <incremental GC step granularity>
* pause: <pause between successive GC cycles>
* sweepstr: <sweep position in string table>
* root: <number of all collectable objects>
* gray: <number of gray objects>
* grayagain: <number of objects for atomic traversal>
* weak: <number of weak tables (to be cleared)>
* mmudata: <number of udata|cdata to be finalized>
    
(gdb) lj-gc 
GC stats: PAUSE
        total: 2163422
        threshold: 4325600
        debt: 0
        estimate: 2162874
        stepmul: 200
        pause: 200
        sweepstr: 16384/16384
        root: 14034 objects
        gray: 0 objects
        grayagain: 14023 objects
        weak: 6230 objects
        mmudata: 0 objects
(gdb) 

lj-stack

(gdb) help lj-stack                                                             
                                                                               
lj-stack [<lua_State *>]                                                       
                                       
The command receives a lua_State address and dumps the given Lua
coroutine guest stack:

<snipped>

(gdb) lj-stack                                                                 
----------- Red zone:  5 slots -----------
0x401d2828            [    ] VALUE: nil                           
0x401d2820            [    ] VALUE: nil 
0x401d2818            [    ] VALUE: nil                       
0x401d2810            [    ] VALUE: nil               
0x401d2808            [    ] VALUE: nil                             
----------- Stack:    62 slots -----------
0x401d2800            [   M] VALUE: nil                            
0x401d2620:0x401d27f8 [    ] 60 slots: Free stack slots
0x401d2618            [ BT ] VALUE: number 19
0x401d2610            [S   ] FRAME: dummy L                           

lj-state

(gdb) help lj-state 

lj-state
The command requires no args and dumps current VM and GC states
* VM state: <INTERP|C|GC|EXIT|RECORD|OPT|ASM|TRACE>
* GC state: <PAUSE|PROPAGATE|ATOMIC|SWEEPSTRING|SWEEP|FINALIZE|LAST>
* JIT state: <IDLE|ACTIVE|RECORD|START|END|ASM|ERR>
    
(gdb) lj-state 
VM state: CFUNC
GC state: PAUSE
JIT state: IDLE

lj-str

(gdb) help lj-str 

lj-str <GCstr *>

The command receives a <gcr> of the corresponding GCstr object and dumps
the payload, size in bytes and hash.

*Caveat*: Since Python 2 provides no native Unicode support, the payload
is replaced with the corresponding error when decoding fails.
    
(gdb) 

lj-tab

(gdb) help lj-tab 

lj-tab <GCtab *>

The command receives a GCtab adress and dumps the table contents:
* Metatable address whether the one is set
* Array part <asize> slots:
  <aslot ptr>: [<index>]: <tv>
* Hash part <hsize> nodes:
  <hnode ptr>: { <tv> } => { <tv> }; next = <next hnode ptr>
    
(gdb) 

lj-tv

(gdb) help lj-tv 

lj-tv <TValue *>

The command receives a pointer to <tv> (TValue address) and dumps
the type and some info related to it.

* LJ_TNIL: nil
* LJ_TFALSE: false
* LJ_TTRUE: true
* LJ_TLIGHTUD: light userdata @ <gcr>
* LJ_TSTR: string <string payload> @ <gcr>
* LJ_TUPVAL: upvalue @ <gcr>
* LJ_TTHREAD: thread @ <gcr>
* LJ_TPROTO: proto @ <gcr>
* LJ_TFUNC: <LFUNC|CFUNC|FFUNC>
  <LFUNC>: Lua function @ <gcr>, <nupvals> upvalues, <chunk:line>
  <CFUNC>: C function <mcode address>
  <FFUNC>: fast function #<ffid>
* LJ_TTRACE: trace <traceno> @ <gcr>
* LJ_TCDATA: cdata @ <gcr>
* LJ_TTAB: table @ <gcr> (asize: <asize>, hmask: <hmask>)
* LJ_TUDATA: userdata @ <gcr>
* LJ_TNUMX: number <numeric payload>

Whether the type of the given address differs from the listed above, then
error message occurs.
    
(gdb) 

Tarantool-specific

Load extension:

(gdb) source ~/sources/MRG/tarantool/tools/tarantool-gdb.py 

info tt-fibers

Display currently known tarantool fibers of the current cord.
Usage: info tt-fibers [ID]...
If ID is given, it is a space-separated list of IDs of fibers to display.
Otherwise, all fibers of the current cord are displayed.

tt-fiber

(gdb) help tt-fiber 
tt-fiber [FIBER_ID]
Use this command to select fiber which stack need to be explored.
FIBER_ID must be currently known, if omitted displays current fiber info.

Please, be aware that the outermost frame (level #0) is filtered out for
any fiber other than the one that is currently running, so backtrace in
this case starts with level #1 frame. It's because GDB always starts frame
sequence with a frame that matches actual value of the stack pointer register,
but this frame only makes sense for the currently running fiber.

Please, note that this command does NOT change currently running fiber,
it just selects stack to explore w/o changing any register/data.

(gdb) info tt-fibers 
  Id     Target   Name                               Stack              Function
<snipped>
  107    TtFiber  "memtx.gc"                         0x7f0acb87fec8     0x55e699f61130 <memtx_engine_gc_f(typedef __va_list_tag __va_list_tag *)>
  106    TtFiber  "checkpoint_daemon"                0x7f0ad13ffdd8     0x55e69a01d8d0 <gc_checkpoint_fiber_f>
  105    TtFiber  "gc"                               0x7f0ad137fec8     0x55e69a01b960 <gc_cleanup_fiber_f>
  104    TtFiber  "console/unix/:/var/run/tarantool/tarantool_pushmequeuetar2.control"   0x7f0ad127fd58     0x55e69a0d8970 <lua_fiber_run_f>
  102    TtFiber  "on_shutdown"                      0x7f0ad117ff98     0x55e699f20b80 <on_shutdown_f(typedef __va_list_tag __va_list_tag *)>
  101    TtFiber  "lua"                            * 0x7f0ad10ffe98     0x55e69a0d8970 <lua_fiber_run_f>
* 1      TtFiber  "sched"                            0x7fff9efa5588     0x0
(gdb) 
(gdb) tt-fiber 105
#1  __pthread_kill_internal (signo=6, threadid=139852930820864) at ./nptl/pthread_kill.c:78
78      in ./nptl/pthread_kill.c
(gdb) bt
#1  0x000055e69a103b3b in fiber_yield_timeout (delay=<optimized out>) at /build/usr/src/debug/tarantool-2.11.0/src/lib/core/fiber.c:750
#2  0x000055e69a104095 in fiber_sleep (delay=<optimized out>) at /build/usr/src/debug/tarantool-2.11.0/src/lib/core/fiber.c:778
#3  0x000055e69a01b9e0 in gc_cleanup_fiber_f (ap=<error reading variable: value has been optimized out>)
    at /build/usr/src/debug/tarantool-2.11.0/src/box/gc.c:298
#4  0x000055e699f20b60 in fiber_cxx_invoke(fiber_func, typedef __va_list_tag __va_list_tag *) (f=<optimized out>, ap=<optimized out>)
    at /build/usr/src/debug/tarantool-2.11.0/src/lib/core/fiber.h:1227
#5  0x000055e69a104136 in fiber_loop (data=<optimized out>) at /build/usr/src/debug/tarantool-2.11.0/src/lib/core/fiber.c:998
#6  0x000055e69a3928cc in coro_init () at /build/usr/src/debug/tarantool-2.11.0/third_party/coro/coro.c:108

tt-list

tt-list                                                                        
Display entries of the LIST_EXP.                                               
If LIST_EXP refers to the concrete item rather than the entire list, only items 
started with the specified one are displayed (up to the specified one in case
of reverse direction).                                                         
                                                                               
Usage: tt-list [[OPTION]... --] LIST_EXP                               
                                                                               
Options:

<snipped>

For example let's look at the list of engines (-f id,name is used to reduce the output)

Example: The whole list.

(gdb) tt-list -pretty -f id,name -- engines
rlist<engine::link> of length 5, ref=*(rlist*)0x55e69a7bb290 = {
  [0] ((engine*)0x55e69b29ecd0) = {
    id = 0,
    name = 0x55e69a3a81df "memtx"
  },
  [1] ((engine*)0x55e69b29e640) = {
    id = 1,
    name = 0x55e69a3a9c81 "sysview"
  },
  [2] ((engine*)0x55e69b29e6e0) = {
    id = 2,
    name = 0x55e69a3a9e9a "service"
  },
  [3] ((engine*)0x55e69b29e710) = {
    id = 3,
    name = 0x55e69a3a9e55 "blackhole"
  },
  [4] ((engine*)0x55e69b2a03f0) = {
    id = 4,
    name = 0x55e69a3aa095 "vinyl"
  }
}

Example: Display entries with the specific id.

(gdb) tt-list -pretty -f id,name -p '$entry->id == 2' -- engines
rlist<engine::link> of length 5, ref=*(rlist*)0x55e69a7bb290 = {
  [2] ((engine*)0x55e69b29e6e0) = {
    id = 2,
    name = 0x55e69a3a9e9a "service"
  }
}

Example: Display entries with the id less than some number.

(gdb) tt-list -pretty -f id,name -p '$entry->id < 2' -- engines
rlist<engine::link> of length 5, ref=*(rlist*)0x55e69a7bb290 = {
  [0] ((engine*)0x55e69b29ecd0) = {
    id = 0,
    name = 0x55e69a3a81df "memtx"
  },
  [1] ((engine*)0x55e69b29e640) = {
    id = 1,
    name = 0x55e69a3a9c81 "sysview"
  }
}

Example: Display entries with the id greater than some number.

(gdb) tt-list -pretty -f id,name -p '$entry->id > 2' -- engines
rlist<engine::link> of length 5, ref=*(rlist*)0x55e69a7bb290 = {
  [3] ((engine*)0x55e69b29e710) = {
    id = 3,
    name = 0x55e69a3a9e55 "blackhole"
  },
  [4] ((engine*)0x55e69b2a03f0) = {
    id = 4,
    name = 0x55e69a3aa095 "vinyl"
  }
}

tt-list-walk

tt-list-walk
This command implements 'walk' functionality, that is it iterates
through the list but unlike 'tt-list' it displays one entry per
invocation (sometimes it may be more convenient to see entries
one-by-one instead of all-at-once). To move through the list you just
need to repeat continuously the last command with the same arguments
(that is just enter blank line) until the list is exhausted and message
'No items left' is displayed.
There can be only one active walk at a time. Once LIST_EXP or any option
that affects entries sequence is changed, current active walk is dropped
and the new one is started automatically.

Usage: tt-list-walk [[OPTION]... --] LIST_EXP

Options:

<snipped>

tt-mp

(gdb) help tt-mp 

Decode and print MsgPack referred by EXP in a human-readable form
Usage: tt-mp [OPTIONS]... EXP

Options:
<snipped>

tt-tuple

(gdb) help tt-tuple 

Decode and print tuple referred by EXP
Usage: tt-tuple [OPTIONS]... EXP

Options:

<snipped>

Посмотреть гостевой стек в файбере по известному fiber id (в примере 101)

Ищем файбер (подробнее про поиск можно посмотреть в help tt-list):

(gdb) tt-list -p '$entry->fid == 101' -- cord_ptr->alive
rlist<fiber::link> of length 2149, ref=*(rlist*)0x55e69a8054e0 = {[2148] ((fiber*)0x7f0ad1000080) = {ctx = {sp = 0x7f0ad10ffe98}, 
    stack_slab = 0x7f0ad1080000, stack = 0x7f0ad1082000, stack_watermark = 0x7f0ad10f0030, stack_size = 516096, max_slice = {warn = 0, err = 0}, 
    stack_id = 0, gc = {cache = 0x55e69a805610 <main_cord+400>, slabs = {slabs = {prev = 0x7f0ad10000c8, next = 0x7f0ad10000c8}, stats = {
          used = 0, total = 0}}, on_alloc_cb = 0x0, on_truncate_cb = 0x0, cb_arg = 0x0}, first_alloc_bt = 0x0, gc_initial_size = 0, 
    caller = 0x55e69a805860 <main_cord+992>, csw = 1, fid = 101, flags = 0, clock_stat = {acc = 0, delta = 0, prev_delta = 0, cputime = 0},
<snipped>

Смотрим гостевой стек (выражение для найденного файбера берем из результата предыдущей команды):

(gdb) lj-stack ((fiber*)0x7f0ad1000080)->storage.lua.stack
<snipped>

Using network sniffing

Tarantool speaks with clients using IPROTO protocol based on MsgPack. See IPROTO description in Tarantool Documentation. Inspecting binary traffic is a boring task. There are two network sniffers that supports IPROTO decoding: Wireshark and Tcpdump. Wireshark supports custom dissectors written in Lua, tcpdump has a work-in-progress IPROTO support.

Wireshark

image

To enable the plugin, follow the steps below:

Clone the tarantool-dissector repository:

git clone https://github.com/tarantool/tarantool-dissector.git

Copy or symlink the plugin files into the Wireshark plugin directory:

mkdir -p ~/.local/lib/wireshark/plugins
cd ~/.local/lib/wireshark/plugins
ln -s /path/to/tarantool-dissector/MessagePack.lua ./
ln -s /path/to/tarantool-dissector/tarantool.dissector.lua ./

(For the location of the plugin directory on macOS and Windows, please refer to the Plugin folders chapter in the Wireshark documentation.)

Run the Wireshark GUI and ensure that the plugins are loaded:

Open Help > About Wireshark > Plugins.
Find MessagePack.lua and tarantool.dissector.lua in the list.

Now you can inspect incoming and outgoing Tarantool packets with user-friendly annotations.

Visit the project page for details: https://github.com/tarantool/tarantool-dissector.

Tcpdump

To use tcpdump with IPROTO support one need to build tcpdump with applied patch. See steps to build in https://github.com/the-tcpdump-group/tcpdump/blob/master/INSTALL.md.

Example of output:

283  16:57:32.821404 IP 127.0.0.1.3309 > 127.0.0.1.40078: Flags [P.], seq 21552:21655, ack 765, win 512, options [nop,nop,TS val 1977542710 ecr 1977542710], length 103: IPROTO
	1) size 103: response: OK, SYNC: 17, SCHEMA_VERSION: 78
284  16:57:32.821418 IP 127.0.0.1.40078 > 127.0.0.1.3309: Flags [P.], seq 765:784, ack 21655, win 512, options [nop,nop,TS val 1977542710 ecr 1977542710], length 19: IPROTO
	1) size 19: request: CALL_16, SYNC: 18
285  16:57:32.821431 IP 127.0.0.1.3309 > 127.0.0.1.40078: Flags [P.], seq 21655:21759, ack 784, win 512, options [nop,nop,TS val 1977542710 ecr 1977542710], length 104: IPROTO
	1) size 104: response: OK, SYNC: 18, SCHEMA_VERSION: 78
286  16:57:32.821444 IP 127.0.0.1.40078 > 127.0.0.1.3309: Flags [P.], seq 784:803, ack 21759, win 512, options [nop,nop,TS val 1977542710 ecr 1977542710], length 19: IPROTO
	1) size 19: request: CALL_16, SYNC: 19
287  16:57:32.821457 IP 127.0.0.1.3309 > 127.0.0.1.40078: Flags [P.], seq 21759:21864, ack 803, win 512, options [nop,nop,TS val 1977542710 ecr 1977542710], length 105: IPROTO
	1) size 105: response: OK, SYNC: 19, SCHEMA_VERSION: 78
288  16:57:32.821473 IP 127.0.0.1.40078 > 127.0.0.1.3309: Flags [P.], seq 803:822, ack 21864, win 512, options [nop,nop,TS val 1977542710 ecr 1977542710], length 19: IPROTO
	1) size 19: request: CALL_16, SYNC: 20
289  16:57:32.821487 IP 127.0.0.1.3309 > 127.0.0.1.40078: Flags [P.], seq 21864:21969, ack 822, win 512, options [nop,nop,TS val 1977542710 ecr 1977542710], length 105: IPROTO
	1) size 105: response: OK, SYNC: 20, SCHEMA_VERSION: 78

Using profiling

  • TODO: Это общая рекомендация: отрезать крупные куски, затыкать их заглушками и смотреть, что изменилось (по RPS, по профилю — но не одновременно). Можно делать заглушки с сопоставимыми по размеру данными, можно намеренно с маленькими. Тут клево оторвать вообще все, кроме сетевых походов, и посмотреть на недостижимый идеал. Потом отрезаниями кода понять, где сколько до идеала скрывается.
  • TODO: require('fiber').top_enable()
  • TODO: Большие readahead.
  • TODO: Много таплов в runtime arena.
  • TODO: Если в Lua не видно явно горячей функции, то тут возможны варианты: bottleneck вообще не там (см. пункт 6) или нагрузка размазана (см. пункт 7).

Poor man's profiler

C

Note that these receipts are described in Tarantool Documentation.

pstack <pid>

To use this tool, first install it with a package manager that comes with your Linux distribution. This command prints an execution stack trace of a running process specified by the PID. You might want to run this command several times in a row to pinpoint the bottleneck that causes the slowdown.

Once installed, say:

$ pstack $(pidof tarantool INSTANCENAME.lua)

Next, say:

$ echo $(pidof tarantool INSTANCENAME.lua)

to show the PID of the Tarantool instance that runs the INSTANCENAME.lua file.

You should get similar output:

Thread 19 (Thread 0x7f09d1bff700 (LWP 24173)):
#0 0x00007f0a1a5423f2 in ?? () from /lib64/libgomp.so.1
#1 0x00007f0a1a53fdc0 in ?? () from /lib64/libgomp.so.1
#2 0x00007f0a1ad5adc5 in start_thread () from /lib64/libpthread.so.0
#3 0x00007f0a1a050ced in clone () from /lib64/libc.so.6
Thread 18 (Thread 0x7f09d13fe700 (LWP 24174)):
#0 0x00007f0a1a5423f2 in ?? () from /lib64/libgomp.so.1
#1 0x00007f0a1a53fdc0 in ?? () from /lib64/libgomp.so.1
#2 0x00007f0a1ad5adc5 in start_thread () from /lib64/libpthread.so.0
#3 0x00007f0a1a050ced in clone () from /lib64/libc.so.6
<...>
Thread 2 (Thread 0x7f09c8bfe700 (LWP 24191)):
#0 0x00007f0a1ad5e6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000045d901 in wal_writer_pop(wal_writer*) ()
#2 0x000000000045db01 in wal_writer_f(__va_list_tag*) ()
#3 0x0000000000429abc in fiber_cxx_invoke(int (*)(__va_list_tag*), __va_list_tag*) ()
#4 0x00000000004b52a0 in fiber_loop ()
#5 0x00000000006099cf in coro_init ()
Thread 1 (Thread 0x7f0a1c47fd80 (LWP 24172)):
#0 0x00007f0a1a0512c3 in epoll_wait () from /lib64/libc.so.6
#1 0x00000000006051c8 in epoll_poll ()
#2 0x0000000000607533 in ev_run ()
#3 0x0000000000428e13 in main ()

gdb -ex “bt” -p <pid>

As with pstack, the GNU debugger (also known as gdb) needs to be installed before you can start using it. Your Linux package manager can help you with that.

Once the debugger is installed, say:

$ gdb -ex "set pagination 0" -ex "thread apply all bt" --batch -p $(pidof tarantool INSTANCENAME.lua)

Next, say:

$ echo $(pidof tarantool INSTANCENAME.lua)

to show the PID of the Tarantool instance that runs the INSTANCENAME.lua file.

After using the debugger, your console output should look like this:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

<snipped>

Thread 1 (Thread 0x7f72289ba940 (LWP 20535)):
#0 _int_malloc (av=av@entry=0x7f7226e0eb20 <main_arena>, bytes=bytes@entry=504) at malloc.c:3697
#1 0x00007f7226acf21a in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3234
#2 0x00000000004631f8 in vy_merge_iterator_reserve (capacity=3, itr=0x7f72264af9e0) at /usr/src/tarantool/src/box/vinyl.c:7629
#3 vy_merge_iterator_add (itr=itr@entry=0x7f72264af9e0, is_mutable=is_mutable@entry=true, belong_range=belong_range@entry=false) at /usr/src/tarantool/src/box/vinyl.c:7660
#4 0x00000000004703df in vy_read_iterator_add_mem (itr=0x7f72264af990) at /usr/src/tarantool/src/box/vinyl.c:8387
#5 vy_read_iterator_use_range (itr=0x7f72264af990) at /usr/src/tarantool/src/box/vinyl.c:8453
#6 0x000000000047657d in vy_read_iterator_start (itr=<optimized out>) at /usr/src/tarantool/src/box/vinyl.c:8501
#7 0x00000000004766b5 in vy_read_iterator_next (itr=itr@entry=0x7f72264af990, result=result@entry=0x7f72264afad8) at /usr/src/tarantool/src/box/vinyl.c:8592
#8 0x000000000047689d in vy_index_get (tx=tx@entry=0x7f7226468158, index=index@entry=0x2563860, key=<optimized out>, part_count=<optimized out>, result=result@entry=0x7f72264afad8) at /usr/src/tarantool/src/box/vinyl.c:5705
#9 0x0000000000477601 in vy_replace_impl (request=<optimized out>, request=<optimized out>, stmt=0x7f72265a7150, space=0x2567ea0, tx=0x7f7226468158) at /usr/src/tarantool/src/box/vinyl.c:5920
#10 vy_replace (tx=0x7f7226468158, stmt=stmt@entry=0x7f72265a7150, space=0x2567ea0, request=<optimized out>) at /usr/src/tarantool/src/box/vinyl.c:6608
#11 0x00000000004615a9 in VinylSpace::executeReplace (this=<optimized out>, txn=<optimized out>, space=<optimized out>, request=<optimized out>) at /usr/src/tarantool/src/box/vinyl_space.cc:108
#12 0x00000000004bd723 in process_rw (request=request@entry=0x7f72265a70f8, space=space@entry=0x2567ea0, result=result@entry=0x7f72264afbc8) at /usr/src/tarantool/src/box/box.cc:182
#13 0x00000000004bed48 in box_process1 (request=0x7f72265a70f8, result=result@entry=0x7f72264afbc8) at /usr/src/tarantool/src/box/box.cc:700
#14 0x00000000004bf389 in box_replace (space_id=space_id@entry=513, tuple=<optimized out>, tuple_end=<optimized out>, result=result@entry=0x7f72264afbc8) at /usr/src/tarantool/src/box/box.cc:754
#15 0x00000000004d72f8 in lbox_replace (L=0x413c5780) at /usr/src/tarantool/src/box/lua/index.c:72
#16 0x000000000050f317 in lj_BC_FUNCC ()
#17 0x00000000004d37c7 in execute_lua_call (L=0x413c5780) at /usr/src/tarantool/src/box/lua/call.c:282
#18 0x000000000050f317 in lj_BC_FUNCC ()
#19 0x0000000000529c7b in lua_cpcall ()
#20 0x00000000004f6aa3 in luaT_cpcall (L=L@entry=0x413c5780, func=func@entry=0x4d36d0 <execute_lua_call>, ud=ud@entry=0x7f72264afde0) at /usr/src/tarantool/src/lua/utils.c:962
#21 0x00000000004d3fe7 in box_process_lua (handler=0x4d36d0 <execute_lua_call>, out=out@entry=0x7f7213020600, request=request@entry=0x413c5780) at /usr/src/tarantool/src/box/lua/call.c:382
#22 box_lua_call (request=request@entry=0x7f72130401d8, out=out@entry=0x7f7213020600) at /usr/src/tarantool/src/box/lua/call.c:405
#23 0x00000000004c0f27 in box_process_call (request=request@entry=0x7f72130401d8, out=out@entry=0x7f7213020600) at /usr/src/tarantool/src/box/box.cc:1074
#24 0x000000000041326c in tx_process_misc (m=0x7f7213040170) at /usr/src/tarantool/src/box/iproto.cc:942
#25 0x0000000000504554 in cmsg_deliver (msg=0x7f7213040170) at /usr/src/tarantool/src/cbus.c:302
#26 0x0000000000504c2e in fiber_pool_f (ap=<error reading variable: value has been optimized out>) at /usr/src/tarantool/src/fiber_pool.c:64
#27 0x000000000041122c in fiber_cxx_invoke(fiber_func, typedef __va_list_tag __va_list_tag *) (f=<optimized out>, ap=<optimized out>) at /usr/src/tarantool/src/fiber.h:645
#28 0x00000000005011a0 in fiber_loop (data=<optimized out>) at /usr/src/tarantool/src/fiber.c:641
#29 0x0000000000688fbf in coro_init () at /usr/src/tarantool/third_party/coro/coro.c:110

Run the debugger in a loop a few times to collect enough samples for making conclusions about why Tarantool demonstrates suboptimal performance. Use the following script:

$ rm -f stack-trace.txt
$ watch -n 0.5 "gdb -ex 'set pagination 0' -ex 'thread apply all bt' --batch -p $(pidof tarantool INSTANCENAME.lua) | tee -a stack-trace.txt"

Structurally and functionally, this script is very similar to the one used with fiber.info() above.

Warning: Use the poor man’s profilers with caution: each time they attach to a running process, this stops the process execution for about a second, which may leave a serious footprint in high-load services.

Lua

"Programming in Lua" describes hooks mechanism in debug library:

The hook mechanism of the debug library allows us to register a function that will be called at specific events as your program runs. There are four kinds of events that can trigger a hook: call events happen every time Lua calls a function; return events happen every time a function returns; line events happen when Lua starts executing a new line of code; and count events happen after a given number of instructions. Lua calls hooks with a single argument, a string describing the event that generated the call: "call", "return", "line", or "count". Moreover, for line events, it also passes a second argument, the new line number. We can always use debug.getinfo to get more information inside a hook.

As a simple example, the following code installs a primitive tracer, which prints the number of each new line the interpreter executes:

debug.sethook(print, "l")

It simply installs print as the hook function and instructs Lua to call it only at line events. A more elaborated tracer can use getinfo to add the current file name to the trace:

function trace (event, line)
    local s = debug.getinfo(2).short_src
    print(s .. ":" .. line)
end
    
debug.sethook(trace, "l")

jit.p

LuaJIT has a builtin profiler. It generates simple textual summaries or source code annotations. It can be accessed with the -jp command line option or from Lua code by loading the underlying jit.p module.

Note: Sampling works for both interpreted and JIT-compiled code. The results for JIT-compiled code may sometimes be surprising. LuaJIT heavily optimizes and inlines Lua code — there's no simple one-to-one correspondence between source code lines and the sampled machine code.

require('jit.p').start('f', 'profile.txt')

<generate workload>

require('jit.p').stop()

It is possible to generate flamegraph using bundled profiler:

  • require('jit.p').start('FG','output')
  • flamegraph.pl output > output.svg

flamegraph.pl is a script that can be obtained in repository https://github.com/brendangregg/FlameGraph.

See notes about jit.p in Tarantool Documentation.

Sysprof

The default profiling options for LuaJIT are not fine enough to get an understanding of performance. For example, perf only able to show host stack, so all the Lua calls are seen as single pcall. Oppositely, the jit.p module provided with LuaJIT is not able to give any information about the host stack. Here comes the sysprof: it is able to capture both guest and host stacks simultaneously, along with virtual machine states.

Sysprof has the following Lua API:

misc.sysprof.start(opts)
misc.sysprof.stop()
misc.sysprof.report()

First two functions return boolean res and err, which is nil on success and contains an error message on failure.

misc.sysprof.report returns a Lua table with counters.

Parameter opts for the misc.sysprof.start can contain the following parameters:

{
  mode = 'D'/'L'/'C', -- 'D' = DEFAULT, 'L' = LEAF, 'C' = CALLGRAPH
  interval = 10, -- sampling interval in msec.
  path = '/path/to/file' -- location to store profile data.
}

where mode MUST be provided always, interval and path are optional. The default interval is 10 msec, path -- sysprof.bin.

The default profile data parser is provided. Its output is flamegraph.pl-suitable, so one can do this:

$ luajit-parse-sysprof /path/to/profile/data > tmp
$ flamegraph.pl tmp > graph.svg

See also:

Memprof

See https://www.tarantool.io/en/doc/latest/reference/tooling/luajit_memprof/

perf

perf record -a -g and perf report. Тут можно увидеть, например, упираемся ли мы в декодинг tls на nginx, принимающем входящие соединения. Хотя процессными мерами (см. пункт 7) это тоже можно выяснить.

https://www.tarantool.io/en/doc/latest/book/admin/server_introspection/#perf

gperftools

see https://www.tarantool.io/en/doc/latest/book/admin/server_introspection/#gperftools

LuaJIT

General Recommendations

is LuaJIT-specific? попробовать выполнить скрипт в PUC Rio Lua

is LuaJIT the Problem?

is it reproduced on the latest version of tarantool's luajit and upstream luajit?

Find unused code

Sorry for the shameless plug, but I use luatrace to find unused code that can (possibly) be removed, again reducing the size of the test case. Run lua -luatrace.profile <file.lua> and look at annotated-source.txt - you’ll see which lines aren’t executed. Of course not all unused code can be removed.

Using logging

Don’t use print!

print isn't implemented as a fast function, so traces containing print are aborted. So if you put a print in the middle of the problem trace, you’ll stop the trace being compiled and hide your problem.

If you want to get some output use io.write which doesn’t abort traces, but beware - it does tickle the heuristics.

Component localisation

Perhaps your problem is component-specific. Disable and enable LuaJIT components to get this and find out how they influence on the problem:

GC64: CMake option LUAJIT_ENABLE_GC64. GC is enabled by default in Tarantool for Linux and disabled on macOS.

HW/OS: LuaJIT supports different platforms (operating system or architecture), probably your problem is OS-specific or architecture-specific.

JIT: LuaJIT could interpret and compile Lua code. JIT compiler is a huge and sophisticated part of LuaJIT. Try to disable JIT and reproduce a problem without it. JIT could be disabled in command line -joff and in Lua with jit.off, see https://luajit.org/running.html. Note, you can disable JIT compilation for the project and for a certain modules.

Flags: LuaJIT has a number of optimization levels and allows to control these levels as well as different optimizations. These levels and flags can move, hide or expose the problem, so it is worth to use these options. Optimization level could be set using command line option -O[0-3] and optimization flag using command line option -O[+|-]flag (+ - enabled, - - disabled). See https://luajit.org/running.html

Bisecting LuaJIT optimization level and flags could be automated with a script below:

#!/bin/bash

set -e

lua_bin=$1
lua_code=$2

echo "Lua runtime: $lua_bin"
echo "Lua code:    $lua_code"

$lua_bin $lua_code > /dev/null
echo "lua OK"

$lua_bin -joff $lua_code > /dev/null
echo "$lua_bin -joff OK"

for i in {1..50}
do
    echo -Ohotloop=$i
    $lua_bin -Ohotloop=$i $lua_code > /dev/null
    if [ $? -ne 0 ]
    then
        echo "Error on " $lua_bin -Ohotloop=$i $lua_code "> /dev/null"
        for o in "" "1" "2" "3" "-fold" "-cse" "-dce" "-narrow" "-loop" "-fwd" "-dse" "-abc" "-fuse"
        do
            echo -O$o -Ohotloop=$i
            $lua_bin -O$o -Ohotloop=$i $lua_code > /dev/null
        done
        break
    fi
done

Debug build

Set CMAKE_BUILD_TYPE to Debug, this will enable debugging in LuaJIT build. Additionally it is useful to enable LuaJIT assertions by setting CMake variables LUA_USE_APICHECK, assertions for the Lua/C API, and LUA_USE_ASSERT, assertions for the whole LuaJIT VM, to ON. On building LuaJIT as a part of Tarantool these variables will be enabled automatically in cmake/luajit.cmake.

Reducing test case

Reducing testcase is a popular task on debugging LuaJIT problems. This section describes reducing step-by-step.

Lua code amalgamation

Sometimes LuaJIT problem is triggered in a Lua project or module that consists from a number of files. In such cases it is recommended to amalgamate files with Lua code to a single one.

There is a number of tools for doing this:

reducing testcase

There is a number of tools that reduce data automatically. Al this tools works in a similar way: you create a script that executes program with reproducer and then pass script within data to a reducer tool. creduce is good enough in reducing Lua snippets. However, one could replace creduce with another one reducer. Consider these - alternatives: cvise, treereduce and jubnzv/dd. Note, testcases produced by fuzzing tests could be reduced automatically by test itself, pass -minimize_crash=1 option and wait.

Let's consider an example of using creduce. Create a script interestingness.sh:

#!/bin/sh

<fullpath>/src/luajit crash.lua 2>&1 | tee output.txt
grep "LuaJIT ASSERT lj_asm.c:147: checkmclim: red zone overflow:" output.txt

Run creduce:

$ creduce interestingness.sh crash.lua

formatting

Use LuaFormatter, StyLua or any other tool to format reduced Lua code.

References

Developer Guidelines ↗

Architecture

How To ...?

Recipes

Upgrade instructions

Useful links

Old discussions

Personal pages

Clone this wiki locally