Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backtraces fail when a debugging inside pipeline. #8440

Open
mcourteaux opened this issue Oct 18, 2024 · 4 comments
Open

Backtraces fail when a debugging inside pipeline. #8440

mcourteaux opened this issue Oct 18, 2024 · 4 comments

Comments

@mcourteaux
Copy link
Contributor

mcourteaux commented Oct 18, 2024

I'm using x86-64-linux AOT-generated pipelines, statically linked in my binary. When I hit a halide_assert(), while running in the debugger, a backtrace never shows me where this function was actually called. It seems that calling conventions are not respected, and thus backtrace algorithms fail. I get something like this:

#0  0x00007ffff747a664 in __pthread_kill_implementation () at /usr/lib64/libc.so.6
#1  0x00007ffff7421c4e in raise () at /usr/lib64/libc.so.6
#2  0x00007ffff7409902 in abort () at /usr/lib64/libc.so.6
#3  0x0000000000980549 in halide_default_error ()
#4  0x0000000000989cbe in Halide::Runtime::Internal::(anonymous namespace)::HeapPrinter<(Halide::Runtime::Internal::PrinterType)1, 1024ul>::~HeapPrinter() [clone .120] ()
#5  0x000000000098a096 in halide_error_access_out_of_bounds ()
#6  0x0000000000873c87 in neonraw_bilateral_grid_loglum_constructor_8-x86-64-linux-avx2-fma-profile-cuda-no_bounds_query-no_runtime ()
#7  0x00000000008723a5 in neonraw_bilateral_grid_loglum_constructor_8 ()
#8  0x0000000000fcf040 in ??? ()
#9  0x3f0758b53ca0a0a1 in ??? ()
#10 0xbe651cc63f605ac1 in ??? ()
#11 0x00000000300040c8 in ??? ()
#12 0x0000000000000000 in ??? ()

So everything until frame #7 seems fine, but afterwards is total gibberish. Note that the function does correctly return and there is no bugs in control flow. It's just not debuggable if you can't go to the call-site.

@alexreinking
Copy link
Member

What C++ compiler and version are you using?

@abadams
Copy link
Member

abadams commented Oct 18, 2024

It's a standard function call, but I think we're doing the equivalent of -fomit-frame-pointer, because that's the default behavior for O3. I can't figure out how to turn it off though in the LLVM API...

@mcourteaux
Copy link
Contributor Author

It's a standard function call, but I think we're doing the equivalent of -fomit-frame-pointer, because that's the default behavior for O3.

Yeah, but then why does the stack trace work within the AOT-compiled pipeline and AOT-compiled runtime? It makes me think that only the entry-code is doing something weird regarding frame-pointers.

@mcourteaux
Copy link
Contributor Author

What C++ compiler and version are you using?

My project (and I believe Halide too) is being compiled with this:

❯ clang-18 --version
clang version 18.1.8 (Fedora 18.1.8-1.fc40)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants