Skip to content

[C++] Extracting files failed when creating database for chrome #19238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mcc0612mcc0612 opened this issue Apr 7, 2025 · 3 comments
Closed
Labels
question Further information is requested

Comments

@mcc0612mcc0612
Copy link

mcc0612mcc0612 commented Apr 7, 2025

Description of the issue

I'm trying to create codeql database for chrome. Though database was created successfully, errors occured on extracting phase and many files were missing.

Partial error log output i found in database-create.log:

 [ERROR] dataset import> 37876644_0.trap.zst for no link target, 1: java.io.IOException: Not enough input bytes
                              io.airlift.compress.zstd.ZstdInputStream.read(ZstdInputStream.java:86)
                              com.semmle.util.trap.CompressedFileInputStream$WrappedZstdInputStream.read(CompressedFileInputStream.java:54)
                              com.semmle.inmemory.trap.TrapInputStream.read(TrapInputStream.java:60)
                              com.semmle.inmemory.trap.TrapScanner.fill(TrapScanner.java:451)
                              com.semmle.inmemory.trap.TrapScanner.ensureNext(TrapScanner.java:428)
                              com.semmle.inmemory.trap.TrapScanner.nextToken(TrapScanner.java:61)
                              com.semmle.inmemory.trap.TRAPReader.scanTuplesAndLabels(TRAPReader.java:504)
                              com.semmle.inmemory.trap.TRAPReader.importTuples(TRAPReader.java:425)
                              com.semmle.inmemory.trap.ImportTasksProcessor.process(ImportTasksProcessor.java:262)
                              com.semmle.inmemory.trap.ImportTasksProcessor.lambda$importTrap$1(ImportTasksProcessor.java:179)
                              com.semmle.util.concurrent.FutureUtils.lambda$mapAsync_$8(FutureUtils.java:161)
                              java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source)
                              java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
                              java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
                              java.base/java.lang.Thread.run(Unknown Source)
                              at (start of line)

A snippet of errors in buld-tracer.log, seemingly all related to files in libc++ and chromium/src/base

"../../base/hash/hash.h", line 38: error: no instance of overloaded function "base::as_byte_span" matches the argument list
            argument types are: (std::__Cr::string_view)
    return FastHash(as_byte_span(str));
                    ^
"../../base/containers/span.h", line 1699: note: number of parameters of function template "base::as_byte_span<ExplicitArgumentBarrier...,ElementType,Extent>(base::allow_nonunique_obj_t, const ElementType (&)[Extent])" does not match the call
  constexpr auto as_byte_span(allow_nonunique_obj_t,
                 ^
"../../base/containers/span.h", line 1694: note: candidate function template "base::as_byte_span<ExplicitArgumentBarrier...,ElementType,Extent>(const ElementType (&)[Extent])" failed deduction
  constexpr auto as_byte_span(const ElementType (&arr LIFETIME_BOUND)[Extent]) {
                 ^
"../../base/containers/span.h", line 1686: note: number of parameters of function template "base::as_byte_span<ExplicitArgumentBarrier...,T>(base::allow_nonunique_obj_t, const T &)" does not match the call
  constexpr auto as_byte_span(allow_nonunique_obj_t, const T& t) {
                 ^
"../../base/containers/span.h", line 1680: note: constraint on candidate function template "base::as_byte_span<ExplicitArgumentBarrier...,T>(const T &)" not satisfied
  constexpr auto as_byte_span(const T& t) {
                 ^
"../../base/containers/span.h", line 1674: note: number of parameters of function template "base::as_byte_span<ExplicitArgumentBarrier...,T>(base::allow_nonunique_obj_t, const T &)" does not match the call
  constexpr auto as_byte_span(allow_nonunique_obj_t, const T& t LIFETIME_BOUND) {
                 ^
"../../base/containers/span.h", line 1669: note: constraint on candidate function template "base::as_byte_span<ExplicitArgumentBarrier...,T>(const T &)" not satisfied
  constexpr auto as_byte_span(const T& t LIFETIME_BOUND) {
                 ^

I found a similar issue in #16449 (comment). According to the reply, the issue in 16449 was fixed in CodeQL 2.21.0, but the latest CodeQL doesn't work for me.

My configuration:
Codeql: 2.21.0
Platform: Debian GNU/Linux 6.1.0-32-cloud-amd64
RAM: 128G
cpu: 32 cores

Reproduction
To reproduce the issue, steps are as follows:

  1. Compile the chrome based on https://chromium.googlesource.com/chromium/src/+/main/docs/linux/build_instructions.md
  2. eliminate all the obj files under out/Default/obj/content/browser/ by find out/Default/obj/content/browser/ -type f \( -name "*.o" -o -name "*.a" \) -delete
  3. run command codeql database create /path/to/database -J-Xmx80G --overwrite --language=cpp --command='autoninja -C out/Default chrome'

The log files are too big to upload, so I shared them in the google drive. Appreciate your help.
build-trace.log
database-create.log

@mcc0612mcc0612 mcc0612mcc0612 added the question Further information is requested label Apr 7, 2025
@mcc0612mcc0612 mcc0612mcc0612 changed the title Extracting files failed when creating database for chrome [C++] Extracting files failed when creating database for chrome Apr 7, 2025
@jketema
Copy link
Contributor

jketema commented Apr 8, 2025

Hi @mcc0612mcc0612,

Thanks for your report. This is likely a different issue from the one reported in #16449, and it looks like there are actually multiple issues here:

  1. Our extraction process crashes somewhere, leading to the Java stack trace near the top of your message
  2. Our C/C++ frontend not able to parse all code from the Chromium codebase

Looking at the logs you provided (1) only seems to occur for 94 files, and (2) seems to relate mostly to code deep in libraries used by Chromium. Given this, I believe that the database you created will still be of high quality.

For (1) the logs do not provide any clear information on where the crash might be occurring, and given this only occurs on a few cases, looking further into this cannot really take priority. With regard to (2), we are currently working on an update to the C/C++ frontend, which should improve things. Unfortunately, it is not clear yet when this will be available. In the mean time I've added your report to our internal issue for tracking Chromium related problems.

@mcc0612mcc0612
Copy link
Author

Thanks for the information. As you pointed out, the database is indeed of good quality. However, one key file we’re trying to analyze, render_frame_host_impl.cc, is missing.

I attempted to compile a new database while excluding only a few object files under out/Default/obj/content/browser/, specifically: render_frame_host.o, frame_tree_node.o, render_frame_host_manager.o, and navigator.o. With this setup, render_frame_host_impl.cc can now be analyzed properly, although the logs still contain some of the same bug-related messages.

Do you have any thoughts on this?

@jketema
Copy link
Contributor

jketema commented Apr 16, 2025

Do you have any thoughts on this?

No, I don't. In part because the approach you describe is one we do not actively support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants