By default openblas will utilize maximum available threads. You could set the threads in openblas by using `void goto_set_num_threads(int num_threads);` `void openblas_set_num_threads(int num_threads);` https://github.com/xianyi/OpenBLAS#setting-the-number-of-threads-at-runtime The default for --threads is `std::thread::hardware_concurrency()` which returns the max threads including hyper-threads. This is not the same as the number of CPU cores. Using threads == cores usually gives the best performance. Here is how you can determinate the number of CPU cores: https://github.com/ggerganov/llama.cpp/blob/d783f7982e0e823a2626a9956359c0d36c1a7e21/examples/common.cpp#L34-L68