Vtune Profiler Get Started Guide 2023.1 769038 773630
Vtune Profiler Get Started Guide 2023.1 769038 773630
Vtune Profiler Get Started Guide 2023.1 769038 773630
Contents
Chapter 1: Get Started with Intel® VTune™ Profiler
Get Started with Intel® VTune™ Profiler for Windows* OS ................................ 4
Example: Profile an OpenMP* Application on Windows*.......................... 8
Example: Profile a SYCL* Application on Windows*.............................. 10
Get Started with Intel® VTune™ Profiler for Linux* OS ................................... 11
Example: Profile an OpenMP Application on Linux* .............................. 15
Example: Profile a SYCL* Application on Linux*................................... 17
Get Started with Intel® VTune™ Profiler for macOS* ...................................... 18
Learn More ............................................................................................. 22
Notices and Disclaimers............................................................................ 22
2
Get Started with Intel® VTune™ Profiler 1
NOTE
Documentation for versions of Intel® VTune™ Profiler prior to the 2021 release are available for
download only. For a list of available documentation downloads by product version, see these pages:
• Download Documentation for Intel Parallel Studio XE
• Download Documentation for Intel System Studio
3
1 Get Started with Intel® VTune™ Profiler
NOTE You do not need to run setvars.bat when using Intel® VTune™ Profiler within Microsoft* Visual
Studio*.
Standalone (GUI) 1. Run the vtune-gui command or run Intel® VTune™ Profiler from the Start menu.
2. When the GUI opens, click
4
Get Started with Intel® VTune™ Profiler 1
Source Start VTune Profiler
Microsoft* Visual Open your solution in Visual Studio. The VTune Profiler toolbar is automatically
Studio* IDE enabled and your Visual Studio project is set as an analysis target.
NOTE You do not need to create a project when running Intel® VTune™ Profiler from the command line
or within Microsoft* Visual Studio.
5
1 Get Started with Intel® VTune™ Profiler
1. In the Launch Application section, browse to the location of your application executable file.
2. Click Start to run Performance Snapshot on your application. This analysis presents a general overview
of issues affecting the performance of your application on the target system.
6
Get Started with Intel® VTune™ Profiler 1
A flagged metric indicates a value outside acceptable/normal operating range. Use tool
tips to understand how to improve a flagged metric.
7
1 Get Started with Intel® VTune™ Profiler
See guidance on other analyses you should consider running next. The Analysis Tree
highlights these recommendations.
Next Steps
Performance Snapshot is a good starting point to get an overall assessment of application performance with
VTune Profiler. Next, check if your algorithm requires tuning.
1. Follow a tutorial to analyze common performance bottlenecks.
2. Once your algorithm is well-tuned, run Performance Snapshot again to calibrate results and identify
potential performance improvements in other areas.
See Also
Microarchitecture Exploration
Prerequisites
• Make sure your system is running Microsoft* Windows 10 or a newer version.
• Use one of these versions of Intel Processor Graphics:
• Gen 8
• Gen 9
• Gen 11
• Your system should be running on one of these Intel processors:
• 7th Generation Intel® Core™ i7 Processors (code name Kaby Lake)
• 8th Generation Intel® Core™ i7 Processors (code name Coffee Lake)
• 10th Generation Intel® Core™ i7 Processors (code name Ice Lake)
• Install Intel VTune Profiler from one of these sources:
• Standalone product download
• Intel® oneAPI Base Toolkit
• Intel® System Bring-up Toolkit
• Download the Intel® oneAPI HPC Toolkit which contains the Intel® oneAPI DPC++/C++ Compiler(icx/
icpx) that you need to profile OpenMP applications.
• Set up environment variables. Execute the vars.bat script located in the <vtune-install-dir>\env
directory.
• Set up your system for GPU analysis.
NOTE To install Intel VTune Profiler in the Microsoft* Visual Studio environment, see the VTune Profiler
User Guide.
cd <sample_dir>/DirectProgramming/C++/StructuredGrids/iso3dfd_omp_offload
8
Get Started with Intel® VTune™ Profiler 1
3. Compile the OpenMP Offload application.
mkdir build
cd build
icx /std:c++17 /EHsc /Qiopenmp /I../include\ /Qopenmp-targets:
spir64 /DUSE_BASELINE /DEBUG ..\src\iso3dfd.cpp ..\src\iso3dfd_verify.cpp ..\src\utils.cpp
9
1 Get Started with Intel® VTune™ Profiler
Prerequisites
• Make sure you have Microsoft* Visual Studio (v2017 or newer) installed on your system.
• Install Intel VTune Profiler from the Intel® oneAPI Base Toolkit or the Intel® System Bring-up Toolkit.
These toolkits contain the Intel® oneAPI DPC++/C++ Compiler(icpx -fsycl) compiler required for the
profiling process.
• Set up environment variables. Execute the vars.bat script located in the <vtune-install-dir>\env
directory.
• Ensure that the Intel oneAPI DPC++ Compiler (installed with the Intel oneAPI Base toolkit) is integrated
into Microsoft Visual Studio.
• Compile the code using the -gline-tables-only and -fdebug-info-for-profiling options for Intel
oneAPI DPC++ Compiler.
• Set up your system for GPU analysis.
For information on installing Intel VTune Profiler in the Microsoft* Visual Studio environment, see VTune
Profiler User Guide.
10
Get Started with Intel® VTune™ Profiler 1
4. Click the Start button to launch the analysis with the predefined options.
Run GPU Analysis from Command Line:
1. Open the sample directory:
<sample_dir>\VtuneProfiler\matrix_multiply_vtune
2. In this directory, open a Visual Studio* project file named matrix_multiply.sln
3. The multiply.cpp file contains several versions of matrix multiplication. Select a version by editing
the corresponding #define MULTIPLY line in multiply.hpp
4. Build the entire project with a Release configuration.
This generates an executable called matrix_multiply.exe.
5. Prepare the system to run a GPU analysis. See Set Up System for GPU Analysis.
6. Set VTune Profiler environment variables by running the batch file:
export <install_dir>\env\vars.bat
7. Run the analysis command:
vtune.exe -collect gpu-offload -- matrix_multiply.exe
VTune Profiler collects data and displays analysis results in the GPU Compute/Media Hotspots viewpoint.
In the Summary window, see statistics on CPU and GPU resource usage to understand if your application is
GPU-bound. Switch to the Graphics window to see basic CPU and GPU metrics representing code execution
over time.
11
1 Get Started with Intel® VTune™ Profiler
2. Build your application with symbol information and in Release mode with all optimizations enabled. For
detailed information on compiler settings, see the VTune Profiler online user guide.
You can also use the matrix sample application available in <install_directory>\sample\matrix.
You can see sample results in <install-dir>\sample (matrix).
3. Set up the environment variables:
source <install-dir>/setvars.sh
By default, the <install-dir> is:
12
Get Started with Intel® VTune™ Profiler 1
13
1 Get Started with Intel® VTune™ Profiler
A flagged metric indicates a value outside acceptable/normal operating range. Use tool
tips to understand how to improve a flagged metric.
14
Get Started with Intel® VTune™ Profiler 1
See guidance on other analyses you should consider running next. The Analysis Tree
highlights these recommendations.
Next Steps
Performance Snapshot is a good starting point to get an overall assessment of application performance with
VTune Profiler. Next, check if your algorithm requires tuning.
1. Follow a tutorial to analyze common performance bottlenecks.
2. Once your algorithm is well-tuned, run Performance Snapshot again to calibrate results and identify
potential performance improvements in other areas.
See Also
Microarchitecture Exploration
Prerequisites
• Make sure your system is running Linux* OS kernel 4.14 or a newer version.
• Use one of these versions of Intel Processor Graphics:
• Gen 8
• Gen 9
• Gen 11
• Your system should be running on one of these Intel processors:
• 7th Generation Intel® Core™ i7 Processors (code name Kaby Lake)
• 8th Generation Intel® Core™ i7 Processors (code name Coffee Lake)
• 10th Generation Intel® Core™ i7 Processors (code name Ice Lake)
• For the Linux GUI, use:
• GTK+ version 2.10 or newer (2.18 and newer versions are recommended)
• Pango version 1.14 or newer
• X.Org version 1.0 or newer (1.7 and newer versions are recommended)
• Install Intel VTune Profiler from one of these sources:
• Standalone product download
• Intel® oneAPI Base Toolkit
• Intel® System Bring-up Toolkit
• Download the Intel® oneAPI HPC Toolkit which contains the Intel® oneAPI DPC++/C++ Compiler(icx/
icpx) that you need to profile OpenMP applications.
• Set up environment variables. Execute the vars.sh script.
• Set up your system for GPU analysis.
cd <sample_dir>/DirectProgramming/C++/StructuredGrids/iso3dfd_omp_offload
15
1 Get Started with Intel® VTune™ Profiler
mkdir build;
cmake -DVERIFY_RESULTS=0 ..
make -j
This generates a src/iso3dfd executable.
make clean
This removes the executable and object files that you created with the make command.
16
Get Started with Intel® VTune™ Profiler 1
• Utilizing the compute resources of your system inefficiently
• Use the information in the Platform window to see basic CPU and GPU metrics.
• Investigate specific computing tasks in the Graphics window.
For a deeper analysis, see a related recipe in the VTune Profiler Performance Analysis Cookbook. You can also
continue your profiling with the GPU Compute/Media Hotspots analysis.
Prerequisites
• Install VTune Profiler and Intel® oneAPI DPC++/C++ Compiler from the Intel® oneAPI Base Toolkit or the
Intel® System Bring-up Toolkit.
• Set up environment variables by executing the vars.sh script.
• Set up your system for GPU analysis.
cd <sample_dir/VtuneProfiler/matrix_multiply>
2. The multiply.cpp file in the src folder contains several versions of matrix multiplication. Select a
version by editing the corresponding #define MULTIPLY line in multiply.h.
3. Build the app using the existing Makefile:
cmake .
make
This should generate a matrix.icpx -fsycl executable.
make clean
This removes the executable and object files that were created by the make command.
Browse button and select GPU Compute/Media Hotspots analysis from the Accelerators group in
the Analysis Tree.
17
1 Get Started with Intel® VTune™ Profiler
6. Click the Start button at the bottom to launch the analysis with the pre-selected options.
Run GPU Analysis from Command Line:
1. Prepare the system to run a GPU analysis. See Set Up System for GPU Analysis.
2. Set up environment variables for Intel software tools:
source $ONEAPI_ROOT/setvars.sh
3. Run the GPU Compute/Media Hotspots analysis:
18
Get Started with Intel® VTune™ Profiler 1
Step 1: Start VTune Profiler
1. Launch VTune Profiler with the vtune-gui command.
19
1 Get Started with Intel® VTune™ Profiler
1. In the WHERE pane, select Remote Linux (SSH) and specify the target Linux system using
username@hostname[:port].
VTune Profiler connects to the Linux system and installs the target package.
2. In the WHAT pane, provide the path to your application on the target Linux system.
3. Click the Start button to run Performance Snapshot on the application.
20
Get Started with Intel® VTune™ Profiler 1
Step 3: View and Analyze Performance Data
When data collection completes, VTune Profiler displays analysis results on the macOS system. Start your
analysis in the Summary window. Here, you see a performance overview of your application.
The overview typically includes several metrics along with their descriptions.
21
1 Get Started with Intel® VTune™ Profiler
A flagged metric indicates a value outside acceptable/normal operating range. Use tool
tips to understand how to improve a flagged metric.
See guidance on other analyses you should consider running next. The Analysis Tree
highlights these recommendations.
Next Steps
Performance Snapshot is a good starting point to get an overall assessment of application performance with
VTune Profiler. Next, check if your algorithm requires tuning.
1. Run Hotspots Analysis on your application.
2. Follow a Hotspots tutorial. Learn techniques to get the most out of your Hotspots analysis.
3. Once your algorithm is well-tuned, run Performance Snapshot again to calibrate results and identify
potential performance improvements in other areas.
See Also
Microarchitecture Exploration
Learn More
Document Description
User Guide The User Guide is the primary documentation for VTune Profiler.
NOTE
You can also download an offline version of the VTune Profiler documentation.
Online Training The online training site is an excellent resource to learn the basics of VTune Profiler
with Getting Started guides, videos, tutorials, webinars, and technical articles.
Cookbook Performance analysis cookbook that contains recipes to identify and solve popular
performance problems using analysis types in VTune Profiler.
Installation Guide The Installation Guide contains basic installation instructions for VTune Profiler and
for Windows | Linux post-installation configuration instructions for the various drivers and collectors.
| macOS hosts
Tutorials VTune Profiler tutorials guide a new user through basic features with a short
sample application.
Release Notes Find information about the latest version of VTune Profiler, including a
comprehensive description of new features, system requirements, and technical
issues that were resolved.
For the standalone and toolkit versions of VTune Profiler, understand the current
System Requirements.
22
Get Started with Intel® VTune™ Profiler 1
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its
subsidiaries. Other names and brands may be claimed as the property of others.
Intel, the Intel logo, Intel Atom, Intel Core, Intel Xeon Phi, VTune and Xeon are trademarks of Intel
Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation
in the United States and/or other countries.
Java is a registered trademark of Oracle and/or its affiliates.
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.
This software and the related documents are Intel copyrighted materials, and your use of them is governed
by the express license under which they were provided to you (License). Unless the License provides
otherwise, you may not use, modify, copy, publish, distribute, disclose or transmit this software or the
related documents without Intel's prior written permission.
This software and the related documents are provided as is, with no express or implied warranties, other
than those that are expressly stated in the License.
© Intel Corporation.
23