Intel Xe Super Sampling Xess API Developer Guide v1.1
Intel Xe Super Sampling Xess API Developer Guide v1.1
Intel Xe Super Sampling Xess API Developer Guide v1.1
Game Development
Use this guide to understand how to optimize image quality and performance
without impacting frame rates.
Introduction
Xe Super Sampling is implemented as a sequence of Microsoft* Direct3D* 12
(DX3D) compute shader passes, executed before the rendering engine's post-
processing stage (as described in the section entitled 'TAA and XeSS'). The
rendering engine initializes XeSS by passing a Direct3D* 12 (D3D12) device,
which is being used for the main rendering, and a pointer to a descriptor heap,
where XeSS creates all its internal resource descriptors. XeSS allocates GPU
Contents resources for one of two categories:
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 • Persistent allocations, such as network weights, and other constant data.
X SS Components. . . . . . . . . . . . . . . . . . . . . . 1
e
• Temporary allocations, such as network activations.
Versioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The game engine can control the location where XeSS makes its temporary
Compatibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 allocations by passing XESS_INIT_FLAG_EXTERNAL_DESCRIPTOR_HEAP
Naming Conventions. . . . . . . . . . . . . . . . . . . 2 initialization flag to xessD3D12Init call and a pointer to an external resource heap
in the xessD3D12Execute call. To ensure optimal game performance with XeSS
TAA and XeSS. . . . . . . . . . . . . . . . . . . . . . . . . . . 3
when game engine provides external resource heap, this heap should have HIGH
XeSS Game Setting memory residency priority. Persistent allocations are always owned by the XeSS
Recommendations. . . . . . . . . . . . . . . . . . . . . 4 library.
Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . 5
Initialization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 XeSS Components
XeSS is accessible through the XeSS SDK, which provides a D3D12-based API for
Execution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
integration into a game engine, and includes the following D3D12 components:
Debug and Logging Capabilities . . . . . 10
• An HLSL-based cross-vendor implementation that runs on any GPU
Recommended Practices . . . . . . . . . . . . . 10
supporting SM 6.4. Hardware acceleration for DP4a or equivalent is
Visual Quality. . . . . . . . . . . . . . . . . . . . . . . . . . . 10 recommended.
Driver Verification. . . . . . . . . . . . . . . . . . . . . . 10 • An Intel implementation optimized to run on Intel® Arc™ Graphics, and
Debugging Tips. . . . . . . . . . . . . . . . . . . . . . . . 10 Intel® Iris® Xe Graphics.
Additional Resources. . . . . . . . . . . . . . . . . . 10 • An implementation dispatcher, which loads either the XeSS runtime shipped
Developer Guide | Intel® Xe Super Sampling (XeSS) API Developer Guide 2
Game Engine
Renderer
Cross-Vendor
Intel Optimized Intel Optimized
Implementation
Implementation Implementation
(HLSL)
Figure 1. XeSS SDK components for both Intel-specific, and cross-vendor solutions.
with the game, the version provided with the Intel • The loader will check the compatibility of the XeSS
graphics drivers, or the cross-vendor implementation. version installed with the game, and the installed driver
on the system.
Versioning • If compatible, the loader will use the game title installed
XeSS uses major.minor.patch version format, and Numeric version of XeSS.
90+ scheme, for development stage builds. The XeSS
• If not compatible, and the driver is newer, the loader will
version is specified by the 64-bit function [xess_version_t]
ignore the game title version of XeSS, and use the version
structure, in which:
distributed with the driver.
• A major version increment indicates a new API, and
• If not compatible and the driver is older, the loader will
potentially a break in functionality.
return a failure code, and XeSS will not initialize.
• A minor version increment indicates incremental
changes such as optional inputs or flags. This does not Naming Conventions
change existing functionality. The XeSS API uses the following naming conventions:
• A patch version increment may include performance • All functions must be prefixed with xess
or quality tweaks, or fixes, for known issues. There is
• All functions must use camel case xessObjectAction
no change in the interfaces. Versions beyond 90 are
convention
used for development builds to change the interface
for the next release. • All macros must use all caps XESS_NAME convention
• All flag enumerations must end with flags_t TAA and XeSS
XeSS is a temporally amortized super-sampling/up-
sampling technique that drops in place of the Temporal
Anti-Aliasing (TAA) stage in the game renderer, achieving
significantly better image quality than current state-of-
the-art techniques in games.
Velocity History
Warp
Heuristics
• Color Clamping
• Object ID, Velocity, and
Jitter Depth Comparisons
Velocity History
Frame N
Jitter
Minimum Description Intel Xe Super Sampling (XeSS) technology uses machine learning to deliver higher
performance with exceptional image quality.
XeSS options.
There are also guidelines for the font, official naming, and descriptions of the
Balanced Delivers optimal performance and image quality 1080p and above
Resolution Specific Your game adjusts the XeSS default preset 1080p and lower set to ‘Balanced’ 1440p
based on the output resolution and higher set to ‘Performance’
General Your game selects one XeSS preset as default. Intel XeSS ON set to ‘Performance’
Developer Guide | Intel® Xe Super Sampling (XeSS) API Developer Guide 5
You should use the approved naming conventions for XeSS set. The entries below are the recommended default
settings.
• Jitter
Intel® XeSS < Performance >
• Input color
Anti-Aliasing Off • Dilated high-res motion vectors
In place of the high-res motion vectors, the renderer can
Figure 4. Example of game UI with XeSS settings. provide the motion vectors at the input resolution—along
with the depth values:
Jy = -.375
in your settings menus and descriptions. The official font for .25
XeSS-related communication is IntelOneText-Regular.
Please use the official superscripted e in XeSS, unless the
.5
font system does not support superscript, in which case
XeSS is acceptable. For the smaller e in XeSS , you can
reduce the font size for just that character to keep the .75
proportions.
ProjectionMatrix.M[2][1] -= Jy * 2.0f /
InputHeight
Jitter Sequence
A quasi-random sampling sequence with a good spatial
distribution of characteristics is required to get the best
quality of XeSS algorithm (Halton sequence would be a
fair choice). The scaling factor should be considered
when using such a sequence to modify the length of a
repeated pattern. For example: if the game is using
Halton sequence of a length eight in native rendering, it
must become 8 * scale^2 if used with XeSS upscaling to
ensure a good distribution of samples in the area covered
by a single low-resolution pixel. Sometimes, increasing
the length even more leads to an additional quality
improvement, so we encourage experimentation with the
sequence length. Avoid sampling techniques that bias
the jitter sample distribution with regard to the input
pixel, however.
Color
XeSS accepts both SDR and HDR input colors in any linear
color format, for example: DXGI_FORMAT_R16G16B16A16_
FLOAT, DXGI_FORMAT_R11G11B10_FLOAT, DXGI_
FORMAT_R8G8B8A8_UNORM, etc. The input colors are
expected to be in the scRGB color space, which is scene-
referred—i.e., the color values represent luminance levels.
A value of (1.0,1.0,1.0) encodes D65 white at 80 nits and
represents the maximum luminance for SDR displays.
The color values can exceed (1.0,1.0,1.0) for HDR content.
If the input color values have not been adjusted for the
exposure, or if the input color values are scaled differently
from the sRGB space, a separate scale value can be provided
Developer Guide | Intel® Xe Super Sampling (XeSS) API Developer Guide 7
Vy = - .25
Vy = - .5
Frame Frame
Current Current
.5 Frame 1 Frame
.75 1.5
Input Pixel
1 2
Target Pixel
Vx = -.5 Vx = - 1
Figure 6. Convention for specifying the low-res and high-res motion vector to XeSS.
These scale values are applied to the input as shown below: Motion vectors specify the screen-space motion in pixels
from the previous frame to the current frame. XeSS accepts
if (autoexposure)
motion vectors in the format DXGI_FORMAT_R16G16_
{
FLOAT, where the R channel encodes the motion in x, and the
scale = XeSSCalculatedExposure(…)
G in y. The motion vectors do not include motion induced by
}
the camera jitter. Motion vectors can be low-res (default), or
else if (useExposureScaleTexture)
high-res (XESS_INIT_FLAG_HIGH_RES_MV). Low-res
{
motion vectors are represented by a 2D texture at the input
scale = exposureScaleTexture.Load(int3(0, 0,
resolution, whereas high-res motion vectors are represented
0)).x
by a 2D texture at the target resolution.
}
else In the case of high-res motion vectors, the velocity
{ component resulting from camera animations is computed
scale = inputScale at the target resolution in a deferred pass, using the camera
} transformation and depth values. However, the velocity
component related to particles and object animations is
inputColor *= scale typically computed at the input resolution and stored in the
G-Buffer. This velocity component is upsampled and
The output is in the same color space as the input. It can be combined with the camera velocity to produce the texture for
any three or four channel linear color format similar to the high-res motion vectors. XeSS also expects the high-res
input. If a scale value is applied to the input, as shown above, motion vectors to be dilated. For example, the motion vectors
the inverse of this scale is applied to the output color. XeSS represent the motion of the foremost surface in a small
maintains an internal history state to perform temporal neighborhood of input pixels (such as 3 * 3). High-res motion
accumulation of incoming samples. That means the history vectors can be computed in a separate pass by the user.
should be dropped if the scene or view suddenly changes.
Low-res motion vectors are not dilated, and directly
This is achieved by passing setting historyReset flag in
represent the velocity sampled at each jittered pixel
xess_xxx_execute_params_t.
position. XeSS internally up-samples motion vectors to the
target grid and uses the depth texture to dilate them. The
Motion Vectors
Developer Guide | Intel® Xe Super Sampling (XeSS) API Developer Guide 8
DXGI_FORMAT_R32G32B32A32_TYPELESS DXGI_FORMAT_R32G32B32A32_FLOAT
DXGI_FORMAT_R32G32B32_TYPELESS DXGI_FORMAT_R32G32B32_FLOAT
DXGI_FORMAT_R16G16B16A16_TYPELESS DXGI_FORMAT_R16G16B16A16_FLOAT
DXGI_FORMAT_R32G32_TYPELESS DXGI_FORMAT_R32G32_FLOAT
DXGI_FORMAT_R32G8X24_TYPELESS DXGI_FORMAT_R32_FLOAT_X8X24_TYPELESS
DXGI_FORMAT_R10G10B10A2_TYPELESS DXGI_FORMAT_R10G10B10A2_UNORM
DXGI_FORMAT_R8G8B8A8_TYPELESS DXGI_FORMAT_R8G8B8A8_UNORM
DXGI_FORMAT_R16G16_TYPELESS DXGI_FORMAT_R16G16_FLOAT
DXGI_FORMAT_R32_TYPELESS DXGI_FORMAT_R32_FLOAT
DXGI_FORMAT_R24G8_TYPELESS DXGI_FORMAT_R24_UNORM_X8_TYPELESS
DXGI_FORMAT_R8G8_TYPELESS DXGI_FORMAT_R8G8_UNORM
DXGI_FORMAT_R16_TYPELESS DXGI_FORMAT_R16_FLOAT
DXGI_FORMAT_R8_TYPELESS DXGI_FORMAT_R8_UNORM
DXGI_FORMAT_B8G8R8A8_TYPELESS DXGI_FORMAT_B8G8R8A8_UNORM
DXGI_FORMAT_B8G8R8X8_TYPELESS DXGI_FORMAT_B8G8R8X8_UNORM
DXGI_FORMAT_D16_UNORM DXGI_FORMAT_R16_UNORM
DXGI_FORMAT_D32_FLOAT DXGI_FORMAT_R32_FLOAT
DXGI_FORMAT_D24_UNORM_S8_UINT DXGI_FORMAT_R24_UNORM_X8_TYPELESS
DXGI_FORMAT_D32_FLOAT_S8X24_UINT DXGI_FORMAT_R32_FLOAT_X8X24_TYPELESS
figure below shows the same motion specified with low-res camera. However, several game engines use inverted depth,
and high-res motion vectors. and this can be enabled by setting XESS_INIT_FLAG_
INVERTED_DEPTH.
Some game engines only render objects into the gbuffer,
and quickly compute the camera velocity in the TAA shader. Responsive Pixel Mask
In such cases, an additional pass is required before XeSS You could provide a responsive pixel mask with a mask value
execution to merge object and camera velocities and of 1 to force XeSS to ignore information from previous
generate a flattened velocity buffer. In such scenarios, frames. Although XeSS is a generalized technique that
high-res motion vectors might be a better choice, as the should handle a wide range of rendering scenarios, there
flattening pass can be executed at the target resolution. may be rare cases where objects without valid motion
vectors may produce artifacts, for example particles. In
Depth
such cases, a responsive pixel mask can be set for these
If XeSS is used with low-res motion vectors, it also requires a
objects. Any texture format can be used for the mask, as
depth texture for velocity dilation. Any depth format, such
long as the mask value is in the R channel.
as D32_FLOAT or D24_UNORM, is supported. By default,
XeSS assumes that smaller depth values are closer to the Resource States
Developer Guide | Intel® Xe Super Sampling (XeSS) API Developer Guide 9
XeSS expects all input textures to be in the state D3D12_ XeSS includes three types of storage:
RESOURCE_STATE_NON_PIXEL_SHADER_
• Persistent Output-Independent Storage: persistent
RESOURCE, and the output texture to be in the state
storage such as weights are internally allocated and
D3D12_RESOURCE_STATE_UNORDERED_ACCESS.
uploaded by XeSS during initialization.
Resource Formats • Persistent Output-Dependent Storage: persistent
XeSS expects all input textures to be typed. For typeless storage such as internal history texture.
formats XeSS performs a conversion according to Table 4. • Temporary Storage: temporary storage only has valid
data during the execution of XeSS.
Mip Bias
To preserve texture details at the target resolution, XeSS Allocate temporary storage either internally in a library-
requires an additional mip bias of (log2(frac{Input Width} managed heap (default), or in a heap provided by the user in
{Target Width})). For example, a mip bias of -1 should be the pTempStorageHeap field of the xess_d3d12_init_
applied for 2x resolution scaling. In certain cases, params_t structure. If you allocate the temporary storage, it
increasing mip bias even more leads to an additional visual can be reused outside of XeSS execution.
quality improvement; this comes with a potential ComPtr<ID3D12Heap> pHeap;
performance overhead, however, due to increased memory CD3DX12 _ HEAP _ DESC heapDesc(xessProp.
bandwidth requirements, and potentially lower temporal tempHeapSize,D3D12 _ HEAP _ TYPE _ DEFAULT);
stability resulting in flickering and moire. You are, of course,
free to experiment with more or less aggressive texture d3dDevice->CreateHeap(&heapDesc, IID _ PPV _
LOD biases to find the right balance. ARGS(&pHeap));
Initialization initParams.tempStorageOffset = 0;
First create an XeSS context, as shown below. On Intel GPUs, initParams.pTempStorageHeap = pHeap.Get();
this step loads the latest Intel-optimized implementation of
XeSS. The returned context handle can then be used for xessD3D12Init(&context, &initParams);
initialization and execution.
You can specify the XESS_INIT_FLAG_EXTERNAL_
xess _ context _ handle _ t context;
DESCRIPTOR_HEAP initialization flag to use the external
xessD3D12CreateContext(pD3D12Device,
descriptor heap later at the execution stage.
&context)
You can also re-initialize XeSS if there is a change in the
Before initializing XeSS, the user can request a pipeline target resolution, or any other initialization parameter.
pre-build process to avoid costly kernel compilation and However, pending XeSS command lists must be
pipeline creation during initialization. completed before re-initialization. When temporary XeSS
storage is allocated, it is your responsibility to de-allocate,
xessD3D12BuildPipelines(context, NULL,
or reallocate, the heap. Quality preset changes are free,
false, initFlags);
but any other parameters change may lead to longer
xessD3D12Init execution times.
The xessD3D12Init function is then called to initialize XeSS.
During initialization, XeSS can create staging buffers and
copy queues to upload weights. These will be destroyed at Execution
the end of initialization. The XeSS storage and layer The XeSS execution function does not involve any GPU
specializations are determined by the target resolution. workloads, rather it records XeSS commands into the
Therefore, the target width and height must be set during specified command list. The command list is then enqueued
initialization. by the user. That means it is your responsibility to make sure
all input/output resources are alive at the time of the actual
xess _ d3d12 _ init _ params _ t initParams; GPU execution.
initParams.outputWidth = 3840;
initParams.outputHeight = 2160; By default, XeSS creates an internal descriptor heap, but if
initParams.initFlags = XESS _ INIT _ FLAG _ you have specified XESS_INIT_FLAG_EXTERNAL_
HIGH _ RES _ MV; DESCRIPTOR_HEAP at the initialization stage, you can
initParams.pTempStorageHeap = NULL; pass the pointer to the external descriptor heap and its
offset in execution parameters.
xessD3D12Init(&context, &initParams);
Developer Guide | Intel® Xe Super Sampling (XeSS) API Developer Guide 10
params.inputWidth = 1920;
Recommended Practices
params.inputHeight = 1080;
Visual Quality
// xess records commands into the command We recommend you run XeSS in the beginning of the post-
list processing chain, before the tone-mapping. Execution after
xessD3D12Execute(&context, pd3dCommandList, tone-mapping is possible in certain scenarios; however, this
¶ms); mode is experimental, and good quality is not guaranteed.
signal significantly hurts reconstruction quality. If XeSS is producing an aliased or shaky image, it is worth
• Use fp16 precision for the color buffer in scene linear concentrating on static scene debugging:
HDR space. • Emulate zero time-delta between frames in the engine
• Use fp16 precision for the velocity buffer. to maintain a fully static scene.
• Adjust mip bias to maximize image quality and keep • Set 0 motion vector scale to exclude potential issues
overhead under control. with motion vectors.
• Provide an appropriate scene exposure value. Correct • Significantly increase the length of a repeated jitter
exposure is essential to minimize ghosting of moving pattern.
objects, blurriness, and precise brightness
XeSS should produce high-quality, super sampled images.
reconstruction.
If this does not happen, there might be problems with jitter
sequence or the input textures' contents; otherwise, the
Driver Verification
problem is most likely in the decoding of motion vectors.
For the best performance and quality, install the latest
Make sure that motion vectors buffer contents correspond
driver. To facilitate this, after initialization with
to currently set units (NDC or pixels), and axis directions
xessD3D12CreateContext, call function
are correct. Try playing with plus or minus 1 motion vectors
xessIsOptimalDriver to verify the driver installed will provide
scale factors to align coordinate axis appropriately.
the best possible experience. If XESS_RESULT_
WARNING_OLD_DRIVER is returned from this function, an
Jitter Offset Debugging
advisory message or notice should be displayed to the user
If the static scene does not look good, try playing with plus
recommending they install a newer driver. XESS_RESULT_
or minus 1 jitter offset scaling to appropriately align the
WARNING_OLD_DRIVER is not a fatal error, and the user
coordinate axis. Make sure jitter does not fall off outside of
should be allowed to continue.
[-0.5, 0.5] bounds.
Debugging Tips
Motion Vectors Debugging
You You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel
products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which
includes subject matter disclosed herein.
Performance varies by use, configuration, and other factors. Learn more at intel.com/performanceindex.
No product or component can be absolutely secure.
All product plans and roadmaps are subject to change without notice.
Your costs and results may vary.
Intel technologies may require enabled hardware, software, or service activation.
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service
activation. Performance varies depending on system configuration. Check with your system manufacturer or retailer or learn more at
intel.com.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a
particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in
trade.
This document contains information on products, services and/or processes in development. All information provided here is subject to
change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications, and roadmaps.
The products and services described may contain defects or errors known as errata which may cause deviations from published
specifications. Current characterized errata are available on request.
Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by
visiting intel.com/design/literature.htm.
Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation
in the United States and/or other countries.
Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names
and brands may be claimed as the property of others.
© Intel Corporation 0323/ RMK/RHM3/PDF Please Recycle