Arm GPU Datasheet
Arm GPU Datasheet
Arm GPU Datasheet
API Support Mali-G71 Mali-G72 Mali-G31 Mali-G51 Mali-G52 Mali-G76 Mali-G57 Mali-G77 Mali-G78 Mali-G710 Mali-G510 Mali-G310 Mali-G715 Immortalis-G715 Mali-G720 Immortalis-G720 Mali-G725 Immortalis-G925
Core Features Mali-G71 Mali-G72 Mali-G31 Mali-G51 Mali-G52 Mali-G76 Mali-G57 Mali-G77 Mali-G78 Mali-G710 Mali-G510 Mali-G310 Mali-G715 Immortalis-G715 Mali-G720 Immortalis-G720 Mali-G725 Immortalis-G925
ASTC ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
AFBC ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
AFBC – RGBA16 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
AFRC ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Shader framebuffer access ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Multiple Render Target[1] ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
2xMSAA Automatically promoted to 4xMSAA ✓ ✓ ✓ ✓
4xMSAA ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
8xMSAA ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
16xMSAA ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
8-bit integer dot product ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
FP16 / R11G11B10
accelerated blending [5] ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Conservative rasterization ✓ ✓ ✓ ✓ ✓ ✓
Variable Rate Shading ✓ ✓ ✓ ✓ ✓ ✓
Ray tracing ○ ✓ ○ ✓ ○ ✓
Microarchitecture Features Mali-G71 Mali-G72 Mali-G31 Mali-G51 Mali-G52 Mali-G76 Mali-G57 Mali-G77 Mali-G78 Mali-G710 Mali-G510 Mali-G310 Mali-G715 Immortalis-G715 Mali-G720 Immortalis-G720 Mali-G725 Immortalis-G925
Transaction elimination ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Hidden surface removal ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
IDVS geometry pipeline ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
DVS geometry pipeline ✓ ✓ ✓ ✓
Texturing Mali-G71 Mali-G72 Mali-G31 Mali-G51 Mali-G52 Mali-G76 Mali-G57 Mali-G77 Mali-G78 Mali-G710 Mali-G510 Mali-G310 Mali-G715 Immortalis-G715 Mali-G720 Immortalis-G720 Mali-G725 Immortalis-G925
Bifrost ISA Config Mali-G71 Mali-G72 Mali-G31 Mali-G51 Mali-G52 Mali-G76 1. OpenGL ES has 4 render targets and Vulkan 8
Thread count (max) 384 384 256/512 512/768 768 768 2. Tile storage per pixel may be able to exceed this,
Max work registers (32b) 64 64 64 64 64 64 but with reduced tile size. Theoretical limit is higher
Thread count with 0-32 work registers 384 384 256/512 512/768 768 768 from Mali-G710 onward, but 256 is recommendation
Thread count with 33-64 work registers 384 384 128/256 256/384 384 384 3. Worst-case anisotropic filtering performance with
a MAX_ANISOTROPY = N
Valhall ISA Config Mali-G57 Mali-G77 Mali-G78 Mali-G710 Mali-G510 Mali-G310 Mali-G715 Immortalis-G715 4. Mali-G72 r0p3 / Mali-G51 r1p1 or higher required
Thread count (max) 1024 1024 1024 2048 1536-2048 512-2048 2048 2048 5. All have float blending. Valhall adds hardware
Max work registers (32b) 64 64 64 64 64 64 64 64 acceleration for standard blend operations
Thread count with 0-32 work registers 1024 1024 1024 2048 1536-2048 512-2048 2048 2048 6. Only fp16 and UNORM10 formats fully achieve x1
Thread count with 33-64 work registers 512 512 512 1024 768-1024 256-1024 1024 1024
The Core Config table details the specs of the chips, rather Specific Architecture pages:
than just whether features are available. As such for each
GPU it has threads in a warp, total threads, and operations/ Bifrost (Mali-G71 — Mali-G76)
texels etc per clock cycle, as well as cache sizes. Note Valhall (Mali-G57 — Immortalis-G715)
that for tile write rate on Arm chips this is both fragments
5th Gen (Mali-G720 — Immortalis-G720)
written into the tile and the pixels written back out of the
tile. Thread count is the total shader core hardware capacity; Performance Counters
note that for OpenGL ES only 128 threads are exposed.
For Mali-G310 and Mali-G510 Core Config has ranges For further reference on the technologies mentioned
depending on implementation — please check with device in the sheet, please refer to these webpages:
manufacturer for exact specification.
ASTC (Adaptive Scalable Texture Compression)
For Texturing, to work out cycles/sample for more AFBC (Arm FrameBuffer Compression)
complicated filters than bilinear, apply the multiplications
MSAA (Multi-Sample Anti-Aliasing)
in the tables on top of the bilinear performance to combine
Transaction Elimination
to the required filter. Remember to invert the bilinear
samples/cycle to get cycles/sample. For example, a simple Hidden Surface Removal
trilinear will be 2 x 1 cycles/sample on a Mali-G72, and 2 x IDVS (Index-Driven Vertex Shading)
0.25 cycles/sample on a Mali-G77. To add in 4x anisotropic DVS (Deferred Vertex Shading)
filtering, multiply by a further 4x. Note that anisotropic filter
Shader Framebuffer Access (GLES)
scaling is the worst-case number caused by the maximum
number of sample taps, it will usually be less than this. Shader Framebuffer Access (Vulkan)
Texture performance will differ from Image performance.
Depth performance with/without reference refers to e.g., For free GPU profiling tools, see:
a shadow sampler with reference comparison returning
a weighted bool vs a normal sample returning the actual Arm Mobile Studio
depth value.