RDNA 3: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 21:16, 31 October 2023 edit 67.84.203.109 (talk) →Integrated graphics processors (iGPs) ← Previous edit		Latest revision as of 20:41, 7 October 2024 edit undo Jonesey95 (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers, Mass message senders, Template editors 360,539 edits m Replace template per TFD
(41 intermediate revisions by 20 users not shown)
Line 5: \| name = RDNA 3 \| image = AMD RDNA-3 Logo.png \| caption = ~~<!-- A caption for the image -->~~ \| alt = <!-- Mouse over text for the image --> \| launched = {{Start date and age\|2022\|12\|13}} \| discontinued = \| ~~soldby~~ designfirm = [[~~Advanced Micro Devices\|~~AMD]] ~~\| designfirm = [[Advanced Micro Devices\|AMD]]~~ \| manuf1 = [[TSMC]] \| process = {{ubl \|[[TSMC]] [[5 nm process\|N4]] (APUs)\|[[TSMC]] [[5 nm process\|N5]] (GCD)\|[[TSMC]] [[7 nm process\|N6]] (Navi 33 and MCD)}} \| codename = {{ubl \|Navi 31: Plum Bonito \|Navi 32: Wheat Nas \|Navi 33: Hotpink Bonefish}} <!------------------ Product Series -------------------> \| products-desktop1 = [[Radeon RX 7000 series]] \| products-hedt1 = [[Radeon Pro#Radeon Pro W7000 series\|Radeon Pro W7000 series]] \| products-server1 = <!-- (1..5) Server/datacenter product series that use the architecture, e.g. AMD Instinct MI300 --> Line 33 ⟶ 32: <!------------------ Supported Compute APIs -------------------> ~~\| opengl-compute-version = <!-- Version number of OpenGL Compute supported by the GPU architecture -->~~ \| cuda-compute-version = <!-- Version number of CUDA Compute supported by the GPU architecture --> \| directcompute-version = <!-- Version number of DirectCompute supported by the GPU architecture --> <!------------------ Specifications -------------------> \| compute = {{ubl \|Up to 122.8{{nbsp}}TFLOPS (FP16) \|Up to 61.42{{nbsp}}TFLOPS (FP32) \|Up to 1.919{{nbsp}}TFLOPS (FP64)}} \| slowest = 1500 \| slow-unit = MHz Line 47 ⟶ 45: \| l1-cache = 256{{nbsp}}KB (per array) \| l2-cache = 6{{nbsp}}MB \| l3-cache = up to 96{{nbsp}}MB (16{{nbsp}}MB per MCD) \| memory-support = [[GDDR6 SDRAM\|GDDR6]] \| memory-clock = 20up to 20 Gbps \| pcie-support = [[PCI Express#PCI Express 4.0\|PCIe 4.0]] Line 62 ⟶ 60: \| predecessor = [[RDNA 2]] \| variant = CDNA 3 (datacenter) \| successor = [[RDNA 4]] \| support_status = Supported }} '''RDNA 3''' is a [[Graphics processing unit\|GPU]] microarchitecture designed by [[~~Advanced Micro Devices\|~~AMD]], released with the [[Radeon]] [[Radeon RX 7000 series\|RX 7000 series]] on December 13, 2022. Alongside powering the RX 7000 series, RDNA 3 is also featured in the SoCs designed by AMD for the [[Asus ROG Ally]] ~~and~~, [[Lenovo Legion\|Lenovo Legion Go]], and the [[PlayStation 5\|PlayStation 5 Pro]] consoles. == Background == Line 73 ⟶ 72: Full details for the RDNA 3 architecture were unveiled on November 3, 2022 at an event in [[Las Vegas]].<ref>{{Cite press release \|title=AMD Unveils World's Most Advanced Gaming Graphics Cards, Built on Groundbreaking AMD RDNA 3 Architecture with Chiplet Design \|url=https://www.amd.com/en/press-releases/2022-11-03-amd-unveils-world-s-most-advanced-gaming-graphics-cards-built \|website=AMD \|location=Las Vegas, NV \|language=en-US \|date=November 3, 2022 \|access-date=April 8, 2023}}</ref> == ~~Architectural details~~Architecture == === Chiplet packaging === For the first time ever in a consumer GPU, RDNA 3 utilizes modular [[Multi-chip module\|chiplets]] rather than a single large monolithic [[Die (integrated circuit)\|die]]. AMD previously had great success with its use of chiplets in its [[Ryzen]] desktop and [[Epyc]] server processors.<ref>{{Cite web \|last1=James \|first1=Dave \|date=June 24, 2022 \|title=AMD suggests a Ryzen-like design for RDNA 3 chiplets would be 'a reasonable inference' \|url=https://www.pcgamer.com/amd-suggests-a-ryzen-like-chiplet-design-for-rdna-3-gpus-would-be-a-reasonable-inference/ \|website=PC Gamer \|language=en-US \|access-date=April 8, 2023}}</ref> The decision to move to a chiplet-based GPU microarchitecture was led by AMD Senior Vice President [[Sam Naffziger]] who had also lead the chiplet initiative with Ryzen and Epyc.<ref>{{Cite web \|last1=Alcorn \|first1=Paul \|last2=Walton \|first2=Jarred \|date=June 23, 2022 \|title=Into the GPU Chiplet Era: An Interview With AMD's Sam Naffziger \|url=https://www.tomshardware.com/features/gpu-chiplet-era-interview-amd-sam-naffziger \|website=Tom's Hardware \|language=en-US \|access-date=April 8, 2023}}</ref> The development of RDNA 3's chiplet architecture began towards the end of 2017 with Naffziger leading the AMD graphics team in the effort.<ref name="Brosdahl">{{Cite web \|last1=Brosdahl \|first1=Peter \|date=November 22, 2022 \|title=AMD Lead Engineer Sam Naffziger Explains Advantages of RDNA3 Chiplet Design \|url=https://www.thefpsreview.com/2022/11/22/amd-lead-engineer-sam-naffziger-explains-advantages-of-rdna3-chiplet-design/ \|website=The FPS Review \|language=en-US \|access-date=April 8, 2023}}</ref> The benefit of using chiplets is that dies can be [[Semiconductor device fabrication\|fabricated]] on different process nodes depending on their functions and intended purpose. According to Naffziger, cache and SRAM do not scale as linearly as logic does on advanced nodes like N5 in terms of density and power consumption so they can instead be fabricated on the cheaper, more mature N6 node. The use of smaller dies rather than one large monolithic die is beneficial for maximizing wafer yields as more dies can be fitted onto a single wafer.<ref name="Brosdahl"/> Alternatively, a large monolithic RDNA 3 die built on N5 would be more expensive to produce with lower yields. ~~The~~RDNA ~~decision~~3 touses ~~move~~two totypes aof ~~chiplet~~chiplets: the Graphics Compute Die (GCD) and Memory Cache Dies (MCDs). On Ryzen and Epyc processors, AMD used its [[PCI Express\|PCIe]]-based ~~GPU~~Infinity ~~microarchitecture~~Fabric ~~was~~protocol ~~led~~with bythe ~~AMD~~package's ~~Senior~~dies ~~Vice~~connected ~~President~~via traces on an organic substrate. This approach is easily scalable in a cost-effective manner but has the drawbacks of increased [[~~Sam~~Latency ~~Naffziger~~(engineering)\|latency]], ~~who~~increased ~~had~~power ~~also~~consumption ~~lead~~when ~~the~~moving ~~chiplet~~data ~~initiative~~between ~~with~~dies ~~Ryzen~~at around 1.5 picojoules per bit, and ~~Epyc~~it cannot achieve the connection density needed for high-bandwidth GPUs.<ref>{{~~Cite~~cite web \|last1~~=Alcorn \|first1=Paul \|last2~~=Walton \|~~first2~~first1=Jarred \|date=June 235, ~~2022~~2023 \|title=~~Into~~AMD ~~the~~RDNA 3 GPU ~~Chiplet~~Architecture ~~Era~~Deep Dive: AnThe ~~Interview~~Ryzen ~~With AMD's~~Moment ~~Sam~~for ~~Naffziger~~GPUs \|url=https://www.tomshardware.com/~~features~~news/amd-rdna-3-gpu-~~chiplet~~architecture-~~era~~deep-~~interview~~dive-the-ryzen-moment-for-gpus#section-amd-~~sam~~rdna-3-and-gpu-~~naffziger~~chiplets \|website=Tom's Hardware \|language=en-US \|access-date=April 829, ~~2023~~2024}}</ref> ~~The~~An ~~development~~organic ofpackage ~~RDNA~~could ~~3's~~not ~~chiplet architecture began towards~~host the ~~end~~number of ~~2017~~wires ~~with~~that ~~Naffziger~~would ~~leading~~be ~~the~~needed ~~AMD~~to ~~graphics~~connect ~~team~~multiple dies in ~~the~~a ~~effort~~GPU.<ref ~~name="Brosdahl"~~>{{~~Cite~~cite web \|last1=~~Brosdahl~~Ridley \|first1=~~Peter~~Jacob \|date=November 2214, 2022 \|title=AMD's ~~Lead~~Infinity ~~Engineer~~Links ~~Sam~~is ~~Naffziger~~the ~~Explains~~unsung ~~Advantages~~hero of ~~RDNA3~~RDNA ~~Chiplet~~3 ~~Design~~and chiplet gaming GPUs \|url=https://www.~~thefpsreview~~pcgamer.com~~/2022/11/22~~/amd-~~lead~~infinity-~~engineer~~links-~~sam~~rdna-~~naffziger-explains-advantages-of-rdna3-chiplet-design~~3/ \|website=~~The FPS~~PC ~~Review~~Gamer \|language=en-US \|access-date=April 829, ~~2023~~2024}}</ref> RDNA 3's dies are instead connected using [[TSMC]]'s Integrated Fan-Out Re-Distribution Layer (InFO-RDL) packaging technique which provides a silicon bridge for high bandwidth and high density die-to-die communication.<ref name="TechPowerUp">{{Cite web \|title=AMD Explains the Economics Behind Chiplets for GPUs \|url=https://www.techpowerup.com/301071/amd-explains-the-economics-behind-chiplets-for-gpus \|website=TechPowerUp \|language=en-US \|date=November 14, 2022 \|access-date=April 8, 2023}}</ref> InFO allows dies to be connected without the use of a more costly silicon [[interposer]] such as the one used in AMD's Instinct MI200 and MI300 datacenter accelerators. Each Infinity Fanout link has 9.2 Gbps in bandwidth. Naffziger explains that "The bandwidth density that we achieve is almost 10x" with the Infinity Fanout rather than the wires used by Ryzen and Epyc processors. The chiplet interconnects in RDNA achieve cumulative bandwidth of 5.3{{nbsp}}TB/s.<ref name="TechPowerUp"/> ~~==== Memory Cache Dies (MCDs) ====~~ With a respective 2.05 billion transistors, each Memory Cache Die (MCD) contains large blocks of L3 cache and two physical 32-bit GDDR6 memory interfaces for a combined 64-bit interface per MCD.<ref name="Walton">{{Cite web \|last1=Walton \|first1=Jarred \|date=November 14, 2022 \|title=AMD RDNA 3 GPU Architecture Deep Dive: The Ryzen Moment for GPUs \|url=https://www.tomshardware.com/news/amd-rdna-3-gpu-architecture-deep-dive-the-ryzen-moment-for-gpus \|website=Tom's Hardware \|language=en-US \|access-date=April 8, 2023}}</ref> The Radeon RX 7900 XTX has a 384-bit memory bus through the use of six MCDs while the RX 7900 XT has a 320-bit bus due to its five MCDs. ==== ~~Chiplet~~Memory ~~interconnects~~Cache Dies (MCDs) ==== With a respective 2.05 billion transistors, each Memory Cache Die (MCD) contains 16{{nbsp}}MB of L3 cache. Theoretically, additional L3 cache could be added to the MCDs via AMD's 3D V-Cache die stacking technology as the MCDs contain unused [[Through-silicon via\|TSV]] connection points.<ref>{{Cite web \|last1=Klotz \|first1=Aaron \|date=January 29, 2023 \|title=AMD GPU Appears to Leave Room for Future 3D V-Cache \|url=https://www.tomshardware.com/news/3d-vcache-rdna3-amd-gpu \|website=Tom's Hardware \|language=en-US \|access-date=April 8, 2023}}</ref><ref>{{Cite web \|last1=Ridley \|first1=Jacob \|date=January 30, 2023 \|title=Tiny spots on AMD's RDNA 3 GPU hint at massive cache potential \|url=https://www.pcgamer.com/tiny-spots-on-amds-rdna-3-gpu-hint-at-massive-cache-potential/ \|website=PC Gamer \|language=en-US \|access-date=April 8, 2023}}</ref> Also present on each MCD are two physical 32-bit GDDR6 memory interfaces for a combined 64-bit interface per MCD.<ref name="Walton">{{Cite web \|last1=Walton \|first1=Jarred \|date=November 14, 2022 \|title=AMD RDNA 3 GPU Architecture Deep Dive: The Ryzen Moment for GPUs \|url=https://www.tomshardware.com/news/amd-rdna-3-gpu-architecture-deep-dive-the-ryzen-moment-for-gpus \|website=Tom's Hardware \|language=en-US \|access-date=April 8, 2023}}</ref> The Radeon RX 7900 XTX has a 384-bit memory bus through the use of six MCDs while the RX 7900 XT has a 320-bit bus due to its five MCDs. The chiplet interconnects have a bandwidth of 5.3{{nbsp}}TB/s.<ref>{{Cite web \|title=AMD Explains the Economics Behind Chiplets for GPUs \|url=https://www.techpowerup.com/301071/amd-explains-the-economics-behind-chiplets-for-gpus \|website=TechPowerUp \|language=en-US \|date=November 14, 2022 \|access-date=April 8, 2023}}</ref> ==== ~~Process~~Graphics ~~node~~Compute Die (GCD) ==== ==== Compute Units ==== According to Naffziger, cache and SRAM do not scale as linearly as logic does on advanced nodes like N5 in terms of density and power consumption so they can instead be fabricated on the cheaper, more mature N6 node. The use of smaller chiplet dies rather than one large monolithic die is beneficial for maximizing wafer yields as more dies can be fitted onto a single wafer.<ref name="Brosdahl"/> RDNA 3's Compute Units (CUs) for graphics processing are organized in dual CU Work Group Processors (WGPs). Rather than including a very large number of WGPs in RDNA 3 GPUs, AMD instead focused on improving per-WGP throughput. This is done with improved [[Superscalar processor\|dual-issue]] shader ALUs with the ability to execute two instructions per cycle. It can contain up to 96 graphics Compute Units that can provide up to 61 TFLOPS of compute.<ref name="Gula">{{Cite web \|last1=Gula \|first1=Damien \|date=November 3, 2022 \|title=AMD's RDNA 3 GPUs are Way Cheaper Than the RTX 4090 \|url=https://gizmodo.com/amd-rdna-3-gpu-rtx-4090-4080-rx-7900-xtx-xt-price-date-1849741000 \|website=Gizmodo \|language=en-US \|access-date=April 8, 2023}}</ref> While RDNA 3 doesn't include dedicated execution units for AI acceleration like the Matrix Cores found in AMD's compute-focused [[CDNA (microarchitecture)\|CDNA]] architectures, the efficiency of running inference tasks on [[half-precision floating-point format\|FP16]] execution resources is improved with Wave MMA ([[matrix multiplication\|matrix]] [[multiply–accumulate operation\|multiply–accumulate]]) instructions. This results in increased inference performance compared to RDNA 2.<ref name="tomshw-rdna3">{{cite web \|last1=Walton \|first1=Jarred \|title=AMD RDNA 3 and Radeon RX 7000-series GPUs: Everything we know \|url=https://www.tomshardware.com/features/amd-radeon-rx-7000-rdna-3-price-performance-benchmarks-release-date#section-amd-rdna-3-architecture-ai-accelerators \|website=Tom's Hardware \|access-date=20 July 2024 \|date=15 June 2024}}</ref><ref name="tomshw-interview">{{cite web \|last1=Walton \|first1=Jarred \|last2=Alcorn \|first2=Paul \|title=Into the GPU Chiplet Era: An Interview With AMD's Sam Naffziger \|url=https://www.tomshardware.com/features/gpu-chiplet-era-interview-amd-sam-naffziger \|website=Tom's Hardware \|access-date=20 July 2024 \|date=23 June 2022 \|quote=We asked whether AMD would include some form of tensor core or matrix core in the architecture, similar to what both Nvidia and Intel are doing with their GPUs. He responded that the split between RDNA and CDNA means stuffing a bunch of specialized matrix cores into consumer graphics products really isn't necessary for the target market, plus the FP16 support that already exists in previous RDNA architectures should prove sufficient for inference-type workloads.}}</ref> WMMA supports FP16, BF16, INT8, and INT4 data types.<ref name="wmma-gpuopen">{{cite web \|last=Vasishta \|first=Aaryaman \|date=January 10, 2023 \|title=How to accelerate AI applications on RDNA 3 using WMMA \|url=https://gpuopen.com/learn/wmma_on_rdna3/ \|website=GPUOpen \|language=en-US \|access-date=August 14, 2023 \|archive-url=https://web.archive.org/web/20230110194947/https://gpuopen.com/learn/wmma_on_rdna3/ \|url-status=live \|archive-date=January 10, 2023}}</ref> ''[[Tom's Hardware]]'' found that AMD's fastest RDNA 3 GPU, the RX 7900 XTX, was capable of generating 26 images per minute with [[Stable Diffusion]], compared to only 6.6 images per minute of the RX 6950 XT, the fastest RDNA 2 GPU.<ref name="tomshw-sd">{{cite web \|last1=Walton \|first1=Jarred \|title=Stable Diffusion Benchmarks: 45 Nvidia, AMD, and Intel GPUs Compared \|url=https://www.tomshardware.com/pc-components/gpus/stable-diffusion-benchmarks \|website=Tom's Hardware \|access-date=20 July 2024 \|date=15 December 2023}}</ref> ~~=== Compute Units ===~~ RDNA 3 includes improved [[superscalar processor\|dual-issue]] shader ALUs with the ability to execute two instructions per cycle. It can contain up to 96 graphics Compute Units that can provide up to 61 TFLOPS of compute.<ref name="Gula">{{Cite web \|last1=Gula \|first1=Damien \|date=November 3, 2022 \|title=AMD's RDNA 3 GPUs are Way Cheaper Than the RTX 4090 \|url=https://gizmodo.com/amd-rdna-3-gpu-rtx-4090-4080-rx-7900-xtx-xt-price-date-1849741000 \|website=Gizmodo \|language=en-US \|access-date=April 8, 2023}}</ref> ==== Ray tracing ==== RDNA 3 has dedicated AI acceleration with Wave MMA ([[matrix multiplication\|matrix]] [[multiply-accumulate]]) instructions,<ref name=wmma-gpuopen>{{cite web\|url=https://gpuopen.com/learn/wmma_on_rdna3/\|title=How to accelerate AI applications on RDNA 3 using WMMA\|first=Aaryaman\|last=Vasishta\|date=January 10, 2023\|publisher=GPUOpen.com\|access-date=2023-08-14\|archive-url=https://web.archive.org/web/20230110194947/https://gpuopen.com/learn/wmma_on_rdna3/\|url-status=live\|archive-date=2023-01-10}}</ref> which can improve AI-based performance by 2.7x and also benefits ray tracing instructions, similar to [[Nvidia]]'s Tensor cores.<ref name="Gula"/> RDNA 3 features second generation ray-tracing accelerators. Each Compute Unit contains one ray tracing accelerator. The overall number of ray tracing accelerators is increased due to the higher number of Compute Units, though the number of ray tracing accelerators per Compute Unit has not increased over RDNA 2. ==== ~~Ray~~Clock ~~tracing~~speeds ==== Each RDNA 3 Compute Unit contains one ray tracing accelerator. The overall number of ray tracing accelerators is increased due to the higher number of Compute Units, though the number of ray tracing accelerators per Compute Unit has not increased over RDNA 2. ~~=== Clock speeds ===~~ RDNA 3 was designed to support high clock speeds. On RDNA 3, clock speeds have been decoupled with the front end operating at a 2.5{{nbsp}}GHz frequency while the shaders operate at 2.3{{nbsp}}GHz. The shaders operating at a lower clock speed gives up to 25% power savings according to AMD and RDNA 3's shader clock speed is still 15% faster than RDNA 2.<ref>{{Cite web \|last1=Olšan \|first1=Jan \|date=November 7, 2022 \|title=AMD RDNA 3 details: architecture changes, AI acceleration, DP 2.1 \|url=https://www.hwcooling.net/en/amd-rdna-3-details-architecture-changes-ai-acceleration-dp-2-1-en/ \|website=HWCooling \|language=en-GB \|access-date=April 8, 2023}}</ref> ==== Cache and memory subsystem ==== RDNA 3 increased the capacity of L1 and L2 caches. The 16-way associative L1 cache shared across a shader array is doubled in RDNA 3 to 256{{nbsp}}KB. The L2 cache increased from 4{{nbsp}}MB on [[RDNA 2]] to 6{{nbsp}}MB on RDNA 3. The L3 Infinity Cache has been lowered in capacity from 128{{nbsp}}MB to 96{{nbs}}MB and latency has increased as it is physically present on the MCDs rather than being closer to the WGPs within the GCD.<ref name="CAC">{{cite web \|title=Microbenchmarking AMD’s RDNA 3 Graphics Architecture \|url=https://chipsandcheese.com/2023/01/07/microbenchmarking-amds-rdna-3-graphics-architecture/ \|website=Chips and Cheese \|language=en-US \|date=January 7, 2023 \|access-date=April 29, 2024}}</ref> The Infinity Cache capacity was decreased due to RDNA 3 having wider a memory interface up to 384-bit whereas RDNA 2 used memory interfaces up to 256-bit. RDNA 3 having a wider 384-bit memory means that its cache hitrate does not have to be as high to still avoid bandwidth bottlenecks as there is higher memory bandwidth.<ref name="CAC"/> RDNA 3 GPUs use [[GDDR6 SDRAM\|GDDR6]] memory rather than faster [[GDDR6X SDRAM\|GDDR6X]] due to the latter's increased power consumption. ~~RDNA 3 GPUs use [[GDDR6 SDRAM\|GDDR6]] memory rather than faster [[GDDR6X SDRAM\|GDDR6X]] due to the latter's increased power consumption.~~ 16{{nbsp}}MB Infinity Cache is included on each MCD. Theoretically, additional L3 cache could be added to the MCDs via AMD's 3D V-Cache die stacking technology as the MCDs contain unused [[Through-silicon via\|TSV]] connection points.<ref>{{Cite web \|last1=Klotz \|first1=Aaron \|date=January 29, 2023 \|title=AMD GPU Appears to Leave Room for Future 3D V-Cache \|url=https://www.tomshardware.com/news/3d-vcache-rdna3-amd-gpu \|website=Tom's Hardware \|language=en-US \|access-date=April 8, 2023}}</ref><ref>{{Cite web \|last1=Ridley \|first1=Jacob \|date=January 30, 2023 \|title=Tiny spots on AMD's RDNA 3 GPU hint at massive cache potential \|url=https://www.pcgamer.com/tiny-spots-on-amds-rdna-3-gpu-hint-at-massive-cache-potential/ \|website=PC Gamer \|language=en-US \|access-date=April 8, 2023}}</ref> ~~=== Power efficiency ===~~ ~~AMD claims that RDNA 3 achieves a 54% increase in performance-per-watt which is in line with their previous claims of 50% performance-per-watt increases for both RDNA and RDNA 2.~~ ==== Media engine ==== RDNA 3 is the first RDNA architecture to have a dedicated media engine. It is built into the GCD and is based on [[Video Core Next\|VCN 4.0]] encoding and decoding core.<ref>{{cite web \|last1=Shilov \|first1=Anton \|date=May 4, 2022 \|title=First Details About AMD’s Next Generation Video Engine Revealed \|url=https://www.tomshardware.com/news/next-amd-video-engine-may-lack-av1 \|website=Tom's Hardware \|language=en-US \|access-date=April 10, 2023}}</ref> AMD's AMF [[AV1]] encoder is comparable in quality to Nvidia's [[Nvidia NVENC\|NVENC]] AV1 encoder but can handle a higher number of simultaneous encoding streams compared to the limit of 3 on the [[GeForce 40 series\|GeForce RTX 40 series]].<ref>{{Cite web \|last1=Klotz \|first1=Aaron \|date=December 12, 2022 \|title=AMD's Radeon RX 7900 AV1 encoder is almost on par with Intel Arc and Nvidia's RTX 40 series \|url=https://www.techspot.com/news/96945-amd-radeon-rx-7900-av1-encoder-almost-par.html \|website=TechSpot \|language=en-US \|access-date=April 8, 2023}}</ref> {\| class="wikitable" style="text-align: center;" Line 118 ⟶ 110: ! [[AV1]] \|- ! style="text-align: left;" \| {{~~midsize~~resize\|~~1080p~~1080p60}} \| 360 \| 360 \| 360 \|- ! style="text-align: left;" \| {{~~midsize~~resize\|~~1440p~~1440p60}} \| 360 \| 360 \| 360 \|- ! style="text-align: left;" \| {{~~midsize~~resize\|4K4K60}} \| 180 \| 180 \| 240 \|- ! style="text-align: left;" \| {{~~midsize~~resize\|8K8K60}} \| 48 \| 48 \| 60 \|- \|} === Display engine === RDNA 3 GPUs feature a new display engine called the "Radiance Display Engine". AMD touted its support for [[DisplayPort#2.1\|DisplayPort 2.1]] UHBR 13.5, delivering up to 54Gbps bandwidth for high refresh rates at [[4K resolution\|4K]] and [[8K resolution\|8K]] resolutions.<ref>{{Cite web \|last1=Sag \|first1=Anshel \|date=November 14, 2022 \|title=AMD's New Radeon RX 7900XTX And 7900XT Put The Pressure On NVIDIA \|url=https://www.forbes.com/sites/moorinsights/2022/11/14/amds-new-radeon-rx-7900xtx-and-7900xt-put-the-pressure-on-nvidia/?sh=21776d571aa3 \|website=Forbes \|language=en-US \|access-date=April 8, 2023}}</ref> The Radeon Pro W7900 and W7800 support the 80Gbps UHBR20 standard. DisplayPort 2.1 can support 4K at 480{{nbsp}}Hz and 8K at 165{{nbsp}}Hz with [[Display Stream Compression]] (DSC). The previous DisplayPort 1.4 standard with DSC was limited to 4K at 240{{nbsp}}Hz and 8K at 60{{nbsp}}Hz. === Power efficiency === RDNA 3 GPUs feature a new display engine called the "Radiance Display Engine". AMD touted its support for [[DisplayPort#2.1\|DisplayPort 2.1]] UHBR 13.5, delivering up to 54 Gbit/s bandwidth for high refresh rates at [[4K resolution\|4K]] and [[8K resolution\|8K]] resolutions.<ref>{{Cite web \|last1=Sag \|first1=Anshel \|date=November 14, 2022 \|title=AMD's New Radeon RX 7900XTX And 7900XT Put The Pressure On NVIDIA \|url=https://www.forbes.com/sites/moorinsights/2022/11/14/amds-new-radeon-rx-7900xtx-and-7900xt-put-the-pressure-on-nvidia/?sh=21776d571aa3 \|website=Forbes \|language=en-US \|access-date=April 8, 2023}}</ref> DisplayPort 2.1 can support 4K at 480{{nbsp}}Hz and 8K at 165{{nbsp}}Hz with [[Display Stream Compression]] (DSC). The previous DisplayPort 1.4 standard with DSC was limited to 4K at 240{{nbsp}}Hz and 8K at 60{{nbsp}}Hz. AMD claims that RDNA 3 achieves a 54% increase in performance-per-watt which is in line with their previous claims of 50% performance-per-watt increases for both RDNA and RDNA 2. == Navi 3x dies == {\| class="wikitable" style="text-align: center;" ! rowspan="2" colspan="3" \| ! colspan="3" \| Graphics Compute Die ~~<br>~~ (GCD) ! rowspan="2" \| Memory Cache Die <br /> (MCD) \|- ! style="width:8em" \| [https://www.techpowerup.com/gpu-specs/amd-navi-31.g998 Navi 31]<ref name="Walton" /> ! style="width:8em" \| [https://www.techpowerup.com/gpu-specs/amd-navi-32.g1000 Navi 32]<ref>https://www.tomshardware.com/news/amd-rdna-3-gpu-architecture-deep-dive-the-ryzen-moment-for-gpus</ref> ! style="width:8em" \| [https://www.techpowerup.com/gpu-specs/amd-navi-33.g1001 Navi 33] \|- ! style="text-align: left;" colspan="3" \| {{~~midsize~~resize\|~~{{Tooltip\|Ref.\|Reference(s)}}~~Launch}} ~~! <ref name="Walton"/>~~ ! ! ! \|- ~~! style="text-align: left;" colspan="3" \| {{midsize\|Launch}}~~ \| {{dts\|2022\|December\|13\|format=my\|abbr=on}} \| {{dts\|2023\|~~August~~September\|2506\|format=my\|abbr=on}} \| {{dts\|2023\|January\|404\|format=my\|abbr=on}} \| {{dts\|2022\|December\|13\|format=my\|abbr=on}} \|- ! style="text-align: left;" colspan="3" \| {{~~midsize~~resize\|Codename}} \| ''Plum Bonito'' \| ''Wheat Nas'' Line 172 ⟶ 159: \| rowspan="2" {{NA}} \|- ! style="text-align: left;" colspan="3" \| {{~~midsize~~resize\|Compute units <br/> (Stream processors) <br/> [FP32 cores]}} \| 96 <br/> (6144) <br/> [12288] \| 60 <br/> (3840) <br/> [7680] \| 32 <br/> (2048) <br/> [4096] \|- ! style="text-align: left;" colspan="3" \| {{~~midsize~~resize\|Process}} \| colspan="2" \| [[TSMC]] [[5 nm process\|N5]] \| colspan="2" \| [[TSMC]] [[7 nm process\|N6]] \|- ! style="text-align: left;" colspan="3" \| {{~~midsize~~resize\|Transistors}} \| 45.4{{nbsp}}<small>bn.</small> ~~\| 45.4B~~ \| 28.1{{nbsp}}<small>bn.</small> ~~\| {{unk}}~~ \| 13.3{{nbsp}}<small>bn.</small> ~~\| 13.3B~~ \| 2.05{{nbsp}}<small>bn.</small> ~~\| 2.05B~~ \|- ! style="text-align: left;" colspan="3" \| {{~~midsize~~resize\|Transistor density}} \| 150.2 MTr/mm<sup>2</sup> \| 143.4 MTr/mm<sup>2</sup> ~~\| {{unk}}~~ \| 65.2 MTr/mm<sup>2</sup> \| 54.64 MTr/mm<sup>2</sup> \|- ! style="text-align: left;" colspan="3" \| {{~~midsize~~resize\|Die size}} \| 304.35 mm<sup>2</sup> \| ~~200~~196 mm<sup>2</sup> \| 204 mm<sup>2</sup> \| 37.52 mm<sup>2</sup> \|- ! style="text-align: left;" colspan="3" \| {{~~midsize~~resize\|Max TDP}} \| 405{{nbsp}}W \| 263{{nbsp}}W Line 205 ⟶ 192: \| {{NA}} \|- ! style="text-align: left;" rowspan="4" \| {{~~midsize~~resize\|Products}} ! style="text-align: left;" rowspan="2" \| {{~~midsize~~resize\|Consumer}} ! style="text-align: left;" \| {{~~midsize~~resize\|Desktop}} \| {{ubl\|RX 7900 GRE\|RX 7900 XT\|RX 7900 XTX}} \| {{ubl\|RX 7700 XT\|RX 7800 XT}} \| {{ubl\|RX 7600\|RX 7600 XT}} \| {{ubl\|RX 7700 XT (3×)\|RX 7800 XT (4×)\|RX 7900 GRE (4×)\|RX 7900 XT (5×)\|RX 7900 XTX (6×)}} \|- ! style="text-align: left;" \| {{~~midsize~~resize\|Mobile}} \| {{ubl\|RX 7900M}} \| {{NA}} Line 219 ⟶ 206: \| {{ubl\|RX 7900M (4×)}} \|- ! style="text-align: left;" rowspan="2" \| {{~~midsize~~resize\|Workstation}} ! style="text-align: left;" \| {{~~midsize~~resize\|Desktop}} \| {{ubl\|W7800\|W7900}} \| {{NAubl\|W7700}} \| {{ubl\|W7500\|W7600}} \| {{ubl\|W7700 (4×)\|W7800 (4×)\|W7900 (6×)}} \|- ! style="text-align: left;" \| {{~~midsize~~resize\|Mobile}} \| {{NA}} \| {{NA}} \| {{NA}} \| {{NA}} \|- \|} == Products == === ~~Desktop~~Gaming === ==== Desktop ==== {{AMD Radeon RX 7000}} ==== Mobile ==== {{AMD Radeon RX 7000M}} === Workstation === {{Main\|Radeon Pro#Radeon Pro W7000 series\|l1 = Radeon Pro W7000 series}} ==== Desktop ~~Workstation~~workstation ==== {{AMD Radeon Pro W7000}} === Integrated graphics ~~processors~~processing units (~~iGPs~~iGPUs) === {\| class="wikitable" style="text-align: center; font-size: 85%; white-space:nowrap;" ! rowspan="2" \| Model ! rowspan="2" \| Launch ! rowspan="2" \| Codename ! rowspan="2" \| [[Microarchitecture\|Architecture]] <br /> & [[Semiconductor device fabrication\|fab]] ! rowspan="2" \| Die <br /> size ! colspan="2" \| Core ! colspan="2" \| [[Fillrate]]{{efn\|name="Boost"}}{{efn\|name="Texture fill"}}{{efn\|name="Pixel fill"}} Line 257 ⟶ 246: ! colspan="3" \| [[Cache (computing)#GPU cache\|Cache]] ! rowspan="2" \| [[Thermal design power\|TDP]] ~~! rowspan="2" \| Bus <br /> interface~~ \|- ! Config{{efn\|name="Core config"}}{{efn\|name="Stream processors"}} ! style="width:4em;" \| Clock{{efn\|name="Boost"}} <br /> ([[Hertz\|MHz]]) ! Texture <br /> ([[Texel (graphics)\|GT]]/s) ! Pixel <br /> ([[Pixel\|GP]]/s) ! style="width:4em;" \| [[Half-precision floating-point format\|Half]] <br/> [FP16] ! style="width:4em;" \| [[Single-precision floating-point format\|Single]] <br/> [FP32] ! style="width:4em;" \| [[Double-precision floating-point format\|Double]] <br/> [FP64] ! style="width:4em;" \| L0 ~~! L0~~ ! style="width:4em;" \| L1 ~~! L1~~ ! style="width:4em;" \| L2 ~~! L2~~ \|- !colspan=16\|RDNA 3 \|- ! style="text-align:left; height:3em;" \| ~~{{Nowrap\|~~[https://www.amd.com/en/products/apu/amd-ryzen-5-7540u Radeon 740M]}} \| rowspan="53" \| {{dts\|2023\|April\|format=mdy\|abbr=on}} \| rowspan="5" \| Phoenix<br/>Hawk Point \| rowspan="5" \| [[RDNA 3]] <br /> [[TSMC]]{{nbsp}}[[5 nm process\|N4]] \| rowspan="5" \| 178{{nbsp}}mm<sup>2</sup> ~~\| rowspan="2"~~ \| 4 CUCUs <br /> 256:16:8:4 \| 2,500 \| 40.0 Line 282 ⟶ 274: \| 2,560 \| 80.0 ~~\| rowspan="2"~~ \| 64{{nbsp}}KB ~~\| rowspan="2"~~ \| 512{{nbsp}}KB \| rowspan="5" \| 2{{nbsp}}MB \| 15–30{{nbsp}}W ~~\| rowspan="5" \| [[PCI Express#PCI Express 4.0\|PCIe 4.0]]<br />×8~~ \|- ~~! style="text-align:left;" \| {{Nowrap\|[https://www.amd.com/en/products/apu/amd-ryzen-z1 Ryzen Z1]}}~~ ! style="text-align:left; height:3em;" \| {{Nowrap\|[https://www.amd.com/en/products/apu/amd-ryzen-5-7640hs Radeon 760M]}} ~~\| 2,735~~ \| 8 CUs <br/> 512:32:16:8 ~~\| 43.7~~ \| 1,000 <br/> ''2,600'' ~~\| 21.8~~ \| 32.0 <br/> ''83.2'' ~~\| 5,600~~ \| 21.3 <br/> ''55.5'' ~~\| 2,800~~ \| 4,096 <br/> ''10,649'' ~~\| 87.5~~ \| 2,048 <br/> ''5,324'' ~~\| 9–30{{nbsp}}W~~ \| 64.0 <br/> ''166.4'' \|- ~~! style="text-align:left;" \| {{Nowrap\|[https://www.amd.com/en/products/apu/amd-ryzen-5-7640hs Radeon 760M]}}~~ ~~\| 8 CU <br /> 512:32:16:8~~ ~~\| 1,000 <br /> ''2,600''~~ ~~\| 32.0 <br /> ''83.2''~~ ~~\| 21.3 <br /> ''55.5''~~ ~~\| 4,096 <br /> ''10,649''~~ ~~\| 2,048 <br /> ''5,324''~~ ~~\| 64.0 <br /> ''166.4''~~ \| 128{{nbsp}}KB \| 1{{nbsp}}MB \| ~~35–54~~rowspan="2" \| 15–65{{nbsp}}W \|- ~~! style="text-align:left;" \|{{Nowrap\|[https://www.amd.com/en/products/apu/amd-ryzen-9-7940hs Radeon 780M]}}~~ ! style="text-align:left; height:3em;" \| [https://www.amd.com/en/products/apu/amd-ryzen-9-7940hs Radeon 780M] ~~\| rowspan="2" \| 12 CU <br />768:48:24:12~~ \| 12 CUs <br/> 768:48:24:12 ~~\| 2,700~~ \| 1,000 <br/> ''2,800'' ~~\| 129.6~~ \| 6440.80 \| ~~16,588~~20.0 \| 6,144 <br/> ''17,203'' ~~\| 8,294~~ \| 3,072 <br/> ''8,601'' ~~\| 259.2~~ \| 192 <br/> ''537.6'' ~~\| rowspan="2" \| 192{{nbsp}}KB~~ \| ~~rowspan="2" \| 1.5~~192{{nbsp}}MBKB \| ~~35–54~~1.5{{nbsp}}WMB \|- ~~! style="text-align:left;" \|{{Nowrap\|[https://www.amd.com/en/products/apu/amd-ryzen-z1-extreme Ryzen Z1 Extreme]}}~~ ! style="text-align:left; height:3em;" \| [https://www.amd.com/en/products/apu/amd-ryzen-z1 Ryzen Z1] \| rowspan="2" \| {{dts\|2023\|June\|13\|format=mdy\|abbr=on}}<!-- Release date of the Asus ROG Ally handheld that the Z1 was designed for --> \| 4 CUs <br/> 256:16:8:4 \| 2,500 \| 40.0 \| 20.0 \| 5,120 \| 2,560 \| 80.0 \| 64{{nbsp}}KB \| 512{{nbsp}}KB \| rowspan="2" \| 9–30{{nbsp}}W \|- ! style="text-align:left; height:3em;" \| [https://www.amd.com/en/products/apu/amd-ryzen-z1-extreme Ryzen Z1 Extreme] \| 12 CUs <br/> 768:48:24:12 \| 2,800 \| 134.4 Line 328 ⟶ 327: \| 8,600 \| 268.8 \| ~~9–30~~192{{nbsp}}WKB \| 1.5{{nbsp}}MB \|- !colspan=16\|RDNA 3.5 \|- ! style="text-align:left; height:3em;" \| [https://www.amd.com/en/products/processors/laptop/ryzen/300-series/amd-ryzen-ai-9-365.html Radeon 880M] \| rowspan="2" \| {{dts\|2024\|July\|format=mdy\|abbr=on}} \| rowspan="2" \| Strix Point \| rowspan="2" \| RDNA 3.5 <br/> [[TSMC]]{{nbsp}}[[5 nm process\|N4P]] \| rowspan="2" \| 232.5{{nbsp}}mm<sup>2</sup> \| 12 CUs <br/> 768:48:24:12 \| 2,900 \| 139.2 \| 69.6 \| 17,818 \| 8,909 \| 278.4 \| 192{{nbsp}}KB \| 1.5{{nbsp}}MB \| rowspan="2" \| 2{{nbsp}}MB \| rowspan="2" \| 15–54{{nbsp}}W \|- ! style="text-align:left; height:3em;" \| [https://www.amd.com/en/products/processors/laptop/ryzen/300-series/amd-ryzen-ai-9-hx-375.html Radeon 890M] \| 16 CUs <br/> 1024:64:32:16 \| 2,900 \| 185.6 \| 92.8 \| 23,757 \| 11,878 \| 371.2 \| 256{{nbsp}}KB \| 2{{nbsp}}MB \|} {{notelist\|refs= Line 336 ⟶ 368: {{efn\|name="Pixel fill"\|Pixel fillrate is calculated as the number of '''Render Output Units''' multiplied by the base (or boost) core clock speed.}} {{efn\|name="FLOPS"\|Precision performance is calculated from the base (or boost) core clock speed based on a [[Multiply–accumulate operation#Fused multiply–add\|FMA]] operation.}} {{efn\|name="Core config"\|[[Graphics Core Next#Compute units\|Compute Units (CUs)]] <br/> [[Unified shader model\|~~Unified~~Stream ~~shaders~~Processors]] : [[Texture mapping unit\|Texture mapping units]]s : [[Render output unit\|Render output units]]s : [[Ray tracing (graphics)\|Ray accelerators~~]] and [[Graphics Core Next#Compute units\|Compute units (CU)~~]]}} {{efn\|name="Stream processors"\|GPUs based on [[RDNA (microarchitecture)#RDNA 3\|RDNA 3]] have dual-issue '''stream processors''' so that up to two shader instructions can be executed per [[Instructions per cycle\|clock cycle]] under certain [[Instruction-level parallelism\|parallelism]] conditions.}} }} Line 342 ⟶ 374: == References == {{reflist}} {{AMD graphics}} [[Category:AMD microarchitectures]]