enardeitjaptrues

Please note that this post is tagged as a rumor.

NVIDIA GeForce RTX 4090 with over 100 TFLOPS of power

NVIDIA next-gen flagship GPU is rumored to deliver over 2.5 times more raw compute power over RTX 3090 Ti.

GeForce-RTX-hero-banner-1200x297 NVIDIA GeForce RTX 4090 with AD102 GPU rumored to ship with ... - VideoCardz.com | Computer Repair, Networking, and IT Support in Seattle, WA

While the rumors about AMD RDNA3 flagship processor indicate it will offer over 4 times more single-precision compute power over RDNA2, NVIDIA is supposedly doing a similar upgrade to its upcoming flagship. The AD102 GPU, based on Ada Lovelace architecture, is expected to deliver over 100 TFLOPS of power, which is 2.5 more than 40 TFLOPS offered by RTX 3090 Ti and 2.8 times more than RTX 3090. The FP32 (single-precision) power does not automatically guarantee better gaming performance, though.

To be honest, I don't have much information about AMD. Maybe Lisa and Jensen's competition will give us a 100TFLOPS gaming war in a few months.

— kopite7kimi (@kopite7kimi) April 29, 2022

Both Greymon55 and Kopite7kimi agree that the TFLOPS war will only be a part of the battle for the fastest desktop graphics. There is more at play, such as raytracing acceleration, supported super-resolution tech and other features that may tip the scales in favor of any architecture.

I can only say that the two products have improved a lot compared to their predecessors, but if you want to ask me directly which one is better, I'm sorry I can't answer, because no one knows the specific improvement by percentage.

— Greymon55 (@greymon55) April 30, 2022

To achieve 100 TFLOPS of power, the AD102 GPU with 18432 CUDA cores would have to be clocked at 2.7 GHz, but it’s almost certain that RTX 4090 will ship with a partially disabled GPU. Therefore, the clock speed will automatically be higher. According to Greymon55, next-gen flagship cards might ship with very similar clock speeds, which in the case of AMD Navi 31 GPU means 3.0 GHz and that’s assuming full GPU is used.

It is true that we know a lot about next-gen GPUs already, some rumors have been around for months. But this does not mean that we know everything yet. The FP32 CUDA/Stream Processor count can still change. What all leakers appear to agree with is that next-gen GPUs will require a lot of power.

Next-gen Flagship GPU Comparison (RUMORED)
VideoCardz.comGeForce RTX 3090 TIAD102NAVI 31
Fabrication NodeSAMSUNG 8NTSMC N5TSMC N5/N6
ArchitectureNVIDIA AmpereNVIDIA AdaAMD RDNA3
GPU PackageMonolithicMonolithicMulti-Chip-Module (MCM)
Estimated GPU Size628mm²~600mm²~800mm²
Graphics Dies112 GCD + 4 MCD + 1 IOD
GPU Mega Clusters7 Graphics Processing Clusters (GPC)12 Graphics Processing Clusters (GPC)2×3 Shader Engines
GPU Super Clusters42 Texture Processing Clusters (TPC)72 Texture Processing Clusters (TPC)2×30 RDNA Workgroups (WGP)
GPU Clusters84 Streaming Multiprocessors (SM)144 Streaming Multiprocessors (SM)120 Compute Units
FP32 Cores10752 CUDAs18432 CUDAs15360 Stream Processors
GPU Clock2.6 GHz~ 2.7 GHz~ 3.0 GHz
Memory Type24 GB GDDR6X24 GB GDDR6XTBC GB GDDR6
Memory & Bus21 Gbps 384-bit21 Gbps 384-bitTBC Gbps 256-bit
Cache6MB (L2 Cache)96MB (L2 Cache)256 or 512MB Infinity Cache
Power Consumption450W600WTBC
Release DateQ1 2022Q3/Q4 2022Q3/Q4 2022
FP32 Performance40 TFLOPS~ 100 TFLOPs~ 92 TFLOPs

Source: @kopite7kimi, @greymon55 via Wccftech



Please note that this post is tagged as a rumor.

NVIDIA GeForce RTX 4090 with over 100 TFLOPS of power

NVIDIA next-gen flagship GPU is rumored to deliver over 2.5 times more raw compute power over RTX 3090 Ti.

GeForce-RTX-hero-banner-1200x297 NVIDIA GeForce RTX 4090 with AD102 GPU rumored to ship with ... - VideoCardz.com | Computer Repair, Networking, and IT Support in Seattle, WA

While the rumors about AMD RDNA3 flagship processor indicate it will offer over 4 times more single-precision compute power over RDNA2, NVIDIA is supposedly doing a similar upgrade to its upcoming flagship. The AD102 GPU, based on Ada Lovelace architecture, is expected to deliver over 100 TFLOPS of power, which is 2.5 more than 40 TFLOPS offered by RTX 3090 Ti and 2.8 times more than RTX 3090. The FP32 (single-precision) power does not automatically guarantee better gaming performance, though.

To be honest, I don't have much information about AMD. Maybe Lisa and Jensen's competition will give us a 100TFLOPS gaming war in a few months.

— kopite7kimi (@kopite7kimi) April 29, 2022

Both Greymon55 and Kopite7kimi agree that the TFLOPS war will only be a part of the battle for the fastest desktop graphics. There is more at play, such as raytracing acceleration, supported super-resolution tech and other features that may tip the scales in favor of any architecture.

I can only say that the two products have improved a lot compared to their predecessors, but if you want to ask me directly which one is better, I'm sorry I can't answer, because no one knows the specific improvement by percentage.

— Greymon55 (@greymon55) April 30, 2022

To achieve 100 TFLOPS of power, the AD102 GPU with 18432 CUDA cores would have to be clocked at 2.7 GHz, but it’s almost certain that RTX 4090 will ship with a partially disabled GPU. Therefore, the clock speed will automatically be higher. According to Greymon55, next-gen flagship cards might ship with very similar clock speeds, which in the case of AMD Navi 31 GPU means 3.0 GHz and that’s assuming full GPU is used.

It is true that we know a lot about next-gen GPUs already, some rumors have been around for months. But this does not mean that we know everything yet. The FP32 CUDA/Stream Processor count can still change. What all leakers appear to agree with is that next-gen GPUs will require a lot of power.

Next-gen Flagship GPU Comparison (RUMORED)
VideoCardz.comGeForce RTX 3090 TIAD102NAVI 31
Fabrication NodeSAMSUNG 8NTSMC N5TSMC N5/N6
ArchitectureNVIDIA AmpereNVIDIA AdaAMD RDNA3
GPU PackageMonolithicMonolithicMulti-Chip-Module (MCM)
Estimated GPU Size628mm²~600mm²~800mm²
Graphics Dies112 GCD + 4 MCD + 1 IOD
GPU Mega Clusters7 Graphics Processing Clusters (GPC)12 Graphics Processing Clusters (GPC)2×3 Shader Engines
GPU Super Clusters42 Texture Processing Clusters (TPC)72 Texture Processing Clusters (TPC)2×30 RDNA Workgroups (WGP)
GPU Clusters84 Streaming Multiprocessors (SM)144 Streaming Multiprocessors (SM)120 Compute Units
FP32 Cores10752 CUDAs18432 CUDAs15360 Stream Processors
GPU Clock2.6 GHz~ 2.7 GHz~ 3.0 GHz
Memory Type24 GB GDDR6X24 GB GDDR6XTBC GB GDDR6
Memory & Bus21 Gbps 384-bit21 Gbps 384-bitTBC Gbps 256-bit
Cache6MB (L2 Cache)96MB (L2 Cache)256 or 512MB Infinity Cache
Power Consumption450W600WTBC
Release DateQ1 2022Q3/Q4 2022Q3/Q4 2022
FP32 Performance40 TFLOPS~ 100 TFLOPs~ 92 TFLOPs

Source: @kopite7kimi, @greymon55 via Wccftech