The highly anticipated NVIDIA RTX40 series is now officially launched. The first 3 series launched are RTX 4090, RTX 4080 16GB and RTX 4080 12GB edition. RTX40 series is based on Ada Lovelace GPU architecture which brings massive improvement in performance compared to previous generation in terms of AI neural engine and ray tracing.
The main innovations and improvements of the RTX 40 series are:
1. Streaming Multiprocessor (SM) – Shader capability is up to 83TFlops (83 trillion operations per second), and the throughput is more than 2 times that of the previous generation.
2. The third-generation RT Cores optical tracing core – the effective optical tracing computing power reaches 191TFlops (191 trillion operations per second), which is 2.8 times that of the previous generation, and the performance of ray and triangle intersection is 2 times that of the previous generation.
At the same time, two new important hardware units are added:
The first is the Opacity Micromap engine, which can double the Alpha-Test geometric performance of ray tracing.
The second is the Micro-Mesh engine, which can dynamically generate micro-mesh, generate additional geometry, and improve the richness of geometry without sacrificing performance and storage costs like traditional complex geometry processing.
NVIDIA also took “Cyberpunk 2077” as an example, claiming that the RTX 40 series can perform more than 600 ray tracing calculations per pixel to determine lighting, an increase of up to 16 times compared to the first ray tracing games four years ago.
3. The fourth-generation Tensor Cores – FP8 tensor processing performance is as high as 1.32PFlops (1320 trillion operations per second), which is more than 5 times the acceleration performance of the previous generation using FP8.
4. Supporting Shader Execution Reordering (SER) – by instantly rearranging shader loads, improving execution efficiency, making better use of GPU resources, bringing up to 3x performance improvement for ray tracing, and improving overall game performance by up to 25%.
5. Integrated Optical Flow Accelerator – Brings 2x performance improvement, with DLSS 3 can predict the motion in the scene, the neural network maintains the image quality while increasing the frame rate.
6. Integrate two 8th generation NVIDIA encoders (NVENC) – cut output time by up to half, and support AV1 video format codec, adopted by OBS, Blackmagic Design DaVinci Resolve, Discord and other companies.
There are also three updates to the NVIDIA Broadcast SDK, including facial expression estimation, eye tracking, and virtual green screen quality improvements
7. Energy consumption ratio – The architecture improvement combined with TSMC’s N4 4nm custom process technology, the energy consumption ratio is increased by 2 times.
The flagship product, the first of this generation, integrates 76 billion transistors, 16384 CUDA cores, and is equipped with 24GB GDDR6X video memory. The power consumption is 450W like the RTX 3090 Ti, but it claims to have twice the game performance and four times the DLSS 3 game performance. At the same time, it can also get a frame rate of more than 100FPS in 4K games. It will be available on October 12, priced at US$1,599 (approximately RM7288)
RTX 4080 16GB:
With 9728 CUDA cores and 16GB GDDR6X video memory, the game performance is 2 times that of RTX 3080 Ti and surpasses RTX 3090 Ti. It will be available in November, priced at US$899 (approximately RM4097)
RTX 4080 12GB:
Being the RTX 4070 Ti before rename, it comes with 7680 CUDA cores and 12GB GDDR6X video memory, the performance can also surpass the previous generation RTX 4090 Ti. It will be available in November, priced at US$1199 (approximately RM5465)
Top graphics card brands such as ASUS, Colorful, Gainward, Galaxy, GIGABYTE, INNO3D, MSI, Palit, PNY, and ZOTAC will launch RTX 4090 and RTX 4080 series graphics cards, including standard and overclocked versions. At the same time, NVIDIA will also launch the RTX 4090 and RTX 4080 16GB FE public versions in limited quantities.