The GM206 Pro GPU
The GM206 GPU used in the Geforce GTX 960/ GTX 950 features all the key architectural innovations first introduced in the Geforce GTX 980. Maxwell GPUs feature a new SM design that’s been tailored to improve efficiency.
The Maxwell SMM is partitioned into four distinct 32-CUDA core processing blocks (128 CUDA cores total per SMM), each with its own dedicated resources for scheduling and instruction buffering. By giving each processing block its own dedicated resources for instruction scheduling and dispatch, Nvidia fully uses the GPU’s CUDA cores more often. This improves workload efficiency and reduces power that usually goes to waste.
The GM206 GPU used in the Geforce GTX 950 ships with 6 SMM units and is composed of 768 CUDA cores in total, while 48 texture units are available for texture processing (compared to 1024 CUDA cores in GTX 960).
To improve the efficiency of the GPU’s onboard caches, Nvidia has made several changes to the cache hierarchy in Maxwell. Each of GM206’s SMM units features its own dedicated 96KB shared memory, while the L1/texture caching functions are combined into a 24KB pool of memory per pair of processing blocks (48KB per SMM). Previous generation Kepler GPUs had a smaller 64KB shared memory function that was shared as L1 cache.
GM206 ships with 1MB of L2 cache that’s shared across the GPU. With more built-in cache, fewer requests to graphics DRAM are needed, this improves performance and reduces power consumption. In addition, Nvidia’s third-generation delta colour compression engine offers new modes for colour compression, allowing the GPU to more effectively use its available memory bandwidth. The GM206 uses roughly 25% fewer bytes per frame compared to prior generation Kepler GPUs.
Because of these changes, each GM206 CUDA core can deliver roughly 1.4x more performance per core compared to a GK106 Kepler CUDA core (the direct predecessor of GM206), and 2x the performance per watt.
The memory subsystem of Geforce GTX 950/ GTX 960 consists of two 64-bit memory controllers (128-bit) with 2GB of GDDR5 memory. The GTX 950’s memory is running at 6.6GHz effective memory clock, while the GTX 960 sets clock at 7GHz. Thanks to the lossless texture compression technology that reduces memory bandwidth usage both GTX 950 and GTX 960 should score good at 1080p resolution.
GM206 GPU supports all the key innovations first introduced in Geforce GTX 980, including support for DirectX 12 API with Feature Level 12.1. DirectX 12 offers a lower level access to hardware and it promises better use of both CPU and GPU. Among DirectX 12 features are conservative raster and raster ordered views.
All GM2xx Maxwell GPUs support volume tiled resources. This feature can allow game developers to produce higher fidelity graphics with less memory. Owners of GTX 950 should benefit from this technology since the card implements only 128-bit memory interface running at 6.6GHz effective memory clock, and is outfitted with 32 ROPs and 2GB of GDDR5 memory.
New features introduced with Maxwell include as real-time voxel illumination, MFAA (multi-frame sampled anti-aliasing), Dynamic Super-Resolution (DSR), Turf Effects, VR Direct, PhysX Flex, G-Sync and ShadowPlay.
MFAA for example can be used to provide image quality similar to 4xMSAA, at a cost that’s closer to 2xMSAA. Nvidia’s G-Sync display technology delivers the smooth and fast gaming experience for gamers, free from screen tearing, VSync input lag, and display-induced stuttering (similar to what you can have with FreeSync on AMD graphics cards).
Both Nvidia and AMD are building an ecosystem around their GPUs that’s meant to provide the users with lot more than just advanced gaming features. Nvidia police is driven by three key principles: offer the most advanced technology for their users, provide the best gaming experiences, and deliver more ways to play than anywhere else.
With Geforce Experience, gamers don’t have to sort through the myriad of graphics settings many games ship with. With the click of a button, Geforce Experience automatically decides the right game settings for over 250 top games to deliver best performance for your particular GPU. We think this is a good feature especially for the owners of a GTX 950 and similar graphics cards which are not powerful enough to play games at all different graphics detail levels.
ShadowPlay allows gamers to share their favourite gaming moments with the world. You can record at up to 4K resolution at 60 fps, and even broadcast directly to Twitch. With Nvidia GameStream technology, Geforce GTX gamers can stream games from their PC to their Shield device.
And new Geforce Experience features launching with the GTX 950 offers more ways for gamers to play with their friends. GameStream Co-Op allows gamers to stream their games over the internet to a friend and play with them cooperatively. You can invite friends to join the game by either sending them an email invite, or copying and pasting an invite URL into a chat program. Currently you need Chrome web browser for this to work. It supports DirectX 9 or higher games running in fullscreen exclusive mode. Nvidia recommends at least 7Mbps upstream and downstream for both the host and guest PCs respectively.
To make the recording and broadcasting functions more accessible to Geforce Experience users, the new version of the software integrates a new in-game share overlay menu. With a press of a hotkey (Alt+Z), users can launch the new in-game overlay.