Review: Intel's new generation sees the light
Let's get started and check out what the new CPUs can do.
Gigabyte P67A-UD4 (B2) (provided by Gigabyte)
ASRock 890GX Extreme 3 (provided by ASRock)
MSI P55-GD65 (provided by MSI)
Intel Core i3-2100T (provided by Mindfactory)
Intel Core i5-750, i5-2500K, i7-2600K (provided by Intel)
AMD Phenom II X4 840/X6 1100T (provided by AMD)
Scythe Mugen 2 (provided by Scythe-Europe)
G.Skill Eco 4GB Kit PC3-12800 (provided by G.Skill)
1333MHz CL7-7-7-20 CR1T 1.35V
AMD Radeon HD 6870 1GB (provided by)
Enermax Modu87+ 500W
Mushkin Callisto Deluxe 60GB (provided by Mushkin)
Cooler Master Stacker 831 Lite (zur Verfügung gestellt von CoolerMaster)
Windows 7 Ultimate SP1 x64
After some wait after Conroe hit the shelves, Intel struck again with a new architecture. While Lynnfield was a mild update of the previous generation, the new Sandy Bridge is more than just evolution.
The first thing Intel did was to change the Level 0 Cache which now stores µOPs instead of the full instructions. While it was common to execute whole instructions, Intel changed that a while ago due to the speed advantage of RISC (Reduced Instruction Set computiing). Any complicated instruction is split in more than one µOP which is in fact a RISC instruction. While the CPU fetches a new instruction, the decoder checks if the cache already has this instruction cached and avoids decompiling the instruction again. The cache-size of about 1.5k µOPs is big enough to contain pretty much any instruction which will increase performance. The L1 caches did not change, the instructions and data feature 32kB each. The µOP cache is also included in the L1 cache which gives an approximatly 80% hit-chance according to Intel.
The next improvement is the redesigned branch prediction. It is especially improved with long branches, although it has its shortcomings with shorter ones such as loops and elseif constructions. This is quite complicated, so let us just say that the troughput has been improved and enables the CPU for the new AVX instructions. They are an improvement over SSE4 with a bitwidth of 256bits per instruction. Apart from new instructions, the important news is that a two-operand form of a:=a+b can now use a non-destructive three-operand form of c:=a+b, preserving both source operands, which will reduce the code significantly.
Also notable is that turbo has also been improved. The new CPU always clocks 100MHz higher than specified, so a i5-2500 will always clocks at 3.4GHz because the thermal budget is never used as specified. In single threaded applications the CPU increases the clock by 400MHz, with two cores by 300MHz and with three cores under load by 200MHz.
With the higher throughput of the CPU, Intel also needed to improve the memory controller. Lynnfield could access two memory requests per cylce which resulted in 16 bytes load and 16 bytes store per cycle. The new memory controller can now read two memory requests and one write memory request per clock. Faster memory will not improve your performance as in previous generations making faster modules nearly useless. The standard support has been increased to 1333MHz, anything beyond that is offically not supported, except for some notebook CPUs.
Besides that, Intel improved many things inside the CPU, improved power managment and also accelerated the GPU architecture. The GPU is now twice as fast a the previous generation and in case of the HD3000, it has twice as many execution units, which should give you four times the performance of its predecessor. It's still too slow for any serious gaming but should be fine for HD playback and everyday work.
However, the new architecture has two downsides to it. First, Intel changed the socket to force any customer to buy a new motherboard. The second one is that overclocking is very restricted with any normal CPU. CPUs without Turbo can't be overclocked at all. The clock generator is now within the CPU and can only be overclocked to about 106MHz which also increases the clocks for PCIe and PCI. If you have a P67 or Z68 chipset, turbo can overclock by another 400MHz as long as your CPU stays inside the TDP budget. H65/Q6x/B65 don't allow for any overclocking besides GPU. Of course the chipset Intel sells to you is always the same chip, just with some fuses blown or not.
For overclockers Intel offers the "K" CPUs which are slightly pricier than the non-K counterparts. Of course there is a downside to that, because Intel refuses to provide their customers with VT-d, which is important if you want to have faster virtual machines. You do get the fastest GPU core but it's quite useless for gamers.
The naming is still a mess with meaningless numbers and without structure. The i5-2390T especially confuses because it isn't a quad core CPU which you would suspect, but just a dual-core with Hyperthreading enabled. They should have named it i3-2390T, but luckily it's still not in stores yet.
For our tests we have three CPUs. The i3-2100T which is a dual-core CPU clocked at 2.5GHz with Hyperthreading but no Turbo, 3MB L3 cache and only 35W TDP but fused off AES support. The second one is the i5-2500K which is a quad core clocked at 3.3GHz, 6MB L3 Cache, multiplicator free without Hyperthreading and a TDP of 95W. The last one is the i7-2600K which is basically the same as the i7-2500K but clocked at 3.4GHz, Hyperthreading enabled and 8MB L3 cache.
The new 60 Series chipset is just an upgrade to the existing 50 Series. The only notable point is the dual 6Gbps SATA controller while the rest stayed the same.
The Z68 is the high-end chipset that supports graphics and CPU overclocking as well as 2x PCIe 2.0 x8 support. The P67 is the same without graphics, while the H67 can just overclock the GPU but is limited to 1x PCIe 2.0 x16.
Because Intel wants to get rid of the PCI bus, they have blown the fuse for that part for all higher-end SKUs, just leaving the Q67 and B65 with native PCI support. If you own professional video or audio-cards you may run into trouble due to increased lag of the addtional PCIe-PCI bridge-chips, which even Intel mounts on its boards. It's more than silly to force vendors to use a seperate PCIe-to-PCI bridge because this increases costs for the end customer. The H61 and B65 can't overclock and also loose RAID support, and while the B65 supports at least one 6Gb SATA connection, the H61 has just four 3Gb connectors and looses even the dual channel memory support. We think that cutting features is easy to overdo but the company that makes 24 billion turnover and 6 billion profit, does tend to screw its customers on a constant basis.
As already mentioned, if your CPU has no Turbo you are out of luck. We managed a mere 150MHz increase with our 2100T using the BCLK. Going over 106MHz is nearly impossible, but maybe your board will manage. Because we thought it would be fun, we reduced the VCore to 0.9V.
If you have one of the K CPUs you can set the multiplier manually. Depending on the clock, you might either leave turbo as it is or disable it all together. The higher the clock the more unusable Turbo is. You can leave all energy savings enabled up to 3.9G/4G and you won't need any or just a slight VCore increase. We set our CPU to +0.05V offset VCore increase while the default core is about 1.25V. With 4.5GHz you need to disable the C6 power state, but depending on the board, it may be possible that the board will shut down any power savings. Of course without power-savings the power consumption will significantly increase for the idle mode. As you can see with 4.5GHz we are bordering 75°C, which may cause the thermal protection to kick in.
We could push it to 4GHz without even touching the VCore settings which is set by the boards at about 1.225V. Prime was stable at 4.5GHz, but the thermal protection kicked in sometimes. At 4.8GHz it was not possible to run Prime95 and our CPU cooler could not keep up with the heat. Namely, with all the cores under load, the heat protection kicked in regulary. We will see if this is valid for all boards or just a problem with the Gigabyte board. Next week we expect a new CPU-cooler, so we will keep you posted. Generally keep the CPU-core temperatures always under 75°C to prevent the cpu from downclocking due to heat.
It's quite useless to reduce the VCore with these CPUs. Even a quite massive drop by 0.1V decreases power just by about 8W. So the CPU equates lower VCore with increased current.
All benchmarks are 64bit applications. We deviated from the benchmarks by using more cpu-intensive applications, so we hope WinRAR and TrueCrypt will give you a better idea. All benches with AVX optimizations show a rather dramatically improve of performance especially compared to its predecessor or AMD. TrueCrypt also supports AES-NI, but Intel has fused off these extensions on the 2100T which hurt the CPU dramatically but at least you get the new AVX instruction set.
As you can see, AVX support does show.
We stayed with the FarCry2 Benchmark, but now with DX10. As you can see even with 1920x1080 and ultrahigh settings, the GPU reaches its limit and there is almost no difference between stock speeds on a i7-2600K and 4.8GHz. Still, even a 2100T offers very playable framerates but the same is valid for the L3 cacheless quad core AMD Phenom X4 840.
As usual Hyperthreading does hurt performance with a CPU with four cores and HT enabled in the first pass. In the second pass HT does help and HT enabled/disabled reach nearly the same results. If you have a batch to encode, we would recommend to disable HT for the first pass and after finishing, enable HT and do all second passes. In our slow test we use a GPU filter, which runs completely inside the graphics card and loads our 6870 with about 50%. A faster card may gain you some fps but the 6870 was sufficient. We stayed with our Babylon 5 DVD episode and will list the settings we used. For these benches, we used build 1913 of x264 64-bit edition but increased the bitrate to 1220Mbps. Of course encoding 720p or 1080p material will slow down the encoding considerably - 720p will be about 45% while 1080p would end up with about 16.6% speed of the DVD encoding. We ran this bench three times and used the fastest result.
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\DGDecode.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\LeakKernelDeint.dll")
x264_x64_1913 --pass 1 --bitrate 1220 --stats "D:\work\b5_x64_fast.stats" --quiet --profile high --preset fast --tune film --bframes 16 --b-pyramid strict --direct auto --sar 16:11 --threads 4 --thread-input --frames 62932 --output NUL "D:\work\B5_x64_4_fast.avs"
x264_x64_1913 --pass 2 --bitrate 1220 --stats "D:\work\b5_x64_fast.stats" --quiet --profile high --preset fast --tune film --bframes 16 --b-pyramid strict --direct auto --sar 16:11 --threads 4 --thread-input --frames 62932 --output "D:\work\B5-x264-fast.mp4" "D:\work\B5_x64_4_fast.avs"
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\DGDecode.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\Undot.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\FFT3DGPU.dll")
LoadCPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\Yadif.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\TIVTC.dll")
x264_x64_1913 --pass 1 --bitrate 1220 --stats "D:\work\B5-AVC-slow.stats" --quiet --profile high --preset fast --tune film --rc-lookahead 64 --bframes 16 --b-pyramid strict --direct auto --sar 16:11 --threads 4 --thread-input --frames 62932 --output NUL "D:\work\B5_x64_4_slow.avs"
x264_x64_1913 --pass 2 --bitrate 1220 --stats "D:\work\B5-AVC-slow.stats" --quiet --profile high --preset fast --tune film --rc-lookahead 64 --bframes 16 --b-pyramid strict --direct auto --sar 16:11 --threads 4 --thread-input --frames 62932 --output "D:\work\B5-x264-slow.mp4" "D:\work\B5_x64_4_slow.avs"
Even with AVX extensions the i3-2100T can't cope with the L3-cacheless Phenom II X4 which is surprising. With real four cores Intel leads way ahead, but with AMD optimizing the Phenoms aren't that bad either.
TrueCrypt is a very CPU intensive program. It encrypts parts of your HDD and can slow down your HDD performance considerably. With the i5 and higher you have AES-NI extensions which will enable the program to use special instructions for en- and decrypting. We used AES-Twofish, which needs AES and CPU-power. The i3 Series does not offer AES which shows. Even a Phenom II 840 without L3 cache and without AES is nearly twice as fast.
WinRAR is one of the most popular compression programs and since version 3.90, it supports multi-threading and 64bit. The results seem to indiciate it's optimized for Intel. We used various files, mostly our benches which consist of 6849 files and 408 directories weighing in at 724MB.
We also stayed with LAME, but this time we compared single threaded vs. dual threaded. Unfortunately, there is no encoder which can really do multi-threading encoding. You may encode two, four, six or even eight files in a batch, so one file per core, but it's not the same compared to real multi-threading.
The new CPUs do not really use less power but about the same as their predecessors but with much more juice. This shows especially in CPU-intensive scenarios. As always the TDP will not be reached even with full load. At 4GHz and no changes to the VCore, the i7-2600K is the most efficient.
The better efficency shows with CineBench:
We have caluclated all benches in constrast to the i5-2500(K). Higher than 100% is better than i5-2500K at stock clocks and HT enabled, while lower of course means less performance.
To make the comparision fair, we have chosen more expensive motherboards. All boards cost between €99 and €107, but of course, especially with AMD, you can buy a €50 board which will improve AMDs standing. The H61 chipset does offer lower costs, but performance also will suffer with only single channel memory. We did not worry about the CPU's graphics performance at this time, since we will test with H67 later.
As was expected, the new generation of CPUs blows away its predecessors and AMD as well. However, AMD keeps its sockets for years, whereas Intel changes it all the time, which is more than annoying. There is no real reason to do so, besides to get the customers money and with a profit of 33%, saying that Intel does great is an understatement.
This CPU is a completly different die compared to the quad-cores. With only two cores and Hyperthreading, this CPU generation looses AES-NI, TXT and VT-d. Especially for business customers this features would be helpfull to build cheaper desktops. For causual gamers, even a i3 is enough for anything you can think of. The 2100T with a TDP of 35W is also quite nice for HTPCs but note that AMD's E350 can do this job very well too. Both the i3-2100 and the i3-2100T are available for less than 100€, which is not cheap but affordable. If you can live with a CPU with 65W TDP you get the i3-2100 which is clocked at 3.1GHz and costs about €94,-.
These are the successors of the i5-700 series which was the most popular CPU. Prices start about €145,- which is about 50% more of an i3, but you get four real cores. You should be aware of Intel's confusing naming scheme by now so the i5-2390T is actually an i3 with only two cores. The i5-2000 series can take anything you throw at it and is suitable for anything a normal customer might wish for. As long as AMD Bulldozer is not out, that's the CPU to go with. The only downside is that you need a new board which renders your previous investments nearly nil and void, but at least your PSU and memory kit can be recycled. Our overclock-free i5-2500K is a bit more expensive but at about €166,- it's not as expensive as previous K processors.
As usual, Intel has its quad-cores also with Hyperthreading. With four real cores, most applications will not benefit from it, or may even slow down some applications, especially games. We only noticed it with Win7 startup, which was slower with the i7-2600K CPU compared to the i5-2500K. Prices start at about €225 and with only HT enabled this is really expensive because it doesn't cost Intel a dime. You can see that on the Xeon E3-1200 Series prices. The E3-1220 clocks at 3.1GHz and costs about €165, but the E3-1230, which clocks 100MHz faster and does support HT, costs only €190. So, for 25€ more you get HT which is reasonable pricing. So if you need all the bells and whistles and don't care about overclocking, Xeon E3-1230 is a much better choice. The i7-2600K now costs about €241,-.
While we are not happy with the new socket, the performance speaks for itself. So, if you're not keen on waiting for AMD's Bulldozer Cores, the i5-2500K is the CPU of our choice.