Featured Articles

Intel refreshes CPU roadmap

Intel refreshes CPU roadmap

Intel has revealed an update to its CPU roadmap and some things have changed in 2015 and beyond. Let’s start with the…

More...
Hands on: Nvidia Shield Tablet with Android 5.0

Hands on: Nvidia Shield Tablet with Android 5.0

We broke the news of Nvidia's ambitious gaming tablet plans back in May and now the Shield tablet got a bit…

More...
Nokia N1 Android tablet ships in Q1 2015

Nokia N1 Android tablet ships in Q1 2015

Nokia has announced its first Android tablet and when we say Nokia, we don’t mean Microsoft. The Nokia N1 was designed…

More...
Marvell launches octa-core 64-bit PXA1936

Marvell launches octa-core 64-bit PXA1936

Marvell is better known for its storage controllers, but the company doesn’t want to give up on the smartphone and…

More...
Nvidia GTX 970 SLI tested

Nvidia GTX 970 SLI tested

Nvidia recently released two new graphics cards based on its latest Maxwell GPU architecture, with exceptional performance-per-watt. The Geforce GTX 970…

More...
Frontpage Slideshow | Copyright © 2006-2010 orks, a business unit of Nuevvo Webware Ltd.
Monday, 09 May 2011 11:02

Intel Sandy Bridge put to the test - Benchmarks

Written by Eliot Kucharik
i3i5i7_small recommended08_75

Review:
Intel's new generation sees the light
Firstly, we must apologize for taking so long for the SandyBridge review. Changing all the benches to Windows 7 SP1, testing with with various benches and determining which to use took some time. If you have any suggestions to improve it, please let us know.

Let's get started and check out what the new CPUs can do.


i7_2600K_front_normal


Testbed:
Motherboard:
Gigabyte P67A-UD4 (B2) (provided by Gigabyte)
Intel P67
ASRock 890GX Extreme 3 (provided by ASRock)
AMD 890GX/SB850
MSI P55-GD65 (provided by MSI)
Intel P55

CPU:
Intel Core i3-2100T (provided by Mindfactory)
Intel Core i5-750, i5-2500K, i7-2600K (provided by Intel)
AMD Phenom II X4 840/X6 1100T (provided by AMD)

CPU-Cooler:
Scythe Mugen 2 (provided by Scythe-Europe)

Memory:
G.Skill Eco 4GB Kit PC3-12800 (provided by G.Skill)
1333MHz CL7-7-7-20 CR1T 1.35V

Graphics Card:
AMD Radeon HD 6870 1GB (provided byMindfactory)

Power supply:
Enermax Modu87+ 500W

Hard disk:
Mushkin Callisto Deluxe 60GB (provided by Mushkin)

Case:
Cooler Master Stacker 831 Lite (zur Verfügung gestellt von CoolerMaster)

OS:
Windows 7 Ultimate SP1 x64

 


 

Overview:

After some wait after Conroe hit the shelves, Intel struck again with a new architecture. While Lynnfield was a mild update of the previous generation, the new Sandy Bridge is more than just evolution.

The first thing Intel did was to change the Level 0 Cache which now stores µOPs instead of the full instructions. While it was common to execute whole instructions, Intel changed that a while ago due to the speed advantage of RISC (Reduced Instruction Set computiing). Any complicated instruction is split in more than one µOP which is in fact a RISC instruction. While the CPU fetches a new instruction, the decoder checks if the cache already has this instruction cached and avoids decompiling the instruction again. The cache-size of about 1.5k µOPs is big enough to contain pretty much any instruction which will increase performance. The L1 caches did not change, the instructions and data feature 32kB each. The µOP cache is also included in the L1 cache which gives an approximatly 80% hit-chance according to Intel.

The next improvement is the redesigned branch prediction. It is especially improved with long branches, although it has its shortcomings with shorter ones such as loops and elseif constructions. This is quite complicated, so let us just say that the troughput has been improved and enables the CPU for the new AVX instructions. They are an improvement over SSE4 with a bitwidth of 256bits per instruction. Apart from new instructions, the important news is that a two-operand form of a:=a+b can now use a non-destructive three-operand form of c:=a+b, preserving both source operands, which will reduce the code significantly.

Also notable is that turbo has also been improved. The new CPU always clocks 100MHz higher than specified, so a i5-2500 will always clocks at 3.4GHz because the thermal budget is never used as specified. In single threaded applications the CPU increases the clock by 400MHz, with two cores by 300MHz and with three cores under load by 200MHz.

With the higher throughput of the CPU, Intel also needed to improve the memory controller. Lynnfield could access two memory requests per cylce which resulted in 16 bytes load and 16 bytes store per cycle. The new memory controller can now read two memory requests and one write memory request per clock. Faster memory will not improve your performance as in previous generations making faster modules nearly useless. The standard support has been increased to 1333MHz, anything beyond that is offically not supported, except for some notebook CPUs.

Besides that, Intel improved many things inside the CPU, improved power managment and also accelerated the GPU architecture. The GPU is now twice as fast a the previous generation and in case of the HD3000, it has twice as many execution units, which should give you four times the performance of its predecessor. It's still too slow for any serious gaming but should be fine for HD playback and everyday work.

However, the new architecture has two downsides to it. First, Intel changed the socket to force any customer to buy a new motherboard. The second one is that overclocking is very restricted with any normal CPU. CPUs without Turbo can't be overclocked at all. The clock generator is now within the CPU and can only be overclocked to about 106MHz which also increases the clocks for PCIe and PCI. If you have a P67 or Z68 chipset, turbo can overclock by another 400MHz as long as your CPU stays inside the TDP budget. H65/Q6x/B65 don't allow for any overclocking besides GPU. Of course the chipset Intel sells to you is always the same chip, just with some fuses blown or not.

For overclockers Intel offers the "K" CPUs which are slightly pricier than the non-K counterparts. Of course there is a downside to that, because Intel refuses to provide their customers with VT-d, which is important if you want to have faster virtual machines. You do get the fastest GPU core but it's quite useless for gamers.

The naming is still a mess with meaningless numbers and without structure. The i5-2390T especially confuses because it isn't a quad core CPU which you would suspect, but just a dual-core with Hyperthreading enabled. They should have named it i3-2390T, but luckily it's still not in stores yet.

For our tests we have three CPUs. The i3-2100T which is a dual-core CPU clocked at 2.5GHz with Hyperthreading but no Turbo, 3MB L3 cache and only 35W TDP but fused off AES support. The second one is the i5-2500K which is a quad core clocked at 3.3GHz, 6MB L3 Cache, multiplicator free without Hyperthreading and a TDP of 95W. The last one is the i7-2600K which is basically the same as the i7-2500K but clocked at 3.4GHz, Hyperthreading enabled and 8MB L3 cache.


Chipset:

The new 60 Series chipset is just an upgrade to the existing 50 Series. The only notable point is the dual 6Gbps SATA controller while the rest stayed the same.

The Z68 is the high-end chipset that supports graphics and CPU overclocking as well as 2x PCIe 2.0 x8 support. The P67 is the same without graphics, while the H67 can just overclock the GPU but is limited to 1x PCIe 2.0 x16.

Because Intel wants to get rid of the PCI bus, they have blown the fuse for that part for all higher-end SKUs, just leaving the Q67 and B65 with native PCI support. If you own professional video or audio-cards you may run into trouble due to increased lag of the addtional PCIe-PCI bridge-chips, which even Intel mounts on its boards. It's more than silly to force vendors to use a seperate PCIe-to-PCI bridge because this increases costs for the end customer. The H61 and B65 can't overclock and also loose RAID support, and while the B65 supports at least one 6Gb SATA connection, the H61 has just four 3Gb connectors and looses even the dual channel memory support. We think that cutting features is easy to overdo but the company that makes 24 billion turnover and 6 billion profit, does tend to screw its customers on a constant basis.

 


 

Overclocking:

i3-2100T:
As already mentioned, if your CPU has no Turbo you are out of luck. We managed a mere 150MHz increase with our 2100T using the BCLK. Going over 106MHz is nearly impossible, but maybe your board will manage. Because we thought it would be fun, we reduced the VCore to 0.9V.

i3-2100T_oc_uv_cpuz

 

i5-2500K:
If you have one of the K CPUs you can set the multiplier manually. Depending on the clock, you might either leave turbo as it is or disable it all together. The higher the clock the more unusable Turbo is. You can leave all energy savings enabled up to 3.9G/4G and you won't need any or just a slight VCore increase. We set our CPU to +0.05V offset VCore increase while the default core is about 1.25V. With 4.5GHz you need to disable the C6 power state, but depending on the board, it may be possible that the board will shut down any power savings. Of course without power-savings the power consumption will significantly increase for the idle mode. As you can see with 4.5GHz we are bordering 75°C, which may cause the thermal protection to kick in.

i5-2500k_cpuz

 

i5-2500k_oc_3.9G_prime95_normal

 

i5-2500k_oc_4.5G_prime95_normal

 

i7-2600K:
We could push it to 4GHz without even touching the VCore settings which is set by the boards at about 1.225V. Prime was stable at 4.5GHz, but the thermal protection kicked in sometimes. At 4.8GHz it was not possible to run Prime95 and our CPU cooler could not keep up with the heat. Namely, with all the cores under load, the heat protection kicked in regulary. We will see if this is valid for all boards or just a problem with the Gigabyte board. Next week we expect a new CPU-cooler, so we will keep you posted. Generally keep the CPU-core temperatures always under 75°C to prevent the cpu from downclocking due to heat.

i5-2600k_cpuz

 

i7-2600K_4.0G_prime95_normal

 

i7-2600K_4.5G_prime95_normal

 

i5-2600k_4.8G_cpuz

 

 


 

Undervoltage:

It's quite useless to reduce the VCore with these CPUs. Even a quite massive drop by 0.1V decreases power just by about 8W. So the CPU equates lower VCore with increased current.

i5-2500k_uv_1.05V_prime95_normal

 

i7-2600K_uv_1.07V_prime95_normal

 

 


 

Benchmarks:


All benchmarks are 64bit applications. We deviated from the benchmarks by using more cpu-intensive applications, so we hope WinRAR and TrueCrypt will give you a better idea. All benches with AVX optimizations show a rather dramatically improve of performance especially compared to its predecessor or AMD. TrueCrypt also supports AES-NI, but Intel has fused off these extensions on the 2100T which hurt the CPU dramatically but at least you get the new AVX instruction set.

CineBench R11.5:
As you can see, AVX support does show.

SandyBridge_CB

SandyBridge_CBOGL

 

Farcry2:
We stayed with the FarCry2 Benchmark, but now with DX10. As you can see even with 1920x1080 and ultrahigh settings, the GPU reaches its limit and there is almost no difference between stock speeds on a i7-2600K and 4.8GHz. Still, even a 2100T offers very playable framerates but the same is valid for the L3 cacheless quad core AMD Phenom X4 840.

SandyBridge_FarCry2

 

x264:
As usual Hyperthreading does hurt performance with a CPU with four cores and HT enabled in the first pass. In the second pass HT does help and HT enabled/disabled reach nearly the same results. If you have a batch to encode, we would recommend to disable HT for the first pass and after finishing, enable HT and do all second passes. In our slow test we use a GPU filter, which runs completely inside the graphics card and loads our 6870 with about 50%. A faster card may gain you some fps but the 6870 was sufficient. We stayed with our Babylon 5 DVD episode and will list the settings we used. For these benches, we used build 1913 of x264 64-bit edition but increased the bitrate to 1220Mbps. Of course encoding 720p or 1080p material will slow down the encoding considerably - 720p will be about 45% while 1080p would end up with about 16.6% speed of the DVD encoding. We ran this bench three times and used the fastest result.

fast.avs:
# PLUGINS
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\DGDecode.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\LeakKernelDeint.dll")
# SOURCE
MPEG2Source("D:\work\b5_64bit.d2v")
# FILTERS
LeakKernelDeint(order=1)

fast encode:
x264_x64_1913 --pass 1 --bitrate 1220 --stats "D:\work\b5_x64_fast.stats" --quiet --profile high --preset fast --tune film --bframes 16 --b-pyramid strict --direct auto --sar 16:11 --threads 4 --thread-input --frames 62932 --output NUL "D:\work\B5_x64_4_fast.avs"
x264_x64_1913 --pass 2 --bitrate 1220 --stats "D:\work\b5_x64_fast.stats" --quiet --profile high --preset fast --tune film --bframes 16 --b-pyramid strict --direct auto --sar 16:11 --threads 4 --thread-input --frames 62932 --output "D:\work\B5-x264-fast.mp4" "D:\work\B5_x64_4_fast.avs"

slow.avs:
# PLUGINS
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\DGDecode.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\Undot.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\FFT3DGPU.dll")
LoadCPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\Yadif.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins64\TIVTC.dll")
# SOURCE
MPEG2Source("D:\work\b5_64bit.d2v")
# FILTERS
tfm(d2v="D:\work\b5_64bit.d2v")
MT("Yadif(mode=0,order=1)",4,4)
MT("Undot()",4,4)
FFT3DGPU(bt=4,sigma=2,precision=2,mode=2,bw=48,bh=48)

slow encode:
x264_x64_1913 --pass 1 --bitrate 1220 --stats "D:\work\B5-AVC-slow.stats" --quiet --profile high --preset fast --tune film --rc-lookahead 64 --bframes 16 --b-pyramid strict --direct auto --sar 16:11 --threads 4 --thread-input --frames 62932 --output NUL "D:\work\B5_x64_4_slow.avs"
x264_x64_1913 --pass 2 --bitrate 1220 --stats "D:\work\B5-AVC-slow.stats" --quiet --profile high --preset fast --tune film --rc-lookahead 64 --bframes 16 --b-pyramid strict --direct auto --sar 16:11 --threads 4 --thread-input --frames 62932 --output "D:\work\B5-x264-slow.mp4" "D:\work\B5_x64_4_slow.avs"

Even with AVX extensions the i3-2100T can't cope with the L3-cacheless Phenom II X4 which is surprising. With real four cores Intel leads way ahead, but with AMD optimizing the Phenoms aren't that bad either.

SandyBridge_x264

 

TureCrypt:
TrueCrypt is a very CPU intensive program. It encrypts parts of your HDD and can slow down your HDD performance considerably. With the i5 and higher you have AES-NI extensions which will enable the program to use special instructions for en- and decrypting. We used AES-Twofish, which needs AES and CPU-power. The i3 Series does not offer AES which shows. Even a Phenom II 840 without L3 cache and without AES is nearly twice as fast.

SandyBridge_TrueCrypt

 

WinRAR:
WinRAR is one of the most popular compression programs and since version 3.90, it supports multi-threading and 64bit. The results seem to indiciate it's optimized for Intel. We used various files, mostly our benches which consist of 6849 files and 408 directories weighing in at 724MB.

SandyBridge_Winrarbench

 

SandyBridge_RARenc

 

SandyBridge_RARdec

 

Lame:
We also stayed with LAME, but this time we compared single threaded vs. dual threaded. Unfortunately, there is no encoder which can really do multi-threading encoding. You may encode two, four, six or even eight files in a batch, so one file per core, but it's not the same compared to real multi-threading.

SandyBridge_Lame

 

 

 


 

Power-Consumption:

The new CPUs do not really use less power but about the same as their predecessors but with much more juice. This shows especially in CPU-intensive scenarios. As always the TDP will not be reached even with full load. At 4GHz and no changes to the VCore, the i7-2600K is the most efficient.

SandyBridge_power

 

The better efficency shows with CineBench:

SandyBridge_CBpower

SandyBridge_CBeff

 


Benchmark Summary:


We have caluclated all benches in constrast to the i5-2500(K). Higher than 100% is better than i5-2500K at stock clocks and HT enabled, while lower of course means less performance.

SandyBridge_Benchmarks

 

Costs:

To make the comparision fair, we have chosen more expensive motherboards. All boards cost between €99 and €107, but of course, especially with AMD, you can buy a €50 board which will improve AMDs standing. The H61 chipset does offer lower costs, but performance also will suffer with only single channel memory. We did not worry about the CPU's graphics performance at this time, since we will test with H67 later.

SandyBridge_Costs

 

Conclusion:

As was expected, the new generation of CPUs blows away its predecessors and AMD as well. However, AMD keeps its sockets for years, whereas Intel changes it all the time, which is more than annoying. There is no real reason to do so, besides to get the customers money and with a profit of 33%, saying that Intel does great is an understatement.

i3-2000 Series:
This CPU is a completly different die compared to the quad-cores. With only two cores and Hyperthreading, this CPU generation looses AES-NI, TXT and VT-d. Especially for business customers this features would be helpfull to build cheaper desktops. For causual gamers, even a i3 is enough for anything you can think of. The 2100T with a TDP of 35W is also quite nice for HTPCs but note that AMD's E350 can do this job very well too. Both the i3-2100 and the i3-2100T are available for less than 100€, which is not cheap but affordable. If you can live with a CPU with 65W TDP you get the i3-2100 which is clocked at 3.1GHz and costs about €94,-.

i5-2000 Series:
These are the successors of the i5-700 series which was the most popular CPU. Prices start about €145,- which is about 50% more of an i3, but you get four real cores. You should be aware of Intel's confusing naming scheme by now so the i5-2390T is actually an i3 with only two cores. The i5-2000 series can take anything you throw at it and is suitable for anything a normal customer might wish for. As long as AMD Bulldozer is not out, that's the CPU to go with. The only downside is that you need a new board which renders your previous investments nearly nil and void, but at least your PSU and memory kit can be recycled. Our overclock-free i5-2500K is a bit more expensive but at about €166,- it's not as expensive as previous K processors.

i7-2000 series:
As usual, Intel has its quad-cores also with Hyperthreading. With four real cores, most applications will not benefit from it, or may even slow down some applications, especially games. We only noticed it with Win7 startup, which was slower with the i7-2600K CPU compared to the i5-2500K. Prices start at about €225 and with only HT enabled this is really expensive because it doesn't cost Intel a dime. You can see that on the Xeon E3-1200 Series prices. The E3-1220 clocks at 3.1GHz and costs about €165, but the E3-1230, which clocks 100MHz faster and does support HT, costs only €190. So, for 25€ more you get HT which is reasonable pricing. So if you need all the bells and whistles and don't care about overclocking, Xeon E3-1230 is a much better choice. The i7-2600K now costs about €241,-.

While we are not happy with the new socket, the performance speaks for itself. So, if you're not keen on waiting for AMD's Bulldozer Cores, the i5-2500K is the CPU of our choice.

 

FudzillaRecommended-2011
(Page 5 of 7)
Last modified on Tuesday, 10 May 2011 02:44
blog comments powered by Disqus

 

Facebook activity

Latest Commented Articles

Recent Comments