Instead of the traditional Cambridge UK ARM client roadmap event, ARM had a virtual event. We got to sit in our cozy homes and listened to the top ARM managers present about the CPU, GPU, and NPU updates.
After the presentation, thanks to the analyst and PR relations team at ARM, yours truly got to chat with Vincent Risson, CPU product manager, Client Line of Business, Chris Abernathy, distinguished engineer, Client Line of Business, Stefan Rosinger, director, product management, Client Line of Business, Peter Greenhalgh, vice president of technology and fellow, Arm and Ian Smythe, vice president of solutions marketing, Client Line of Business.
Cortex A78 Core
Cortex A77, in its base design, came with 7nm Fin Fet and landed some premium SoCs, including the Snapdragon 865, the one that powers most of the 5G Android-based high-end phones on the market. Interestingly, both Exynos by Samsung and Kirin by Huawei ended up using Cortex A76 for the 2020 phones, probably due to its time to market plans.
The Cortex A78 is designed for high-end performance and best efficiency, and it is built on the successful Cortex A76 and Cortex A77 design template.
Chris Abernathy, distinguished engineer, Client Line of Business mentioned improvements in branch prediction, better accuracy, improved better target process as well as many microarchitecture improvements.
Cortex A78 shares the same architecture as the previous generation, with the extended scalability feature set. The Cortex A78 supports Cortex A78 DynamIQ sharing unit that is compatible with Cortex A55 for big.LITTLE. The Cortex A78 supports Armv8-A (v8.2 instruction set) for 32 or 64 bit CPU, comes with Neon SIMD engine, FPU. It supports 32kb/64kb L1 I-cache parity and 32kb/64kb L1 D-cache ECC with 256kb/512kb Private L2 ECC. The SoC supports between 512kb/4MB shared L3 cache ECC, and it will be up to the customer to choose the right amount of L3 cache. It comes with support for asynchronous bridges, SCU, PP, and Accelerator Coherency Port (ACP).
Cortex A78 performance
Now let’s look at some numbers. When compared with Cortex A77 at 2.6GHz 7nm FinFet, the Cortex A78 at 3GHz and 5nm FinFet and based 1W per core comes about 20 percent faster. Performance numbers are based on Cortex-A78 to Cortex-A77 in 1W and energy consumption for 30 SPECint2006, Including architectural and process improvements (compared to the 2019 device). It might look like an unequal competition to compare 2.6GHz core to 3.0 GHz. Still, ARM’s Stefan Rosinger, director of product management, Client Line of Business, explained that ARM expects customers to get higher clocks with the new Cortex A78 cores on 5 nm FinFet.
It is expected that this will happen on the market once the customers launch the new cores based on Cortex A78. The result is a more sustainable performance over 2019 devices resulting in up to a 20 percent performance increase.
Energy efficiency shows an even higher performance delta. Energy efficiency-optimized Cortex A78 clocked at 2.1 GHz in 5nm FinFet has 50 percent lower energy consumption and the same performance when compared to the Cortex A77 clocked at 2.3GHz in 7nm FinFet. Over 50 percent of energy efficiency compared to 2019 devices is quite an achievement.
Partners are expected to use 5nm for the Cortex A78. Our information, gathered from industry sources, implies that Huawei is expected to have 5nm with its next-generation Kirin series. Qualcomm with the next generation Snapdragon expected to launch this year and get available in devices in Q1/Q2 2021 timeframe. Since Samsung dropped its Exynos custom M core program, 2021 Exynos might use the Cortex A78s too.
Cortex A78 gives you seven percent more performance compared to Cortex A77 in ISO comparison, with four percent less power and five percent smaller area. The delta is much higher compared to Cortex A76, and all scores are normalized to 7nm FinFet. In other words, this is what the architecture optimization can deliver as the remaining 13 percent of performance comes from 5nm FinFet.
ARM also has a surprise appearance, Core X, even faster than Cortex A78 that we will tell you all about in the next part.