May 12, 2019

My Best Friends, also a note about Fury X vs Vega 64 'clock for clock'

Radeon VII isn't here because he is in my PC rendering muh video gamez

My favourite crew and best friends in life. Top left: Sapphire R9 380X Nitro, Top right: Sapphire R9 280X Vapor-X, bottom left: Sapphire RX 590 Nitro Special Edition, bottom right: Sapphire RX 570 'Bane of GTX 1650' Pulse ITX 4GB.

Yes they are all Sapphire, because I really do like Sapphire. I am currently in the process of acquiring a Sapphire R9 390X Nitro with Hawaii GPU (GCN2) for my future testing. That will complete my full collection of all GCN revisions! Then I will be able to die happy.

I kinda wanna do a Fury X (GCN3, Fiji XT) versus RX Vega 64 (GCN5, Vega 10 XT) clock for clock in some specific tests. But... There's a few things that people don't understand I think, when you compare Fiji to Vega 10. So I saw some people seeing very little, (single digit %) gains from these and then concluding Vega had almost no architectural gains over Fiji. Well that's really not true because if you take a look at the way Vega's ROP / L2 cache system works... The ROP are tied directly into the L2$ with Vega, which is doubled in size to 4MB over Fiji's 2MB. That means more data used by those ROP units stays on-chip and increases bandwidth efficiency.

So why such small gains with both cores at 1 GHz? That is because Vega was designed with much, much higher frequencies in mind (Polaris runs much closer to Tonga than Vega to Fiji). Firstly, a big success of Vega, is that it can be up to 35-40% faster than Fiji, whilst having less memory bandwidth. That's a pretty impressive thing for GCN, which historically has required a lot of bandwidth to scale that performance (especially versus NVIDIA parts). So that alone shows much higher bandwidth efficiency and architectural gain when both GPU are operating in designed clock ranges.

Also I did hear that some of Vega's functional elements relating to its ability to discard primitives are not functioning properly at clock rates too low, or too high. Something was observed when Vega first launched, you'd get some insane benchmark score but you'd notice that the GPU wasn't actually drawing a lot of the scene, or it was not drawing certain bits or the colours were all wrong. So my assumption is the Discard engine was throwing out geometry that it shouldn't have, artificially increasing FPS... you know, because it's not rendering the entire scene properly. At speeds too low, the Discard engine might not be functioning at all, or at least, much less effectively. Hence the lower % gains, even in geometry heavy scenes. Vega's 'Draw Stream Binning Rasterizer' is also basically about allowing higher performance with lower bandwidth reliance... (or lower power at the same performance). So that only shows if you pump the core speed up and keep the raw Bandwidth the same (or less in Vega's case).

If both cores are running at 1 GHz, you're hitting other bottlenecks in Vega, that reduce its gains over Fiji. The increased bandwidth efficiency afforded by the ROP / L2 configuration is being mitigated by the fact that the ROP are likely hard limited (i,e the GPU is pixel-fillrate bound) rather than being limited by memory bandwidth. So would Vega 64 be 35-40% faster than Fury X, with 5% less memory bandwidth, if it didn't have significant architectural improvements? I don't think it would be.

That's why I have a bit of a problem with 'Fury X vs Vega 64 clock for clock' comparisons, where people conclude that Vega is just a 'die-shrunk Fiji with higher clock speeds' (it's not)... But why I did a Tahiti vs Tonga vs Polaris? Because as I said Polaris 10/20 was built for clock rates much closer to those two. GCN5 was a massive increase in stock operating speeds.

All that said, I would still like those two cards to run some tests. For Science!

Eridonia Archives

My Best Friends, also a note about Fury X vs Vega 64 'clock for clock'

Recent Posts

Comments

Rabbit Hole.