A simple test to show that the 16-core Ryzen 9 5950X processor can indeed benefit significantly from higher memory clock (MCLK) speeds at the expense of latency from desynchronised clock domains between the Infinity Fabric and Unified Memory Controller.
This video uses a 4K/30/22mbps (H.264) -> 720p/24/4mbps (H.265) transcode from my recording format to my standard archival format for bulk video storage.
I was curious to see if the Ryzen 9 5950X was memory bandwidth bottlenecked during video transcoding in this case, after observing HWINFO64's higher "DRAM Read/Write Bandwidth" reading. I have noticed an (anecdotal) correlation to bandwidth intensive workloads that correlates to a higher reading on those two sensors. Also, observing the core frequency and relatively low package power indicates the cores are waiting a lot, likely on memory, as the execution units are gated - the core simply uses the headroom to run the rest of the core at higher clocks as a result.
This test essentially pits the standard DDR4-3200 sync setting that the 5950X supports, with all 3 domains (Fabric, Memory Controller, and Memory) running at 1600 MHz locked, against a the less popular option of running a much higher memory speed (4200 MT/s effective) but with the Fabric running out of power of 2, and the memory controller at 50% the memory speed. Due to the fact that the UMC can signal multiple (likely) bits per clock to the memory, the 50% clock speed should rely on pipelining within the memory controller to achieve higher bandwidth without bottlenecking the RAM. This is observed (salt needed) in AIDA64 bandwidth test where the 4200 De-sync setup achieves ~59-60 GB/s vs the ~46-48GB/s on the 3200 Sync.
The video clearly shows De-sync 4200 memory outperforming the 3200 Sync setup by transcoding the video in around ~14% less time, with ~30% more memory bandwidth, showing reasonable scaling.
Comments