Memory Bandwidth Performance for Modern Machines

Posted by porgarmingduod on Stack Overflow See other posts from Stack Overflow or by porgarmingduod
Published on 2010-03-18T01:38:16Z Indexed on 2010/03/18 1:41 UTC
Read the original article Hit count: 333

Filed under:
|
|
|

I'm designing a real-time system that occasionally has to duplicate a large amount of memory. The memory consists of non-tiny regions, so I expect the copying performance will be fairly close to the maximum bandwidth the relevant components (CPU, RAM, MB) can do. This led me to wonder what kind of raw memory bandwidth modern commodity machine can muster?

My aging Core2Duo gives me 1.5 GB/s if I use 1 thread to memcpy() (and understandably less if I memcpy() with both cores simultaneously.) While 1.5 GB is a fair amount of data, the real-time application I'm working on will have have something like 1/50th of a second, which means 30 MB. Basically, almost nothing. And perhaps worst of all, as I add multiple cores, I can process a lot more data without any increased performance for the needed duplication step.

But a low-end Core2Due isn't exactly hot stuff these days. Are there any sites with information, such as actual benchmarks, on raw memory bandwidth on current and near-future hardware?

Furthermore, for duplicating large amounts of data in memory, are there any shortcuts, or is memcpy() as good as it will get?

Given a bunch of cores with nothing to do but duplicate as much memory as possible in a short amount of time, what's the best I can do?

© Stack Overflow or respective owner

Related posts about memory

Related posts about Performance