The best way to predict performance without actually porting the code?

Posted by ardiyu07 on Stack Overflow See other posts from Stack Overflow or by ardiyu07
Published on 2012-12-20T10:07:58Z Indexed on 2012/12/20 11:03 UTC
Read the original article Hit count: 138

Filed under:
|
|
|

I believe there are people with the same experience with me, where he/she must give a (estimated) performance report of porting a program from sequential to parallel with some designated multicore hardwares, with a very few amount of time given.

For instance, if a 10K LoC sequential program was given and executes on Intel i7-3770k (not vectorized) in 100 ms, how long would it take to run if one parallelizes the code to a Tesla C2075 with NVIDIA CUDA, given that all kinds of parallelizing optimization techniques were done? (but you're only given 2-4 days to report the performance? assume that you didn't know the algorithm at all. Or perhaps it'd be safer if we just assume that it's an impossible situation to finish the job)

Therefore, I'm wondering, what most likely be the fastest way to give such performance report? Is it safe to calculate solely by the hardware's capability, such as GFLOPs peak and memory bandwidth rate? Is there a mathematical way to calculate it? If there is, please prove your method with the corresponding problem description and the algorithm, and also the target hardwares' specifications.

Or perhaps there already exists such tool to (roughly) estimate code porting?

(Please don't the answer: 'kill yourself is the fastest way.')

© Stack Overflow or respective owner

Related posts about Performance

Related posts about cuda