Solving problems involving more complex data structures with CUDA

Posted by Nils on Stack Overflow See other posts from Stack Overflow or by Nils
Published on 2010-05-10T22:43:48Z Indexed on 2010/05/11 1:04 UTC
Read the original article Hit count: 484

Filed under:

So I read a bit about CUDA and GPU programming. I noticed a few things such that access to global memory is slow (therefore shared memory should be used) and that the execution path of threads in a warp should not diverge.

I also looked at the (dense) matrix multiplication example, described in the programmers manual and the nbody problem. And the trick with the implementation seems to be the same: Arrange the calculation in a grid (which it already is in case of the matrix mul); then subdivide the grid into smaller tiles; fetch the tiles into shared memory and let the threads calculate as long as possible, until it needs to reload data from the global memory into shared memory.

In case of the nbody problem the calculation for each body-body interaction is exactly the same (page 682):

bodyBodyInteraction(float4 bi, float4 bj, float3 ai)

It takes two bodies and an acceleration vectors. The body vector has four components it's position and the weight. When reading the paper, the calculation is understood easily.

But what is if we have a more complex object, with a dynamic data structure? For now just assume that we have an object (similar to the body object presented in the paper) which has a list of other objects attached and the number of objects attached is different in each thread. How could I implement that without having the execution paths of the threads to diverge?

I'm also looking for literature which explains how different algorithms involving more complex data structures can be effectively implemented in CUDA.

Developer IT

Solving problems involving more complex data structures with CUDA - Developer IT

Solving problems involving more complex data structures with CUDA

cuda

gpu

Related posts about cuda

CUDA Driver API vs. CUDA runtime

Updating a Cuda 4.0 project to Cuda 4.2

How to solve CUDA crash when run CUDA example fluidsGL?

Context migration in CUDA.NET

CUDA on GeForce 8600GT

Related posts about gpu

CUDA: Memory copy to GPU 1 is slower in multi-GPU

Software for monitoring/measuring GPU temperature (WinXP/7)

Port scientific software to GPU and publish it

Intel Merges GPU, CPU for Netbooks

Yellow Dog Enterprise Linux for GPU computing

Categories cloud