CUDA: accumulate data into a large histogram of floats

Posted by shoosh on Stack Overflow See other posts from Stack Overflow or by shoosh
Published on 2010-05-31T16:31:51Z Indexed on 2010/05/31 16:43 UTC
Read the original article Hit count: 578

Filed under:
|
|
|

I'm trying to think of a way to implement the following algorithm using CUDA:

Working on a large volume of voxels, for each voxel I calculate an index i and a value c. after the calculation I need to perform histogram[i] += c
c is a float value and the histogram can have up to 15,000 bins.

I'm looking for a way to implement this efficiently using CUDA. The first obvious problem is that with compute capabilities 1.3 which is what I'm using I can't even do an atomicAdd() of floats so how can I accumulate anything reliably?

This example by nVidia does something somewhat simpler. The histograms are saved in the shared memory (which I can't do due to its size) and it only accumulates integers. Can this approach be generalized to my case?

© Stack Overflow or respective owner

Related posts about floating-point

Related posts about cuda