C++11: thread_local or array of OpenCL 1.2 cl_kernel objects?

Posted by user926918 on Stack Overflow See other posts from Stack Overflow or by user926918
Published on 2012-10-07T21:35:22Z Indexed on 2012/10/07 21:37 UTC
Read the original article Hit count: 146

Filed under:
|
|

I need to run several C++11 threads (GCC 4.7.1) parallely in host. Each of them needs to use a device, say a GPU. As per OpenCL 1.2 spec (p. 357):

All OpenCL API calls are thread-safe75 except clSetKernelArg. 
clSetKernelArg is safe to call from any host thread, and is safe
to call re-entrantly so long as concurrent calls operate on different
cl_kernel objects. However, the behavior of the cl_kernel object is
undefined if clSetKernelArg is called from multiple host threads on
the same cl_kernel object at the same time.

An elegant way would be to use thread_local cl_kernel objects and the other way I can think of is to use an array of these objects such that i'th thread uses i'th object. As I have not implemented these earlier I was wondering if any of the two are good or are there better ways of getting things done.

TIA, S

© Stack Overflow or respective owner

Related posts about c++11

Related posts about opencl