cuda - Number of Threads Calculating a single Value -


I am using CUDA with compute capability 1.2. I am running my CUDA code by comparing the other 2 matrix together with each element of the matrix. I'm calculating the value of an element by a thread. I want to know that it is possible to use 2 threads to calculate a single value. If this is possible, can someone tell me how to use 2 different threads of the same block to calculate the single value?

  q = m2 [i] [k] + m2 [(k +, then 

Use a wide variable + less iterations than two cores. Int2: __ shared____ int2 m2 [N] [N], p1 [N], q;

Can use two cores, but not two threads. If you insist on two threads, then

  qThread1 = m2 [i] [k] + m2 [(k + 1)] [j] // in a kernel .... ... .. qThread2 = p1 [(i-1)] * p1 [k] * p1 [j] // in another kernel   

Then you add them to q in another thread. Synchronization, kernel launch overheads, performance of the cache utility, as well as decreased instruction level parallel can also be lacking. Perhaps the colonel's business grows but it is not certain that it tolerates the above mentioned negative.

Comments

Popular posts from this blog

c - Mpirun hangs when mpi send and recieve is put in a loop -

python - Apply coupon to a customer's subscription based on non-stripe related actions on the site -

java - Unable to get JDBC connection in Spring application to MySQL -