add uchar values in ushort array with sse2 or sse3
        Posted  
        
            by 
                pompolus
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by pompolus
        
        
        
        Published on 2012-11-09T18:06:59Z
        Indexed on 
            2012/11/10
            11:01 UTC
        
        
        Read the original article
        Hit count: 268
        
i have an unsigned short dst[16][16] matrix and a larger unsigned char src[m][n] matrix.
Now i have to access in the src matrix and add a 16x16 submatrix to dst, using sse2 or ss3.
In a my older implementation, I was sure that my summed values ??were never greater than 256, so i could do this:
for (int row = 0; row < 16; ++row)
  {
    __m128i subMat = _mm_lddqu_si128(reinterpret_cast<const __m128i*>(src));
    dst[row] = _mm_add_epi8(dst[row], subMat);
    src += W; // Step to next row i need to add
  }
where W is an offset to reach the desired rows. This code works, but now my values in src are larger and summed could be greater than 256, so i need to store them as ushort.
i've tried this:
for (int row = 0; row < 16; ++row)
  {
    __m128i subMat = _mm_lddqu_si128(reinterpret_cast<const __m128i*>(src));
    dst[row] = _mm_add_epi16(dst[row], subMat);
    src += W; // Step to next row i need to add
  }
but it doesn't work. I'm not so good with sse, so any help will be appreciated.
© Stack Overflow or respective owner