Efficiency of data structures in C99 (possibly affected by endianness)
- by Ninefingers
Hi All,
I have a couple of questions that are all inter-related. Basically, in the algorithm I am implementing a word w is defined as four bytes, so it can be contained whole in a uint32_t. 
However, during the operation of the algorithm I often need to access the various parts of the word. Now, I can do this in two ways:
uint32_t w = 0x11223344;
uint8_t a = (w & 0xff000000) >> 24;
uint8_t b = (w & 0x00ff0000) >> 16;
uint8_t b = (w & 0x0000ff00) >>  8;
uint8_t d = (w & 0x000000ff);
However, part of me thinks that isn't particularly efficient. I thought a better way would be to use union representation like so:
typedef union
{
    struct
    {
        uint8_t d;
        uint8_t c;
        uint8_t b;
        uint8_t a;
    };
    uint32_t n;
} word32;
Using this method I can assign word32 w = 0x11223344; then I can access the various 
parts as I require (w.a=11 in little endian).
However, at this stage I come up against endianness issues, namely, in big endian systems my struct is defined incorrectly so I need to re-order the word prior to it being passed in.
This I can do without too much difficulty. My question is, then, is the first part (various bitwise ands and shifts) efficient compared to the implementation using a union? Is there any difference between the two generally? Which way should I go on a modern, x86_64 processor? Is endianness just a red herring here?
I could inspect the assembly output of course, but my knowledge of compilers is not brilliant. I would have thought a union would be more efficient as it would essentially convert to memory offsets, like so:
mov eax, [r9+8]
Would a compiler realise that is what happening in the bit-shift case above?
If it matters, I'm using C99, specifically my compiler is clang (llvm).
Thanks in advance.