Memory read/write access efficiency

Posted by wolfPack88 on Programmers See other posts from Programmers or by wolfPack88
Published on 2014-06-13T13:32:34Z Indexed on 2014/06/13 15:39 UTC
Read the original article Hit count: 258

Filed under:
|

I've heard conflicting information from different sources, and I'm not really sure which one to believe. As such, I'll post what I understand and ask for corrections. Let's say I want to use a 2D matrix. There are three ways that I can do this (at least that I know of).

1:

int i;
char **matrix;
matrix = malloc(50 * sizeof(char *));
for(i = 0; i < 50; i++)
    matrix[i] = malloc(50);

2:

int i;
int rowSize = 50;
int pointerSize = 50 * sizeof(char *);
int dataSize = 50 * 50;
char **matrix;
matrix = malloc(dataSize + pointerSize);
char *pData = matrix + pointerSize - rowSize;
for(i = 0; i < 50; i++) {
    pData += rowSize;
    matrix[i] = pData;
}

3:

//instead of accessing matrix[i][j] here, we would access matrix[i * 50 + j]
char *matrix = malloc(50 * 50); 

In terms of memory usage, my understanding is that 3 is the most efficient, 2 is next, and 1 is least efficient, for the reasons below:

3: There is only one pointer and one allocation, and therefore, minimal overhead.

2: Once again, there is only one allocation, but there are now 51 pointers. This means there is 50 * sizeof(char *) more overhead.

1: There are 51 allocations and 51 pointers, causing the most overhead of all options.

In terms of performance, once again my understanding is that 3 is the most efficient, 2 is next, and 1 is least efficient. Reasons being:

3: Only one memory access is needed. We will have to do a multiplication and an addition as opposed to two additions (as in the case of a pointer to a pointer), but memory access is slow enough that this doesn't matter.

2: We need two memory accesses; once to get a char *, and then to the appropriate char. Only two additions are performed here (once to get to the correct char * pointer from the original memory location, and once to get to the correct char variable from wherever the char * points to), so multiplication (which is slower than addition) is not required. However, on modern CPUs, multiplication is faster than memory access, so this point is moot.

1: Same issues as 2, but now the memory isn't contiguous. This causes cache misses and extra page table lookups, making it the least efficient of the lot.

First and foremost: Is this correct? Second: Is there an option 4 that I am missing that would be even more efficient?

© Programmers or respective owner

Related posts about optimization

Related posts about c