Optimizing a shared buffer in a producer/consumer multithreaded environment

Posted by Etan on Stack Overflow See other posts from Stack Overflow or by Etan
Published on 2010-05-05T10:50:53Z Indexed on 2010/05/05 12:08 UTC
Read the original article Hit count: 348

I have some project where I have a single producer thread which writes events into a buffer, and an additional single consumer thread which takes events from the buffer. My goal is to optimize this thing for a single machine to achieve maximum throughput.

Currently, I am using some simple lock-free ring buffer (lock-free is possible since I have only one consumer and one producer thread and therefore the pointers are only updated by a single thread).

#define BUF_SIZE 32768

struct buf_t { volatile int writepos; volatile void * buffer[BUF_SIZE]; 
    volatile int readpos;) };

void produce (buf_t *b, void * e) {
    int next = (b->writepos+1) % BUF_SIZE;
    while (b->readpos == next); // queue is full. wait
    b->buffer[b->writepos] = e; b->writepos = next;
}

void * consume (buf_t *b) {
    while (b->readpos == b->writepos); // nothing to consume. wait
    int next = (b->readpos+1) % BUF_SIZE;
    void * res = b->buffer[b->readpos]; b->readpos = next;
    return res;
}

buf_t *alloc () {
    buf_t *b = (buf_t *)malloc(sizeof(buf_t));
    b->writepos = 0; b->readpos = 0; return b;
}

However, this implementation is not yet fast enough and should be optimized further. I've tried with different BUF_SIZE values and got some speed-up. Additionaly, I've moved writepos before the buffer and readpos after the buffer to ensure that both variables are on different cache lines which resulted also in some speed.

What I need is a speedup of about 400 %. Do you have any ideas how I could achieve this using things like padding etc?

© Stack Overflow or respective owner

Related posts about c

    Related posts about Performance