Re: [linux-audio-dev] Lock Free FIFOs

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] Lock Free FIFOs
From: Jack O'Quin (joq_AT_io.com)
Date: Wed Dec 06 2000 - 22:01:13 EET


Paul Sladen <paul_AT_sladen.org> writes:

> Does anyone have an example of a Lock Free FIFO. Puesdo code /quite/
> acceptable.

Depending on the system environment in which the FIFO is used, the
"ancient and venerable" ring queue is an obvious solution. It is very
efficient in the case of a single producer thread and a single
consumer thread per queue. It also works for signal handlers and
interrupt handlers.

The basic idea is straightforward, each queue is a circular buffer
with the following header:

typedef int Q_element; /* could be anything */

struct ring_Q { /* queue header */
    Q_element *first; /* start of buffer */
    Q_element *in; /* next available slot */
    Q_element *out; /* first element in queue */
    Q_element *limit; /* last slot + 1 */
};

The producer thread does something like this:

int enqueue(struct ring_Q *rq, Q_element new_element)
{
    Q_element *next;

    next = rq->in + 1;
    if (next == rq->limit) /* buffer wrap? */
        next = rq->first;

    if (next == rq->out) /* buffer full? */
        return ENOMEM;

    *rq->in = new_element;
    rq->in = next;
    return 0;
}

The consumer thread removes elements as follows:

Q_element dequeue(struct ring_Q *rq)
{
    Q_element result;
    Q_element *next = rq->out;

    if (rq->in == next) /* buffer empty? */
        return -1; /* not a Q_element */

    result = *next++;
    if (next == rq->limit) /* buffer wrap? */
        next = rq->first;
    rq->out = next;
    return result;
}

DISCLAIMER: This code is untested and probably contains errors.

In the single-producer, single-consumer case this needs no locks,
because rq->in is only written by the producer thread, and rq->out is
only written by the consumer thread. Each thread reads the other's
pointer to detect overflow or underflow, but depends only on the
machine ensuring that reads and writes of a single pointer location
are coherent with respect to interrupts and multiple processors. This
property, called "load/store atomicity" is supported by almost all MP
hardware, as spin locks generally depend on it. For some NUMA
(non-uniform memory architecture) implementations, one may need an
assembler routine that uses special instructions. But, for normal
cache-coherent SMP it's not a problem.

Note that this implementation does *not* correctly handle either
multiple producer or multiple consumer threads accessing a single
queue without a lock. There is no problem supporting several
consumers (each with its own queue), or a single consumer reading
several producers' queues. For small numbers of producers and
consumers, I could even imagine an N_PRODUCERS by N_CONSUMERS matrix
of queues so any thread could queue to any other. But, that would
quickly become absurd. :-)

Perhaps this simple algorithm meets your need, perhaps not.
Generalizing it to handle multiple threads all reading or writing a
single queue without locks looks harder. It can probably be done on
machines that support some kind of primitive operation like "test-
and-set", "exchange", "compare-and-swap", or "conditional-store". If
that's what you really need, let me know. That would be fun to think
about. But, the solution is not obvious to me at the moment.

Regards,

-- 
  Jack O'Quin
  Austin, Texas, USA


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Wed Dec 06 2000 - 22:30:33 EET