Re: [linux-audio-dev] MidiShare Linux

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] MidiShare Linux
From: Benno Senoner (sbenno_AT_gardena.net)
Date: Thu Jul 20 2000 - 14:27:55 EEST


On Thu, 20 Jul 2000, Stephane Letz wrote:
> >That means if you want to be sure that it works everywhere
> >you have to ship the SMP version.
>
> I think using the "lock" instruction is costly when used in UP. You should
> better provide
> two versions, actually by using the __SMP__ flag.
> We will publish some benchmark to show that.

I looking forward to the benchmark results.
Regarding SMP: Do you mean checking the presence of SMP at runtime,
and then use the appropriate function ?

> >
> >Why implement a lock-free fifo on top of two lock-free lifos ?
>
> Just a temporary solution.

was quite obvious , who writes suboptimal code these days except M$ ? :-)

> Will must implement the fifo directly using inlined assembly language ,
> using the "magic" instructions
> "cmpxchg8b" and so on.
>
> >About the "only one reader" issue: yes the lock-free fifo I'm using , assumes
> >this, but we colud add the following:
> >
> >let's say we always know the number of readers:
> >each reader reads form the FIFO but does not increase the read pointer.
> >reach reader (atomically) increases a member (count) of the dataset with was
> >just read, and as soon as it reaches NUM_READERS we will increase the read
> >pointer and reset to zero.
>
> So it means that all readers read the same element?

yes, but I have not done any research on the multiple reader isse just
brainstorming.
I do not need multiple readers, I have only one reader and one writer.

>
> Actually i don't really understand the way your fifo (with ringbuffer) works.
> How can you garantee that the write operations works correctly in a
> multi-thread context?
> Can you explain the design ?

The code works correctly only in an enviroment, where only one reading thread
and one writing thread are accessing the structure.

Basically the reader increases only the read pointer and the writer only the
write pointer.
That's all, no complex design behind :-)

I have wrapped these assignments into atomic_read() and atomic_set() macros
so it should be completely SMP safe.
I make not use of the atomic_add() / atomic_sub() macros because
I must do write_ptr = ( write_ptr + written_elements) & size_mask
in order to clip at buffer boundaries,
looking at the asm code, the SMP and UP versions are exactly the same because
atomic_set and atomic_read do not make use of the lock instruction.
This may not be true on other architectures.

As for the "multiple concurrent writers" situation I'm not sure if there would
be a way to adapt this code to make it work correctly.
The problem is as soon as you increase the write pointer, the reader
assumes that new elements are available.
So the writer has first to get the actual write position,
write his stuff in the buffer and when it is finished move the write pointer so
that the reader knows that new data is available.

Currently I use one fifo per reader/writer pair so in the case of one reader and
multiple writer threads,
at each iteration, I simply let the reading thread check all fifos in
sequence if there is new data available, while the writers all write to their
private fifo. simple and race-free :-)

But obviously there may be situations where one single fifo with multiple
writers and one reader is prefered.
( I assume Midishare does so)

How did you solve this exactly ?

For those that are interested I looked at the asm code outputted by the version
with and without atomic_t (see old disksamp versions)

look at this testfunction:

void Voice::testfunc(void) {
  curr_offset += 255;
  current_sample=stream->rb->get_read_ptr();
  curr_offset += 254;
  stream->rb->increment_read_ptr(BUFFER_GRANULE);
  curr_offset += 253;
}

(the curr_offset instructions are only useful to see where the
get_read_ptr() and increment_read_ptr() functions start and end:

without atomic_t, current_sample=stream->rb->get_read_ptr(); translates into:

        movl %esi,12(%ecx)
        movl 44(%ecx),%ebx
        movl (%ebx),%edx
        movl 12(%edx),%edi
        leal 0(,%edi,2),%eax
        addl 4(%edx),%eax
        movl %eax,(%ecx)

with atomic_t:

        movl %esi,12(%ecx)
        movl 44(%ecx),%ebx
        movl (%ebx),%edx
        movl 12(%edx),%edi
        leal 0(,%edi,2),%eax
        addl 4(%edx),%eax
        movl %eax,(%ecx)

you can see only the lea instruction is different and the performance
hit should not be that big

stream->rb->increment_read_ptr(BUFFER_GRANULE);
is identical on both methods (with and witout atomic_t ):

        movl %esi,12(%ecx)
        movl (%ebx),%edx
        movl 12(%edx),%eax
        addl BUFFER_GRANULE,%eax
        andl 16(%edx),%eax
        movl %eax,12(%edx)

With the inlining, you achieve efficiency which is close
to assembly programming, therefore it is not true that C++ programs
are always inefficient. :-)

Notice that I use the
get_read_ptr() and increment_read_ptr() (and the *write* counterparts)
almost exclusively over the more general read() write() methods,
because by passing data by reference you avoid memory copies,
plus since the fifo size is a power of two, when you write a number
of elements which is a power of two , then you can safe the boundary
crossing check, where you would have to split up the data in two parts.

Since the diskthread of my disksampler writes the data in large chunks
and the audiothread reads/updates the ringbuffer's read buffer position only
every 32 samples (BUFFER_GRANULE) which additionally is very lightweight
(the above few asm instructions), it makes virtually no difference if you stream
from RAM or from disk. Efficiency close to the ideal case ! :-)

I attached the new ringbuffer.h code (with atomic_t) please look at it,
and tell me if it looks ok to you.

PS: just for cursiousity, does windoze provide atomic functions to userspace
apps ?

Benno.

-------
#ifndef RINGBUFFER_H
#define RINGBUFFER_H

#include <stdio.h>
#include <asm/atomic.h>

#include <sys/types.h>
#include <pthread.h>

template<class T>
class RingBuffer
{
public:
    RingBuffer (int sz) {
            int power_of_two;

            for (power_of_two = 1;
                 1<<power_of_two < sz;
                 power_of_two++);

            size = 1<<power_of_two;
            size_mask = size;
            size_mask -= 1;
            atomic_set(&write_ptr, 0);
            atomic_set(&read_ptr, 0);
            buf = new T[size];
    };

    virtual ~RingBuffer() {
            delete [] buf;
    }

    __inline int read (T *dest, int cnt);
    __inline int write (T *src, int cnt);
    __inline T *get_read_ptr();
    __inline T *get_write_ptr();
    __inline void increment_read_ptr(int cnt) {
               atomic_set(&read_ptr , (atomic_read(&read_ptr) + cnt) & size_mask);
             }

    __inline void increment_write_ptr(int cnt) {
               atomic_set(&write_ptr, (atomic_read(&write_ptr) + cnt) & size_mask);
             }
    __inline int write_space_to_end() {
                  return(size - atomic_read(&write_ptr));
            }
    __inline int read_space_to_end() {
                  return(size - atomic_read(&read_ptr));
            }
    __inline void init() {
                   atomic_set(&write_ptr, 0);
                   atomic_set(&read_ptr, 0);
            }

    int write_space () {
            int w, r;

            w = atomic_read(&write_ptr);
            r = atomic_read(&read_ptr);

            if (w > r) {
                    return ((r - w + size) & size_mask) - 1;
            } else if (w < r) {
                    return (r - w) - 1;
            } else {
                    return size - 1;
            }
    }

    int read_space () {
            int w, r;

            w = atomic_read(&write_ptr);
            r = atomic_read(&read_ptr);

            if (w >= r) {
                    return w - r;
            } else {
                    return (w - r + size) & size_mask;
            }
    }

    int size;
    
  protected:
    T *buf;
    atomic_t write_ptr;
    atomic_t read_ptr;
    int size_mask;
};

template<class T> T *
RingBuffer<T>::get_read_ptr (void) {
//printf("GET: read_ptr=%d\n",read_ptr);
  return(&buf[atomic_read(&read_ptr)]);
}

template<class T> T *
RingBuffer<T>::get_write_ptr (void) {
  return(&buf[atomic_read(&write_ptr)]);
}

template<class T> int
RingBuffer<T>::read (T *dest, int cnt)

{
        int free_cnt;
        int cnt2;
        int to_read;
        int n1, n2;
        int priv_read_ptr;

        priv_read_ptr=atomic_read(&read_ptr);

        if ((free_cnt = read_space ()) == 0) {
                return 0;
        }

        to_read = cnt > free_cnt ? free_cnt : cnt;
        
        cnt2 = priv_read_ptr + to_read;

        if (cnt2 > size) {
                n1 = size - priv_read_ptr;
                n2 = cnt2 & size_mask;
        } else {
                n1 = to_read;
                n2 = 0;
        }
        
        memcpy (dest, &buf[priv_read_ptr], n1 * sizeof (T));
        priv_read_ptr = (priv_read_ptr + n1) & size_mask;

        if (n2) {
                memcpy (dest+n1, buf, n2 * sizeof (T));
                priv_read_ptr = n2;
        }

        atomic_set(&read_ptr, priv_read_ptr);
        return to_read;
}

template<class T> int
RingBuffer<T>::write (T *src, int cnt)

{
        int free_cnt;
        int cnt2;
        int to_write;
        int n1, n2;
        int priv_write_ptr;

        priv_write_ptr=atomic_read(&write_ptr);

        if ((free_cnt = write_space ()) == 0) {
                return 0;
        }

        to_write = cnt > free_cnt ? free_cnt : cnt;
        
        cnt2 = priv_write_ptr + to_write;

        if (cnt2 > size) {
                n1 = size - priv_write_ptr;
                n2 = cnt2 & size_mask;
        } else {
                n1 = to_write;
                n2 = 0;
        }

        memcpy (&buf[priv_write_ptr], src, n1 * sizeof (T));
        priv_write_ptr = (priv_write_ptr + n1) & size_mask;

        if (n2) {
                memcpy (buf, src+n1, n2 * sizeof (T));
                priv_write_ptr = n2;
        }
        atomic_set(&write_ptr, priv_write_ptr);
        return to_write;
}

#endif /* RINGBUFFER_H */

 


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Thu Jul 20 2000 - 16:02:08 EEST