hi everyone!
for the benefit of those JACK devs who don't follow linux-audio-* (if
any), and for posterity in the JACK mailing list archives:
there has been a very insightful discussion on LAU/LAD about ringbuffer
peculiarities and a possible bug. olivier guilyardi has written test
cases that pinpoint those problems.
he has made them available at http://svn.samalyse.com/misc/rbtest.
can i suggest that the rest of this discussion take place on or at least
be cc:ed to jack-devel?
best,
jörn (who has no idea at all about concurrent programming, but knows a
bit about community engineering and data mining :-D)
Olivier Guilyardi wrote:
> Jack O'Quin wrote:
>> On Sun, Oct 19, 2008 at 2:55 AM, Paul Davis <paul@email-addr-hidden> wrote:
>>
>>> i don't know whether to shoot myself or eat another couple of the
>>> oft-promised hats.
>> Don't beat yourself up too badly. Multiple threads accessing shared memory
>> is *very* tricky. Even smart people (like you) get it wrong sometimes.
>
> I agree with Jack. I remember a friend of mine once qualifying multi-threading
> of "kernel quantization". This is complex but also challenging :)
>
> Although your understanding of these matters is far beyond mine, I think that I
> have a good professional experience of coding methodology. And to me the problem
> here is indeed methodological: there is a lack of unit tests in JACK.
>
>>> so the next question is how best to prevent it. as far as i can see we
>>> have a couple of proposals:
>>>
>>> 1) fons' design, which never actually wraps readptr or writeptr, but
>>> masks the address used to access the data buffer
>>>
>>> 2) removing the intermediate state's visibility
>>>
>>> i admit to preferring (2) even though i know that with a 64 bit index,
>>> not wrapping the ptrs is not really a problem.
>> I also prefer (2).
>>
>>> however, it is not totally clear to me how to prevent an optimizing
>>> compiler from doing the wrong thing here. unlike the claims made by
>>> someone involved with portaudio, i believe that it is correct to declare
>>> the read_ptr and write_ptr volatile, so that the compiler knows that it
>>> cannot try to be clever about it accesses of the "other" ptr (i.e.
>>> read_ptr for the writing thread, and vice versa). maybe the comment on
>>> the use of volatile was based on some idea that i thought it made the
>>> variables thread safe, which is and never was the case.
>>>
>>> i suspect that the safest code looks like:
>>>
>>> size_t tmp = (ptr + incr) & mask;
>>> barrier();
>>> ptr = tmp;
>>>
>>> but i am not sure whether barrier needs to be read, write or both.
>>>
>>> i think that the simpler code:
>>>
>>> ptr = (ptr + incr) & mask;
>>>
>>> is subject to potential compiler and/or processor "optimization" that
>>> might reduce it back to the problem case of two ops without an
>>> intermediate load/store location. the volatile declaration ought to
>>> prevent the compiler from doing this, and i don't see why a processor
>>> would do this, ever, but clearly i've already been deeply wrong about
>>> this. does anybody know for certain?
>> It is best to avoid assumptions about what some future compiler may
>> consider an "optimization". If the register pressure is high at some
>> point in the program, it may decide to store some value just to free up
>> register space for other variables.
>>
>> The "volatile" declaration should remove any need for the compiler
>> barrier() statement, AFAICT. Note that barrier() is a compiler
>> directive, and has no effect on the CPU's ability to reorder cache
>> operations in an SMP memory hierarchy.
>>
>> Here is a fairly clear and complete description of the memory barrier
>> issues: http://lxr.linux.no/linux/Documentation/memory-barriers.txt
>>
>> As best I can tell, both ring buffer threads require "general" memory
>> barriers, because they both read and write (different) shared data. In
>> Linux kernel terms, they'd need to use smp_mb(), since the problems
>> they address are multiprocessor-only. But, since JACK is not compiled
>> differently for UP and SMP, the full mb() seems more appropriate.
>>
>> That stuff makes my head hurt.
>
> Maybe that is because your approach is too theoretical. I've read many theories
> about whether memory barriers are needed or not in lock-free ring buffers, and I
> think it might lead nowhere without experimenting, that is: write test cases.
>
> Until now, I just couldn't write a test that fail because of the lack of memory
> barrier on x86. However, I think I found another bug in Jack ringbuffer, by
> writing another test.
>
> It's a bit of a weird test, I call it "bit circle": I start 16 "node" threads.
> Each node :
> - communicate with its adjacent node through 2 ring buffers
> - is responsible for shifting an integer by one bit
> - send the shifted result to the next node through a ring buffer
> - checks that the value read from previous node is correct.
>
> On my Debian Quad Core and Mac OS X Core Duo boxes, this test fails with Jack's
> ringbuffer even with my patch applied. But it succeeds with Portaudio (both with
> and without memory barriers), and also with Fons' implementation.
>
> I wish someone could eventually run my test suite on a PowerPC, especially SMP.
>
> The usage is unchanged:
> svn co http://svn.samalyse.com/misc/rbtest
> cd rbtest
> make test
>
> The run time is now shorter, about 2 minutes. Below is the output. You can see
> that jack (test-bit-circle-jack) fails the bit circle test, even when patched
> (test-bit-circle-jack-fix1). "|" is printed when a node thread starts, and "-"
> when a node has read 1000 values, to ensure data is really flowing.
>
> The test-int-array-* tests are the same as my original test. All *-lfq tests use
> Fons's Lfq ringbuffer (modified to use char instead of uint32_t) as backend.
>
> ./run-tests.sh bit-circle 512 jack jack-fix1 portaudio portaudio-nobarrier lfq
>
> test-bit-circle-jack - starting (5s max) - buffer size: 512
> || FAILURE: 2 != 514
>
> test-bit-circle-jack-fix1 - starting (5s max) - buffer size: 512
> || FAILURE: 2 != 514
>
> test-bit-circle-portaudio - starting (5s max) - buffer size: 512
> ||-|-|-||-|-|-|-|-||--|||--|---- SUCCESS
>
> test-bit-circle-portaudio-nobarrier - starting (5s max) - buffer size: 512
> |||--|-|||---||-||-|-|--||--|--- SUCCESS
>
> test-bit-circle-lfq - starting (5s max) - buffer size: 512
> |||--|||-|||-|||--|||-------|--- SUCCESS
>
> ./run-tests.sh int-array 16 jack jack-fix1 portaudio portaudio-nobarrier lfq
>
> test-int-array-jack - starting (20s max) - array/buffer size: 8/16
> [reader started] [writer started] [flowing] 52572 != 52568 at offset 0
> FAILURE in chunk 91822
>
> test-int-array-jack-fix1 - starting (20s max) - array/buffer size: 8/16
> [reader started] [writer started] [flowing] SUCCESS
>
> test-int-array-portaudio - starting (20s max) - array/buffer size: 8/16
> [reader started] [writer started] [flowing] SUCCESS
>
> test-int-array-portaudio-nobarrier - starting (20s max) - array/buffer size: 8/16
> [reader started] [writer started] [flowing] SUCCESS
>
> test-int-array-lfq - starting (20s max) - array/buffer size: 8/16
> [writer started] [reader started] [flowing] SUCCESS
>
-- jörn nettingsmeier home://germany/45128 essen/lortzingstr. 11/ http://spunk.dnsalias.org phone://+49/201/491621 Kurt is up in Heaven now. _______________________________________________ Linux-audio-dev mailing list Linux-audio-dev@email-addr-hidden http://lists.linuxaudio.org/mailman/listinfo/linux-audio-devReceived on Mon Oct 20 20:15:04 2008
This archive was generated by hypermail 2.1.8 : Mon Oct 20 2008 - 20:15:05 EEST