[linux-audio-dev] RT watchdog .. Was: Re: Lowlatency rpm problem

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: [linux-audio-dev] RT watchdog .. Was: Re: Lowlatency rpm problem
From: Benno Senoner (sbenno_AT_gardena.net)
Date: Wed Oct 11 2000 - 17:57:57 EEST


On Wed, 11 Oct 2000, David Olofson wrote:

>
> One thing that worries me is this; How to find out if a thread is
> deadlocked, or if it's just running/sleeping with very short
> intervals, without a kernel hack? Do the existing APIs provide
> reliable info in such cases? (I don't want to rely on the threads
> reporting to the daemon in any explicit way! Not reliable.)
>
> Preferably, this should be done without any kernel hacks, extra
> modules or other special requirements.

not sure what's the desired reaction speed of your watchdog.
(eg. you want that if a task with 5msec timeslices , is killed after no more
5-6msec in order to cause minimal audio disruption)

Anyway I think such a fast reaction speed is not needed because if
a task freezes then it is buggy or if you overbook the CPU for a few msec,
one might not indend to kill the softsynth how caused this.

I think the best way for the watchdog to see if all realtime processes are
still alive, would be:

- the RT process must periodically set a flag to 1 in a shared mem area which is
the checked periodically (let's say every 1-2 secs) by the watchdog and
then reset to zero.
If the watchdog finds that the flag is zero during the next check, then it means
that the RT process freezed in some part of the code (without executing the
main loop).
On the other hand you might fear that the RT process did not freeze but could
be in a state where the CPU is constantly overbooked, thus the audio write
loop never blocks, causing in practice a freeze of the non RT threads.
In this case I would adopt the same flagging strategy , but this time using a
second a non RT thread (which belongs to the watchdog).
If the non-RT thread does not give any signs of life
(basically it does while(1) { i_am_alive=1; sleep(1); } )
then assume that we are likely to be in a CPU overbooking situation
(assuming the the RT threads still periodically inform the watchdog that they
are running ok) In this case it is hard to figure out WHO is causing the mess.
Perhaps one could simply see which thread consumes the biggest amount of
CPU (like top does), and then kill this thread.
But it is not 100% foolproof.
Anyway such a solution would be much more than a windoze solution can offer.
(and windows does not have a concept of stability, thus the watchdog talks
on that platform would be useless :-)) )

cheers,
Benno.

>
> (Please, CC to do_AT_reologica.se, as that address isn't subscribed to
> these lists.)


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Wed Oct 11 2000 - 17:00:44 EEST