Re: [linux-audio-dev] HD-recording frustration: linux makes it almost impossible :-(

New Message	Reply	About this list	Date view	Thread view	Subject view	Author view	Other groups

Subject: Re: [linux-audio-dev] HD-recording frustration: linux makes it almost impossible :-(
From: Paul Barton-Davis (pbd_AT_Op.Net)
Date: Mon Apr 10 2000 - 05:31:25 EEST

Next message: Alexander Ehlert: "Re: [linux-audio-dev] plug-in mania and the sound editor du jour"
Previous message: David O'Toole: "Re: [linux-audio-dev] do YOU support LASDPA? (was Re: Linux Audio plugin API's: where arewe ?)"
In reply to: Benno Senoner: "Re: [linux-audio-dev] HD-recording frustration: linux makes it almost impossible :-("
Next in thread: Benno Senoner: "Re: [linux-audio-dev] HD-recording frustration: linux makes it almost impossible :-("
Reply: Paul Barton-Davis: "Re: [linux-audio-dev] HD-recording frustration: linux makes it almost impossible :-("
Reply: Benno Senoner: "Re: [linux-audio-dev] HD-recording frustration: linux makes it almost impossible :-("

Benno - further down in this message is a very important thing that
happened to me when I was working on ardour last week that *might*
explain the behaviour you are seeing without it being anything to do
with the buffer cache.

>The 2nd CPU isn't needed, since SCHED_FIFO processes not calling disk
>I/O routines will run just fine, therefore the audio thread doesn't
>suffe r much during heavy disk I/O ( about 50-150ms on normal
>kernels, about 1-3ms on a lowlatency kernel)

As I pointed out, I think its important for the butler thread to be
SCHED_FIFO as well. If its not, and its issuing disk i/o requests
serially, then it can effectively reduce the disk throughput because
of the delay between a read request being complete, and the butler
thread running again. However, it obviously has to run at lower
priority than the audio thread.

>Actually hitting the boundary or coming very close doesn't make a big
>difference,because when the buffer is almost full the read size is small,
>thus leading to a disk performance decrease (for that request),
>therefore available space will always fluctuate until the system gets in a
>stationary state.
>( A ringbuffer is in my opinion easier/not harder to manage than a ring of
>buffers, plus if you want you can read/write any amount of data)

Right, but thats a feature that we don't actually
want. Reading/writing anything but suitably-sized chunks of data is
bad for disk throughput, and since data is being used and produced at
a regular rate, it seems much more natural to me to use the ring of
buffers approach. Note that the ring of buffers approach doesn't have
to have a distinct set of buffers - if you wanted to, you could do it
just like soundcard h/w does, and just use fragment boundaries as
semantic locations, not breaks in the contiguity of the memory.

>The problem is not having the buffer refilled up to 2k samples or up to 10k
>samples, the problem is when your disk thread gets blocked by the
>damn buffer flushing.

Well, the problems on my system came from using non-contiguous files,
too small chunk size for i/o and other things that all led to not
having the buffers filled in time. Once I got that solved, its been
easy since then (bar the read/write ordering issue, and I think thats
easily solved by throwing more memory at the problem).

But as I said in a previous message, I think the problem may be that
you are allocating fs blocks as you go, which is a performance killer
for more than one reason. If you preallocate the files, there is no
(or almost no) metadata to be written.

>> I didn't say that varispeed was painful. I said that doing it the way
>> you sketched out (determine space per track; sort tracks; refill each
>> track) will not work, or more precisely, it will not work if you at
>> all close to the limits of the disk throughput. If you are operating
>> comfortably within the disk throughput limit, then sure, its just
>> fine.
>
>Why will it not work ?
>Do you have any alternatives ?
>Anyway the algorithm adapts itself to the actual conditions.

If you fill an extremely empty track, it requires more data than a
"regular" track. This will take more time to fetch. This delays the
subsequent tracks being filled. If you the refill the next one, it will
also take longer than you "expect", and so on. If this occurs to too
great an extent, you will fail to refill the "last" tracks at all
before the audio thread catches up with you.

The way to avoid this is to always read the same size chunk of data
for each track, BUT then, if that didn't fill all the tracks, continue
around the loop again to fill them up. That way, slowed down or
regular speed tracks get filled in time, and you use what would
otherwise be slack time in the butler thread (the time between
finishing a refill and the next "signal" from the audio thread/wakeup
from usleep) to "top up" (this sounds very english - does it make
sense?) the speeded up tracks.

>Ok you can use a clean disk and prealloc the files etc, but I want it to work
>in
>the average disk too)

I can almost guarantee you from my own mistakes that this will not
work.

Until last week, I did not realize that I had stupidly created
ardour's tape directory on a partition that already had a bunch of
stuff in it. I could not, under any circumstances, meet the deadlines
for 24 track playback. Specific numbers: it should have been taking,
judging from some benchmarks I ran, about 0.5sec to refill all the
buffers. Sometimes, it would take close to 2secs, or even more!

I would always get an underrun within a couple of seconds. Then at
some point, I was running a test program on the same files, and I
noticed that the seek+read performance really picked up in the last
30secs or so of the file. It was then I remembered that this was a
dirty filesystem, because it occured to that this was a section of the
files that had ended up being fairly contiguous. I cleaned the fs out,
remade the filesystem, and recreated ardour's directory and track
files.

Bingo! Almost perfectly predictable performance.

So, its also possible that your problems are not caused by the buffer
cache at all, but by using fragmented files.

>PS: can you play 38 tracks off from you SCSI disk with your actual code ?
> (my crappy IDE disk + my "inefficient" sorting algorithm can do this :-)

When I was printing out precise times for the butler thread, I was
refilling 1.3secs worth of audio for all tracks in about 0.48secs,
reliably. If that scaled, I could probably easily do 48 tracks in
playback only mode. This is also using a 2 year old drive with a
maximum effective Linux throughput of about 17MB/sec. When measured
within ardour, I get rates of 10MB/sec to 13MB/sec. 48 tracks for
playback requires about 8.8MB/sec, so it seems pretty achievable.

I have not tested it, but if I find time (ha, ha, ha), I'll try it out.

>PS3: Paul you said writing was not a problem , while reading was,
>but I see it as the opposite contrary: reading is just fine, and with 1MB /
>track buffers, I can playback 24 tracks while surfing, using gimp etc.

No, my point about writing/reading was merely that if writing is too
too erratic in its performance, you just throw more buffering (memory)
at it, and the problem is solved. However, as you are discovering, if
reading is too erratic, there's nothing you can do - you *have* to
have the data there on time.

--p

New Message	Reply	About this list	Date view	Thread view	Subject view	Author view	Other groups

This archive was generated by hypermail 2b28 : Mon Apr 10 2000 - 09:58:16 EEST