Subject: [linux-audio-dev] ardour now operating smoothly with 24 channel playback, etc.
From: Paul Barton-Davis (pbd_AT_Op.Net)
Date: Sat Apr 08 2000 - 05:51:04 EEST
The CVS repository at ardour.sourceforge.net now contains a version of
ardour that smoothly and happily (*) plays back 24 channels of
non-interleaved WAV-format audio from an ext2 filesystem on a single
disk. It records correctly as well.
Because of the new design, I know how to do dynamic x-fade as well but
thats not implemented at this time. Tomorrow, I'll take it over the
studio to check its actual performance with a digital output system,
since I haven't done that in a while. My testing right now consists of
playing back the WAV files it generates.
Some comments:
1) there are no longer any ringbuffers in the code
* ringbuffers are bad, because they tend to split the i/o
to/from them into two pieces. when this is just a memcpy, its
no big deal, but when the data is going to/from a disk, its
inefficient and measurably slows things down.
2) the buffering system is basically identical to any audio interface
(soundcard).Hey, it works there, so why not use it ?
3) You need a minimum of 9MB/sec sustained throughput to record 24
channels (4.4MB/sec to playback 24 channels) at 48kHz.
My experience suggests that if the track files are at all
fragmented, you will not sustain the required disk i/o rate. So,
create them on a fresh partition that doesn't see any other file
creation/deletion activity.
4) I don't know what the actual limit on recording channels is at this
point. Its not very common to track more than 8-12, and I know we
can handle that pretty easily. But the disk write speed and
behaviour is distinct from the read speed, and I need to do more
investigation.
5) Its very important when writing an application like
this to get every little detail correct. Some subtle
points:
* Benno's nice-sounding scheme for handling varispeed does not work
correctly if your disk does not have a *lot* of headroom in its
sustained throughput rate. This is because "fast running" tracks
will require more I/O when the butler thread services them, and
the butler does this blindly, then later tracks run the risk of
an underrun.
The correct method is to always fetch the same amount of data for
each track, but then to not pause the butler if some tracks were
not satisfied by that iteration.
* The butler thread needs to be SCHED_FIFO as well, because it is
critical that it not be delayed in scheduling whenever an I/O
request is complete. It should, however, be of lower priority
than the audio thread. It takes my disk about 20ms to read
256kB; if the butler thread was delayed in scheduling, it could
be at least this long, and thats equivalent to reducing disk
throughput, noticeably. Note that it, like the audio thread,
is asleep most of the time.
* Getting the correct disk I/O chunk size is important. Disk
throughput generally goes up as the chunk size increase until we
reach the limit of the disk's capabilities. I have found that
chunks of about 256kB work well with 170msec latency on the card.
If you reduce the chunk size, you will see a decrease in the
throughput rate, so if you want to reduce the latency, you need
to increase the buffering done in user space. I will experiment
with this to see what suitable figures are. The problem is
essentially identical to that of a soundcard's fragment sizes,
but with the added problem that going down to small "fragment
size" (i.e. small chunks of disk i/o) reduces the
throughput. When you do this on a soundcard, the rate is not
affected at all.
* You shouldn't write behind+read ahead for each track. That
causes a lot more slow seeks than doing read ahead for all
tracks, then write behind for all tracks. If there was no latency
in the system, then theoretically, there would be no seeking
between the end of the write behind for a track and the start of
the read ahead. But there is latency, and this means that the
data that gets written to disk does not end where the read ahead
data begins.
6) As explained by Linus, using mmap/mlock is not a particularly
useful trick for streaming media. It works well if you can map and
lock something entirely in RAM, which is normally OK for a single
soundfile. But if you have to map/lock/unlock/map fairly
frequently, its actually slower than using read/write.
Its been a long and difficult battle to get this to work, and I thank
everyone on the list for their input. I didn't call it "ardour" for
nothing, you know :)
--regards,
--p
(*) On my system, a dual PII-450 using a 2 year old Seagate 10000rpm
Ultra-2 SCSI drive.
This archive was generated by hypermail 2b28 : Sat Apr 08 2000 - 06:21:18 EEST