Re: [linux-audio-dev] HD-recording frustration: linux makes it almost impossible :-(

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] HD-recording frustration: linux makes it almost impossible :-(
From: Roger Larsson (roger.larsson_AT_norran.net)
Date: Mon Apr 10 2000 - 20:49:31 EEST


Hi,

You could also try to open the output file O_SYNC then the writer
will block until the data is written - but you would probably need
another thread... (this is close to raw-io for disk output, but time
will include multiple seeks if the file is fragmented..., multiple
write threads might help since fragmented writes might be merged)

  read butler - audio processing - write butler

/RogerL

Benno Senoner wrote:
>
> On Sun, 09 Apr 2000, Paul Barton-Davis wrote:
> > Silly title.
> :-)
> yes, but the frustration is big, and i think the main problem is the updating of
> metadata. It would be really nice to turn off metadata updating completely on
> selected files. :-)
> I will try to play with the bdflush parameters, as suggested by Karl Millar.
> But I am still unsure if it is better to update metadata very seldom or very
> frequently. (too seldom = too much queued request = lots of disk seeking)
> >
> > >The problem is when there is a buffer cache flush (kernel disk write
> > >buffers), then things get realy ugly. even with the disk thread
> > >using SCHED_FIFO, it doesn't get rescheduled for 4-8 secs
> > >SOMETIMES. (since I am calling read() and write() ) ARRGH !
> > >Hopefully SCSI disk will make this better , but on Windoze you can
> > >record painless on EIDE disks, therefore the HD recording has to work
> > >on Linux on IDE disk as well, or I am forced to call it an UNUSABLE
> > >OS. :-)
> >
> > Linux 2.3 has rawio built in (i think there are 2.2.X patches), so if
> > you want that kind of operation, you should use it. Assuming that you
> > don't have lots of other disk i/o going on that will still require
> > buffer cache flushing, you'll get Windows-like performance from it. It
> > would be nice if there was a filesystem or something that was layered
> > above rawio, but unfortunately, such a concept doesn't exist in Linux:
> > a filesystem by definition uses the buffer cache. The best you could
> > do is to provide a library that understands the data layout on your
> > raw partition. This is probably the area where BeOS wins the biggest
> > prizes for streaming media applications - several people pointed out
> > on linux-kernel recently that for big data transfers, BeOS will do DMA
> > straight into/from user space. Its not really faster that way, but
> > there's no buffer cache to complicate matters, and you still get the
> > data going to/from a "real filesystem". Sigh. I'm sure someone will
> > implement something like this for Linux at some point.
>
> Yes, a RAWIO ala SGI , where you simply open() with an O_DIRECT flag
> is really needed. The 2.3. kernel contains the O_DIRECT flag but it is ignored:
> ./include/asm-i386/fcntl.h:#define O_DIRECT 040000 /* direct disk access hint - currently ignored */
> sigh :-(
>
> >
> > Having said all that, I suspect that there are also problems with the
> > IDE drivers that exacerbate this issue. Think about it: the data rate
> > to/from the disk is well known. The buffer cache doesn't change it,
> > but it makes it bursty. This isn't a problem with the SCSI drivers,
> > but its seemed fairly clear from lots of reports from different people
> > with different angles that the IDE drivers cause interrupt blocking
> > for relatively long periods of time, and this can prevent even a
> > SCHED_FIFO thread from getting proper access to the processor. So
> > perhaps its fair to say that Linux has unusable device drivers for
> > streaming media on IDE devices.
>
> I wouldn't say that the IDE devices are the cause of the problems:
> the lowlatency patches clearly demonstrate that even under very heavy disk
> writing a SCHED_FIFO process gets rescheduled with millisecond precision.
>
> The problem is IMHO the bursty behaviour, and during buffer flushing (I suspect
> that metadata is one of the main cause) , every application currently issuing
> read() or write() has to wait SEVERAL seconds, and that leads of
> buffers of 1-2MB in size, MISSED. :-(
>
> SCSI may (partially) overcome to this problem , due to the deeper/better
> request queue, but linux doing this bursty disk I/O , doesn't fully exploit the
> full capabilities of IDE disks. ( BeOS, as you said wins here)
>
> >
> > If you want to continue to use the buffer cache, you'll need to play
> > games either with the kernel or with your hardware configuration to
> > deal with its existence. For example, if you don't want to use rawio,
> > then use a dual CPU machine and use SCSI with more than 1 disk. Using
> > 2 CPU's will allow you to have a much bigger comfort zone when doing
> > this kind of real-time stuff, and using SCSI with more than 1 disk
> > will more or less guarantee throughput.
>
> The 2nd CPU isn't needed, since SCHED_FIFO processes not calling
> disk I/O routines will run just fine, therefore the audio thread doesn't suffer
> much during heavy disk I/O ( about 50-150ms on normal kernels, about 1-3ms
> on a lowlatency kernel)
>
> >
> > > Even with 2MB buffer per track we miss a deadline sometimes.
> > >And that is independent of the algorithm, I think Ardour will
> > >encounter the same problems.
> >
> > Given that I just recorded 45 minutes of 4 channel 32/48 audio this
> > morning in the studio, without any problems, and with clear
> > improvements over the Alesis M20 ADAT's we have there as well, I think
> > not :) And to be perfectly honest, I don't care if it doesn't work on
> > uniprocessor IDE systems. I'm writing it because I want a professional
> > hard disk recorder at my friends studio, and in my estimation, a
> > uniprocessor with IDE disks and a preemptive multitasking OS are not
> > the right combination. If you can get it to work that way, good for
> > you.
>
> I know that you can't call IDE hardware "professional" , but the hardware
> can get the job done, it's actually linux that is unable to exploit it's full
> capabilities.
> (but I think what either SGI or Stephen Tweedie, will implement RAWIO on regular
> filesystems in not so distant future , so that even the poor IDEers can get
> their satisfaction)
> >
> > >Paul, your statement about ringbuffers == bad is wrong , since
> > >the boundary condition occurs very seldom , ( a small % of cases),
> > >plus I am doing page alignment of read()s and write()s therefore
> > >it should be pretty optimized.
> >
> > Thats not true as a general case. If in fact i/o to your ringbuffer
> > always takes place with a single i/o operation, then there is little
> > benefit and some overhead using a ring buffer as opposed to a ring of
> > buffers.
>
> actually I am using the ringbuffer in an optimized fashion:
> the read_ahead() directly read()s into the ringbuffer without any additional
> memcpy() and I take care not to split the request (buffer boundary crossover),
> and read() at multiples of 4096 (PAGE_SIZE).
> The audio thread does the same (that can be guaranteed by using a buffer size
> which is a power of two and reading too with always the same request-len (again
> a power of two).
>
> >
> > In my experience, if I refill a ringbuffer with the space thats
> > actually available when I come to refill it, as opposed to how much
> > was there when the butler thread woke up, I hit boundary conditions
> > virtually all the time. If I just assume that when the butler wakes
> > up, there is a static amount to refill, then sure, in general refills
> > can be made to miss the boundaries, but then why use a ringbuffer at
> > all ? There are virtually no cache benefits because of the amount of
> > data we're talking about.
>
> Actually hitting the boundary or coming very close doesn't make a big
> difference,because when the buffer is almost full the read size is small,
> thus leading to a disk performance decrease (for that request),
> therefore available space will always fluctuate until the system gets in a
> stationary state.
> ( A ringbuffer is in my opinion easier/not harder to manage than a ring of
> buffers, plus if you want you can read/write any amount of data)
>
> The problem is not having the buffer refilled up to 2k samples or up to 10k
> samples, the problem is when your disk thread gets blocked by the
> damn buffer flushing.
>
> >
> > >PS2: I will demonstrate to Paul that varispeed is PAINLESS (no or
> > >little performance degradation) with my code, (first demo will not contain
> > >varispeed).
> >
> > I didn't say that varispeed was painful. I said that doing it the way
> > you sketched out (determine space per track; sort tracks; refill each
> > track) will not work, or more precisely, it will not work if you at
> > all close to the limits of the disk throughput. If you are operating
> > comfortably within the disk throughput limit, then sure, its just
> > fine.
>
> Why will it not work ?
> Do you have any alternatives ?
> Anyway the algorithm adapts itself to the actual conditions.
> But I do not sort N tracks and then refill all N tracks in a "most empty
> buffer" first fashion.
>
> I sort, then I do this only for a few tracks (4 in my case), and then I sort
> again, that means my code reacts very quickly to changed conditions.
> (that means you could drag wildly on a track's pitch control slider , without
> confusing the disk thread)
> The sorting overhead is zero compared to the time the disk thread spends
> in a single read() call.
>
> Anyway using sequential refill doesn't buy us anything over the sorting method,
> since you don't know where the data actually resides on the disk. (physically).
> ( fragmentation , block allocation etc)
> Ok you can use a clean disk and prealloc the files etc, but I want it to work in
> the average disk too)
>
> PS: can you play 38 tracks off from you SCSI disk with your actual code ?
> (my crappy IDE disk + my "inefficient" sorting algorithm can do this :-)
>
> PS2: I am doing graphs with gnuplot of available bufferspace during disk I/O,
> and it's amazing how much scheduling "holes" the disk thread faces during
> kernel buffer flushing. (I will publish these too, maybe stressing a few kernel
> folks too :-) )
>
> PS3: Paul you said writing was not a problem , while reading was,
> but I see it as the opposite contrary: reading is just fine, and with 1MB /
> track buffers, I can playback 24 tracks while surfing, using gimp etc.
>
> Benno.

--
Home page:
  http://www.norran.net/nra02596/


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Mon Apr 10 2000 - 22:53:20 EEST