Subject: [linux-audio-dev] lowish-lat for 2.4.0-test10-pre3
From: Andrew Morton (andrewm_AT_uow.edu.au)
Date: Sat Oct 21 2000 - 16:58:00 EEST
Patch is at
http://www.uow.edu.au/~andrewm/linux/2.4.0-test10-pre3-low-latency.patch
Changes:
- Non-inlined the set_current_state()/schedule() code. Saves
a couple of hundred bytes.
- Simplified some code in dcache.c - the performance benefit
wasn't worth the ugliness.
- Put a couple of rescheduling points in the /proc files. This
is because reading /proc/meminfo is part of Benno's test suite :)
- There were some complaints about the decision to disable this
patch for SMP. So it can be reenabled via CONFIG_LOLAT_SMP (Kernel
Hacking menu).
But no promises here at all. It may seem to work, but if you
start getting heavy spinlock contention in the kernel it
goes bad.
- Fixed (I believe) the reschedule and signal race in
arch/i386/kernel/entry.S. This is the only x86-specific
part of this patch.
I simply stuck a `cli' in there:
--- linux-2.4.0-test10-pre3/arch/i386/kernel/entry.S Sat Oct 14 17:02:03 2000
+++ linux-akpm/arch/i386/kernel/entry.S Sat Oct 21 22:15:47 2000
@@ -215,21 +215,27 @@
jne handle_softirq
ret_with_reschedule:
- cmpl $0,need_resched(%ebx)
- jne reschedule
- cmpl $0,sigpending(%ebx)
- jne signal_return
+ cli
+ movl need_resched(%ebx),%eax
+ orl sigpending(%ebx),%eax
+ jne signal_or_resched
restore_all:
RESTORE_ALL
ALIGN
-signal_return:
+signal_or_resched:
+ cmpl $0,need_resched(%ebx)
+ jne reschedule
+ # Must be a pending signal
sti # we can get here from an interrupt handler
testl $(VM_MASK),EFLAGS(%esp)
movl %esp,%eax
jne v86_signal_return
xorl %edx,%edx
call SYMBOL_NAME(do_signal)
+ cli
+ cmpl $0,need_resched(%ebx)
+ jne reschedule
jmp restore_all
ALIGN
@@ -285,6 +291,7 @@
ALIGN
reschedule:
+ sti
call SYMBOL_NAME(schedule) # test
jmp ret_from_sys_call
On the Mendocino the `cli' adds ~15 cycles to system calls. The
cunning removal of a conditional jump from the syscall and interrupt
fastpath reduces this to 13 cycles.
So `getpid()' now takes 1.005x as long as it used to. So shoot me.
- Some testing results on 2.4.0-test9.
These tests were run with the separate tcp_minisocks patch because
the reaping of timed-wait sockets gets in the way with lmbench.
See http://www.uow.edu.au/~andrewm/linux/schedlat.html#ddt
Machine: 566 MHz UP Celeron, 256M RAM, UDMA66
Workload: 2 instances of bonnie++
2 instances of lmbench
1 instance of mmap001, mmap002, mmap001, mmap002, ...
1 instance of netperf
1 SCHED_FIFO process handling a 1024 Hz signal stream
After 20 hours the latency histogram was:
0-1 milliseconds: ~78,000,000
1-2 milliseconds: 3
5-8 milliseconds: 8 (kmem_cache_reap->kmem_slab_destroy->avl_remove)
9-10 milliseconds: 3 (ide_intr)
The kmem_cache_reap thing is due to the insane number of
inodes and dentries which bonnie++ leaves around. It doesn't
happen normally.
Not sure why the IDE ISR had those glitches.
This archive was generated by hypermail 2b28 : Sat Oct 21 2000 - 17:34:51 EEST