diff options
author | rtm <rtm> | 2006-08-15 22:18:20 +0000 |
---|---|---|
committer | rtm <rtm> | 2006-08-15 22:18:20 +0000 |
commit | 350e63f7a9b1be695c0cf69e380bd96733524f25 (patch) | |
tree | 782259566c4596eef115590ff3e054fc7c5f3718 /Notes | |
parent | 69332d1918fda38b25fc3ec8c786d16bb17e9e68 (diff) | |
download | xv6-labs-350e63f7a9b1be695c0cf69e380bd96733524f25.tar.gz xv6-labs-350e63f7a9b1be695c0cf69e380bd96733524f25.tar.bz2 xv6-labs-350e63f7a9b1be695c0cf69e380bd96733524f25.zip |
no more proc[] entry per cpu for idle loop
each cpu[] has its own gdt and tss
no per-proc gdt or tss, re-write cpu's in scheduler (you win, cliff)
main0() switches to cpu[0].mpstack
Diffstat (limited to 'Notes')
-rw-r--r-- | Notes | 285 |
1 files changed, 22 insertions, 263 deletions
@@ -22,32 +22,14 @@ no kernel malloc(), just kalloc() for user core user pointers aren't valid in the kernel -setting up first process - we do want a process zero, as template - but not runnable - just set up return-from-trap frame on new kernel stack - fake user program that calls exec - -map text read-only? -shared text? - -what's on the stack during a trap or sys call? - PUSHA before scheduler switch? for callee-saved registers. - segment contents? - what does iret need to get out of the kernel? - how does INT know what kernel stack to use? - -are interrupts turned on in the kernel? probably. - -per-cpu curproc -one tss per process, or one per cpu? -one segment array per cpu, or per process? +are interrupts turned on in the kernel? yes. pass curproc explicitly, or implicit from cpu #? e.g. argument to newproc()? hmm, you need a global curproc[cpu] for trap() &c -test stack expansion +no stack expansion + test running out of memory, process slots we can't really use a separate stack segment, since stack addresses @@ -56,16 +38,6 @@ data vs text. how can we have a gap between data and stack, so that both can grow, without committing 4GB of physical memory? does this mean we need paging? -what's the simplest way to add the paging we need? - one page table, re-write it each time we leave the kernel? - page table per process? - probably need to use 0-0xffffffff segments, so that - both data and stack pointers always work - so is it now worth it to make a process's phys mem contiguous? - or could use segment limits and 4 meg pages? - but limits would prevent using stack pointers as data pointers - how to write-protect text? not important? - perhaps have fixed-size stack, put it in the data segment? oops, if kernel stack is in contiguous user phys mem, then moving @@ -87,19 +59,6 @@ test children being inherited by grandparent &c some sleep()s should be interruptible by kill() -cli/sti in acquire/release should nest! - in case you acquire two locks - -what would need fixing if we got rid of kernel_lock? - console output - proc_exit() needs lock on proc *array* to deallocate - kill() needs lock on proc *array* - allocator's free list - global fd table (really free-ness) - sys_close() on fd table - fork on proc list, also next pid - hold lock until public slots in proc struct initialized - locks init_lock sequences CPU startup @@ -110,37 +69,17 @@ locks memory allocator printf -wakeup needs proc_table_lock - so we need recursive locks? - or you must hold the lock to call wakeup? - in general, the table locks protect both free-ness and public variables of table elements in many cases you can use table elements w/o a lock e.g. if you are the process, or you are using an fd -lock code shouldn't call cprintf... - -nasty hack to allow locks before first process, - and to allow them in interrupts when curproc may be zero - -race between release and sleep in sys_wait() -race between sys_exit waking up parent and setting state=ZOMBIE -race in pipe code when full/empty - lock order per-pipe lock proc_table_lock fd_table_lock kalloc_lock console_lock -condition variable + mutex that protects it - proc * (for wait()), proc_table_lock - pipe structure, pipe lock - -systematic way to test sleep races? - print something at the start of sleep? - -do you have to be holding the mutex in order to call wakeup()? +do you have to be holding the mutex in order to call wakeup()? yes device interrupts don't clear FL_IF so a recursive timer interrupt is possible @@ -156,202 +95,11 @@ inode->count counts in-memory pointers to the struct blocks and inodes have ad-hoc sleep-locks provide a single mechanism? -need to lock bufs in bio between bread and brelse - test 14-character file names and file arguments longer than 14 -and directories longer than one sector kalloc() can return 0; do callers handle this right? -why directing interrupts to cpu 1 causes trouble - cpu 1 turns on interrupts with no tss! - and perhaps a stale gdt (from boot) - since it has never run a process, never called setupsegs() - but does cpu really need the tss? - not switching stacks - fake process per cpu, just for tss? - seems like a waste - move tss to cpu[]? - but tss points to per-process kernel stack - would also give us a gdt - OOPS that wasn't the problem - -wait for other cpu to finish starting before enabling interrupts? - some kind of crash in ide_init ioapic_enable cprintf -move ide_init before mp_start? - didn't do any good - maybe cpu0 taking ide interrupt, cpu1 getting a nested lock error - -cprintfs are screwed up if locking is off - often loops forever - hah, just use lpt alone - -looks like cpu0 took the ide interrupt and was the last to hold -the lock, but cpu1 thinks it is nested -cpu0 is in load_icode / printf / cons_putc - probably b/c cpu1 cleared use_console_lock -cpu1 is in scheduler() / printf / acquire - - 1: init timer - 0: init timer - cpu 1 initial nlock 1 - ne0s:t iidd el_occnkt rc - onsole cpu 1 old caller stack 1001A5 10071D 104DFF 1049FE - panic: acquire - ^CNext at t=33002418 - (0) [0x00100091] 0008:0x00100091 (unk. ctxt): jmp .+0xfffffffe ; ebfe - (1) [0x00100332] 0008:0x00100332 (unk. ctxt): jmp .+0xfffffffe - -why is output interleaved even before panic? - -does release turn on interrupts even inside an interrupt handler? - -overflowing cpu[] stack? - probably not, change from 512 to 4096 didn't do anything - - - 1: init timer - 0: init timer - cnpeus te11 linnitki aclo nnoolleek cp1u - ss oarltd sccahleldeul esrt aocnk cpu 0111 Ej6 buf1 01A3140 C5118 - 0 - la anic1::7 0a0c0 uuirr e - ^CNext at t=31691050 - (0) [0x00100373] 0008:0x00100373 (unk. ctxt): jmp .+0xfffffffe ; ebfe - (1) [0x00100091] 0008:0x00100091 (unk. ctxt): jmp .+0xfffffffe ; ebfe - -cpu0: - -0: init timer -nested lock console cpu 0 old caller stack 1001e6 101a34 1 0 - (that's mpmain) -panic: acquire - -cpu1: - -1: init timer -cpu 1 initial nlock 1 -start scheduler on cpu 1 jmpbuf ... -la 107000 lr ... - that is, nlock != 0 - -maybe a race; acquire does - locked = 1 - cpu = cpu() -what if another acquire calls holding w/ locked = 1 but - before cpu is set? - -if I type a lot (kbd), i get a panic -cpu1 in scheduler: panic "holding locks in scheduler" -cpu0 also in the same panic! -recursive interrupt? - FL_IF is probably set during interrupt... is that correct? -again: - olding locks in scheduler - trap v 33 eip 100ED3 c (that is, interrupt while holding a lock) - 100ed3 is in lapic_write -again: - trap v 33 eip 102A3C cpu 1 nlock 1 (in acquire) - panic: interrupt while holding a lock -again: - trap v 33 eip 102A3C cpu 1 nlock 1 - panic: interrupt while holding a lock -OR is it the cprintf("kbd overflow")? - no, get panic even w/o that cprintf -OR a release() at interrupt time turns interrupts back on? - of course i don't think they were off... -OK, fixing trap.c to make interrupts turn off FL_IF - that makes it take longer, but still panics - (maybe b/c release sets FL_IF) - -shouldn't something (PIC?) prevent recursive interrupts of same IRQ? - or should FL_IF be clear during all interrupts? - -maybe acquire should remember old FL_IF value, release should restore - if acquire did cli() - -DUH the increment of nlock in acquire() happens before the cli! - so the panic is probably not a real problem - test nlock, cli(), then increment? - -BUT now userfs doesn't do the final cat README - -AND w/ cprintf("kbd overflow"), panic holding locks in scheduler - maybe also simulataneous panic("interrupt while holding a lock") - -again (holding down x key): - kbd overflow - kbd oaaniicloowh - olding locks in scheduler - trap v 33 eip 100F5F c^CNext at t=32166285 - (0) [0x0010033e] 0008:0010033e (unk. ctxt): jmp .+0xfffffffe (0x0010033e) ; ebfe - (1) [0x0010005c] 0008:0010005c (unk. ctxt): jmp .+0xfffffffe (0x0010005c) ; ebfe -cpu0 paniced due to holding locks in scheduler -cpu1 got panic("interrupt while holding a lock") - again in lapic_write. - while re-enabling an IRQ? - -again: -cpu 0 panic("holding locks in scheduler") - but didn't trigger related panics earlier in scheduler or sched() - of course the panic is right after release() and thus sti() - so we may be seeing an interrupt that left locks held -cpu 1 unknown panic -why does it happen to both cpus at the same time? - -again: -cpu 0 panic("holding locks in scheduler") - but trap() didn't see any held locks on return -cpu 1 no apparent panic - -again: -cpu 0 panic: holding too many locks in scheduler -cpu 1 panic: kbd_intr returned while holding a lock - -again: -cpu 0 panic: holding too man - la 10d70c lr 10027b - those don't seem to be locks... - only place non-constant lock is used is sleep()'s 2nd arg - maybe register not preserved across context switch? - it's in %esi... - sched() doesn't touch %esi - %esi is evidently callee-saved - something to do with interrupts? since ordinarily it works -cpu 1 panic: kbd_int returned while holding a lock - la 107340 lr 107300 - console_lock and kbd_lock - -maybe console_lock is often not released due to change - in use_console_lock (panic on other cpu) - -again: -cpu 0: panic: h... - la 10D78C lr 102CA0 -cpu 1: panic: acquire FL_IF (later than cpu 0) - -but if sleep() were acquiring random locks, we'd see panics -in release, after sleep() returned. -actually when system is idle, maybe no-one sleeps at all. - just scheduler() and interrupts - -questions: - does userfs use pipes? or fork? - no - does anything bad happen if process 1 exits? eg exit() in cat.c - looks ok - are there really no processes left? - lock_init() so we can have a magic number? - -HMM maybe the variables at the end of struct cpu are being overwritten - nlocks, lastacquire, lastrelease - by cpu->stack? - adding junk buffers maybe causes crash to take longer... - when do we run on cpu stack? - just in scheduler()? - and interrupts from scheduler() - OH! recursive interrupts will use up any amount of cpu[].stack! underflow and wrecks *previous* cpu's struct @@ -360,15 +108,26 @@ mkdir sh arguments sh redirection indirect blocks -two bugs in unlink: don't just return if nlink > 0, - and search for name, not inum is there a create/create race for same file name? resulting in two entries w/ same name in directory? +why does shell often ignore first line of input? test: one process unlinks a file while another links to it -test: simultaneous create of same file test: one process opens a file while another deletes it - -wdir should use writei, to avoid special-case block allocation - also readi - is dir locked? probably +test: mkdir. deadlock d/.. vs ../d + +make proc[0] runnable +cpu early tss and gdt +how do we get cpu0 scheduler() to use mpstack, not proc[0].kstack? +when iget() first sleeps, where does it longjmp to? +maybe set up proc[0] to be runnable, with entry proc0main(), then + have main() call scheduler()? + perhaps so proc[0] uses right kstack? + and scheduler() uses mpstack? +ltr sets the busy bit in the TSS, faults if already set + so gdt and TSS per cpu? + we don't want to be using some random process's gdt when it changes it. +maybe get rid of per-proc gdt and ts + one per cpu + refresh it when needed + setupsegs(proc *) |