diff options
author | Austin Clements <[email protected]> | 2011-09-07 11:49:14 -0400 |
---|---|---|
committer | Austin Clements <[email protected]> | 2011-09-07 11:49:14 -0400 |
commit | 01a6c054d548d9fff8bbdfac4d3f3de4ae8677a1 (patch) | |
tree | 4320eb3d09f31f4a628b80d45482a72ee7c3956b /web/l-lock.html | |
parent | 64a03bd7aa5c03a626a2da4730a45fcceea75322 (diff) | |
download | xv6-labs-01a6c054d548d9fff8bbdfac4d3f3de4ae8677a1.tar.gz xv6-labs-01a6c054d548d9fff8bbdfac4d3f3de4ae8677a1.tar.bz2 xv6-labs-01a6c054d548d9fff8bbdfac4d3f3de4ae8677a1.zip |
Remove web directory; all cruft or moved to 6.828 repo
Diffstat (limited to 'web/l-lock.html')
-rw-r--r-- | web/l-lock.html | 322 |
1 files changed, 0 insertions, 322 deletions
diff --git a/web/l-lock.html b/web/l-lock.html deleted file mode 100644 index eea8217..0000000 --- a/web/l-lock.html +++ /dev/null @@ -1,322 +0,0 @@ -<title>L7</title> -<html> -<head> -</head> -<body> - -<h1>Locking</h1> - -<p>Required reading: spinlock.c - -<h2>Why coordinate?</h2> - -<p>Mutual-exclusion coordination is an important topic in operating -systems, because many operating systems run on -multiprocessors. Coordination techniques protect variables that are -shared among multiple threads and updated concurrently. These -techniques allow programmers to implement atomic sections so that one -thread can safely update the shared variables without having to worry -that another thread intervening. For example, processes in xv6 may -run concurrently on different processors and in kernel-mode share -kernel data structures. We must ensure that these updates happen -correctly. - -<p>List and insert example: -<pre> - -struct List { - int data; - struct List *next; -}; - -List *list = 0; - -insert(int data) { - List *l = new List; - l->data = data; - l->next = list; // A - list = l; // B -} -</pre> - -<p>What needs to be atomic? The two statements labeled A and B should -always be executed together, as an indivisible fragment of code. If -two processors execute A and B interleaved, then we end up with an -incorrect list. To see that this is the case, draw out the list after -the sequence A1 (statement executed A by processor 1), A2 (statement A -executed by processor 2), B2, and B1. - -<p>How could this erroneous sequence happen? The varilable <i>list</i> -lives in physical memory shared among multiple processors, connected -by a bus. The accesses to the shared memory will be ordered in some -total order by the bus/memory system. If the programmer doesn't -coordinate the execution of the statements A and B, any order can -happen, including the erroneous one. - -<p>The erroneous case is called a race condition. The problem with -races is that they are difficult to reproduce. For example, if you -put print statements in to debug the incorrect behavior, you might -change the time and the race might not happen anymore. - -<h2>Atomic instructions</h2> - -<p>The programmer must be able express that A and B should be executed -as single atomic instruction. We generally use a concept like locks -to mark an atomic region, acquiring the lock at the beginning of the -section and releasing it at the end: - -<pre> -void acquire(int *lock) { - while (TSL(lock) != 0) ; -} - -void release (int *lock) { - *lock = 0; -} -</pre> - -<p>Acquire and release, of course, need to be atomic too, which can, -for example, be done with a hardware atomic TSL (try-set-lock) -instruction: - -<p>The semantics of TSL are: -<pre> - R <- [mem] // load content of mem into register R - [mem] <- 1 // store 1 in mem. -</pre> - -<p>In a harware implementation, the bus arbiter guarantees that both -the load and store are executed without any other load/stores coming -in between. - -<p>We can use locks to implement an atomic insert, or we can use -TSL directly: -<pre> -int insert_lock = 0; - -insert(int data) { - - /* acquire the lock: */ - while(TSL(&insert_lock) != 0) - ; - - /* critical section: */ - List *l = new List; - l->data = data; - l->next = list; - list = l; - - /* release the lock: */ - insert_lock = 0; -} -</pre> - -<p>It is the programmer's job to make sure that locks are respected. If -a programmer writes another function that manipulates the list, the -programmer must must make sure that the new functions acquires and -releases the appropriate locks. If the programmer doesn't, race -conditions occur. - -<p>This code assumes that stores commit to memory in program order and -that all stores by other processors started before insert got the lock -are observable by this processor. That is, after the other processor -released a lock, all the previous stores are committed to memory. If -a processor executes instructions out of order, this assumption won't -hold and we must, for example, a barrier instruction that makes the -assumption true. - - -<h2>Example: Locking on x86</h2> - -<p>Here is one way we can implement acquire and release using the x86 -xchgl instruction: - -<pre> -struct Lock { - unsigned int locked; -}; - -acquire(Lock *lck) { - while(TSL(&(lck->locked)) != 0) - ; -} - -release(Lock *lck) { - lck->locked = 0; -} - -int -TSL(int *addr) -{ - register int content = 1; - // xchgl content, *addr - // xchgl exchanges the values of its two operands, while - // locking the memory bus to exclude other operations. - asm volatile ("xchgl %0,%1" : - "=r" (content), - "=m" (*addr) : - "0" (content), - "m" (*addr)); - return(content); -} -</pre> - -<p>the instruction "XCHG %eax, (content)" works as follows: -<ol> -<li> freeze other CPUs' memory activity -<li> temp := content -<li> content := %eax -<li> %eax := temp -<li> un-freeze other CPUs -</ol> - -<p>steps 1 and 5 make XCHG special: it is "locked" special signal - lines on the inter-CPU bus, bus arbitration - -<p>This implementation doesn't scale to a large number of processors; - in a later lecture we will see how we could do better. - -<h2>Lock granularity</h2> - -<p>Release/acquire is ideal for short atomic sections: increment a -counter, search in i-node cache, allocate a free buffer. - -<p>What are spin locks not so great for? Long atomic sections may - waste waiters' CPU time and it is to sleep while holding locks. In - xv6 we try to avoid long atomic sections by carefully coding (can - you find an example?). xv6 doesn't release the processor when - holding a lock, but has an additional set of coordination primitives - (sleep and wakeup), which we will study later. - -<p>My list_lock protects all lists; inserts to different lists are - blocked. A lock per list would waste less time spinning so you might - want "fine-grained" locks, one for every object BUT acquire/release - are expensive (500 cycles on my 3 ghz machine) because they need to - talk off-chip. - -<p>Also, "correctness" is not that simple with fine-grained locks if - need to maintain global invariants; e.g., "every buffer must be on - exactly one of free list and device list". Per-list locks are - irrelevant for this invariant. So you might want "large-grained", - which reduces overhead but reduces concurrency. - -<p>This tension is hard to get right. One often starts out with - "large-grained locks" and measures the performance of the system on - some workloads. When more concurrency is desired (to get better - performance), an implementor may switch to a more fine-grained - scheme. Operating system designers fiddle with this all the time. - -<h2>Recursive locks and modularity</h2> - -<p>When designing a system we desire clean abstractions and good - modularity. We like a caller not have to know about how a callee - implements a particul functions. Locks make achieving modularity - more complicated. For example, what to do when the caller holds a - lock, then calls a function, which also needs to the lock to perform - its job. - -<p>There are no transparent solutions that allow the caller and callee - to be unaware of which lokcs they use. One transparent, but - unsatisfactory option is recursive locks: If a callee asks for a - lock that its caller has, then we allow the callee to proceed. - Unfortunately, this solution is not ideal either. - -<p>Consider the following. If lock x protects the internals of some - struct foo, then if the caller acquires lock x, it know that the - internals of foo are in a sane state and it can fiddle with them. - And then the caller must restore them to a sane state before release - lock x, but until then anything goes. - -<p>This assumption doesn't hold with recursive locking. After - acquiring lock x, the acquirer knows that either it is the first to - get this lock, in which case the internals are in a sane state, or - maybe some caller holds the lock and has messed up the internals and - didn't realize when calling the callee that it was going to try to - look at them too. So the fact that a function acquired the lock x - doesn't guarantee anything at all. In short, locks protect against - callers and callees just as much as they protect against other - threads. - -<p>Since transparent solutions aren't ideal, it is better to consider - locks part of the function specification. The programmer must - arrange that a caller doesn't invoke another function while holding - a lock that the callee also needs. - -<h2>Locking in xv6</h2> - -<p>xv6 runs on a multiprocessor and is programmed to allow multiple -threads of computation to run concurrently. In xv6 an interrupt might -run on one processor and a process in kernel mode may run on another -processor, sharing a kernel data structure with the interrupt routing. -xv6 uses locks, implemented using an atomic instruction, to coordinate -concurrent activities. - -<p>Let's check out why xv6 needs locks by following what happens when -we start a second processor: -<ul> -<li>1516: mp_init (called from main0) -<li>1606: mp_startthem (called from main0) -<li>1302: mpmain -<li>2208: scheduler. - <br>Now we have several processors invoking the scheduler - function. xv6 better ensure that multiple processors don't run the - same process! does it? - <br>Yes, if multiple schedulers run concurrently, only one will - acquire proc_table_lock, and proceed looking for a runnable - process. if it finds a process, it will mark it running, longjmps to - it, and the process will release proc_table_lock. the next instance - of scheduler will skip this entry, because it is marked running, and - look for another runnable process. -</ul> - -<p>Why hold proc_table_lock during a context switch? It protects -p->state; the process has to hold some lock to avoid a race with -wakeup() and yield(), as we will see in the next lectures. - -<p>Why not a lock per proc entry? It might be expensive in in whole -table scans (in wait, wakeup, scheduler). proc_table_lock also -protects some larger invariants, for example it might be hard to get -proc_wait() right with just per entry locks. Right now the check to -see if there are any exited children and the sleep are atomic -- but -that would be hard with per entry locks. One could have both, but -that would probably be neither clean nor fast. - -<p>Of course, there is only processor searching the proc table if -acquire is implemented correctly. Let's check out acquire in -spinlock.c: -<ul> -<li>1807: no recursive locks! -<li>1811: why disable interrupts on the current processor? (if -interrupt code itself tries to take a held lock, xv6 will deadlock; -the panic will fire on 1808.) -<ul> -<li>can a process on a processor hold multiple locks? -</ul> -<li>1814: the (hopefully) atomic instruction. -<ul> -<li>see sheet 4, line 0468. -</ul> -<li>1819: make sure that stores issued on other processors before we -got the lock are observed by this processor. these may be stores to -the shared data structure that is protected by the lock. -</ul> - -<p> - -<h2>Locking in JOS</h2> - -<p>JOS is meant to run on single-CPU machines, and the plan can be -simple. The simple plan is disabling/enabling interrupts in the -kernel (IF flags in the EFLAGS register). Thus, in the kernel, -threads release the processors only when they want to and can ensure -that they don't release the processor during a critical section. - -<p>In user mode, JOS runs with interrupts enabled, but Unix user -applications don't share data structures. The data structures that -must be protected, however, are the ones shared in the library -operating system (e.g., pipes). In JOS we will use special-case -solutions, as you will find out in lab 6. For example, to implement -pipe we will assume there is one reader and one writer. The reader -and writer never update each other's variables; they only read each -other's variables. Carefully programming using this rule we can avoid -races. |