summaryrefslogtreecommitdiff
path: root/web/l-lock.html
diff options
context:
space:
mode:
authorAustin Clements <[email protected]>2011-09-07 11:49:14 -0400
committerAustin Clements <[email protected]>2011-09-07 11:49:14 -0400
commit01a6c054d548d9fff8bbdfac4d3f3de4ae8677a1 (patch)
tree4320eb3d09f31f4a628b80d45482a72ee7c3956b /web/l-lock.html
parent64a03bd7aa5c03a626a2da4730a45fcceea75322 (diff)
downloadxv6-labs-01a6c054d548d9fff8bbdfac4d3f3de4ae8677a1.tar.gz
xv6-labs-01a6c054d548d9fff8bbdfac4d3f3de4ae8677a1.tar.bz2
xv6-labs-01a6c054d548d9fff8bbdfac4d3f3de4ae8677a1.zip
Remove web directory; all cruft or moved to 6.828 repo
Diffstat (limited to 'web/l-lock.html')
-rw-r--r--web/l-lock.html322
1 files changed, 0 insertions, 322 deletions
diff --git a/web/l-lock.html b/web/l-lock.html
deleted file mode 100644
index eea8217..0000000
--- a/web/l-lock.html
+++ /dev/null
@@ -1,322 +0,0 @@
-<title>L7</title>
-<html>
-<head>
-</head>
-<body>
-
-<h1>Locking</h1>
-
-<p>Required reading: spinlock.c
-
-<h2>Why coordinate?</h2>
-
-<p>Mutual-exclusion coordination is an important topic in operating
-systems, because many operating systems run on
-multiprocessors. Coordination techniques protect variables that are
-shared among multiple threads and updated concurrently. These
-techniques allow programmers to implement atomic sections so that one
-thread can safely update the shared variables without having to worry
-that another thread intervening. For example, processes in xv6 may
-run concurrently on different processors and in kernel-mode share
-kernel data structures. We must ensure that these updates happen
-correctly.
-
-<p>List and insert example:
-<pre>
-
-struct List {
- int data;
- struct List *next;
-};
-
-List *list = 0;
-
-insert(int data) {
- List *l = new List;
- l->data = data;
- l->next = list; // A
- list = l; // B
-}
-</pre>
-
-<p>What needs to be atomic? The two statements labeled A and B should
-always be executed together, as an indivisible fragment of code. If
-two processors execute A and B interleaved, then we end up with an
-incorrect list. To see that this is the case, draw out the list after
-the sequence A1 (statement executed A by processor 1), A2 (statement A
-executed by processor 2), B2, and B1.
-
-<p>How could this erroneous sequence happen? The varilable <i>list</i>
-lives in physical memory shared among multiple processors, connected
-by a bus. The accesses to the shared memory will be ordered in some
-total order by the bus/memory system. If the programmer doesn't
-coordinate the execution of the statements A and B, any order can
-happen, including the erroneous one.
-
-<p>The erroneous case is called a race condition. The problem with
-races is that they are difficult to reproduce. For example, if you
-put print statements in to debug the incorrect behavior, you might
-change the time and the race might not happen anymore.
-
-<h2>Atomic instructions</h2>
-
-<p>The programmer must be able express that A and B should be executed
-as single atomic instruction. We generally use a concept like locks
-to mark an atomic region, acquiring the lock at the beginning of the
-section and releasing it at the end:
-
-<pre>
-void acquire(int *lock) {
- while (TSL(lock) != 0) ;
-}
-
-void release (int *lock) {
- *lock = 0;
-}
-</pre>
-
-<p>Acquire and release, of course, need to be atomic too, which can,
-for example, be done with a hardware atomic TSL (try-set-lock)
-instruction:
-
-<p>The semantics of TSL are:
-<pre>
- R <- [mem] // load content of mem into register R
- [mem] <- 1 // store 1 in mem.
-</pre>
-
-<p>In a harware implementation, the bus arbiter guarantees that both
-the load and store are executed without any other load/stores coming
-in between.
-
-<p>We can use locks to implement an atomic insert, or we can use
-TSL directly:
-<pre>
-int insert_lock = 0;
-
-insert(int data) {
-
- /* acquire the lock: */
- while(TSL(&insert_lock) != 0)
- ;
-
- /* critical section: */
- List *l = new List;
- l->data = data;
- l->next = list;
- list = l;
-
- /* release the lock: */
- insert_lock = 0;
-}
-</pre>
-
-<p>It is the programmer's job to make sure that locks are respected. If
-a programmer writes another function that manipulates the list, the
-programmer must must make sure that the new functions acquires and
-releases the appropriate locks. If the programmer doesn't, race
-conditions occur.
-
-<p>This code assumes that stores commit to memory in program order and
-that all stores by other processors started before insert got the lock
-are observable by this processor. That is, after the other processor
-released a lock, all the previous stores are committed to memory. If
-a processor executes instructions out of order, this assumption won't
-hold and we must, for example, a barrier instruction that makes the
-assumption true.
-
-
-<h2>Example: Locking on x86</h2>
-
-<p>Here is one way we can implement acquire and release using the x86
-xchgl instruction:
-
-<pre>
-struct Lock {
- unsigned int locked;
-};
-
-acquire(Lock *lck) {
- while(TSL(&(lck->locked)) != 0)
- ;
-}
-
-release(Lock *lck) {
- lck->locked = 0;
-}
-
-int
-TSL(int *addr)
-{
- register int content = 1;
- // xchgl content, *addr
- // xchgl exchanges the values of its two operands, while
- // locking the memory bus to exclude other operations.
- asm volatile ("xchgl %0,%1" :
- "=r" (content),
- "=m" (*addr) :
- "0" (content),
- "m" (*addr));
- return(content);
-}
-</pre>
-
-<p>the instruction "XCHG %eax, (content)" works as follows:
-<ol>
-<li> freeze other CPUs' memory activity
-<li> temp := content
-<li> content := %eax
-<li> %eax := temp
-<li> un-freeze other CPUs
-</ol>
-
-<p>steps 1 and 5 make XCHG special: it is "locked" special signal
- lines on the inter-CPU bus, bus arbitration
-
-<p>This implementation doesn't scale to a large number of processors;
- in a later lecture we will see how we could do better.
-
-<h2>Lock granularity</h2>
-
-<p>Release/acquire is ideal for short atomic sections: increment a
-counter, search in i-node cache, allocate a free buffer.
-
-<p>What are spin locks not so great for? Long atomic sections may
- waste waiters' CPU time and it is to sleep while holding locks. In
- xv6 we try to avoid long atomic sections by carefully coding (can
- you find an example?). xv6 doesn't release the processor when
- holding a lock, but has an additional set of coordination primitives
- (sleep and wakeup), which we will study later.
-
-<p>My list_lock protects all lists; inserts to different lists are
- blocked. A lock per list would waste less time spinning so you might
- want "fine-grained" locks, one for every object BUT acquire/release
- are expensive (500 cycles on my 3 ghz machine) because they need to
- talk off-chip.
-
-<p>Also, "correctness" is not that simple with fine-grained locks if
- need to maintain global invariants; e.g., "every buffer must be on
- exactly one of free list and device list". Per-list locks are
- irrelevant for this invariant. So you might want "large-grained",
- which reduces overhead but reduces concurrency.
-
-<p>This tension is hard to get right. One often starts out with
- "large-grained locks" and measures the performance of the system on
- some workloads. When more concurrency is desired (to get better
- performance), an implementor may switch to a more fine-grained
- scheme. Operating system designers fiddle with this all the time.
-
-<h2>Recursive locks and modularity</h2>
-
-<p>When designing a system we desire clean abstractions and good
- modularity. We like a caller not have to know about how a callee
- implements a particul functions. Locks make achieving modularity
- more complicated. For example, what to do when the caller holds a
- lock, then calls a function, which also needs to the lock to perform
- its job.
-
-<p>There are no transparent solutions that allow the caller and callee
- to be unaware of which lokcs they use. One transparent, but
- unsatisfactory option is recursive locks: If a callee asks for a
- lock that its caller has, then we allow the callee to proceed.
- Unfortunately, this solution is not ideal either.
-
-<p>Consider the following. If lock x protects the internals of some
- struct foo, then if the caller acquires lock x, it know that the
- internals of foo are in a sane state and it can fiddle with them.
- And then the caller must restore them to a sane state before release
- lock x, but until then anything goes.
-
-<p>This assumption doesn't hold with recursive locking. After
- acquiring lock x, the acquirer knows that either it is the first to
- get this lock, in which case the internals are in a sane state, or
- maybe some caller holds the lock and has messed up the internals and
- didn't realize when calling the callee that it was going to try to
- look at them too. So the fact that a function acquired the lock x
- doesn't guarantee anything at all. In short, locks protect against
- callers and callees just as much as they protect against other
- threads.
-
-<p>Since transparent solutions aren't ideal, it is better to consider
- locks part of the function specification. The programmer must
- arrange that a caller doesn't invoke another function while holding
- a lock that the callee also needs.
-
-<h2>Locking in xv6</h2>
-
-<p>xv6 runs on a multiprocessor and is programmed to allow multiple
-threads of computation to run concurrently. In xv6 an interrupt might
-run on one processor and a process in kernel mode may run on another
-processor, sharing a kernel data structure with the interrupt routing.
-xv6 uses locks, implemented using an atomic instruction, to coordinate
-concurrent activities.
-
-<p>Let's check out why xv6 needs locks by following what happens when
-we start a second processor:
-<ul>
-<li>1516: mp_init (called from main0)
-<li>1606: mp_startthem (called from main0)
-<li>1302: mpmain
-<li>2208: scheduler.
- <br>Now we have several processors invoking the scheduler
- function. xv6 better ensure that multiple processors don't run the
- same process! does it?
- <br>Yes, if multiple schedulers run concurrently, only one will
- acquire proc_table_lock, and proceed looking for a runnable
- process. if it finds a process, it will mark it running, longjmps to
- it, and the process will release proc_table_lock. the next instance
- of scheduler will skip this entry, because it is marked running, and
- look for another runnable process.
-</ul>
-
-<p>Why hold proc_table_lock during a context switch? It protects
-p->state; the process has to hold some lock to avoid a race with
-wakeup() and yield(), as we will see in the next lectures.
-
-<p>Why not a lock per proc entry? It might be expensive in in whole
-table scans (in wait, wakeup, scheduler). proc_table_lock also
-protects some larger invariants, for example it might be hard to get
-proc_wait() right with just per entry locks. Right now the check to
-see if there are any exited children and the sleep are atomic -- but
-that would be hard with per entry locks. One could have both, but
-that would probably be neither clean nor fast.
-
-<p>Of course, there is only processor searching the proc table if
-acquire is implemented correctly. Let's check out acquire in
-spinlock.c:
-<ul>
-<li>1807: no recursive locks!
-<li>1811: why disable interrupts on the current processor? (if
-interrupt code itself tries to take a held lock, xv6 will deadlock;
-the panic will fire on 1808.)
-<ul>
-<li>can a process on a processor hold multiple locks?
-</ul>
-<li>1814: the (hopefully) atomic instruction.
-<ul>
-<li>see sheet 4, line 0468.
-</ul>
-<li>1819: make sure that stores issued on other processors before we
-got the lock are observed by this processor. these may be stores to
-the shared data structure that is protected by the lock.
-</ul>
-
-<p>
-
-<h2>Locking in JOS</h2>
-
-<p>JOS is meant to run on single-CPU machines, and the plan can be
-simple. The simple plan is disabling/enabling interrupts in the
-kernel (IF flags in the EFLAGS register). Thus, in the kernel,
-threads release the processors only when they want to and can ensure
-that they don't release the processor during a critical section.
-
-<p>In user mode, JOS runs with interrupts enabled, but Unix user
-applications don't share data structures. The data structures that
-must be protected, however, are the ones shared in the library
-operating system (e.g., pipes). In JOS we will use special-case
-solutions, as you will find out in lab 6. For example, to implement
-pipe we will assume there is one reader and one writer. The reader
-and writer never update each other's variables; they only read each
-other's variables. Carefully programming using this rule we can avoid
-races.