labs/lock.html


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148

<html>
<head>
<title>Lab: locks</title>
<link rel="stylesheet" href="homework.css" type="text/css" />
</head>
<body>

<h1>Lab: locks</h1>

<p>In this lab you will try to avoid lock contention for certain
workloads.

<h2>lock contention</h2>

<p>The program user/kalloctest stresses xv6's memory allocator: three
  processes grow and shrink there address space, which will results in
  many calls to <tt>kalloc</tt> and <tt>kfree</tt>,
  respectively.  <tt>kalloc</tt> and <tt>kfree</tt>
  obtain <tt>kmem.lock</tt>.  To see if there is lock contention for
  <tt>kmem.lock</tt> replace the call to <tt>acquire</tt>
  in <tt>kalloc</tt> with the following code:

  <pre>
    while(!tryacquire(&kmem.lock)) {
      printf("!");
    }
  </pre>

<p><tt>tryacquire</tt> tries to acquire <tt>kmem.lock</tt>: if the
  lock is taking it returns false (0); otherwise, it returns true (1)
  and with the lock acquired.  Your first job is to
  implement <tt>tryacquire</tt> in kernel/spinlock.c.

<p>A few hints:
  <ul>
    <li>look at <tt>acquire</tt>.
    <li>don't forget to restore interrupts when acquision fails
    <li>Add tryacquire's signature to defs.h.
  </ul>

<p>Run usertests to see if you didn't break anything.  Note that
  usertests never prints "!"; there is never contention
  for <tt>kmem.lock</tt>.  The caller is always able to immediately
  acquire the lock and never has to wait because some other process
  has the lock.

<p>Now run kalloctest.  You should see quite a number of "!" on the
  console.  kalloctest causes many processes to contend on
  the <tt>kmem.lock</tt>.  This lock contention is a bit artificial,
  because qemu is simulating 3 processors, but it is likely on real
  hardware, there would be contention too.
  
<h2>Removing lock contention</h2>

<p>The root cause of lock contention in kalloctest is that there is a
  single free list, protected by a single lock.  To remove lock
  contention, you will have to redesign the memory allocator to avoid
  a single lock and list.  The basic idea is to maintain a free list
  per CPU, each list with its own lock. Allocations and frees on each
  CPU can run in parallel, because each CPU will operate on a
  different list.
  
<p> The main challenge will be to deal with the case that one CPU runs
  out of memory, but another CPU has still free memory; in that case,
  the one CPU must "steal" part of the other CPU's free list.
  Stealing may introduce lock contention, but that may be acceptable
  because it may happen infrequently.

<p>Your job is to implement per-CPU freelists and stealing when one
  CPU is out of memory.  Run kalloctest() to see if your
  implementation has removed lock contention.

<p>Some hints:
  <ul>
    <li>You can use the constant <tt>NCPU</tt> in kernel/param.h
    <li>Let <tt>freerange</tt> give all free memory to the CPU
      running <tt>freerange</tt>.
    <li>The function <tt>cpuid</tt> returns the current core, but note
    that you can use it when interrupts are turned off and so you will
    need to turn on/off interrupts in your solution.
  </ul>

<p>Run usertests to see if you don't break anything.

<h2>More scalabale bcache lookup</h2>


<p>Several processes reading different files repeatedly will
  bottleneck in the buffer cache, bcache, in bio.c.  Replace the
  acquire in <tt>bget</tt> with
  
  <pre>
    while(!tryacquire(&bcache.lock)) {
      printf("!");
    }
  </pre>

  and run test0 from bcachetest and you will see "!"s.

<p>Modify <tt>bget</tt> so that a lookup for a buffer that is in the
  bcache doesn't need to acquire <tt>bcache.lock</tt>.  This is more
  tricky than the kalloc assignment, because bcache buffers are truly
  shared among processes. You must maintain the invariant that a
  buffer is only once in memory.

<p> There are several races that <tt>bcache.lock</tt> protects
against, including:
  <ul>
    <li>A <tt>brelse</tt> may set <tt>b->ref</tt> to 0,
      while concurrent <tt>bget</tt> is incrementing it.
    <li>Two <tt>bget</tt> may see <tt>b->ref = 0</tt> and one may re-use
    the buffer, while the other may replaces it with another block.
    <li>A concurrent <tt>brelse</tt> modifies the list
      that <tt>bget</tt> traverses.
  </ul>

<p>A challenge is testing whether you code is still correct.  One way
  to do is to artificially delay certain operations
  using <tt>sleepticks</tt>.  <tt>test1</tt> trashes the buffer cache
  and exercises more code paths.

<p>Here are some hints:
  <ul>
    <li>Read the description of buffer cache in the xv6 book (Section 7.2).
    <li>Use a simple design: i.e., don't design a lock-free implementation.
    <li>Use a simple hash table with locks per bucket.
    <li>Searching in hash table for a buffer and allocating an entry
      for that buffer when the buffer is not found must be atomic.
    <li>It is fine to acquire <tt>bcache.lock</tt> in <tt>brelse</tt>
      to update the LRU/MRU list.
  </ul>

<p>Check that your implementation has less contention
  on <tt>test0</tt>

<p>Make sure your implementation passes bcachetest and usertests.

<p>Optional:
  <ul>
  <li>make the buffer cache more scalable (e.g., avoid taking
  out <tt>bcache.lock</tt> on <tt>brelse</tt>).
  <li>make lookup lock-free (Hint: use gcc's <tt>__sync_*</tt>
    functions.) How do you convince yourself that your implementation is correct?
  </ul>
  
  
</body>
</html>