diff options
Diffstat (limited to 'labs')
-rw-r--r-- | labs/cow.html | 109 | ||||
-rw-r--r-- | labs/fs.html | 360 | ||||
-rw-r--r-- | labs/fs1.html | 215 | ||||
-rw-r--r-- | labs/lazy.html | 132 | ||||
-rw-r--r-- | labs/lock.html | 148 | ||||
-rw-r--r-- | labs/mmap.html | 171 | ||||
-rw-r--r-- | labs/syscall.html | 443 | ||||
-rw-r--r-- | labs/xv6.html | 238 |
8 files changed, 1816 insertions, 0 deletions
diff --git a/labs/cow.html b/labs/cow.html new file mode 100644 index 0000000..2cc18fa --- /dev/null +++ b/labs/cow.html @@ -0,0 +1,109 @@ +<html> +<head> +<title>Lab: Copy-on-Write Fork for xv6</title> +<link rel="stylesheet" href="homework.css" type="text/css" /> +</head> +<body> + +<h1>Lab: Copy-on-Write Fork for xv6</h2> + +<p> +Your task is implement copy-on-write fork in the xv6 kernel. You are +done if your modified kernel executes both the cow and usertests +programs successfully. + +<h2>The problem</h2> + +The fork() system call in xv6 copies all of the parent process's +user-space memory into the child. If the parent is large, copying can +take a long time. In addition, the copies often waste memory; in many +cases neither the parent nor the child modifies a page, so that in +principle they could share the same physical memory. The inefficiency +is particularly clear if the child calls exec(), since then most of +the copied pages are thrown away without ever being used. Of course, +sometimes both child and parent modify memory at the same virtual +address after a fork(), so for some pages the copying is truly needed. + +<h2>The solution</h2> + +The goal of copy-on-write (COW) fork() is to defer allocating and +copying physical memory pages for the child until they are actually +needed, in the hope that they may never be needed. + +<p> +COW fork() creates just a pagetable for the child, with PTEs for user +memory pointing to the parent's physical pages. COW fork() marks all +the user PTEs in both parent and child as read-only. When either +process tries to write one of these COW pages, the CPU will force a +page fault. The kernel page-fault handler detects this case, allocates +a page of physical memory for the faulting process, copies the +original page into the new page, and modifies the relevant PTE in the +faulting process to refer to the new page, this time with the PTE +marked writeable. When the page fault handler returns, the user +process will be able to write its copy of the page. + +<p> +COW fork() makes freeing of the physical pages that implement user +memory a little trickier. A given physical page may be referred to by +multiple processes' page tables, and should be freed when the last +reference disappears. + +<h2>The cow test program</h2> + +To help you test your implementation, we've provided an xv6 program +called cow (source in user/cow.c). cow runs various tests, but +even the first will fail on unmodified xv6. Thus, initially, you +will see: + +<pre> +$ cow +simple: fork() failed +$ +</pre> + +The "simple" test allocates more than half of available physical +memory, and then fork()s. The fork fails because there is not enough +free physical memory to give the child a complete copy of the parent. + +<p> +When you are done, your kernel should be able to run both cow and +usertests. That is: + +<pre> +$ cow +simple: ok +simple: ok +three: zombie! +ok +three: zombie! +ok +three: zombie! +ok +file: ok +ALL COW TESTS PASSED +$ usertests +... +ALL TESTS PASSED +$ +</pre> + +<h2>Hints</h2> + +Here's one reasonable plan of attack. Modify uvmcopy() to map the +parent's physical pages into the child, instead of allocating new +pages, and clear PTE_W in the PTEs of both child and parent. +Modify usertrap() to recognize a page fault. When a page fault occurs +on a COW page, allocate a new page with kalloc(), copy the old page to +the new page, and install the new page in the PTE with PTE_W set. +Next, ensure that each physical page is freed when the last PTE +reference to it goes away (but not before!), perhaps by implementing +reference counts in kalloc.c. Finally, modify copyout() to use the +same scheme as page faults when it encounters a COW page. + +<p> +It may be useful to have a way to record, for each PTE, whether it is +a COW mapping. You can use the RSW (reserved for software) bits in +the RISC-V PTE for this. + +</body> +</html> diff --git a/labs/fs.html b/labs/fs.html new file mode 100644 index 0000000..a21e61f --- /dev/null +++ b/labs/fs.html @@ -0,0 +1,360 @@ +<html> +<head> +<title>Lab: file system</title> +<link rel="stylesheet" href="homework.css" type="text/css" /> +</head> +<body> + +<h1>Lab: file system</h1> + +<p>In this lab you will add large files and <tt>mmap</tt> to the xv6 file system. + +<h2>Large files</h2> + +<p>In this assignment you'll increase the maximum size of an xv6 +file. Currently xv6 files are limited to 268 blocks, or 268*BSIZE +bytes (BSIZE is 1024 in xv6). This limit comes from the fact that an +xv6 inode contains 12 "direct" block numbers and one "singly-indirect" +block number, which refers to a block that holds up to 256 more block +numbers, for a total of 12+256=268. You'll change the xv6 file system +code to support a "doubly-indirect" block in each inode, containing +256 addresses of singly-indirect blocks, each of which can contain up +to 256 addresses of data blocks. The result will be that a file will +be able to consist of up to 256*256+256+11 blocks (11 instead of 12, +because we will sacrifice one of the direct block numbers for the +double-indirect block). + +<h3>Preliminaries</h3> + +<p>Modify your Makefile's <tt>CPUS</tt> definition so that it reads: +<pre> +CPUS := 1 +</pre> + +<b>XXX doesn't seem to speedup things</b> +<p>Add +<pre> +QEMUEXTRA = -snapshot +</pre> +right before +<tt>QEMUOPTS</tt> +<p> +The above two steps speed up qemu tremendously when xv6 +creates large files. + +<p><tt>mkfs</tt> initializes the file system to have fewer +than 1000 free data blocks, too few to show off the changes +you'll make. Modify <tt>param.h</tt> to +set <tt>FSSIZE</tt> to: +<pre> + #define FSSIZE 20000 // size of file system in blocks +</pre> + +<p>Download <a href="big.c">big.c</a> into your xv6 directory, +add it to the UPROGS list, start up xv6, and run <tt>big</tt>. +It creates as big a file as xv6 will let +it, and reports the resulting size. It should say 140 sectors. + +<h3>What to Look At</h3> + +The format of an on-disk inode is defined by <tt>struct dinode</tt> +in <tt>fs.h</tt>. You're particularly interested in <tt>NDIRECT</tt>, +<tt>NINDIRECT</tt>, <tt>MAXFILE</tt>, and the <tt>addrs[]</tt> element +of <tt>struct dinode</tt>. Look Figure 7.3 in the xv6 text for a +diagram of the standard xv6 inode. + +<p> +The code that finds a file's data on disk is in <tt>bmap()</tt> +in <tt>fs.c</tt>. Have a look at it and make sure you understand +what it's doing. <tt>bmap()</tt> is called both when reading and +writing a file. When writing, <tt>bmap()</tt> allocates new +blocks as needed to hold file content, as well as allocating +an indirect block if needed to hold block addresses. + +<p> +<tt>bmap()</tt> deals with two kinds of block numbers. The <tt>bn</tt> +argument is a "logical block" -- a block number relative to the start +of the file. The block numbers in <tt>ip->addrs[]</tt>, and the +argument to <tt>bread()</tt>, are disk block numbers. +You can view <tt>bmap()</tt> as mapping a file's logical +block numbers into disk block numbers. + +<h3>Your Job</h3> + +Modify <tt>bmap()</tt> so that it implements a doubly-indirect +block, in addition to direct blocks and a singly-indirect block. +You'll have to have only 11 direct blocks, rather than 12, +to make room for your new doubly-indirect block; you're +not allowed to change the size of an on-disk inode. +The first 11 elements of <tt>ip->addrs[]</tt> should be +direct blocks; the 12th should be a singly-indirect block +(just like the current one); the 13th should be your new +doubly-indirect block. + +<p> +You don't have to modify xv6 to handle deletion of files with +doubly-indirect blocks. + +<p> +If all goes well, <tt>big</tt> will now report that it +can write sectors. It will take <tt>big</tt> minutes +to finish. + +<b>XXX this runs for a while!</b> + +<h3>Hints</h3> + +<p> +Make sure you understand <tt>bmap()</tt>. Write out a diagram of the +relationships between <tt>ip->addrs[]</tt>, the indirect block, the +doubly-indirect block and the singly-indirect blocks it points to, and +data blocks. Make sure you understand why adding a doubly-indirect +block increases the maximum file size by 256*256 blocks (really -1), +since you have to decrease the number of direct blocks by one). + +<p> +Think about how you'll index the doubly-indirect block, and +the indirect blocks it points to, with the logical block +number. + +<p>If you change the definition of <tt>NDIRECT</tt>, you'll +probably have to change the size of <tt>addrs[]</tt> +in <tt>struct inode</tt> in <tt>file.h</tt>. Make sure that +<tt>struct inode</tt> and <tt>struct dinode</tt> have the +same number of elements in their <tt>addrs[]</tt> arrays. + +<p>If you change the definition of <tt>NDIRECT</tt>, make sure to create a +new <tt>fs.img</tt>, since <tt>mkfs</tt> uses <tt>NDIRECT</tt> too to build the +initial file systems. If you delete <tt>fs.img</tt>, <tt>make</tt> on Unix (not +xv6) will build a new one for you. + +<p>If your file system gets into a bad state, perhaps by crashing, +delete <tt>fs.img</tt> (do this from Unix, not xv6). <tt>make</tt> will build a +new clean file system image for you. + +<p>Don't forget to <tt>brelse()</tt> each block that you +<tt>bread()</tt>. + +<p>You should allocate indirect blocks and doubly-indirect + blocks only as needed, like the original <tt>bmap()</tt>. + +<p>Optional challenge: support triple-indirect blocks. + +<h2>Writing with a Log</h2> + +Insert a print statement in bwrite (in bio.c) so that you get a +print every time a block is written to disk: + +<pre> + printf("bwrite block %d\n", b->blockno); +</pre> + +Build and boot a new kernel and run this: +<pre> + $ rm README +</pre> + +<p>You should see a sequence of bwrite prints after the <tt>rm</tt>.</p> + +<div class="question"> +<ol> +<li>Annotate the bwrite lines with the kind of information that is +being written to the disk (e.g., "README's inode", "allocation +bitmap"). If the log is being written, note both that the log is being +written and also what kind of information is being written to the log. +<li>Mark with an arrow the first point at which, if a +crash occured, README would be missing after a reboot +(after the call to <tt>recover_from_log()</tt>). +</ol> +</p> +</div> + + +<h2>Crash safety</h2> + +<p>This assignment explores the xv6 log in two parts. +First, you'll artificially create a crash which illustrates +why logging is needed. Second, you'll remove one +inefficiency in the xv6 logging system. + +<p> +Submit your solution before the beginning of the next lecture +to <a href="https://6828.scripts.mit.edu/2018/handin.py/">the submission +web site</a>. + +<h3>Creating a Problem</h3> + +<p> +The point of the xv6 log is to cause all the disk updates of a +filesystem operation to be atomic with respect to crashes. +For example, file creation involves both adding a new entry +to a directory and marking the new file's inode as in-use. +A crash that happened after one but before the other would +leave the file system in an incorrect state after a reboot, +if there were no log. + +<p> +The following steps will break the logging code in a way that +leaves a file partially created. + +<p> +First, replace <tt>commit()</tt> in <tt>log.c</tt> with +this code: +<pre> +#include "kernel/proc.h" +void +commit(void) +{ + int pid = myproc()->pid; + if (log.lh.n > 0) { + write_log(); + write_head(); + if(pid > 1) // AAA + log.lh.block[0] = 0; // BBB + install_trans(); + if(pid > 1) // AAA + panic("commit mimicking crash"); // CCC + log.lh.n = 0; + write_head(); + } +} +</pre> + +<p> +The BBB line causes the first block in the log to be written to +block zero, rather than wherever it should be written. During file +creation, the first block in the log is the new file's inode updated +to have non-zero <tt>type</tt>. +Line BBB causes the block +with the updated inode to be written to block 0 (whence +it will never be read), leaving the on-disk inode still marked +unallocated. The CCC line forces a crash. +The AAA lines suppress this buggy behavior for <tt>init</tt>, +which creates files before the shell starts. + +<p> +Second, replace <tt>recover_from_log()</tt> in <tt>log.c</tt> +with this code: +<pre> +static void +recover_from_log(void) +{ + read_head(); + printf("recovery: n=%d but ignoring\n", log.lh.n); + // install_trans(); + log.lh.n = 0; + // write_head(); +} +</pre> + +<p> +This modification suppresses log recovery (which would repair +the damage caused by your change to <tt>commit()</tt>). + +<p> +Finally, remove the <tt>-snapshot</tt> option from the definition +of <tt>QEMUEXTRA</tt> in your Makefile so that the disk image will see the +changes. + +<p> +Now remove <tt>fs.img</tt> and run xv6: +<pre> + % rm fs.img ; make qemu +</pre> +<p> +Tell the xv6 shell to create a file: +<pre> + $ echo hi > a +</pre> + +<p> +You should see the panic from <tt>commit()</tt>. So far +it is as if a crash occurred in a non-logging system in the middle +of creating a file. + +<p> +Now re-start xv6, keeping the same <tt>fs.img</tt>: +<pre> + % make qemu +</pre> + +<p> +And look at file <tt>a</tt>: +<pre> + $ cat a +</pre> + +<p> + You should see <tt>panic: ilock: no type</tt>. Make sure you understand what happened. +Which of the file creation's modifications were written to the disk +before the crash, and which were not? + +<h3>Solving the Problem</h3> + +Now fix <tt>recover_from_log()</tt>: +<pre> +static void +recover_from_log(void) +{ + read_head(); + cprintf("recovery: n=%d\n", log.lh.n); + install_trans(); + log.lh.n = 0; + write_head(); +} +</pre> + +<p> +Run xv6 (keeping the same <tt>fs.img</tt>) and read <tt>a</tt> again: +<pre> + $ cat a +</pre> + +<p> +This time there should be no crash. Make sure you understand why +the file system now works. + +<p> +Why was the file empty, even though you created +it with <tt>echo hi > a</tt>? + +<p> +Now remove your modifications to <tt>commit()</tt> +(the if's and the AAA and BBB lines), so that logging works again, +and remove <tt>fs.img</tt>. + +<h3>Streamlining Commit</h3> + +<p> +Suppose the file system code wants to update an inode in block 33. +The file system code will call <tt>bp=bread(block 33)</tt> and update the +buffer data. <tt>write_log()</tt> in <tt>commit()</tt> +will copy the data to a block in the log on disk, for example block 3. +A bit later in <tt>commit</tt>, <tt>install_trans()</tt> reads +block 3 from the log (containing block 33), copies its contents into the in-memory +buffer for block 33, and then writes that buffer to block 33 on the disk. + +<p> +However, in <tt>install_trans()</tt>, it turns out that the modified +block 33 is guaranteed to be still in the buffer cache, where the +file system code left it. Make sure you understand why it would be a +mistake for the buffer cache to evict block 33 from the buffer cache +before the commit. + +<p> +Since the modified block 33 is guaranteed to already be in the buffer +cache, there's no need for <tt>install_trans()</tt> to read block +33 from the log. Your job: modify <tt>log.c</tt> so that, when +<tt>install_trans()</tt> is called from <tt>commit()</tt>, +<tt>install_trans()</tt> does not perform the needless read from the log. + +<p>To test your changes, create a file in xv6, restart, and make sure +the file is still there. + +<b>XXX Does this speedup bigfile?</b> + +<b>XXX Maybe support lseek and modify shell to append to a file?</b> + + +</body> +</html> diff --git a/labs/fs1.html b/labs/fs1.html new file mode 100644 index 0000000..45d3e0c --- /dev/null +++ b/labs/fs1.html @@ -0,0 +1,215 @@ +<html> +<head> +<title>Lab: mount/umount</title> +<link rel="stylesheet" href="homework.css" type="text/css" /> +</head> +<body> + +<h1>Lab: mount/umount</h1> + +<p>In this lab you will add support for mounting/unmounting of file +systems to xv6. This lab will expose you to many parts of the xv6 +file system, including pathname lookup, inodes, logging/recovery, disk +driver, concurrency, etc. + +<p>Your job is modify xv6 so that your modified kernel passes the + tests in mounttest. You will have to implement two system + calls: <tt>mount(char *source, char *target)</tt> + and <tt>umount(char *target)</tt>. Mount attaches the device + referenced by <tt>source</tt> (e.g., <tt>/disk1</tt>) at the + location specified by <tt>target</tt>. For + example, <tt>mount("/disk1", "/m")</tt> will attach <tt>disk1</tt> + at the directory <tt>/m</tt>. After this mount call, users can use + pathnames such as <tt>/m/README</tt> to read the + file <tt>README</tt> stored in the root directory + on <tt>disk1</tt>. <tt>Umount</tt> removes the attachment. For + example, <tt>umount("/m")</tt> unmounts disk1 from <tt>/m</tt>. + +<p>There are several major challenges in implementing the mount system +calls: + + <ul> + + <li>Adding the actual system calls so that user programs can call + them. This is similar to previous labs in which you added + systems calls xv6. + + <li>Supporting several disks. You will have generalize to + virtio_disk.c to support at least two disks. + + <li>Logging file system modifications to the right disk. xv6 + assumes there is only disk and file system calls typically start + with <tt>begin_op</tt> and end with <tt>end_op</tt>, logging all + modifications between these two calls to the log on the one + disk. With mount, modifications to the file system on the + second disk must be logged to the second disk. + + <li>Modifying pathname lookup (<tt>namex</tt>) so that when a + lookup cross a mount point, it continues at the root inode of + the attached disk. + + </ul> + +<p>The rest of this assignment provides some hints how you might go +about the above challenges. + +<h2>Adding system calls</h2> + +<p>Add the stubs for the two systems calls to xv6 so that you can +compile mounttest and add two empty functions for the two system calls +to sysfile.c. Run mounttest and it will fail on the first call +to <tt>mount</tt>. + + +<h2>Adding a second disk</h2> + +<p>To be able to mount another disk, you need to extend xv6 to support +at least two disks. Modify virtio_disk.c to support an array of two +disks instead of a single disk. The address of the second disk +is <tt>0x10002000</tt>; modify the macro <tt>R</tt> to take a disk +number (0, 1,..) and read/write to the memory address for that disk. + +<p>All functions in <tt>virtio_disk.c</tt> need to take the disk +number as an argument to update the state of the disk that is +read/written to or to receive an interrupt from the disk. +Modify <tt>virtio_disk_init</tt> to take a disk number as an argument +and update is to that it initializes that disk. Similar, go through +the other functions; make these changes should be most mechanical +(i.e., text substitutions). + +<p>The second disk interrupts at IRQ 2; modify trap.c to receive that +interrupt and <tt>virtio_disk_intr</tt> with the number of the disk +that generated the interrupt. + +<p>Modify the file Makefile to tell qemu to provide a second +disk. Define the variable <tt>QEMUEXTRA = -drive +file=fs1.img,if=none,format=raw,id=x1 -device +virtio-blk-device,drive=x1,bus=virtio-mmio-bus.1</tt> and +add <tt>$(QEMUEXTRA)</tt> to the end of <tt>QEMUOPTS</tt>. + +<p>Create a second disk image <tt>fs1.img</tt>. Easiest thing to do + is just copy the file <tt>fs.img</tt>. You might want to add rules + to the Makefile to make this image and remove it on <tt>make + clean</tt>. + +<p>Add to the user program init a call to create a device for the new + disk. For example, add the line <tt>mknod("disk1", DISK, 1);</tt> to + init.c. This will create an inode of type device in the root + directory with major number <tt>DISK</tt> and minor number 1. + +<p>The first argument of the <tt>mount</tt> system call ("disk1") will + refer to the device you created using <tt>mknod</tt> above. In your + implementation of the mount system call, + call <tt>virtio_disk_init</tt> with the minor number as the argument + to initialize the second disk. (We reserve minor number 0 for the + first disk.) + +<p>Boot xv6, run mounttest, and make sure <tt>virtio_disk_init</tt> + gets called (e.g., add print statement). You won't know if your + changes are correct, but your code should compile and invoke the + driver for the second disk. + +<h2>Modify the logging system</h2> + +<p>After calling <tt>virtio_disk_init</tt>, you need to also + call <tt>loginit</tt> to initialize the logging system for the + second disk (and restore the second disk if a power failure happened + while modifying the second disk). Generalize the logging system to + support to two logs, one on disk 0 and one disk 1. These changes + are mostly mechanical (e.g., <tt>log.</tt> changes + to <tt>log[n].</tt>), similar to generalizing the disk driver to + support two disks. + +<p>To make xv6 compile, you need to provide a disk number + to <tt>begin_op</tt> and <tt>end_op</tt>. It will be a challenge to + figure out what the right value is; for now just specify the first + disk (i.e., 0). This isn't correct, since modifications to the + second disk should be logged on the second disk, but we have no way + yet to read/write the second disk. Come back to this later when you + have a better idea how things will fit together, but make sure that + xv6 compiles and still runs. + +<h2>Pathname lookup</h2> + +<p>Modify <tt>namex</tt> to traverse mount points: when <tt>namex</tt> + sees an inode to which a file system is attached, it should traverse + to the root inode of that file system. Hint: modify the in-memory + inode in file.h to keep some additional state, and initialize that + state in the mount system call. Note that the inode already has a + field for disk number (i.e., <tt>dev</tt>), which is initialized and + passed to reads and writes to the driver. <tt>dev</tt> corresponds + to the minor number for disk devices. + +<p>Your modified xv6 should be able to pass the first tests in + mounttest (i.e., <tt>stat</tt>). This is likely to be challenging, + however, because now your kernel will be reading from the second + disk for the first time, and you may run into many issues. + +<p>Even though <tt>stat</tt> may return correctly, your code is likely + to be incorrect, because in <tt>namex</tt> + because <tt>iunlockput</tt> may modify the second disk (e.g., if + another process removes the file or directory) and those + modifications must be written to the second disk. Your job is to + fix the calls to <tt>begin_op</tt> and <tt>end_op</tt> to take the + right device. One challenge is that <tt>begin_op</tt> is called at + the beginning of a system call but then you don't know the device + that will be involved; you will have to postpone this call until you + know which inode is involved (which tells you will which device is + involved). Another challenge is that you cannot postpone + calling <tt>begin_op</tt> passed <tt>ilock</tt> because that + violates lock ordering in xv6; you should not be + calling <tt>begin_op</tt> while holding locks on inodes. (The log + system allows a few systems calls to run; if a system call that + holds an inode lock isn't admitted and one of the admitted system + calls needs that inode to complete, then xv6 will deadlock.) + +<p>Once you have implemented a plan for <tt>begin_op</tt> + and <tt>end_op</tt>, see if your kernel can pass <tt>test0</tt>. It + is likely that you will have to modify your implementation of the + mount system call to handle several corner cases. See the tests + in <tt>test0</tt>. + +<p>Run usertests to see if you didn't break anything else. Since you + modified <tt>namex</tt> and <tt>begin/end_op</tt>, which are at the + core of the xv6 file system, you might have introduced bugs, perhaps + including deadlocks. Deadlocks manifest themselves as no output + being produced because all processes are sleeping (hit ctrl-p a few + times). Your kernel might also suffer kernel panics, because your + changes violate invariants. You may have to iterate a few times to + get a good design and implementation. + +<h2>umount</h2> + +<p>Once your kernel passes usertests and test0 of mounttest, implement + umount. The main challenge is that umount of a file system should + fail if the file system is still in use; that is, if there is an + inode on the mounted device that has a <tt>ref > 0</tt>. + Furthermore, this test and unmounting should be an atomic + operation. (Hint: lock the inode cache.) Make sure your kernel + passes test1 of mounttest. + +<p>Test2 of mounttest stresses <tt>namex</tt> more; if you have done + everything right above, your kernel should pass it. Test3 tests + concurrent mount/unmounts with file creation. + +<h2>crash safety</h2> + +<p>One of the main goals of the file system is to provide crash + safety: if there is a power failure during a file system operation, + xv6 should recover correctly. It is difficult to introduce power + failure at the critical steps of logging; instead, we added a system + call that causes a kernel panic after committing an operation but + before installing the operation. Test4 with crashtest tests if your + xv6 recovers the mounted disk correctly. + + +</body> +</html> + +<h2>Optional challenges</h2> + +<p>Modify xv6 so that init mounts the first disk on the root inode. + This will allow you to remove some code specific for the first disk + from the kernel. + +<p>Support mounts on top of mounts. diff --git a/labs/lazy.html b/labs/lazy.html new file mode 100644 index 0000000..9d97cab --- /dev/null +++ b/labs/lazy.html @@ -0,0 +1,132 @@ +<html> +<head> +<title>Lab: xv6 lazy page allocation</title> +<link rel="stylesheet" href="homework.css" type="text/css" /> +</head> +<body> + +<h1>Lab: xv6 lazy page allocation</h1> + +<p> +One of the many neat tricks an O/S can play with page table hardware +is lazy allocation of heap memory. Xv6 applications ask the kernel for +heap memory using the sbrk() system call. In the kernel we've given +you, sbrk() allocates physical memory and maps it into the process's +virtual address space. There are programs that allocate memory but +never use it, for example to implement large sparse arrays. +Sophisticated kernels delay allocation of each page of memory until +the application tries to use that page -- as signaled by a page fault. +You'll add this lazy allocation feature to xv6 in this lab. + +<h2>Part One: Eliminate allocation from sbrk()</h2> + +Your first task is to delete page allocation from the sbrk(n) system +call implementation, which is the function sys_sbrk() in sysproc.c. The +sbrk(n) system call grows the process's memory size by n bytes, and +then returns the start of the newly allocated region (i.e., the old +size). Your new sbrk(n) should just increment the process's size +(myproc()->sz) by n and return the old size. It should not allocate memory +-- so you should delete the call to growproc() (but you still need to +increase the process's size!). + +<p> +Try to guess what the result of this modification will be: what will +break? + +<p> +Make this modification, boot xv6, and type <tt>echo hi</tt> to the shell. +You should see something like this: + +<pre> +init: starting sh +$ echo hi +usertrap(): unexpected scause 0x000000000000000f pid=3 + sepc=0x00000000000011dc stval=0x0000000000004008 +va=0x0000000000004000 pte=0x0000000000000000 +panic: unmappages: not mapped +</pre> + +The "usertrap(): ..." message is from the user trap handler in trap.c; +it has caught an exception that it does not know how to handle. Make +sure you understand why this page fault occurs. The "stval=0x0..04008" +indicates that the virtual address that caused the page fault is +0x4008. + +<h2>Part Two: Lazy allocation</h2> + +Modify the code in trap.c to respond to a page fault from user space +by mapping a newly-allocated page of physical memory at the faulting +address, and then returning back to user space to let the process +continue executing. You should add your code just before +the <tt>printf</tt> call that produced the "usertrap(): ..." +message. + +<p> +Hint: look at the printf arguments to see how to find the virtual +address that caused the page fault. + +<p> +Hint: steal code from allocuvm() in vm.c, which is what sbrk() +calls (via growproc()). + +<p> +Hint: use PGROUNDDOWN(va) to round the faulting virtual address +down to a page boundary. + +<p> +Hint: <tt>usertrapret()</tt> in order to avoid +the <tt>printf</tt> and the <tt>myproc()->killed = 1</tt>. + +<p> +Hint: you'll need to call mappages(). + +<p>Hint: you can check whether a fault is a page fault by r_scause() + is 13 or 15 in trap(). + +<p>Hint: modify unmappages() to not free pages that aren't mapped. + +<p>Hint: if the kernel crashes, look up sepc in kernel/kernel.asm + +<p>Hint: if you see the error "imcomplete type proc", include "proc.h" + (and "spinlock.h"). + +<p>Hint: the first test in sbrk() allocates something large, this + should succeed now. + +<p> +If all goes well, your lazy allocation code should result in <tt>echo +hi</tt> working. You should get at least one page fault (and thus lazy +allocation) in the shell, and perhaps two. + +<p>If you have the basics working, now turn your implementation into + one that handles the corner cases too: + +<ul> + + <li> Handle negative sbrk() arguments. sbrktest() in usertests will + tests this. + + <li> Handle fork correctly. sbrktst() will test this. + + <li> Make sure that kernel use of not-yet-allocated user addresses + works; for example, if a program passes an sbrk()-allocated + address to write(). sbrktest() will test this. + + <li> Handle out of memory correctly. sbrktst() will test this. + + <li> Handle faults on the invalid page below the stack. stacktest() + in usertests will tests this. + +</ul> + +<p>Run all tests in usertests() to make sure your solution doesn't +break other tests. + +<p> +<div class="question"> +<p><b>Submit</b>: The code that you added to trap.c in a file named <em>hwN.c</em> where <em>N</em> is the homework number as listed on the schedule. +</div> + + +</body> +</html> diff --git a/labs/lock.html b/labs/lock.html new file mode 100644 index 0000000..707d6c4 --- /dev/null +++ b/labs/lock.html @@ -0,0 +1,148 @@ +<html> +<head> +<title>Lab: locks</title> +<link rel="stylesheet" href="homework.css" type="text/css" /> +</head> +<body> + +<h1>Lab: locks</h1> + +<p>In this lab you will try to avoid lock contention for certain +workloads. + +<h2>lock contention</h2> + +<p>The program user/kalloctest stresses xv6's memory allocator: three + processes grow and shrink there address space, which will results in + many calls to <tt>kalloc</tt> and <tt>kfree</tt>, + respectively. <tt>kalloc</tt> and <tt>kfree</tt> + obtain <tt>kmem.lock</tt>. To see if there is lock contention for + <tt>kmem.lock</tt> replace the call to <tt>acquire</tt> + in <tt>kalloc</tt> with the following code: + + <pre> + while(!tryacquire(&kmem.lock)) { + printf("!"); + } + </pre> + +<p><tt>tryacquire</tt> tries to acquire <tt>kmem.lock</tt>: if the + lock is taking it returns false (0); otherwise, it returns true (1) + and with the lock acquired. Your first job is to + implement <tt>tryacquire</tt> in kernel/spinlock.c. + +<p>A few hints: + <ul> + <li>look at <tt>acquire</tt>. + <li>don't forget to restore interrupts when acquision fails + <li>Add tryacquire's signature to defs.h. + </ul> + +<p>Run usertests to see if you didn't break anything. Note that + usertests never prints "!"; there is never contention + for <tt>kmem.lock</tt>. The caller is always able to immediately + acquire the lock and never has to wait because some other process + has the lock. + +<p>Now run kalloctest. You should see quite a number of "!" on the + console. kalloctest causes many processes to contend on + the <tt>kmem.lock</tt>. This lock contention is a bit artificial, + because qemu is simulating 3 processors, but it is likely on real + hardware, there would be contention too. + +<h2>Removing lock contention</h2> + +<p>The root cause of lock contention in kalloctest is that there is a + single free list, protected by a single lock. To remove lock + contention, you will have to redesign the memory allocator to avoid + a single lock and list. The basic idea is to maintain a free list + per CPU, each list with its own lock. Allocations and frees on each + CPU can run in parallel, because each CPU will operate on a + different list. + +<p> The main challenge will be to deal with the case that one CPU runs + out of memory, but another CPU has still free memory; in that case, + the one CPU must "steal" part of the other CPU's free list. + Stealing may introduce lock contention, but that may be acceptable + because it may happen infrequently. + +<p>Your job is to implement per-CPU freelists and stealing when one + CPU is out of memory. Run kalloctest() to see if your + implementation has removed lock contention. + +<p>Some hints: + <ul> + <li>You can use the constant <tt>NCPU</tt> in kernel/param.h + <li>Let <tt>freerange</tt> give all free memory to the CPU + running <tt>freerange</tt>. + <li>The function <tt>cpuid</tt> returns the current core, but note + that you can use it when interrupts are turned off and so you will + need to turn on/off interrupts in your solution. + </ul> + +<p>Run usertests to see if you don't break anything. + +<h2>More scalabale bcache lookup</h2> + + +<p>Several processes reading different files repeatedly will + bottleneck in the buffer cache, bcache, in bio.c. Replace the + acquire in <tt>bget</tt> with + + <pre> + while(!tryacquire(&bcache.lock)) { + printf("!"); + } + </pre> + + and run test0 from bcachetest and you will see "!"s. + +<p>Modify <tt>bget</tt> so that a lookup for a buffer that is in the + bcache doesn't need to acquire <tt>bcache.lock</tt>. This is more + tricky than the kalloc assignment, because bcache buffers are truly + shared among processes. You must maintain the invariant that a + buffer is only once in memory. + +<p> There are several races that <tt>bcache.lock</tt> protects +against, including: + <ul> + <li>A <tt>brelse</tt> may set <tt>b->ref</tt> to 0, + while concurrent <tt>bget</tt> is incrementing it. + <li>Two <tt>bget</tt> may see <tt>b->ref = 0</tt> and one may re-use + the buffer, while the other may replaces it with another block. + <li>A concurrent <tt>brelse</tt> modifies the list + that <tt>bget</tt> traverses. + </ul> + +<p>A challenge is testing whether you code is still correct. One way + to do is to artificially delay certain operations + using <tt>sleepticks</tt>. <tt>test1</tt> trashes the buffer cache + and exercises more code paths. + +<p>Here are some hints: + <ul> + <li>Read the description of buffer cache in the xv6 book (Section 7.2). + <li>Use a simple design: i.e., don't design a lock-free implementation. + <li>Use a simple hash table with locks per bucket. + <li>Searching in hash table for a buffer and allocating an entry + for that buffer when the buffer is not found must be atomic. + <li>It is fine to acquire <tt>bcache.lock</tt> in <tt>brelse</tt> + to update the LRU/MRU list. + </ul> + +<p>Check that your implementation has less contention + on <tt>test0</tt> + +<p>Make sure your implementation passes bcachetest and usertests. + +<p>Optional: + <ul> + <li>make the buffer cache more scalable (e.g., avoid taking + out <tt>bcache.lock</tt> on <tt>brelse</tt>). + <li>make lookup lock-free (Hint: use gcc's <tt>__sync_*</tt> + functions.) How do you convince yourself that your implementation is correct? + </ul> + + +</body> +</html> diff --git a/labs/mmap.html b/labs/mmap.html new file mode 100644 index 0000000..6f779c4 --- /dev/null +++ b/labs/mmap.html @@ -0,0 +1,171 @@ +<html> +<head> +<title>Lab: mmap</title> +<link rel="stylesheet" href="homework.css" type="text/css" /> +</head> +<body> + +<h1>Lab: mmap</h1> + +<p>In this lab you will use </tt>mmap</tt> on Linux to demand-page a +very large table and add memory-mapped files to xv6. + +<h2>Using mmap on Linux</h2> + +<p>This assignment will make you more familiar with how to manage virtual memory +in user programs using the Unix system call interface. You can do this +assignment on any operating system that supports the Unix API (a Linux Athena +machine, your laptop with Linux or MacOS, etc.). + +<p>Download the <a href="mmap.c">mmap homework assignment</a> and look +it over. The program maintains a very large table of square root +values in virtual memory. However, the table is too large to fit in +physical RAM. Instead, the square root values should be computed on +demand in response to page faults that occur in the table's address +range. Your job is to implement the demand faulting mechanism using a +signal handler and UNIX memory mapping system calls. To stay within +the physical RAM limit, we suggest using the simple strategy of +unmapping the last page whenever a new page is faulted in. + +<p>To compile <tt>mmap.c</tt>, you need a C compiler, such as gcc. On Athena, +you can type: +<pre> +$ add gnu +</pre> +Once you have gcc, you can compile mmap.c as follows: +<pre> +$ gcc mmap.c -lm -o mmap +</pre> +Which produces a <tt>mmap</tt> file, which you can run: +<pre> +$ ./mmap +page_size is 4096 +Validating square root table contents... +oops got SIGSEGV at 0x7f6bf7fd7f18 +</pre> + +<p>When the process accesses the square root table, the mapping does not exist +and the kernel passes control to the signal handler code in +<tt>handle_sigsegv()</tt>. Modify the code in <tt>handle_sigsegv()</tt> to map +in a page at the faulting address, unmap a previous page to stay within the +physical memory limit, and initialize the new page with the correct square root +values. Use the function <tt>calculate_sqrts()</tt> to compute the values. +The program includes test logic that verifies if the contents of the +square root table are correct. When you have completed your task +successfully, the process will print “All tests passed!”. + +<p>You may find that the man pages for mmap() and munmap() are helpful references. +<pre> +$ man mmap +$ man munmap +</pre> + + +<h2>Implement memory-mapped files in xv6</h2> + +<p>In this assignment you will implement memory-mapped files in xv6. + The test program <tt>mmaptest</tt> tells you what should work. + +<p>Here are some hints about how you might go about this assignment: + + <ul> + <li>Start with adding the two systems calls to the kernel, as you + done for other systems calls (e.g., <tt>sigalarm</tt>), but + don't implement them yet; just return an + error. run <tt>mmaptest</tt> to observe the error. + + <li>Keep track for each process what <tt>mmap</tt> has mapped. + You will need to allocate a <tt>struct vma</tt> to record the + address, length, permissions, etc. for each virtual memory area + (VMA) that maps a file. Since the xv6 kernel doesn't have a + memory allocator in the kernel, you can use the same approach has + for <tt>struct file</tt>: have a global array of <tt>struct + vma</tt>s and have for each process a fixed-sized array of VMAs + (like the file descriptor array). + + <li>Implement <tt>mmap</tt>: allocate a VMA, add it to the process's + table of VMAs, fill in the VMA, and find a hole in the process's + address space where you will map the file. You can assume that no + file will be bigger than 1GB. The VMA will contain a pointer to + a <tt>struct file</tt> for the file being mapped; you will need to + increase the file's reference count so that the structure doesn't + disappear when the file is closed (hint: + see <tt>filedup</tt>). You don't have worry about overlapping + VMAs. Run <tt>mmaptest</tt>: the first <tt>mmap</tt> should + succeed, but the first access to the mmaped- memory will fail, + because you haven't updated the page fault handler. + + <li>Modify the page-fault handler from the lazy-allocation and COW + labs to call a VMA function that handles page faults in VMAs. + This function allocates a page, reads a 4KB from the mmap-ed + file into the page, and maps the page into the address space of + the process. To read the page, you can use <tt>readi</tt>, + which allows you to specify an offset from where to read in the + file (but you will have to lock/unlock the inode passed + to <tt>readi</tt>). Don't forget to set the permissions correctly + on the page. Run <tt>mmaptest</tt>; you should get to the + first <tt>munmap</tt>. + + <li>Implement <tt>munmap</tt>: find the <tt>struct vma</tt> for + the address and unmap the specified pages (hint: + use <tt>uvmunmap</tt>). If <tt>munmap</tt> removes all pages + from a VMA, you will have to free the VMA (don't forget to + decrement the reference count of the VMA's <tt>struct + file</tt>); otherwise, you may have to shrink the VMA. You can + assume that <tt>munmap</tt> will not split a VMA into two VMAs; + that is, we don't unmap a few pages in the middle of a VMA. If + an unmapped page has been modified and the file is + mapped <tt>MAP_SHARED</tt>, you will have to write the page back + to the file. RISC-V has a dirty bit (<tt>D</tt>) in a PTE to + record whether a page has ever been written too; add the + declaration to kernel/riscv.h and use it. Modify <tt>exit</tt> + to call <tt>munmap</tt> for the process's open VMAs. + Run <tt>mmaptest</tt>; you should <tt>mmaptest</tt>, but + probably not <tt>forktest</tt>. + + <li>Modify <tt>fork</tt> to copy VMAs from parent to child. Don't + forget to increment reference count for a VMA's <tt>struct + file</tt>. In the page fault handler of the child, it is OK to + allocate a new page instead of sharing the page with the + parent. The latter would be cooler, but it would require more + implementation work. Run <tt>mmaptest</tt>; make sure you pass + both <tt>mmaptest</tt> and <tt>forktest</tt>. + + </ul> + +<p>Run usertests to make sure you didn't break anything. + +<p>Optional challenges: + <ul> + + <li>If two processes have the same file mmap-ed (as + in <tt>forktest</tt>), share their physical pages. You will need + reference counts on physical pages. + + <li>The solution above allocates a new physical page for each page + read from the mmap-ed file, even though the data is also in kernel + memory in the buffer cache. Modify your implementation to mmap + that memory, instead of allocating a new page. This requires that + file blocks be the same size as pages (set <tt>BSIZE</tt> to + 4096). You will need to pin mmap-ed blocks into the buffer cache. + You will need worry about reference counts. + + <li>Remove redundancy between your implementation for lazy + allocation and your implementation of mmapp-ed files. (Hint: + create an VMA for the lazy allocation area.) + + <li>Modify <tt>exec</tt> to use a VMA for different sections of + the binary so that you get on-demand-paged executables. This will + make starting programs faster, because <tt>exec</tt> will not have + to read any data from the file system. + + <li>Implement on-demand paging: don't keep a process in memory, + but let the kernel move some parts of processes to disk when + physical memory is low. Then, page in the paged-out memory when + the process references it. Port your linux program from the first + assignment to xv6 and run it. + + </ul> + +</body> +</html> diff --git a/labs/syscall.html b/labs/syscall.html new file mode 100644 index 0000000..2281f2e --- /dev/null +++ b/labs/syscall.html @@ -0,0 +1,443 @@ +<html> +<head> +<title>Lab: Alarm and uthread</title> +<link rel="stylesheet" href="homework.css" type="text/css" /> +</head> +<body> + +<h1>Lab: Alarm and uthread</h1> + +This lab will familiarize you with the implementation of system calls +and switching between threads of execution. In particular, you will +implement new system calls (<tt>sigalarm</tt> and <tt>sigreturn</tt>) +and switching between threads in a user-level thread package. + +<h2>Warmup: RISC-V assembly</h2> + +<p>For this lab it will be important to understand a bit of RISC-V assembly. + +<p>Add a file user/call.c with the following content, modify the + Makefile to add the program to the user programs, and compile (make + fs.img). The Makefile also produces a binary and a readable + assembly a version of the program in the file user/call.asm. +<pre> +#include "kernel/param.h" +#include "kernel/types.h" +#include "kernel/stat.h" +#include "user/user.h" + +int g(int x) { + return x+3; +} + +int f(int x) { + return g(x); +} + +void main(void) { + printf(1, "%d %d\n", f(8)+1, 13); + exit(); +} +</pre> + +<p>Read through user/call.asm and understand it. The instruction manual + for RISC-V is in the doc directory (doc/riscv-spec-v2.2.pdf). Here + are some questions that you should answer for yourself: + + <ul> + <li>Which registers contain arguments to functions? Which + register holds 13 in the call to <tt>printf</tt>? Which register + holds the second argument? Which register holds the third one? Etc. + + <li>Where is the function call to <tt>f</tt> from main? Where + is the call to <tt>g</tt>? + (Hint: the compiler may inline functions.) + + <li>At what address is the function <tt>printf</tt> located? + + <li>What value is in the register <tt>ra</tt> just after the <tt>jalr</tt> + to <tt>printf</tt> in <tt>main</tt>? + </ul> + +<h2>Warmup: system call tracing</h2> + +<p>In this exercise you will modify the xv6 kernel to print out a line +for each system call invocation. It is enough to print the name of the +system call and the return value; you don't need to print the system +call arguments. + +<p> +When you're done, you should see output like this when booting +xv6: + +<pre> +... +fork -> 2 +exec -> 0 +open -> 3 +close -> 0 +$write -> 1 + write -> 1 +</pre> + +<p> +That's init forking and execing sh, sh making sure only two file descriptors are +open, and sh writing the $ prompt. (Note: the output of the shell and the +system call trace are intermixed, because the shell uses the write syscall to +print its output.) + +<p> Hint: modify the syscall() function in kernel/syscall.c. + +<p>Run the xv6 programs you wrote in earlier labs and inspect the system call + trace. Are there many system calls? Which system calls correspond + to code in the applications you wrote? + +<p>Optional: print the system call arguments. + + +<h2>Alarm</h2> + +<p> +In this exercise you'll add a feature to xv6 that periodically alerts +a process as it uses CPU time. This might be useful for compute-bound +processes that want to limit how much CPU time they chew up, or for +processes that want to compute but also want to take some periodic +action. More generally, you'll be implementing a primitive form of +user-level interrupt/fault handlers; you could use something similar +to handle page faults in the application, for example. + +<p> +You should add a new <tt>sigalarm(interval, handler)</tt> system call. +If an application calls <tt>sigalarm(n, fn)</tt>, then after every +<tt>n</tt> "ticks" of CPU time that the program consumes, the kernel +should cause application function +<tt>fn</tt> to be called. When <tt>fn</tt> returns, the application +should resume where it left off. A tick is a fairly arbitrary unit of +time in xv6, determined by how often a hardware timer generates +interrupts. + +<p> +You'll find a file <tt>user/alarmtest.c</tt> in your xv6 +repository. Add it to the Makefile. It won't compile correctly +until you've added <tt>sigalarm</tt> and <tt>sigreturn</tt> +system calls (see below). + +<p> +<tt>alarmtest</tt> calls <tt>sigalarm(2, periodic)</tt> in <tt>test0</tt> to +ask the kernel to force a call to <tt>periodic()</tt> every 2 ticks, +and then spins for a while. +You can see the assembly +code for alarmtest in user/alarmtest.asm, which may be handy +for debugging. +When you've finished the lab, +<tt>alarmtest</tt> should produce output like this: + +<pre> +$ alarmtest +test0 start +......................................alarm! +test0 passed +test1 start +..alarm! +..alarm! +..alarm! +.alarm! +..alarm! +..alarm! +..alarm! +..alarm! +..alarm! +..alarm! +test1 passed +$ +</pre> + +<p>The main challenge will be to arrange that the handler is invoked + when the process's alarm interval expires. You'll need to modify + usertrap() in kernel/trap.c so that when a + process's alarm interval expires, the process executes + the handler. How can you do that? You will need to understand + how system calls work (i.e., the code in kernel/trampoline.S + and kernel/trap.c). Which register contains the address to which + system calls return? + +<p>Your solution will be only a few lines of code, but it may be tricky to + get it right. +We'll test your code with the version of alarmtest.c in the original +repository; if you modify alarmtest.c, make sure your kernel changes +cause the original alarmtest to pass the tests. + +<h3>test0: invoke handler</h3> + +<p>Get started by modifying the kernel to jump to the alarm handler in +user space, which will cause test0 to print "alarm!". Don't worry yet +what happens after the "alarm!" output; it's OK for now if your +program crashes after printing "alarm!". Here are some hints: + +<ul> + +<li>You'll need to modify the Makefile to cause <tt>alarmtest.c</tt> +to be compiled as an xv6 user program. + +<li>The right declarations to put in <tt>user/user.h</tt> are: +<pre> + int sigalarm(int ticks, void (*handler)()); + int sigreturn(void); +</pre> + +<li>Update user/sys.pl (which generates user/usys.S), + kernel/syscall.h, and kernel/syscall.c + to allow <tt>alarmtest</tt> to invoke the sigalarm and + sigreturn system calls. + +<li>For now, your <tt>sys_sigreturn</tt> should just return zero. + +<li>Your <tt>sys_sigalarm()</tt> should store the alarm interval and +the pointer to the handler function in new fields in the <tt>proc</tt> +structure, defined in <tt>kernel/proc.h</tt>. + +<li>You'll need to keep track of how many ticks have passed since the +last call (or are left until the next call) to a process's alarm +handler; you'll need a new field in <tt>struct proc</tt> for this +too. You can initialize <tt>proc</tt> fields in <tt>allocproc()</tt> +in <tt>proc.c</tt>. + +<li>Every tick, the hardware clock forces an interrupt, which is handled +in <tt>usertrap()</tt>; you should add some code here. + +<li>You only want to manipulate a process's alarm ticks if there's a a + timer interrupt; you want something like +<pre> + if(which_dev == 2) ... +</pre> + +<li>Only invoke the alarm function if the process has a + timer outstanding. Note that the address of the user's alarm + function might be 0 (e.g., in alarmtest.asm, <tt>periodic</tt> is at + address 0). + +<li>It will be easier to look at traps with gdb if you tell qemu to +use only one CPU, which you can do by running +<pre> + make CPUS=1 qemu +</pre> + +<li>You've succeeded if alarmtest prints "alarm!". + +</ul> + +<h3>test1(): resume interrupted code</h3> + +Chances are that alarmtest crashes at some point after it prints +"alarm!". Depending on how your solution works, that point may be in +test0, or it may be in test1. Crashes are likely caused +by the alarm handler (<tt>periodic</tt> in alarmtest.c) returning +to the wrong point in the user program. + +<p> +Your job now is to ensure that, when the alarm handler is done, +control returns to +the instruction at which the user program was originally +interrupted by the timer interrupt. You must also ensure that +the register contents are restored to values they held +at the time of the interrupt, so that the user program +can continue undisturbed after the alarm. + +<p>Your solution is likely to require you to save and restore + registers---what registers do you need to save and restore to resume + the interrupted code correctly? (Hint: it will be many). + Several approaches are possible; for this lab you should make + the <tt>sigreturn</tt> system call + restore registers and return to the original + interrupted user instruction. + The user-space alarm handler + calls sigreturn when it is done. + + Some hints: + <ul> + <li>Have <tt>usertrap</tt> save enough state in + <tt>struct proc</tt> when the timer goes off + that <tt>sigreturn</tt> can correctly return to the + interrupted user code. + + <li>Prevent re-entrant calls to the handler----if a handler hasn't + returned yet, the kernel shouldn't call it again. + </ul> + +<p>Once you pass <tt>test0</tt> and <tt>test1</tt>, run usertests to + make sure you didn't break any other parts of the kernel. + +<h2>Uthread: switching between threads</h2> + +<p>Download <a href="uthread.c">uthread.c</a> and <a + href="uthread_switch.S">uthread_switch.S</a> into your xv6 directory. +Make sure <tt>uthread_switch.S</tt> ends with <tt>.S</tt>, not +<tt>.s</tt>. Add the +following rule to the xv6 Makefile after the _forktest rule: + +<pre> +$U/_uthread: $U/uthread.o $U/uthread_switch.o + $(LD) $(LDFLAGS) -N -e main -Ttext 0 -o $U/_uthread $U/uthread.o $U/uthread_switch.o $(ULIB) + $(OBJDUMP) -S $U/_uthread > $U/uthread.asm +</pre> +Make sure that the blank space at the start of each line is a tab, +not spaces. + +<p> +Add <tt>_uthread</tt> in the Makefile to the list of user programs defined by UPROGS. + +<p>Run xv6, then run <tt>uthread</tt> from the xv6 shell. The xv6 kernel will print an error message about <tt>uthread</tt> encountering a page fault. + +<p>Your job is to complete <tt>uthread_switch.S</tt>, so that you see output similar to +this (make sure to run with CPUS=1): +<pre> +~/classes/6828/xv6$ make CPUS=1 qemu +... +$ uthread +my thread running +my thread 0x0000000000002A30 +my thread running +my thread 0x0000000000004A40 +my thread 0x0000000000002A30 +my thread 0x0000000000004A40 +my thread 0x0000000000002A30 +my thread 0x0000000000004A40 +my thread 0x0000000000002A30 +my thread 0x0000000000004A40 +my thread 0x0000000000002A30 +... +my thread 0x0000000000002A88 +my thread 0x0000000000004A98 +my thread: exit +my thread: exit +thread_schedule: no runnable threads +$ +</pre> + +<p><tt>uthread</tt> creates two threads and switches back and forth between +them. Each thread prints "my thread ..." and then yields to give the other +thread a chance to run. + +<p>To observe the above output, you need to complete <tt>uthread_switch.S</tt>, but before +jumping into <tt>uthread_switch.S</tt>, first understand how <tt>uthread.c</tt> +uses <tt>uthread_switch</tt>. <tt>uthread.c</tt> has two global variables +<tt>current_thread</tt> and <tt>next_thread</tt>. Each is a pointer to a +<tt>thread</tt> structure. The thread structure has a stack for a thread and a +saved stack pointer (<tt>sp</tt>, which points into the thread's stack). The +job of <tt>uthread_switch</tt> is to save the current thread state into the +structure pointed to by <tt>current_thread</tt>, restore <tt>next_thread</tt>'s +state, and make <tt>current_thread</tt> point to where <tt>next_thread</tt> was +pointing to, so that when <tt>uthread_switch</tt> returns <tt>next_thread</tt> +is running and is the <tt>current_thread</tt>. + +<p>You should study <tt>thread_create</tt>, which sets up the initial stack for +a new thread. It provides hints about what <tt>uthread_switch</tt> should do. +Note that <tt>thread_create</tt> simulates saving all callee-save registers +on a new thread's stack. + +<p>To write the assembly in <tt>thread_switch</tt>, you need to know how the C +compiler lays out <tt>struct thread</tt> in memory, which is as +follows: + +<pre> + -------------------- + | 4 bytes for state| + -------------------- + | stack size bytes | + | for stack | + -------------------- + | 8 bytes for sp | + -------------------- <--- current_thread + ...... + + ...... + -------------------- + | 4 bytes for state| + -------------------- + | stack size bytes | + | for stack | + -------------------- + | 8 bytes for sp | + -------------------- <--- next_thread +</pre> + +The variables <tt>&next_thread</tt> and <tt>¤t_thread</tt> each +contain the address of a pointer to <tt>struct thread</tt>, and are +passed to <tt>thread_switch</tt>. The following fragment of assembly +will be useful: + +<pre> + ld t0, 0(a0) + sd sp, 0(t0) +</pre> + +This saves <tt>sp</tt> in <tt>current_thread->sp</tt>. This works because +<tt>sp</tt> is at +offset 0 in the struct. +You can study the assembly the compiler generates for +<tt>uthread.c</tt> by looking at <tt>uthread.asm</tt>. + +<p>To test your code it might be helpful to single step through your +<tt>uthread_switch</tt> using <tt>riscv64-linux-gnu-gdb</tt>. You can get started in this way: + +<pre> +(gdb) file user/_uthread +Reading symbols from user/_uthread... +(gdb) b *0x230 + +</pre> +0x230 is the address of uthread_switch (see uthread.asm). When you +compile it may be at a different address, so check uthread_asm. +You may also be able to type "b uthread_switch". <b>XXX This doesn't work + for me; why?</b> + +<p>The breakpoint may (or may not) be triggered before you even run +<tt>uthread</tt>. How could that happen? + +<p>Once your xv6 shell runs, type "uthread", and gdb will break at +<tt>thread_switch</tt>. Now you can type commands like the following to inspect +the state of <tt>uthread</tt>: + +<pre> + (gdb) p/x *next_thread + $1 = {sp = 0x4a28, stack = {0x0 (repeats 8088 times), + 0x68, 0x1, 0x0 <repeats 102 times>}, state = 0x1} +</pre> +What address is <tt>0x168</tt>, which sits on the bottom of the stack +of <tt>next_thread</tt>? + +With "x", you can examine the content of a memory location +<pre> + (gdb) x/x next_thread->sp + 0x4a28 <all_thread+16304>: 0x00000168 +</pre> +Why does that print <tt>0x168</tt>? + +<h3>Optional challenges</h3> + +<p>The user-level thread package interacts badly with the operating system in +several ways. For example, if one user-level thread blocks in a system call, +another user-level thread won't run, because the user-level threads scheduler +doesn't know that one of its threads has been descheduled by the xv6 scheduler. As +another example, two user-level threads will not run concurrently on different +cores, because the xv6 scheduler isn't aware that there are multiple +threads that could run in parallel. Note that if two user-level threads were to +run truly in parallel, this implementation won't work because of several races +(e.g., two threads on different processors could call <tt>thread_schedule</tt> +concurrently, select the same runnable thread, and both run it on different +processors.) + +<p>There are several ways of addressing these problems. One is + using <a href="http://en.wikipedia.org/wiki/Scheduler_activations">scheduler + activations</a> and another is to use one kernel thread per + user-level thread (as Linux kernels do). Implement one of these ways + in xv6. This is not easy to get right; for example, you will need to + implement TLB shootdown when updating a page table for a + multithreaded user process. + +<p>Add locks, condition variables, barriers, +etc. to your thread package. + +</body> +</html> + diff --git a/labs/xv6.html b/labs/xv6.html new file mode 100644 index 0000000..13d581e --- /dev/null +++ b/labs/xv6.html @@ -0,0 +1,238 @@ +<html> +<head> +<title>Lab: xv6</title> +<link rel="stylesheet" href="homework.css" type="text/css" /> +</head> +<body> + +<h1>Lab: xv6</h1> + +This lab makes you familiar with xv6 and its system calls. + +<h2>Boot xv6</h2> + +<p>Login to Athena (e.g., ssh -X athena.dialup.mit.edu) and attach the course +locker: (You must run this command every time you log in; or add it to your +~/.environment file.) + +<pre> +$ add -f 6.828 +</pre> + +<p>Fetch the xv6 source: + +<pre> +$ mkdir 6.828 +$ cd 6.828 +$ git clone git://github.com/mit-pdos/xv6-riscv.git +Cloning into 'xv6-riscv'... +... +$ +</pre> + +<p>XXX pointer to an update tools page + +<p>Build xv6 on Athena: +<pre> +$ cd xv6-public +$ makeriscv64-linux-gnu-gcc -c -o kernel/entry.o kernel/entry.S +riscv64-linux-gnu-gcc -Wall -Werror -O -fno-omit-frame-pointer -ggdb -MD -mcmodel=medany -ffreestanding -fno-common -nostdlib -mno-relax -I. -fno-stack-protector -fno-pie -no-pie -c -o kernel/start.o kernel/start.c +... +$ make qemu +... +mkfs/mkfs fs.img README user/_cat user/_echo user/_forktest user/_grep user/_init user/_kill user/_ln user/_ls user/_mkdir user/_rm user/_sh user/_stressfs user/_usertests user/_wc user/_zombie user/_cow +nmeta 46 (boot, super, log blocks 30 inode blocks 13, bitmap blocks 1) blocks 954 total 1000 +balloc: first 497 blocks have been allocated +balloc: write bitmap block at sector 45 +qemu-system-riscv64 -machine virt -kernel kernel/kernel -m 3G -smp 3 -nographic -drive file=fs.img,if=none,format=raw,id=x0 -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0 +hart 0 starting +hart 2 starting +hart 1 starting +init: starting sh +$ +</pre> + +<p> +If you type <tt>ls</tt> at the prompt, you should output similar to the following: +<pre> +$ ls +. 1 1 1024 +.. 1 1 1024 +README 2 2 2181 +cat 2 3 21024 +echo 2 4 19776 +forktest 2 5 11456 +grep 2 6 24512 +init 2 7 20656 +kill 2 8 19856 +ln 2 9 19832 +ls 2 10 23280 +mkdir 2 11 19952 +rm 2 12 19936 +sh 2 13 38632 +stressfs 2 14 20912 +usertests 2 15 106264 +wc 2 16 22160 +zombie 2 17 19376 +cow 2 18 27152 +console 3 19 0 +</pre> +These are the programs/files that <tt>mkfs</tt> includes in the +initial file system. You just ran one of them: <tt>ls</tt>. + +<h2>sleep</h2> + +<p>Implement the UNIX program sleep for xv6; your sleep should pause + for a user-specified number of ticks. + +<p>Some hints: + <ul> + <li>Look at some of the other programs in <tt>user/</tt> to see + how you can obtain the command-line arguments passed to a program. If the user + forgets to pass an argument, sleep should print an error message. + + <li>The command-line argument is passed as a string; you can convert it to an + integer using <tt>atoi</tt> (see user/ulib.c). + + <li>Use the system call <tt>sleep</tt> (see user/usys.S and kernel/sysproc.c). + + <li>Make sure <tt>main</tt> calls <tt>exit()</tt> in order to exit + your program. + + <li>Add the program to <tt>UPROGS</tt> in Makefile and compile + user programs by typing <tt>make fs.img</tt>. + + </ul> + + <p>Run the program from the xv6 shell: + <pre> + $ make qemu + ... + init: starting sh + $ sleep 10 + (waits for a little while) + $ + </pre> + + <p>Optional: write an uptime program that prints the uptime in terms + of ticks using the <tt>uptime</tt> system call. + +<h2>pingpong</h2> + +<p> Write a program that uses UNIX system calls to ``ping-pong'' a + byte between two processes over a pair of pipes, one for each + direction. The parent sends by writing a byte to <tt>fd[1]</tt> and + the child receives it by reading from <tt>fd[0]</tt>. After + receiving a byte from parent, the child responds with its own byte + by writing to <tt>fd[1]</tt>, which the parent then reads. + +<p>Some hints: + <ul> + <li>Use <tt>pipe</tt> to create a pipe. + <li>Use <tt>fork</tt> to create a child. + <li>Use <tt>read</tt> to read from the pipe, and <tt>write</tt> to write to the pipe. + </ul> + +<h2>primes</h2> + + <p>Write a concurrent version of prime sieve using pipes. This idea + is due to Doug McIlroy, inventor of Unix pipes. The picture + halfway down <a href="http://swtch.com/~rsc/thread/">the page</a> + and the text surrounding it explain how to do it. + + <p>Your goal is to use <tt>pipe</tt> and <tt>fork</tt> to set up + the pipeline. The first process feeds the numbers 2 through 35 + into the pipeline. For each prime number, you will arrange to + create one process that reads from its left neighbor over a pipe + and writes to its right neighbor over another pipe. Since xv6 has + limited number of file descriptors and processes, the first + process can stop at 35. + +<p>Some hints: + <ul> + <li>Be careful to close file descriptors that a process doesn't + need, because otherwise your program will run xv6 out of resources + before the first process reaches 35. + + <li>Once the first process reach 35, you should arrange that the + pipeline terminates cleanly (Hint: read will return an end-of-file + when the write-side of the pipe is closed). + </ul> + +<h2>find</h2> + +<p>Write a simple version of the UNIX find program: find all the files + in a directory tree whose name matches a string. For example if the + file system contains a file <tt>a/b</tt>, then running find as + follows should produce: + <pre> + $ find . b + ./a/b + $ + </pre> + +<p>Some hints: + <ul> + <li>Look at user/ls.c to see how to read directories. + <li>Use recursion to run find in sub-directories. + <li>Don't recurse into "." and "..". + </ul> + +<p>Optional: support regular expressions in name matching. Grep has some + primitive support for regular expressions. + +<h2>xargs</h2> + +<p>Write a simple version of the UNIX xargs program: read lines from + standard in and run a command for each line, supplying the line as + arguments to the command. The following example illustrates xarg's + behavior: + <pre> + $ xargs echo bye + hello too + bye hello too + <ctrl-d> + $ + </pre> + Note that the command here is "echo bye" and the additional + arguments are "hello too", making the command "echo bye hello too", + which outputs "bye hello too". + +<p>xargs and find combine well: + <pre> + find . b | xargs grep hello + </pre> + will run "grep hello" on each file named b in the directories below ".". + +<p>Some hints: + <ul> + <li>Use <tt>fork</tt> and <tt>exec</tt> system call to invoke the + command on each line of input. Use <tt>wait</tt> in the parent + to wait for the child to complete running the command. + <li>Read from stdin a character at the time until the newline + character ('\n'). + <li>kernel/param.h declares MAXARG, which may be useful if you need + to declare an argv. + </ul> + +<h2>Optional: modify the shell</h2> + +There are endless ways in which the shell could be extended. Here are +some suggestions: + +<ul> + +<li>Modify the shell to support wait. + +<li>Modify the shell to support lists of commands, separated by ";" + +<li>Modify the shell to support sub-shells by implementing "(" and ")" + +<li>Modify the shell to allow users to edit the command line + +</ul> + +</body> +</html> + + |