diff options
Diffstat (limited to 'labs/fs.html')
-rw-r--r-- | labs/fs.html | 360 |
1 files changed, 0 insertions, 360 deletions
diff --git a/labs/fs.html b/labs/fs.html deleted file mode 100644 index a21e61f..0000000 --- a/labs/fs.html +++ /dev/null @@ -1,360 +0,0 @@ -<html> -<head> -<title>Lab: file system</title> -<link rel="stylesheet" href="homework.css" type="text/css" /> -</head> -<body> - -<h1>Lab: file system</h1> - -<p>In this lab you will add large files and <tt>mmap</tt> to the xv6 file system. - -<h2>Large files</h2> - -<p>In this assignment you'll increase the maximum size of an xv6 -file. Currently xv6 files are limited to 268 blocks, or 268*BSIZE -bytes (BSIZE is 1024 in xv6). This limit comes from the fact that an -xv6 inode contains 12 "direct" block numbers and one "singly-indirect" -block number, which refers to a block that holds up to 256 more block -numbers, for a total of 12+256=268. You'll change the xv6 file system -code to support a "doubly-indirect" block in each inode, containing -256 addresses of singly-indirect blocks, each of which can contain up -to 256 addresses of data blocks. The result will be that a file will -be able to consist of up to 256*256+256+11 blocks (11 instead of 12, -because we will sacrifice one of the direct block numbers for the -double-indirect block). - -<h3>Preliminaries</h3> - -<p>Modify your Makefile's <tt>CPUS</tt> definition so that it reads: -<pre> -CPUS := 1 -</pre> - -<b>XXX doesn't seem to speedup things</b> -<p>Add -<pre> -QEMUEXTRA = -snapshot -</pre> -right before -<tt>QEMUOPTS</tt> -<p> -The above two steps speed up qemu tremendously when xv6 -creates large files. - -<p><tt>mkfs</tt> initializes the file system to have fewer -than 1000 free data blocks, too few to show off the changes -you'll make. Modify <tt>param.h</tt> to -set <tt>FSSIZE</tt> to: -<pre> - #define FSSIZE 20000 // size of file system in blocks -</pre> - -<p>Download <a href="big.c">big.c</a> into your xv6 directory, -add it to the UPROGS list, start up xv6, and run <tt>big</tt>. -It creates as big a file as xv6 will let -it, and reports the resulting size. It should say 140 sectors. - -<h3>What to Look At</h3> - -The format of an on-disk inode is defined by <tt>struct dinode</tt> -in <tt>fs.h</tt>. You're particularly interested in <tt>NDIRECT</tt>, -<tt>NINDIRECT</tt>, <tt>MAXFILE</tt>, and the <tt>addrs[]</tt> element -of <tt>struct dinode</tt>. Look Figure 7.3 in the xv6 text for a -diagram of the standard xv6 inode. - -<p> -The code that finds a file's data on disk is in <tt>bmap()</tt> -in <tt>fs.c</tt>. Have a look at it and make sure you understand -what it's doing. <tt>bmap()</tt> is called both when reading and -writing a file. When writing, <tt>bmap()</tt> allocates new -blocks as needed to hold file content, as well as allocating -an indirect block if needed to hold block addresses. - -<p> -<tt>bmap()</tt> deals with two kinds of block numbers. The <tt>bn</tt> -argument is a "logical block" -- a block number relative to the start -of the file. The block numbers in <tt>ip->addrs[]</tt>, and the -argument to <tt>bread()</tt>, are disk block numbers. -You can view <tt>bmap()</tt> as mapping a file's logical -block numbers into disk block numbers. - -<h3>Your Job</h3> - -Modify <tt>bmap()</tt> so that it implements a doubly-indirect -block, in addition to direct blocks and a singly-indirect block. -You'll have to have only 11 direct blocks, rather than 12, -to make room for your new doubly-indirect block; you're -not allowed to change the size of an on-disk inode. -The first 11 elements of <tt>ip->addrs[]</tt> should be -direct blocks; the 12th should be a singly-indirect block -(just like the current one); the 13th should be your new -doubly-indirect block. - -<p> -You don't have to modify xv6 to handle deletion of files with -doubly-indirect blocks. - -<p> -If all goes well, <tt>big</tt> will now report that it -can write sectors. It will take <tt>big</tt> minutes -to finish. - -<b>XXX this runs for a while!</b> - -<h3>Hints</h3> - -<p> -Make sure you understand <tt>bmap()</tt>. Write out a diagram of the -relationships between <tt>ip->addrs[]</tt>, the indirect block, the -doubly-indirect block and the singly-indirect blocks it points to, and -data blocks. Make sure you understand why adding a doubly-indirect -block increases the maximum file size by 256*256 blocks (really -1), -since you have to decrease the number of direct blocks by one). - -<p> -Think about how you'll index the doubly-indirect block, and -the indirect blocks it points to, with the logical block -number. - -<p>If you change the definition of <tt>NDIRECT</tt>, you'll -probably have to change the size of <tt>addrs[]</tt> -in <tt>struct inode</tt> in <tt>file.h</tt>. Make sure that -<tt>struct inode</tt> and <tt>struct dinode</tt> have the -same number of elements in their <tt>addrs[]</tt> arrays. - -<p>If you change the definition of <tt>NDIRECT</tt>, make sure to create a -new <tt>fs.img</tt>, since <tt>mkfs</tt> uses <tt>NDIRECT</tt> too to build the -initial file systems. If you delete <tt>fs.img</tt>, <tt>make</tt> on Unix (not -xv6) will build a new one for you. - -<p>If your file system gets into a bad state, perhaps by crashing, -delete <tt>fs.img</tt> (do this from Unix, not xv6). <tt>make</tt> will build a -new clean file system image for you. - -<p>Don't forget to <tt>brelse()</tt> each block that you -<tt>bread()</tt>. - -<p>You should allocate indirect blocks and doubly-indirect - blocks only as needed, like the original <tt>bmap()</tt>. - -<p>Optional challenge: support triple-indirect blocks. - -<h2>Writing with a Log</h2> - -Insert a print statement in bwrite (in bio.c) so that you get a -print every time a block is written to disk: - -<pre> - printf("bwrite block %d\n", b->blockno); -</pre> - -Build and boot a new kernel and run this: -<pre> - $ rm README -</pre> - -<p>You should see a sequence of bwrite prints after the <tt>rm</tt>.</p> - -<div class="question"> -<ol> -<li>Annotate the bwrite lines with the kind of information that is -being written to the disk (e.g., "README's inode", "allocation -bitmap"). If the log is being written, note both that the log is being -written and also what kind of information is being written to the log. -<li>Mark with an arrow the first point at which, if a -crash occured, README would be missing after a reboot -(after the call to <tt>recover_from_log()</tt>). -</ol> -</p> -</div> - - -<h2>Crash safety</h2> - -<p>This assignment explores the xv6 log in two parts. -First, you'll artificially create a crash which illustrates -why logging is needed. Second, you'll remove one -inefficiency in the xv6 logging system. - -<p> -Submit your solution before the beginning of the next lecture -to <a href="https://6828.scripts.mit.edu/2018/handin.py/">the submission -web site</a>. - -<h3>Creating a Problem</h3> - -<p> -The point of the xv6 log is to cause all the disk updates of a -filesystem operation to be atomic with respect to crashes. -For example, file creation involves both adding a new entry -to a directory and marking the new file's inode as in-use. -A crash that happened after one but before the other would -leave the file system in an incorrect state after a reboot, -if there were no log. - -<p> -The following steps will break the logging code in a way that -leaves a file partially created. - -<p> -First, replace <tt>commit()</tt> in <tt>log.c</tt> with -this code: -<pre> -#include "kernel/proc.h" -void -commit(void) -{ - int pid = myproc()->pid; - if (log.lh.n > 0) { - write_log(); - write_head(); - if(pid > 1) // AAA - log.lh.block[0] = 0; // BBB - install_trans(); - if(pid > 1) // AAA - panic("commit mimicking crash"); // CCC - log.lh.n = 0; - write_head(); - } -} -</pre> - -<p> -The BBB line causes the first block in the log to be written to -block zero, rather than wherever it should be written. During file -creation, the first block in the log is the new file's inode updated -to have non-zero <tt>type</tt>. -Line BBB causes the block -with the updated inode to be written to block 0 (whence -it will never be read), leaving the on-disk inode still marked -unallocated. The CCC line forces a crash. -The AAA lines suppress this buggy behavior for <tt>init</tt>, -which creates files before the shell starts. - -<p> -Second, replace <tt>recover_from_log()</tt> in <tt>log.c</tt> -with this code: -<pre> -static void -recover_from_log(void) -{ - read_head(); - printf("recovery: n=%d but ignoring\n", log.lh.n); - // install_trans(); - log.lh.n = 0; - // write_head(); -} -</pre> - -<p> -This modification suppresses log recovery (which would repair -the damage caused by your change to <tt>commit()</tt>). - -<p> -Finally, remove the <tt>-snapshot</tt> option from the definition -of <tt>QEMUEXTRA</tt> in your Makefile so that the disk image will see the -changes. - -<p> -Now remove <tt>fs.img</tt> and run xv6: -<pre> - % rm fs.img ; make qemu -</pre> -<p> -Tell the xv6 shell to create a file: -<pre> - $ echo hi > a -</pre> - -<p> -You should see the panic from <tt>commit()</tt>. So far -it is as if a crash occurred in a non-logging system in the middle -of creating a file. - -<p> -Now re-start xv6, keeping the same <tt>fs.img</tt>: -<pre> - % make qemu -</pre> - -<p> -And look at file <tt>a</tt>: -<pre> - $ cat a -</pre> - -<p> - You should see <tt>panic: ilock: no type</tt>. Make sure you understand what happened. -Which of the file creation's modifications were written to the disk -before the crash, and which were not? - -<h3>Solving the Problem</h3> - -Now fix <tt>recover_from_log()</tt>: -<pre> -static void -recover_from_log(void) -{ - read_head(); - cprintf("recovery: n=%d\n", log.lh.n); - install_trans(); - log.lh.n = 0; - write_head(); -} -</pre> - -<p> -Run xv6 (keeping the same <tt>fs.img</tt>) and read <tt>a</tt> again: -<pre> - $ cat a -</pre> - -<p> -This time there should be no crash. Make sure you understand why -the file system now works. - -<p> -Why was the file empty, even though you created -it with <tt>echo hi > a</tt>? - -<p> -Now remove your modifications to <tt>commit()</tt> -(the if's and the AAA and BBB lines), so that logging works again, -and remove <tt>fs.img</tt>. - -<h3>Streamlining Commit</h3> - -<p> -Suppose the file system code wants to update an inode in block 33. -The file system code will call <tt>bp=bread(block 33)</tt> and update the -buffer data. <tt>write_log()</tt> in <tt>commit()</tt> -will copy the data to a block in the log on disk, for example block 3. -A bit later in <tt>commit</tt>, <tt>install_trans()</tt> reads -block 3 from the log (containing block 33), copies its contents into the in-memory -buffer for block 33, and then writes that buffer to block 33 on the disk. - -<p> -However, in <tt>install_trans()</tt>, it turns out that the modified -block 33 is guaranteed to be still in the buffer cache, where the -file system code left it. Make sure you understand why it would be a -mistake for the buffer cache to evict block 33 from the buffer cache -before the commit. - -<p> -Since the modified block 33 is guaranteed to already be in the buffer -cache, there's no need for <tt>install_trans()</tt> to read block -33 from the log. Your job: modify <tt>log.c</tt> so that, when -<tt>install_trans()</tt> is called from <tt>commit()</tt>, -<tt>install_trans()</tt> does not perform the needless read from the log. - -<p>To test your changes, create a file in xv6, restart, and make sure -the file is still there. - -<b>XXX Does this speedup bigfile?</b> - -<b>XXX Maybe support lseek and modify shell to append to a file?</b> - - -</body> -</html> |