diff options
| author | rsc <rsc> | 2008-09-03 04:50:04 +0000 | 
|---|---|---|
| committer | rsc <rsc> | 2008-09-03 04:50:04 +0000 | 
| commit | f53494c28e362fb7752bbc83417b9ba47cff0bf5 (patch) | |
| tree | 7a7474710c9553b0188796ba24ae3af992320153 /web/l2.html | |
| parent | ee3f75f229742a376bedafe34d0ba04995a942be (diff) | |
| download | xv6-labs-f53494c28e362fb7752bbc83417b9ba47cff0bf5.tar.gz xv6-labs-f53494c28e362fb7752bbc83417b9ba47cff0bf5.tar.bz2 xv6-labs-f53494c28e362fb7752bbc83417b9ba47cff0bf5.zip | |
DO NOT MAIL: xv6 web pages
Diffstat (limited to 'web/l2.html')
| -rw-r--r-- | web/l2.html | 494 | 
1 files changed, 494 insertions, 0 deletions
| diff --git a/web/l2.html b/web/l2.html new file mode 100644 index 0000000..e183d5a --- /dev/null +++ b/web/l2.html @@ -0,0 +1,494 @@ +<html> +<head> +<title>L2</title> +</head> +<body> + +<h1>6.828 Lecture Notes: x86 and PC architecture</h1> + +<h2>Outline</h2> +<ul> +<li>PC architecture +<li>x86 instruction set +<li>gcc calling conventions +<li>PC emulation +</ul> + +<h2>PC architecture</h2> + +<ul> +<li>A full PC has: +  <ul> +  <li>an x86 CPU with registers, execution unit, and memory management +  <li>CPU chip pins include address and data signals +  <li>memory +  <li>disk +  <li>keyboard +  <li>display +  <li>other resources: BIOS ROM, clock, ... +  </ul> + +<li>We will start with the original 16-bit 8086 CPU (1978) +<li>CPU runs instructions: +<pre> +for(;;){ +	run next instruction +} +</pre> + +<li>Needs work space: registers +	<ul> +	<li>four 16-bit data registers: AX, CX, DX, BX +        <li>each in two 8-bit halves, e.g. AH and AL +	<li>very fast, very few +	</ul> +<li>More work space: memory +	<ul> +	<li>CPU sends out address on address lines (wires, one bit per wire) +	<li>Data comes back on data lines +	<li><i>or</i> data is written to data lines +	</ul> + +<li>Add address registers: pointers into memory +	<ul> +	<li>SP - stack pointer +	<li>BP - frame base pointer +	<li>SI - source index +	<li>DI - destination index +	</ul> + +<li>Instructions are in memory too! +	<ul> +	<li>IP - instruction pointer (PC on PDP-11, everything else) +	<li>increment after running each instruction +	<li>can be modified by CALL, RET, JMP, conditional jumps +	</ul> + +<li>Want conditional jumps +	<ul> +	<li>FLAGS - various condition codes +		<ul> +		<li>whether last arithmetic operation overflowed +		<li> ... was positive/negative +		<li> ... was [not] zero +		<li> ... carry/borrow on add/subtract +		<li> ... overflow +		<li> ... etc. +		<li>whether interrupts are enabled +		<li>direction of data copy instructions +		</ul> +	<li>JP, JN, J[N]Z, J[N]C, J[N]O ... +	</ul> + +<li>Still not interesting - need I/O to interact with outside world +	<ul> +	<li>Original PC architecture: use dedicated <i>I/O space</i> +		<ul> +		<li>Works same as memory accesses but set I/O signal +		<li>Only 1024 I/O addresses +		<li>Example: write a byte to line printer: +<pre> +#define DATA_PORT    0x378 +#define STATUS_PORT  0x379 +#define   BUSY 0x80 +#define CONTROL_PORT 0x37A +#define   STROBE 0x01 +void +lpt_putc(int c) +{ +  /* wait for printer to consume previous byte */ +  while((inb(STATUS_PORT) & BUSY) == 0) +    ; + +  /* put the byte on the parallel lines */ +  outb(DATA_PORT, c); + +  /* tell the printer to look at the data */ +  outb(CONTROL_PORT, STROBE); +  outb(CONTROL_PORT, 0); +} +<pre> +		</ul> + +<li>Memory-Mapped I/O +	<ul> +	<li>Use normal physical memory addresses +		<ul> +		<li>Gets around limited size of I/O address space +		<li>No need for special instructions +		<li>System controller routes to appropriate device +		</ul> +	<li>Works like ``magic'' memory: +		<ul> +		<li>	<i>Addressed</i> and <i>accessed</i> like memory, +			but ... +		<li>	... does not <i>behave</i> like memory! +		<li>	Reads and writes can have ``side effects'' +		<li>	Read results can change due to external events +		</ul> +	</ul> +</ul> + + +<li>What if we want to use more than 2^16 bytes of memory? +	<ul> +        <li>8086 has 20-bit physical addresses, can have 1 Meg RAM +        <li>each segment is a 2^16 byte window into physical memory +        <li>virtual to physical translation: pa = va + seg*16 +        <li>the segment is usually implicit, from a segment register +	<li>CS - code segment (for fetches via IP) +	<li>SS - stack segment (for load/store via SP and BP) +	<li>DS - data segment (for load/store via other registers) +	<li>ES - another data segment (destination for string operations) +        <li>tricky: can't use the 16-bit address of a stack variable as a pointer +        <li>but a <i>far pointer</i> includes full segment:offset (16 + 16 bits) +	</ul> + +<li>But 8086's 16-bit addresses and data were still painfully small +  <ul> +  <li>80386 added support for 32-bit data and addresses (1985) +  <li>boots in 16-bit mode, boot.S switches to 32-bit mode +  <li>registers are 32 bits wide, called EAX rather than AX +  <li>operands and addresses are also 32 bits, e.g. ADD does 32-bit arithmetic +  <li>prefix 0x66 gets you 16-bit mode: MOVW is really 0x66 MOVW +  <li>the .code32 in boot.S tells assembler to generate 0x66 for e.g. MOVW +  <li>80386 also changed segments and added paged memory... +  </ul> + +</ul> + +<h2>x86 Physical Memory Map</h2> + +<ul> +<li>The physical address space mostly looks like ordinary RAM +<li>Except some low-memory addresses actually refer to other things +<li>Writes to VGA memory appear on the screen +<li>Reset or power-on jumps to ROM at 0x000ffff0 +</ul> + +<pre> ++------------------+  <- 0xFFFFFFFF (4GB) +|      32-bit      | +|  memory mapped   | +|     devices      | +|                  | +/\/\/\/\/\/\/\/\/\/\ + +/\/\/\/\/\/\/\/\/\/\ +|                  | +|      Unused      | +|                  | ++------------------+  <- depends on amount of RAM +|                  | +|                  | +| Extended Memory  | +|                  | +|                  | ++------------------+  <- 0x00100000 (1MB) +|     BIOS ROM     | ++------------------+  <- 0x000F0000 (960KB) +|  16-bit devices, | +|  expansion ROMs  | ++------------------+  <- 0x000C0000 (768KB) +|   VGA Display    | ++------------------+  <- 0x000A0000 (640KB) +|                  | +|    Low Memory    | +|                  | ++------------------+  <- 0x00000000 +</pre> + +<h2>x86 Instruction Set</h2> + +<ul> +<li>Two-operand instruction set +	<ul> +	<li>Intel syntax: <tt>op dst, src</tt> +	<li>AT&T (gcc/gas) syntax: <tt>op src, dst</tt> +		<ul> +		<li>uses b, w, l suffix on instructions to specify size of operands +		</ul> +	<li>Operands are registers, constant, memory via register, memory via constant +	<li>	Examples: +		<table cellspacing=5> +		<tr><td><u>AT&T syntax</u> <td><u>"C"-ish equivalent</u> +		<tr><td>movl %eax, %edx <td>edx = eax; <td><i>register mode</i> +		<tr><td>movl $0x123, %edx <td>edx = 0x123; <td><i>immediate</i> +		<tr><td>movl 0x123, %edx <td>edx = *(int32_t*)0x123; <td><i>direct</i> +		<tr><td>movl (%ebx), %edx <td>edx = *(int32_t*)ebx; <td><i>indirect</i> +		<tr><td>movl 4(%ebx), %edx <td>edx = *(int32_t*)(ebx+4); <td><i>displaced</i> +		</table> +	</ul> + +<li>Instruction classes +	<ul> +	<li>data movement: MOV, PUSH, POP, ... +	<li>arithmetic: TEST, SHL, ADD, AND, ... +	<li>i/o: IN, OUT, ... +	<li>control: JMP, JZ, JNZ, CALL, RET +	<li>string: REP MOVSB, ... +	<li>system: IRET, INT +	</ul> + +<li>Intel architecture manual Volume 2 is <i>the</i> reference + +</ul> + +<h2>gcc x86 calling conventions</h2> + +<ul> +<li>x86 dictates that stack grows down: +	<table cellspacing=5> +	<tr><td><u>Example instruction</u> <td><u>What it does</u> +	<tr><td>pushl %eax +		<td> +		subl $4, %esp <br> +		movl %eax, (%esp) <br> +	<tr><td>popl %eax +		<td> +		movl (%esp), %eax <br> +		addl $4, %esp <br> +	<tr><td>call $0x12345 +		<td> +		pushl %eip <sup>(*)</sup> <br> +		movl $0x12345, %eip <sup>(*)</sup> <br> +	<tr><td>ret +		<td> +		popl %eip <sup>(*)</sup> +	</table> +	(*) <i>Not real instructions</i> + +<li>GCC dictates how the stack is used. +	Contract between caller and callee on x86: +	<ul> +	<li>after call instruction: +		<ul> +		<li>%eip points at first instruction of function +		<li>%esp+4 points at first argument +		<li>%esp points at return address +		</ul> +	<li>after ret instruction: +		<ul> +		<li>%eip contains return address +		<li>%esp points at arguments pushed by caller +		<li>called function may have trashed arguments +		<li>%eax contains return value +			(or trash if function is <tt>void</tt>) +		<li>%ecx, %edx may be trashed +		<li>%ebp, %ebx, %esi, %edi must contain contents from time of <tt>call</tt> +		</ul> +	<li>Terminology: +		<ul> +		<li>%eax, %ecx, %edx are "caller save" registers +		<li>%ebp, %ebx, %esi, %edi are "callee save" registers +		</ul> +	</ul> + +<li>Functions can do anything that doesn't violate contract. +	By convention, GCC does more: +	<ul> +	<li>each function has a stack frame marked by %ebp, %esp +		<pre> +		       +------------+   | +		       | arg 2      |   \ +		       +------------+    >- previous function's stack frame +		       | arg 1      |   / +		       +------------+   | +		       | ret %eip   |   / +		       +============+    +		       | saved %ebp |   \ +		%ebp-> +------------+   | +		       |            |   | +		       |   local    |   \ +		       | variables, |    >- current function's stack frame +		       |    etc.    |   / +		       |            |   | +		       |            |   | +		%esp-> +------------+   / +		</pre> +	<li>%esp can move to make stack frame bigger, smaller +	<li>%ebp points at saved %ebp from previous function, +		chain to walk stack +	<li>function prologue: +		<pre> +			pushl %ebp +			movl %esp, %ebp +		</pre> +	<li>function epilogue: +		<pre> +			movl %ebp, %esp +			popl %ebp +		</pre> +		or +		<pre> +			leave +		</pre> +	</ul> + +<li>Big example: +	<ul> +	<li>C code +		<pre> +		int main(void) { return f(8)+1; } +		int f(int x) { return g(x); } +		int g(int x) { return x+3; } +		</pre> +	<li>assembler +		<pre> +		_main: +					<i>prologue</i> +			pushl %ebp +			movl %esp, %ebp +					<i>body</i> +			pushl $8 +			call _f +			addl $1, %eax +					<i>epilogue</i> +			movl %ebp, %esp +			popl %ebp +			ret +		_f: +					<i>prologue</i> +			pushl %ebp +			movl %esp, %ebp +					<i>body</i> +			pushl 8(%esp) +			call _g +					<i>epilogue</i> +			movl %ebp, %esp +			popl %ebp +			ret + +		_g: +					<i>prologue</i> +			pushl %ebp +			movl %esp, %ebp +					<i>save %ebx</i> +			pushl %ebx +					<i>body</i> +			movl 8(%ebp), %ebx +			addl $3, %ebx +			movl %ebx, %eax +					<i>restore %ebx</i> +			popl %ebx +					<i>epilogue</i> +			movl %ebp, %esp +			popl %ebp +			ret +		</pre> +	</ul> + +<li>Super-small <tt>_g</tt>: +	<pre> +		_g: +			movl 4(%esp), %eax +			addl $3, %eax +			ret +	</pre> + +<li>Compiling, linking, loading: +	<ul> +	<li>	<i>Compiler</i> takes C source code (ASCII text), +		produces assembly language (also ASCII text) +	<li>	<i>Assembler</i> takes assembly language (ASCII text), +		produces <tt>.o</tt> file (binary, machine-readable!) +	<li>	<i>Linker</i> takse multiple '<tt>.o</tt>'s, +		produces a single <i>program image</i> (binary) +	<li>	<i>Loader</i> loads the program image into memory +		at run-time and starts it executing +	</ul> +</ul> + + +<h2>PC emulation</h2> + +<ul> +<li>	Emulator like Bochs works by +	<ul> +	<li>	doing exactly what a real PC would do, +	<li>	only implemented in software rather than hardware! +	</ul> +<li>	Runs as a normal process in a "host" operating system (e.g., Linux) +<li>	Uses normal process storage to hold emulated hardware state: +	e.g., +	<ul> +	<li>	Hold emulated CPU registers in global variables +		<pre> +		int32_t regs[8]; +		#define REG_EAX 1; +		#define REG_EBX 2; +		#define REG_ECX 3; +		... +		int32_t eip; +		int16_t segregs[4]; +		... +		</pre> +	<li>	<tt>malloc</tt> a big chunk of (virtual) process memory +		to hold emulated PC's (physical) memory +	</ul> +<li>	Execute instructions by simulating them in a loop: +	<pre> +	for (;;) { +		read_instruction(); +		switch (decode_instruction_opcode()) { +		case OPCODE_ADD: +			int src = decode_src_reg(); +			int dst = decode_dst_reg(); +			regs[dst] = regs[dst] + regs[src]; +			break; +		case OPCODE_SUB: +			int src = decode_src_reg(); +			int dst = decode_dst_reg(); +			regs[dst] = regs[dst] - regs[src]; +			break; +		... +		} +		eip += instruction_length; +	} +	</pre> + +<li>	Simulate PC's physical memory map +	by decoding emulated "physical" addresses just like a PC would: +	<pre> +	#define KB		1024 +	#define MB		1024*1024 + +	#define LOW_MEMORY	640*KB +	#define EXT_MEMORY	10*MB + +	uint8_t low_mem[LOW_MEMORY]; +	uint8_t ext_mem[EXT_MEMORY]; +	uint8_t bios_rom[64*KB]; + +	uint8_t read_byte(uint32_t phys_addr) { +		if (phys_addr < LOW_MEMORY) +			return low_mem[phys_addr]; +		else if (phys_addr >= 960*KB && phys_addr < 1*MB) +			return rom_bios[phys_addr - 960*KB]; +		else if (phys_addr >= 1*MB && phys_addr < 1*MB+EXT_MEMORY) { +			return ext_mem[phys_addr-1*MB]; +		else ... +	} + +	void write_byte(uint32_t phys_addr, uint8_t val) { +		if (phys_addr < LOW_MEMORY) +			low_mem[phys_addr] = val; +		else if (phys_addr >= 960*KB && phys_addr < 1*MB) +			; /* ignore attempted write to ROM! */ +		else if (phys_addr >= 1*MB && phys_addr < 1*MB+EXT_MEMORY) { +			ext_mem[phys_addr-1*MB] = val; +		else ... +	} +	</pre> +<li>	Simulate I/O devices, etc., by detecting accesses to +	"special" memory and I/O space and emulating the correct behavior: +	e.g., +	<ul> +	<li>	Reads/writes to emulated hard disk +		transformed into reads/writes of a file on the host system +	<li>	Writes to emulated VGA display hardware +		transformed into drawing into an X window +	<li>	Reads from emulated PC keyboard +		transformed into reads from X input event queue +	</ul> +</ul> | 
