diff options
Diffstat (limited to 'web/l2.html')
-rw-r--r-- | web/l2.html | 494 |
1 files changed, 0 insertions, 494 deletions
diff --git a/web/l2.html b/web/l2.html deleted file mode 100644 index e183d5a..0000000 --- a/web/l2.html +++ /dev/null @@ -1,494 +0,0 @@ -<html> -<head> -<title>L2</title> -</head> -<body> - -<h1>6.828 Lecture Notes: x86 and PC architecture</h1> - -<h2>Outline</h2> -<ul> -<li>PC architecture -<li>x86 instruction set -<li>gcc calling conventions -<li>PC emulation -</ul> - -<h2>PC architecture</h2> - -<ul> -<li>A full PC has: - <ul> - <li>an x86 CPU with registers, execution unit, and memory management - <li>CPU chip pins include address and data signals - <li>memory - <li>disk - <li>keyboard - <li>display - <li>other resources: BIOS ROM, clock, ... - </ul> - -<li>We will start with the original 16-bit 8086 CPU (1978) -<li>CPU runs instructions: -<pre> -for(;;){ - run next instruction -} -</pre> - -<li>Needs work space: registers - <ul> - <li>four 16-bit data registers: AX, CX, DX, BX - <li>each in two 8-bit halves, e.g. AH and AL - <li>very fast, very few - </ul> -<li>More work space: memory - <ul> - <li>CPU sends out address on address lines (wires, one bit per wire) - <li>Data comes back on data lines - <li><i>or</i> data is written to data lines - </ul> - -<li>Add address registers: pointers into memory - <ul> - <li>SP - stack pointer - <li>BP - frame base pointer - <li>SI - source index - <li>DI - destination index - </ul> - -<li>Instructions are in memory too! - <ul> - <li>IP - instruction pointer (PC on PDP-11, everything else) - <li>increment after running each instruction - <li>can be modified by CALL, RET, JMP, conditional jumps - </ul> - -<li>Want conditional jumps - <ul> - <li>FLAGS - various condition codes - <ul> - <li>whether last arithmetic operation overflowed - <li> ... was positive/negative - <li> ... was [not] zero - <li> ... carry/borrow on add/subtract - <li> ... overflow - <li> ... etc. - <li>whether interrupts are enabled - <li>direction of data copy instructions - </ul> - <li>JP, JN, J[N]Z, J[N]C, J[N]O ... - </ul> - -<li>Still not interesting - need I/O to interact with outside world - <ul> - <li>Original PC architecture: use dedicated <i>I/O space</i> - <ul> - <li>Works same as memory accesses but set I/O signal - <li>Only 1024 I/O addresses - <li>Example: write a byte to line printer: -<pre> -#define DATA_PORT 0x378 -#define STATUS_PORT 0x379 -#define BUSY 0x80 -#define CONTROL_PORT 0x37A -#define STROBE 0x01 -void -lpt_putc(int c) -{ - /* wait for printer to consume previous byte */ - while((inb(STATUS_PORT) & BUSY) == 0) - ; - - /* put the byte on the parallel lines */ - outb(DATA_PORT, c); - - /* tell the printer to look at the data */ - outb(CONTROL_PORT, STROBE); - outb(CONTROL_PORT, 0); -} -<pre> - </ul> - -<li>Memory-Mapped I/O - <ul> - <li>Use normal physical memory addresses - <ul> - <li>Gets around limited size of I/O address space - <li>No need for special instructions - <li>System controller routes to appropriate device - </ul> - <li>Works like ``magic'' memory: - <ul> - <li> <i>Addressed</i> and <i>accessed</i> like memory, - but ... - <li> ... does not <i>behave</i> like memory! - <li> Reads and writes can have ``side effects'' - <li> Read results can change due to external events - </ul> - </ul> -</ul> - - -<li>What if we want to use more than 2^16 bytes of memory? - <ul> - <li>8086 has 20-bit physical addresses, can have 1 Meg RAM - <li>each segment is a 2^16 byte window into physical memory - <li>virtual to physical translation: pa = va + seg*16 - <li>the segment is usually implicit, from a segment register - <li>CS - code segment (for fetches via IP) - <li>SS - stack segment (for load/store via SP and BP) - <li>DS - data segment (for load/store via other registers) - <li>ES - another data segment (destination for string operations) - <li>tricky: can't use the 16-bit address of a stack variable as a pointer - <li>but a <i>far pointer</i> includes full segment:offset (16 + 16 bits) - </ul> - -<li>But 8086's 16-bit addresses and data were still painfully small - <ul> - <li>80386 added support for 32-bit data and addresses (1985) - <li>boots in 16-bit mode, boot.S switches to 32-bit mode - <li>registers are 32 bits wide, called EAX rather than AX - <li>operands and addresses are also 32 bits, e.g. ADD does 32-bit arithmetic - <li>prefix 0x66 gets you 16-bit mode: MOVW is really 0x66 MOVW - <li>the .code32 in boot.S tells assembler to generate 0x66 for e.g. MOVW - <li>80386 also changed segments and added paged memory... - </ul> - -</ul> - -<h2>x86 Physical Memory Map</h2> - -<ul> -<li>The physical address space mostly looks like ordinary RAM -<li>Except some low-memory addresses actually refer to other things -<li>Writes to VGA memory appear on the screen -<li>Reset or power-on jumps to ROM at 0x000ffff0 -</ul> - -<pre> -+------------------+ <- 0xFFFFFFFF (4GB) -| 32-bit | -| memory mapped | -| devices | -| | -/\/\/\/\/\/\/\/\/\/\ - -/\/\/\/\/\/\/\/\/\/\ -| | -| Unused | -| | -+------------------+ <- depends on amount of RAM -| | -| | -| Extended Memory | -| | -| | -+------------------+ <- 0x00100000 (1MB) -| BIOS ROM | -+------------------+ <- 0x000F0000 (960KB) -| 16-bit devices, | -| expansion ROMs | -+------------------+ <- 0x000C0000 (768KB) -| VGA Display | -+------------------+ <- 0x000A0000 (640KB) -| | -| Low Memory | -| | -+------------------+ <- 0x00000000 -</pre> - -<h2>x86 Instruction Set</h2> - -<ul> -<li>Two-operand instruction set - <ul> - <li>Intel syntax: <tt>op dst, src</tt> - <li>AT&T (gcc/gas) syntax: <tt>op src, dst</tt> - <ul> - <li>uses b, w, l suffix on instructions to specify size of operands - </ul> - <li>Operands are registers, constant, memory via register, memory via constant - <li> Examples: - <table cellspacing=5> - <tr><td><u>AT&T syntax</u> <td><u>"C"-ish equivalent</u> - <tr><td>movl %eax, %edx <td>edx = eax; <td><i>register mode</i> - <tr><td>movl $0x123, %edx <td>edx = 0x123; <td><i>immediate</i> - <tr><td>movl 0x123, %edx <td>edx = *(int32_t*)0x123; <td><i>direct</i> - <tr><td>movl (%ebx), %edx <td>edx = *(int32_t*)ebx; <td><i>indirect</i> - <tr><td>movl 4(%ebx), %edx <td>edx = *(int32_t*)(ebx+4); <td><i>displaced</i> - </table> - </ul> - -<li>Instruction classes - <ul> - <li>data movement: MOV, PUSH, POP, ... - <li>arithmetic: TEST, SHL, ADD, AND, ... - <li>i/o: IN, OUT, ... - <li>control: JMP, JZ, JNZ, CALL, RET - <li>string: REP MOVSB, ... - <li>system: IRET, INT - </ul> - -<li>Intel architecture manual Volume 2 is <i>the</i> reference - -</ul> - -<h2>gcc x86 calling conventions</h2> - -<ul> -<li>x86 dictates that stack grows down: - <table cellspacing=5> - <tr><td><u>Example instruction</u> <td><u>What it does</u> - <tr><td>pushl %eax - <td> - subl $4, %esp <br> - movl %eax, (%esp) <br> - <tr><td>popl %eax - <td> - movl (%esp), %eax <br> - addl $4, %esp <br> - <tr><td>call $0x12345 - <td> - pushl %eip <sup>(*)</sup> <br> - movl $0x12345, %eip <sup>(*)</sup> <br> - <tr><td>ret - <td> - popl %eip <sup>(*)</sup> - </table> - (*) <i>Not real instructions</i> - -<li>GCC dictates how the stack is used. - Contract between caller and callee on x86: - <ul> - <li>after call instruction: - <ul> - <li>%eip points at first instruction of function - <li>%esp+4 points at first argument - <li>%esp points at return address - </ul> - <li>after ret instruction: - <ul> - <li>%eip contains return address - <li>%esp points at arguments pushed by caller - <li>called function may have trashed arguments - <li>%eax contains return value - (or trash if function is <tt>void</tt>) - <li>%ecx, %edx may be trashed - <li>%ebp, %ebx, %esi, %edi must contain contents from time of <tt>call</tt> - </ul> - <li>Terminology: - <ul> - <li>%eax, %ecx, %edx are "caller save" registers - <li>%ebp, %ebx, %esi, %edi are "callee save" registers - </ul> - </ul> - -<li>Functions can do anything that doesn't violate contract. - By convention, GCC does more: - <ul> - <li>each function has a stack frame marked by %ebp, %esp - <pre> - +------------+ | - | arg 2 | \ - +------------+ >- previous function's stack frame - | arg 1 | / - +------------+ | - | ret %eip | / - +============+ - | saved %ebp | \ - %ebp-> +------------+ | - | | | - | local | \ - | variables, | >- current function's stack frame - | etc. | / - | | | - | | | - %esp-> +------------+ / - </pre> - <li>%esp can move to make stack frame bigger, smaller - <li>%ebp points at saved %ebp from previous function, - chain to walk stack - <li>function prologue: - <pre> - pushl %ebp - movl %esp, %ebp - </pre> - <li>function epilogue: - <pre> - movl %ebp, %esp - popl %ebp - </pre> - or - <pre> - leave - </pre> - </ul> - -<li>Big example: - <ul> - <li>C code - <pre> - int main(void) { return f(8)+1; } - int f(int x) { return g(x); } - int g(int x) { return x+3; } - </pre> - <li>assembler - <pre> - _main: - <i>prologue</i> - pushl %ebp - movl %esp, %ebp - <i>body</i> - pushl $8 - call _f - addl $1, %eax - <i>epilogue</i> - movl %ebp, %esp - popl %ebp - ret - _f: - <i>prologue</i> - pushl %ebp - movl %esp, %ebp - <i>body</i> - pushl 8(%esp) - call _g - <i>epilogue</i> - movl %ebp, %esp - popl %ebp - ret - - _g: - <i>prologue</i> - pushl %ebp - movl %esp, %ebp - <i>save %ebx</i> - pushl %ebx - <i>body</i> - movl 8(%ebp), %ebx - addl $3, %ebx - movl %ebx, %eax - <i>restore %ebx</i> - popl %ebx - <i>epilogue</i> - movl %ebp, %esp - popl %ebp - ret - </pre> - </ul> - -<li>Super-small <tt>_g</tt>: - <pre> - _g: - movl 4(%esp), %eax - addl $3, %eax - ret - </pre> - -<li>Compiling, linking, loading: - <ul> - <li> <i>Compiler</i> takes C source code (ASCII text), - produces assembly language (also ASCII text) - <li> <i>Assembler</i> takes assembly language (ASCII text), - produces <tt>.o</tt> file (binary, machine-readable!) - <li> <i>Linker</i> takse multiple '<tt>.o</tt>'s, - produces a single <i>program image</i> (binary) - <li> <i>Loader</i> loads the program image into memory - at run-time and starts it executing - </ul> -</ul> - - -<h2>PC emulation</h2> - -<ul> -<li> Emulator like Bochs works by - <ul> - <li> doing exactly what a real PC would do, - <li> only implemented in software rather than hardware! - </ul> -<li> Runs as a normal process in a "host" operating system (e.g., Linux) -<li> Uses normal process storage to hold emulated hardware state: - e.g., - <ul> - <li> Hold emulated CPU registers in global variables - <pre> - int32_t regs[8]; - #define REG_EAX 1; - #define REG_EBX 2; - #define REG_ECX 3; - ... - int32_t eip; - int16_t segregs[4]; - ... - </pre> - <li> <tt>malloc</tt> a big chunk of (virtual) process memory - to hold emulated PC's (physical) memory - </ul> -<li> Execute instructions by simulating them in a loop: - <pre> - for (;;) { - read_instruction(); - switch (decode_instruction_opcode()) { - case OPCODE_ADD: - int src = decode_src_reg(); - int dst = decode_dst_reg(); - regs[dst] = regs[dst] + regs[src]; - break; - case OPCODE_SUB: - int src = decode_src_reg(); - int dst = decode_dst_reg(); - regs[dst] = regs[dst] - regs[src]; - break; - ... - } - eip += instruction_length; - } - </pre> - -<li> Simulate PC's physical memory map - by decoding emulated "physical" addresses just like a PC would: - <pre> - #define KB 1024 - #define MB 1024*1024 - - #define LOW_MEMORY 640*KB - #define EXT_MEMORY 10*MB - - uint8_t low_mem[LOW_MEMORY]; - uint8_t ext_mem[EXT_MEMORY]; - uint8_t bios_rom[64*KB]; - - uint8_t read_byte(uint32_t phys_addr) { - if (phys_addr < LOW_MEMORY) - return low_mem[phys_addr]; - else if (phys_addr >= 960*KB && phys_addr < 1*MB) - return rom_bios[phys_addr - 960*KB]; - else if (phys_addr >= 1*MB && phys_addr < 1*MB+EXT_MEMORY) { - return ext_mem[phys_addr-1*MB]; - else ... - } - - void write_byte(uint32_t phys_addr, uint8_t val) { - if (phys_addr < LOW_MEMORY) - low_mem[phys_addr] = val; - else if (phys_addr >= 960*KB && phys_addr < 1*MB) - ; /* ignore attempted write to ROM! */ - else if (phys_addr >= 1*MB && phys_addr < 1*MB+EXT_MEMORY) { - ext_mem[phys_addr-1*MB] = val; - else ... - } - </pre> -<li> Simulate I/O devices, etc., by detecting accesses to - "special" memory and I/O space and emulating the correct behavior: - e.g., - <ul> - <li> Reads/writes to emulated hard disk - transformed into reads/writes of a file on the host system - <li> Writes to emulated VGA display hardware - transformed into drawing into an X window - <li> Reads from emulated PC keyboard - transformed into reads from X input event queue - </ul> -</ul> |