6.828 Lab 1 Booting a PC 报告

正文

前言

工具链

用 Ubuntu 比较方便:

$ sudo apt install -y gcc objdump
$ sudo apt install -y build-essential gdb
$ sudo apt install -y gcc-multilib
$ sudo apt install -y qemu

软件配置

首先是 git 的部署和基本用法.
然后是整个项目的 Makefile. 使用 make handin 可以提交, 但是我不是 MIT 的学生; make grade 可以测成绩.

系统环境

本来是在装有 Ubuntu 18.04 的母机上跑的, 因为系统位数原因换成了 Windows 10 下跑 Ubuntu 16.04 32 位的虚拟机, 使用 ssh 连接.

Part 1: PC Bootstrap

x86 汇编

汇编已经在上上一篇博文里完成了. 不过因为 828 全程用到 gcc, 所以汇编自然是 AT&T 风格的, 区别于 NASM.

模拟 x86

make

输出:

+ as kern/entry.S
+ cc kern/entrypgdir.c
+ cc kern/init.c
+ cc kern/console.c
+ cc kern/monitor.c
+ cc kern/printf.c
+ cc kern/kdebug.c
+ cc lib/printfmt.c
+ cc lib/readline.c
+ cc lib/string.c
+ ld obj/kern/kernel
ld: warning: section `.bss' type changed to PROGBITS
+ as boot/boot.S
+ cc -Os boot/main.c
+ ld boot/boot
boot block is 390 bytes (max 510)
+ mk obj/kern/kernel.img

bss 段报了个警告, 但是问题不大.

make qemu 或者 make qemu-nox 可以放 QEMU 里面跑.
系统自带两个指令, helpkerninfo. C-A,X 可以退出 QEMU.

PC 的物理地址空间

+------------------+  <- 0xFFFFFFFF (4GB)
|      32-bit      |
|  memory mapped   |
|     devices      |
|                  |
/\/\/\/\/\/\/\/\/\/\

/\/\/\/\/\/\/\/\/\/\
|                  |
|      Unused      |
|                  |
+------------------+  <- depends on amount of RAM
|                  |
|                  |
| Extended Memory  |
|                  |
|                  |
+------------------+  <- 0x00100000 (1MB)
|     BIOS ROM     |
+------------------+  <- 0x000F0000 (960KB)
|  16-bit devices, |
|  expansion ROMs  |
+------------------+  <- 0x000C0000 (768KB)
|   VGA Display    |
+------------------+  <- 0x000A0000 (640KB)
|                  |
|    Low Memory    |
|                  |
+------------------+  <- 0x00000000

注意, JOS 被限制只用 PC 物理地址的前 256MB.

The ROM BIOS

在第一个终端里执行 make qemu-gdb 或者 make qemu-nox-gdb, QEMU 会停在处理器执行第一个指令之前. 接着在第二个终端里输入 make gdb 可以对 QEMU 里的系统进行调试.

接着, 可以跟踪到第一条指令是:

[f000:fff0]    0xffff0:	ljmp   $0xf000,$0xe05b

这里 CS = 0xf000, IP = 0xe05b.

BIOS运行, 将会部署中断描述符表, 并且初始化一些设备.
当初始化完成 PCI 总线和一些重要设备后, 它开始在存储盘中搜索引导设备(bootable device), 以启动boot loader.

Part 2: The Boot Loader

PC 的软盘硬盘会被分成 512 bytes 的扇区(sectors), 扇区是盘的最小传输粒度(minimum transfer granularity). 一个可引导盘的第一个扇区被叫做引导扇区(boot sector), 里面有 boot loader 的代码. 如果 BIOS 找到了可引导盘, 它会把 boot loader 读进物理内存的 0x7c00~0x7dff (512 bytes), 然后使用 jmp 跳到 0000:7c00.
CD-ROMS 的一个扇区有 2048 bytes, boot loader 也会随之变大, BIOS 读取的内容也要随之变大. 详见 “El Torito” Bootable CD-ROM Format Specification. 略.

接着是两个源代码, boot/boot.Sboot/main.c, 根据叔的文章, 这两个代码必须完全弄懂.
boot/boot.S

#include <inc/mmu.h>             # Memory Management Unit 内存管理单元

# Start the CPU: switch to 32-bit protected mode, jump into C.
# The BIOS loads this code from the first sector of the hard disk into
# memory at physical address 0x7c00 and starts executing in real mode
# with %cs=0 %ip=7c00.
# 启动 CPU, 切换到 32 位保护模式, 然后跳转到 C. 
# 这段代码就是 boot loader, BIOS 会从硬盘的第一个扇区读取它们到物理内存的 0x7c00, 
# 然后在实模式中从 0000:7c00 开始执行. 

.set PROT_MODE_CSEG, 0x8         # 内核的 CS 选择器
.set PROT_MODE_DSEG, 0x10        # 内核的 DS 选择器
.set CR0_PE_ON,      0x1         # 保护模式启动标志, CR0, Exercise 2 中有提到

.globl start
start:
  .code16                     # Assemble for 16-bit mode
  cli                         # Disable interrupts
  cld                         # String operations increment. DF = 0, 增方向

  # Set up the important data segment registers (DS, ES, SS).
  xorw    %ax,%ax             # ax = 0
  movw    %ax,%ds             # ds = 0
  movw    %ax,%es             # es = 0
  movw    %ax,%ss             # ss = 0

  # Enable A20:
  #   For backwards compatibility with the earliest PCs, physical
  #   address line 20 is tied low, so that addresses higher than
  #   1MB wrap around to zero by default.  This code undoes this.
  # 开启 A20:
  #   同样在 Exercise 2 中有提到过.
  #   物理地址线 20 开启, 这样可以保证良好的向后兼容性.
  #
# seta20.1 是向键盘控制器的 0x64 端口发送 0x61 命令, 这个命令的意思是要向键盘控制器的 P2 写入数据.
seta20.1:
  inb     $0x64,%al               # Wait for not busy
  # if al & 0x2 == 0, ZF 置位.
  # 相当于 bit2 为 0 时, ZF 置位.
  testb   $0x2,%al
  # if ZF == 0, jmp.
  # 相当于 bit2 为 0 时, 跳转, 否则循环直到 bit2 为空位.
  # 0x64 端口是 KB controller read status, 键盘控制器读状态寄存器.
  jnz     seta20.1

  # 向 0x64 端口写入 0xd1.
  # 0x60 接口是 PS/2 硬件接口或者 PS/2 控制器本身读/写数据.
  # 0x64 是状态寄存器.
  # 将 0xd1 写入 0x64 端口, 是设置这个状态寄存器, 
  # 使可以通过 0x60 端口写入数据而控制 PS/2 的状态.
  movb    $0xd1,%al               # 0xd1 -> port 0x64
  outb    %al,$0x64

seta20.2:
  # 同上. 检测缓冲区是否有空位, 没有则等待.
  inb     $0x64,%al               # Wait for not busy
  testb   $0x2,%al
  jnz     seta20.2

  # 将 0xdf 写入 0x60, 开启 A20 地址总线.
  # 但是 Exercise 2 中出现的是将 0x60 的 bit1 置为 1.
  # 原因不明.
  movb    $0xdf,%al               # 0xdf -> port 0x60
  outb    %al,$0x60

  # Switch from real to protected mode, using a bootstrap GDT
  # and segment translation that makes virtual addresses 
  # identical to their physical addresses, so that the 
  # effective memory map does not change during the switch.
  # 从实模式切换到保护模式的准备工作.
  # 将 gdtdesc(在代码最后) 加载到 GDTR 中.
  # GDT 是全局描述符表, GDTR 是全局描述符表寄存器.
  # 想要在保护模式下对内存进行寻址就先要有 GDT, GDT 表里每一项叫做段描述符, 
  # 用来记录每个内存分段的一些属性信息, 每个段描述符占8字节.
  # CPU 使用 GDTR 寄存器来保存我们 GDT 在内存中的位置和 GDT 的长度.
  # 所以才有了 gdtdesc 里面的内容.
  lgdt    gdtdesc
  # 将 CR0 的 bit0 置 1, 准备进入保护模式.
  movl    %cr0, %eax
  orl     $CR0_PE_ON, %eax
  movl    %eax, %cr0

  # Jump to next instruction, but in 32-bit code segment.
  # Switches processor into 32-bit mode.
  # 长跳转到 protcseg, $PROT_MODE_CSEG 是前面设置的 CS 宏(?)
  ljmp    $PROT_MODE_CSEG, $protcseg

  .code32                     # Assemble for 32-bit mode
protcseg:
  # Set up the protected-mode data segment registers
  # 部署保护模式数据段寄存器, 跟 Exercise 2 中的类似.
  movw    $PROT_MODE_DSEG, %ax    # Our data segment selector
  movw    %ax, %ds                # -> DS: Data Segment
  movw    %ax, %es                # -> ES: Extra Segment
  movw    %ax, %fs                # -> FS
  movw    %ax, %gs                # -> GS
  movw    %ax, %ss                # -> SS: Stack Segment

  # Set up the stack pointer and call into C.
  # 将栈区设置在 $start 处
  # 因为栈的工作模式是地址减为增方向, 所以不会影响代码部分.
  movl    $start, %esp
  # 调用 C 接口 bootmain.
  call bootmain

  # If bootmain returns (it shouldn't), loop.
  # 死循环.
spin:
  jmp spin

# Bootstrap GDT
# 设置 4 字节对齐
.p2align 2                                # force 4 byte alignment
# 初始化 gtd 表
gdt:
  SEG_NULL                              # null seg
  SEG(STA_X|STA_R, 0x0, 0xffffffff)     # code seg
  SEG(STA_W, 0x0, 0xffffffff)           # data seg

gdtdesc:
  .word   0x17                            # sizeof(gdt) - 1
  .long   gdt                             # address gdt

boot/main.c

#include <inc/x86.h>
#include <inc/elf.h>

/**********************************************************************
 * This a dirt simple boot loader, whose sole job is to boot
 * an ELF kernel image from the first IDE hard disk.
 * 这是一个简单的 boot loader, 作用仅仅是从 IDE 硬盘引导 ELF 内核镜像.
 *
 * DISK LAYOUT
 *  * This program(boot.S and main.c) is the bootloader.  It should
 *    be stored in the first sector of the disk.
 *  * 注意 boot.S 和 main.c 一起才是 boot loader. 需要被存储在盘的第一扇区.
 *
 *  * The 2nd sector onward holds the kernel image.
 *  * 第二扇区开始存放内核镜像.
 *
 *  * The kernel image must be in ELF format.
 *  * 内核镜像必须是 ELF 格式.
 *
 * BOOT UP STEPS
 *  * when the CPU boots it loads the BIOS into memory and executes it
 *
 *  * the BIOS intializes devices, sets of the interrupt routines, and 
 *    reads the first sector of the boot device(e.g., hard-drive)
 *    into memory and jumps to it.
 *  * BIOS 初始化设备和一些中断程式, 然后从引导设备读入第一个扇区到内存后跳转.
 *
 *  * Assuming this boot loader is stored in the first sector of the
 *    hard-drive, this code takes over...
 *
 *  * control starts in boot.S -- which sets up protected mode,
 *    and a stack so C code then run, then calls bootmain()
 *  * 执行 boot.S, 用来部署保护模式和能够让 C 代码运行的栈, 然后调用 C 接口 `bootmain()`.
 *
 *  * bootmain() in this file takes over, reads in the kernel and jumps to it.
 **********************************************************************/

/**********************************************************************
这段注释直接白嫖自叔的文章[2].
// The definition of struct Elf.
struct Elf {
 	    uint32_t e_magic;               // must equal ELF_MAGIC. 保存了 4 个 char, "\0x7FELF", 用来校验是否是一个 Elf 结构体
 	    uint8_t  e_elf[12];             // 应该是关于一些平台相关的设置, 关系到如何译码和解释文件内容存  疑. 
 	    uint16_t e_type;                // 该文件的类型
 	    uint16_t e_machine;             // 该文件需要的体系结构
 	    uint32_t e_version;             // 文件的版本
 	    uint32_t e_entry;               // 程序的入口地址
 	    uint32_t e_phoff;               // 表示 Program header table 在文件中的偏移量(以字节计算)
 	    uint32_t e_shoff;               // 表示 Section header table 在文件中的偏移量(以字节计算)
 	    uint32_t e_flags;               // 对 IA32 而言, 此项为 0. 
 	    uint16_t e_ehsize;              // 表示 ELF header 大小
 	    uint16_t e_phentsize;           // 表示 Program header table 中每一个条目的大小
 	    uint16_t e_phnum;               // 表示 Program header table 中有多少个条目
 	    uint16_t e_shentsize;           // 表示 Section header table 中每一个条目的大小
 	    uint16_t e_shnum;               // 表示 Section header table 中有多少个条目
 	    uint16_t e_shstrndx;            // 表示包含节名称的字符串是第几个节
};

// The definition of struct Proghdr.
struct Proghdr {
    uint32_t p_type;                  // 当前 program 的段类型
		uint32_t p_offset;                // 段的第一个字节在文件中的偏移
		uint32_t p_va;                    // 段的第一个字节在文件中的虚拟地址
		uint32_t p_pa;                    // 段的第一个字节在文件中的物理地址, 在物理内存定位相关的系统中使用
		uint32_t p_filesz;                // 段在文件中的长度
		uint32_t p_memsz;                 // 段在内存中的长度
		uint32_t p_flags;                 // 与段相关的标识位
		uint32_t p_align;                 // 根据此项来确定段在文件以及内存中如何对齐
};
 **********************************************************************/

// 扇区大小
#define SECTSIZE        512
// 一个 ELF 表存在 0x10000.
#define ELFHDR          ((struct Elf *) 0x10000) // scratch space

void readsect(void*, uint32_t);
void readseg(uint32_t, uint32_t, uint32_t);

void
bootmain(void)
{
        // program header table, 第一个是指向表中 program header 的指针.
        // eph 是 end of program header.
        struct Proghdr *ph, *eph;

        // read 1st page off disk
        // 读取从 0 开始的 8 个扇区放入 ELFHDR 位置.
        readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);

        // is this a valid ELF?
        // 校验它是不是 Elf 结构体. 
        if (ELFHDR->e_magic != ELF_MAGIC)
                goto bad;

        // load each program segment (ignores ph flags)
        // 读取 program header table, 地址是 ELFHDR + pht offset
        ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
        // end of program header, program header table 的最后一个条目的下一个位置. 
        eph = ph + ELFHDR->e_phnum;
        // 将 program headers 读入内存
        for (; ph < eph; ph++)
                // p_pa is the load address of this segment (as well
                // as the physical address)
                // 从 offset 读 memsz 长度的数据到 pa 里.
                readseg(ph->p_pa, ph->p_memsz, ph->p_offset);

        // call the entry point from the ELF header
        // note: does not return!
        // 运行程序入口
        ((void (*)(void)) (ELFHDR->e_entry))();

bad:
        // 0x8A00 写入 0x8A00 端口, 0x8A00 写入 0x8E00 端口
        // 开启 IO Debug
        // 可以在 bocks 的调试器中看到状态
        outw(0x8A00, 0x8A00);
        outw(0x8A00, 0x8E00);
        while (1)
                /* do nothing */;
}

// Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
// Might copy more than asked
// 从内核的 offset 读取 count 个字节进物理地址 pa
void
readseg(uint32_t pa, uint32_t count, uint32_t offset)
{
        uint32_t end_pa;

        end_pa = pa + count;

        // round down to sector boundary
        // 抹掉低位的数字, 锁定到扇区边界.
        // 比如这里 SECTSIZE 为 512, 减一求反后, 二进制低 9 位均为 0, 高位全为 1.
        // 原地做与运算可以将低 9 位抹成 0.
        pa &= ~(SECTSIZE - 1);

        // translate from bytes to sectors, and kernel starts at sector 1
        // 将 offset 的单位由字节计算成扇区. 内核是从扇区 1 开始的, 所以再加 1.
        offset = (offset / SECTSIZE) + 1;

        // If this is too slow, we could read lots of sectors at a time.
        // We'd write more to memory than asked, but it doesn't matter --
        // we load in increasing order.
        while (pa < end_pa) {
                // Since we haven't enabled paging yet and we're using
                // an identity segment mapping (see boot.S), we can
                // use physical addresses directly.  This won't be the
                // case once JOS enables the MMU.
                // 从编号为 offset 的扇区读取数据进文件的物理地址
                readsect((uint8_t*) pa, offset);
                pa += SECTSIZE;
                offset++;
        }
}

void
waitdisk(void)
{
        // wait for disk reaady
        // 一直从 0x1f7 读取数据, 直至它空闲.
        // 0x40 这个位为 1 时, 表示空闲.
        while ((inb(0x1F7) & 0xC0) != 0x40)
                /* do nothing */;
}

// 读入一个扇区
void
readsect(void *dst, uint32_t offset)
{
        // wait for disk to be ready
        waitdisk();

        // outb(port, data);
        // 向端口输出数据
        outb(0x1F2, 1);         // count = 1
        // offset 太长了, 所以分段
        outb(0x1F3, offset);
        outb(0x1F4, offset >> 8);
        outb(0x1F5, offset >> 16);
        outb(0x1F6, (offset >> 24) | 0xE0);
        outb(0x1F7, 0x20);      // cmd 0x20 - read sectors

        // wait for disk to be ready
        waitdisk();

        // read a sector
        // 0x1F0 是硬盘接口的数据端口, 是一个 16 位端口.
        // 一旦硬盘空闲且准备就绪, 就可以连续从这个端口写入或读取数据.
        // 第三个参数是双字, 所以要除以 4.
        insl(0x1F0, dst, SECTSIZE/4);
}

原文在 Exercise3 结束之后, 提了几个问题:

老实说这些问题我都回答不上来. 先放着. 既然是要求的一定会补上.

Loading the Kernel

恩, 一上来就是 Exercise 4.
ELF 是 Executable and Linkable Format 的缩写. 编译器先将 C 源文件(*.c)编译成目标文件(*.o), 目标文件包含汇编编码成的硬件可执行的二进制格式. Full information about this format is available in the ELF specification on our reference page. ELF 很复杂不是这个课程的部分, 所以直接参考 Wikipedia Page.

一个 ELF binary 有定长的 ELF header, C 定义的这个头为 inc/elf.h. 我们需要关注的部分是:

当链接器计算程序内存布局时, 会为未初始化全局变量(uninitialized global variables)保留空间, 这段是紧跟在 .data 之后的 .bss 段. C 会将未初始化的 .bss 段初始化为 0.
接着他让我们自己使用 objdump 工具检查程序的段, 使用 -h 参数. 在 .text 段中, VMA 表示 link address, LMA 表示 load address. 比如我们使用原文中举例的 boot loader 文件:

$ objdump -h obj/boot/boot.out 

obj/boot/boot.out:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000186  00007c00  00007c00  00000074  2**2
                  CONTENTS, ALLOC, LOAD, CODE
  1 .eh_frame     000000a8  00007d88  00007d88  000001fc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .stab         00000720  00000000  00000000  000002a4  2**2
                  CONTENTS, READONLY, DEBUGGING
  3 .stabstr      0000088f  00000000  00000000  000009c4  2**0
                  CONTENTS, READONLY, DEBUGGING
  4 .comment      00000035  00000000  00000000  00001253  2**0
                  CONTENTS, READONLY

-x 参数可以输出程序头信息和链接器使用的符号表.

$ objdump -x obj/kern/kernel

obj/kern/kernel:     file format elf32-i386
obj/kern/kernel
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0010000c

Program Header:
    LOAD off    0x00001000 vaddr 0xf0100000 paddr 0x00100000 align 2**12
         filesz 0x00007120 memsz 0x00007120 flags r-x
    LOAD off    0x00009000 vaddr 0xf0108000 paddr 0x00108000 align 2**12
         filesz 0x0000a948 memsz 0x0000a948 flags rw-
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
         filesz 0x00000000 memsz 0x00000000 flags rwx

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00001871  f0100000  00100000  00001000  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .rodata       00000714  f0101880  00101880  00002880  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .stab         000038d1  f0101f94  00101f94  00002f94  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .stabstr      000018bb  f0105865  00105865  00006865  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .data         0000a300  f0108000  00108000  00009000  2**12
                  CONTENTS, ALLOC, LOAD, DATA
  5 .bss          00000648  f0112300  00112300  00013300  2**5
                  CONTENTS, ALLOC, LOAD, DATA
  6 .comment      00000035  00000000  00000000  00013948  2**0
                  CONTENTS, READONLY
SYMBOL TABLE:
f0100000 l    d  .text	00000000 .text
f0101880 l    d  .rodata	00000000 .rodata
f0101f94 l    d  .stab	00000000 .stab
f0105865 l    d  .stabstr	00000000 .stabstr
f0108000 l    d  .data	00000000 .data
f0112300 l    d  .bss	00000000 .bss
00000000 l    d  .comment	00000000 .comment
00000000 l    df *ABS*	00000000 obj/kern/entry.o
f010002f l       .text	00000000 relocated
f010003e l       .text	00000000 spin
00000000 l    df *ABS*	00000000 entrypgdir.c
00000000 l    df *ABS*	00000000 init.c
00000000 l    df *ABS*	00000000 console.c
f0100177 l     F .text	0000001f serial_proc_data
f0100196 l     F .text	00000043 cons_intr
f0112320 l     O .bss	00000208 cons
f01001d9 l     F .text	00000119 kbd_proc_data
f0112300 l     O .bss	00000004 shift.1407
f0101a60 l     O .rodata	00000100 shiftcode
f0101960 l     O .rodata	00000100 togglecode
f0101940 l     O .rodata	00000010 charcode
f01002f2 l     F .text	000001e9 cons_putc
f0112528 l     O .bss	00000002 crt_pos
f011252c l     O .bss	00000004 crt_buf
f0112530 l     O .bss	00000004 addr_6845
f0112534 l     O .bss	00000001 serial_exists
f0112200 l     O .data	00000100 normalmap
f0112100 l     O .data	00000100 shiftmap
f0112000 l     O .data	00000100 ctlmap
00000000 l    df *ABS*	00000000 monitor.c
f0101d44 l     O .rodata	00000018 commands
00000000 l    df *ABS*	00000000 printf.c
f01008bf l     F .text	00000013 putch
00000000 l    df *ABS*	00000000 kdebug.c
f010090c l     F .text	000000f6 stab_binsearch
00000000 l    df *ABS*	00000000 printfmt.c
f0100bd3 l     F .text	000000af printnum
f0100c82 l     F .text	0000001d sprintputch
f0101f68 l     O .rodata	0000001c error_string
00000000 l    df *ABS*	00000000 readline.c
f0112540 l     O .bss	00000400 buf
00000000 l    df *ABS*	00000000 string.c
f010000c g       .text	00000000 entry
f010129c g     F .text	00000020 strcpy
f01004f7 g     F .text	00000012 kbd_intr
f010076e g     F .text	0000000a mon_backtrace
f01000e6 g     F .text	00000057 _panic
f0100094 g     F .text	00000052 i386_init
f010142e g     F .text	00000068 memmove
f0101170 g     F .text	0000001a snprintf
f0100cbc g     F .text	00000466 vprintfmt
f0100509 g     F .text	0000004a cons_getc
f01008f8 g     F .text	00000014 cprintf
f0101496 g     F .text	00000013 memcpy
f010118a g     F .text	000000d9 readline
f0111000 g     O .data	00001000 entry_pgtable
f0100040 g     F .text	00000054 test_backtrace
f0101122 g     F .text	0000004e vsnprintf
f0112300 g       .bss	00000000 edata
f0100553 g     F .text	00000108 cons_init
f0105864 g       .stab	00000000 __STAB_END__
f0105865 g       .stabstr	00000000 __STABSTR_BEGIN__
f0101720 g     F .text	00000151 .hidden __umoddi3
f01004db g     F .text	0000001c serial_intr
f01015f0 g     F .text	00000122 .hidden __udivdi3
f010067c g     F .text	0000000a iscons
f0101505 g     F .text	000000de strtol
f010127b g     F .text	00000021 strnlen
f01012bc g     F .text	00000022 strcat
f0112944 g     O .bss	00000004 panicstr
f0112940 g       .bss	00000000 end
f010013d g     F .text	0000003a _warn
f01013c5 g     F .text	0000001c strfind
f0101871 g       .text	00000000 etext
0010000c g       .text	00000000 _start
f010130b g     F .text	0000003b strlcpy
f010136c g     F .text	00000038 strncmp
f01012de g     F .text	0000002d strncpy
f01014a9 g     F .text	00000039 memcmp
f010065b g     F .text	00000010 cputchar
f01013e1 g     F .text	0000004d memset
f010066b g     F .text	00000011 getchar
f0100c9f g     F .text	0000001d printfmt
f010711f g       .stabstr	00000000 __STABSTR_END__
f0101346 g     F .text	00000026 strcmp
f0100a02 g     F .text	000001d1 debuginfo_eip
f01008d2 g     F .text	00000026 vcprintf
f0110000 g       .data	00000000 bootstacktop
f0110000 g     O .data	00001000 entry_pgdir
f0108000 g       .data	00000000 bootstack
f0101f94 g       .stab	00000000 __STAB_BEGIN__
f0101263 g     F .text	00000018 strlen
f01013a4 g     F .text	00000021 strchr
f01006be g     F .text	000000b0 mon_kerninfo
f0100778 g     F .text	00000147 monitor
f01014e2 g     F .text	00000023 memfind
f0100686 g     F .text	00000038 mon_help

Program Header 部分, vdddr 指 virtual address, paddr 指 physical address, memsz/filesz 指 the size of the loaded area.

boot/main.c 中, 每一个程序头的 ph->p_pa 域都包含了段的目标物理地址(the segment’s destination physical address).
BIOS 从内存 0x7c00 开始加载引导扇区. 于是就有了 Exercise 5 的内容.

ELF 头中还有一个非常重要的域叫 e_entry, 这个我们也在 bootmain() 里见过. 这个域持有这个程序的入口链接地址(the link address of the entry point), 也就是程序 text 段的内存地址.

$ objdump -f obj/kern/kernel

obj/kern/kernel:     file format elf32-i386
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0010000c

接着是完成 Exercise 6 的内容.

Part 3: The Kernel

Using virtual memory to work around position dependence

利用虚拟内存解决位置依赖问题.
操作系统内核一般会链接到高位虚拟内存(这个可以到 kern/kernel.ld 中查看), 如 0xf0100000, 然后将更低的处理器虚拟地址留给用户程序.
很多机器没有那么高的地址, 这时候就会用到处理器的内存管理硬件去把 0xf0100000 映射到 0x00100000. 这样 kernel 就会被加载到物理地址 1MB 处, 正好在 BIOS ROM 上面.
接着, 我们先映射 4MB 的物理内存, 见 Exercise 7. 不过这里要注意, 没有 6.828 专用的 QEMU 会无限重启.

Formatted Printing to the Console

终终终于到我最喜欢的 C 代码的部分了!

Read through kern/printf.c, lib/printfmt.c, and kern/console.c, and make sure you understand their relationship. It will become clear in later labs why printfmt.c is located in the separate lib directory.

让读 kern/printf.clib/printfmt.c 还有 kern/console.c 的代码, 然后让理清关系.

kern/printf.c:

// Simple implementation of cprintf console output for the kernel,
// based on printfmt() and the kernel console's cputchar().

#include <inc/types.h>
#include <inc/stdio.h>
#include <inc/stdarg.h>


static void
putch(int ch, int *cnt)
{
	// 打印一个字符
	cputchar(ch);
	// 看不太懂这里为什么要加解引用
	*cnt++;
}

int
vcprintf(const char *fmt, va_list ap)
{
	int cnt = 0;
	// 传入上面定义的打印一个字符的局部函数, 一个计数器, 传进来的格式化字符串和可变参数列表.
	vprintfmt((void*)putch, &cnt, fmt, ap);
	// 返回计数器, 初步推断应该是跟 C 标准库的 printf 返回值相同.
	return cnt;
}

// 这个函数就是在 vcprintf 加了一层可变参数列表指针的壳子
int
cprintf(const char *fmt, ...)
{
	// 可变参数的使用方法参考 C Primer Plus, 因为都会了这里就不提了.
	va_list ap;
	int cnt;

	va_start(ap, fmt);
	cnt = vcprintf(fmt, ap);
	va_end(ap);

	return cnt;
}

lib/printfmt.c:

// Stripped-down primitive printf-style formatting routines,
// used in common by printf, sprintf, fprintf, etc.
// This code is also used by both the kernel and user programs.

#include <inc/types.h>
#include <inc/stdio.h>
#include <inc/string.h>
#include <inc/stdarg.h>
#include <inc/error.h>

/*
 * Space or zero padding and a field width are supported for the numeric
 * formats only.
 *
 * The special format %e takes an integer error code
 * and prints a string describing the error.
 * The integer may be positive or negative,
 * so that -E_NO_MEM and E_NO_MEM are equivalent.
 */

// 错误信息的字符串常量. 
static const char * const error_string[MAXERROR] =
{
	[E_UNSPECIFIED]	= "unspecified error",
	[E_BAD_ENV]	= "bad environment",
	[E_INVAL]	= "invalid parameter",
	[E_NO_MEM]	= "out of memory",
	[E_NO_FREE_ENV]	= "out of environments",
	[E_FAULT]	= "segmentation fault",   // 堪称噩梦的东西?
};

/*
 * Print a number (base <= 16) in reverse order,
 * using specified putch function and associated pointer putdat.
 * 一个局部函数.
 * 逆序打印一个数, 要求进制小于 16.
 * 使用特制的 putchar 函数, 关联指针 putdat. 这俩等下会在 vprintfmt 函数里看到.
 * putch 函数指针跟 printf.c 里的 putch 其实应该是一个东西. 待会儿再说吧.
 */
static void
printnum(void (*putch)(int, void*), void *putdat,
	 unsigned long long num, unsigned base, int width, int padc)
{
	// first recursively print all preceding (more significant) digits
	// 递归输出
	if (num >= base) {
		printnum(putch, putdat, num / base, base, width - 1, padc);
	} else {
		// print any needed pad characters before first digit
		// 这个就是打印固定宽度的数字, padc 是占位符
		while (--width > 0)
			putch(padc, putdat);
	}

	// then print this (the least significant) digit
	putch("0123456789abcdef"[num % base], putdat);
}
// 看完了, 一个字, 妙啊

// Get an unsigned int of various possible sizes from a varargs list,
// depending on the lflag parameter.
static unsigned long long
getuint(va_list *ap, int lflag)
{
	// 这里的 lflag 应该是 %u 前面的 l 数量决定的.
	if (lflag >= 2)
		return va_arg(*ap, unsigned long long);
	else if (lflag)
		return va_arg(*ap, unsigned long);
	else
		return va_arg(*ap, unsigned int);
}

// Same as getuint but signed - can't use getuint
// because of sign extension
static long long
getint(va_list *ap, int lflag)
{
	// 这里也一样, 是由 l 的数量决定的.
	if (lflag >= 2)
		return va_arg(*ap, long long);
	else if (lflag)
		return va_arg(*ap, long);
	else
		return va_arg(*ap, int);
}


// Main function to format and print a string.
void printfmt(void (*putch)(int, void*), void *putdat, const char *fmt, ...);

void
vprintfmt(void (*putch)(int, void*), void *putdat, const char *fmt, va_list ap)
{
	// 一个常量引用
	register const char *p;
	// ch 就是 ch, 吃 fmt 的字符的
	// err 是 %e 用到的编号
	register int ch, err;
	// 无符号 64 位整型
	unsigned long long num;
	// 进制(基数), 表示 l 前缀数量的计数器, 输出宽度, 精度, %s 中有用到的 altflag
	int base, lflag, width, precision, altflag;
	// 宽度不够时的占位符
	char padc;

	while (1) {
		// 死循环, 但是遇到'\0'会 break 的.
		// 吃 fmt 的字符. 转换成无符号没懂什么意思, 可能是扩充ASCII?
		// 如果没吃到 % (转换说明符的标志), 就输出这个字符然后继续吃.
		while ((ch = *(unsigned char *) fmt++) != '%') {
			if (ch == '\0')
				return;
			putch(ch, putdat);
		}
		// 上面吃到了 % 会继续下面的
		// Process a %-escape sequence
		// 重置一些必要的变量
		padc = ' ';
		width = -1;
		precision = -1;
		lflag = 0;
		altflag = 0;
	// 重来一遍 switch, 给转换说明符的一些类型前缀准备的
	reswitch:
		switch (ch = *(unsigned char *) fmt++) {

		// flag to pad on the right
		case '-':
			padc = '-';
			goto reswitch;

		// flag to pad with 0's instead of spaces
		case '0':
			padc = '0';
			goto reswitch;

		// width field
		case '1':
		case '2':
		case '3':
		case '4':
		case '5':
		case '6':
		case '7':
		case '8':
		case '9':
			// 如果读取到一个数字, 就把它后面的数字都读了, 并一并转成十进制
			for (precision = 0; ; ++fmt) {
				precision = precision * 10 + ch - '0';
				ch = *fmt;
				if (ch < '0' || ch > '9')
					break;
			}
			goto process_precision;

		case '*':
			// 这是从可变参数列表中获取精度的标志
			precision = va_arg(ap, int);
			goto process_precision;

		case '.':
			// 如果检测到 width 还是 -1, 就把赋值成 0, 
			// 这样就可以和 process_precision 配合了.
			if (width < 0)
				width = 0;
			goto reswitch;

		case '#':
			altflag = 1;
			goto reswitch;

		// 刚才我还在想怎么解决小数点的问题, 没想到这里做的这么精妙
		// 我用 goto 就没有这个水平了
		// 如果检测到 width 还没被赋值, 也就是还没读到小数点, 
		// 那读进来的铁定不是 precision, 就把 width 和 precision 换一下
		// 妙啊!
		process_precision:
			if (width < 0)
				width = precision, precision = -1;
			goto reswitch;

		// long flag (doubled for long long)
		// 前面说了好几次的 l 前缀
		case 'l':
			lflag++;
			goto reswitch;

		// character
		// 输出一个字符型
		case 'c':
			putch(va_arg(ap, int), putdat);
			break;

		// error message
		// 错误信息, 标准库的 printf 是 E 字记浮点输出
		case 'e':
			// err 是个整数
			err = va_arg(ap, int);
			// 取绝对值
			if (err < 0)
				err = -err;
			if (err >= MAXERROR || (p = error_string[err]) == NULL)
				// 如果对应不上 error_string, 那就输出一个编号
				printfmt(putch, putdat, "error %d", err);
			else
				// 如果对的上, 就输出错误信息
				printfmt(putch, putdat, "%s", p);
			break;

		// string
		// 字符串输出
		case 's':
			// 输出 NULL
			if ((p = va_arg(ap, char *)) == NULL)
				p = "(null)";
			// 宽度大于 0 且为右对齐
			if (width > 0 && padc != '-')
				// 这里求 p 的长度, 最大值限制在 precision
				// 比较精巧的代码.
				for (width -= strnlen(p, precision); width > 0; width--)
					putch(padc, putdat);
			// 如果到了字符串尾, 且(精度未设置, 或者输出了精度个字符了)就 break.
			for (; (ch = *p++) != '\0' && (precision < 0 || --precision >= 0); width--)
				if (altflag && (ch < ' ' || ch > '~'))
					putch('?', putdat);
				else
					putch(ch, putdat);
				// 上面定义的 altflag 被设置为了 0, 所以这里的分支是废的, 
				// 控制字符, 扩展ASCII照常输出
			// 如果是左对齐这里的 width 就还在, 输出对应数量的空格.
			for (; width > 0; width--)
				putch(' ', putdat);
			break;

		// (signed) decimal
		// 输出有符号十进制整型
		case 'd':
			num = getint(&ap, lflag);
			// 注意任意整型都作为 64 位整数处理, 这里应该是为了防止补码引起的一些问题.
			if ((long long) num < 0) {
				putch('-', putdat);
				num = -(long long) num;
			}
			base = 10;
			goto number;

		// unsigned decimal
		// 输出无符号十进制整型
		case 'u':
			num = getuint(&ap, lflag);
			base = 10;
			goto number;

		// (unsigned) octal
		// 输出无符号八进制整型
		case 'o':
			// Replace this with your code.
			// 这段代码在 Exercise 8 中改了
			putch('X', putdat);
			putch('X', putdat);
			putch('X', putdat);
			break;

		// pointer
		// 以十六进制标记开头以十六进制输出指针
		case 'p':
			putch('0', putdat);
			putch('x', putdat);
			num = (unsigned long long)
				(uintptr_t) va_arg(ap, void *);
			base = 16;
			goto number;

		// (unsigned) hexadecimal
		// 十六进制输出
		case 'x':
			num = getuint(&ap, lflag);
			base = 16;
		// 所有整数的通用输出部分
		number:
			printnum(putch, putdat, num, base, width, padc);
			break;

		// escaped '%' character
		// 打印一个 %
		case '%':
			putch(ch, putdat);
			break;

		// unrecognized escape sequence - just print it literally
		default:
			// 如果 % 后面瞎了东西, 就输出一个 % 然后把 % 后面多余的东西吐出来.
			putch('%', putdat);
			// 吐字符
			for (fmt--; fmt[-1] != '%'; fmt--)
				/* do nothing */;
			break;
		}
	}
}

// 给 vprintfmt 加一个壳子.
void
printfmt(void (*putch)(int, void*), void *putdat, const char *fmt, ...)
{
	va_list ap;

	va_start(ap, fmt);
	vprintfmt(putch, putdat, fmt, ap);
	va_end(ap);
}

// 字符串打印的 buffer 结构体
struct sprintbuf {
	char *buf;       // buffer 首部
	char *ebuf;      // buffer 尾部('\0', 不是'\0'的下一位)
	int cnt;         // 打印过的字符的计数器
};

// 
static void
sprintputch(int ch, struct sprintbuf *b)
{
	b->cnt++;                 // 计数器增
	if (b->buf < b->ebuf)     // 如果没到尾部
		*b->buf++ = ch;       // 把一个字符放到 buf 里面
}

// 安全版本的 vsprintf, 有个 n 做保护.
int
vsnprintf(char *buf, int n, const char *fmt, va_list ap)
{
	// 注意这里 buf+n 之后减一了, 所以应该是认为 n 传进来的类似 strlen(n)+1,
	// 这样不会包括结尾的'\0'
	struct sprintbuf b = {buf, buf+n-1, 0};

	if (buf == NULL || n < 1)
		return -E_INVAL;

	// 调用回到了回到了 vprintfmt 头上
	// print the string to the buffer
	vprintfmt((void*)sprintputch, &b, fmt, ap);

	// null terminate the buffer
	// 然后它会在尾部填上一个'\0'
	// 所以现在大致明白了, 前面的那个减一应该是给 sizeof 字符数组准备的
	*b.buf = '\0';

	return b.cnt;
}


// 给 vsnprintf 加一层壳子
int
snprintf(char *buf, int n, const char *fmt, ...)
{
	va_list ap;
	int rc;

	va_start(ap, fmt);
	rc = vsnprintf(buf, n, fmt, ap);
	va_end(ap);

	return rc;
}

kern/consoles.c, 看不懂, 不想看, 暂时不处理.

/* See COPYRIGHT for copyright information. */

#include <inc/x86.h>
#include <inc/memlayout.h>
#include <inc/kbdreg.h>
#include <inc/string.h>
#include <inc/assert.h>

#include <kern/console.h>

static void cons_intr(int (*proc)(void));
static void cons_putc(int c);

// Stupid I/O delay routine necessitated by historical PC design flaws
static void
delay(void)
{
	inb(0x84);
	inb(0x84);
	inb(0x84);
	inb(0x84);
}

/***** Serial I/O code *****/

#define COM1		0x3F8

#define COM_RX		0	// In:	Receive buffer (DLAB=0)
#define COM_TX		0	// Out: Transmit buffer (DLAB=0)
#define COM_DLL		0	// Out: Divisor Latch Low (DLAB=1)
#define COM_DLM		1	// Out: Divisor Latch High (DLAB=1)
#define COM_IER		1	// Out: Interrupt Enable Register
#define   COM_IER_RDI	0x01	//   Enable receiver data interrupt
#define COM_IIR		2	// In:	Interrupt ID Register
#define COM_FCR		2	// Out: FIFO Control Register
#define COM_LCR		3	// Out: Line Control Register
#define	  COM_LCR_DLAB	0x80	//   Divisor latch access bit
#define	  COM_LCR_WLEN8	0x03	//   Wordlength: 8 bits
#define COM_MCR		4	// Out: Modem Control Register
#define	  COM_MCR_RTS	0x02	// RTS complement
#define	  COM_MCR_DTR	0x01	// DTR complement
#define	  COM_MCR_OUT2	0x08	// Out2 complement
#define COM_LSR		5	// In:	Line Status Register
#define   COM_LSR_DATA	0x01	//   Data available
#define   COM_LSR_TXRDY	0x20	//   Transmit buffer avail
#define   COM_LSR_TSRE	0x40	//   Transmitter off

static bool serial_exists;

static int
serial_proc_data(void)
{
	if (!(inb(COM1+COM_LSR) & COM_LSR_DATA))
		return -1;
	return inb(COM1+COM_RX);
}

void
serial_intr(void)
{
	if (serial_exists)
		cons_intr(serial_proc_data);
}

static void
serial_putc(int c)
{
	int i;

	for (i = 0;
	     !(inb(COM1 + COM_LSR) & COM_LSR_TXRDY) && i < 12800;
	     i++)
		delay();

	outb(COM1 + COM_TX, c);
}

static void
serial_init(void)
{
	// Turn off the FIFO
	outb(COM1+COM_FCR, 0);

	// Set speed; requires DLAB latch
	outb(COM1+COM_LCR, COM_LCR_DLAB);
	outb(COM1+COM_DLL, (uint8_t) (115200 / 9600));
	outb(COM1+COM_DLM, 0);

	// 8 data bits, 1 stop bit, parity off; turn off DLAB latch
	outb(COM1+COM_LCR, COM_LCR_WLEN8 & ~COM_LCR_DLAB);

	// No modem controls
	outb(COM1+COM_MCR, 0);
	// Enable rcv interrupts
	outb(COM1+COM_IER, COM_IER_RDI);

	// Clear any preexisting overrun indications and interrupts
	// Serial port doesn't exist if COM_LSR returns 0xFF
	serial_exists = (inb(COM1+COM_LSR) != 0xFF);
	(void) inb(COM1+COM_IIR);
	(void) inb(COM1+COM_RX);

}



/***** Parallel port output code *****/
// For information on PC parallel port programming, see the class References
// page.

static void
lpt_putc(int c)
{
	int i;

	for (i = 0; !(inb(0x378+1) & 0x80) && i < 12800; i++)
		delay();
	outb(0x378+0, c);
	outb(0x378+2, 0x08|0x04|0x01);
	outb(0x378+2, 0x08);
}




/***** Text-mode CGA/VGA display output *****/

static unsigned addr_6845;
static uint16_t *crt_buf;
static uint16_t crt_pos;

static void
cga_init(void)
{
	volatile uint16_t *cp;
	uint16_t was;
	unsigned pos;

	cp = (uint16_t*) (KERNBASE + CGA_BUF);
	was = *cp;
	*cp = (uint16_t) 0xA55A;
	if (*cp != 0xA55A) {
		cp = (uint16_t*) (KERNBASE + MONO_BUF);
		addr_6845 = MONO_BASE;
	} else {
		*cp = was;
		addr_6845 = CGA_BASE;
	}

	/* Extract cursor location */
	outb(addr_6845, 14);
	pos = inb(addr_6845 + 1) << 8;
	outb(addr_6845, 15);
	pos |= inb(addr_6845 + 1);

	crt_buf = (uint16_t*) cp;
	crt_pos = pos;
}



static void
cga_putc(int c)
{
	// if no attribute given, then use black on white
	if (!(c & ~0xFF))
		c |= 0x0700;

	switch (c & 0xff) {
	case '\b':
		if (crt_pos > 0) {
			crt_pos--;
			crt_buf[crt_pos] = (c & ~0xff) | ' ';
		}
		break;
	case '\n':
		crt_pos += CRT_COLS;
		/* fallthru */
	case '\r':
		crt_pos -= (crt_pos % CRT_COLS);
		break;
	case '\t':
		cons_putc(' ');
		cons_putc(' ');
		cons_putc(' ');
		cons_putc(' ');
		cons_putc(' ');
		break;
	default:
		crt_buf[crt_pos++] = c;		/* write the character */
		break;
	}

	// What is the purpose of this?
	if (crt_pos >= CRT_SIZE) {
		int i;

		memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
		for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
			crt_buf[i] = 0x0700 | ' ';
		crt_pos -= CRT_COLS;
	}

	/* move that little blinky thing */
	outb(addr_6845, 14);
	outb(addr_6845 + 1, crt_pos >> 8);
	outb(addr_6845, 15);
	outb(addr_6845 + 1, crt_pos);
}


/***** Keyboard input code *****/

#define NO		0

#define SHIFT		(1<<0)
#define CTL		(1<<1)
#define ALT		(1<<2)

#define CAPSLOCK	(1<<3)
#define NUMLOCK		(1<<4)
#define SCROLLLOCK	(1<<5)

#define E0ESC		(1<<6)

static uint8_t shiftcode[256] =
{
	[0x1D] = CTL,
	[0x2A] = SHIFT,
	[0x36] = SHIFT,
	[0x38] = ALT,
	[0x9D] = CTL,
	[0xB8] = ALT
};

static uint8_t togglecode[256] =
{
	[0x3A] = CAPSLOCK,
	[0x45] = NUMLOCK,
	[0x46] = SCROLLLOCK
};

static uint8_t normalmap[256] =
{
	NO,   0x1B, '1',  '2',  '3',  '4',  '5',  '6',	// 0x00
	'7',  '8',  '9',  '0',  '-',  '=',  '\b', '\t',
	'q',  'w',  'e',  'r',  't',  'y',  'u',  'i',	// 0x10
	'o',  'p',  '[',  ']',  '\n', NO,   'a',  's',
	'd',  'f',  'g',  'h',  'j',  'k',  'l',  ';',	// 0x20
	'\'', '`',  NO,   '\\', 'z',  'x',  'c',  'v',
	'b',  'n',  'm',  ',',  '.',  '/',  NO,   '*',	// 0x30
	NO,   ' ',  NO,   NO,   NO,   NO,   NO,   NO,
	NO,   NO,   NO,   NO,   NO,   NO,   NO,   '7',	// 0x40
	'8',  '9',  '-',  '4',  '5',  '6',  '+',  '1',
	'2',  '3',  '0',  '.',  NO,   NO,   NO,   NO,	// 0x50
	[0xC7] = KEY_HOME,	      [0x9C] = '\n' /*KP_Enter*/,
	[0xB5] = '/' /*KP_Div*/,      [0xC8] = KEY_UP,
	[0xC9] = KEY_PGUP,	      [0xCB] = KEY_LF,
	[0xCD] = KEY_RT,	      [0xCF] = KEY_END,
	[0xD0] = KEY_DN,	      [0xD1] = KEY_PGDN,
	[0xD2] = KEY_INS,	      [0xD3] = KEY_DEL
};

static uint8_t shiftmap[256] =
{
	NO,   033,  '!',  '@',  '#',  '$',  '%',  '^',	// 0x00
	'&',  '*',  '(',  ')',  '_',  '+',  '\b', '\t',
	'Q',  'W',  'E',  'R',  'T',  'Y',  'U',  'I',	// 0x10
	'O',  'P',  '{',  '}',  '\n', NO,   'A',  'S',
	'D',  'F',  'G',  'H',  'J',  'K',  'L',  ':',	// 0x20
	'"',  '~',  NO,   '|',  'Z',  'X',  'C',  'V',
	'B',  'N',  'M',  '<',  '>',  '?',  NO,   '*',	// 0x30
	NO,   ' ',  NO,   NO,   NO,   NO,   NO,   NO,
	NO,   NO,   NO,   NO,   NO,   NO,   NO,   '7',	// 0x40
	'8',  '9',  '-',  '4',  '5',  '6',  '+',  '1',
	'2',  '3',  '0',  '.',  NO,   NO,   NO,   NO,	// 0x50
	[0xC7] = KEY_HOME,	      [0x9C] = '\n' /*KP_Enter*/,
	[0xB5] = '/' /*KP_Div*/,      [0xC8] = KEY_UP,
	[0xC9] = KEY_PGUP,	      [0xCB] = KEY_LF,
	[0xCD] = KEY_RT,	      [0xCF] = KEY_END,
	[0xD0] = KEY_DN,	      [0xD1] = KEY_PGDN,
	[0xD2] = KEY_INS,	      [0xD3] = KEY_DEL
};

#define C(x) (x - '@')

static uint8_t ctlmap[256] =
{
	NO,      NO,      NO,      NO,      NO,      NO,      NO,      NO,
	NO,      NO,      NO,      NO,      NO,      NO,      NO,      NO,
	C('Q'),  C('W'),  C('E'),  C('R'),  C('T'),  C('Y'),  C('U'),  C('I'),
	C('O'),  C('P'),  NO,      NO,      '\r',    NO,      C('A'),  C('S'),
	C('D'),  C('F'),  C('G'),  C('H'),  C('J'),  C('K'),  C('L'),  NO,
	NO,      NO,      NO,      C('\\'), C('Z'),  C('X'),  C('C'),  C('V'),
	C('B'),  C('N'),  C('M'),  NO,      NO,      C('/'),  NO,      NO,
	[0x97] = KEY_HOME,
	[0xB5] = C('/'),		[0xC8] = KEY_UP,
	[0xC9] = KEY_PGUP,		[0xCB] = KEY_LF,
	[0xCD] = KEY_RT,		[0xCF] = KEY_END,
	[0xD0] = KEY_DN,		[0xD1] = KEY_PGDN,
	[0xD2] = KEY_INS,		[0xD3] = KEY_DEL
};

static uint8_t *charcode[4] = {
	normalmap,
	shiftmap,
	ctlmap,
	ctlmap
};

/*
 * Get data from the keyboard.  If we finish a character, return it.  Else 0.
 * Return -1 if no data.
 */
static int
kbd_proc_data(void)
{
	int c;
	uint8_t stat, data;
	static uint32_t shift;

	stat = inb(KBSTATP);
	if ((stat & KBS_DIB) == 0)
		return -1;
	// Ignore data from mouse.
	if (stat & KBS_TERR)
		return -1;

	data = inb(KBDATAP);

	if (data == 0xE0) {
		// E0 escape character
		shift |= E0ESC;
		return 0;
	} else if (data & 0x80) {
		// Key released
		data = (shift & E0ESC ? data : data & 0x7F);
		shift &= ~(shiftcode[data] | E0ESC);
		return 0;
	} else if (shift & E0ESC) {
		// Last character was an E0 escape; or with 0x80
		data |= 0x80;
		shift &= ~E0ESC;
	}

	shift |= shiftcode[data];
	shift ^= togglecode[data];

	c = charcode[shift & (CTL | SHIFT)][data];
	if (shift & CAPSLOCK) {
		if ('a' <= c && c <= 'z')
			c += 'A' - 'a';
		else if ('A' <= c && c <= 'Z')
			c += 'a' - 'A';
	}

	// Process special keys
	// Ctrl-Alt-Del: reboot
	if (!(~shift & (CTL | ALT)) && c == KEY_DEL) {
		cprintf("Rebooting!\n");
		outb(0x92, 0x3); // courtesy of Chris Frost
	}

	return c;
}

void
kbd_intr(void)
{
	cons_intr(kbd_proc_data);
}

static void
kbd_init(void)
{
}



/***** General device-independent console code *****/
// Here we manage the console input buffer,
// where we stash characters received from the keyboard or serial port
// whenever the corresponding interrupt occurs.

#define CONSBUFSIZE 512

static struct {
	uint8_t buf[CONSBUFSIZE];
	uint32_t rpos;
	uint32_t wpos;
} cons;

// called by device interrupt routines to feed input characters
// into the circular console input buffer.
static void
cons_intr(int (*proc)(void))
{
	int c;

	while ((c = (*proc)()) != -1) {
		if (c == 0)
			continue;
		cons.buf[cons.wpos++] = c;
		if (cons.wpos == CONSBUFSIZE)
			cons.wpos = 0;
	}
}

// return the next input character from the console, or 0 if none waiting
int
cons_getc(void)
{
	int c;

	// poll for any pending input characters,
	// so that this function works even when interrupts are disabled
	// (e.g., when called from the kernel monitor).
	serial_intr();
	kbd_intr();

	// grab the next character from the input buffer.
	if (cons.rpos != cons.wpos) {
		c = cons.buf[cons.rpos++];
		if (cons.rpos == CONSBUFSIZE)
			cons.rpos = 0;
		return c;
	}
	return 0;
}

// output a character to the console
static void
cons_putc(int c)
{
	serial_putc(c);
	lpt_putc(c);
	cga_putc(c);
}

// initialize the console devices
void
cons_init(void)
{
	cga_init();
	kbd_init();
	serial_init();

	if (!serial_exists)
		cprintf("Serial port does not exist!\n");
}


// `High'-level console I/O.  Used by readline and cprintf.

void
cputchar(int c)
{
	cons_putc(c);
}

int
getchar(void)
{
	int c;

	while ((c = cons_getc()) == 0)
		/* do nothing */;
	return c;
}

int
iscons(int fdnum)
{
	// used by readline
	return 1;
}

然后 Exercise 8.

接着是几个问题:

Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?

解释 printf.cconsole.c 之间的接口.
其实就一个函数, cputchar(). 具体原理不太想看了. 根据叔的文章, console.c 里面控制 CRT 也就是显示器接口输出的.

Explain the following from console.c:

1      if (crt_pos >= CRT_SIZE) {
2              int i;
3              memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
4              for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
5                      crt_buf[i] = 0x0700 | ' ';
6              crt_pos -= CRT_COLS;
7      }

CRT_SIZECRT_COLS 是定义在 console.h

#define CRT_ROWS	25
#define CRT_COLS	80
#define CRT_SIZE	(CRT_ROWS * CRT_COLS)

crt_buf 是一个 uint16_t* 指针. 看不懂啥意思, 参考了一下孟佬的知乎: 作用就是屏幕写满的时候, 把第一行消掉, 然后其他所有行上移一行, 新的内容写到新的行里. 不过结合一下看, CRT_SIZE - CRT_COLS 也确实应该是这个意思呢.

For the following questions you might wish to consult the notes for Lecture 2. These notes cover GCC’s calling convention on the x86.
Trace the execution of the following code step-by-step:

int x = 1, y = 3, z = 4;
cprintf("x %d, y %x, z %d\n", x, y, z);
  • In the call to cprintf(), to what does fmt point? To what does ap point?
  • List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.

我都不太想调试了…好麻烦.
gdb 打断点可以指定到哪个一个文件的多少行, 比如我这里是:

(gdb) b kern/init.c:38
Breakpoint 1 at 0xf01000c8: file kern/init.c, line 38.
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0xf01000c8 <i386_init+52>:	push   $0x4

Breakpoint 1, i386_init () at kern/init.c:39
39		cprintf("x %d, y %x, z %d\n", x, y, z);

接着逐步调试:

(gdb) si
=> 0xf01000ca <i386_init+54>:	push   $0x3
0xf01000ca	39		cprintf("x %d, y %x, z %d\n", x, y, z);
(gdb) si
=> 0xf01000cc <i386_init+56>:	push   $0x1
0xf01000cc	39		cprintf("x %d, y %x, z %d\n", x, y, z);
(gdb) si
=> 0xf01000ce <i386_init+58>:	push   $0xf0101912
0xf01000ce	39		cprintf("x %d, y %x, z %d\n", x, y, z);
(gdb) si
=> 0xf01000d3 <i386_init+63>:	call   0xf0100906 <cprintf>
0xf01000d3	39		cprintf("x %d, y %x, z %d\n", x, y, z);

所以这里可以看到 fmt 应该是被存储在 0xf0101912 的位置的, 由于这里是映射的虚拟地址, 我们可以顺带检查一下对应的物理地址的内容:

(gdb) x/s 0xf0101912
0xf0101912:	"x %d, y %x, z%d\n"
(gdb) x/s 0x00101912
0x101912:	"x %d, y %x, z%d\n"

进入 cprintf() 函数继续.

(gdb) si
=> 0xf010090f <cprintf+9>:	push   %eax
32		cnt = vcprintf(fmt, ap);
(gdb) si
=> 0xf0100910 <cprintf+10>:	pushl  0x8(%ebp)
0xf0100910	32		cnt = vcprintf(fmt, ap);
(gdb) si
=> 0xf0100913 <cprintf+13>:	call   0xf01008e0 <vcprintf>
0xf0100913	32		cnt = vcprintf(fmt, ap);

趁 eax 的值还没有更变, 我们看一下 eax 的地址处包含的值:

(gdb) x/3d $eax
0xf010ffd4:	1	3	4

根据上面的调试信息, eax 里存的应该就是可变参数列表的指针值, 也就是 ap.
第二题不做了, 都是些花时间的活.

Run the following code.

   unsigned int i = 0x00646c72;
   cprintf("H%x Wo%s", 57616, &i);

What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise. Here’s an ASCII table that maps bytes to characters.
The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

Here’s a description of little- and big-endian and a more whimsical description.

先说一下输出, 是 He110 World.
很简单, 只需要拿 ASCII 码表来对应一下就可以了.
这里如果换成大端法, 肯定是不影响前面的常数的, 但是后面的会倒过来. 孟佬和叔的文章都写应该输出 He110 Wodlr, 但是我觉得 00 会被放到最前面, 应该低地址端的第一个字节就是 '\0', 后面的内容也就不会输出, 所以应该是 He110 Wo.

In the following code, what is going to be printed after ‘y=’? (note: the answer is not a specific value.) Why does this happen?

   cprintf("x=%d y=%d", 3);

UB. 输出是 x=3, y=1600. 我们看下调试的结果.

(gdb) si
=> 0xf01000cb <i386_init+55>:	push   $0x3
0xf01000cb	37		cprintf("x=%d, y=%d", 3);
(gdb) si
=> 0xf01000cd <i386_init+57>:	push   $0xf0101912
0xf01000cd	37		cprintf("x=%d, y=%d", 3);
(gdb) si
=> 0xf01000d2 <i386_init+62>:	call   0xf0100907 <cprintf>
0xf01000d2	37		cprintf("x=%d, y=%d", 3);
(gdb) si
=> 0xf0100907 <cprintf>:	push   %ebp
cprintf (fmt=0xf0101912 "x=%d, y=%d") at kern/printf.c:27

cprintf 的栈部署完毕后, 然后我们看下栈的信息:

(gdb) x/8d $ebp
0xf010ffd8:	-267321352	-267386665	-267380462	3
0xf010ffe8:	1600	0	0	0

我们只知道 3 后面是 1600, 原因不清楚.

Let’s say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?

看起来这个很麻烦呢. 但是我觉得这样也还是可行的:

int cprintf(..., fmt);

没有测试我也不是很清楚能不能运行.

Challenge Enhance the console to allow text to be printed in different colors. The traditional way to do this is to make it interpret ANSI escape sequences embedded in the text strings printed to the console, but you may use any mechanism you like. There is plenty of information on the 6.828 reference page and elsewhere on the web on programming the VGA display hardware. If you’re feeling really adventurous, you could try switching the VGA hardware into a graphics mode and making the console draw text onto the graphical frame buffer.

Challenge 确确实实是一个 Challenge 呢. 让我们想办法改变 VGA 硬件的显示颜色.
我们再一次进入到 console.c 里, 跟颜色相关的估计就是”或等于一个十六进制数”之类的语句, 暂时先锁定这样的找就行了.
果不其然, 我们在 L165~167 找到了一条注释和几个语句:

// if no attribute given, then use black on white
if (!(c & ~0xFF))
	c |= 0x0700;

首先, c 是一个 int 型, 条件里面的意思大概就是如果 c 的最低一个字节等于 0, 就让它或上一个 0x0700. 好了, 好在我在几年前(估摸着应该有 8 年了)写过 Windows 的批处理程序, 知道它的 color 指令的帮助信息里有张表, 这里把它截取过来:

    0 = 黑色       8 = 灰色
    1 = 蓝色       9 = 淡蓝色
    2 = 绿色       A = 淡绿色
    3 = 浅绿色     B = 淡浅绿色
    4 = 红色       C = 淡红色
    5 = 紫色       D = 淡紫色
    6 = 黄色       E = 淡黄色
    7 = 白色       F = 亮白色

顺带我还猜测第三个十六进制位表示前景色, 第四个十六进制位表示背景色. 我将这个数字改成了 0x2700, 果不其然, 显示 Booting from Hard Disk 之后的所有输出信息, 都是绿背景色.
这个挑战也就算是完成了. 好像也不是那么难.

The Stack

我们的 lab1 内容接近尾声了, 但是仍有 4 个 Exercise 还在等着我们.
这个部分我们将更多地了解 x86 机器的 C 语言中栈的工作模式, 最后就到了我们整个 lab1 最高潮地部分 – 写一个新的函数来打印栈的 backtrace – 一个保存了 IP(Instruction Pointer) 值的列表. 这个 IP 值来自引导当前执行位置的 call 指令.
开启 Exercise 9.

x86 栈指针(ESP 寄存器)指向当前栈使用的最低地址, 所有在栈区域保存范围内的以 ESP 指针以下所有内存区域均是栈的空闲区域(free). 值入栈会先使栈指针减小, 然后将值存进栈指针的位置. 出栈相反. 在 32 位模式下, 栈只存 32 位值, 于是栈指针的变化以 4 为单位. 也就是我们在 NASM 教程中一直以来看到的了.
比较之下, 基指针(Base Pointer, aka EBP 寄存器)主要关联软件中约定的栈. 这个我们在 NASM 教程描述接口技术的部分也看到了. 在 C 函数入口处, 使用 gdb 调试应为 { 的位置(原文使用了序幕代码 prologue code 这个词), EBP 的值会先入栈, 然后将 ESP 的值拷贝到 EBP. 如果全部函数都遵循这个约定, 然后是一个长句:

It is possible to trace back through the stack by following the chain of saved ebp pointers and determining exactly what nested sequence of function calls caused this particular point in the program to be reached.

直接解释一下就是函数调用成链状了, 就可以实现一个功能, 函数调用的回溯跟踪, 即为刚才就提到的 backtrace.
backtrace 可以帮助我们跟踪到 assert 的错误根源或者是 panic 的参数错误根源, 也就是在其他高级语言中常见的异常链的实现基础.
开始 Exercise 10.

原文提示可以使用纯 C 完成 mon_backtrace() 函数, 但是在那之前需要去了解一下 inc/x86.h 中的 read_ebp() 函数, 让用户能在 monitor 中调用它.
backtrace 函数需要有下面格式的输出信息:

Stack backtrace:
  ebp f0109e58  eip f0100a62  args 00000001 f0109e80 f0109e98 f0100ed2 00000031
  ebp f0109ed8  eip f01000d6  args 00000000 00000000 f0100058 f0109f28 00000061
  ...

每一行包括 ebp, eip 和 args. ebp 存储的是堆栈的 base point, 也就是我们之前在 Exercise 中提到的快照. eip 是 return instruction pointer, 返回指令地址, 存储的东西就不用多说了. args 是传入函数的前五项参数, 当然函数的参数可能不够五个, 无所谓, 后面的数据就当是废的.
介绍完毕, 那么开始 Exercise 11.

有了这么一个函数的支撑后, 也就跟我上面提到过的异常链一样, 它可以帮助用户更容易地找到到底是从调用地哪一层开始出错的, 也算是一个小小的异常链系统了.
lab1 的最后一个大作业, 就是要将这个函数整合进系统的指令中. 我们提供一个函数 debuginfo_eip(), 用来查阅符号表中的 eip 并返回地址的调试信息. 这个函数被声明在 kern/kdebug.c 中. 后见 Exersice 12.

正文的内容就此完结, lab2 再见.

Exercises

Exercise 1

Exercise 1. Familiarize yourself with the assembly language materials available on the 6.828 reference page. You don’t have to read them now, but you’ll almost certainly want to refer to some of this material when reading and writing x86 assembly.

We do recommend reading the section “The Syntax” in Brennan’s Guide to Inline Assembly. It gives a good (and quite brief) description of the AT&T assembly syntax we’ll be using with the GNU assembler in JOS.

正文部分说过了, 有专门开一个文章写汇编书. 本身汇编也是专业课吧, 所以花了点时间, 统计了下有 70k 字.
另外一个 Brennan’s Guide to Inline Assembly 是一个讲 C Inline 汇编的文章. 没有看了, 见叔的文章$^2$.

Exercise 2

Exercise 2. Use GDB’s si (Step Instruction) command to trace into the ROM BIOS for a few more instructions, and try to guess what it might be doing. You might want to look at Phil Storrs I/O Ports Description, as well as other materials on the 6.828 reference materials page. No need to figure out all the details - just the general idea of what the BIOS is doing first.

使用 si 指令逐步调试然后猜作用.

[f000:fff0]    0xffff0:	ljmp   $0xf000,$0xe05b
[f000:e05b]    0xfe05b:	cmpl   $0x0,%cs:0x6c48      # cs == 0xf000
[f000:e062]    0xfe062:	jne    0xfd2e1
[f000:e066]    0xfe066:	xor    %dx,%dx              # dx = 0
[f000:e068]    0xfe068:	mov    %dx,%ss              # ss = dx = 0
[f000:e06a]    0xfe06a:	mov    $0x7000,%esp         # esp = 0x7000
[f000:e070]    0xfe070:	mov    $0xf3691,%edx        # edx = 0xf3691
[f000:e076]    0xfe076:	jmp    0xfd165
[f000:d165]    0xfd165:	mov    %eax,%ecx            # ecx = eax
[f000:d168]    0xfd168:	cli                         # 
[f000:d169]    0xfd169:	cld                         # DF = 0, 增方向

CLI:Clear Interupt, 禁止中断发生. STL:Set Interupt, 允许中断发生. CLI和STI是用来屏蔽中断和恢复中断用的, 如设置栈基址SS和偏移地址SP时, 需要CLI, 因为如果这两条指令被分开了, 那么很有可能SS被修改了, 但由于中断, 而代码跳去其它地方执行了, SP还没来得及修改, 就有可能出错.
CLD: Clear Director. STD:Set Director. 在字行块传送时使用的, 它们决定了块传送的方向. CLD使得传送方向从低地址到高地址, 而STD则相反. $^3$

[f000:d16a]    0xfd16a:	mov    $0x8f,%eax           # eax = 0x8f
[f000:d170]    0xfd170:	out    %al,$0x70            # 将 al 的数据导出到 0x70 端口
[f000:d172]    0xfd172:	in     $0x71,%al            # 将 0x71 端口的数据导入到 al

out 和 in 指令用于操作 IO 端口.

CPU与外部设备通讯时, 通常是通过访问, 修改设备控制器中的寄存器来实现的. 那么这些位于设备控制器当中的寄存器也叫做IO端口. 为了方便管理, 80x86CPU采用IO端口单独编址的方式, 即所有设备的端口都被命名到一个IO端口地址空间中. 这个空间是独立于内存地址空间的. 所以必须采用和访问内存的指令不一样的指令来访问端口.
0x70端口和0x71端口是用于控制系统中一个叫做CMOS的设备, 这个设备是一个低功耗的存储设备, 它可以用于在计算机关闭时存储一些信息, 它是由独立的电池供电的.
这个CMOS中可以控制跟PC相关的多个功能, 其中最重要的就是时钟设备(Real Time Clock)的 , 它还可以控制是否响应不可屏蔽中断NMI(Non-Maskable Interrupt).
操作CMOS存储器中的内容需要两个端口, 一个是0x70另一个就是0x71. 其中0x70可以叫做索引寄存器, 这个8位寄存器的最高位是不可屏蔽中断(NMI)使能位. 如果你把这个位置1, 则NMI不会被响应. 低7位用于指定CMOS存储器中的存储单元地址, 所以如果你想访问第1号存储单元, 并且在访问时, 我要使能NMI, 那么你就应该向端口0x70里面送入0b10000001 = 0x81. $^4$

这里 mov 0x8f 到 eax 中, 然后将值导入 0x70 端口, 是为了能通过 0x71 端口访问存储单元 0xf 的值(in $0x71,%al), 并且关闭 NMI 中断. 但是 al 的值并没有被利用. 所以认为这三行是用来关闭 NMI 中断的.

[f000:d174]    0xfd174:	in     $0x92,%al            # 将 0x92 端口的数据导入到 al
[f000:d176]    0xfd176:	or     $0x2,%al             # 将 al 的(右数)第 1 位(bit1)置为 1
[f000:d178]    0xfd178:	out    %al,$0x92            # 将 al 导回去

这三行的作用就是将 0x92 端口的 bit1 修改为 1.
0x92 控制的是 PS/2 系统控制接口 A$^5$, 而 bit 1= 1 indicates A20 active, 即 bit1 是 A20 位, 即第 21 个地址线被使能. A20 地址线被激活时, 系统工作在保护模式. 但是 boot loader 程序中计算机仍需要工作在实模式下. 所以这里应该只是测试可用内存空间.$^6$

[f000:d17a]    0xfd17a:	lidtw  %cs:0x6c38           # 将从地址 0x6c38 起始的后面 6 个字节数据读入 IDTR 中

lidt指令:加载中断向量表寄存器(IDTR). 这个指令会把从地址0xf6ab8起始的后面6个字节的数据读入到中断向量表寄存器(IDTR)中. 中断是操作系统中非常重要的一部分, 有了中断操作系统才能真正实现进程. 每一种中断都有自己对应的中断处理程序, 那么这个中断的处理程序的首地址就叫做这个中断的中断向量. 中断向量表自然是存放所有中断向量的表了. $^4$

[f000:d180]    0xfd180:	lgdtw  %cs:0x6bf4           # 

把从 0xf6bf4 为起始地址处的6个字节的值加载到全局描述符表格寄存器(GDTR)中. GDTR 将在 boot loader 中介绍.

[f000:d186]    0xfd186:	mov    %cr0,%eax
[f000:d189]    0xfd189:	or     $0x1,%eax
[f000:d18d]    0xfd18d:	mov    %eax,%cr0            # cr0 |= 0x1

将控制寄存器的 bit0 置 1. 计算机包含四个控制寄存器 CR0~CR3, CR0 是 PE 位(启动保护位), 置 1 表示开启保护模式.

[f000:d190]    0xfd190:	ljmpl  $0x8,$0xfd198
The target architecture is assumed to be i386
=> 0xfd198:	mov    $0x10,%eax                       # eax = 0x10
=> 0xfd19d:	mov    %eax,%ds                         # ds = eax
=> 0xfd19f:	mov    %eax,%es                         # es = eax
=> 0xfd1a1:	mov    %eax,%ss                         # ss = eax
=> 0xfd1a3:	mov    %eax,%fs                         # fs = eax
=> 0xfd1a5:	mov    %eax,%gs                         # gs = eax
=> 0xfd1a7:	mov    %ecx,%eax                        # eax = ecx
=> 0xfd1a9:	jmp    *%edx
=> 0xf3691:	push   %ebx                             # 熟悉的子程序调用...
...

上面这些寄存器设置是按规定来的. 刚刚加载完 GDTR 必须要重新加载所有段寄存器的值$^7$, 而 CS 段寄存器必须通过长跳转指令(ljmp), 这样相当于使 GDTR 生效$^4$.

F Segment (FS). Pointer to more extra data (‘F’ comes after ‘E’).
G Segment (GS). Pointer to still more extra data (‘G’ comes after ‘F’).$^2$

Exercise 3

Exercise 3. Take a look at the lab tools guide, especially the section on GDB commands. Even if you’re familiar with GDB, this includes some esoteric GDB commands that are useful for OS work.

Set a breakpoint at address 0x7c00, which is where the boot sector will be loaded. Continue execution until that breakpoint. Trace through the code in boot/boot.S, using the source code and the disassembly file obj/boot/boot.asm to keep track of where you are. Also use the x/i command in GDB to disassemble sequences of instructions in the boot loader, and compare the original boot loader source code with both the disassembly in obj/boot/boot.asm and GDB.

Trace into bootmain() in boot/main.c, and then into readsect(). Identify the exact assembly instructions that correspond to each of the statements in readsect(). Trace through the rest of readsect() and back out into bootmain(), and identify the begin and end of the for loop that reads the remaining sectors of the kernel from the disk. Find out what code will run when the loop is finished, set a breakpoint there, and continue to that breakpoint. Then step through the remainder of the boot loader.

用 GDB 在 0x7c00 处打个断点, 这个地址也就是引导扇区的起始地址. 使用 x/i 来查看 boot loader 内的指令, 然后与 obj/bool/boot.asm 的反汇编代码进行比较.
boot.S 部分, 设置 CR0 时, GDB 调试出的代码, 和反汇编代码都不太相同:
GDB:

   0x7c1a:	mov    $0xdf,%al
   0x7c1c:	out    %al,$0x60
   0x7c1e:	lgdtw  0x7c64
   0x7c23:	mov    %cr0,%eax

boot.asm:

  movl    %cr0, %eax
    7c24:       20 c0                   and    %al,%al
  orl     $CR0_PE_ON, %eax
    7c26:       66 83 c8 01             or     $0x1,%ax
  movl    %eax, %cr0
    7c2a:       0f 22 c0                mov    %eax,%cr0

原因未知, 但是功能实现了, 便无论代码了.
接着到 0x7d15 的位置:

# void bootmain(void) {
    0x7d15:	push   %ebp
    0x7d16:	mov    %esp,%ebp
    0x7d18:	push   %esi
    0x7d19:	push   %ebx

    # 三个参数入栈
    0x7d1a:	push   $0x0                  # offset = 0
    0x7d1c:	push   $0x1000               # count  = SECTSIZE * 8 = 0x200 << 3
    0x7d21:	push   $0x10000              # pa     = 0x10000
    # 调用 readseg
    0x7d26:	call   0x7cdc
        # void readseg(uint32_t pa, uint32_t count, uint32_t offset) {
        0x7cdc:	push   %ebp
        0x7cdd:	mov    %esp,%ebp
        0x7cdf:	push   %edi
        0x7ce0:	push   %esi

        0x7ce1:	mov    0x10(%ebp),%edi   # edi = offset
        0x7ce4:	push   %ebx              # 保存 ebx, 下下行用到了这个寄存器
        0x7ce5:	mov    0xc(%ebp),%esi    # esi = count
        0x7ce8:	mov    0x8(%ebp),%ebx    # ebx = pa

        0x7ceb:	shr    $0x9,%edi         # offset = (offset / SECTSIZE)
        0x7cee:	add    %ebx,%esi         # count += pa, esi(count) --> esi(end_pa)
        0x7cf0:	inc    %edi              # offset += 1   // 跟上上一行共同构成 offset = (offset / SECTSIZE) + 1;
        0x7cf1:	and    $0xfffffe00,%ebx  # pa &= ~(SECTSIZE - 1);

        # while (pa < end_pa)
        0x7cf7:	cmp    %esi,%ebx
        0x7cf9:	jae    0x7d0d

        # 参数入栈准备调用函数
        0x7cfb:	push   %edi              # offset
        0x7cfc:	push   %ebx              # pa

        0x7cfd:	inc    %edi              # offset++;
        0x7cfe:	add    $0x200,%ebx       # pa += SECTSIZE;
        # 调用 readsect
        0x7d04:	call   0x7c7c
            # void readsect(void *dst, uint32_t offset) {
            0x7c7c:	push   %ebp
            0x7c7d:	mov    %esp,%ebp

            0x7c7f:	push   %edi                          # 保存 edi
            0x7c80:	mov    0xc(%ebp),%ecx                # ecx = offset
            # 调用 waitdisk
            0x7c83:	call   0x7c6a
                # void waitdisk(void) {
                0x7c6a:	push   %ebp
                0x7c6b:	mov    $0x1f7,%edx           # edx = 0x1f7
                0x7c70:	mov    %esp,%ebp

                0x7c72:	in     (%dx),%al             # al = inb(0x1F7)
                0x7c73:	and    $0xffffffc0,%eax      # eax & 0xC0
                # while ((inb(0x1F7) & 0xC0) != 0x40)
                0x7c76:	cmp    $0x40,%al
                0x7c78:	jne    0x7c72

                0x7c7a:	pop    %ebp
                0x7c7b:	ret
                # }

            # outb(0x1F2, 1);
            0x7c88:	mov    $0x1f2,%edx               # edx = 0x1f2
            0x7c8d:	mov    $0x1,%al                  # al = 1
            0x7c8f:	out    %al,(%dx)
            # outb(0x1F3, offset);
            0x7c90:	mov    $0x1f3,%edx               # edx = 0x1f3
            0x7c95:	mov    %cl,%al                   # al = cl; ecx == offset
            0x7c97:	out    %al,(%dx)
            # outb(0x1F4, offset >> 8);
            0x7c98:	mov    %ecx,%eax                 # eax = ecx = offset
            0x7c9a:	mov    $0x1f4,%edx               # edx = 0x1f4
            0x7c9f:	shr    $0x8,%eax                 # eax >>= 8
            0x7ca2:	out    %al,(%dx)
            # outb(0x1F5, offset >> 16);
            0x7ca3:	mov    %ecx,%eax                 # eax = ecx = offset 
            0x7ca5:	mov    $0x1f5,%edx               # edx = 0x1f5
            0x7caa:	shr    $0x10,%eax                # eax >>= 16
            0x7cad:	out    %al,(%dx)
            # outb(0x1F6, (offset >> 24) | 0xE0);
            0x7cae:	mov    %ecx,%eax                 # eax = ecx = offset
            0x7cb0:	mov    $0x1f6,%edx               # edx = 0x1f6
            0x7cb5:	shr    $0x18,%eax                # edx >>= 24
            0x7cb8:	or     $0xffffffe0,%eax          # edx |= 0xe0
            0x7cbb:	out    %al,(%dx)
            # outb(0x1F7, 0x20)
            0x7cbc:	mov    $0x1f7,%edx               # edx = 0x1f7
            0x7cc1:	mov    $0x20,%al                 # al = 0x20
            0x7cc3:	out    %al,(%dx)

            # 调用 waitdisk (略)
            0x7cc4:	call   0x7c6a

            # insl(0x1F0, dst, SECTSIZE/4);
            0x7cc9:	mov    0x8(%ebp),%edi            # dst
            0x7ccc:	mov    $0x80,%ecx                # SECTSIZE/4
            0x7cd1:	mov    $0x1f0,%edx               # 0x1F0
            0x7cd6:	cld                              # DF = 0
            0x7cd7:	repnz insl (%dx),%es:(%edi)

            0x7cd9:	pop    %edi
            0x7cda:	pop    %ebp
            0x7cdb:	ret
            # }

        0x7d0d:	lea    -0xc(%ebp),%esp               # 复原 esp
        0x7d10:	pop    %ebx
        0x7d11:	pop    %esi
        0x7d12:	pop    %edi
        0x7d13:	pop    %ebp
        0x7d14:	ret
        # }

    # 参数退栈
    0x7d2b:	add    $0xc,%esp
    # if (ELFHDR->e_magic != ELF_MAGIC) {
    0x7d2e:	cmpl   $0x464c457f,0x10000
    # goto bad;
    0x7d38:	jne    0x7d71
    # }

        # bad: {
        # outw(0x8A00, 0x8A00);
        0x7d71:	mov    $0x8a00,%edx
        0x7d76:	mov    $0xffff8a00,%eax
        0x7d7b:	out    %ax,(%dx)
        # outw(0x8A00, 0x8E00);
        0x7d7d:	mov    $0xffff8e00,%eax
        0x7d82:	out    %ax,(%dx)
        # while (1);
        0x7d84:	jmp    0x7d84
        # }

    # ebx == ph, esi == eph
    # ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
    0x7d3a:	mov    0x1001c,%eax                      # eax = 0x1001c == ELFHDR->e_phoff
    0x7d3f:	movzwl 0x1002c,%esi                      # esi = 0x1002c
    0x7d46:	lea    0x10000(%eax),%ebx                # ph = ebx = eax + ELFHDR
    # eph = ph + ELFHDR->e_phnum;
    0x7d4c:	shl    $0x5,%esi                         # esi <<= 5; esi == ELFHDR->e_phnum
    0x7d4f:	add    %ebx,%esi                         # eph = esi = ELFHDR->e_phnum + ph

    # for (; ph < eph; ph++) {
    0x7d51:	cmp    %esi,%ebx
    0x7d53:	jae    0x7d6b

    # 参数入栈
    0x7d55:	pushl  0x4(%ebx)                         # ph->p_offset
    0x7d58:	pushl  0x14(%ebx)                        # ph->p_memsz
    0x7d5b:	add    $0x20,%ebx                        # ph++
    0x7d5e:	pushl  -0x14(%ebx)                       # ph->p_pa
    0x7d61:	call   0x7cdc                            # 调用 readseg (略)
    0x7d66:	add    $0xc,%esp                         # 参数出栈
    0x7d69:	jmp    0x7d51
    # }

    # ((void (*)(void)) (ELFHDR->e_entry))();
    0x7d6b:	call   *0x10018
# } // bootmain quit

就没比较了, 花了好长时间把代码读了, 并且和 C 源代码做了对应.

Exercise 4

Exercise 4. Read about programming with pointers in C. The best reference for the C language is The C Programming Language by Brian Kernighan and Dennis Ritchie (known as ‘K&R’). We recommend that students purchase this book (here is an Amazon Link) or find one of MIT’s 7 copies.

Read 5.1 (Pointers and Addresses) through 5.5 (Character Pointers and Functions) in K&R. Then download the code for pointers.c, run it, and make sure you understand where all of the printed values come from. In particular, make sure you understand where the pointer addresses in printed lines 1 and 6 come from, how all the values in printed lines 2 through 4 get there, and why the values printed in line 5 are seemingly corrupted.

There are other references on pointers in C (e.g., A tutorial by Ted Jensen that cites K&R heavily), though not as strongly recommended.

Warning: Unless you are already thoroughly versed in C, do not skip or even skim this reading exercise. If you do not really understand pointers in C, you will suffer untold pain and misery in subsequent labs, and then eventually come to understand them the hard way. Trust us; you don’t want to find out what “the hard way” is.

被安利了两本 classic, 一本是 K&R, 另外一本是 Pointers in C. 他让我们去读关于 C 指针的内容. 过于基础我们跳过.
最后还有个警告, 说的是即使你是老司机还是不推荐跳过或者略读推荐内容, 因为老司机在后面的 lab 中也很容易翻车. 我们直接无视它.
pointer.c 的代码:

#include <stdio.h>
#include <stdlib.h>

void
f(void)
{
    int a[4];
    int *b = malloc(16);
    int *c;
    int i;

    printf("1: a = %p, b = %p, c = %p\n", a, b, c);

    c = a;
    for (i = 0; i < 4; i++)
        a[i] = 100 + i;
    c[0] = 200;
    printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
	      a[0], a[1], a[2], a[3]);

    c[1] = 300;
    *(c + 2) = 301;
    3[c] = 302;
    printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
	      a[0], a[1], a[2], a[3]);

    c = c + 1;
    *c = 400;
    printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
	      a[0], a[1], a[2], a[3]);

    c = (int *) ((char *) c + 1);
    *c = 500;
    printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
	      a[0], a[1], a[2], a[3]);

    b = (int *) a + 1;
    c = (int *) ((char *) a + 1);
    printf("6: a = %p, b = %p, c = %p\n", a, b, c);
}

int
main(int ac, char **av)
{
    f();
    return 0;
}

Exercise 5

Exercise 5. Trace through the first few instructions of the boot loader again and identify the first instruction that would “break” or otherwise do the wrong thing if you were to get the boot loader’s link address wrong. Then change the link address in boot/Makefrag to something wrong, run make clean, recompile the lab with make, and trace into the boot loader again to see what happens. Don’t forget to change the link address back and make clean again afterward!

这个习题我有留意 myk 和叔的文章, 孟佬把改成 0x8c00 了, 叔改成了 0x6c00, 我就试试 0x1c00 会怎么样吧.
修改 boot/Makefrag 里的地址信息(随便改, 反正整个项目有版本控制程序托管), 将 0x7c00 改成 0x1c00, 先执行一次 make clean, 再执行一次 make, 看效果.
make 的信息还是跟之前一样. 但是这里要插入一小段, ld: warning: section '.bss' type changed to PROGBITS 这一行根据叔的文章, 在 0x7c00 的情况下他是没有出现的, 而我是两次都有出现, 引用一下他的文章:

如果.bss是NOBITS的,那么链接器会在输出的文件里告诉操作系统当这个程序被加载的时候,根据提供的信息,将某一块内存给分配出来,并置0,但是如果是PROGBITS的话,就是告诉系统从文件里取出一块已经被置0的数据段存入内存中,所以区别就在NOBITS的文件中,.bbs数据段是不占用空间的,但是PROGBITS的数据段是占用空间的。虽然最后对运行的程序没什么影响,最大的影响是可执行文件多了一块被置零的数据段,需要占用更多的空间。$^2$

然后, 先用 objdump 检查一下地址:

$ objdump -h obj/boot/boot.out 

obj/boot/boot.out:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000186  00001c00  00001c00  00000074  2**2
                  CONTENTS, ALLOC, LOAD, CODE
  1 .eh_frame     000000a8  00001d88  00001d88  000001fc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .stab         00000720  00000000  00000000  000002a4  2**2
                  CONTENTS, READONLY, DEBUGGING
  3 .stabstr      0000088f  00000000  00000000  000009c4  2**0
                  CONTENTS, READONLY, DEBUGGING
  4 .comment      00000035  00000000  00000000  00001253  2**0
                  CONTENTS, READONLY

可以看到 .text 的 VMA 和 LMA 确实是到 0x1c00 了.
接着我使用 gdb 在 0x1c00 处断点是没法跟踪的. 直接运行进行测试:

$ make qemu-nox
sed "s/localhost:1234/localhost:26000/" < .gdbinit.tmpl > .gdbinit
***
*** Use Ctrl-a x to exit qemu
***
qemu-system-i386 -nographic -drive file=obj/kern/kernel.img,index=0,media=disk,format=raw -serial mon:stdio -gdb tcp::26000 -D qemu.log 
main-loop: WARNING: I/O thread spun for 1000 iterations

然后就在这里卡住了. 转到虚拟机中的 Ubuntu 桌面环境执行 make qemu, 会发现 qemu 窗口在不停闪烁, 隐约看到有 Booting from hard disk 的字样. 接着我们测试 0x7c00 之后的地址, 我选择了 0xac00. 情况完全相同. 使用 gdb 调试, 将断点打在 0x7c00 的位置:

(gdb) b *0x7c00
Breakpoint 1 at 0x7c00
(gdb) c
Continuing.
[   0:7c00] => 0x7c00:	cli    

Breakpoint 1, 0x00007c00 in ?? ()
(gdb) x/50
   0x7c01:	cld    
   0x7c02:	xor    %ax,%ax
   0x7c04:	mov    %ax,%ds
   0x7c06:	mov    %ax,%es
   0x7c08:	mov    %ax,%ss
   0x7c0a:	in     $0x64,%al
   0x7c0c:	test   $0x2,%al
   0x7c0e:	jne    0x7c0a
   0x7c10:	mov    $0xd1,%al
   0x7c12:	out    %al,$0x64
   0x7c14:	in     $0x64,%al
   0x7c16:	test   $0x2,%al
   0x7c18:	jne    0x7c14
   0x7c1a:	mov    $0xdf,%al
   0x7c1c:	out    %al,$0x60
   0x7c1e:	lgdtw  -0x539c
   0x7c23:	mov    %cr0,%eax
   0x7c26:	or     $0x1,%eax
   0x7c2a:	mov    %eax,%cr0
   0x7c2d:	ljmp   $0x8,$0xac32
   0x7c32:	mov    $0xd88e0010,%eax
   0x7c38:	mov    %ax,%es
   0x7c3a:	mov    %ax,%fs
   0x7c3c:	mov    %ax,%gs
   0x7c3e:	mov    %ax,%ss
   0x7c40:	mov    $0xac00,%sp
   0x7c43:	add    %al,(%bx,%si)
   0x7c45:	call   0x7d13
   0x7c48:	add    %al,(%bx,%si)
   0x7c4a:	jmp    0x7c4a
   0x7c4c:	add    %al,(%bx,%si)
   0x7c4e:	add    %al,(%bx,%si)
   0x7c50:	add    %al,(%bx,%si)
   0x7c52:	add    %al,(%bx,%si)
   0x7c54:	(bad)  
   0x7c55:	incw   (%bx,%si)
   0x7c57:	add    %al,(%bx,%si)
   0x7c59:	lcall  $0xffff,$0xcf
   0x7c5e:	add    %al,(%bx,%si)
   0x7c60:	add    %dl,0xcf(%bp,%si)
   0x7c64:	pop    %ss
   0x7c65:	add    %cl,-0x54(%si)
   0x7c68:	add    %al,(%bx,%si)
   0x7c6a:	push   %bp
   0x7c6b:	mov    $0x1f7,%dx
   0x7c6e:	add    %al,(%bx,%si)
   0x7c70:	mov    %sp,%bp
   0x7c72:	in     (%dx),%al
   0x7c73:	and    $0xffc0,%ax
   0x7c76:	cmp    $0x40,%al

作为参考我打印了原来情形的后文 50 个指令:

(gdb) x/50
   0x7c01:	cld    
   0x7c02:	xor    %ax,%ax
   0x7c04:	mov    %ax,%ds
   0x7c06:	mov    %ax,%es
   0x7c08:	mov    %ax,%ss
   0x7c0a:	in     $0x64,%al
   0x7c0c:	test   $0x2,%al
   0x7c0e:	jne    0x7c0a
   0x7c10:	mov    $0xd1,%al
   0x7c12:	out    %al,$0x64
   0x7c14:	in     $0x64,%al
   0x7c16:	test   $0x2,%al
   0x7c18:	jne    0x7c14
   0x7c1a:	mov    $0xdf,%al
   0x7c1c:	out    %al,$0x60
   0x7c1e:	lgdtw  0x7c64
   0x7c23:	mov    %cr0,%eax
   0x7c26:	or     $0x1,%eax
   0x7c2a:	mov    %eax,%cr0
   0x7c2d:	ljmp   $0x8,$0x7c32
   0x7c32:	mov    $0xd88e0010,%eax
   0x7c38:	mov    %ax,%es
   0x7c3a:	mov    %ax,%fs
   0x7c3c:	mov    %ax,%gs
   0x7c3e:	mov    %ax,%ss
   0x7c40:	mov    $0x7c00,%sp
   0x7c43:	add    %al,(%bx,%si)
   0x7c45:	call   0x7d13
   0x7c48:	add    %al,(%bx,%si)
   0x7c4a:	jmp    0x7c4a
   0x7c4c:	add    %al,(%bx,%si)
   0x7c4e:	add    %al,(%bx,%si)
   0x7c50:	add    %al,(%bx,%si)
   0x7c52:	add    %al,(%bx,%si)
   0x7c54:	(bad)  
   0x7c55:	incw   (%bx,%si)
   0x7c57:	add    %al,(%bx,%si)
   0x7c59:	lcall  $0xffff,$0xcf
   0x7c5e:	add    %al,(%bx,%si)
   0x7c60:	add    %dl,0xcf(%bp,%si)
   0x7c64:	pop    %ss
   0x7c65:	add    %cl,0x7c(%si)
   0x7c68:	add    %al,(%bx,%si)
   0x7c6a:	push   %bp
   0x7c6b:	mov    $0x1f7,%dx
   0x7c6e:	add    %al,(%bx,%si)
   0x7c70:	mov    %sp,%bp
   0x7c72:	in     (%dx),%al
   0x7c73:	and    $0xffc0,%ax
   0x7c76:	cmp    $0x40,%al

因为一时间手头没有什么比较文本的工具, 就直接用 git diff 了:

diff --git a/ex5 b/ex5
index 3955d53..76c3fd6 100644
--- a/ex5
+++ b/ex5
@@ -14,17 +14,17 @@
    0x7c18:     jne    0x7c14
    0x7c1a:     mov    $0xdf,%al
    0x7c1c:     out    %al,$0x60
-   0x7c1e:     lgdtw  0x7c64
+   0x7c1e:     lgdtw  -0x539c
    0x7c23:     mov    %cr0,%eax
    0x7c26:     or     $0x1,%eax
    0x7c2a:     mov    %eax,%cr0
-   0x7c2d:     ljmp   $0x8,$0x7c32
+   0x7c2d:     ljmp   $0x8,$0xac32
    0x7c32:     mov    $0xd88e0010,%eax
    0x7c38:     mov    %ax,%es
    0x7c3a:     mov    %ax,%fs
    0x7c3c:     mov    %ax,%gs
    0x7c3e:     mov    %ax,%ss
-   0x7c40:     mov    $0x7c00,%sp
+   0x7c40:     mov    $0xac00,%sp
    0x7c43:     add    %al,(%bx,%si)
    0x7c45:     call   0x7d13
    0x7c48:     add    %al,(%bx,%si)
@@ -40,7 +40,7 @@
    0x7c5e:     add    %al,(%bx,%si)
    0x7c60:     add    %dl,0xcf(%bp,%si)
    0x7c64:     pop    %ss
-   0x7c65:     add    %cl,0x7c(%si)
+   0x7c65:     add    %cl,-0x54(%si)
    0x7c68:     add    %al,(%bx,%si)
    0x7c6a:     push   %bp
    0x7c6b:     mov    $0x1f7,%dx

出了一些蛮大的问题, GDTR 被扔了一个负地址进去, 这显然是不合法的; 长跳转的位置随着 0x7c00 被修改成 0xac00 也随之变动了, 原本应该直接跳转到下一个指令的, 现在不知道跳到哪里去了, 所以后面就不用再看了.
我在修改时候建了个分支, 现在切回 lab1 分支, 然后把改动的删除.

Exercise 6

Exercise 6. We can examine memory using GDB’s x command. The GDB manual has full details, but for now, it is enough to know that the command x/Nx ADDR prints N words of memory at ADDR. (Note that both ‘x’s in the command are lowercase.) Warning: The size of a word is not a universal standard. In GNU assembly, a word is two bytes (the ‘w’ in xorw, which stands for word, means 2 bytes).

Reset the machine (exit QEMU/GDB and start them again). Examine the 8 words of memory at 0x00100000 at the point the BIOS enters the boot loader, and then again at the point the boot loader enters the kernel. Why are they different? What is there at the second breakpoint? (You do not really need to use QEMU to answer this question. Just think.)

在刚进入 boot loader 的时候(将断点打在 0x7c00), 打印出的数据全是 0.

(gdb) x/8x 0x00100000
0x100000:	0x00000000	0x00000000	0x00000000	0x00000000
0x100010:	0x00000000	0x00000000	0x00000000	0x00000000

接着我们将断点打在调用 e_entry 的位置:

(gdb) b *0x7d6b
Breakpoint 1 at 0x7d6b
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0x7d6b:	call   *0x10018

Breakpoint 1, 0x00007d6b in ?? ()
(gdb) x/8x 0x0010000
0x10000:	0x464c457f	0x00010101	0x00000000	0x00000000
0x10010:	0x00030002	0x00000001	0x0010000c	0x00000034

可以看到 0x0010000 中已经有数据了.
这里程序执行到了 call *0x10018 的位置, 再执行一次 si, 程序跳转到了 0x10000c 的位置, 查看了一下 0x10018 内存中的一个字, 发现里面确实存储的是 0x0010000c, 这就不是问题了.

这里我们可以断言, 0x0010000c 是 kernel 的入口代码地址. 两次打印结果不同是因为 kernel 在之后被加载进了内存中.

参考叔的文章, 这里我们还发现了一个问题:

(gdb) x/10i 0x100000
   0x100000:	add    0x1bad(%eax),%dh
   0x100006:	add    %al,(%eax)
   0x100008:	decb   0x52(%edi)
   0x10000b:	in     $0x66,%al
   0x10000d:	movl   $0xb81234,0x472
   0x100017:	add    %dl,(%ecx)
   0x100019:	add    %cl,(%edi)
   0x10001b:	and    %al,%bl
   0x10001d:	mov    %cr0,%eax
   0x100020:	or     $0x80010001,%eax

这里我们发现, 显示 0x100000 后面的指令, 并没有出现 0x10000c 地址处的指令, 这显然是有问题的. 我们换一个方式:

(gdb) x/5i 0x10000c
=> 0x10000c:	movw   $0x1234,0x472
   0x100015:	mov    $0x110000,%eax
   0x10001a:	mov    %eax,%cr3
   0x10001d:	mov    %cr0,%eax
   0x100020:	or     $0x80010001,%eax

接着我们去关注一下 kernel 编译的汇编文件 obj/kern/kernel.asm:

f0100000:       02 b0 ad 1b 00 00       add    0x1bad(%eax),%dh
f0100006:       00 00                   add    %al,(%eax)
f0100008:       fe 4f 52                decb   0x52(%edi)
f010000b:       e4                      .byte 0xe4

f010000c <entry>:
f010000c:       66 c7 05 72 04 00 00    movw   $0x1234,0x472
f0100013:       34 12
        # sufficient until we set up our real page table in mem_init
        # in lab 2.

        # Load the physical address of entry_pgdir into cr3.  entry_pgdir
        # is defined in entrypgdir.c.
        movl    $(RELOC(entry_pgdir)), %eax
f0100015:       b8 00 00 11 00          mov    $0x110000,%eax
        movl    %eax, %cr3
f010001a:       0f 22 d8                mov    %eax,%cr3
        # Turn on paging.
        movl    %cr0, %eax
f010001d:       0f 20 c0                mov    %cr0,%eax
        orl     $(CR0_PE|CR0_PG|CR0_WP), %eax

0xf010000b 处的 e4 编码被 gdb 错误的解释了. 所以才会遇到刚才的问题.

Exercise 7

Exercise 7. Use QEMU and GDB to trace into the JOS kernel and stop at the movl %eax, %cr0. Examine memory at 0x00100000 and at 0xf0100000. Now, single step over that instruction using the stepi GDB command. Again, examine memory at 0x00100000 and at 0xf0100000. Make sure you understand what just happened.

What is the first instruction after the new mapping is established that would fail to work properly if the mapping weren’t in place? Comment out the movl %eax, %cr0 in kern/entry.S, trace into it, and see if you were right.

我先将断点打在了 0x7c00 的位置, 然后用 x/50 去找 ELFHDR->e_entry() 的调用位置, 结果硬是眼瞎没找到. 就参考了 Exercise 6 的结果, 将断点打在了 *0x7d6b 的位置, 然后加载 kernel. 逐步调试后, 锁定到了题中示意的指令:

=> 0x100025:	mov    %eax,%cr0

接着我们检验一下题中示意的内存(多复制了一段方便和第二段的代码比较):

(gdb) si
=> 0x100020:	or     $0x80010001,%eax
0x00100020 in ?? ()
(gdb) si
=> 0x100025:	mov    %eax,%cr0
0x00100025 in ?? ()
(gdb) x/8x 0x00100000
0x100000:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0x100010:	0x34000004	0x0000b812	0x220f0011	0xc0200fd8
(gdb) x/8x 0xf0100000
0xf0100000 <_start+4026531828>:	0x00000000	0x00000000	0x00000000	0x00000000
0xf0100010 <entry+4>:	0x00000000	0x00000000	0x00000000	0x00000000

然后单步调试后再次检查:

(gdb) si
=> 0x100028:	mov    $0xf010002f,%eax
0x00100028 in ?? ()
(gdb) x/8x 0x00100000
0x100000:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0x100010:	0x34000004	0x0000b812	0x220f0011	0xc0200fd8
(gdb) x/8x 0xf0100000
0xf0100000 <_start+4026531828>:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0xf0100010 <entry+4>:	0x34000004	0x0000b812	0x220f0011	0xc0200fd8

这里完成了将立即数 0xf010002f 赋值给寄存器 EAX, 虽然 0xf010002f 和 0xf0100000 很接近, 但是还是不轻易猜测它的用途吧.

然后我们完成问题的第二部分, 注释掉这个 movl %eax, %cr0 in kern/entry.S 会怎么样.
依旧使用 git 建一个新分支然后再改代码.

(gdb) si
=> 0x100020:	or     $0x80010001,%eax
0x00100020 in ?? ()
(gdb) si
=> 0x100025:	mov    $0xf010002c,%eax
0x00100025 in ?? ()
(gdb) x/8x 0x00100000
0x100000:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0x100010:	0x34000004	0x0000b812	0x220f0011	0xc0200fd8
(gdb) x/8x 0xf0100000
0xf0100000 <_start+4026531828>:	0x00000000	0x00000000	0x00000000	0x00000000
0xf0100010 <entry+4>:	0x00000000	0x00000000	0x00000000	0x00000000

回到 lab1 分支, make clean 一下, 这个 Exercise 就算结束了. 不过这里还有问题没有解决, CR0 里到底被塞了一个什么东西进去. 我们再回头来看一下 0x100025 前面的代码:

(gdb) si
=> 0x10001d:	mov    %cr0,%eax
0x0010001d in ?? ()
(gdb) si
=> 0x100020:	or     $0x80010001,%eax
0x00100020 in ?? ()
(gdb) si
=> 0x100025:	mov    %eax,%cr0
0x00100025 in ?? ()

在这之前先把 CR0 的值取到 EAX, 然后用常数跟 EAX 做或运算, 再放回 CR0. 这是我们在 Exercise 2 中就已经遇到的对 CR0 做的掩码操作. 早在 Exercise 2 的内容中就提起过, CR0 的 bit0 置位表示保护模式开启. 从叔的文章里盗了一张表来了:

bit label descrption
0 Pe Protected mode enable
1 Mp Monitor co-processor
2 Em Emulation
3 Ts Task switched
4 Et extension type
5 Ne Numeric error
16 Wp write protect
18 Am alignment mask
29 Nw Not-write through
30 Cd cache disable
31 Pg paging

接着我们推一下, 这个掩码操作的目的是: 打开保护模式, 开启写保护, 打开 Paging. 这时候让我想到在 kern/entry.S 的这三条指令上面有个 Turn on paging. 的注释. 参考叔的文章, paging是分页允许位, 他表示芯片上的分页部件是否允许工作. 下面这段是他摘自 Wikipedia的:

Paging is a system which allows each process to see a full virtual address space, without actually requiring the full amount of physical memory to be available or present. In fact, current implementations of x86-64 have a limit of between 4 GiB and 256 TiB of physical address space (and an architectural limit of 4 PiB of physical address space).

按维基百科上面说的, Paging 是一个使完整虚拟地址可见的系统, 并且不需要完全数量的物理地址映射. 这样我们才能完成上面的 4MB 的映射.

Exercise 8

Exercise 8. We have omitted a small fragment of code - the code necessary to print octal numbers using patterns of the form “%o”. Find and fill in this code fragment. Be able to answer the following questions:

这题叫我们去自己写一段格式化输出的一段代码: 关于 %o 格式的.
lib/printfmt.c 里, 找到 L206 关于八进制输出的代码. 模仿一下 %x 的, 改成下面这段:

case 'o':
    num = getint(&ap, lflag);
    base = 8;
    goto number;
    break;

然后打一个 commit.

Exercise 9

Exercise 9. Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which “end” of this reserved area is the stack pointer initialized to point to?

这个 Exercise 没有给任何提示, 内容也比较多.
我们通过查看 obj/kern/kernel.asm 里的注释, 可以发现这么一段:

relocated:

        # Clear the frame pointer register (EBP)
        # so that once we get into debugging C code,
        # stack backtraces will be terminated properly.
        movl    $0x0,%ebp                       # nuke frame pointer
f010002f:       bd 00 00 00 00          mov    $0x0,%ebp

        # Set the stack pointer
        movl    $(bootstacktop),%esp
f0100034:       bc 00 00 11 f0          mov    $0xf0110000,%esp

从这里我们可以看到, 栈是被 locate 在了虚拟内存 0xf0110000 的位置, 即物理地址的 0x00110000 的位置. 由于 0xf0100000 是 entry 的位置, 所以相当于栈的可用范围是 0xf0110000~0xf0100001 (栈增方向, 地址减方向). 也就是 $2^{20}$ 个字节啦.(但是后面的内容证明这里的说法并不正确)

关于内核是怎么保留这些空间的, 答案其实是我们都知道的, 栈顶放在高地址位置, 栈底往低地址方向增长.(这个好像跟我认知的栈结构是反的) 下面是具体的, 从源码反映出来的内容.

我们需要去找一下 bootstacktop 的定义. 在 kernel.asm 的源代码 kern/entry.S 中, 有定义在 .data 段的内容:

.data
###################################################################
# boot stack
###################################################################
        .p2align        PGSHIFT         # force page alignment
        .globl          bootstack
bootstack:
        .space          KSTKSIZE
        .globl          bootstacktop
bootstacktop:

我们看到栈区域是被数据段保存了.
不过这里又出现了两个新的常量(?), PGSHIFTKSTKSIZE. 根据这个文件上面的 #include, 顺藤摸瓜可以找到 inc/memlayout.h 中的宏定义:

// Kernel stack.
#define KSTACKTOP       KERNBASE
#define KSTKSIZE        (8*PGSIZE)              // size of a kernel stack
#define KSTKGAP         (8*PGSIZE)              // size of a kernel stack guard

以及 inc/mmu.h 中的:

#define PGSIZE          4096            // bytes mapped by a page
#define PGSHIFT         12              // log2(PGSIZE)

PGSIZE 是页大小, PGSHIFT 取其对数自然就是对应的左移量(英文本身也就是 shift).
KSTKSIZE 即为栈的大小了. 是 $8\times 4096=2^{15}$ 个字节. KSTKGAP 看注释应该是守护栈区域的一个什么常数, 所以跟栈大小相同. KSTACKTOP 被定义成了 KERNBASE, 我们回到 memlayout.h:

// All physical memory mapped at this address
#define KERNBASE        0xF0000000

所以下面来正式回答一下这个 Exercise 的问题:

  1. 栈在 kern/entry.S 中被初始化, 在栈顶在实际物理内存的 0x00110000.
  2. 内核为了保存这段区域, 将它放在了 .data 段中, 分配了 8 个页表的空间.
  3. 这 8 个页表的区域为物理地址的 0x00108000~0x00110000. 所谓的 the “end” of this reserved area, 应该就是 0x00108000.

Exercise 10

Exercise 10. To become familiar with the C calling conventions on the x86, find the address of the test_backtrace function in obj/kern/kernel.asm, set a breakpoint there, and examine what happens each time it gets called after the kernel starts. How many 32-bit words does each recursive nesting level of test_backtrace push on the stack, and what are those words?

Note that, for this exercise to work properly, you should be using the patched version of QEMU available on the tools page or on Athena. Otherwise, you’ll have to manually translate all breakpoint and memory addresses to linear addresses.

obj/kern/kernel.asm 中找到:

f0100040 <test_backtrace>:
#include <kern/console.h>

// Test the stack backtrace function (lab 1 only)
void
test_backtrace(int x)
{
f0100040:       55                      push   %ebp
f0100041:       89 e5                   mov    %esp,%ebp

可以看到 test_backtrace 的地址在 0xf0100040. 在断点调试之前先看一下 test_backtrace() 函数的.
kern/init.c:

// Test the stack backtrace function (lab 1 only)
void
test_backtrace(int x)
{
        cprintf("entering test_backtrace %d\n", x);
        if (x > 0)
                test_backtrace(x-1);
        else
                mon_backtrace(0, 0, 0);
        cprintf("leaving test_backtrace %d\n", x);
}

kern/monitor.c:

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
        // Your code here.
        return 0;
}

这个应该就是 lab1 最后大题了吧.
开始调试.

(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0xf0100040 <test_backtrace>:	push   %ebp

Breakpoint 1, test_backtrace (x=5) at kern/init.c:13
13	{

调用 test_backtrace 函数, 初始参数为 5.(这个调用我看了一下是在 i386_init() 里面)
这里我们先看一下 esp 的值(ebp 由于一直都是 esp 的一个快照应该是不用关注的):

(gdb) x/2d $esp
0xf010ffdc:	-267386668	5

前面一个数是函数的返回地址, 第二个就是 test_backtrace 的第一参数.
此时的 $esp = 0xf010ffdc.

(gdb) x/5i 0xf0100040
   0xf0100040 <test_backtrace>:	  push   %ebp
   0xf0100041 <test_backtrace+1>:	mov    %esp,%ebp
   0xf0100043 <test_backtrace+3>:	push   %ebx
   0xf0100044 <test_backtrace+4>:	sub    $0xc,%esp
=> 0xf0100047 <test_backtrace+7>:	mov    0x8(%ebp),%ebx

我们打印了 test_backtrace 调用后的 5 行 prologue code 部分. 这里能看到栈发生了一些变化. 先展示一下 esp 现在的值:

(gdb) x/x $esp
0xf010ffc8:	0x00

来统计一下这里栈做的变化: 保存 ebp 入栈 -4, 保存 ebx 入栈 -4, sub 执行 -12, 合计 0x14(20) 个字节. 减去 12 的意义暂时不明. 检查一下 obj/kern/kernel.asm 中可以看到这一段是所有 test_backtrace 的共有部分. 根据 C 语言一些基础知识可以判断, test_backtrace 调用六次应该会占用掉栈的 6*(20+4+4)=168 个字节. (这里第一个加 4 是要包含 test_backtrace 每次调用的参数, 第二个加 4 是每一次调用的返回地址)

我们在最后一次调用返回之前, 输出了一下 esp 往后的内容, 一直到最早的 0xf010ffdc + 0x4 = 0xf010ffe0 位置的初始参数 5.

(gdb) x/49d $esp
0xf010ff20:	-267380548	0	0	0
0xf010ff30:	-267384641	1	-267321512	-267386776
0xf010ff40:	0	1	-267321480	0
0xf010ff50:	-267384641	2	-267321480	-267386776
0xf010ff60:	1	2	-267321448	0
0xf010ff70:	-267384641	3	-267321448	-267386776
0xf010ff80:	2	3	-267321416	0
0xf010ff90:	-267384641	4	-267321416	-267386776
0xf010ffa0:	3	4	0	0
0xf010ffb0:	0	5	-267321384	-267386776
0xf010ffc0:	4	5	0	65684
0xf010ffd0:	65684	65684	-267321352	-267386668
0xf010ffe0:	5

前后相差 192. emmmm… 好像越绕越复杂了, 我们稍微把已知的信息理清一下(cnt 为栈偏移字节计数器):

  1. test_backtrace(5)i386_init() 中被调用阶段. 参数入栈, 返回地址入栈, cnt = 0 + 8 = 8.
  2. test_backtrace(5) 的 prologue code 阶段. ebp 入栈, ebx 入栈, esp 偏 -12, cnt = 8 + 20 = 28.
  3. 第一个 cprintf()if 阶段. cprintf() 两参数入栈, 返回后 esp 偏 16, cnt = 28 + 2*4 - 16 = 20.
  4. test_backtrace(n-1) 递归预备阶段. esp 偏 12, cnt = 20 + 12 = 32. 至此开始循环 1, 2, 3, 4 的执行.
  5. 递归重复了 5 次, cnt = cnt + cnt*5 = 192.
  6. 最里层调用 else 中的 mon_backtrace, esp 偏 -4, 三参数入栈, 返回后 esp 偏 16, cnt = 192 + 4 + 3*4 - 16 = 192
  7. 最里层开始调用第二个 cprintf, esp 偏 -8, 两参数进栈, 返回后 esp 偏 16, cnt = 192 + 8 + 2*4 - 16 = 192
  8. 这里就到了最后我单步调试到的位置了.

反正也不知道上面哪里漏了, 总之最后是对了. 所以这题的答案应该是, 每个栈 32 个字节(也可能说是 20 个字节), 也就是 8 个 32 位字. 关于里面有什么, 上面也都说的很清楚了, 有一些是保留的寄存器, 一些是空洞.
后话, 关于这里为什么每次会浪费 3+3 个双字(24 个字节)的空间不使用我也不是很清楚. 可能是为了对齐?(迷惑行为)

Exercise 11

Exercise 11. Implement the backtrace function as specified above. Use the same format as in the example, since otherwise the grading script will be confused. When you think you have it working right, run make grade to see if its output conforms to what our grading script expects, and fix it if it doesn’t. After you have handed in your Lab 1 code, you are welcome to change the output format of the backtrace function any way you like.

If you use read_ebp(), note that GCC may generate “optimized” code that calls read_ebp() before mon_backtrace()’s function prologue, which results in an incomplete stack trace (the stack frame of the most recent function call is missing). While we have tried to disable optimizations that cause this reordering, you may want to examine the assembly of mon_backtrace() and make sure the call to read_ebp() is happening after the function prologue.

在写代码之前, 还是要看清题面中的提示. 如果要使用 read_ebp(), 注意一下 GCC 可能会生成在 mon_backtrace() 的 prologue code 之前的”优化”代码. 简单说就是你不知道 read_ebp() 读取到的 ebp 是在 prologue code 修改之前还是之后的, 这个稍后我们通过汇编来验证.
In kern/monitor.c:

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
        cprintf("Stack backtrace:\n");
        uint32_t *ebp = (uint32_t *)read_ebp();
        while (ebp != NULL) {
                uint32_t eip = *(ebp + 1);
                uint32_t *args = ebp + 2;
                cprintf("  ebp %08x  eip %08x  args", ebp, eip);
                for (int i = 0; i < 4; ++i) {
                        cprintf(" %08x", *(args + i));
                }
                cprintf("\n");
                ebp = (uint32_t *)(*ebp);
        }
        return 0;
}

Exercise 12

Exercise 12. Modify your stack backtrace function to display, for each eip, the function name, source file name, and line number corresponding to that eip.

In debuginfo_eip, where do __STAB_* come from? This question has a long answer; to help you to discover the answer, here are some things you might want to do:

look in the file kern/kernel.ld for __STAB_*
run objdump -h obj/kern/kernel
run objdump -G obj/kern/kernel
run gcc -pipe -nostdinc -O2 -fno-builtin -I. -MD -Wall -Wno-format -DJOS_KERNEL -gstabs -c -S kern/init.c, and look at init.s.
see if the bootloader loads the symbol table in memory as part of loading the kernel binary
Complete the implementation of debuginfo_eip by inserting the call to stab_binsearch to find the line number for an address.

Add a backtrace command to the kernel monitor, and extend your implementation of mon_backtrace to call debuginfo_eip and print a line for each stack frame of the form:

K> backtrace  
Stack backtrace:  
  ebp f010ff78  eip f01008ae  args 00000001 f010ff8c 00000000 f0110580 00000000  
         kern/monitor.c:143: monitor+106  
  ebp f010ffd8  eip f0100193  args 00000000 00001aac 00000660 00000000 00000000  
         kern/init.c:49: i386_init+59  
  ebp f010fff8  eip f010003d  args 00000000 00000000 0000ffff 10cf9a00 0000ffff  
         kern/entry.S:70: <unknown>+0  
K>   

Each line gives the file name and line within that file of the stack frame’s eip, followed by the name of the function and the offset of the eip from the first instruction of the function (e.g., monitor+106 means the return eip is 106 bytes past the beginning of monitor).

Be sure to print the file and function names on a separate line, to avoid confusing the grading script.

Tip: printf format strings provide an easy, albeit obscure, way to print non-null-terminated strings like those in STABS tables. printf("%.*s", length, string) prints at most length characters of string. Take a look at the printf man page to find out why this works.

You may find that some functions are missing from the backtrace. For example, you will probably see a call to monitor() but not to runcmd(). This is because the compiler in-lines some function calls. Other optimizations may cause you to see unexpected line numbers. If you get rid of the -O2 from GNUMakefile, the backtraces may make more sense (but your kernel will run more slowly).

最后一道压轴题了, 题面多到人不想读.

首先, 对于 debuginfo_eip() 函数, 它参数中的 __STAB_* 是哪来的, 是干嘛用的.
查看一下 kern/kernel.ld 文件:

/* Include debugging information in kernel memory */
.stab : {
        PROVIDE(__STAB_BEGIN__ = .);
        *(.stab);
        PROVIDE(__STAB_END__ = .);
        BYTE(0)         /* Force the linker to allocate space
                            for this section */
}

.stabstr : {
        PROVIDE(__STABSTR_BEGIN__ = .);
        *(.stabstr);
        PROVIDE(__STABSTR_END__ = .);
        BYTE(0)         /* Force the linker to allocate space
                            for this section */
}

这里有指示 stab 是 kernel 内存中包含调试信息的区域.

再执行一下它说的几个指令:

$ objdump -h obj/kern/kernel

obj/kern/kernel:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00001941  f0100000  00100000  00001000  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .rodata       00000748  f0101960  00101960  00002960  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .stab         00003a15  f01020a8  001020a8  000030a8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .stabstr      000018e6  f0105abd  00105abd  00006abd  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .data         0000a300  f0108000  00108000  00009000  2**12
                  CONTENTS, ALLOC, LOAD, DATA
  5 .bss          00000648  f0112300  00112300  00013300  2**5
                  CONTENTS, ALLOC, LOAD, DATA
  6 .comment      00000035  00000000  00000000  00013948  2**0
                  CONTENTS, READONLY

这里有看到一个 .stab 段和 .stabstr 段.
可以算出 .stab 的地址是 0x00102068~0x00105abc, .stabstr 的地址是 0x00105abd~0x00107fff. 这里用不到计算器, 因为跟它们后面的段都是线性连接起来的.
注意这个地址每次重新编译可能会发生变化, 被坑了.

继续:

$ objdump -G obj/kern/kernel | more

obj/kern/kernel:     file format elf32-i386

Contents of .stab section:

Symnum n_type n_othr n_desc n_value  n_strx String

-1     HdrSym 0      1231   000018cd 1     
0      SO     0      0      f0100000 1      {standard input}
1      SOL    0      0      f010000c 18     kern/entry.S
2      SLINE  0      44     f010000c 0      
3      SLINE  0      57     f0100015 0      
4      SLINE  0      58     f010001a 0      
5      SLINE  0      60     f010001d 0      
6      SLINE  0      61     f0100020 0      
7      SLINE  0      62     f0100025 0      
8      SLINE  0      67     f0100028 0      
9      SLINE  0      68     f010002d 0      
10     SLINE  0      74     f010002f 0      
11     SLINE  0      77     f0100034 0      
12     SLINE  0      80     f0100039 0      
13     SLINE  0      83     f010003e 0      
14     SO     0      2      f0100040 31     kern/entrypgdir.c
15     OPT    0      0      00000000 49     gcc2_compiled.
16     LSYM   0      0      00000000 64     int:t(0,1)=r(0,1);-2147483648;2147483647;
17     LSYM   0      0      00000000 106    char:t(0,2)=r(0,2);0;127;
18     LSYM   0      0      00000000 132    long int:t(0,3)=r(0,3);-2147483648;2147483647;
19     LSYM   0      0      00000000 179    unsigned int:t(0,4)=r(0,4);0;4294967295;
20     LSYM   0      0      00000000 220    long unsigned int:t(0,5)=r(0,5);0;4294967295;
21     LSYM   0      0      00000000 266    __int128:t(0,6)=r(0,6);0;-1;
22     LSYM   0      0      00000000 295    __int128 unsigned:t(0,7)=r(0,7);0;-1;
23     LSYM   0      0      00000000 333    long long int:t(0,8)=r(0,8);-0;4294967295;
24     LSYM   0      0      00000000 376    long long unsigned int:t(0,9)=r(0,9);0;-1;
25     LSYM   0      0      00000000 419    short int:t(0,10)=r(0,10);-32768;32767;
26     LSYM   0      0      00000000 459    short unsigned int:t(0,11)=r(0,11);0;65535;
27     LSYM   0      0      00000000 503    signed char:t(0,12)=r(0,12);-128;127;
--More--

之前就有想查, 结果忘了, 我们在这里 man 一下 objdump:

OBJDUMP(1)                                            GNU Development Tools                                            OBJDUMP(1)

NAME
       objdump - display information from object files.

SYNOPSIS
       objdump [-a|--archive-headers]
               [-b bfdname|--target=bfdname]
               [-C|--demangle[=style] ]
               [-d|--disassemble]
               [-D|--disassemble-all]
               [-z|--disassemble-zeroes]
               [-EB|-EL|--endian={big | little }]
               [-f|--file-headers]
               [-F|--file-offsets]
               [--file-start-context]
               [-g|--debugging]
               [-e|--debugging-tags]
               [-h|--section-headers|--headers]
               [-i|--info]
               [-j section|--section=section]
               [-l|--line-numbers]
               [-S|--source]
               [-m machine|--architecture=machine]
               [-M options|--disassembler-options=options]
               [-p|--private-headers]
               [-P options|--private=options]
               [-r|--reloc]
               [-R|--dynamic-reloc]
               [-s|--full-contents]
               [-W[lLiaprmfFsoRt]|
                --dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames]
                        [=aranges,=macro,=frames,=frames-interp,=str,=loc]
                        [=Ranges,=pubtypes,=trace_info,=trace_abbrev]
                        [=trace_aranges,=gdb_index]
               [-G|--stabs]
               [-t|--syms]
               [-T|--dynamic-syms]
               [-x|--all-headers]
               [-w|--wide]
               [--start-address=address]
               [--stop-address=address]
               [--prefix-addresses]
               [--[no-]show-raw-insn]
               [--adjust-vma=offset]
               [--special-syms]
               [--prefix=prefix]
               [--prefix-strip=level]
               [--insn-width=width]
               [-V|--version]
               [-H|--help]
               objfile...

这里可以看到我们使用到的两个参数, 一个 -h 显示头, 一个 -G 显示 stab. 最后还有一个:

$ gcc -pipe -nostdinc -O2 -fno-builtin -I. -MD -Wall -Wno-format -DJOS_KERNEL -gstabs -c -S kern/init.c
$ more ./init.s
	.file	"init.c"
	.stabs	"kern/init.c",100,0,2,.Ltext0
	.text
.Ltext0:
	.stabs	"gcc2_compiled.",60,0,0,0
	.stabs	"int:t(0,1)=r(0,1);-2147483648;2147483647;",128,0,0,0
	.stabs	"char:t(0,2)=r(0,2);0;127;",128,0,0,0
	.stabs	"long int:t(0,3)=r(0,3);-2147483648;2147483647;",128,0,0,0
	.stabs	"unsigned int:t(0,4)=r(0,4);0;4294967295;",128,0,0,0
	.stabs	"long unsigned int:t(0,5)=r(0,5);0;4294967295;",128,0,0,0
	.stabs	"__int128:t(0,6)=r(0,6);0;-1;",128,0,0,0
	.stabs	"__int128 unsigned:t(0,7)=r(0,7);0;-1;",128,0,0,0
	.stabs	"long long int:t(0,8)=r(0,8);-0;4294967295;",128,0,0,0
	.stabs	"long long unsigned int:t(0,9)=r(0,9);0;-1;",128,0,0,0
	.stabs	"short int:t(0,10)=r(0,10);-32768;32767;",128,0,0,0
	.stabs	"short unsigned int:t(0,11)=r(0,11);0;65535;",128,0,0,0
	.stabs	"signed char:t(0,12)=r(0,12);-128;127;",128,0,0,0
	.stabs	"unsigned char:t(0,13)=r(0,13);0;255;",128,0,0,0
	.stabs	"float:t(0,14)=r(0,1);4;0;",128,0,0,0
	.stabs	"double:t(0,15)=r(0,1);8;0;",128,0,0,0
	.stabs	"long double:t(0,16)=r(0,1);12;0;",128,0,0,0
	.stabs	"_Decimal32:t(0,17)=r(0,1);4;0;",128,0,0,0
	.stabs	"_Decimal64:t(0,18)=r(0,1);8;0;",128,0,0,0
	.stabs	"_Decimal128:t(0,19)=r(0,1);16;0;",128,0,0,0
	.stabs	"void:t(0,20)=(0,20)",128,0,0,0
	.stabs	"./inc/stdio.h",130,0,0,0
	.stabs	"./inc/stdarg.h",130,0,0,0
	.stabs	"va_list:t(2,1)=(2,2)=*(0,2)",128,0,0,0
	.stabn	162,0,0,0
	.stabn	162,0,0,0
	.stabs	"./inc/string.h",130,0,0,0
	.stabs	"./inc/types.h",130,0,0,0
	.stabs	"bool:t(4,1)=(4,2)=eFalse:0,True:1,;",128,0,0,0
	.stabs	" :T(4,3)=efalse:0,true:1,;",128,0,0,0
	.stabs	"int8_t:t(4,4)=(0,12)",128,0,0,0
	.stabs	"uint8_t:t(4,5)=(0,13)",128,0,0,0
--More--(19%)

最后, 题中让我们看一下 boot loader 有没有将它加载到内存中:

(gdb) x/10s 0x00105abd
0x105abd:	""
0x105abe:	"{standard input}"
0x105acf:	"kern/entry.S"
0x105adc:	"kern/entrypgdir.c"
0x105aee:	"gcc2_compiled."
0x105afd:	"int:t(0,1)=r(0,1);-2147483648;2147483647;"
0x105b27:	"char:t(0,2)=r(0,2);0;127;"
0x105b41:	"long int:t(0,3)=r(0,3);-2147483648;2147483647;"
0x105b70:	"unsigned int:t(0,4)=r(0,4);0;4294967295;"
0x105b99:	"long unsigned int:t(0,5)=r(0,5);0;4294967295;"

就此打住. 不再深究了.

第二步, 我们得为该函数补上一段使用二分查找的代码:
kern/kdebug.c:

	// Your code here.
	stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
	if (lline <= rline)
		info->eip_line = stabs[lline].n_desc;
	else
		return -1;

本来是打算把整个 debuginfo_eip() 都拿过来的, 做了点注释之后发现没啥好看的东西, 就都删掉了.
最后修改 kern/monitor.c:

--- a/kern/monitor.c
+++ b/kern/monitor.c
@@ -24,6 +24,7 @@ struct Command {
 static struct Command commands[] = {
        { "help", "Display this list of commands", mon_help },
        { "kerninfo", "Display information about the kernel", mon_kerninfo },
+       { "backtrace", "looks up eip in the symbol table and returns the debugging information for that address", mon_backtrace },
 };
 
 /***** Implementations of basic kernel monitor commands *****/
@@ -66,8 +67,16 @@ mon_backtrace(int argc, char **argv, struct Trapframe *tf)
                for (int i = 0; i < 4; ++i) {
                        cprintf(" %08x", *(args + i));
                }
-               cprintf("\n");
+               cprintf("\n         ");
                ebp = (uint32_t *)(*ebp);
+               struct Eipdebuginfo info;
+               if (debuginfo_eip(eip, &info) != 0) {
+                       panic("Error read stabs!");
+                       return -1;
+               }
+               cprintf("%s:%d: ", info.eip_file, info.eip_line);
+               cprintf("%.*s", info.eip_fn_namelen, info.eip_fn_name);
+               cprintf("+%d\n", (uint32_t)eip - (uint32_t)info.eip_fn_addr);
        }
        return 0;
 }

打上 commit, 收工.

总结

lab1 内容挺多, 看了一下后面的几个 lab 好像内容都没有这么多. 这一个多星期来比较肝, 在 MIT 原计划的两周内超额完成了, 值得鼓励一下.(其实那一本 NASM 的指导书就不止俩星期了:p)

毛病还是很多的, 实验过程中大量参考了叔和孟佬的文章, 很多资料没有亲自去阅读一些内容去完成了. 在整个实验过程中非常缺乏自己的思考和对延申内容的探索, 希望能在后面的 lab 中改进.

P.S.: 我在做到 Exercise 10 的时候才发现我把 gdb si 的输出信息理解错了(我一直认为是 si 输出的那一行已经刚刚被运行, 其实应该是将要运行的)… 不过好在不太影响整个实验的完成. 这算是本次实验非常大的一个瑕疵(?)了, 之后会多加注意了.

最后, ./grade-lab1 的输出信息纪念 lab1 完成:

running JOS: (0.6s) 
  printf: OK 
  backtrace count: OK 
  backtrace arguments: OK 
  backtrace symbols: OK 
  backtrace lines: OK 
Score: 50/50

引用

  1. https://pdos.csail.mit.edu/6.828/2018/labs/lab1/
  2. https://github.com/Spdwal/LearningLanuages/blob/master/OperatingSystem/6.828/lab1.md
  3. https://blog.csdn.net/qq_32473685/article/details/93626548#1%20%C2%A0%E4%B8%BB%E8%A6%81%E9%98%85%E8%AF%BB%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80%E8%B5%84%E6%96%99%E3%80%82
  4. https://www.cnblogs.com/fatsheep9146/p/5078179.html
  5. http://bochs.sourceforge.net/techspec/PORTS.LST
  6. http://kernelx.weebly.com/a20-address-line.html
  7. https://en.wikibooks.org/wiki/X86_Assembly/Global_Descriptor_Table
  8. https://zhuanlan.zhihu.com/p/36926462
  9. https://pdos.csail.mit.edu/6.828/2018/readings/elf.pdf
  10. http://en.wikipedia.org/wiki/Executable_and_Linkable_Format