Project 4 is a bare-metal S-mode SMP payload running on 4 RISC-V harts (2 cores × 2 harts each). It demonstrates every fundamental multi-hart mechanism: atomic primary election, HSM-based secondary hart launch, per-hart timer interrupts via SBI, cross-hart IPIs, an atomic shared counter, and a contention-safe UART spinlock.
| Decision | Why |
|---|---|
| Atomic lottery for primary | OpenSBI boot hart is non-deterministic. Lottery ensures correctness on every run regardless of which hart wins OpenSBI. |
| HSM to start secondaries | Secondaries that lost the lottery are in wfi with no stack. HSM gives them a clean S-mode entry with correct registers. |
| amoswap for UART lock | LR/SC reservations are cancelled by other harts touching the same cache line. amoswap is indivisible — cannot be interrupted. |
| uart_lock in padded struct | Ensures uart_lock occupies its own 64-byte cache line so adjacent variables cannot interfere with the spinlock. |
| All shared vars in .data | GCC puts zero-initialised vars in .bss. The runtime BSS clear had a GP-relative bug. Forcing to .data means ELF loader initialises them correctly before _start runs. |
| wfi in wait loops | Tight spinning on hart_ready[] shared the same cache line as uart_lock, cancelling LR/SC reservations. wfi yields the CPU and eliminates bus traffic. |
When QEMU boots you see Domain0 Region entries in the OpenSBI banner. Each one is a PMP entry programmed by OpenSBI defining which physical addresses S-mode (your payload) can access and how. Here is every region explained.
M: (F,R,W) = what M-mode (OpenSBI) can do |
S/U: (R,W) = what S-mode (your payload) can do |
F=fetch(execute) R=read W=write ()=no access
sbi_set_timer() (SBI_EXT_TIME ecall) and IPIs through sbi_send_ipi() (SBI_EXT_IPI ecall). OpenSBI handles the actual CLINT register writes in M-mode.
M: () is because Smepmp is not active on this build. Without Smepmp, M-mode bypasses all unlocked PMP entries anyway. With Smepmp enabled (MML=1), M-mode would need explicit entries for its own regions — Region06 would then effectively block M-mode from payload memory, creating strong isolation.
The assembly entry point. Runs before any C code. Sets up the execution environment for every hart: stack pointer, global pointer, trap vector, and the primary election lottery.
All the C code: shared variables, UART driver, SBI ecall wrappers, trap handler, and the primary/secondary main functions.
= 0 as equivalent to .bss (zero at runtime). But our runtime BSS clear had a GP-relative addressing bug where the loop used the wrong end address. Forcing to .data means the ELF loader sets all values to zero before _start runs — no runtime clear needed, no bug possible.
The shared header included by both entry.S (for bit mask defines) and payload.c. Defines all constants, types, and external symbols.
| Constant | Value | Meaning |
|---|---|---|
| NUM_HARTS | 4 | Total harts in the system |
| UART_BASE | 0x10000000 | NS16550 UART MMIO base address on QEMU virt |
| UART_THR | offset +0 | Transmit Holding Register — write byte here to send |
| UART_LSR | offset +5 | Line Status Register — read bit 5 to check TX ready |
| UART_LSR_THRE | 0x20 | Bit 5 of LSR — Transmit Holding Register Empty |
| SIE_STIE | 1<<5 | Supervisor Timer Interrupt Enable bit in sie CSR |
| SIE_SSIE | 1<<1 | Supervisor Software Interrupt Enable (IPI) in sie CSR |
| SSTATUS_SIE | 1<<1 | Global S-mode interrupt enable in sstatus CSR |
| TIMER_INTERVAL | 10,000,000 | CLINT ticks between timer fires. At 10 MHz = 1 second |
| TIMER_FIRES_PER_HART | 3 | Each hart fires its timer 3 times. Total = 12 across 4 harts |
| Item | Purpose |
|---|---|
| sbi_ret_t | Two-field struct {error, value} — every SBI ecall returns this pair. error=0 means success. |
| trap_frame_t | Struct with one field per register saved by _trap_entry. Layout matches exactly — C code accesses frame->sepc to modify the return address. |
| SBI_EXT_BASE 0x10 | Base extension — version queries, feature probing |
| SBI_EXT_TIME 0x54494D45 | "TIME" in ASCII — timer extension |
| SBI_EXT_IPI 0x735049 | "sPI" — inter-processor interrupt extension |
| SBI_EXT_HSM 0x48534D | "HSM" — Hart State Management extension |
extern void _trap_entry(void) and extern void _secondary_entry(void) so payload.c can take their addresses and pass them to HSM and stvec writes.
Tells the linker where to place each section in physical memory and defines the symbols (_stack_top, _bss_start etc.) that entry.S uses.
sp = _stack_top - hartid × 4096
Builds the payload, links it, and launches QEMU with the correct 4-hart topology.
| Flag | Why it is needed |
|---|---|
| -march=rv64imac | GCC 10.2 compatible. Includes I(base), M(multiply), A(atomics for amoswap/amoadd), C(compressed instructions). The A extension is essential for amoswap and amoadd. |
| -mabi=lp64 | 64-bit integers and pointers, no floating point ABI. Matches the hardware (no FPU used). |
| -mcmodel=medany | PC-relative addressing for all symbol references. Required for code loaded at 0x80200000 — the linker generates auipc+addi pairs instead of absolute addresses. |
| -ffreestanding | No standard library. No startup code. No hidden calls to malloc or printf. |
| -fno-stack-protector | No __stack_chk_guard — that symbol does not exist in bare metal. |
| -fno-builtin | Prevent GCC from replacing memset/memcpy calls with library versions. |
| -nostdlib -nostartfiles | Do not link libc or crt0. Our entry.S is the only startup code. |
| -fno-common | Force all tentative definitions to generate errors if duplicated — catches accidental multiple definitions. |
| -O1 | Basic optimisation. Higher levels can reorder memory accesses in ways that break bare-metal code without careful use of volatile. |
-S -s to pause at startup and open a GDB server on port 1234.