Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DaedalusOS Documentation

DaedalusOS is a bare-metal Rust kernel for the Raspberry Pi 4 Model B, developed as a learning project to explore OS internals and low-level ARM hardware programming.

Project Scope

  • Target Hardware: Raspberry Pi 4 Model B only (BCM2711, Cortex-A72)
  • Language: Rust 2024 edition, nightly toolchain
  • Architecture: AArch64 (ARMv8-A)
  • Environment: #![no_std], bare-metal (no operating system)

Current Status

Phase 4 In Progress: Networking Stack Milestone #12 Complete: Ethernet Driver Foundation (GENET v5 + PHY)

  • Working REPL with command parsing and shell history
  • Exception vector table with register dumps
  • 8 MB heap with bump allocator
  • Full alloc crate support (Box, Vec, String, collections)
  • System timer driver with microsecond precision delays
  • GIC-400 interrupt controller with interrupt-driven UART
  • MMU with 39-bit virtual address space (identity mapped)
  • Caching enabled for performance
  • GPIO driver with BCM2711 pull-up/down support
  • NetworkDevice trait abstraction for hardware portability
  • Ethernet frame and ARP protocol implementation (30 unit tests)
  • GENET v5 MAC driver with MDIO/PHY management
  • Hardware diagnostics command (eth-diag)

Next Milestone: Frame transmission and reception (TX/RX paths)

Documentation Structure

This documentation is organized as a reference wiki, not a linear tutorial. Jump to any topic:

For Implementation Work

For Understanding Design

For Reference

Quick Commands

# Build kernel
cargo build

# Run in QEMU
cargo run

# Run tests
cargo test

# Run tests with deterministic timing (slower, but more reproducible in CI)
QEMU_DETERMINISTIC=1 cargo test

# Generate kernel8.img for hardware
cargo build --release
cargo objcopy --release -- -O binary kernel8.img

Timing Tests in CI

If timing tests become flaky in GitHub Actions or other CI environments, you can enable deterministic timing mode using QEMU_DETERMINISTIC=1. This uses QEMU’s -icount flag to decouple the guest clock from the host, making timing perfectly reproducible at the cost of 10-100x slower execution (disables KVM hardware acceleration).

Current timing tests use 25% tolerance to handle normal CI variability without this flag. See src/drivers/clocksource/bcm2711.rs:231 for details.

Design Tenets

  1. Pi-Only, Tutorial-Inspired - Port patterns from Phil Opp’s Blog OS when useful
  2. Document One-Way Doors - Major architecture decisions require ADRs
  3. Hardware Facts Over Assumptions - Every magic number must reference datasheets
  4. Keep Build/Test Simple - One target spec, one QEMU command
  5. Tight Feedback Loop - Every milestone must build and run in QEMU

See Design Decisions for detailed rationale.

Memory Map

Physical and virtual address layout for Raspberry Pi 4 (BCM2711).

Virtual Memory

Status: MMU enabled with identity mapping (VA = PA)

After kernel initialization, the MMU is active with:

  • 39-bit virtual address space (512 GB)
  • 4 KB page granule, 2 MB block mappings
  • Identity mapping: virtual address equals physical address

See MMU & Paging for details.

Address Ranges

StartEndSizePurpose
0x000000000x3FFFFFFF1 GBDRAM (varies by Pi 4 model: 1/2/4/8 GB)
0x00080000--Kernel load address (firmware entry point)
0xC00000000xFFFFFFFF~1 GBReserved (future expansion)
0xFE0000000xFF800000~24 MBMMIO peripherals window

MMIO Base Address

CRITICAL: Pi 4 uses 0xFE000000 for ARM CPU peripheral access.

Why Multiple Addresses Exist

The BCM2711 chip supports different address mappings:

  • ARM physical access: 0xFE000000Use this for bare-metal
  • Bus addressing: 0x7E000000 (appears in datasheets)
  • Pi 3 legacy: 0x3F000000 (BCM2837, not applicable to Pi 4)

When reading BCM2711 documentation showing bus addresses (0x7E00xxxx), translate to ARM physical by replacing 0x7E with 0xFE.

Key Peripheral Addresses

All MMIO regions are mapped as device memory (non-cacheable, strictly ordered).

PeripheralBase AddressDatasheet Reference
UART0 (PL011)0xFE201000BCM2711 §2.1
GPIO0xFE200000BCM2711 §5.2
System Timer0xFE003000BCM2711 §10.2
GIC-400 Distributor (GICD)0xFF841000BCM2711 §6.1
GIC-400 CPU Interface (GICC)0xFF842000BCM2711 §6.1
GENET v5 Ethernet0xFD580000Linux bcmgenet driver
Mailbox0xFE00B880BCM2711 §1.3

Kernel Memory Layout

Detailed layout defined in linker.ld:

SectionAddressSizeDescription
.text.boot0x00080000~256 BAssembly entry point
.text.exceptions0x000808002 KBException vector table (2KB aligned)
.text0x00081000+~50-100 KBRust kernel code
.rodataAfter .text~10-20 KBRead-only data, string literals
.dataAfter .rodata~1-5 KBInitialized global variables
.bssAfter .data~20-30 KBZero-initialized data + page tables
Heap__heap_start8 MBDynamic allocations (String, Vec, etc.)
Stack_stack_start2 MBCall stack (grows downward)

Kernel Image

  • Loaded at 0x00080000 by firmware
  • Total size: ~100 KB (debug), ~50 KB (release)
  • Entry point: _start in src/arch/aarch64/boot.s

Page Tables (in .bss)

  • L1_TABLE: 4 KB (512 entries, 1 GB per entry)
  • L2_TABLE_LOW: 4 KB (maps 0-1 GB normal memory)
  • L2_TABLE_MMIO: 4 KB (maps 3-4 GB device memory)
  • Total: 12 KB for translation tables

Heap

  • Defined by __heap_start and __heap_end symbols in linker script
  • Size: 8 MB reserved
  • Active: Bump allocator implemented (Phase 2 complete)
  • Location: After BSS section, aligned to 16 bytes
  • Used for: String, Vec, shell command history, dynamic allocations

Stack

  • Defined by _stack_start symbol in linker script
  • Size: 2 MB (currently only core 0 uses it)
  • Location: After heap, grows downward toward heap
  • Alignment: 16 bytes (ARM AAPCS requirement)
  • Future: Per-core stacks when multi-core support is added

Memory Attributes

After MMU initialization:

RegionTypeCacheableShareablePermissions
0x00000000-0x3FFFFFFFNormalYes (WB)InnerEL1 RW
0xFE000000-0xFF800000DeviceNoNoEL1 RW

Normal Memory (kernel code/data):

  • Write-Back, Read/Write-Allocate caching
  • Inner Shareable for multi-core coherency
  • ~100x faster than uncached access

Device Memory (MMIO):

  • Device-nGnRnE (non-Gathering, non-Reordering, no Early-ack)
  • Strictly ordered, every access reaches hardware
  • Required for correct peripheral operation

See MMU & Paging for MAIR_EL1 configuration details.

Future Memory Regions

Planned for future milestones:

Address RangePurposeMilestone
Higher-half kernelKernel at 0xFFFF_8000_0000_0000+Phase 5-6
Per-core stacks2 MB × 4 coresPhase 3 (Multi-core)
UserspaceLower 256 TB for user programsPhase 4 (EL0 Userspace)

Code References

  • Linker script: linker.ld
  • Page tables: src/arch/aarch64/mmu.rs (L1_TABLE, L2_TABLE_*)
  • UART base: src/drivers/uart.rs (UART_BASE)
  • GPIO base: src/drivers/gpio.rs (GPIO_BASE)
  • GIC base: src/drivers/gic.rs (GICD_BASE, GICC_BASE)
  • GENET base: src/drivers/genet.rs (GENET_BASE)

External References

UART PL011

ARM PrimeCell UART (PL011) driver reference for Raspberry Pi 4.

Hardware Configuration

ParameterValueNotes
Base Address0xFE201000See Memory Map
Clock Frequency48 MHzPi 4 hardware (QEMU uses 54 MHz)
Target Baud Rate115200Standard serial console speed
Data Format8N18 data bits, no parity, 1 stop bit
GPIO Pins14 (TXD0), 15 (RXD0)Must be configured to Alt0 function

Register Map

RegisterOffsetNamePurpose
DR0x00Data RegisterRead/write data bytes
FR0x18Flag RegisterStatus flags (TXFF, RXFE, BUSY)
IBRD0x24Integer Baud Rate DivisorInteger part of baud divisor
FBRD0x28Fractional Baud Rate DivisorFractional part (6 bits)
LCRH0x2CLine ControlData format, FIFO enable
CR0x30Control RegisterEnable UART, TX, RX
IMSC0x38Interrupt MaskMask interrupt sources
ICR0x44Interrupt ClearClear pending interrupts

GPIO Configuration (Required on Hardware)

Pi 4 firmware doesn’t always configure GPIO pins for UART. The driver must explicitly set GPIO 14 and 15 to Alt0 function:

#![allow(unused)]
fn main() {
// Configure GPIO 14 (TXD0) and 15 (RXD0) for UART0
const GPFSEL1: usize = 0xFE200004;  // GPIO Function Select 1

let mut fsel = read_volatile(GPFSEL1 as *const u32);
fsel &= !((0b111 << 12) | (0b111 << 15));  // Clear function bits
fsel |= (0b100 << 12) | (0b100 << 15);     // Set to Alt0
write_volatile(GPFSEL1 as *mut u32, fsel);
}

GPIO Function Select Encoding:

  • Bits 12-14: GPIO 14 function (000=Input, 001=Output, 100=Alt0/UART0_TXD)
  • Bits 15-17: GPIO 15 function (000=Input, 001=Output, 100=Alt0/UART0_RXD)

Initialization Sequence

#![allow(unused)]
fn main() {
// 1. Configure GPIO pins (see above)

// 2. Disable UART during configuration
UART_CR = 0x0000;
small_delay();  // Hardware needs time to disable

// 3. Mask all interrupts
UART_IMSC = 0x0000;

// 4. Clear pending interrupts
UART_ICR = 0x07FF;
small_delay();  // Let FIFOs flush

// 5. Calculate and set baud rate divisors
// Formula: Clock / (16 × BaudRate) = 48000000 / (16 × 115200) = 26.0416...
UART_IBRD = 26;      // Integer part
UART_FBRD = 3;       // Fractional: int(0.0416 × 64 + 0.5)

// 6. Configure line control (8N1, enable FIFO)
UART_LCRH = (1 << 4) | (1 << 5) | (1 << 6);  // 0x70
// Bit 4: Enable FIFOs
// Bits 5-6: Word length = 8 bits

small_delay();  // Stabilize before enabling

// 7. Enable UART, transmitter, receiver
UART_CR = (1 << 0) | (1 << 8) | (1 << 9);  // 0x301
// Bit 0: UART enable
// Bit 8: Transmit enable
// Bit 9: Receive enable

small_delay();  // Let UART stabilize after enabling
}

Hardware Stabilization Delays: Real hardware requires small delays between configuration steps. A simple delay of ~150 NOPs is sufficient:

#![allow(unused)]
fn main() {
fn small_delay() {
    for _ in 0..150 {
        unsafe { core::arch::asm!("nop", options(nomem, nostack)) };
    }
}
}

These delays are harmless on QEMU but critical for hardware stability.

Transmit (Polling Mode)

#![allow(unused)]
fn main() {
pub fn write_byte(&mut self, byte: u8) {
    // Wait until transmit FIFO has space
    while (self.registers.fr.read() & (1 << 5)) != 0 {
        // Bit 5 = TXFF (Transmit FIFO Full)
        core::hint::spin_loop();
    }

    // Write byte to data register
    self.registers.dr.write(byte as u32);
}
}

Receive (Polling Mode)

#![allow(unused)]
fn main() {
pub fn read_byte(&mut self) -> u8 {
    // Wait until receive FIFO has data
    while (self.registers.fr.read() & (1 << 4)) != 0 {
        // Bit 4 = RXFE (Receive FIFO Empty)
        core::hint::spin_loop();
    }

    // Read byte from data register
    (self.registers.dr.read() & 0xFF) as u8
}
}

Baud Rate Calculation

Formula: Divisor = Clock / (16 × BaudRate)

For Pi 4 (54 MHz clock, 115200 baud):

  • Divisor = 54,000,000 / (16 × 115,200) = 29.296875
  • IBRD (integer) = 29
  • FBRD (fractional) = int(0.296875 × 64 + 0.5) = 19

Note: Pi 3 uses 48 MHz clock, requiring different divisors (IBRD=26, FBRD=3).

Synchronization

The UART driver is wrapped in spin::Mutex<UartDriver> to allow safe concurrent access from:

  • Print macros (print!, println!)
  • Shell input/output
  • Future interrupt handlers

See src/lib.rs for WRITER global definition.

Known Issues & Quirks

  1. Clock frequency differs from Pi 3: Pi 4 uses 54 MHz, Pi 3 uses 48 MHz
  2. MMIO base differs from Pi 3: 0xFE201000 vs 0x3F201000
  3. Polling only: Interrupts not yet configured (requires GIC setup)

Code References

  • Implementation: src/drivers/tty/serial/amba_pl011.rs
  • Print macros: src/lib.rs (print!, println!, _print)
  • Shell I/O: src/shell.rs

External References

GPIO

GPIO (General Purpose Input/Output) driver for BCM2711.

Overview

The BCM2711 provides 58 GPIO pins (GPIO 0-57) for general-purpose digital I/O. Each pin can be configured as input, output, or one of six alternate functions (for hardware peripherals like UART, SPI, I2C, etc.).

Key Features:

  • 58 GPIO pins (BCM2711 specific - BCM2835 had 54)
  • Configurable as input, output, or 6 alternate functions
  • Built-in pull-up/pull-down resistors (BCM2711 uses new register mechanism)
  • 3.3V logic levels (NOT 5V tolerant!)
  • Fast digital I/O for bit-banging protocols

Hardware Reference

Register Map

All offsets are from GPIO_BASE = 0xFE200000.

Function Select Registers (GPFSEL0-5)

Control pin modes (input, output, alternate functions).

RegisterOffsetControls PinsDescription
GPFSEL00x00GPIO 0-9Function select for pins 0-9
GPFSEL10x04GPIO 10-19Function select for pins 10-19
GPFSEL20x08GPIO 20-29Function select for pins 20-29
GPFSEL30x0CGPIO 30-39Function select for pins 30-39
GPFSEL40x10GPIO 40-49Function select for pins 40-49
GPFSEL50x14GPIO 50-57Function select for pins 50-57

Format: Each pin uses 3 bits (10 pins per 32-bit register).

Function Codes:

  • 000 (0) - Input
  • 001 (1) - Output
  • 100 (4) - Alternate Function 0
  • 101 (5) - Alternate Function 1
  • 110 (6) - Alternate Function 2
  • 111 (7) - Alternate Function 3
  • 011 (3) - Alternate Function 4
  • 010 (2) - Alternate Function 5

Output Set Registers (GPSET0-1)

Set pins HIGH (write-only, reads return 0).

RegisterOffsetControls PinsDescription
GPSET00x1CGPIO 0-31Write 1 to bit N to set GPIO N high
GPSET10x20GPIO 32-57Write 1 to bit (N-32) to set GPIO N high

Usage: Writing 1 to a bit sets the corresponding pin HIGH. Writing 0 has no effect.

Output Clear Registers (GPCLR0-1)

Set pins LOW (write-only, reads return 0).

RegisterOffsetControls PinsDescription
GPCLR00x28GPIO 0-31Write 1 to bit N to set GPIO N low
GPCLR10x2CGPIO 32-57Write 1 to bit (N-32) to set GPIO N low

Usage: Writing 1 to a bit sets the corresponding pin LOW. Writing 0 has no effect.

Pin Level Registers (GPLEV0-1)

Read current pin state (read-only).

RegisterOffsetControls PinsDescription
GPLEV00x34GPIO 0-31Read bit N to get GPIO N level
GPLEV10x38GPIO 32-57Read bit (N-32) to get GPIO N level

Usage: Returns actual pin voltage level (0 = low, 1 = high) regardless of pin mode.

Pull-up/down Control Registers (BCM2711 Only!)

IMPORTANT: BCM2711 uses a completely different mechanism than BCM2835/BCM2836/BCM2837!

The old GPPUD and GPPUDCLK registers are not connected on BCM2711. Use these instead:

RegisterOffsetControls PinsDescription
GPIO_PUP_PDN_CNTRL_REG00xE4GPIO 0-15Pull control for pins 0-15
GPIO_PUP_PDN_CNTRL_REG10xE8GPIO 16-31Pull control for pins 16-31
GPIO_PUP_PDN_CNTRL_REG20xECGPIO 32-47Pull control for pins 32-47
GPIO_PUP_PDN_CNTRL_REG30xF0GPIO 48-57Pull control for pins 48-57

Format: Each pin uses 2 bits (16 pins per register).

Pull Codes:

  • 00 (0) - No pull resistor
  • 01 (1) - Pull-up resistor enabled
  • 10 (2) - Pull-down resistor enabled
  • 11 (3) - Reserved

Example: To enable pull-up on GPIO 5 (in REG0, bits 10-11):

#![allow(unused)]
fn main() {
let reg = gpio.read(GPIO_PUP_PDN_CNTRL_REG0);
gpio.write(GPIO_PUP_PDN_CNTRL_REG0, (reg & !(0x3 << 10)) | (0x1 << 10));
}

Reset State

At power-on/reset:

  • All pins configured as INPUT (GPFSEL = 0x0)
  • All pins have PULL-DOWN enabled (GPIO_PUP_PDN_CNTRL = 0x2 for each pin)
  • Except: Pins used by firmware (UART, I2C) may be configured differently

Common GPIO Pins

On Raspberry Pi 4, common GPIO usage:

GPIOAlt FuncCommon Use
14ALT0UART0 TXD (console)
15ALT0UART0 RXD (console)
2ALT0I2C1 SDA (HAT EEPROM)
3ALT0I2C1 SCL (HAT EEPROM)
42-Activity LED (active low)
18ALT5PWM0 (audio, servo control)

Warning: Avoid GPIOs 0-8 (used for SD card boot), and GPIOs 14-15 if using serial console.

Electrical Characteristics

  • Logic Levels: 3.3V (HIGH = 3.3V, LOW = 0V)
  • Absolute Maximum: 3.6V (do NOT connect 5V signals directly!)
  • Pull Resistors: ~50-60kΩ (exact value varies)
  • Drive Strength: Configurable, default ~8mA
  • Maximum Current: 16mA per pin, 50mA total for all GPIO

Usage Pattern

Typical sequence for GPIO operations:

1. Configure as Output:

#![allow(unused)]
fn main() {
// Set GPIO 42 (Activity LED) as output
gpio.set_function(42, Function::Output);
gpio.set_pull(42, Pull::None);  // LEDs don't need pull resistors
}

2. Set Output HIGH/LOW:

#![allow(unused)]
fn main() {
gpio.set(42);    // Turn LED on
gpio.clear(42);  // Turn LED off
}

3. Configure as Input:

#![allow(unused)]
fn main() {
// Set GPIO 17 (button) as input with pull-up
gpio.set_function(17, Function::Input);
gpio.set_pull(17, Pull::Up);  // Button to ground, pull-up holds high
}

4. Read Input:

#![allow(unused)]
fn main() {
let pressed = !gpio.read(17);  // Active low (button pulls to ground)
}

Implementation Notes

Function Select Calculation:

  • Register index: pin / 10
  • Bit offset: (pin % 10) * 3
  • Mask: 0b111 << bit_offset

Output Set/Clear Calculation:

  • Register index: pin / 32
  • Bit offset: pin % 32
  • Mask: 1 << bit_offset

Pull Control Calculation (BCM2711):

  • Register index: pin / 16
  • Bit offset: (pin % 16) * 2
  • Mask: 0b11 << bit_offset

References

System Timer

The BCM2711 System Timer provides a stable 64-bit free-running counter for timing, delays, and scheduling.

Overview

The System Timer is a simple but crucial peripheral:

  • 64-bit counter at 1 MHz (1 microsecond per tick)
  • Cannot be stopped - always running
  • Cannot be reset - counter value is read-only
  • Hardware guarantees: Runs at 1 MHz regardless of CPU/GPU clock speeds
  • Overflow: Won’t wrap for ~584,942 years (2^64 microseconds)

This makes it ideal for:

  • Accurate microsecond delays
  • Performance measurement
  • Scheduler tick source (future)
  • Timestamps and uptime tracking

Hardware Characteristics

PropertyValue
Base Address (ARM)0xFE003000
Bus Address0x7E003000 (in datasheets)
Counter Width64-bit
Frequency1 MHz (fixed)
Resolution1 microsecond
Compare Channels4 (C0-C3)
ARM-Available Channels2 (C1, C3)
GPU-Reserved Channels2 (C0, C2)

Register Map

The System Timer has 8 registers:

OffsetNameWidthAccessDescription
+0x00CS32-bitR/WControl/Status (interrupt flags)
+0x04CLO32-bitRCounter Lower 32 bits
+0x08CHI32-bitRCounter Higher 32 bits
+0x0CC032-bitR/WCompare 0 (used by GPU firmware)
+0x10C132-bitR/WCompare 1 (available for ARM)
+0x14C232-bitR/WCompare 2 (used by GPU firmware)
+0x18C332-bitR/WCompare 3 (available for ARM)

CS Register (Control/Status)

The CS register contains interrupt match flags:

BitNameDescription
0M0Timer 0 match detected (GPU)
1M1Timer 1 match detected (ARM available)
2M2Timer 2 match detected (GPU)
3M3Timer 3 match detected (ARM available)
31:4-Reserved

Write 1 to a bit to clear the corresponding interrupt flag.

Counter Registers (CLO/CHI)

The 64-bit counter is split across two 32-bit registers:

  • CLO: Lower 32 bits (bits 31:0)
  • CHI: Upper 32 bits (bits 63:32)

Reading the 64-bit counter safely:

  1. Read CHI
  2. Read CLO
  3. Read CHI again
  4. If CHI changed, use the new CHI value

This handles the rare case where CLO rolls over between reads.

Compare Registers (C0-C3)

Each compare register can trigger an interrupt when the lower 32 bits of the counter match:

  • When counter[31:0] == Cx, bit Mx in CS is set
  • GPU firmware uses C0 and C2 for its own purposes
  • ARM can safely use C1 and C3

Note: Phase 2 (current) only uses the counter for delays. Interrupts (Milestone #9) will use C1/C3.

Usage Example

#![allow(unused)]
fn main() {
use daedalus::drivers::timer::SystemTimer;

// Get current timestamp in microseconds
let start = SystemTimer::timestamp_us();

// Delay for 1 millisecond
SystemTimer::delay_ms(1);

// Measure elapsed time
let end = SystemTimer::timestamp_us();
let elapsed = end - start;
println!("Operation took {} microseconds", elapsed);

// Get uptime in seconds
let uptime = SystemTimer::uptime_seconds();
println!("System has been running for {} seconds", uptime);
}

Implementation Details

Delay Functions

The driver provides two delay functions:

  • delay_us(n) - Busy-wait for n microseconds
  • delay_ms(n) - Busy-wait for n milliseconds (calls delay_us(n * 1000))

Accuracy:

  • Resolution: 1 microsecond
  • Overhead: ~2-5 microseconds per call
  • For delays < 10μs, actual delay may be longer due to overhead

Implementation: Simple busy-wait loop that polls the counter.

Wrap-Around Handling

The 64-bit counter will eventually wrap (after ~584,942 years), though this is unlikely in practice. The delay functions handle wrap-around using wrapping_add:

#![allow(unused)]
fn main() {
let start = SystemTimer::read_counter();
let target = start.wrapping_add(microseconds);

if target < start {
    // Wrapped - wait for counter to wrap first
    while SystemTimer::read_counter() >= start {}
}
while SystemTimer::read_counter() < target {}
}

Shell Commands

The uptime command uses the System Timer:

daedalus> uptime
Uptime: 5 minutes, 32 seconds
  (332451829 microseconds)

Performance Characteristics

OperationTypical TimeNotes
read_counter()~100 ns2 volatile reads + comparison
timestamp_us()~100 nsAlias for read_counter()
delay_us(1)~3-6 μsMinimum realistic delay
delay_us(100)~100-105 μs< 5% overhead
delay_ms(1)~1000-1005 μs< 1% overhead

Measurements taken in QEMU on Apple M1 host. Real hardware may differ slightly.

References

External Documentation

Code References

  • Driver: src/drivers/clocksource/bcm2711.rs
  • Shell command: src/shell.rs (uptime command)
  • Tests: src/drivers/clocksource/bcm2711.rs (6 tests: counter, delays, monotonicity)

Future Enhancements

Planned for Milestone #9 (GIC-400 Setup):

  • Configure timer compare interrupts (C1 or C3)
  • Replace busy-wait delays with interrupt-driven timing
  • Scheduler tick for preemptive multitasking (Phase 3)

GIC-400 Interrupt Controller

Status: ✅ Implemented (Milestone #9 complete)

ARM Generic Interrupt Controller v2 (GIC-400) driver for interrupt handling on BCM2711.

Overview

The GIC-400 is a centralized interrupt controller that:

  • Manages up to 1020 interrupt sources
  • Routes interrupts to specific CPU cores
  • Supports interrupt prioritization and masking
  • Provides acknowledge/end-of-interrupt protocol

DaedalusOS uses the GIC for interrupt-driven I/O, starting with UART receive interrupts.

Hardware Configuration

Base Addresses

ComponentAddressPurpose
GIC Distributor (GICD)0xFF841000Global interrupt configuration
GIC CPU Interface (GICC)0xFF842000Per-CPU interrupt handling

Source: BCM2711 ARM Peripherals, Section 6

Important Note: enable_gic Required

The GIC must be enabled by the firmware. Add to config.txt:

enable_gic=1

Without this setting, the GIC will not function in bare metal mode on Pi 4.

Interrupt IDs

Interrupt numbering follows ARM GIC-400 specification:

RangeTypeDescription
0-15SGISoftware Generated Interrupts (inter-core communication)
16-31PPIPrivate Peripheral Interrupts (per-CPU timers, etc.)
32-1019SPIShared Peripheral Interrupts (UART, GPIO, etc.)
1020-1023SpecialReserved/spurious interrupt IDs

BCM2711 Peripheral Interrupts

PeripheralDevice Tree IDActual IDType
UART0 (PL011)GIC_SPI 121153Level, active high

Note: SPI IDs in device tree are offset by +32 to get the actual GIC interrupt ID.

Source: Linux device tree arch/arm/boot/dts/broadcom/bcm2711.dtsi

Architecture

The GIC has two main components:

Distributor (GICD)

Manages global interrupt state:

  • Enable/disable individual interrupts
  • Set interrupt priorities (0 = highest, 255 = lowest)
  • Configure trigger type (level-sensitive or edge-triggered)
  • Route interrupts to specific CPUs

Key registers:

  • GICD_CTLR (0x000): Enable/disable distributor
  • GICD_TYPER (0x004): Reports number of interrupt lines
  • GICD_ISENABLERn (0x100+): Enable interrupts (set-enable)
  • GICD_ICENABLERn (0x180+): Disable interrupts (clear-enable)
  • GICD_IPRIORITYRn (0x400+): Set interrupt priorities
  • GICD_ITARGETSRn (0x800+): Route to CPUs
  • GICD_ICFGRn (0xC00+): Configure trigger type

CPU Interface (GICC)

Per-CPU interrupt handling:

  • Acknowledge pending interrupts
  • Signal end-of-interrupt (EOI)
  • Configure priority masking

Key registers:

  • GICC_CTLR (0x000): Enable/disable CPU interface
  • GICC_PMR (0x004): Priority mask (accept only interrupts with priority higher than this)
  • GICC_IAR (0x00C): Interrupt acknowledge (read to get pending interrupt ID)
  • GICC_EOIR (0x010): End of interrupt (write interrupt ID when done)

Initialization Sequence

The GIC is initialized in drivers::gic::Gic::init():

  1. Disable distributor while configuring
  2. Read GICD_TYPER to get number of interrupt lines
  3. Configure all SPIs (ID >= 32):
    • Priority: 0xA0 (medium)
    • Target: CPU 0
    • Trigger: Level-sensitive (default for BCM2711 peripherals)
  4. Enable distributor (both Group 0 and Group 1)
  5. Configure CPU interface:
    • Priority mask: 0xFF (accept all)
    • Binary point: 0 (all priority bits for preemption)
    • Enable both interrupt groups

Interrupt Flow

Enabling an Interrupt

#![allow(unused)]
fn main() {
// Enable UART0 interrupt in GIC
let gic = drivers::gic::GIC.lock();
gic.enable_interrupt(drivers::gic::irq::UART0); // ID 153

// Enable RX interrupt in UART hardware
drivers::uart::WRITER.lock().enable_rx_interrupt();

// Unmask IRQs at CPU level (clear DAIF.I bit)
enable_irqs();
}

Handling an Interrupt

When an interrupt fires:

  1. CPU takes IRQ exception → jumps to vector table offset 0x280
  2. Assembly stub saves context → calls exception_handler_el1_spx
  3. Rust handler calls handle_irq():
    #![allow(unused)]
    fn main() {
    let int_id = gic.acknowledge_interrupt(); // Read GICC_IAR
    // Route to peripheral handler based on int_id
    gic.end_of_interrupt(int_id); // Write GICC_EOIR
    }
  4. Assembly stub restores context → executes eret

Priority and Nesting

Current configuration:

  • Priority mask: 0xFF (lowest priority, accept all interrupts)
  • Binary point: 0 (all 8 priority bits used for preemption)
  • UART priority: 0xA0 (medium, higher value = lower priority)

Nested interrupts are not currently supported (DAIF.I is set while handling IRQs).

Implementation Details

Location: src/drivers/irqchip/gic_v2.rs (356 lines)

Key functions:

  • Gic::init() - Initialize GIC hardware
  • Gic::enable_interrupt(int_id) - Enable specific interrupt
  • Gic::disable_interrupt(int_id) - Disable specific interrupt
  • Gic::acknowledge_interrupt() - Get pending interrupt ID
  • Gic::end_of_interrupt(int_id) - Signal completion

All register access uses volatile reads/writes to prevent compiler optimization.

Testing

The GIC is initialized during kernel startup and tested by:

  1. Enabling UART RX interrupts
  2. Typing characters in QEMU console
  3. Verifying interrupt handler is called (characters are echoed)

Current Limitations

  1. Single CPU only - Interrupts routed to CPU 0
  2. No interrupt nesting - IRQs disabled during handler execution
  3. Level-sensitive only - Edge-triggered mode not tested
  4. No SGI/PPI support - Only SPIs (peripheral interrupts) configured

Future Enhancements

Potential improvements for later milestones:

  • Multi-core support (Phase 3):

    • Route interrupts to specific CPUs
    • Use SGIs for inter-processor communication
    • Implement per-CPU local timer interrupts (PPIs)
  • Priority-based preemption (Phase 3):

    • Allow higher-priority interrupts to preempt lower-priority handlers
    • Configure binary point for priority grouping
  • Edge-triggered interrupts:

    • Support GPIO interrupts (rising/falling edge)
    • Test edge-triggered configuration
  • Interrupt statistics:

    • Track interrupt counts per source
    • Measure interrupt latency

Code References

  • GIC driver: src/drivers/irqchip/gic_v2.rs
  • IRQ handler: src/arch/aarch64/exceptions.rs (handle_irq())
  • UART interrupt: src/drivers/tty/serial/amba_pl011.rs (handle_interrupt())
  • Initialization: src/lib.rs (init())

External References

  • ARM GIC-400 Specification: IHI0069 (GIC Architecture Spec)

    • Section 2: Programmer’s model
    • Section 3: Distributor registers
    • Section 4: CPU Interface registers
    • Section 5: Interrupt configuration
  • BCM2711 Documentation: BCM2711 ARM Peripherals

    • Section 6: Interrupt controller
  • Linux Device Tree: bcm2711.dtsi

    • Interrupt IDs for BCM2711 peripherals

GENET v5 Ethernet Controller

Hardware: Broadcom GENET v5 (Gigabit Ethernet MAC) SoC: BCM2711 (Raspberry Pi 4) Driver: src/drivers/net/ethernet/broadcom/genet.rs Status: Hardware detection and PHY management implemented


Overview

The GENET (Gigabit Ethernet) controller is an integrated MAC (Media Access Control) layer device in the BCM2711 SoC. It handles Ethernet frame transmission and reception, communicates with the external PHY chip via MDIO, and provides DMA engines for efficient packet transfer.

Architecture

┌─────────────────────────────────────────────────────────┐
│                    DaedalusOS Driver                     │
│                  (src/drivers/genet.rs)                  │
└───────────────────────────┬──────────────────────────────┘
                            │
            ┌───────────────┴───────────────┐
            │                               │
            ▼                               ▼
┌───────────────────────┐       ┌───────────────────────┐
│   GENET Controller    │       │    MDIO/MDC Bus       │
│   (MAC Layer)         │       │   (Management)        │
│                       │       │                       │
│ • UMAC (UniMAC)       │       │ • PHY Register Access │
│ • RX/TX Buffers       │◄──────┤ • Clause 22 Protocol  │
│ • DMA Engines         │       │ • 1 MHz Clock         │
│ • Interrupt Control   │       │                       │
│ • Statistics Counters │       │                       │
└───────────┬───────────┘       └───────────┬───────────┘
            │                               │
            │ Frame Data                    │ Management
            │                               │
            └───────────────┬───────────────┘
                            │
                            ▼
                ┌───────────────────────┐
                │   BCM54213PE PHY      │
                │   (Physical Layer)    │
                │                       │
                │ • Auto-negotiation    │
                │ • Link Detection      │
                │ • 10/100/1000 Mbps    │
                │ • MII Registers       │
                └───────────┬───────────┘
                            │
                            ▼
                    RJ45 Ethernet Port

Understanding “MAC” - Two Different Meanings

Terminology Confusion: “MAC” has two completely different meanings in networking:

TermMeaningWhat It Is
MAC AddressMedia Access Control Address48-bit hardware address (e.g., B8:27:EB:12:34:56)
MAC ChipMedia Access ControllerPhysical chip that implements Layer 2 ethernet protocol

When this document refers to “GENET MAC” or “MAC controller,” it means the controller chip, not the address!

PHY vs MAC: The Two-Chip Architecture

Modern ethernet requires two separate chips with different responsibilities:

MAC Chip (GENET) - Inside BCM2711 SoC

  • Location: Integrated into the CPU die
  • Layer: OSI Layer 2 (Data Link)
  • Works with: Digital signals (bits)
  • Responsibilities:
    • Build/parse ethernet frames (14-byte header)
    • Add/check CRC-32 (only CRC - IP/TCP checksums are software!)
    • MAC address filtering (accept frames for our address)
    • DMA to/from system memory
    • Send/receive bits to/from PHY via RGMII bus

PHY Chip (BCM54213PE) - External on Pi 4 Board

  • Location: Separate chip on the board (not in SoC)
  • Layer: OSI Layer 1 (Physical)
  • Works with: Analog signals (voltages on copper wire)
  • Responsibilities:
    • Convert digital bits ↔ electrical signals
    • Auto-negotiation (determine speed/duplex with partner)
    • Link detection (is cable plugged in?)
    • Signal encoding (1000BASE-T, 100BASE-TX, etc.)
    • Cable equalization, echo cancellation

How They Communicate

Data Path (RGMII): 12-wire parallel bus transfers packet bits

  • 4-bit TX data, 4-bit RX data, clocks, control signals
  • 125 MHz for Gigabit (1000 Mbps)
  • Carries actual ethernet frame data

Management Path (MDIO): 2-wire serial bus for PHY configuration

  • MDC (clock), MDIO (data)
  • ~1 MHz, slow but sufficient for configuration
  • Read PHY ID, configure speed, check link status

What the Driver Actually Does

95% of driver code = MAC chip (GENET)

  • Configure DMA rings
  • Build ethernet frames (header + payload)
  • Enable TX/RX
  • Handle interrupts
  • Manage buffers

5% of driver code = PHY chip (BCM54213PE)

  • Reset PHY
  • Enable auto-negotiation
  • Read link status

The PHY mostly “just works” once configured. The MAC is where driver complexity lives.

Key Features

  • MAC Layer: Handles frame encapsulation, CRC, and media access control
  • MDIO Controller: Manages communication with the PHY chip
  • DMA Engines: Separate RX and TX DMA for efficient packet transfer (not yet implemented)
  • Hardware Filtering: Can filter packets by MAC address (not yet implemented)
  • Statistics: Hardware counters for packets, bytes, errors (not yet implemented)
  • Interrupts: RX/TX completion, link changes, errors (not yet implemented)

Memory Map

Base Address

Address TypeValueNote
Bus Address0x7D580000As seen in device tree
ARM Physical0xFD580000What the CPU uses (bus + 0x8000_0000)
Size0x10000 (64 KB)Register space

CRITICAL: Always use 0xFD580000 as the base address. The device tree uses bus addresses, which differ from ARM physical addresses by a fixed offset.

Source: BCM2711 device tree (bcm2711.dtsi)

Register Block Offsets

All offsets are from GENET_BASE (0xFD580000):

BlockOffsetSizeDescription
SYS0x000064 BSystem control registers
GR_BRIDGE0x004064 BGR bridge registers
EXT0x0080384 BExtension block
INTRL2_00x020064 BInterrupt controller 0
INTRL2_10x024064 BInterrupt controller 1
RBUF0x0300768 BRX buffer control
TBUF0x0600512 BTX buffer control
UMAC0x08003588 BUniMAC (the actual MAC)
RDMA0x20008192 BRX DMA engine
TDMA0x40008192 BTX DMA engine
HFB0x800032768 BHardware filter block

Source: Linux kernel driver (bcmgenet.h)


Register Reference

System Registers (SYS_OFF = 0x0000)

SYS_REV_CTRL (Offset 0x0000)

System revision control register. Contains version information.

Format:

Bits [31:28]: Reserved
Bits [27:24]: Major version (4 bits)
Bits [23:20]: Reserved
Bits [19:16]: Minor version (4 bits)
Bits [15:0]:  Reserved

⚠️ CRITICAL VERSION QUIRK: The BCM2711 GENET hardware reports major version 6 (not 5), which corresponds to the GENET v5 IP block. This naming inconsistency exists across all drivers:

  • Hardware register value: 0x06000000 (major=6, minor=0)
  • IP block name: GENET v5
  • Linux enum: GENET_V5 = 5
  • U-Boot validation: Only accepts major version 6

Examples:

  • 0x06000000 = GENET hardware v6.0 (GENET v5 IP block) ← Raspberry Pi 4
  • 0x05020000 = GENET hardware v5.2 (GENET v4 IP block)

Usage: The is_present() function checks that bits [27:24] == 6 for BCM2711.

Sources:

  • U-Boot: major = (reg >> 24) & 0x0f; if (major != 6) reject;
  • Linux: Maps hardware version 6 → GENET_V5 enum
  • EDK2: SYS_REV_MAJOR = BIT27|BIT26|BIT25|BIT24

UMAC Registers (UMAC_OFF = 0x0800)

The UMAC (Unified MAC) is the core MAC layer implementation within GENET.

UMAC_CMD (Offset 0x0808)

Command register. Controls MAC enable, reset, and operating modes.

Key Bits:

  • Bit 0: TX_EN - Enable transmit
  • Bit 1: RX_EN - Enable receive
  • Bit 13: SW_RESET - Software reset (self-clearing)

Usage:

#![allow(unused)]
fn main() {
// Enable TX and RX
self.write_reg(UMAC_CMD, CMD_TX_EN | CMD_RX_EN);

// Reset UMAC
self.write_reg(UMAC_CMD, CMD_SW_RESET);
// Wait for reset to complete (bit clears automatically)
}

UMAC_MAC0 (Offset 0x080C)

MAC address bytes 0-3 (network byte order).

Format:

Bits [31:24]: MAC byte 0
Bits [23:16]: MAC byte 1
Bits [15:8]:  MAC byte 2
Bits [7:0]:   MAC byte 3

UMAC_MAC1 (Offset 0x0810)

MAC address bytes 4-5 (network byte order).

Format:

Bits [31:16]: Reserved
Bits [15:8]:  MAC byte 4
Bits [7:0]:   MAC byte 5

Usage:

#![allow(unused)]
fn main() {
// Set MAC address B8:27:EB:12:34:56
let mac0 = (0xB8 << 24) | (0x27 << 16) | (0xEB << 8) | 0x12;
let mac1 = (0x34 << 8) | 0x56;
self.write_reg(UMAC_MAC0, mac0);
self.write_reg(UMAC_MAC1, mac1);
}

UMAC_MODE (Offset 0x084C)

Mode register. Controls speed (10/100/1000 Mbps) and duplex.

⚠️ HARDWARE QUIRK: This register is write-only. Reading returns garbage. Must track state in software.

Key Bits:

  • Bits [1:0]: Speed selection
    • 00 = 10 Mbps
    • 01 = 100 Mbps
    • 10 = 1000 Mbps
  • Bit 4: Full duplex enable

UMAC_MDIO_CMD (Offset 0x0E14)

MDIO command and data register. Used to read/write PHY registers.

Format:

Bit 29:       MDIO_START_BUSY - Start operation / operation in progress
Bit 28:       MDIO_READ_FAIL - Read failed
Bits [27:26]: Operation - 10 = read, 01 = write
Bits [25:21]: PHY address (5 bits)
Bits [20:16]: Register address (5 bits)
Bits [15:0]:  Data (read or write)

Read Sequence:

  1. Write: MDIO_START_BUSY | MDIO_RD | (phy_addr << 21) | (reg_addr << 16)
  2. Poll bit 29 until clear (timeout ~1ms)
  3. Check bit 28 (MDIO_READ_FAIL)
  4. Read bits [15:0] for data

Write Sequence:

  1. Write: MDIO_START_BUSY | MDIO_WR | (phy_addr << 21) | (reg_addr << 16) | data
  2. Poll bit 29 until clear (timeout ~1ms)

See: MDIO Protocol section below for details.


MDIO Protocol

MDIO (Management Data Input/Output) is the bus used to communicate with the PHY chip. It’s a simple serial protocol with two signals:

  • MDC: Management Data Clock (~1 MHz)
  • MDIO: Management Data (bidirectional)

Clause 22 Protocol

The GENET controller implements IEEE 802.3 Clause 22 MDIO protocol:

  1. Preamble: 32 bits of 1
  2. Start: 01
  3. Opcode: 10 (read) or 01 (write)
  4. PHY Address: 5 bits
  5. Register Address: 5 bits
  6. Turnaround: 2 bits
  7. Data: 16 bits

Timing: Each bit takes one MDC clock cycle. The GENET controller handles the protocol automatically - we just write to UMAC_MDIO_CMD and poll for completion.

MDIO Operations

Reading a PHY Register

#![allow(unused)]
fn main() {
pub fn mdio_read(&self, phy_addr: u8, reg_addr: u8) -> Option<u16> {
    // Build command: read operation
    let cmd = MDIO_START_BUSY
        | MDIO_RD
        | ((phy_addr as u32) << 21)
        | ((reg_addr as u32) << 16);

    // Start operation
    self.write_reg(UMAC_MDIO_CMD, cmd);

    // Wait for completion (poll START_BUSY bit)
    for _ in 0..1000 {
        let status = self.read_reg(UMAC_MDIO_CMD);

        if (status & MDIO_START_BUSY) == 0 {
            // Check for read failure
            if (status & MDIO_READ_FAIL) != 0 {
                return None;
            }

            // Return data from bits [15:0]
            return Some((status & 0xFFFF) as u16);
        }

        SystemTimer::delay_us(1);
    }

    None // Timeout
}
}

Writing a PHY Register

#![allow(unused)]
fn main() {
pub fn mdio_write(&self, phy_addr: u8, reg_addr: u8, value: u16) -> bool {
    // Build command: write operation with data
    let cmd = MDIO_START_BUSY
        | MDIO_WR
        | ((phy_addr as u32) << 21)
        | ((reg_addr as u32) << 16)
        | (value as u32);

    // Start operation
    self.write_reg(UMAC_MDIO_CMD, cmd);

    // Wait for completion
    for _ in 0..1000 {
        let status = self.read_reg(UMAC_MDIO_CMD);

        if (status & MDIO_START_BUSY) == 0 {
            return true;
        }

        SystemTimer::delay_us(1);
    }

    false // Timeout
}
}

Timeout: 1000 iterations × 1 µs = 1 ms maximum wait time.


PHY Management (BCM54213PE)

The BCM54213PE is the external Gigabit Ethernet PHY chip on the Raspberry Pi 4. It handles the physical layer: link detection, auto-negotiation, and signal encoding.

PHY Constants

ConstantValueSource
MDIO Address0x01Pi 4 device tree
PHY ID0x600D84A2Linux kernel driver
PHY ID1 Register0x600DUpper 16 bits
PHY ID2 Register0x84A2Lower 16 bits

MII Register Map (IEEE 802.3)

These are standard registers that all Ethernet PHYs must implement:

RegisterAddressNameDescription
BMCR0x00Basic Mode ControlControl register
BMSR0x01Basic Mode StatusStatus register
PHYSID10x02PHY ID 1Upper 16 bits of PHY ID
PHYSID20x03PHY ID 2Lower 16 bits of PHY ID
ADVERTISE0x04Auto-Negotiation AdvertisementCapabilities to advertise
LPA0x05Link Partner AbilityPartner’s capabilities
CTRL10000x091000BASE-T ControlGigabit control
STAT10000x0A1000BASE-T StatusGigabit status

Source: IEEE 802.3 Clause 22

BMCR - Basic Mode Control Register (0x00)

Controls PHY operation and initiates actions.

Key Bits:

  • Bit 15: RESET - Software reset (self-clearing)
  • Bit 12: ANENABLE - Enable auto-negotiation
  • Bit 9: ANRESTART - Restart auto-negotiation
  • Bit 8: DUPLEX - Full duplex (if auto-neg disabled)
  • Bits [6,13]: Speed selection (if auto-neg disabled)

Usage:

#![allow(unused)]
fn main() {
// Reset PHY
self.mdio_write(PHY_ADDR, MII_BMCR, BMCR_RESET);
SystemTimer::delay_ms(10); // Wait for reset

// Enable auto-negotiation
self.mdio_write(PHY_ADDR, MII_BMCR, BMCR_ANENABLE | BMCR_ANRESTART);
}

BMSR - Basic Mode Status Register (0x01)

Read-only register indicating PHY status and capabilities.

Key Bits:

  • Bit 5: ANEGCOMPLETE - Auto-negotiation complete
  • Bit 2: LSTATUS - Link status (1 = link up)
  • Bit 3: Auto-negotiation capable
  • Bits [15:11]: Supported speeds (100BASE-T4, 100BASE-X, 10BASE-T)

Usage:

#![allow(unused)]
fn main() {
// Check link status
if let Some(bmsr) = self.mdio_read(PHY_ADDR, MII_BMSR) {
    let link_up = (bmsr & BMSR_LSTATUS) != 0;
    let autoneg_done = (bmsr & BMSR_ANEGCOMPLETE) != 0;
}
}

⚠️ NOTE: Some BMSR bits are latching (they stick until read). Reading BMSR twice can give different results. Always read twice to get current state.

PHY Initialization Sequence

  1. Read PHY ID to verify presence:

    #![allow(unused)]
    fn main() {
    let id1 = self.mdio_read(PHY_ADDR, MII_PHYSID1)?;
    let id2 = self.mdio_read(PHY_ADDR, MII_PHYSID2)?;
    let phy_id = ((id1 as u32) << 16) | (id2 as u32);
    assert_eq!(phy_id, 0x600D84A2); // BCM54213PE
    }
  2. Software Reset:

    #![allow(unused)]
    fn main() {
    self.mdio_write(PHY_ADDR, MII_BMCR, BMCR_RESET);
    SystemTimer::delay_ms(10); // Wait for reset to complete
    }
  3. Configure Auto-Negotiation (optional, usually done by firmware):

    #![allow(unused)]
    fn main() {
    // Read current advertisement
    let advertise = self.mdio_read(PHY_ADDR, MII_ADVERTISE)?;
    // Advertise 10/100 capabilities...
    
    // Enable Gigabit advertisement
    let ctrl1000 = self.mdio_read(PHY_ADDR, MII_CTRL1000)?;
    // Set Gigabit capabilities...
    }
  4. Start Auto-Negotiation:

    #![allow(unused)]
    fn main() {
    self.mdio_write(PHY_ADDR, MII_BMCR, BMCR_ANENABLE | BMCR_ANRESTART);
    }
  5. Wait for Link:

    #![allow(unused)]
    fn main() {
    for _ in 0..3000 {
        if let Some(bmsr) = self.mdio_read(PHY_ADDR, MII_BMSR) {
            if (bmsr & BMSR_ANEGCOMPLETE) != 0 {
                // Auto-negotiation complete
                break;
            }
        }
        SystemTimer::delay_ms(1);
    }
    }
  6. Read Link Parameters:

    #![allow(unused)]
    fn main() {
    let lpa = self.mdio_read(PHY_ADDR, MII_LPA)?;
    let stat1000 = self.mdio_read(PHY_ADDR, MII_STAT1000)?;
    // Determine speed and duplex from partner ability
    }

Hardware Quirks and Limitations

1. UMAC_MODE is Write-Only

Problem: Reading UMAC_MODE register returns garbage, not the written value.

Impact: Cannot verify speed/duplex settings by reading back.

Workaround: Track the current mode in software (in the GenetController struct).

Source: U-Boot driver comments, community reports

Problem: The PHY doesn’t generate interrupts on link state changes.

Impact: Cannot rely on interrupts for link detection.

Workaround: Poll BMSR register periodically (e.g., every 1 second) to detect link changes.

Source: Linux kernel driver uses polling

3. Some Registers Are Write-Once After Reset

Problem: Certain configuration registers only accept the first write after a hardware reset.

Impact: Must get initialization right the first time.

Workaround: Carefully plan initialization sequence. Test thoroughly.

Source: Broadcom vendor documentation (not public)

4. MDIO Timing is Critical

Problem: MDIO operations need proper delays between operations.

Impact: Too fast = operation fails. Too slow = waste time.

Workaround: Use 1 µs polling intervals with 1 ms timeout (current implementation).

Source: IEEE 802.3 timing requirements

5. Auto-Negotiation Takes Time

Problem: Link auto-negotiation can take 1-3 seconds.

Impact: Boot time increases if waiting for link.

Workaround:

  • Option 1: Don’t wait during init, just start negotiation
  • Option 2: Wait with timeout and continue even if incomplete
  • Option 3: Background polling task

Source: IEEE 802.3 specification (auto-negotiation protocol)


QEMU Limitations

CRITICAL: QEMU 9.0’s raspi4b machine does not fully emulate GENET.

Observed Behavior

Reading from GENET registers (0xFD580000) in QEMU causes a Data Abort exception. This happens because:

  1. QEMU doesn’t implement the GENET controller
  2. The address is not mapped to any device
  3. ARM generates a data abort for unmapped addresses

Detection

The is_present() function safely detects this:

#![allow(unused)]
fn main() {
pub fn is_present(&self) -> bool {
    let version = self.read_reg(SYS_REV_CTRL);
    let major_version = (version >> 16) & 0xFFFF;
    major_version == 0x0005
}
}

In QEMU, this will either:

  • Return false (if the read succeeds but returns garbage)
  • Cause a data abort exception (current QEMU behavior)

Workaround

Wrap all GENET access in exception handlers or presence checks:

#![allow(unused)]
fn main() {
if genet.is_present() {
    // Safe to use GENET
    genet.diagnostic();
} else {
    println!("GENET hardware not present (QEMU?)");
}
}

Testing Strategy

  • Unit Tests: Test pure functions (parsing, encoding) in QEMU
  • Integration Tests: Mark as #[ignore], run on real hardware only
  • Interactive Tests: Use shell commands on real Pi 4

Initialization Flowchart

Complete initialization sequence for GENET + PHY:

START
  │
  ├─► Check GENET present (read SYS_REV_CTRL)
  │   ├─► Version != v5 → ERROR: Hardware not found
  │   └─► Version == v5 → Continue
  │
  ├─► Soft reset UMAC (write UMAC_CMD)
  │   └─► Wait 10 µs
  │
  ├─► Detect PHY via MDIO
  │   ├─► Read PHYSID1 (0x02)
  │   ├─► Read PHYSID2 (0x03)
  │   └─► Verify ID == 0x600D84A2
  │
  ├─► Reset PHY
  │   ├─► Write BMCR[15] = 1 (reset)
  │   └─► Wait 10-100 ms
  │
  ├─► Configure Auto-Negotiation
  │   ├─► Write ADVERTISE (10/100 capabilities)
  │   ├─► Write CTRL1000 (1000 capabilities)
  │   └─► Write BMCR (enable auto-neg, restart)
  │
  ├─► Wait for Link (optional)
  │   ├─► Poll BMSR[5] (auto-neg complete)
  │   ├─► Poll BMSR[2] (link status)
  │   └─► Timeout after 3 seconds
  │
  ├─► Read Link Parameters
  │   ├─► Read LPA (partner ability)
  │   ├─► Read STAT1000 (Gigabit status)
  │   └─► Determine speed and duplex
  │
  ├─► Configure UMAC
  │   ├─► Write MAC address (UMAC_MAC0/MAC1)
  │   ├─► Write speed/duplex (UMAC_MODE)
  │   └─► Enable TX/RX (UMAC_CMD)
  │
  └─► READY

Diagnostic Output

The diagnostic() function performs a comprehensive hardware check. Expected output on real Pi 4:

With Ethernet Cable Unplugged

[DIAG] Ethernet Hardware Diagnostics
[DIAG] ================================
[DIAG] Step 1: GENET Controller Detection
[DIAG]   Reading SYS_REV_CTRL @ 0xFD580000...
[DIAG]   Raw register value: 0x06000000
[PASS]   GENET hardware v6.0 detected (GENET v5 IP block)
[PASS]   Register: 0x06000000

[DIAG] Step 2: PHY Detection
[DIAG]   Scanning MDIO address 1...
[DIAG]   Reading PHY_ID1 @ addr 1, reg 0x02...
[DIAG]     Value: 0x600D
[DIAG]   Reading PHY_ID2 @ addr 1, reg 0x03...
[DIAG]     Value: 0x84A2
[PASS]   PHY found at address 1: BCM54213PE (ID: 0x600D84A2)

[DIAG] Step 3: PHY Status
[DIAG]   Reading BMSR (Basic Mode Status Register)...
[DIAG]     BMSR: 0x7949
[DIAG]       Link status: DOWN
[DIAG]       Auto-negotiation: IN PROGRESS
[DIAG]   Reading BMCR (Basic Mode Control Register)...
[DIAG]     BMCR: 0x1140
[DIAG]       Auto-negotiation: ENABLED

[PASS] ================================
[PASS] Hardware diagnostics complete!
[PASS] GENET hardware v6.0 (GENET v5 IP) and BCM54213PE PHY detected

Note: Link status shows DOWN when no ethernet cable is plugged in. This is normal - plug in a cable to see “Link status: UP” and “Auto-negotiation: COMPLETE”.

In QEMU

[DIAG] Ethernet Hardware Diagnostics
[DIAG] ================================
[DIAG] Step 1: GENET Controller Detection
[DIAG]   Reading SYS_REV_CTRL @ 0xFD580000...
[DIAG]   Raw register value: 0x00000000
[WARN]   Unexpected version: 0.0 (expected 6.x for GENET v5)
[INFO]   Hardware not present (running in QEMU?)
[SKIP] Diagnostics completed (no hardware detected)

References

Official Documentation

Linux Kernel Sources

U-Boot Sources

Other Implementations

Community Resources


Next Steps

Current Implementation Status

Implemented:

  • Register definitions and constants
  • MMIO read/write functions
  • MDIO read/write protocol
  • PHY ID detection
  • Hardware presence checking
  • Comprehensive diagnostics

Not Yet Implemented:

  • Frame transmission (TX path)
  • Frame reception (RX path)
  • DMA engine configuration
  • Interrupt handling
  • MAC address configuration
  • Link state monitoring
  • Speed/duplex configuration

Future Milestones

  • Milestone #13: Frame TX/RX (simple polling mode)
  • Milestone #14: Interrupt-driven RX
  • Milestone #15: ARP responder
  • Milestone #16: TCP/IP stack integration (smoltcp)

Constant Verification

Boot Sequence

Complete boot flow from firmware to Rust kernel.

Overview

Pi 4 Firmware → Assembly Stub → Rust Entry → Kernel Main
  (EL2/EL1)      (boot.s)         (_start_rust)   (kernel_main)

Stage 1: Firmware

The Raspberry Pi 4 firmware (start4.elf) performs initial hardware setup:

  1. Initializes CPU cores, memory, and basic peripherals
  2. Loads kernel8.img from SD card FAT partition
  3. Copies kernel to physical address 0x00080000
  4. Jumps to _start (first instruction in kernel)

State at firmware handoff:

  • MMU disabled (identity addressing, no virtual memory)
  • Data and instruction caches disabled
  • Interrupts masked (DAIF bits set - D, A, I, F all masked)
  • Stack pointer undefined (must be set by boot code)
  • Exception level: EL2 (both QEMU and real hardware boot at EL2)
  • All cores running (core 0 continues, cores 1-3 must be parked)

IMPORTANT: The boot stub immediately drops from EL2 to EL1 before jumping to Rust. This ensures atomic instructions and spin locks work correctly on both QEMU and hardware.

Stage 2: Assembly Stub (boot.s)

Located at src/arch/aarch64/boot.s, linked first via .text.boot section.

Entry Point (_start)

.section .text.boot
.global _start
_start:
    // 1. Check core ID
    mrs x0, mpidr_el1
    and x0, x0, #0xFF        // Extract Aff0 field (core number)
    cbnz x0, park_core       // Park non-zero cores

primary_core:
    // 2. Drop from EL2 to EL1 if currently at EL2
    mrs x0, CurrentEL
    and x0, x0, #0xC         // Bits [3:2] contain EL
    cmp x0, #8               // EL2 = 0b10 << 2 = 8
    b.ne setup_stack         // If not EL2, skip transition

    // Initialize EL1 system registers (have UNKNOWN values before first entry)
    // Reference: ARM Trusted Firmware lib/el3_runtime/aarch64/context_mgmt.c

    // SCTLR_EL1: Set RES1 bits, MMU/caches disabled
    ldr x0, =0x30D00800
    msr sctlr_el1, x0

    // Initialize MMU registers to safe disabled state
    msr tcr_el1, xzr
    msr mair_el1, xzr
    msr ttbr0_el1, xzr
    msr ttbr1_el1, xzr

    // Enable FP/SIMD at EL1 (CPACR_EL1.FPEN = 0b11)
    // LLVM may use SIMD for memory operations
    mov x0, #(0b11 << 20)
    msr cpacr_el1, x0

    // Initialize exception vector table
    ldr x0, =exception_vector_table
    msr vbar_el1, x0
    isb

    // Configure EL1 execution state
    mov x0, #(1 << 31)       // RW bit: EL1 is AArch64
    msr hcr_el2, x0

    // Set exception level and mask interrupts
    mov x0, #0x3C5           // EL1h mode, all interrupts masked
    msr spsr_el2, x0

    // Set return address to setup_stack
    adr x0, setup_stack
    msr elr_el2, x0

    // Exception return to EL1
    eret

setup_stack:
    // 3. Set up stack
    ldr x0, =_stack_start
    mov sp, x0

    // 4. Clear BSS section
    ldr x0, =__bss_start
    ldr x1, =__bss_end
clear_bss:
    cmp x0, x1
    b.ge clear_bss_done
    str xzr, [x0], #8
    b clear_bss
clear_bss_done:

    // 5. Jump to Rust
    bl _start_rust

    // Should never return
hang:
    wfe
    b hang

Core Parking

park_core:
    wfe          // Wait for event
    b park_core  // Loop forever

Cores 1-3 are parked in low-power mode. Future milestones will wake them for SMP support.

Stage 3: Rust Entry (main.rs)

The _start_rust function in src/main.rs:

#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn _start_rust() -> ! {
    // Initialize kernel subsystems
    daedalus::init();  // Initializes UART, exception vectors

    #[cfg(test)]
    test_main();       // Run tests if in test mode

    #[cfg(not(test))]
    daedalus::shell::run();  // Launch interactive shell

    // Never returns
    loop {
        core::hint::spin_loop();
    }
}
}

Stage 4: Kernel Initialization (lib.rs)

The daedalus::init() function performs subsystem setup in a specific order:

#![allow(unused)]
fn main() {
pub fn init() {
    // 1. Initialize MMU first (before UART or any other subsystem)
    //    - Sets up 3-level translation tables (L1, L2)
    //    - Identity maps 0-1 GB (normal memory) and 3-4 GB (MMIO)
    //    - Configures MAIR_EL1, TCR_EL1, TTBR0_EL1
    //    - Enables MMU, data cache, and instruction cache
    unsafe {
        arch::aarch64::mmu::init();
    }

    // 2. Initialize UART driver
    //    - Firmware already initialized it, we just take control
    //    - Now we can print boot messages
    drivers::uart::WRITER.lock().init();

    // 3. Print boot sequence header
    println!("DaedalusOS v{} booting...", VERSION);
    println!("[  OK  ] MMU initialized (virtual memory enabled)");

    // 4. Install exception vector table
    exceptions::init();
    println!("[  OK  ] Exception vectors installed");

    // 5. Initialize GIC-400 interrupt controller
    //    - Configure distributor and CPU interface
    //    - Enable UART0 interrupt (ID 153)
    let mut gic = drivers::gic::GIC.lock();
    gic.init();
    gic.enable_interrupt(drivers::gic::irq::UART0);
    println!("[  OK  ] GIC-400 interrupt controller initialized");

    // 6. Enable UART RX interrupts and unmask IRQs at CPU level
    drivers::uart::WRITER.lock().enable_rx_interrupt();
    enable_irqs();  // Unmasks I bit in DAIF register
    println!("[  OK  ] IRQs enabled (interrupt-driven I/O active)");

    // 7. Initialize heap allocator
    //    - 8 MB region defined in linker.ld
    //    - Simple bump allocator for String/Vec support
    unsafe {
        extern "C" {
            static __heap_start: u8;
            static __heap_end: u8;
        }
        let heap_start = &__heap_start as *const u8 as usize;
        let heap_end = &__heap_end as *const u8 as usize;
        ALLOCATOR.init(heap_start, heap_end);
    }
    println!("[  OK  ] Heap allocator initialized (8 MB)");

    // 8. Print final boot message
    println!("Boot complete. Running at EL{}.", current_el());
}
}

Initialization Order Rationale

Why MMU first?

  • Identity mapping (VA = PA) means all existing addresses remain valid
  • Enables caching for performance boost throughout boot
  • Must happen before any significant memory operations

Why UART second?

  • Need UART working to print boot status messages
  • Firmware already initialized it, we just configure our driver

Why exceptions before interrupts?

  • Exception vectors must be installed before any interrupts can occur
  • IRQ handler is part of exception vector table

Why GIC before enabling IRQs?

  • GIC must be configured before CPU accepts interrupts
  • UART interrupt must be enabled in GIC before unmasking CPU IRQs

Why heap last?

  • Not needed for early initialization
  • Requires linker symbols which are available throughout boot
  • Allocations only needed for shell and runtime features

Memory Layout During Boot

Defined in linker.ld:

0x00080000: .text.boot       (assembly entry point)
0x00080800: .text.exceptions (exception vector table, 2KB aligned)
0x00081xxx: .text            (Rust code)
0x000xxxxx: .rodata          (read-only data, string literals)
0x000xxxxx: .data            (initialized globals)
0x000xxxxx: .bss             (zero-initialized globals)
0x000xxxxx: __heap_start     (8 MB heap region)
0x00xxxxxx: __heap_end
0x00xxxxxx: (2 MB stack, grows downward)
0x00xxxxxx: _stack_start     (initial SP points here)

[Page Tables - allocated in .bss by MMU module]
L1_TABLE:       4 KB (512 entries × 8 bytes)
L2_TABLE_LOW:   4 KB (maps 0-1 GB)
L2_TABLE_MMIO:  4 KB (maps 3-4 GB)

Note: After MMU initialization, all addresses are virtual, but identity-mapped (VA = PA).

Exception Level Transition (EL2 → EL1)

Both QEMU and Pi 4 hardware boot at EL2 (hypervisor mode). The boot stub transitions to EL1 (kernel mode) before jumping to Rust for the following reasons:

Why EL1?

  1. Atomic instructions work correctly - At EL2, exclusive load/store semantics are undefined without proper hypervisor setup
  2. Spin locks function - Rust’s spin::Mutex (used throughout the kernel) requires working atomics
  3. Standard OS privilege level - Linux and other OSes run at EL1, not EL2
  4. Simpler exception handling - No need to manage both EL1 and EL2 exception vectors

EL1 Register Initialization: The boot stub initializes all EL1 system registers before the ERET instruction:

  • SCTLR_EL1: RES1 bits set (0x30D00800 from ARM Trusted Firmware)
  • TCR_EL1, MAIR_EL1, TTBR0_EL1, TTBR1_EL1: Zeroed (safe disabled state)
  • CPACR_EL1: FP/SIMD enabled (LLVM uses SIMD for memory operations)
  • VBAR_EL1: Exception vector table pointer

Without this initialization, EL1 registers have UNKNOWN values after the first entry to EL1, which can cause crashes.

Verification

Expected serial output after successful boot:

DaedalusOS v0.1.0 booting...

[  OK  ] MMU initialized (virtual memory enabled)
[  OK  ] Exception vectors installed
[  OK  ] GIC-400 interrupt controller initialized
[  OK  ] IRQs enabled (interrupt-driven I/O active)
[  OK  ] Heap allocator initialized (8 MB)

Boot complete. Running at EL1.

Welcome to DaedalusOS!
Type 'help' for available commands.

daedalus>

Boot Time

On QEMU with KVM acceleration, boot typically completes in <100ms. Real hardware boot time depends on firmware initialization (~1-2 seconds before kernel starts).

Code References

  • Assembly stub: src/arch/aarch64/boot.s
  • Rust entry: src/main.rs (_start_rust)
  • Init function: src/lib.rs (init)
  • Linker script: linker.ld

External References

Boot Sequence Diagram

Firmware (start4.elf) @ EL2
  ↓
Load kernel8.img @ 0x80000
  ↓
Jump to _start (boot.s) @ EL2
  ├─→ Core 1-3: park in WFE loop
  └─→ Core 0: continue
       ↓
    Check CurrentEL (should be EL2)
       ↓
    Initialize EL1 system registers:
       ├─→ SCTLR_EL1 (RES1 bits)
       ├─→ TCR_EL1, MAIR_EL1, TTBR0_EL1, TTBR1_EL1 (zero)
       ├─→ CPACR_EL1 (enable FP/SIMD)
       └─→ VBAR_EL1 (exception vectors)
       ↓
    Configure HCR_EL2, SPSR_EL2, ELR_EL2
       ↓
    ERET to EL1 (drop privilege level)
       ↓
    Set SP = _stack_start @ EL1
       ↓
    Clear BSS section
       ↓
    Jump to _start_rust (main.rs)
       ↓
    Call daedalus::init()
       ├─→ MMU init (enable virtual memory + caches)
       ├─→ UART init (configure GPIO + baud rate)
       ├─→ Exception vectors (already installed)
       ├─→ GIC init (configure interrupt controller)
       ├─→ IRQ enable (unmask interrupts)
       └─→ Heap init (setup allocator)
       ↓
    Launch shell (shell::run())
       ↓
    Read-Eval-Print Loop

Performance Optimizations

After MMU initialization:

  • Data cache enabled: ~100x faster memory access for hot data
  • Instruction cache enabled: ~10-100x faster instruction fetch
  • TLB active: Fast virtual-to-physical address translation

These optimizations make the shell responsive and enable real-time interrupt handling.

Exception Handling

Status: Implemented (Milestone #7 complete)

ARMv8-A exception handling with vector table, context save/restore, and register dumps.

Overview

The exception handling system provides:

  • 16-entry exception vector table (aligned to 2048 bytes)
  • Full context save/restore (all GPRs + system registers)
  • ESR (Exception Syndrome Register) decoding
  • FAR (Fault Address Register) reporting
  • Complete register dumps on exceptions

Vector Table Structure

ARM ARM D1.10.2 specifies 16 vectors: 4 exception types × 4 exception levels

Offset  | Exception Type      | Exception Level
--------|--------------------|-----------------
0x000   | Synchronous        | Current EL, SP0
0x080   | IRQ                | Current EL, SP0
0x100   | FIQ                | Current EL, SP0
0x180   | SError             | Current EL, SP0
0x200   | Synchronous        | Current EL, SPx  ← Normal kernel exceptions
0x280   | IRQ                | Current EL, SPx
0x300   | FIQ                | Current EL, SPx
0x380   | SError             | Current EL, SPx
0x400   | Synchronous        | Lower EL, AArch64
0x480   | IRQ                | Lower EL, AArch64
0x500   | FIQ                | Lower EL, AArch64
0x580   | SError             | Lower EL, AArch64
0x600   | Synchronous        | Lower EL, AArch32
0x680   | IRQ                | Lower EL, AArch32
0x700   | FIQ                | Lower EL, AArch32
0x780   | SError             | Lower EL, AArch32

Each vector is 128 bytes (0x80), aligned to 128-byte boundary.

Exception Types

  • Synchronous: Instruction aborts, data aborts, syscalls (SVC), breakpoints (BRK)
  • IRQ: Normal interrupts (disabled until GIC configured)
  • FIQ: Fast interrupts (disabled)
  • SError: Asynchronous system errors

Context Save/Restore

Assembly macros save all registers to stack-allocated ExceptionContext:

#![allow(unused)]
fn main() {
#[repr(C)]
pub struct ExceptionContext {
    // General purpose registers
    x0: u64, x1: u64, ..., x30: u64,

    // System registers
    elr_el1: u64,   // Exception Link Register (return address)
    spsr_el1: u64,  // Saved Program Status Register
}
}

Size: 33 registers × 8 bytes = 264 bytes per exception

ESR Decoding

Exception Syndrome Register (ESR_EL1) bits [31:26] contain exception class:

EC ValueException Class
0x00Unknown reason
0x01Trapped WFI/WFE
0x07SVE/SIMD/FP access
0x15SVC (syscall) from AArch64
0x18Trapped MSR/MRS/system instruction
0x20Instruction abort from lower EL
0x21Instruction abort from same EL
0x24Data abort from lower EL
0x25Data abort from same EL
0x2CFloating point exception
0x3CBRK instruction (breakpoint)

Full list: 40+ exception classes decoded in src/arch/aarch64/exceptions.rs.

FAR (Fault Address Register)

For memory access exceptions (instruction/data aborts, alignment faults):

  • FAR_EL1 contains the faulting virtual address
  • FAR_EL2 used when running at EL2 (QEMU)

Installation

Vector table installed during daedalus::init():

#![allow(unused)]
fn main() {
pub unsafe fn install_vector_table() {
    let vbar_addr = &exception_vector_table as *const _ as u64;

    // Set VBAR_EL1 or VBAR_EL2 based on current exception level
    if current_el() == 2 {
        asm!("msr vbar_el2, {}", in(reg) vbar_addr);
    } else {
        asm!("msr vbar_el1, {}", in(reg) vbar_addr);
    }
}
}

Exception Flow

  1. Exception occurs (e.g., BRK instruction, data abort)
  2. CPU jumps to appropriate vector (e.g., offset 0x200 for synchronous at current EL)
  3. Assembly stub saves all registers to stack
  4. Rust handler called with &ExceptionContext
  5. Handler prints exception info: type, ESR, FAR, all registers
  6. Panic (current behavior - no recovery yet)

Testing

Shell Command

daedalus> exception

Triggers BRK instruction, prints full exception dump.

Test Suite

cargo test

Runs 25 tests including exception vector installation.

Known Issues

EL2 vs EL1 Discrepancy:

  • QEMU boots at EL2, real hardware boots at EL1
  • Assembly hardcodes EL1 register saves (ELR_EL1, SPSR_EL1)
  • In QEMU: ELR/SPSR show as zero (GPRs and FAR are correct)
  • Code checks current_el() and reads FAR_EL2 when at EL2

Future Fix: Make exception assembly EL-agnostic or drop to EL1 during boot.

Code References

  • Vector table: src/arch/aarch64/exceptions.s
  • Context struct: src/arch/aarch64/exceptions.rs
  • ESR decoding: src/arch/aarch64/exceptions.rs (exception_class_str)
  • Installation: src/arch/aarch64/exceptions.rs (install_vector_table)

External References

Linker Script

File: linker.ld

The linker script controls how the kernel binary is laid out in memory at link time.

Key Decisions

Entry Address: 0x80000

Pi 4 firmware loads kernel8.img to physical address 0x80000 and jumps there. This is a hardware constraint, not a choice.

Why this address?

  • Historical: ARM bootloaders have used 0x8000 or 0x80000
  • Pi firmware convention: kernel8.img → 64-bit mode → 0x80000
  • Well below MMIO window (0xFE000000), plenty of room for DRAM

Exception Vector Alignment: 2048 bytes

The ARM architecture requires exception vector tables be aligned to 2048 bytes (0x800).

Why?

  • VBAR_EL1 register ignores low 11 bits when setting vector base
  • ARM ARM D1.10.2: “aligned to 0x800 (2048 bytes)”
  • Linker enforces this: .text.exceptions : ALIGN(0x800)

Memory Layout Order

.text.boot        ← First! Firmware jumps to 0x80000
.text.exceptions  ← Exception vectors (aligned to 0x800)
.text             ← Main Rust code
.rodata           ← String literals, const data
.data             ← Initialized globals
.bss              ← Zero-initialized globals
Heap (8 MB)       ← Reserved for Phase 2 allocator
Stack (2 MB)      ← Grows downward from top

Why this order?

  • Boot code must be first (entry point)
  • Exceptions need special alignment, easier before main code
  • Standard ELF convention: code → rodata → data → bss
  • Stack at end makes overflow detection easier (future)

Heap Size: 8 MB

Currently reserved but unused. Allocator is Phase 2 milestone.

Why 8 MB?

  • Enough for shell history, command buffers, future features
  • Small enough to not waste limited Pi 1 GB RAM
  • Can be adjusted based on actual usage

Stack Size: 2 MB

Why 2 MB?

  • Conservative estimate for deep call stacks
  • Exception handlers save ~264 bytes per exception
  • Future: Will split per-core when enabling SMP

Symbol Exports

The linker script defines symbols that boot code and future allocator use:

SymbolUsed ByPurpose
__bss_start, __bss_endboot.sClear BSS loop bounds
__heap_start, __heap_endFuture allocatorHeap memory region
_stack_startboot.sInitial stack pointer

How they’re used:

  • Boot assembly reads these to know where BSS is
  • Future allocator reads heap bounds to manage free list
  • No runtime overhead - these are compile-time addresses

Alignment Requirements

  • BSS: 16-byte aligned (ARM AAPCS calling convention)
  • Stack: 16-byte aligned (ARM AAPCS requirement for function calls)
  • Heap: 16-byte aligned (allocation efficiency)
  • Exception vectors: 2048-byte aligned (ARM architectural requirement)

Future Changes

When Adding MMU (Phase 2/3)

Will need to align sections to page boundaries:

  • 4 KiB alignment for small pages
  • 2 MB alignment for large pages/sections

Example: . = ALIGN(4096); before each major section.

When Adding Multi-Core (Phase 3)

Will need per-core stacks:

.stack (NOLOAD) : {
    . = ALIGN(16);
    . += (0x200000 * 4);  /* 2 MB × 4 cores */
    _stack_start = .;
}

Debugging Tips

“Kernel doesn’t boot”:

  • Check entry address is 0x80000: readelf -h kernel.elf | grep Entry
  • Verify .text.boot is first: readelf -S kernel.elf | head -20

“Exception handling broken”:

  • Check vector alignment: readelf -S kernel.elf | grep exceptions
  • Must show ALIGN: 0x800

“Stack overflow”:

  • Add guards: __stack_end symbol for overflow detection
  • Reduce stack size or increase in linker script

External References

  • ARM AAPCS (calling convention): Requires 16-byte stack alignment
  • ARM ARM D1.10.2: Vector table alignment requirement
  • ELF specification: Standard section ordering

Heap Allocator

DaedalusOS uses a simple bump allocator for heap memory management, providing dynamic allocation capabilities for Rust’s alloc crate.

Overview

The bump allocator is the simplest form of memory allocator:

  • Maintains a pointer that “bumps” forward with each allocation
  • Individual deallocations are no-ops
  • Memory is only reclaimed when the entire allocator is reset
  • Fast O(1) allocation time
  • Minimal overhead and complexity

This design is ideal for kernel workloads where:

  • Memory is frequently allocated but rarely freed individually
  • Simple, predictable behavior is preferred over complex memory management
  • Performance and code size matter more than memory reuse

Implementation

Location: src/mm/allocator.rs

The BumpAllocator struct implements:

  • GlobalAlloc trait for Rust’s standard allocator interface
  • Thread-safe access using spin::Mutex
  • Proper alignment handling for all allocations
  • Memory tracking (total size, used, free)

Memory Layout

Heap Start (0x00880000)                          Heap End (0x01080000)
    |                                                        |
    v                                                        v
    [=============== Allocated ===============][=== Free ===]
                                               ^
                                               |
                                            next pointer

The allocator manages an 8 MB region defined by linker symbols:

  • __heap_start: Beginning of heap region (after BSS section)
  • __heap_end: End of heap region (before stack)
  • next: Current allocation pointer (bumps forward)

Initialization

The heap is initialized during kernel startup in lib.rs::init():

#![allow(unused)]
fn main() {
unsafe {
    extern "C" {
        static __heap_start: u8;
        static __heap_end: u8;
    }
    let heap_start = &__heap_start as *const u8 as usize;
    let heap_end = &__heap_end as *const u8 as usize;
    ALLOCATOR.init(heap_start, heap_end);
}
}

Safety Invariants

The allocator relies on several safety invariants:

  1. init() is called exactly once before any allocations
  2. The heap region [heap_start, heap_end) is valid, properly aligned memory
  3. The region is reserved exclusively for heap use (no overlap with code/stack)
  4. heap_start < heap_end (enforced by linker script)

Debug assertions catch configuration errors during development.

Allocation Strategy

Alignment

All allocations are properly aligned according to the requested Layout:

#![allow(unused)]
fn main() {
let alloc_start = (next + layout.align() - 1) & !(layout.align() - 1);
}

This ensures that returned pointers meet ARM AAPCS alignment requirements.

Out of Memory

When the heap is exhausted:

  1. alloc() returns a null pointer
  2. Rust’s allocator calls the #[alloc_error_handler]
  3. The kernel panics with allocation error details

Deallocation

The bump allocator does not free individual allocations:

#![allow(unused)]
fn main() {
unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {
    // No-op: bump allocator never deallocates individual allocations
}
}

This is a deliberate design choice that trades memory efficiency for simplicity and speed.

Usage Examples

The allocator enables Rust’s standard collections:

#![allow(unused)]
fn main() {
use alloc::boxed::Box;
use alloc::vec::Vec;
use alloc::string::String;

// Heap-allocated value
let value = Box::new(42);

// Dynamic array
let mut vec = Vec::new();
vec.push(1);
vec.push(2);

// Owned string
let mut s = String::from("Hello");
s.push_str(", World!");
}

Shell History

The interactive shell uses the allocator for command history:

#![allow(unused)]
fn main() {
static HISTORY: Mutex<Vec<String>> = Mutex::new(Vec::new());

fn handle_line(line: &str) {
    let mut history = HISTORY.lock();
    history.push(String::from(line));
}
}

Monitoring

The allocator provides runtime statistics:

#![allow(unused)]
fn main() {
ALLOCATOR.heap_size()  // Total heap capacity (8 MB)
ALLOCATOR.used()       // Bytes allocated so far
ALLOCATOR.free()       // Bytes remaining
}

These are exposed through the meminfo shell command.

Testing

The allocator is tested in src/lib.rs with:

  • test_box_allocation - Box allocation
  • test_vec_allocation - Vec creation and push
  • test_string_allocation - String concatenation
  • test_vec_with_capacity - Pre-allocated capacity
  • test_allocator_stats - Usage tracking

All tests run in QEMU during cargo test.

Future Improvements

Potential enhancements for later phases:

  • Free list allocator - Reuse deallocated memory
  • Slab allocator - Fixed-size pools for common allocations
  • Per-CPU allocators - Reduce contention in SMP
  • Memory pressure callbacks - Allow cleanup when low on memory

For now, the bump allocator provides a solid foundation for Phase 2 development.

References

  • Code: src/mm/allocator.rs (164 lines)
  • Linker symbols: linker.ld defines __heap_start and __heap_end
  • Integration: src/lib.rs - initialization and global allocator registration
  • Rust allocator API: GlobalAlloc trait documentation

MMU & Paging

Status: ✅ Implemented (Phase 3, Milestone #10)

Memory Management Unit and virtual memory configuration for DaedalusOS.

Overview

The MMU provides:

  • Virtual memory: 39-bit address space (512 GB)
  • Memory protection: Separate attributes for kernel and MMIO regions
  • Cache control: Cacheable normal memory, non-cacheable device memory
  • Foundation for userspace: Ready for EL0 isolation (Phase 4)

Implementation Details

Address Space Configuration

  • Virtual address size: 39 bits (512 GB)
  • Page granule: 4 KB
  • Translation levels: 3 (L1, L2, L3)
  • Mapping strategy: Identity mapping (VA = PA)

Why these choices?

  • 39-bit VA requires only 3 page table levels (vs 4 for 48-bit)
  • 4 KB pages are universally supported and efficient
  • Identity mapping simplifies boot and hardware access

Translation Table Structure

L1 Table (512 entries, each covers 1 GB):
  ├─ Entry 0 → L2_TABLE_LOW (0-1 GB region)
  ├─ Entry 1-2 → Unmapped
  └─ Entry 3 → L2_TABLE_MMIO (3-4 GB region)

L2_TABLE_LOW (512 entries, each covers 2 MB):
  ├─ Entry 0-511 → 2 MB blocks, Normal memory (0-1 GB)
  └─ Attributes: Cacheable, Inner Shareable, EL1 RW

L2_TABLE_MMIO (512 entries, each covers 2 MB):
  ├─ Entry 0-511 → 2 MB blocks, Device memory (3-4 GB)
  └─ Attributes: Device-nGnRnE, Non-shareable, EL1 RW

Memory Mappings

Virtual AddressPhysical AddressSizeTypeDescription
0x00000000-0x3FFFFFFFSame (identity)1 GBNormalKernel code, data, heap, DRAM
0xFE000000-0xFF800000Same (identity)~24 MBDeviceMMIO peripherals (UART, GIC, etc.)

Memory Attributes (MAIR_EL1)

Attr0 (Device):  0x00 = Device-nGnRnE
  - Non-Gathering: Each access is separate
  - Non-Reordering: Access order is preserved
  - No Early-ack: Wait for completion

Attr1 (Normal):  0xFF = Normal, Write-Back, Read/Write-Allocate
  - Inner/Outer cacheable
  - Write-back policy
  - Allocate on read and write

Reference: ARM ARM Section D4.4.4, Table D4-17

Translation Control (TCR_EL1)

T0SZ     = 25     → 2^(64-25) = 512 GB address space
TG0      = 4 KB   → Page granule size
SH0      = Inner Shareable (for SMP)
ORGN0/IRGN0 = Write-Back Write-Allocate
IPS      = 40-bit → 1 TB physical address support

Reference: ARM ARM Section D4.2.6, Table D4-11

System Registers

The MMU uses these AArch64 system registers:

  • SCTLR_EL1: System Control Register

    • Bit 0 (M): MMU enable
    • Bit 2 (C): Data cache enable
    • Bit 12 (I): Instruction cache enable
  • TTBR0_EL1: Translation Table Base Register

    • Points to L1 translation table
    • Must be 4 KB aligned
  • MAIR_EL1: Memory Attribute Indirection Register

    • Defines 8 memory attribute encodings
    • Referenced by page table entries
  • TCR_EL1: Translation Control Register

    • Configures address space size, granule, cacheability

Reference: ARM ARM Section C5.2 (System Registers)

Initialization Sequence

The MMU is initialized during kernel startup in src/lib.rs:init():

  1. Set up translation tables (setup_page_tables)

    • Initialize L1, L2_LOW, and L2_MMIO tables
    • Create identity mappings for kernel and MMIO
  2. Configure memory attributes (MAIR_EL1)

    • Attr0: Device-nGnRnE for MMIO
    • Attr1: Normal WB for kernel/DRAM
  3. Configure translation control (TCR_EL1)

    • Set address space size (39-bit)
    • Configure granule size (4 KB)
    • Enable caching and shareability
  4. Set translation table base (TTBR0_EL1)

    • Point to L1 table physical address
  5. Enable MMU (SCTLR_EL1)

    • Set M bit (MMU enable)
    • Set C bit (data cache enable)
    • Set I bit (instruction cache enable)
  6. Synchronization barriers

    • DSB SY: Ensure all writes complete
    • ISB: Flush instruction pipeline

Code location: src/arch/aarch64/mmu.rs

Shell Commands

Use the mmu command to inspect MMU status:

daedalus> mmu
MMU (Memory Management Unit) Status:

  Status: ENABLED

  Translation Table Base (TTBR0_EL1): 0x00000000000A5000

  Translation Control (TCR_EL1): 0x0000000080803519
    Virtual address size: 39 bits (512 GB)
    Page granule: 4 KB

  Memory Attributes (MAIR_EL1): 0x000000000000FF00
    Attr0 (Device): 0x00 (Device-nGnRnE)
    Attr1 (Normal):  0xFF (Normal WB RW-Allocate)

  Memory Mappings (Identity):
    0x00000000-0x3FFFFFFF → Normal memory (kernel + DRAM)
    0xFE000000-0xFF800000 → Device memory (MMIO)

Design Decisions

Why Identity Mapping?

We use identity mapping (VA = PA) instead of higher-half kernel because:

  1. Boot simplicity: No address space switch needed during MMU enablement
  2. No relocation: Kernel code/data/linker symbols work without modification
  3. Clear debugging: Virtual address = physical hardware address
  4. Standard for bare-metal: Easier to reason about hardware access

Future work can add higher-half mapping (e.g., kernel at 0xFFFF_8000_0000_0000+) without changing MMIO access patterns.

Why 2 MB Blocks (Not 4 KB Pages)?

We use 2 MB block mappings at L2 instead of 4 KB pages at L3 because:

  1. Fewer TLB entries: Larger blocks = fewer Translation Lookaside Buffer entries
  2. Simpler page tables: No need for L3 tables (saves 2 MB per GB mapped)
  3. Sufficient granularity: We don’t need fine-grained protection yet
  4. Performance: Fewer page table walks

We can add L3 tables later for fine-grained memory protection (e.g., read-only .text, no-execute heap).

Future Enhancements

Phase 4: Userspace (EL0)

  • Add separate TTBR1_EL1 for kernel space
  • Configure EL0 access permissions
  • Map user programs with restricted permissions
  • Implement copy-on-write for processes

Phase 5: Fine-Grained Protection

  • Add L3 tables for 4 KB page granularity
  • Make .text section read-only and executable
  • Make .rodata section read-only
  • Make heap/stack non-executable (NX)

Phase 6: Higher-Half Kernel

  • Map kernel to high addresses (0xFFFF_8000_0000_0000+)
  • Keep MMIO at low addresses (identity mapped)
  • Allows full lower address space for userspace

Debugging

Common Issues

MMU doesn’t enable (SCTLR_EL1.M = 0):

  • Check TTBR0_EL1 points to valid page table
  • Verify page table entries are valid (descriptor type bits)
  • Ensure TCR_EL1 is correctly configured

Data abort on MMU enable:

  • Check page table covers all accessed addresses
  • Verify MAIR_EL1 attributes match page table AttrIndx
  • Ensure stack/heap are in mapped regions

Cache coherency issues:

  • Add DSB/ISB barriers after page table modifications
  • Invalidate TLB after changes (TLBI instruction)

Useful ARM Instructions

MRS x0, SCTLR_EL1    ; Read system control
MRS x0, TTBR0_EL1    ; Read table base
MRS x0, TCR_EL1      ; Read translation control
MRS x0, MAIR_EL1     ; Read memory attributes
TLBI VMALLE1         ; Invalidate all TLB entries
DC CIVAC, x0         ; Clean and invalidate data cache by VA

ARM References

Testing Framework

DaedalusOS uses a custom test framework for bare-metal testing in QEMU. This document explains how the testing system works and how to write tests.

Why a Custom Test Framework?

Rust’s standard test framework (#[test]) requires the std library, which is not available in bare-metal environments (#![no_std]). DaedalusOS implements a custom framework that:

  • Runs tests directly on bare-metal (in QEMU)
  • Provides test output via UART serial console
  • Exits QEMU with success/failure status codes
  • Supports the same test patterns as standard Rust tests

Architecture

Test Harness Entry Point

The test harness is defined in src/lib.rs:

#![allow(unused)]
fn main() {
#[cfg(test)]
#[no_mangle]
pub extern "C" fn _start() -> ! {
    kernel_init();
    test_main();
    qemu::exit_success();
}
}

When running cargo test, this replaces the normal kernel entry point and:

  1. Initializes the kernel (UART, MMU, interrupts, etc.)
  2. Runs all test functions
  3. Exits QEMU with success status if all tests pass

Test Runner

The test_main() function discovers and runs all tests:

#![allow(unused)]
fn main() {
#[cfg(test)]
fn test_main() {
    println!("\nrunning {} tests\n", TEST_CASES.len());

    for test in TEST_CASES {
        test.run();
    }

    println!("\ntest result: ok. {} passed\n", TEST_CASES.len());
}
}

Test Case Registration

Tests are registered using the #[test_case] attribute macro (NOT #[test]):

#![allow(unused)]
fn main() {
#[test_case]
fn test_example() {
    assert_eq!(2 + 2, 4);
}
}

The #[test_case] attribute:

  1. Marks the function as a test
  2. Adds it to the TEST_CASES static array
  3. Wraps it with test runner logic (name printing, panic handling)

CRITICAL: Always use #[test_case], never #[test]. Using #[test] will cause compilation errors because the standard library test crate is not available.

Writing Tests

Unit Tests

Place unit tests in a tests module within the file being tested:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    #[test_case]
    fn test_mac_address_new() {
        let mac = MacAddress::new([0xB8, 0x27, 0xEB, 0x12, 0x34, 0x56]);
        assert_eq!(mac.0, [0xB8, 0x27, 0xEB, 0x12, 0x34, 0x56]);
    }

    #[test_case]
    fn test_mac_address_broadcast() {
        let mac = MacAddress::broadcast();
        assert!(mac.is_broadcast());
        assert_eq!(mac.0, [0xFF; 6]);
    }
}
}

Integration Tests

Integration tests are placed in the main test module in src/lib.rs:

#![allow(unused)]
fn main() {
#[test_case]
fn test_println_simple() {
    println!("test_println_simple output");
}

#[test_case]
fn test_kernel_init() {
    kernel_init();  // Should not panic
}
}

Assertions

All standard Rust assertion macros work:

#![allow(unused)]
fn main() {
assert!(condition);
assert_eq!(left, right);
assert_ne!(left, right);
debug_assert!(condition);  // Only in debug builds
}

When an assertion fails, the test panics and the panic handler prints the error message.

Test Organization

Pure Function Tests

Test pure functions (no hardware interaction) extensively:

#![allow(unused)]
fn main() {
#[test_case]
fn test_ethernet_frame_parse() {
    let mut buffer = [0u8; 64];
    buffer[0..6].copy_from_slice(&[0xFF; 6]);  // Dest MAC
    buffer[6..12].copy_from_slice(&[0xB8, 0x27, 0xEB, 0x12, 0x34, 0x56]);  // Src MAC
    buffer[12..14].copy_from_slice(&[0x08, 0x06]);  // EtherType (ARP)
    buffer[14..20].copy_from_slice(b"Hello!");  // Payload

    let frame = EthernetFrame::parse(&buffer[..20]).unwrap();

    assert_eq!(frame.dest_mac, MacAddress::broadcast());
    assert_eq!(frame.ethertype, ETHERTYPE_ARP);
    assert_eq!(frame.payload, b"Hello!");
}
}

Why test pure functions?

  • No hardware required (works in QEMU)
  • Fast to run (no delays or I/O)
  • Deterministic (same input always gives same output)
  • High code coverage achievable

Hardware Tests

Test hardware that QEMU supports:

#![allow(unused)]
fn main() {
#[test_case]
fn test_uart_write_byte() {
    uart::write_byte(b'A');
    uart::write_byte(b'B');
    uart::write_byte(b'C');
    println!();  // Newline
}

#[test_case]
fn test_timer_counter_increments() {
    let before = SystemTimer::counter();
    SystemTimer::delay_us(100);
    let after = SystemTimer::counter();
    assert!(after > before);
}
}

Skipping Hardware-Only Tests

Do NOT use #[ignore] for hardware-only tests. If a test can only run on real hardware (not QEMU), don’t write it as a test case. Instead:

  1. Test the pure functions that will be used on hardware
  2. Create diagnostic commands for manual hardware testing (e.g., eth-diag)

Rationale: The project is never built or tested on actual hardware via cargo test, so ignored tests serve no purpose and create maintenance burden.

Example of the wrong approach:

#![allow(unused)]
fn main() {
#[test_case]
#[ignore]  // ❌ Don't do this
fn test_ethernet_tx_on_hardware() {
    // This test will never run in CI or during development
}
}

Example of the right approach:

#![allow(unused)]
fn main() {
// src/net/ethernet.rs - Test the pure functions
#[test_case]
fn test_ethernet_frame_write() {
    let frame = EthernetFrame::new(dest, src, ETHERTYPE_IPV4, payload);
    let mut buffer = [0u8; 128];
    let size = frame.write_to(&mut buffer).unwrap();
    // Verify serialization is correct
}

// src/drivers/genet.rs - Provide diagnostic command
pub fn diagnostic(&self) -> bool {
    println!("[DIAG] Checking Ethernet hardware...");
    // Step-by-step hardware validation with verbose output
}
}

Running Tests

All Tests

cargo test

This runs all tests in QEMU and shows output like:

running 65 tests

test daedalus::net::ethernet::tests::test_mac_address_new ... ok
test daedalus::net::ethernet::tests::test_mac_address_broadcast ... ok
test daedalus::drivers::timer::tests::test_delay_us_actually_delays ... ok
...

test result: ok. 65 passed

Specific Test Module

cargo test --test <test_name>

Test Output

Tests print to UART, which appears in the console:

  • Test names as they run
  • Assertion failures with file:line information
  • Final pass/fail summary

QEMU Exit Codes

The test framework uses QEMU semihosting to exit with status codes:

  • Exit code 0: All tests passed
  • Exit code 1: Test failure or panic
  • Exit code 2: QEMU error

See src/qemu.rs for implementation details.

Deterministic Timing Tests

Some timing tests may be flaky in CI environments due to host load. Enable deterministic mode:

QEMU_DETERMINISTIC=1 cargo test

This uses QEMU’s -icount flag to decouple guest clock from host, making timing perfectly reproducible at the cost of 10-100x slower execution.

Current timing tests use 25% tolerance to handle normal CI variability without needing this flag.

Test Statistics (Milestone #12)

Current test coverage:

CategoryTestsDescription
Network protocols30Ethernet frames, MAC addresses, ARP packets
GENET driver4Register offsets, MDIO encoding, PHY constants
Timer6Counter, delays, uptime, monotonicity
Allocator6Box, Vec, String, capacity, stats
UART6Write byte/string, newlines, locking
Shell5Command parsing, whitespace handling
Formatting5println!, integers, padding, Debug trait
Exception1Vector installation
Kernel init2Initialization, version output
Total65All passing in QEMU

Troubleshooting

Error: “can’t find crate for ‘test’”

Problem: You used #[test] instead of #[test_case].

Solution: Replace all #[test] with #[test_case]:

# In the affected file:
sed -i 's/#\[test\]/#[test_case]/g' src/path/to/file.rs

Error: “no tests to run”

Problem: Tests not registered in TEST_CASES array.

Solution: Ensure you’re using #[test_case] attribute, not #[test] or custom test functions.

Test Hangs in QEMU

Problem: Test enters infinite loop or waits forever.

Solution:

  1. Use cargo test with default timeout (2 minutes)
  2. Check for blocking operations (e.g., MDIO reads with no hardware)
  3. Add timeout to cargo test invocation: timeout 30 cargo test

Timing Test Flakiness

Problem: Tests like test_delay_us_actually_delays fail intermittently.

Solution: Use QEMU_DETERMINISTIC=1 or increase tolerance in assertions.

Best Practices

  1. Use #[test_case], not #[test] - This is the most common mistake
  2. Test pure functions extensively - No hardware = fast, reliable tests
  3. Use diagnostic commands for hardware - Better than ignored tests
  4. Keep tests fast - Avoid long delays unless necessary
  5. Test edge cases - Empty inputs, boundary values, invalid data
  6. Use descriptive test names - test_mac_address_broadcast not test_mac1
  7. Group related tests - One #[cfg(test)] mod tests per module
  8. Document non-obvious tests - Explain what you’re testing and why

External References

Network Protocol Stack

Modules: src/net/ethernet.rs, src/net/arp.rs, src/drivers/net/netdev.rs Status: Protocol parsing and device abstraction implemented Testing: 66 unit tests passing (30 protocol + 36 other)


Overview

DaedalusOS implements a lightweight network protocol stack for Ethernet networking. The current implementation includes:

  • Device Abstraction: NetworkDevice trait for hardware portability
  • Layer 2 Protocols: Ethernet II frames and ARP
  • GENET Driver: BCM2711 Ethernet controller (Pi 4)

This provides the foundation for future IP/TCP/UDP support via smoltcp.

Architecture Layers

┌────────────────────────────────────────┐
│      Application Layer                 │
│  (Future: HTTP, DNS, DHCP, etc.)       │
└────────────────┬───────────────────────┘
                 │
┌────────────────┴───────────────────────┐
│      Transport Layer                   │
│  (Future: TCP, UDP via smoltcp)        │
└────────────────┬───────────────────────┘
                 │
┌────────────────┴───────────────────────┐
│      Network Layer                     │
│  (Future: IPv4, IPv6, ICMP)            │
└────────────────┬───────────────────────┘
                 │
┌────────────────┴───────────────────────┐
│      Data Link Layer (CURRENT)         │
│  • Ethernet II Frames                  │  ← src/net/ethernet.rs
│  • ARP (Address Resolution)            │  ← src/net/arp.rs
└────────────────┬───────────────────────┘
                 │
┌────────────────┴───────────────────────┐
│      Physical Layer                    │
│  • GENET MAC Controller                │  ← src/drivers/net/ethernet/broadcom/genet.rs
│  • BCM54213PE PHY Chip                 │
└────────────────────────────────────────┘

Current Implementation Scope

✅ Implemented:

  • Device Abstraction: NetworkDevice trait for multiple hardware implementations
  • Hardware Driver: GENET v5 controller (Pi 4) with trait implementation
  • Ethernet II frame parsing and construction
  • MAC address representation and validation
  • ARP packet parsing and construction
  • ARP request/reply generation
  • Network byte order handling (big-endian)

❌ Not Yet Implemented (Coming in Milestone #13+):

  • Actual frame transmission/reception (hardware TX/RX)
  • ARP cache management
  • IP protocol (IPv4/IPv6)
  • Transport protocols (TCP/UDP via smoltcp)
  • Application protocols

Network Device Abstraction

Module: src/drivers/net/netdev.rs

The NetworkDevice trait provides a hardware-independent interface for Ethernet network devices. This abstraction enables:

  • Hardware portability: Support multiple Ethernet controllers (Pi 4 GENET, future Pi 5, QEMU mock)
  • Testing: Mock devices for protocol testing without hardware
  • smoltcp integration: Clean interface for TCP/IP stack (Milestone #16)

See ADR-003: Network Device Abstraction for design rationale.

NetworkDevice Trait

#![allow(unused)]
fn main() {
pub trait NetworkDevice {
    /// Check if hardware is present (false in QEMU)
    fn is_present(&self) -> bool;

    /// Initialize device (reset MAC, configure PHY, set up buffers)
    fn init(&mut self) -> Result<(), NetworkError>;

    /// Transmit Ethernet frame (blocking, 60-1514 bytes)
    fn transmit(&mut self, frame: &[u8]) -> Result<(), NetworkError>;

    /// Receive frame (non-blocking, returns None if no frame available)
    fn receive(&mut self) -> Option<&[u8]>;

    /// Get device MAC address
    fn mac_address(&self) -> MacAddress;

    /// Check link status (optional, default: false)
    fn link_up(&self) -> bool { false }
}
}

Error Handling

#![allow(unused)]
fn main() {
pub enum NetworkError {
    HardwareNotPresent,   // Device not detected
    NotInitialized,       // init() not called yet
    TxBufferFull,         // Hardware TX queue full
    FrameTooLarge,        // Frame > 1514 bytes
    FrameTooSmall,        // Frame < 60 bytes
    HardwareError,        // MAC/PHY error
    Timeout,              // Operation timeout
    InvalidConfiguration, // Bad parameters
}
}

Current Implementations

GenetController (Raspberry Pi 4)

#![allow(unused)]
fn main() {
use daedalus::drivers::net::ethernet::broadcom::genet::GenetController;
use daedalus::drivers::net::netdev::NetworkDevice;

let mut netdev = GenetController::new();

// Check hardware presence (returns false in QEMU)
if netdev.is_present() {
    netdev.init()?;

    // Get MAC address
    let mac = netdev.mac_address();

    // Check link status (reads PHY BMSR register)
    if netdev.link_up() {
        // Transmit frame (Milestone #13)
        netdev.transmit(&frame)?;

        // Receive frame (Milestone #13)
        if let Some(frame) = netdev.receive() {
            // Process frame
        }
    }
}
}

Hardware: BCM2711 GENET v5 Ethernet MAC controller PHY: BCM54213PE Gigabit Ethernet transceiver

MockNetworkDevice (Future - Milestone #14)

Planned mock implementation for QEMU testing:

#![allow(unused)]
fn main() {
pub struct MockNetworkDevice {
    rx_queue: Vec<Vec<u8>>,       // Injected RX frames
    tx_captured: Vec<Vec<u8>>,    // Captured TX frames
    mac: MacAddress,
}

impl NetworkDevice for MockNetworkDevice {
    fn is_present(&self) -> bool { true }  // Always present

    fn transmit(&mut self, frame: &[u8]) -> Result<(), NetworkError> {
        self.tx_captured.push(frame.to_vec());  // Capture for testing
        Ok(())
    }

    fn receive(&mut self) -> Option<&[u8]> {
        self.rx_queue.pop().map(|frame| frame.as_slice())
    }
}
}

This will enable network protocol testing in QEMU without real hardware.

Design Decisions

Why blocking transmit?

  • Simplifies initial implementation (interrupts come in Milestone #14)
  • Common pattern (Linux ndo_start_xmit, smoltcp)
  • API remains stable when adding interrupt-driven I/O

Why non-blocking receive?

  • Protocol stacks poll in loops (e.g., loop { if let Some(f) = receive() { ... } })
  • Matches smoltcp’s token-based API expectations
  • No thread blocking in bare-metal single-core environment

Why single-frame API (no queues)?

  • Implementations use hardware ring buffers internally (GENET)
  • Trait stays simple and focused
  • Protocol stacks manage their own packet buffers

Why frame size validation (60-1514 bytes)?

  • Enforces IEEE 802.3 Ethernet constraints at trait level
  • Prevents invalid frames from reaching hardware
  • Source: IEEE 802.3 Ethernet standard

Ethernet II Frames

Module: src/net/ethernet.rs

Ethernet II is the standard frame format for modern Ethernet networks. It consists of a 14-byte header followed by payload data.

Frame Structure

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
┌───────────────────────────────────────────────────────────────┐
│                    Destination MAC Address                    │
│                         (6 bytes)                             │
├───────────────────────────────────────────────────────────────┤
│                      Source MAC Address                       │
│                         (6 bytes)                             │
├───────────────────────────────┬───────────────────────────────┤
│        EtherType (2)          │          Payload ...          │
├───────────────────────────────┴───────────────────────────────┤
│                                                               │
│                       Payload Data                            │
│                    (46-1500 bytes)                            │
│                                                               │
├───────────────────────────────────────────────────────────────┤
│                         CRC (4 bytes)                         │
│                    (Calculated by hardware)                   │
└───────────────────────────────────────────────────────────────┘

Total: 64-1518 bytes (excluding preamble/SFD)

Field Descriptions:

  • Destination MAC: 48-bit address of the recipient (or broadcast FF:FF:FF:FF:FF:FF)
  • Source MAC: 48-bit address of the sender
  • EtherType: 16-bit protocol identifier (big-endian)
    • 0x0800 = IPv4
    • 0x0806 = ARP
    • 0x86DD = IPv6
  • Payload: Protocol data (46-1500 bytes, padded if needed)
  • CRC: Frame check sequence (calculated and verified by hardware)

MAC Address Representation

MacAddress Type

#![allow(unused)]
fn main() {
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub struct MacAddress(pub [u8; 6]);
}

A MAC address is a 48-bit (6-byte) unique hardware identifier. It’s displayed in colon-separated hexadecimal format: B8:27:EB:12:34:56.

Special Addresses

#![allow(unused)]
fn main() {
// Broadcast - send to all devices on the network
let broadcast = MacAddress::broadcast(); // FF:FF:FF:FF:FF:FF

// Zero address - used in ARP requests for unknown MAC
let zero = MacAddress::zero(); // 00:00:00:00:00:00

// Check address type
if mac.is_broadcast() { /* ... */ }
if mac.is_multicast() { /* Bit 0 of first byte is 1 */ }
if mac.is_unicast() { /* Not multicast */ }
}

Parsing and Display

#![allow(unused)]
fn main() {
// Parse from string
let mac: MacAddress = "B8:27:EB:12:34:56".parse().unwrap();

// Display
println!("MAC: {}", mac); // Prints: "B8:27:EB:12:34:56"

// Access bytes
let bytes = mac.as_bytes(); // &[u8; 6]
}

Ethernet Frame Handling

EthernetFrame Type

#![allow(unused)]
fn main() {
pub struct EthernetFrame<'a> {
    pub dest_mac: MacAddress,
    pub src_mac: MacAddress,
    pub ethertype: u16,        // Big-endian
    pub payload: &'a [u8],
}
}

The frame uses a lifetime 'a because the payload is a borrowed slice - it doesn’t own the data, just references it.

Frame Constants

#![allow(unused)]
fn main() {
const HEADER_SIZE: usize = 14;         // Dest MAC + Src MAC + EtherType
const MIN_PAYLOAD_SIZE: usize = 46;    // Minimum payload (padded if needed)
const MAX_PAYLOAD_SIZE: usize = 1500;  // MTU (Maximum Transmission Unit)
const MIN_FRAME_SIZE: usize = 60;      // 14 + 46 (excluding CRC)
const MAX_FRAME_SIZE: usize = 1514;    // 14 + 1500 (excluding CRC)
}

Verification: IEEE 802.3 Ethernet standard defines minimum frame size of 64 bytes and maximum of 1518 bytes (including 4-byte FCS/CRC). Excluding the CRC: 60 bytes minimum, 1514 bytes maximum.

Sources:

Creating a Frame

#![allow(unused)]
fn main() {
use daedalus::net::ethernet::*;

// Create frame
let frame = EthernetFrame::new(
    MacAddress::broadcast(),                      // Destination
    MacAddress::new([0xB8, 0x27, 0xEB, 1, 2, 3]), // Source
    ETHERTYPE_ARP,                                 // Protocol
    &payload_data,                                 // Data
);

// Serialize to buffer
let mut buffer = [0u8; 1518];
let size = frame.write_to(&mut buffer).unwrap();

// Now buffer[0..size] contains the raw frame
}

Parsing a Frame

#![allow(unused)]
fn main() {
// Receive raw frame from hardware
let raw_frame: &[u8] = receive_from_hardware();

// Parse
if let Some(frame) = EthernetFrame::parse(raw_frame) {
    println!("From: {}", frame.src_mac);
    println!("To: {}", frame.dest_mac);

    match frame.ethertype {
        ETHERTYPE_ARP => handle_arp(frame.payload),
        ETHERTYPE_IPV4 => handle_ipv4(frame.payload),
        _ => println!("Unknown protocol: {:#06X}", frame.ethertype),
    }
}
}

Byte Order Handling

CRITICAL: Network protocols use big-endian byte order, but ARM is little-endian.

#![allow(unused)]
fn main() {
// WRONG - sends in little-endian
let ethertype = 0x0806u16;
buffer[12] = (ethertype & 0xFF) as u8;        // 0x06
buffer[13] = ((ethertype >> 8) & 0xFF) as u8; // 0x08

// CORRECT - sends in big-endian
let ethertype_bytes = ethertype.to_be_bytes(); // [0x08, 0x06]
buffer[12..14].copy_from_slice(&ethertype_bytes);
}

The EthernetFrame implementation handles this automatically:

#![allow(unused)]
fn main() {
// Write ethertype (big-endian)
let ethertype_bytes = self.ethertype.to_be_bytes();
buffer[12..14].copy_from_slice(&ethertype_bytes);

// Parse ethertype (big-endian)
let ethertype = u16::from_be_bytes([buffer[12], buffer[13]]);
}

EtherType Values

#![allow(unused)]
fn main() {
pub const ETHERTYPE_IPV4: u16 = 0x0800;  // Internet Protocol v4
pub const ETHERTYPE_ARP: u16 = 0x0806;   // Address Resolution Protocol
pub const ETHERTYPE_IPV6: u16 = 0x86DD;  // Internet Protocol v6
}

Verification: All values confirmed against official IANA IEEE 802 Numbers registry.

Sources:

  • IANA IEEE 802 Numbers
  • RFC 9542 (authoritative reference for these protocols)
  • Verified against: src/net/ethernet.rs constants

ARP (Address Resolution Protocol)

Module: src/net/arp.rs

ARP is used to map IP addresses to MAC addresses on a local network. When you want to send a packet to IP 192.168.1.1, ARP determines the MAC address of that device.

ARP Packet Structure

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
┌───────────────────────────────┬───────────────────────────────┐
│      Hardware Type (2)        │      Protocol Type (2)        │
│         (0x0001)              │         (0x0800)              │
├───────────────┬───────────────┼───────────────────────────────┤
│  HW Addr Len  │ Proto Addr Len│        Operation (2)          │
│      (1)      │      (1)      │     (1=Req, 2=Reply)          │
├───────────────┴───────────────┴───────────────────────────────┤
│                  Sender Hardware Address                      │
│                      (6 bytes - MAC)                          │
├───────────────────────────────────────────────────────────────┤
│            Sender Protocol Address (4 bytes - IPv4)           │
├───────────────────────────────────────────────────────────────┤
│                  Target Hardware Address                      │
│                      (6 bytes - MAC)                          │
├───────────────────────────────────────────────────────────────┤
│            Target Protocol Address (4 bytes - IPv4)           │
└───────────────────────────────────────────────────────────────┘

Total: 28 bytes (for Ethernet/IPv4)

Note: This packet is carried as the payload of an Ethernet frame with EtherType 0x0806.

Verification: ARP packet structure confirmed against RFC 826. Total size for Ethernet/IPv4 is 28 bytes:

  • Fixed header: 8 bytes (hardware type, protocol type, lengths, operation)
  • Addresses: 20 bytes (6+4+6+4)

Sources:

ARP Operation Types

#![allow(unused)]
fn main() {
#[repr(u16)]
pub enum ArpOperation {
    Request = 1,  // "Who has IP X? Tell IP Y"
    Reply = 2,    // "IP X is at MAC Z"
}
}

Verification: Operation codes confirmed from RFC 826:

  • ares_op$REQUEST = 1
  • ares_op$REPLY = 2

ARP Constants (verified from RFC 826):

#![allow(unused)]
fn main() {
pub const ARP_HARDWARE_ETHERNET: u16 = 1;    // ares_hrd$Ethernet
pub const ARP_PROTOCOL_IPV4: u16 = 0x0800;   // ether_type$DOD_INTERNET
}

Sources:

ARP Request Example

Scenario: Device A (192.168.1.100) wants to communicate with device B (192.168.1.1) but doesn’t know B’s MAC address.

Ethernet Frame:

Dest MAC: FF:FF:FF:FF:FF:FF (broadcast - everyone receives it)
Src MAC:  B8:27:EB:12:34:56 (Device A)
EtherType: 0x0806 (ARP)

ARP Packet:

Hardware Type: 0x0001 (Ethernet)
Protocol Type: 0x0800 (IPv4)
HW Addr Len: 6
Proto Addr Len: 4
Operation: 1 (Request)
Sender MAC: B8:27:EB:12:34:56 (Device A)
Sender IP: 192.168.1.100
Target MAC: 00:00:00:00:00:00 (unknown - that's what we're asking)
Target IP: 192.168.1.1 (who we're looking for)

Human Translation: “This is B8:27:EB:12:34:56 at 192.168.1.100. Who has 192.168.1.1? Please tell me!”

ARP Reply Example

Response: Device B (192.168.1.1) sends a unicast reply to Device A.

Ethernet Frame:

Dest MAC: B8:27:EB:12:34:56 (Device A - unicast, not broadcast)
Src MAC:  AA:BB:CC:DD:EE:FF (Device B)
EtherType: 0x0806 (ARP)

ARP Packet:

Hardware Type: 0x0001 (Ethernet)
Protocol Type: 0x0800 (IPv4)
HW Addr Len: 6
Proto Addr Len: 4
Operation: 2 (Reply)
Sender MAC: AA:BB:CC:DD:EE:FF (Device B - the answer!)
Sender IP: 192.168.1.1
Target MAC: B8:27:EB:12:34:56 (Device A)
Target IP: 192.168.1.100

Human Translation: “I’m AA:BB:CC:DD:EE:FF at 192.168.1.1. This is for you, B8:27:EB:12:34:56!”

Using the ARP API

Creating an ARP Request

#![allow(unused)]
fn main() {
use daedalus::net::arp::*;
use daedalus::net::ethernet::*;

// We are 192.168.1.100, asking for 192.168.1.1
let our_mac = MacAddress::new([0xB8, 0x27, 0xEB, 0x12, 0x34, 0x56]);
let our_ip = [192, 168, 1, 100];
let target_ip = [192, 168, 1, 1];

// Create ARP request
let arp_request = ArpPacket::request(our_mac, our_ip, target_ip);

// Serialize to buffer
let mut arp_buffer = [0u8; 28];
arp_request.write_to(&mut arp_buffer).unwrap();

// Wrap in Ethernet frame (broadcast)
let eth_frame = EthernetFrame::new(
    MacAddress::broadcast(),  // Send to everyone
    our_mac,
    ETHERTYPE_ARP,
    &arp_buffer,
);

// Serialize and send
let mut frame_buffer = [0u8; 64];
let size = eth_frame.write_to(&mut frame_buffer).unwrap();
send_frame(&frame_buffer[..size]);
}

Handling an ARP Request

#![allow(unused)]
fn main() {
// Receive Ethernet frame
let eth_frame = EthernetFrame::parse(received_data)?;

// Check if it's ARP
if eth_frame.ethertype == ETHERTYPE_ARP {
    // Parse ARP packet
    if let Some(arp) = ArpPacket::parse(eth_frame.payload) {
        match arp.operation {
            ArpOperation::Request => {
                // Is this request for our IP?
                if arp.target_ip == our_ip {
                    // Send ARP reply
                    let reply = ArpPacket::reply(
                        our_mac,           // Sender MAC (us)
                        our_ip,            // Sender IP (us)
                        arp.sender_mac,    // Target MAC (them)
                        arp.sender_ip,     // Target IP (them)
                    );

                    send_arp_reply(reply);
                }
            }
            ArpOperation::Reply => {
                // Update ARP cache
                cache.insert(arp.sender_ip, arp.sender_mac);
            }
        }
    }
}
}

Creating an ARP Reply

#![allow(unused)]
fn main() {
// Responding to a request
let reply = ArpPacket::reply(
    our_mac,              // Who we are
    our_ip,
    requesters_mac,       // Who asked
    requesters_ip,
);

// Serialize
let mut buffer = [0u8; 28];
reply.write_to(&mut buffer).unwrap();

// Wrap in Ethernet frame (unicast to requester)
let frame = EthernetFrame::new(
    requesters_mac,  // Direct reply, not broadcast
    our_mac,
    ETHERTYPE_ARP,
    &buffer,
);
}

ARP Packet Display

The ArpPacket type implements Display for debugging:

#![allow(unused)]
fn main() {
println!("{}", arp_packet);

// Output:
// ARP Request - Who has 192.168.1.1? Tell 192.168.1.100 (B8:27:EB:12:34:56)
// ARP Reply - Who has 192.168.1.100? Tell 192.168.1.1 (AA:BB:CC:DD:EE:FF)
}

Network Byte Order

CRITICAL CONCEPT: Network protocols use big-endian byte order (most significant byte first), but ARM processors are little-endian (least significant byte first).

The Problem

#![allow(unused)]
fn main() {
// A 16-bit value: 0x0806 (ARP EtherType)
//
// In memory on ARM (little-endian):
//   buffer[0] = 0x06  (low byte)
//   buffer[1] = 0x08  (high byte)
//
// On the network (big-endian):
//   byte 0 = 0x08 (high byte)
//   byte 1 = 0x06 (low byte)
}

The Solution

Rust provides conversion functions:

#![allow(unused)]
fn main() {
// Convert to big-endian (for sending)
let value: u16 = 0x0806;
let bytes = value.to_be_bytes();  // [0x08, 0x06] - correct for network

// Convert from big-endian (for receiving)
let bytes: [u8; 2] = [0x08, 0x06];
let value = u16::from_be_bytes(bytes);  // 0x0806
}

In Practice

#![allow(unused)]
fn main() {
// Writing a 16-bit field to a network packet
let ethertype: u16 = 0x0806;
buffer[12..14].copy_from_slice(&ethertype.to_be_bytes());

// Reading a 16-bit field from a network packet
let ethertype = u16::from_be_bytes([buffer[12], buffer[13]]);
}

Rule of Thumb: Any multi-byte field in a network protocol must use .to_be_bytes() when writing and .from_be_bytes() when reading.


Frame Processing Pipeline

Transmission (TX) Path

Application
    │
    ├─► Create protocol packet (e.g., ARP)
    │   └─► Serialize to buffer
    │
    ├─► Wrap in Ethernet frame
    │   ├─► Set destination MAC
    │   ├─► Set source MAC (our MAC)
    │   ├─► Set EtherType
    │   └─► Add payload
    │
    ├─► Serialize frame to buffer
    │   └─► Handle byte order conversion
    │
    └─► Send to hardware
        └─► GENET TX (future implementation)

Reception (RX) Path

Hardware
    │
    ├─► GENET RX (future implementation)
    │
    ├─► Parse Ethernet frame
    │   ├─► Extract dest MAC
    │   ├─► Extract src MAC
    │   ├─► Extract EtherType
    │   └─► Extract payload
    │
    ├─► Filter by destination
    │   ├─► Is it for us? (our MAC or broadcast)
    │   └─► Ignore if not for us
    │
    ├─► Dispatch by EtherType
    │   ├─► 0x0806 → ARP handler
    │   ├─► 0x0800 → IPv4 handler (future)
    │   └─► Other → Log and drop
    │
    └─► Protocol handler
        └─► Parse protocol packet
            └─► Process and respond if needed

Testing Strategy

Unit Tests

The network protocol modules have comprehensive unit tests (30 tests total):

MAC Address Tests (12 tests):

  • Construction and constants
  • Broadcast/multicast detection
  • String parsing and display
  • Validation

Ethernet Frame Tests (6 tests):

  • Frame parsing from raw bytes
  • Frame serialization to bytes
  • Roundtrip (serialize → parse)
  • Buffer size validation
  • EtherType constants

ARP Packet Tests (12 tests):

  • Request/reply creation
  • Packet parsing
  • Packet serialization
  • Roundtrip
  • Invalid packet handling
  • Display formatting

Running Tests

# Run all tests
cargo test

# Run only network tests
cargo test --lib net

# Run specific test
cargo test test_arp_request_creation

All tests run in QEMU without requiring real hardware.

Test Coverage

What’s Tested:

  • ✅ Data structure creation and initialization
  • ✅ Parsing from raw bytes
  • ✅ Serialization to raw bytes
  • ✅ Byte order conversion
  • ✅ Validation and error handling
  • ✅ Display/formatting

What’s Not Tested (requires hardware):

  • ❌ Actual frame transmission
  • ❌ Actual frame reception
  • ❌ ARP cache management
  • ❌ Network timeouts and retries

Future Extensions

ARP Cache

An ARP cache stores IP-to-MAC mappings to avoid repeated ARP requests:

#![allow(unused)]
fn main() {
struct ArpEntry {
    ip: [u8; 4],
    mac: MacAddress,
    timestamp: u64,  // For expiration (typical: 60 seconds)
}

struct ArpCache {
    entries: Vec<ArpEntry>,
}

impl ArpCache {
    fn lookup(&self, ip: [u8; 4]) -> Option<MacAddress> { /* ... */ }
    fn insert(&mut self, ip: [u8; 4], mac: MacAddress) { /* ... */ }
    fn remove_expired(&mut self, current_time: u64) { /* ... */ }
}
}

Gratuitous ARP

A gratuitous ARP is an ARP request for your own IP address. It’s used to:

  • Announce your presence on the network
  • Update other devices’ ARP caches
  • Detect IP address conflicts
#![allow(unused)]
fn main() {
// Send gratuitous ARP (announce our presence)
let gratuitous = ArpPacket::request(our_mac, our_ip, our_ip);
send_broadcast(gratuitous);
}

IPv4 Integration

When IPv4 is implemented, ARP will be used automatically:

#![allow(unused)]
fn main() {
// Application wants to send IP packet to 192.168.1.1
fn send_ip_packet(dest_ip: [u8; 4], payload: &[u8]) {
    // Look up MAC address
    let dest_mac = match arp_cache.lookup(dest_ip) {
        Some(mac) => mac,
        None => {
            // Send ARP request and wait for reply
            send_arp_request(dest_ip);
            wait_for_arp_reply(dest_ip, timeout)
        }
    };

    // Now we can send the packet
    send_ethernet_frame(dest_mac, ETHERTYPE_IPV4, payload);
}
}

Proxy ARP

A device can respond to ARP requests on behalf of another device (used in routing):

#![allow(unused)]
fn main() {
// If we're a router, we might answer ARP for devices on other networks
if arp.operation == ArpOperation::Request {
    if should_proxy_for(arp.target_ip) {
        let reply = ArpPacket::reply(
            our_mac,           // We answer with our MAC
            arp.target_ip,     // But claim to be the target IP
            arp.sender_mac,
            arp.sender_ip,
        );
        send_reply(reply);
    }
}
}

Common Patterns

Pattern 1: Receiving and Dispatching

#![allow(unused)]
fn main() {
fn handle_received_frame(raw_data: &[u8]) {
    // Parse Ethernet frame
    let frame = match EthernetFrame::parse(raw_data) {
        Some(f) => f,
        None => {
            println!("Invalid Ethernet frame");
            return;
        }
    };

    // Filter by destination
    if !frame.dest_mac.is_broadcast() && frame.dest_mac != OUR_MAC {
        // Not for us
        return;
    }

    // Dispatch by protocol
    match frame.ethertype {
        ETHERTYPE_ARP => handle_arp(&frame),
        ETHERTYPE_IPV4 => handle_ipv4(&frame),
        _ => println!("Unknown EtherType: {:#06X}", frame.ethertype),
    }
}
}

Pattern 2: Sending a Protocol Message

#![allow(unused)]
fn main() {
fn send_arp_request(target_ip: [u8; 4]) -> Result<(), Error> {
    // Create ARP packet
    let arp = ArpPacket::request(OUR_MAC, OUR_IP, target_ip);

    // Serialize ARP
    let mut arp_buffer = [0u8; 28];
    arp.write_to(&mut arp_buffer)?;

    // Wrap in Ethernet frame
    let frame = EthernetFrame::new(
        MacAddress::broadcast(),
        OUR_MAC,
        ETHERTYPE_ARP,
        &arp_buffer,
    );

    // Serialize frame
    let mut frame_buffer = [0u8; 64];
    let size = frame.write_to(&mut frame_buffer)?;

    // Send to hardware
    genet.transmit(&frame_buffer[..size])
}
}

Pattern 3: Processing ARP Requests

#![allow(unused)]
fn main() {
fn handle_arp(frame: &EthernetFrame) {
    // Parse ARP packet
    let arp = match ArpPacket::parse(frame.payload) {
        Some(a) => a,
        None => return,
    };

    match arp.operation {
        ArpOperation::Request => {
            // Is this for us?
            if arp.target_ip == OUR_IP {
                // Send reply
                let reply = ArpPacket::reply(
                    OUR_MAC,
                    OUR_IP,
                    arp.sender_mac,
                    arp.sender_ip,
                );
                send_arp_reply(reply);
            }
        }
        ArpOperation::Reply => {
            // Update cache
            println!("Learned: {} is at {}",
                     format_ip(&arp.sender_ip),
                     arp.sender_mac);
            arp_cache.insert(arp.sender_ip, arp.sender_mac);
        }
    }
}
}

References

RFCs and Standards

Implementation References

  • smoltcp: Future TCP/IP stack for embedded systems

  • Linux Kernel: Networking stack

    • net/ethernet/eth.c - Ethernet handling
    • net/ipv4/arp.c - ARP implementation

Debugging Tips

Viewing Raw Bytes

#![allow(unused)]
fn main() {
fn dump_frame(data: &[u8]) {
    println!("Frame dump ({} bytes):", data.len());
    for (i, chunk) in data.chunks(16).enumerate() {
        print!("{:04X}: ", i * 16);
        for byte in chunk {
            print!("{:02X} ", byte);
        }
        println!();
    }
}
}

Verifying Byte Order

#![allow(unused)]
fn main() {
// Check if EtherType is correct
let raw = [0x08, 0x06]; // Network bytes
let value = u16::from_be_bytes(raw);
assert_eq!(value, 0x0806); // ARP

// If this fails, byte order is wrong
let wrong = u16::from_le_bytes(raw); // DON'T DO THIS
assert_eq!(wrong, 0x0608); // Backwards!
}

Packet Capture Simulation

#![allow(unused)]
fn main() {
// Save frames to analyze with Wireshark later
fn save_pcap(frames: &[Vec<u8>], filename: &str) {
    // Write PCAP file format
    // Can load in Wireshark for detailed analysis
}
}

Common Issues

Issue: Frames are being ignored

  • Check: Is dest MAC correct? (our MAC or broadcast)
  • Check: Is EtherType in network byte order?

Issue: ARP replies not working

  • Check: Are sender/target fields swapped correctly?
  • Check: Is the Ethernet frame using unicast dest MAC?

Issue: Byte order errors

  • Check: Using .to_be_bytes() and .from_be_bytes()?
  • Check: Not mixing up little-endian and big-endian?

Next Steps

Integration with Network Device

Once frame TX/RX is implemented (Milestone #13):

#![allow(unused)]
fn main() {
use daedalus::drivers::net::netdev::NetworkDevice;
use daedalus::drivers::net::ethernet::broadcom::genet::GenetController;

// Initialize networking (works with any NetworkDevice implementation)
let mut netdev = GenetController::new();
if netdev.is_present() {
    netdev.init()?;

    // Send frames
    fn send_frame<T: NetworkDevice>(netdev: &mut T, data: &[u8]) -> Result<(), NetworkError> {
        netdev.transmit(data)
    }

    // Receive frames (polling loop)
    loop {
        if let Some(frame_data) = netdev.receive() {
            handle_received_frame(frame_data);
        }
    }
}
}

TCP/IP Stack (smoltcp)

Future integration with smoltcp will provide:

  • IPv4 and IPv6
  • TCP and UDP
  • ICMP (ping)
  • DHCP client
  • DNS client
  • HTTP client/server

The Ethernet and ARP modules provide the foundation for smoltcp’s Device trait.

Networking Guide

Complete guide to networking in DaedalusOS Status: Foundation complete, ready for TX/RX implementation Last Updated: 2025-11-09 (Milestone #12)


Overview

This guide provides a complete reference for working with DaedalusOS’s network stack. It covers everything from low-level hardware drivers to high-level protocol handling.

Quick Navigation

TopicDocumentationCode
HardwareGENET Driversrc/drivers/net/ethernet/broadcom/genet.rs
ProtocolsEthernet & ARPsrc/net/
TestingThis document (below)cargo test
IntegrationThis document (below)src/shell.rs

Architecture Overview

Component Map

┌──────────────────────────────────────────────────────────────┐
│                      User Application                         │
│                   (Future: HTTP, GPIO API)                    │
└───────────────────────────┬──────────────────────────────────┘
                            │
┌───────────────────────────┴──────────────────────────────────┐
│                      Protocol Handlers                        │
│         • ARP Responder (src/net/arp.rs)                     │
│         • Future: IP, TCP, UDP (via smoltcp)                 │
└───────────────────────────┬──────────────────────────────────┘
                            │
┌───────────────────────────┴──────────────────────────────────┐
│                    Frame Processing                           │
│         • Ethernet Frame Parser (src/net/ethernet.rs)        │
│         • Protocol Dispatch (by EtherType)                   │
└───────────────────────────┬──────────────────────────────────┘
                            │
┌───────────────────────────┴──────────────────────────────────┐
│                     GENET Driver                              │
│         • Register Control (src/drivers/genet.rs)            │
│         • MDIO Bus Protocol                                  │
│         • PHY Management (BCM54213PE)                        │
│         • TX/RX Buffers (future)                             │
│         • Interrupt Handling (future)                        │
└───────────────────────────┬──────────────────────────────────┘
                            │
┌───────────────────────────┴──────────────────────────────────┐
│                      Hardware                                 │
│    BCM2711 SoC          BCM54213PE PHY       Ethernet Port   │
│    (GENET MAC)          (Physical Layer)     (RJ45)          │
└──────────────────────────────────────────────────────────────┘

Data Flow

Transmit Path (Future)

Application
    │
    └─► Create message (e.g., HTTP response)
         │
         └─► TCP layer (smoltcp)
              │
              └─► IP layer (smoltcp)
                   │
                   └─► ARP lookup (get dest MAC)
                        │
                        └─► Ethernet frame construction
                             │
                             └─► GENET TX
                                  │
                                  └─► PHY chip
                                       │
                                       └─► Network wire

Receive Path (Future)

Network wire
    │
    └─► PHY chip
         │
         └─► GENET RX (interrupt)
              │
              └─► Ethernet frame parsing
                   │
                   ├─► EtherType 0x0806 → ARP handler
                   │    └─► Process request, send reply
                   │
                   ├─► EtherType 0x0800 → IP handler (smoltcp)
                   │    └─► TCP/UDP processing
                   │         └─► Application handler
                   │
                   └─► Unknown → Drop and log

Current Implementation Status

✅ Completed (Milestone #12)

Hardware Layer:

  • GENET v5 controller register definitions
  • MMIO read/write infrastructure
  • MDIO protocol implementation (read/write PHY registers)
  • PHY detection and ID verification (BCM54213PE)
  • Hardware diagnostics system
  • Safe hardware presence checking (QEMU compatibility)

Protocol Layer:

  • MAC address type with validation
  • Ethernet II frame parsing and construction
  • ARP packet parsing and construction
  • ARP request/reply generation
  • Network byte order handling

Testing:

  • 65 total unit tests passing
  • 30 network protocol tests
  • 4 GENET driver tests
  • All tests run in QEMU

Documentation:

  • Complete hardware reference (GENET)
  • Complete protocol guide (Networking)
  • This integration guide
  • Milestone summary with test results

❌ Not Yet Implemented

Hardware Layer (Milestone #13):

  • Frame transmission (TX path)
  • Frame reception (RX path)
  • DMA engine configuration
  • Interrupt-driven RX/TX
  • MAC address configuration
  • Link state monitoring

Protocol Layer (Milestone #14-16):

  • ARP cache management
  • ARP request timeout/retry
  • IPv4 protocol
  • ICMP (ping)
  • TCP/UDP (via smoltcp)
  • DHCP client

Application Layer (Milestone #17):

  • HTTP server
  • GPIO control API
  • DNS client

Getting Started

Prerequisites

  • Raspberry Pi 4 Model B (BCM2711)
  • Ethernet cable
  • Network with connectivity (for testing)
  • Serial console or monitor for debugging

Building

# Build kernel
cargo build --release

# Run tests
cargo test

# Build documentation
cargo doc --open

Running on QEMU

# Interactive shell (no network hardware)
cargo run

# Run diagnostics (will report no hardware)
daedalus> eth-diag
[INFO] Hardware not present (running in QEMU?)

Note: QEMU 9.0’s raspi4b machine does not emulate GENET. Network testing requires real hardware.

Running on Raspberry Pi 4

  1. Build kernel:

    cargo build --release
    
  2. Copy to SD card:

    cp target/aarch64-daedalus/release/daedalus /path/to/sd/kernel8.img
    
  3. Boot and test:

    daedalus> eth-diag
    [PASS] GENET v5.2.16 detected
    [PASS] PHY found at address 1: BCM54213PE
    [PASS] Link status: UP
    

Working with the GENET Driver

Basic Usage

#![allow(unused)]
fn main() {
use daedalus::drivers::genet::GenetController;

// Create controller instance
let genet = GenetController::new();

// Check if hardware is present (safe in QEMU)
if !genet.is_present() {
    println!("No GENET hardware detected");
    return;
}

// Get hardware version
let (major, minor) = genet.get_version();
println!("GENET hardware version: {}.{}", major, minor);
}

MDIO Operations

#![allow(unused)]
fn main() {
use daedalus::drivers::genet::{MII_BMSR, MII_PHYSID1, MII_PHYSID2};

// Read PHY ID
let id1 = genet.mdio_read(PHY_ADDR, MII_PHYSID1)?;
let id2 = genet.mdio_read(PHY_ADDR, MII_PHYSID2)?;
let phy_id = ((id1 as u32) << 16) | (id2 as u32);

println!("PHY ID: {:#010X}", phy_id); // Should be 0x600D84A2

// Read link status
let bmsr = genet.mdio_read(PHY_ADDR, MII_BMSR)?;
let link_up = (bmsr & BMSR_LSTATUS) != 0;
println!("Link: {}", if link_up { "UP" } else { "DOWN" });

// Write to PHY register (example: software reset)
genet.mdio_write(PHY_ADDR, MII_BMCR, BMCR_RESET);
}

Running Diagnostics

#![allow(unused)]
fn main() {
// Run comprehensive hardware check
let success = genet.diagnostic();

if success {
    println!("Hardware ready for network operations");
} else {
    println!("Hardware issues detected, see output above");
}
}

See: GENET Hardware Reference for complete register documentation.


Working with Ethernet Frames

Sending a Frame (Conceptual - TX not yet implemented)

#![allow(unused)]
fn main() {
use daedalus::net::ethernet::*;

// Create frame
let frame = EthernetFrame::new(
    MacAddress::broadcast(),              // Destination
    MacAddress::new([0xB8, 0x27, 0xEB, 1, 2, 3]), // Source (our MAC)
    ETHERTYPE_ARP,                         // Protocol
    &payload_data,                         // Payload
);

// Serialize to buffer
let mut buffer = [0u8; 1518];
let size = frame.write_to(&mut buffer).unwrap();

// Send via GENET (future)
// genet.transmit(&buffer[..size])?;
}

Receiving a Frame (Conceptual - RX not yet implemented)

#![allow(unused)]
fn main() {
// Receive raw bytes from GENET (future)
// let raw_frame = genet.receive()?;

// Parse frame
if let Some(frame) = EthernetFrame::parse(&raw_frame) {
    println!("From: {}", frame.src_mac);
    println!("To: {}", frame.dest_mac);
    println!("Protocol: {:#06X}", frame.ethertype);

    // Dispatch by protocol
    match frame.ethertype {
        ETHERTYPE_ARP => handle_arp(frame.payload),
        ETHERTYPE_IPV4 => handle_ipv4(frame.payload),
        _ => println!("Unknown protocol"),
    }
}
}

See: Ethernet Protocol Guide for complete API reference.


Working with ARP

Sending an ARP Request

#![allow(unused)]
fn main() {
use daedalus::net::arp::*;
use daedalus::net::ethernet::*;

fn send_arp_request(target_ip: [u8; 4]) {
    // Our network configuration
    let our_mac = MacAddress::new([0xB8, 0x27, 0xEB, 1, 2, 3]);
    let our_ip = [192, 168, 1, 100];

    // Create ARP request
    let arp = ArpPacket::request(our_mac, our_ip, target_ip);

    // Serialize ARP packet
    let mut arp_buffer = [0u8; 28];
    arp.write_to(&mut arp_buffer).unwrap();

    // Wrap in Ethernet frame (broadcast)
    let frame = EthernetFrame::new(
        MacAddress::broadcast(),
        our_mac,
        ETHERTYPE_ARP,
        &arp_buffer,
    );

    // Serialize and send (future)
    let mut frame_buffer = [0u8; 64];
    let size = frame.write_to(&mut frame_buffer).unwrap();
    // genet.transmit(&frame_buffer[..size])?;
}
}

Handling ARP Requests

#![allow(unused)]
fn main() {
fn handle_arp_request(arp: &ArpPacket, our_mac: MacAddress, our_ip: [u8; 4]) {
    // Is this request for our IP?
    if arp.operation == ArpOperation::Request && arp.target_ip == our_ip {
        // Create reply
        let reply = ArpPacket::reply(
            our_mac,           // We are the sender
            our_ip,
            arp.sender_mac,    // They are the target
            arp.sender_ip,
        );

        // Serialize and send (future)
        let mut buffer = [0u8; 28];
        reply.write_to(&mut buffer).unwrap();

        // Wrap in Ethernet frame (unicast to requester)
        let frame = EthernetFrame::new(
            arp.sender_mac,  // Direct reply
            our_mac,
            ETHERTYPE_ARP,
            &buffer,
        );

        // Send via GENET (future)
        // send_frame(&frame)?;
    }
}
}

See: ARP Protocol Guide for complete examples.


Shell Commands

eth-diag - Ethernet Diagnostics

Run comprehensive hardware diagnostics.

Usage:

daedalus> eth-diag

Output on Real Pi 4 (with ethernet cable plugged in):

[DIAG] Ethernet Hardware Diagnostics
[DIAG] ================================
[DIAG] Step 1: GENET Controller Detection
[DIAG]   Reading SYS_REV_CTRL @ 0xFD580000...
[DIAG]   Raw register value: 0x06000000
[PASS]   GENET hardware v6.0 detected (GENET v5 IP block)
[PASS]   Register: 0x06000000

[DIAG] Step 2: PHY Detection
[DIAG]   Scanning MDIO address 1...
[DIAG]   Reading PHY_ID1 @ addr 1, reg 0x02...
[DIAG]     Value: 0x600D
[DIAG]   Reading PHY_ID2 @ addr 1, reg 0x03...
[DIAG]     Value: 0x84A2
[PASS]   PHY found at address 1: BCM54213PE (ID: 0x600D84A2)

[DIAG] Step 3: PHY Status
[DIAG]   Reading BMSR (Basic Mode Status Register)...
[DIAG]     BMSR: 0x7949
[DIAG]       Link status: UP
[DIAG]       Auto-negotiation: COMPLETE
[DIAG]   Reading BMCR (Basic Mode Control Register)...
[DIAG]     BMCR: 0x1140
[DIAG]       Auto-negotiation: ENABLED

[PASS] ================================
[PASS] Hardware diagnostics complete!
[PASS] GENET hardware v6.0 (GENET v5 IP) and BCM54213PE PHY detected

Output in QEMU (no ethernet hardware emulated):

[DIAG] Ethernet Hardware Diagnostics
[DIAG] ================================
[DIAG] Step 1: GENET Controller Detection
[DIAG]   Reading SYS_REV_CTRL @ 0xFD580000...
[DIAG]   Raw register value: 0x00000000
[WARN]   Unexpected version: 0.0 (expected 6.x for GENET v5)
[INFO]   Hardware not present (running in QEMU?)
[SKIP] Diagnostics completed (no hardware detected)

Implementation: src/shell.rs (line ~200), src/drivers/net/ethernet/broadcom/genet.rs (line ~373)

Future Commands (Planned)

  • eth-status - Show link status, speed, duplex, MAC address
  • eth-stats - Display RX/TX packet/byte counters
  • eth-send <dest_mac> <data> - Send raw Ethernet frame
  • arp-cache - Display ARP cache entries
  • arp-request <ip> - Send ARP request for an IP address
  • ping <ip> - Send ICMP echo request
  • dhcp - Request IP via DHCP

Testing

Running Tests

# All tests
cargo test

# Only network tests
cargo test --lib net

# Only GENET tests
cargo test --lib drivers::genet

# Specific test
cargo test test_mac_address_display

# Show test output
cargo test -- --nocapture

Test Organization

Unit Tests (run in QEMU):

  • src/net/ethernet.rs - 18 tests
  • src/net/arp.rs - 12 tests
  • src/drivers/genet.rs - 4 tests

Integration Tests (future, require hardware):

  • Marked with #[ignore]
  • Run with cargo test -- --ignored

Manual Tests (on hardware):

  • Use shell commands
  • Capture with Wireshark on connected network
  • Verify with external tools

Test Coverage

ComponentUnit TestsIntegration TestsManual Tests
MAC Address✅ 12 testsN/AN/A
Ethernet Frames✅ 6 tests❌ Planned❌ Planned
ARP Packets✅ 12 tests❌ Planned❌ Planned
GENET Registers✅ 4 testsN/AN/A
MDIO Protocol❌ Mock only❌ Plannedeth-diag
PHY Detection❌ Mock only❌ Plannedeth-diag
Frame TX/RX❌ Not impl❌ Planned❌ Planned

Example Test

#![allow(unused)]
fn main() {
#[test_case]
fn test_arp_request_creation() {
    let our_mac = MacAddress::new([0xB8, 0x27, 0xEB, 0x12, 0x34, 0x56]);
    let our_ip = [192, 168, 1, 100];
    let target_ip = [192, 168, 1, 1];

    let request = ArpPacket::request(our_mac, our_ip, target_ip);

    assert_eq!(request.operation, ArpOperation::Request);
    assert_eq!(request.sender_mac, our_mac);
    assert_eq!(request.sender_ip, our_ip);
    assert_eq!(request.target_mac, MacAddress::zero());
    assert_eq!(request.target_ip, target_ip);
}
}

Debugging

QEMU Debugging

Since QEMU doesn’t emulate GENET, debugging focuses on protocol logic:

#![allow(unused)]
fn main() {
// Use unit tests to verify protocol handling
cargo test test_ethernet_frame_roundtrip

// Test serialization manually
let frame = EthernetFrame::new(/* ... */);
let mut buffer = [0u8; 1518];
let size = frame.write_to(&mut buffer).unwrap();

// Dump hex
for (i, byte) in buffer[..size].iter().enumerate() {
    if i % 16 == 0 {
        println!();
        print!("{:04X}: ", i);
    }
    print!("{:02X} ", byte);
}
}

Hardware Debugging

Step 1: Verify Hardware Detection

daedalus> eth-diag

Check for:

  • GENET version matches v5.x.x
  • PHY ID matches 0x600D84A2
  • Link status shows UP (cable connected)
  • Auto-negotiation completes

Step 2: Monitor PHY Registers

#![allow(unused)]
fn main() {
// Read key PHY registers
let bmsr = genet.mdio_read(1, MII_BMSR)?;
let bmcr = genet.mdio_read(1, MII_BMCR)?;

println!("BMSR: {:#06X}", bmsr);
println!("  Link: {}", if (bmsr & 0x04) != 0 { "UP" } else { "DOWN" });
println!("  AN Complete: {}", if (bmsr & 0x20) != 0 { "YES" } else { "NO" });

println!("BMCR: {:#06X}", bmcr);
println!("  AN Enabled: {}", if (bmcr & 0x1000) != 0 { "YES" } else { "NO" });
}

Step 3: Packet Capture (Future)

Once TX/RX is implemented, use Wireshark on a connected device:

# On a Linux machine connected to the Pi 4
sudo tcpdump -i eth0 -w capture.pcap

# Or use Wireshark GUI
wireshark

Filter for:

  • ARP: arp
  • From Pi’s MAC: eth.src == b8:27:eb:xx:xx:xx
  • To Pi’s MAC: eth.dst == b8:27:eb:xx:xx:xx

Common Issues

Issue: eth-diag reports no hardware in QEMU

  • Cause: QEMU 9.0 doesn’t emulate GENET
  • Solution: This is expected. Test on real Pi 4.

Issue: PHY ID mismatch

  • Cause: Different PHY chip (not Pi 4?) or MDIO issue
  • Solution: Verify hardware, check MDIO timing

Issue: Link status DOWN

  • Cause: Cable unplugged, bad cable, switch port down
  • Solution: Check cable, try different switch port

Issue: Auto-negotiation timeout

  • Cause: PHY configuration issue or partner doesn’t support auto-neg
  • Solution: Check BMCR/BMSR registers, verify cable/switch

Issue: Frames not received

  • Cause: MAC filtering, promiscuous mode not enabled, interrupt not firing
  • Solution: Check UMAC_CMD settings, verify interrupt registration

Network Configuration

MAC Address

The Raspberry Pi 4 has a factory-programmed MAC address stored in OTP (One-Time Programmable) memory. Our driver currently reads this from device-specific registers (future implementation).

Temporary: Hard-code MAC address during development:

#![allow(unused)]
fn main() {
const OUR_MAC: MacAddress = MacAddress([0xB8, 0x27, 0xEB, 0x01, 0x02, 0x03]);
}

Production: Read from OTP:

#![allow(unused)]
fn main() {
// Future implementation
let mac = genet.read_mac_address();
}

IP Address

Static IP (current approach):

#![allow(unused)]
fn main() {
const OUR_IP: [u8; 4] = [192, 168, 1, 100];
}

DHCP (future):

#![allow(unused)]
fn main() {
// Use smoltcp's DHCP client
let ip = dhcp_client.request_ip()?;
}

Network Settings

Typical development network configuration:

SettingValueConfigurable
MAC AddressRead from OTP❌ (hardware)
IP Address192.168.1.100✅ (code constant)
Netmask255.255.255.0✅ (future)
Gateway192.168.1.1✅ (future)
DNS192.168.1.1✅ (future)
Link SpeedAuto-negotiated❌ (PHY handles)
DuplexAuto-negotiated❌ (PHY handles)

Performance Considerations

MDIO Timing

MDIO operations are relatively slow (~1ms each):

  • PHY ID read: 2 MDIO reads = ~2ms
  • Link status poll: 1 MDIO read = ~1ms
  • Auto-negotiation: Can take 1-3 seconds

Optimization: Don’t poll PHY registers in performance-critical paths. Cache link state and update periodically.

Frame Processing

Future Bottlenecks:

  • Copying data between buffers (use zero-copy where possible)
  • Protocol parsing overhead (optimize hot paths)
  • Interrupt frequency (tune interrupt coalescing)

DMA vs. Polling:

  • Polling: Simple, good for low traffic
  • DMA: Essential for high traffic (1 Gbps = ~1.5M packets/sec)

Memory Usage

Current allocations:

  • GENET driver: Minimal (no buffers yet)
  • Ethernet frames: Stack-allocated or passed by reference
  • ARP packets: 28 bytes (stack)

Future allocations:

  • RX buffer ring: ~32 KB (16 descriptors × 2 KB)
  • TX buffer ring: ~32 KB
  • ARP cache: ~1 KB (typical: 64 entries)

Roadmap

Milestone #13: Frame TX/RX

Goal: Send and receive Ethernet frames

Implementation:

  • Configure GENET TX/RX buffers (simple mode, no DMA)
  • Implement transmit(&[u8]) function
  • Implement receive() -> Option<&[u8]> function (polling)
  • Test with raw frame send/receive

Verification:

  • Send ARP request from Pi
  • Receive ARP request on Pi
  • View frames in Wireshark

Milestone #14: Interrupt-Driven RX

Goal: Replace polling with interrupts

Implementation:

  • Register GENET IRQs (157, 158) with GIC
  • Implement RX interrupt handler
  • Queue received frames for processing
  • Clear interrupt status correctly

Verification:

  • Receive frames without polling
  • Measure latency improvement

Milestone #15: ARP Responder

Goal: Respond to ARP requests

Implementation:

  • ARP cache with expiration
  • ARP request/reply handling
  • Integration with RX path

Verification:

  • ping 192.168.1.100 from another device
  • Pi responds to ARP, then to ICMP (need Milestone #16)

Milestone #16: TCP/IP Stack (smoltcp)

Goal: Full TCP/IP support

Implementation:

  • Integrate smoltcp crate
  • Implement Device trait (maps to GENET)
  • Configure IP, routing, sockets
  • DHCP client

Verification:

  • Obtain IP via DHCP
  • Ping external hosts
  • TCP connection (HTTP GET)

Milestone #17: HTTP Server

Goal: Web-based GPIO control

Implementation:

  • Simple HTTP server using smoltcp
  • REST API for GPIO control
  • JSON responses

Verification:

  • curl http://192.168.1.100/gpio/21/on
  • LED turns on

API Reference

Key Types

#![allow(unused)]
fn main() {
// Hardware
pub struct GenetController { /* ... */ }

// Network
pub struct MacAddress(pub [u8; 6]);
pub struct EthernetFrame<'a> { /* ... */ }
pub struct ArpPacket { /* ... */ }
pub enum ArpOperation { Request = 1, Reply = 2 }

// Constants
pub const ETHERTYPE_ARP: u16 = 0x0806;
pub const ETHERTYPE_IPV4: u16 = 0x0800;
pub const ETHERTYPE_IPV6: u16 = 0x86DD;
}

Key Functions

#![allow(unused)]
fn main() {
// GENET
impl GenetController {
    pub fn new() -> Self;
    pub fn is_present(&self) -> bool;
    pub fn get_version(&self) -> (u8, u8);        // Returns (major, minor)
    pub fn get_version_raw(&self) -> u32;         // Returns raw SYS_REV_CTRL value
    pub fn mdio_read(&self, phy_addr: u8, reg_addr: u8) -> Option<u16>;
    pub fn mdio_write(&self, phy_addr: u8, reg_addr: u8, value: u16) -> bool;
    pub fn read_phy_id(&self) -> Option<u32>;
    pub fn diagnostic(&self) -> bool;
}

// Ethernet
impl EthernetFrame<'_> {
    pub fn new(dest_mac: MacAddress, src_mac: MacAddress,
               ethertype: u16, payload: &[u8]) -> Self;
    pub fn parse(buffer: &[u8]) -> Option<Self>;
    pub fn write_to(&self, buffer: &mut [u8]) -> Option<usize>;
}

// ARP
impl ArpPacket {
    pub fn new(operation: ArpOperation, sender_mac: MacAddress,
               sender_ip: [u8; 4], target_mac: MacAddress,
               target_ip: [u8; 4]) -> Self;
    pub fn request(sender_mac: MacAddress, sender_ip: [u8; 4],
                   target_ip: [u8; 4]) -> Self;
    pub fn reply(sender_mac: MacAddress, sender_ip: [u8; 4],
                 target_mac: MacAddress, target_ip: [u8; 4]) -> Self;
    pub fn parse(buffer: &[u8]) -> Option<Self>;
    pub fn write_to(&self, buffer: &mut [u8]) -> Option<usize>;
}
}

Complete API documentation: cargo doc --open


FAQ

Q: Why doesn’t networking work in QEMU? A: QEMU 9.0’s raspi4b machine doesn’t fully emulate the GENET controller. Network testing requires real Pi 4 hardware.

Q: Can I use a different Ethernet PHY? A: The driver is specific to BCM54213PE (Pi 4’s PHY). Porting would require changes to PHY initialization and MDIO addressing.

Q: What about Wi-Fi? A: Wi-Fi is much more complex (separate driver, firmware, WPA supplicant). Ethernet is the priority for now.

Q: Why not use smoltcp from the start? A: Understanding the hardware first makes debugging easier. We’ll integrate smoltcp once TX/RX works.

Q: How do I capture packets for debugging? A: Connect the Pi 4 to a network with another device running Wireshark or tcpdump. The Pi will be visible as a network node.

Q: What’s the maximum throughput? A: Hardware supports Gigabit (1000 Mbps). Actual throughput depends on:

  • DMA configuration (required for high speed)
  • CPU overhead (interrupt handling, context switches)
  • Buffer management (zero-copy techniques)
  • Realistic target: 100-500 Mbps with simple implementation

Q: Can I test without a network cable? A: Loopback mode (if supported by GENET) would allow testing TX→RX internally. This is not yet implemented.


Further Reading

Official Documentation

External Resources

Community


Contributing

When working on network code:

  1. Read the relevant documentation (hardware or protocol guide)
  2. Write tests first (if possible)
  3. Verify constants from datasheets or RFCs
  4. Document sources in code comments
  5. Test on hardware (not just QEMU)
  6. Capture packets for verification

Example code comment:

#![allow(unused)]
fn main() {
// MDIO Read operation bits (bits 27:26 = 0b10)
// Source: Linux kernel bcmgenet.h, line 487
const MDIO_RD: u32 = 2 << 26;
}

Last Updated: 2025-11-09 (Milestone #12 Complete) Next Milestone: #13 - Frame TX/RX Implementation

ARM Documentation

ARM architecture references organized by topic for quick lookup.

When to Use

Consult when implementing low-level features: exceptions, system registers, MMU, assembly code, or debugging unexpected CPU behavior.

Core Documentation

ARM Cortex-A72 Processor (Our CPU)

Cortex-A72 MPCore Processor Technical Reference Manual

Key sections:

  • Section 2: Functional description and features
  • Section 3.3: Power management (WFE/WFI instructions for core parking)
  • Section 4: System control
    • 4.2: System control registers (SCTLR_EL1, CPACR_EL1)
    • 4.3: Memory system (caches, MMU control)
    • 4.4: Exception handling configuration
  • Section 5: Exceptions and debug
    • 5.2: Exception model
    • 5.3: Exception handling (VBAR_EL1 setup)
  • Section 6: Caches
    • 6.2: L1 cache (future optimization)
    • 6.3: L2 cache configuration
  • Section 8: Memory Management Unit
    • 8.2: Translation tables (for Phase 2/3)
    • 8.3: TLB maintenance

ARMv8-A Instruction Set Architecture

A-profile A64 Instruction Set Architecture (2024-12)

Key sections:

  • Section A1: Instruction encoding and syntax
    • A1.3: Registers (X0-X30, SP, PC)
    • A1.6: Instruction set overview
  • Section C5: System register descriptions
    • C5.2.7: MPIDR_EL1 (multiprocessor affinity - for core detection)
    • C5.2.18: VBAR_EL1 (vector base address - exception table)
    • C5.2.5: ESR_EL1 (exception syndrome - what caused exception)
    • C5.2.6: FAR_EL1 (fault address - where memory fault occurred)
    • C5.2.8: ELR_EL1 (exception link - return address)
    • C5.2.16: SPSR_EL1 (saved program status)
    • C5.2.14: SCTLR_EL1 (system control - MMU enable, cache enable)
  • Section D1: The AArch64 System Level Programmers’ Model
    • D1.2: Exception levels (EL0-EL3)
    • D1.10: Exception model and vectors
    • D1.10.2: Vector table layout (16 entries × 128 bytes)
    • D1.11: Exception syndrome register (ESR_EL1 decoding)
  • Section D4: The AArch64 Virtual Memory System Architecture
    • D4.2: Translation tables (for MMU work)
    • D4.3: Page table format
    • D4.4: Memory attributes and types

Quick references:

ARM Generic Interrupt Controller

GIC-400 Architecture Specification

Needed for Phase 3 (interrupts)

  • Section 2: Programmers’ model
  • Section 3: GIC distributor (GICD) at 0xFF841000
  • Section 4: CPU interface (GICC)
  • Section 5: Interrupt configuration

Note: Pi 4 uses GIC-400 (not GIC-500/600 found in newer ARM platforms).

Usage Patterns

Implementing Exception Handling

  1. Start with ISA Section D1.10 for exception model overview
  2. Check Cortex-A72 Section 5 for A72-specific details
  3. Use ISA Section C5 for system register bitfields (VBAR_EL1, ESR_EL1, FAR_EL1)
  4. Reference ISA Section D1.11 for ESR decoding

Debugging Unexpected Behavior

  1. Check Cortex-A72 Section 4 for reset state and defaults
  2. Verify exception level in ISA Section D1.2
  3. Review system register access permissions in ISA Section C5
  4. Compare QEMU vs hardware behavior (QEMU boots EL2, hardware boots EL1)

Writing Assembly Code

  1. Use ISA Section A1 for instruction syntax
  2. Check A64 Base Instructions for specific instruction details
  3. Verify register usage in ISA Section A1.3
  4. Consult Cortex-A72 Section 3 for core-specific features

Common Pitfalls

Exception Level Confusion

  • QEMU boots at EL2, real Pi 4 hardware boots at EL1
  • Affects which registers are accessible
  • Some EL1 registers (ELR_EL1, SPSR_EL1) may show zero in QEMU
  • Solution: Check current EL and use appropriate registers

Register Access

  • System registers have specific access requirements per exception level
  • Read ISA Section C5 for each register’s access permissions
  • Accessing wrong-level registers causes undefined instruction exceptions

Vector Table Alignment

  • Exception vector table MUST be aligned to 2048 bytes (0x800)
  • Specified in ISA Section D1.10.2
  • Linker script enforces this with .align 11 (2^11 = 2048)

Implementation Checklist

When implementing ARM-specific features:

  • Cite ARM doc section number in code comments
  • Document A72-specific behavior vs generic ARMv8-A
  • Note QEMU vs hardware differences
  • Include register bitfield diagrams for complex registers
  • Cross-reference related system registers

Raspberry Pi Documentation

Raspberry Pi 4 specific documentation and resources.

Primary References

BCM2711 ARM Peripherals

BCM2711 ARM Peripherals PDF

Complete peripheral reference for the BCM2711 SoC used in Pi 4.

Key sections:

  • Section 1.2: Address map and MMIO base (0xFE000000 for ARM access)
  • Section 2: UART (PL011, mini UART)
    • 2.1: PL011 UART registers and configuration
  • Section 5: GPIO
    • 5.2: Function select and pull-up/down configuration
  • Section 6: Interrupts (GIC-400)
    • 6.1: GIC distributor base address (0xFF841000)
  • Section 10: System Timer
    • 10.2: System timer registers at 0xFE003000

Important notes:

  • Bus addresses in documentation (0x7E...) must be translated to ARM physical (0xFE...)
  • Pi 4 MMIO base changed from Pi 3’s 0x3F000000 to 0xFE000000
  • Clock frequencies differ from Pi 3 (e.g., PL011 UART: 54 MHz vs 48 MHz)

Pi 4 Schematics

Raspberry Pi 4 Reduced Schematics

Hardware schematics showing:

  • Power supply routing
  • GPIO pin connections
  • UART pin assignments (GPIO 14/15 for TXD/RXD)
  • Component placement

Device Tree Reference

Raspberry Pi Device Tree Documentation

Device tree overlays and parameters for:

  • Enabling/disabling peripherals
  • UART configuration
  • GPIO function assignment

Useful for understanding hardware defaults and firmware configuration.

Boot Configuration

config.txt Settings

For bare-metal kernel deployment to SD card:

enable_uart=1        # Enable PL011 UART for serial console
arm_64bit=1          # Boot in AArch64 mode (required)
kernel=kernel8.img   # Kernel binary to load

Boot Process

  1. GPU firmware (start4.elf) loads from SD card FAT partition
  2. Firmware initializes hardware and reads config.txt
  3. Firmware loads kernel8.img to 0x00080000
  4. Firmware jumps to kernel entry point
  5. Kernel runs in EL1 (supervisor mode)

See Boot Sequence for kernel-side boot flow.

Hardware Differences vs Pi 3

FeaturePi 3 (BCM2837)Pi 4 (BCM2711)
MMIO Base (ARM)0x3F0000000xFE000000
UART Clock48 MHz54 MHz
Interrupt ControllerARM LocalGIC-400
Max RAM1 GB1/2/4/8 GB
USB4x USB 2.02x USB 2.0 + 2x USB 3.0

Code porting note: Always use memory-map constants, never hardcode Pi 3 addresses.

QEMU Emulation

raspi4b Machine Type

QEMU Raspberry Pi Documentation

  • QEMU 9.0+ required for raspi4b machine type
  • Emulates: CPU, RAM, UART, GPIO (partial), system timer
  • Not emulated: PCI, Ethernet, WiFi, USB, GPU

QEMU vs Real Hardware

AspectQEMUReal Hardware
Boot exception levelEL2 (hypervisor)EL1 (kernel)
UART initializationPre-configuredMust initialize
TimingApproximateCycle-accurate
InterruptsBasic GICFull GIC-400

See ADR-002 for QEMU version requirements.

Useful Resources

Similar Projects & Tutorials

Learning resources and similar bare-metal Rust projects.

Rust OS Tutorials

Philipp Oppermann’s Blog OS

Writing an OS in Rust

Target: x86_64 architecture (different from our AArch64)

Useful for reference:

  • Testing framework - Custom test harness pattern
  • Print macros - print!/println! implementation using fmt::Write trait
  • Panic handling - Separate panic handlers for test vs normal mode
  • VGA text mode concepts - Console output patterns
  • Memory management - Heap allocators, paging concepts

Less relevant:

  • x86-specific code (bootloader, interrupts, APIC)
  • VGA hardware specifics
  • x86 page table format

Best use: Architecture patterns and Rust idioms, not hardware specifics.

Rust Raspberry Pi OS Tutorials

rust-raspberrypi-OS-tutorials

Target: Raspberry Pi 3 and 4 (AArch64, same as us!)

Useful for reference:

  • Pi-specific initialization - GPIO, UART, timer setup examples
  • AArch64 assembly - Boot sequence, exception handling patterns
  • Linker scripts - Section placement approaches for Pi
  • Driver patterns - MMIO register access techniques
  • Testing approaches - QEMU-based integration test examples

Differences from DaedalusOS:

  • Uses Ruby-based build tooling (we use Cargo directly)
  • Structured as progressive tutorials (we’re focused on single working kernel)
  • Supports multiple Pi models (we’re Pi 4 only)

Best use: Reference implementation for Pi 4 hardware initialization.

Embedded Rust Resources

The Embedded Rust Book

Embedded Rust Book

Topics:

  • #![no_std] development
  • Peripheral access crates (PAC pattern)
  • Memory-mapped I/O
  • Volatile operations
  • Inline assembly

Best use: General embedded Rust patterns, not Pi-specific.

C-based OS Development

OSDev Wiki

OSDev.org

Useful sections:

  • Meaty Skeleton - Project structure inspiration
  • Memory management - Paging, heaps, allocators
  • Filesystems - Future milestone reference
  • Bootloaders - Understanding boot process

Note: Most content is x86-focused. Use for concepts, not code.

OSDev Wiki - ARM

ARM-specific articles

Relevant topics:

  • Exception handling
  • MMU setup
  • Cache management
  • SMP (multi-core) bringup

Raspberry Pi 4 Bare Metal Projects

rpi4-bare-metal by rhythm16

GitHub: rhythm16/rpi4-bare-metal

Target: Raspberry Pi 4B (BCM2711, same as us!)

Useful for reference:

  • GIC-400 implementation - Interrupt controller setup and handling examples
  • PL011 UART interrupts - Interrupt-driven I/O patterns
  • Mini-UART driver - Alternative UART implementation approach
  • BCM2711-specific initialization - Hardware bringup sequence examples

Best use: Reference implementation for GIC-400 interrupt handling on Pi 4.

rpi4os.com Tutorial Series

Writing a “bare metal” OS for Raspberry Pi 4

Target: Raspberry Pi 4B

Topics covered:

  • System timer interrupts
  • Exception handling at EL1
  • Interrupt controller setup
  • Bare metal C programming patterns

Best use: Step-by-step tutorial for Pi 4 interrupt concepts.

Valvers Bare Metal Programming

Bare Metal Programming in C

Target: Raspberry Pi series (includes Pi 4)

Useful sections:

  • Part 4: Interrupts - GIC-400 explanation and setup
  • Interrupt controller architecture
  • Bare metal C patterns for Pi

Best use: Understanding interrupt flow and GIC-400 architecture.

Important note: All Pi 4 bare metal projects require enable_gic=1 in config.txt!

Project Comparisons

When to Consult Each Resource

NeedResourceWhy
Rust OS patternsBlog OSArchitecture, testing, idioms
Pi 4 hardwareRust Pi OS Tutorials, rpi4-bare-metalHardware initialization examples
ARM assemblyRust Pi OS TutorialsAArch64 boot/exception code patterns
Embedded RustEmbedded Rust Book#![no_std] patterns
OS conceptsOSDev WikiGeneral OS knowledge
ARM architectureOSDev ARMARM-specific OS dev
GIC-400 interruptsrpi4-bare-metal, ValversInterrupt handling examples

Using Reference Implementations

  1. Understand the concept from tutorials/docs
  2. Review similar implementations in reference projects
  3. Study hardware specifications from official datasheets
  4. Implement independently for DaedalusOS constraints
  5. Document our approach in code comments and docs

Note: These projects are reference implementations to learn from, not code to directly copy. Each has different design goals and constraints.

Architecture Decision Records (ADRs)

This directory contains Architecture Decision Records - documents that capture important architectural choices made during DaedalusOS development.

What is an ADR?

An ADR documents why a significant technical decision was made. It captures:

  • The problem or choice faced
  • Alternatives considered
  • The decision and rationale
  • Consequences and trade-offs

ADRs are lightweight (typically 100-400 lines) and focus on decision rationale, not implementation details.

When to Write an ADR

Write an ADR when:

“Would future-me wonder why this design exists?”

Specific triggers:

  • One-way doors: Hard-to-reverse decisions (e.g., target platform, no multi-arch)
  • Non-obvious trade-offs: Choices where alternatives had merit (e.g., QEMU version requirement)
  • Future-facing design: Adding abstraction/complexity now for future benefit (e.g., NetworkDevice trait)
  • Breaking conventions: Deviating from common patterns (with good reason)
  • External dependencies: Requiring specific versions/tools (e.g., QEMU 9.0+)

Don’t write ADRs for:

  • ❌ Implementation details (those go in module docs)
  • ❌ Obvious choices (e.g., “use Rust for Rust project”)
  • ❌ Easily reversible decisions (refactorings, minor API changes)
  • ❌ Temporary workarounds (comment in code is sufficient)

ADR Template

# ADR-XXX: Decision Title

**Status**: Accepted | Proposed | Deprecated | Superseded by ADR-YYY
**Date**: YYYY-MM-DD
**Decision**: One-sentence summary of the decision.

## Context

What problem are we solving? What constraints exist?
What alternatives were considered?

## Decision

What did we decide to do?
(Keep this section concise - 1-3 paragraphs)

## Rationale

Why this decision over alternatives?
- Reason 1
- Reason 2
- ...

### Alternatives Considered

**Alternative 1: [Name]**
- Pros: ...
- Cons: ...
- Why rejected: ...

**Alternative 2: [Name]**
- Pros: ...
- Cons: ...
- Why rejected: ...

## Consequences

### Positive
- Benefit 1
- Benefit 2

### Negative
- Cost 1
- Cost 2

### Neutral (optional)
- Side effect 1

## Related Decisions

- [ADR-XXX: Related Decision](adr-xxx.md) - How it relates

## References

- [External source 1](https://...)
- [External source 2](https://...)

Best Practices

1. Context Before Decision

Explain the problem and show alternatives before stating what you chose. This prevents “obvious in hindsight” bias.

Good:

## Context
We need to support multiple network devices (Pi 4 GENET, future Pi 5, QEMU mock).

Three approaches:
- A) Direct GENET usage (no abstraction)
- B) Full trait abstraction now
- C) Minimal trait now, implement later

## Decision
Chose option C: Minimal trait now...

Bad:

## Decision
We're using a trait for network devices.

## Context
This lets us support multiple devices...

2. Acknowledge Trade-offs

Good ADRs admit downsides. No decision is perfect.

Good:

### Negative
- Setup complexity: Users must build QEMU from source
- CI build time: ~4 minutes on first run

Bad:

### Consequences
- Better testing
- More accurate emulation
(No admission of downsides)

3. Status Lifecycle

Proposed → Accepted → [Deprecated | Superseded]
  • Proposed: Under discussion, not yet implemented
  • Accepted: Implemented and active
  • Deprecated: No longer recommended, but code remains
  • Superseded by ADR-XXX: Replaced by new decision

Update status when circumstances change.

Decisions often build on or conflict with previous ones:

## Related Decisions
- [ADR-001: Pi 4 Only](adr-001-pi-only.md) - Why we need raspi4b specifically
- [ADR-003: Network Abstraction](adr-003.md) - Plans for multi-device support

Numbering Convention

ADRs are numbered sequentially with zero-padding:

  • adr-001-pi-only.md
  • adr-002-qemu-9.md
  • adr-003-network-device-trait.md

Numbers are permanent. If ADR-002 is superseded, we create ADR-004 (not rename ADR-002).

File Naming

Format: adr-NNN-short-slug.md

Examples:

  • adr-001-pi-only.md
  • adr-002-qemu-9.md
  • adr-1-raspberry-pi-4-only-target-platform.md (too long, no zero-padding)

Examples in This Project

ADR-001: Raspberry Pi 4 Only

Type: Platform choice (one-way door) Demonstrates: Clear rationale for rejecting multi-platform, detailed reversal plan

ADR-002: QEMU 9.0+ Requirement

Type: External dependency requirement Demonstrates: “Why Not” alternatives section, multiple implementation options

ADR-003: Network Device Abstraction

Type: Future-facing design (abstraction for 1 implementation) Demonstrates: Three options with honest pros/cons, migration path, design pattern comparisons

ADR-004: Linux Kernel Filesystem Structure

Type: Code organization (maintainability choice) Demonstrates: Practical benefits prioritized over strict conformance, clear deviation policy

ADR-005: Multi-Board Support Strategy

Type: Future-facing architecture (multi-platform preparation) Demonstrates: Hybrid approach reasoning, four alternatives compared, deferred implementation

Anti-Patterns to Avoid

“Implementation Masquerading as ADR”

# ADR-006: UART Driver Implementation
## Decision
The UART driver uses PL011 registers at 0xFE201000...

→ This is implementation detail, belongs in module docs.

“No Alternatives Shown”

## Decision
We use Rust.

→ If there’s no real choice, don’t write an ADR.

“Bias Toward Decision”

## Alternatives
1. Direct GENET usage - terrible, inflexible, bad
2. Trait abstraction - perfect, elegant, future-proof

→ Be honest about trade-offs.

ADR Workflow

  1. Identify decision: Recognize a significant architectural choice
  2. Draft ADR: Use template, fill in context/alternatives
  3. Discuss if needed: For team projects; solo projects can skip
  4. Implement: Make the change
  5. Finalize ADR: Update with actual implementation details
  6. Commit together: ADR and implementation in same PR/commit

For DaedalusOS (solo project), ADRs can be written during or after implementation, as long as rationale is captured while fresh.

References

ADR-001: Raspberry Pi 4 Only

Status: Accepted Date: 2025-11-08 Decision: DaedalusOS targets only Raspberry Pi 4 Model B (BCM2711, Cortex-A72). No x86, no other ARM boards.

Context

Originally inspired by Philipp Oppermann’s Blog OS (x86_64), the project faced a choice:

  1. Maintain multi-architecture support - Keep x86_64 builds alongside Pi 4
  2. Focus on single platform - Pi 4 only, adapt patterns as needed
  3. Switch to generic ARM - Target multiple ARM platforms

Supporting multiple architectures introduces complexity:

  • Different boot processes (BIOS/UEFI vs firmware)
  • Different memory maps and MMIO access
  • Different interrupt controllers (APIC vs GIC)
  • Different assemblycode for each platform
  • Testing burden across platforms

Decision

Focus exclusively on Raspberry Pi 4 Model B.

This is a one-way door decision. The codebase will:

  • Use Pi 4-specific memory addresses (0xFE000000 MMIO base)
  • Rely on Pi 4 peripherals (PL011 UART, GIC-400, BCM2711 features)
  • Drop x86_64 target specification and code
  • Optimize for single platform instead of abstraction layers

Rationale

  1. Learning focus: Deep understanding of one platform > superficial knowledge of many
  2. Hardware access: Actual Pi 4 hardware available for testing
  3. Simplicity: No abstraction layers needed for hardware access
  4. Documentation: Can cite specific datasheet sections without caveats
  5. Iteration speed: One build target, one test platform, faster feedback

Why Pi 4 Specifically?

  • Modern ARM: ARMv8-A (64-bit) with contemporary features
  • Available hardware: Widely available, affordable (~$35-75)
  • Good documentation: BCM2711 peripherals PDF, ARM Cortex-A72 TRM
  • QEMU support: raspi4b machine type (QEMU 9.0+)
  • Ecosystem: Active community, learning resources

Consequences

Positive

  • Simpler codebase: No platform abstraction, direct hardware access
  • Better documentation: Can reference exact register addresses
  • Faster development: One platform to test and verify
  • Deeper learning: Master one SoC instead of many abstractions

Negative

  • Not portable: Cannot run on x86, other ARM boards, or cloud VMs
  • Historical code lost: x86 code lives only in git history, will rot
  • Limited audience: Only useful to Pi 4 owners/learners

Neutral

  • Code reuse: Patterns (print macros, testing) still portable to other projects
  • Future expansion: Could add Pi 5 later if justified (new ADR required)

Reversal Plan

If multi-architecture support becomes necessary:

  1. Create ADR-00X documenting new scope and rationale
  2. Design HAL (Hardware Abstraction Layer) separating platform code
  3. Restructure codebase:
    src/
    ├── platform/
    │   ├── rpi4/     # Pi 4 specific
    │   └── x86_64/   # New platform
    ├── drivers/      # Generic drivers
    └── kernel/       # Platform-independent code
    
  4. Test on both platforms before merging
  5. Update all documentation for multi-platform reality

Cost estimate: 2-4 weeks of refactoring, significant ongoing testing burden.

Triggers for reversal:

  • Project scope expands beyond learning (e.g., production deployment)
  • Need to support multiple Pi models with different hardware (Pi 5, CM4)
  • Community contributions require broader hardware support
  • Cloud/VM deployment becomes a requirement (x86_64)

Current assessment: Not triggered. Learning focus remains valid.

Current State

  • x86_64 code removed from main branch (2025-11-08)
  • Linker script, boot assembly, and memory map are Pi 4-specific
  • All documentation assumes Pi 4 hardware

References

ADR-002: QEMU 9.0+ Requirement

Status: Accepted Date: 2025-11-09 Decision: DaedalusOS requires QEMU 9.0 or newer for emulation testing.

Context

QEMU is the primary tool for kernel development and testing:

  • Fast iteration: Test changes without SD card flashing
  • Debugging: GDB integration, semihosting for test output
  • CI/CD: Automated testing in GitHub Actions

However, QEMU’s Raspberry Pi support evolved over time:

  • QEMU 6.1: Added raspi3b (Pi 3) machine type
  • QEMU 6.2: Improved Pi 3 emulation
  • QEMU 8.x: Various improvements, but no Pi 4
  • QEMU 9.0 (April 2024): Added raspi4b machine type for Pi 4

Problem

Many Linux distributions ship older QEMU versions:

  • Ubuntu 22.04 LTS: QEMU 6.2 (no raspi4b)
  • Ubuntu 24.04 LTS: QEMU 8.2 (still no raspi4b!)
  • Ubuntu 24.10+: QEMU 9.0+ (has raspi4b)

Installing via apt install qemu-system-aarch64 on Ubuntu 22.04/24.04 results in:

qemu-system-aarch64: unsupported machine type
Use -machine help to list supported machines

Decision

Require QEMU 9.0 or newer for DaedalusOS development and testing.

Implementation

  1. Documentation: README and setup guides specify QEMU 9.0+ requirement
  2. CI/CD: GitHub Actions builds QEMU 9.2 from source with caching
  3. Verification: qemu-system-aarch64 -M help | grep raspi must show raspi4b

Rationale

Why Not Fallback to raspi3b?

Using raspi3b machine type (Pi 3 emulation) was considered but rejected:

Hardware differences:

  • Different MMIO base (0x3F000000 vs 0xFE000000)
  • Different UART clock (48 MHz vs 54 MHz)
  • Different interrupt controller (ARM local vs GIC-400)
  • Missing Pi 4-specific features

Code impact:

  • Would require conditional compilation (#[cfg]) for QEMU vs hardware
  • Breaks “one platform” philosophy (see ADR-001)
  • Tests wouldn’t validate real hardware behavior

Why Not Wait for Distribution Packages?

Timeline reality:

  • Ubuntu 24.04 LTS released April 2024, still ships QEMU 8.2
  • Ubuntu 26.04 LTS (April 2026) will likely have QEMU 10+
  • Can’t wait 1-2 years for package availability

Alternative: Build from source or use newer Ubuntu (24.10+).

Consequences

Positive

  • Accurate emulation: Tests run on Pi 4-equivalent environment
  • Single codebase: No QEMU-specific workarounds
  • Future-proof: Latest QEMU features available

Negative

  • Setup complexity: Users on older Ubuntu must build from source
  • CI build time: First GH Actions run takes ~4 minutes to compile QEMU
  • Storage: QEMU build artifacts ~300 MB (mitigated by caching)

Reversal Plan

This decision will naturally reverse itself as Linux distributions catch up:

When distribution packages suffice:

  1. Update README to recommend apt install qemu-system-aarch64 (1 line change)
  2. Simplify CI workflow to use apt instead of building from source
  3. Remove QEMU build caching steps from GitHub Actions
  4. Update ADR-002 status to “Superseded by standard packages”

Estimated timeline: Ubuntu 26.04 LTS (April 2026) will likely ship QEMU 10+

Cost of reversal: Minimal (simplification, not refactoring)

Triggers for early reversal:

  • Ubuntu backports QEMU 9.0+ to 24.04 LTS (check ubuntu-proposed)
  • Raspberry Pi official QEMU binaries become available
  • CI environment switches to newer Ubuntu version

This is a temporary workaround that will age out naturally.

Implementation Options

###Option 1: Build QEMU from Source (Recommended)

# Install build dependencies
sudo apt-get install -y ninja-build libglib2.0-dev libpixman-1-dev

# Download and build QEMU 9.2
wget https://download.qemu.org/qemu-9.2.0.tar.xz
tar xf qemu-9.2.0.tar.xz
cd qemu-9.2.0
./configure --prefix=$HOME/qemu-install --target-list=aarch64-softmmu --enable-slirp
make -j$(nproc)
make install

# Add to PATH
export PATH="$HOME/qemu-install/bin:$PATH"

Pros: Full control, latest version Cons: ~4 minute build time, 300 MB disk space

Option 2: Upgrade to Ubuntu 24.10+

# Check current version
lsb_release -a

# Upgrade if on 24.04 or earlier
# (Follow Ubuntu upgrade guide)

Pros: Simple apt install Cons: Major OS upgrade, may break other tools

Option 3: Use Pre-built Binary

Status: Not available. QEMU only distributes source tarballs.

CI/CD Strategy

GitHub Actions (.github/workflows/ci.yml):

- name: Cache QEMU build
  uses: actions/cache@v4
  with:
    path: ~/qemu-install
    key: qemu-9.2.0-aarch64

- name: Build QEMU 9.2
  if: cache-miss
  run: |
    # Build from source (first run only)

- name: Run tests
  run: cargo test  # Uses cached QEMU

First run: ~8 minutes total (4 min QEMU build + 4 min tests) Subsequent runs: ~4 minutes (cached QEMU, tests only)

Verification

Check QEMU version and raspi4b support:

$ qemu-system-aarch64 --version
QEMU emulator version 9.2.0

$ qemu-system-aarch64 -M help | grep raspi
raspi0               Raspberry Pi Zero (revision 1.2)
raspi1ap             Raspberry Pi A+ (revision 1.1)
raspi2b              Raspberry Pi 2B (revision 1.1)
raspi3ap             Raspberry Pi 3A+ (revision 1.0)
raspi3b              Raspberry Pi 3B (revision 1.2)
raspi4b              Raspberry Pi 4B (revision 1.2)  ← Must be present

References

ADR-003: Network Device Abstraction Layer

Status: Accepted Date: 2025-11-10 Decision: Implement NetworkDevice trait abstraction for network hardware drivers.

Context

DaedalusOS currently targets Raspberry Pi 4 exclusively (ADR-001), which uses the BCM2711 GENET v5 Ethernet controller. However, future expansion plans include:

  1. Raspberry Pi 5 support - Different Ethernet controller (when QEMU support available)
  2. QEMU mock driver - Enable network stack testing in emulation (Milestone #14)
  3. smoltcp integration - TCP/IP stack expects generic device abstraction (Milestone #16)

Two architectural approaches were considered:

Option A: Direct GENET Usage (No Abstraction)

#![allow(unused)]
fn main() {
// All network code directly uses GenetController
let mut genet = GenetController::new();
genet.init()?;
genet.transmit(frame)?;
}

Pros: Simpler initially, no abstraction overhead Cons: Tight coupling, difficult to add Pi 5 or mock drivers later

Option B: Trait Abstraction Now

#![allow(unused)]
fn main() {
// Network code uses trait, implementation is pluggable
let mut netdev: Box<dyn NetworkDevice> = Box::new(GenetController::new());
netdev.init()?;
netdev.transmit(frame)?;
}

Pros: Future-proof, testable, aligns with smoltcp patterns Cons: Extra abstraction layer, more upfront design

Option C: Minimal Trait Now, Full Implementation Later (Chosen)

#![allow(unused)]
fn main() {
// Trait exists, but only one implementation initially
trait NetworkDevice {
    fn init(&mut self) -> Result<(), NetworkError>;
    fn transmit(&mut self, frame: &[u8]) -> Result<(), NetworkError>;
    fn receive(&mut self) -> Option<&[u8]>;
    // ... minimal interface
}

impl NetworkDevice for GenetController { /* ... */ }
}

Pros: Captures design now, enables gradual implementation Cons: None significant

Decision

Implement NetworkDevice trait abstraction in Milestone #12 (alongside protocol structures).

The trait provides:

  • Hardware detection (is_present())
  • Lifecycle management (init())
  • Frame I/O (transmit(), receive())
  • Metadata (mac_address(), link_up())

Current implementations:

  • GenetController (Pi 4 GENET v5)

Future implementations:

  • Mock device for QEMU (Milestone #14)
  • Pi 5 Ethernet controller (when hardware available)

Rationale

Why Now (Milestone #12) Instead of Later?

  1. Low cost: Trait definition is small (~100 lines), mostly documentation
  2. Captures design intent: Documents interface requirements while fresh
  3. Enables testing: Mock driver can be added in Milestone #14 without refactoring
  4. Aligns with smoltcp: Their Device trait expects similar abstraction

Why This Interface?

Blocking transmit, non-blocking receive:

  • Simplifies initial implementation (interrupts come in Milestone #14)
  • Common pattern in embedded networking (Linux ndo_start_xmit, smoltcp)
  • API remains stable when adding interrupt-driven I/O

Single-frame API (no queues):

  • Pushes buffer management to implementation (GENET has hardware rings)
  • Keeps trait simple and focused
  • Protocol stacks (smoltcp) poll in loops and manage their own buffers

Optional link_up() with default:

  • Not all devices have PHY link detection (mock drivers)
  • Default returns false (conservative)
  • Real hardware can override

Result-based error handling:

  • NetworkError enum covers all failure modes
  • Explicit errors better than silent failures in bare-metal

Consequences

Positive

  • Future-proof: Adding Pi 5 or mock drivers requires no refactoring
  • Testable: Can swap real hardware for mock in tests
  • smoltcp integration: Clean Device trait implementation (wrap our trait)
  • Clear interface: Documents exactly what network hardware must provide

Negative

  • Abstraction overhead: Extra trait layer (negligible in practice)
  • Not strictly needed: Could delay until Pi 5 support (but harder to retrofit)

Neutral

  • Current code unchanged: GENET driver gains trait implementation, no functional changes
  • API stability: Trait signature designed to remain stable through interrupt-driven I/O

Implementation Details

Module Structure

src/
├── drivers/net/
│   ├── netdev.rs                     # Trait definition, NetworkError
│   └── ethernet/broadcom/genet.rs    # impl NetworkDevice for GenetController
└── net/
    ├── ethernet.rs       # Ethernet protocol (uses trait in future)
    └── arp.rs            # ARP protocol

Trait Definition

#![allow(unused)]
fn main() {
pub trait NetworkDevice {
    fn is_present(&self) -> bool;
    fn init(&mut self) -> Result<(), NetworkError>;
    fn transmit(&mut self, frame: &[u8]) -> Result<(), NetworkError>;
    fn receive(&mut self) -> Option<&[u8]>;
    fn mac_address(&self) -> MacAddress;
    fn link_up(&self) -> bool { false }  // Default implementation
}
}

Error Types

#![allow(unused)]
fn main() {
pub enum NetworkError {
    HardwareNotPresent,
    NotInitialized,
    TxBufferFull,
    FrameTooLarge,
    FrameTooSmall,
    HardwareError,
    Timeout,
    InvalidConfiguration,
}
}

Frame Size Validation

Trait implementations enforce Ethernet frame size constraints:

  • Minimum: 60 bytes (excludes 4-byte CRC)
  • Maximum: 1514 bytes (excludes 4-byte CRC)

Source: IEEE 802.3 Ethernet standard

Design Patterns

Pattern 1: Linux net_device

The Linux kernel uses struct net_device with function pointers:

struct net_device_ops {
    int (*ndo_init)(struct net_device *dev);
    int (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev);
    // ...
};

Our trait is the Rust equivalent with compile-time polymorphism.

Pattern 2: embedded-hal

Rust embedded ecosystem uses trait abstractions:

#![allow(unused)]
fn main() {
pub trait SpiDevice {
    fn transfer(&mut self, read: &mut [u8], write: &[u8]) -> Result<(), Self::Error>;
}
}

Our NetworkDevice follows this pattern for bare-metal Rust.

Pattern 3: smoltcp Device

smoltcp expects a Device trait:

#![allow(unused)]
fn main() {
pub trait Device {
    fn receive(&mut self, timestamp: Instant) -> Option<(Self::RxToken, Self::TxToken)>;
}
}

We’ll implement smoltcp’s trait by wrapping our NetworkDevice trait in Milestone #16.

Testing Impact

Unit Tests

Added test_network_device_trait() validating:

  • Frame size validation (too small, too large)
  • Error handling (NotInitialized state)
  • MAC address retrieval

Integration Tests (Future)

#![allow(unused)]
fn main() {
#[test_case]
fn test_mock_network_device() {
    let mut mock = MockNetworkDevice::new();
    mock.init().unwrap();

    // Inject test frame
    mock.inject_rx_frame(&test_frame);
    assert!(mock.receive().is_some());

    // Capture TX frames
    mock.transmit(&outgoing_frame).unwrap();
    assert_eq!(mock.captured_tx_frames().len(), 1);
}
}

Migration Path

Current (Milestone #12)

#![allow(unused)]
fn main() {
// Direct usage of trait implementation
use daedalus::drivers::genet::GenetController;
use daedalus::drivers::netdev::NetworkDevice;

let mut genet = GenetController::new();
if genet.is_present() {
    genet.init()?;
    genet.transmit(&frame)?;
}
}

Future (Milestone #14+)

#![allow(unused)]
fn main() {
// Runtime selection of implementation
use daedalus::drivers::netdev::NetworkDevice;

let mut netdev: Box<dyn NetworkDevice> = if in_qemu() {
    Box::new(MockNetworkDevice::new())
} else {
    Box::new(GenetController::new())
};

netdev.init()?;
// Same API for both implementations
}

smoltcp Integration (Milestone #16)

#![allow(unused)]
fn main() {
// Wrap our trait in smoltcp's Device trait
struct DaedalusDevice<T: NetworkDevice> {
    inner: T,
    rx_buffer: [u8; 1518],
}

impl<T: NetworkDevice> smoltcp::phy::Device for DaedalusDevice<T> {
    fn receive(&mut self, _timestamp: Instant) -> Option<(RxToken, TxToken)> {
        // Map our receive() to smoltcp's token API
    }
}
}

Alternatives Considered

Alternative 1: Delay Until Pi 5 Support

Rejected: Retrofitting abstraction later requires:

  1. Refactoring all network code
  2. Changing function signatures throughout codebase
  3. Risk of breaking working code

Cost of adding trait now is minimal, benefit is substantial.

Alternative 2: Use embedded-hal Traits

Rejected: embedded-hal doesn’t define network device traits (only SPI, I2C, GPIO, etc.). We’d need to design our own anyway.

Alternative 3: Function Pointers (C-style)

#![allow(unused)]
fn main() {
struct NetworkDevice {
    init: fn(&mut Self) -> Result<(), NetworkError>,
    transmit: fn(&mut Self, &[u8]) -> Result<(), NetworkError>,
    // ...
}
}

Rejected: Rust traits provide better type safety, compile-time dispatch, and zero-cost abstraction.

Reversal Plan

If the abstraction proves unnecessary (e.g., we never add Pi 5 or mock drivers):

To remove trait abstraction:

  1. Change all use NetworkDevice to direct GenetController usage
  2. Replace trait method calls with direct GENET method calls
  3. Delete src/drivers/net/netdev.rs (~290 lines)
  4. Update documentation to remove trait references
  5. Mark ADR-003 as “Deprecated - Abstraction not needed”

Cost estimate: ~2 hours (straightforward refactoring, all usage is local)

Triggers for reversal:

  • Milestone #14 skipped (no QEMU mock driver implemented)
  • Milestone #16 uses smoltcp differently (doesn’t need our trait)
  • Pi 5 support deemed out of scope permanently
  • Trait adds measurable performance overhead (unlikely but possible)

Likelihood: Low. The trait is minimal (~100 lines) and already implemented. More likely we’ll add implementations than remove the abstraction.

Current State

  • NetworkDevice trait defined (src/drivers/net/netdev.rs)
  • GenetController implements trait
  • ✅ 66 unit tests passing (added 1 new test)
  • ✅ Documentation complete
  • ⏳ Milestone #13 will use trait for TX/RX implementation

Future Work

Milestone #14: Mock Network Device

#![allow(unused)]
fn main() {
pub struct MockNetworkDevice {
    rx_queue: Vec<Vec<u8>>,
    tx_captured: Vec<Vec<u8>>,
    mac: MacAddress,
}

impl NetworkDevice for MockNetworkDevice {
    // Enable network stack testing in QEMU
}
}

Milestone #16: smoltcp Integration

#![allow(unused)]
fn main() {
impl<T: NetworkDevice> smoltcp::phy::Device for DaedalusDevice<T> {
    // Bridge our trait to smoltcp's expectations
}
}

Pi 5 Support (Future)

#![allow(unused)]
fn main() {
pub struct Pi5EthernetController { /* ... */ }

impl NetworkDevice for Pi5EthernetController {
    // Same interface, different hardware
}
}
  • ADR-001: Pi 4 Only - Why single platform (but plan for expansion)
  • Future ADR: Pi 5 Support (when QEMU gains raspi5b machine type)

References

Design Patterns

Standards

  • IEEE 802.3: Ethernet frame format and size constraints
  • RFC 1122: Requirements for Internet Hosts (network layer expectations)

Implementation

  • Module: src/drivers/net/netdev.rs
  • Usage: src/drivers/net/ethernet/broadcom/genet.rs (NetworkDevice implementation)
  • Tests: src/drivers/net/ethernet/broadcom/genet.rs::tests::test_network_device_trait

ADR-004: Linux Kernel Filesystem Structure

Status: Accepted Date: 2025-11-11 Decision: Reorganize source tree following Linux kernel subsystem conventions where they improve maintainability, simplicity, and ease of implementation.

Context

DaedalusOS currently uses a flat driver directory structure:

src/
├── drivers/
│   ├── uart.rs      # PL011 UART driver
│   ├── gpio.rs      # BCM2711 GPIO driver
│   ├── genet.rs     # Broadcom GENET ethernet
│   ├── gic.rs       # GIC-400 interrupt controller
│   ├── timer.rs     # BCM2711 system timer
│   └── netdev.rs    # NetworkDevice trait
├── net/             # Protocol stack
├── arch/aarch64/    # Architecture-specific code
├── allocator.rs     # Heap allocator
└── exceptions.rs    # Exception handling

Problems with Current Structure

  1. Poor Scalability: Flat drivers/ directory will become cluttered as we add:

    • Multiple network drivers (WiFi, USB ethernet, mock devices)
    • Additional serial devices (mini UART, console abstraction)
    • More interrupt controllers (if porting to other boards)
  2. Unclear Organization: Files like netdev.rs sit alongside hardware drivers

    • Is netdev.rs a driver or an abstraction?
    • Where would a second GPIO driver go?
  3. Generic Naming: Files like uart.rs, gpio.rs, timer.rs don’t indicate:

    • Which hardware they support (PL011? BCM2711? Generic?)
    • Platform specificity (Pi 4 only)
  4. Convention Mismatch: Structure doesn’t match established patterns:

    • Linux kernel uses subsystem directories (drivers/tty/, drivers/irqchip/)
    • Experienced developers expect familiar layout
    • AI agents trained on Linux kernel code struggle with flat structures
  5. Missing Separation: Architecture-independent concerns mixed with drivers:

    • exceptions.rs is AArch64-specific but lives in src/
    • allocator.rs is generic memory management but sits at top level

Why Use Linux-Inspired Structure?

Linux kernel structure provides a proven foundation that balances multiple goals:

  • Proven scalability: Structure handles thousands of drivers across decades
  • Clear conventions: Established patterns for subsystem organization
  • Developer familiarity: Most OS developers recognize the layout as a bonus
  • Agent familiarity: LLMs trained on Linux kernel code navigate similar structures naturally
  • Maintainability: Clear boundaries between subsystems reduce cognitive load

We use Linux conventions where they align with our goals (maintainability, simplicity, ease of implementation), not as a strict requirement.

Alternatives Considered

Alternative 1: Keep Current Flat Structure

Pros: Simple, no migration work Cons: Doesn’t scale, unclear organization, convention mismatch Rejected: Already causing confusion about where new files should go

Alternative 2: Custom Hierarchical Structure

src/
├── devices/
│   ├── serial/uart.rs
│   ├── gpio/gpio.rs
│   └── network/genet.rs
├── memory/allocator.rs
└── interrupts/gic.rs

Pros: Cleaner than flat, custom to our needs Cons: Unfamiliar to everyone, reinventing conventions Rejected: No benefit over established Linux structure

Alternative 3: Minimal Rust-Idiomatic Structure

src/
├── hal/              # Hardware Abstraction Layer
│   ├── uart.rs
│   └── gpio.rs
├── drivers/          # High-level drivers
│   └── network.rs
└── platform/         # Platform-specific code
    └── bcm2711/

Pros: Matches embedded Rust embedded-hal pattern Cons: Doesn’t match OS development conventions, unclear boundaries Rejected: DaedalusOS is an OS, not an embedded HAL library

Decision

Reorganize source tree following Linux kernel subsystem conventions for improved maintainability and developer familiarity, using deep nesting and specific chip/device naming.

Target Structure

src/
├── main.rs
├── lib.rs
├── shell.rs
├── qemu.rs
│
├── mm/                              # Memory Management (Linux: mm/)
│   ├── mod.rs
│   └── allocator.rs
│
├── arch/                            # Architecture-specific (Linux: arch/)
│   └── aarch64/
│       ├── mod.rs
│       ├── boot.s
│       ├── exceptions.s
│       ├── exceptions.rs            # ← Move from src/
│       └── mmu.rs
│
├── drivers/                         # Device Drivers (Linux: drivers/)
│   ├── mod.rs
│   │
│   ├── tty/                         # TTY subsystem (Linux: drivers/tty/)
│   │   ├── mod.rs
│   │   └── serial/
│   │       ├── mod.rs
│   │       └── amba_pl011.rs        # ← Rename uart.rs, match amba-pl011.c
│   │
│   ├── gpio/                        # GPIO subsystem (Linux: drivers/gpio/)
│   │   ├── mod.rs
│   │   └── bcm2711.rs               # ← Rename gpio.rs, chip-specific
│   │
│   ├── net/                         # Network devices (Linux: drivers/net/)
│   │   ├── mod.rs
│   │   ├── netdev.rs                # NetworkDevice trait
│   │   └── ethernet/
│   │       ├── mod.rs
│   │       └── broadcom/
│   │           ├── mod.rs
│   │           └── genet.rs         # ← Move from drivers/
│   │
│   ├── irqchip/                     # Interrupt controllers (Linux: drivers/irqchip/)
│   │   ├── mod.rs
│   │   └── gic_v2.rs                # ← Rename gic.rs, GIC-400 is v2
│   │
│   └── clocksource/                 # Timers (Linux: drivers/clocksource/)
│       ├── mod.rs
│       └── bcm2711.rs               # ← Rename timer.rs
│
└── net/                             # Network Protocol Stack (Linux: net/)
    ├── mod.rs
    ├── ethernet.rs
    └── arp.rs

File Migrations

Current PathNew PathLinux Reference
src/allocator.rssrc/mm/allocator.rsmm/slab.c
src/exceptions.rssrc/arch/aarch64/exceptions.rsarch/arm64/kernel/traps.c
src/drivers/uart.rssrc/drivers/tty/serial/amba_pl011.rsdrivers/tty/serial/amba-pl011.c
src/drivers/gpio.rssrc/drivers/gpio/bcm2711.rsdrivers/gpio/gpio-bcm2711.c
src/drivers/genet.rssrc/drivers/net/ethernet/broadcom/genet.rsdrivers/net/ethernet/broadcom/genet/bcmgenet.c
src/drivers/gic.rssrc/drivers/irqchip/gic_v2.rsdrivers/irqchip/irq-gic.c
src/drivers/timer.rssrc/drivers/clocksource/bcm2711.rsdrivers/clocksource/bcm2835_timer.c
src/drivers/netdev.rssrc/drivers/net/netdev.rsinclude/linux/netdevice.h

Naming Conventions

Use specific, clear names (happens to align with Linux patterns):

  • Use specific chip/device names: bcm2711.rs, amba_pl011.rs (not ambiguous gpio.rs, uart.rs)
  • Use underscores in Rust filenames: gic_v2.rs (Rust convention, adapted from Linux irq-gic.c)
  • Use descriptive subsystem names: irqchip/, clocksource/ (clarifies purpose better than irq/, timer/)

Rationale

Why Deep Nesting?

Objection: “Rust prefers flat modules, deep nesting is un-idiomatic”

Primary benefit - Better organization:

  • Prevents cluttered flat directories as driver count grows
  • Clear subsystem boundaries improve maintainability
  • Vendor/chip-specific directories group related code naturally
  • Easy to find where new drivers should go

Technical compatibility - Rust handles nesting well:

  • Rust’s module system handles deep nesting naturally via mod.rs files
  • No impact on compilation, borrow checking, or lifetimes
  • pub use re-exports provide clean public API when needed
  • Cargo handles nested modules automatically

Real-world Rust OS examples also choose nested structures:

  • Redox OS: Uses nested driver structure
  • Theseus OS: Uses subsystem directories
  • Blog OS: Small project, but uses arch/ separation

Why Specific Naming (bcm2711.rs vs gpio.rs)?

Primary benefit - Eliminates ambiguity:

Generic names create confusion as the codebase grows:

  • gpio.rs - Which GPIO controller? BCM2711? RP2040? Abstract trait?
  • uart.rs - PL011? Mini UART? 16550? Multiple implementations?
  • timer.rs - System timer? ARM generic timer? Watchdog timer?

Specific chip/device names provide immediate clarity:

  • bcm2711.rs - Unmistakably the BCM2711 GPIO driver
  • amba_pl011.rs - Clearly ARM’s PL011 UART (portable to other SoCs using PL011)
  • genet.rs under broadcom/ - Broadcom’s GENET MAC, not Intel or Realtek

Secondary benefit - Enables multiple implementations naturally:

drivers/gpio/
├── mod.rs
├── bcm2711.rs        # Pi 4 GPIO
└── bcm2835.rs        # Pi 1-3 GPIO (if we add legacy support)

This also happens to match Linux naming conventions (gpio-bcm2711.c, amba-pl011.c), providing familiar patterns as a bonus.

Why Follow Linux Conventions (Not Exact Matching)?

We adopt Linux naming and organization where it improves maintainability, not for strict conformance:

Clear organization principles: Linux conventions solve real problems:

  • Subsystem boundaries (drivers/tty/ vs drivers/net/) prevent mixing concerns
  • Vendor directories (ethernet/broadcom/) naturally scale with multiple vendors
  • Function-based naming (irqchip/, clocksource/) clarifies purpose

Reduced cognitive load: Familiar patterns require less mental mapping:

  • “Where do serial drivers go?” → drivers/tty/serial/ is the obvious answer
  • New contributors don’t waste time debating structure
  • Clear precedent for where new code belongs

Better tooling support: AI agents and experienced developers benefit:

  • LLMs trained on Linux kernel suggest correct file locations
  • Agents understand context from directory structure without explanation
  • Documentation references Linux subsystems for comparison

We will deviate from Linux conventions when:

  • DaedalusOS-specific needs require different structure
  • Rust idioms suggest clearer alternatives
  • Simpler solutions exist for our single-platform scope

Why Not embedded-hal Structure?

embedded-hal is a library for hardware abstraction, not an operating system.

embedded-hal structure:

src/
├── hal/              # Abstract traits
│   ├── gpio.rs
│   └── serial.rs
└── platform/         # Platform implementations
    └── bcm2711/

Why this doesn’t fit:

  • DaedalusOS is building an OS kernel, not a HAL library
  • We need protocol stacks (net/), memory management (mm/), architecture code (arch/)
  • Linux structure proven for OS development over 30+ years

Consequences

Positive

  • Scalability: Clear place for new drivers (second network driver goes in drivers/net/ethernet/vendor/)
  • Familiarity: Experienced OS developers immediately understand structure
  • AI effectiveness: Agents trained on Linux kernel navigate codebase naturally
  • Clear boundaries: Subsystems have obvious separation (mm/, arch/, drivers/)
  • Specific naming: No ambiguity about which hardware a driver supports
  • Industry alignment: Matches conventions of Linux, FreeBSD, Zircon

Negative

  • Migration work: ~50 files touched (imports updated)
  • Deeper paths: use crate::drivers::tty::serial::amba_pl011 vs use crate::drivers::uart
  • More directories: 10+ new directories vs current 3
  • Breaking change: External users (if any) must update imports

Neutral

  • Compilation unchanged: Rust module system handles nesting transparently
  • Performance unchanged: File organization is compile-time only
  • Functionality unchanged: Pure refactoring, no behavior changes

Migration Impact

Files to move: 8 (allocator, exceptions, 6 drivers) New directories: 10 (mm/, drivers/tty/serial/, drivers/gpio/, etc.) Import updates: ~30-40 use statements across files Documentation updates: CLAUDE.md, code reference sections Estimated time: 1-2 hours (mostly mechanical)

Implementation Plan

Phase 1: Create Directory Structure

mkdir -p src/mm
mkdir -p src/drivers/{tty/serial,gpio,net/ethernet/broadcom,irqchip,clocksource}

Phase 2: Move Files with Git (Preserve History)

# Memory management
git mv src/allocator.rs src/mm/allocator.rs

# Architecture
git mv src/exceptions.rs src/arch/aarch64/exceptions.rs

# Drivers
git mv src/drivers/uart.rs src/drivers/tty/serial/amba_pl011.rs
git mv src/drivers/gpio.rs src/drivers/gpio/bcm2711.rs
git mv src/drivers/genet.rs src/drivers/net/ethernet/broadcom/genet.rs
git mv src/drivers/netdev.rs src/drivers/net/netdev.rs
git mv src/drivers/gic.rs src/drivers/irqchip/gic_v2.rs
git mv src/drivers/timer.rs src/drivers/clocksource/bcm2711.rs

Phase 3: Create mod.rs Files

Each directory needs a mod.rs for module declarations:

src/mm/mod.rs:

#![allow(unused)]
fn main() {
//! Memory Management subsystem
//! Corresponds to Linux mm/

pub mod allocator;
pub use allocator::*;
}

src/drivers/tty/mod.rs:

#![allow(unused)]
fn main() {
//! TTY and serial device drivers
//! Corresponds to Linux drivers/tty/

pub mod serial;
}

src/drivers/tty/serial/mod.rs:

#![allow(unused)]
fn main() {
//! Serial (UART) device drivers
//! Corresponds to Linux drivers/tty/serial/

pub mod amba_pl011;
pub use amba_pl011::*;
}

src/drivers/mod.rs (with backward compatibility):

#![allow(unused)]
fn main() {
//! Device drivers subsystem
//! Organized following Linux kernel conventions

pub mod tty;
pub mod gpio;
pub mod net;
pub mod irqchip;
pub mod clocksource;

// Backward compatibility aliases (remove in future breaking change)
pub mod uart {
    //! Deprecated: Use drivers::tty::serial instead
    pub use crate::drivers::tty::serial::*;
}

pub mod gic {
    //! Deprecated: Use drivers::irqchip::gic_v2 instead
    pub use crate::drivers::irqchip::gic_v2::*;
}

pub mod timer {
    //! Deprecated: Use drivers::clocksource instead
    pub use crate::drivers::clocksource::*;
}
}

Phase 4: Update Imports

Automated with search/replace:

#![allow(unused)]
fn main() {
// Old imports
use crate::drivers::uart;
use crate::drivers::gic;
use crate::allocator;
use crate::exceptions;

// New imports (backward compatible via aliases)
use crate::drivers::uart;  // Still works via alias
use crate::drivers::gic;   // Still works via alias
use crate::mm::allocator;
use crate::arch::aarch64::exceptions;
}

Or use new paths explicitly:

#![allow(unused)]
fn main() {
use crate::drivers::tty::serial::amba_pl011;
use crate::drivers::irqchip::gic_v2;
}

Phase 5: Update Documentation

  • CLAUDE.md: Update file paths in “Code Organization” section
  • docs/src/hardware/*.md: Update code reference paths
  • docs/src/architecture/*.md: Update module paths
  • README.md: Update getting started examples (if any)

Phase 6: Remove Backward Compatibility (Future)

In next breaking version (v0.2.0 or v1.0.0):

  • Remove alias modules from drivers/mod.rs
  • Force all code to use new paths
  • Update CLAUDE.md to remove old path references

Testing

Verification after migration:

./.githooks/pre-commit  # Must pass:
                        # - cargo fmt --check
                        # - cargo clippy
                        # - cargo doc
                        # - cargo test
                        # - cargo build --release

No functional changes: All 66 tests must pass identically.

Backward Compatibility

Public API impact: Low

  • DaedalusOS is not a published library (no external consumers)
  • Breaking change acceptable for v0.x versions

Alias strategy: Keep old paths working during transition:

#![allow(unused)]
fn main() {
// Old code continues working
use crate::drivers::uart::WRITER;  // Via alias

// New code uses explicit paths
use crate::drivers::tty::serial::amba_pl011::WRITER;
}

Deprecation timeline:

  • v0.2.0: Add aliases, warn about deprecation in docs
  • v0.3.0: Remove aliases, require new paths
  • v1.0.0: Final structure solidified

References

Linux Kernel Structure

Other OS Structures

Rust OS Examples

Naming Conventions

Current State

  • Status: Accepted and implemented
  • Implementation: Complete (all files moved, mod.rs created, backward compatibility added)
  • Testing: All 66 tests passing, build successful

Questions for Review

  1. Nesting depth: Is drivers/net/ethernet/broadcom/genet.rs too deep?

    • Alternative: drivers/net/genet.rs (one level)
    • Recommendation: Keep deep for future vendor expansion
  2. Backward compatibility: Keep aliases indefinitely or remove in v0.2.0?

    • Recommendation: Remove in v0.2.0 (clean break while still v0.x)
  3. Timer naming: bcm2711.rs (Pi 4-specific) or bcm2835.rs (Linux naming)?

    • Linux uses bcm2835_timer.c for backward compat even on Pi 4
    • Recommendation: bcm2711.rs (accurate for our Pi 4-only scope per ADR-001)
  4. arch/aarch64/exceptions.rs: Keep or create arch/aarch64/kernel/ subdir?

    • Linux has arch/arm64/kernel/traps.c, arch/arm64/kernel/entry.S
    • Recommendation: Keep flat for now, add kernel/ if more arch files appear

ADR-005: Multi-Board Support Strategy

Status: Accepted Date: 2025-01-11 Decision: Use hybrid runtime hardware detection pattern to support multiple Raspberry Pi boards (Pi 4, Pi 5) with a single kernel binary.

Context

DaedalusOS currently targets only Raspberry Pi 4 (BCM2711 SoC) as documented in ADR-001. However, Raspberry Pi 5 introduces significant architectural changes that require planning now:

Pi 5 Architectural Changes (BCM2712 + RP1)

Pi 5 moves to a disaggregated architecture with most I/O offloaded to a separate RP1 I/O controller chip (Raspberry Pi’s first custom silicon), connected via PCIe Gen 2.0:

PeripheralPi 4 (BCM2711)Pi 5 (BCM2712 + RP1)
EthernetGENET v5 (native)RP1-Ethernet (PCIe-attached)
USBDWC2 (native)RP1-USB (PCIe-attached)
GPIOBCM2711 registersRP1 registers (PCIe-attached)
UARTPL011 (native)RP1-UART (PCIe-attached)
I2C/SPIBCM2711RP1

Impact: Almost all drivers will need board-specific implementations. The question is: how do we support both boards without massive rewrites?

Timing Considerations

  • Current: Pi 4 only, QEMU 9.0+ supports raspi4b machine
  • Near Future: QEMU will add BCM2712/RP1 emulation (likely 2025)
  • User Context: Developer has both Pi 4 and Pi 5 hardware, wants to support both

The Problem

How do we architect driver support to:

  1. Continue Pi 4 development without impediment
  2. Add Pi 5 support cleanly when QEMU support arrives
  3. Avoid major refactoring/rewrites when transitioning
  4. Potentially support both boards with a single binary (convenience for testing/deployment)

Decision

Use hybrid runtime hardware detection pattern (inspired by Linux driver probing):

  1. Driver Pattern: All drivers implement is_present() hardware detection
  2. Trait Abstraction: Multi-implementation categories use traits (NetworkDevice, future UsbHost)
  3. Runtime Selection: At boot, probe for hardware and instantiate correct driver
  4. Single Binary: One kernel image auto-detects board and initializes appropriate drivers

Current Phase: Document pattern now, implement multi-board support later (when QEMU adds Pi 5)

Implementation: Drivers follow the pattern starting now, enabling seamless Pi 5 addition without refactoring existing code.

Rationale

Why Hybrid Approach?

“Hybrid” means:

  • Now: Single target (Pi 4), pattern documented but not exercised for multi-board
  • Later: Same pattern enables runtime detection with zero driver changes
  • Best of both worlds: No premature complexity, future-proof architecture

Key advantages:

  1. Zero overhead now: Pattern doesn’t complicate Pi 4-only development
  2. Additive Pi 5 support: Add new driver files, no refactoring of working code
  3. Single binary convenience: One image for both boards (testing/deployment)
  4. Linux-like familiarity: Driver probing pattern is well-understood
  5. Already partially implemented: NetworkDevice trait + GENET’s is_present() already follow this

Alternatives Considered

Alternative 1: Compile-Time Board Selection

Use Cargo features to select target board at compile time:

#![allow(unused)]
fn main() {
#[cfg(feature = "pi4")]
use drivers::net::ethernet::broadcom::genet::GenetController as NetDevice;

#[cfg(feature = "pi5")]
use drivers::net::ethernet::broadcom::rp1_enet::Rp1Ethernet as NetDevice;
}

Build separate binaries:

cargo build --features pi4    # Pi 4 kernel
cargo build --features pi5    # Pi 5 kernel

Pros:

  • Simple, zero runtime overhead
  • Smaller binaries (only one set of drivers compiled in)
  • Clear separation of concerns

Cons:

  • Need separate kernel images for each board
  • Can’t auto-detect board at boot (user must know which image to use)
  • More build/release complexity (maintain two images)
  • Testing requires rebuilding between boards

Why rejected: Inconvenient for users with multiple boards, requires manual image selection. Runtime overhead is negligible for bare-metal (no resource constraints).

Alternative 2: Pure Runtime Detection with Dynamic Dispatch

Always use trait objects with runtime dispatch:

#![allow(unused)]
fn main() {
// All drivers behind trait objects
static NETWORK: Mutex<Option<Box<dyn NetworkDevice>>> = Mutex::new(None);

fn init() {
    // Always probe all drivers
    if let Some(genet) = try_init_genet() {
        NETWORK.lock().replace(genet);
    } else if let Some(rp1) = try_init_rp1() {
        NETWORK.lock().replace(rp1);
    }
}
}

Pros:

  • Maximum flexibility
  • Clean abstraction boundaries
  • Easy to add new boards

Cons:

  • Overhead of dynamic dispatch (negligible in practice)
  • All driver code compiled in (larger binary)
  • More complex initialization infrastructure

Why rejected: Over-engineered for current needs. Hybrid approach gives same flexibility with simpler implementation.

Alternative 3: Device Tree-Driven (Linux Kernel Style)

Parse device tree at boot to discover hardware:

#![allow(unused)]
fn main() {
// Read device tree to find compatible devices
for node in devicetree.nodes() {
    if node.compatible("broadcom,genet-v5") {
        register_driver(GenetDriver);
    } else if node.compatible("raspberrypi,rp1-eth") {
        register_driver(Rp1Driver);
    }
}
}

Pros:

  • Very flexible, supports unknown future boards
  • Standard approach (used by Linux)
  • External configuration (no recompile for new boards)

Cons:

  • Need device tree parser (complex, ~5000+ lines in Linux)
  • Need driver registration infrastructure
  • Overkill for 2-board support
  • Firmware must provide correct device tree

Why rejected: Too much infrastructure for minimal benefit. We control both supported boards, don’t need external configuration.

Consequences

Positive

  • Future-proof: Pi 5 support is additive (new files), not refactoring
  • Single binary option: One kernel for both boards (convenience)
  • Existing pattern: NetworkDevice trait already implements this approach
  • Clear guidelines: Documented pattern prevents inconsistent implementations
  • Testable: Can test Pi 4/Pi 5 code paths in same binary (future)
  • Familiar: Linux-like driver probing pattern

Negative

  • Larger binary: Both driver sets compiled in (vs compile-time selection)
    • Mitigation: Bare-metal has no resource constraints, Pi 4 has 1-8 GB RAM
  • Runtime probe overhead: Checking hardware at boot (~milliseconds)
    • Mitigation: One-time cost, negligible compared to boot time
  • Pattern requirements: All drivers must follow pattern (documented in CLAUDE.md)
    • Mitigation: Pattern is simple (is_present() + trait), already partially implemented

Neutral

  • No immediate changes: Pattern documented, not yet exercised for multi-board
  • Deferred implementation: Multi-board support waits for QEMU Pi 5 support
  • Some drivers don’t need traits: Single-implementation categories (timers, GPIO) use chip-specific naming instead

Implementation Requirements

Driver Pattern (Documented in CLAUDE.md)

All drivers must implement:

  1. Hardware Detection:
#![allow(unused)]
fn main() {
impl MyDriver {
    pub fn is_present(&self) -> bool {
        // Read version/ID register to detect hardware
        let version = self.read_reg(VERSION_REG);
        version == EXPECTED_VERSION
    }
}
}
  1. Trait-Based Interfaces (for multi-implementation categories):
#![allow(unused)]
fn main() {
pub trait NetworkDevice {
    fn is_present(&self) -> bool;
    fn init(&mut self) -> Result<(), Error>;
    // ...
}
}
  1. Self-Contained Initialization:
#![allow(unused)]
fn main() {
impl MyDriver {
    pub fn new() -> Self { /* ... */ }
    pub fn init(&mut self) -> Result<(), Error> {
        if !self.is_present() {
            return Err(Error::HardwareNotPresent);
        }
        // Initialize hardware
        Ok(())
    }
}
}

Directory Structure Rules

Use deep structure for categories with cross-vendor diversity:

drivers/net/ethernet/broadcom/  # Multiple ethernet vendors
drivers/net/wireless/            # Multiple WiFi vendors
drivers/usb/host/                # Multiple USB controllers

Use flat structure for single-vendor version changes:

drivers/gpio/bcm2711.rs         # Pi 4
drivers/gpio/rp1.rs             # Pi 5

Future Runtime Detection (When Pi 5 Support Added)

#![allow(unused)]
fn main() {
// Detect network device at boot
let mut network_device: Box<dyn NetworkDevice> = {
    let genet = GenetController::new();
    if genet.is_present() {
        Box::new(genet)  // Pi 4
    } else {
        let rp1 = Rp1Ethernet::new();
        if rp1.is_present() {
            Box::new(rp1)  // Pi 5
        } else {
            panic!("No network hardware detected")
        }
    }
};

network_device.init()?;
}

Current State

  • Pattern documented: CLAUDE.md contains driver implementation guidelines
  • Partially implemented: NetworkDevice trait + GENET is_present() already follow pattern
  • Pi 5 implementation: Waiting for QEMU BCM2712/RP1 emulation support
  • Multi-board runtime detection: Not yet implemented (only Pi 4 currently supported)

References

Raspberry Pi 5 Architecture

Driver Patterns

Implementation

  • Pattern Documentation: CLAUDE.md - “Multi-Board Support Strategy” section
  • Current Implementation: src/drivers/net/netdev.rs - NetworkDevice trait
  • Example: src/drivers/net/ethernet/broadcom/genet.rs - GENET driver with is_present()

Roadmap

DaedalusOS development phases and milestones.

Project Goals

  • Primary: Learning project for OS internals and bare-metal ARM programming
  • Target: Raspberry Pi 4 exclusively (see ADR-001)
  • End Vision: Network-enabled device for remote GPIO control via HTTP
  • Development: Incremental milestones, each delivers working feature
  • Learning Focus: Hardware/driver layer (implement from scratch), protocols/algorithms (use existing no_std crates)

Current Status

Phase 4 In Progress 🔄 - Networking Stack Milestone #13 Complete ✅ - Frame Transmission & Reception

  • Working REPL with command parsing and shell history
  • Exception vector table with register dumps
  • 8 MB heap with bump allocator
  • Full alloc crate support (Box, Vec, String, collections)
  • System timer driver with microsecond precision delays
  • GIC-400 interrupt controller with interrupt-driven UART
  • MMU with 39-bit virtual address space (identity mapped)
  • Caching enabled for performance
  • GPIO driver with BCM2711 pull-up/down support
  • Shell commands for GPIO pin control (mode, pull, set, get, toggle)
  • GENET Ethernet controller with full TX/RX capability
  • VideoCore mailbox driver for querying firmware properties
  • MAC address retrieved from OTP (One-Time Programmable memory)
  • Ethernet and ARP protocol structures with 30 unit tests
  • Shell commands: eth-diag (diagnostics), arp-probe (TX/RX test)

Next: Milestone #14 - Interrupt-driven networking

Phase 1: Interactive Shell ✅ COMPLETE

Goal: Usable REPL running in QEMU

Completed Milestones:

  1. Boot & Console - Assembly entry, UART TX
  2. Testing Infrastructure - Custom test framework with QEMU
  3. UART Input - Polling RX, line editing (backspace, Ctrl-U, Ctrl-C)
  4. Command Parser - Line buffering, argument splitting
  5. Shell Loop - REPL with prompt, built-in commands (help, echo, clear, version, meminfo)
  6. Exception Vectors - 16-entry table, context save/restore, ESR/FAR decoding

Current Features:

  • Interactive shell prompt (daedalus>)
  • Commands: help, echo, clear, version, meminfo, exception
  • Line editing: backspace, Ctrl-U (clear line), Ctrl-C (cancel)
  • Full exception handling with register dumps

Phase 2: Memory & Interrupts ✅ COMPLETE

Goal: Dynamic allocation and interrupt-driven I/O

Milestone #7: Heap Allocator ✅ COMPLETE

  • ✅ Integrated Rust alloc crate
  • ✅ Simple bump allocator for shell history
  • ✅ Enabled String, Vec, collections
  • ✅ 8 MB heap region with proper alignment
  • ✅ Memory tracking (heap_size, used, free)
  • ✅ 6 allocator tests (Box, Vec, String, capacity, stats)

Milestone #8: System Timer ✅ COMPLETE

  • ✅ Configured BCM2711 system timer (base 0xFE003000)
  • ✅ Implemented delay functions (delay_us, delay_ms)
  • ✅ Added timestamp and uptime tracking functions
  • ✅ Added uptime shell command
  • ✅ 6 timer tests (counter, delays, monotonicity)
  • ✅ Comprehensive hardware documentation

Milestone #9: GIC-400 Setup ✅ COMPLETE

  • ✅ Initialize interrupt controller
  • ✅ Configure UART interrupts
  • ✅ Interrupt-driven I/O (replaced polling)

Milestone #10: MMU & Paging ✅ COMPLETE

  • ✅ 3-level translation tables with 2 MB block mappings
  • ✅ Identity map kernel (1 GB normal memory)
  • ✅ Identity map MMIO (device memory, non-cacheable)
  • ✅ 39-bit virtual address space (512 GB)
  • ✅ Memory attributes (cacheable normal, device-nGnRnE)
  • ✅ Shell command (mmu) for debugging MMU status
  • ✅ Comprehensive documentation

Phase 3: Hardware I/O 🔄 IN PROGRESS

Goal: Foundation for real-world device control

Milestone #11: GPIO Driver ✅ COMPLETE

  • ✅ Pin configuration (input/output, alt functions 0-5)
  • ✅ BCM2711 pull-up/down resistor control (new register mechanism)
  • ✅ Digital I/O (read/write/toggle GPIO pins)
  • ✅ Shell commands: gpio-mode, gpio-pull, gpio-set, gpio-get, gpio-toggle
  • ✅ Support for all 58 GPIO pins (BCM2711)
  • ✅ Comprehensive hardware documentation

Phase 4: Networking Stack 🔄 IN PROGRESS

Goal: Network-enabled device (the primary objective)

Milestone #12: Ethernet Driver Foundation ✅ COMPLETE

  • ✅ GENET v5 hardware detection and register access
  • ✅ MDIO protocol implementation (PHY communication)
  • ✅ BCM54213PE PHY detection and identification
  • ✅ Ethernet frame structures and parsing
  • ✅ ARP packet structures and parsing
  • ✅ Network byte order handling
  • ✅ 30 protocol unit tests passing
  • ✅ Comprehensive documentation (hardware, protocols, verification)
  • ✅ Shell command: eth-diag (hardware diagnostics)

Milestone #13: Frame Transmission & Reception ✅ COMPLETE

  • ✅ Frame TX implementation (polling mode with DMA descriptors)
  • ✅ Frame RX implementation (polling with ring buffers)
  • ✅ VideoCore mailbox driver for firmware communication
  • ✅ MAC address queried from OTP via mailbox (real hardware MAC)
  • ✅ Bus address translation (ARM physical → VideoCore bus)
  • ✅ Cache-line aligned message buffers (64-byte alignment)
  • ✅ Frame validation and error handling
  • ✅ Shell command: arp-probe (comprehensive TX/RX diagnostics)

Milestone #14: Interrupt-Driven Networking

  • Register GENET interrupts with GIC
  • RX interrupt handler
  • TX completion handling
  • Frame queuing for processing

Milestone #15: ARP Responder

  • ARP cache implementation with expiration
  • ARP request/reply handling
  • Respond to ARP requests for our IP
  • Shell command: arp-cache

Milestone #16: TCP/IP Stack Integration (smoltcp)

  • Integrate smoltcp no_std TCP/IP stack
  • Implement Device trait (maps to GENET driver)
  • IPv4 packet handling
  • ICMP echo (ping support)
  • DHCP client for IP configuration
  • UDP/TCP socket support

Milestone #17: Application Protocols

  • DNS resolver (A records)
  • HTTP/1.1 client (GET/POST)
  • Simple HTTP server for device control
  • Shell commands: ping, http-get, gpio-server

Phase 5: Advanced Features (Future Self-Implementation)

Goal: Optimizations and advanced capabilities

Milestone #18: DMA Controller

  • DMA channel setup
  • Optimize Ethernet for DMA-based transfers
  • Improve SD card performance (when implemented)

Milestone #19: Better Allocator

  • Replace bump allocator with buddy or slab allocator
  • Free/reallocation support
  • Fragmentation management

Milestone #20: Multi-Core Support

  • Wake secondary cores (cores 1-3)
  • Spinlocks and synchronization primitives
  • Per-core data structures

Milestone #21: Cooperative Scheduler

  • Task switching for async I/O
  • Event-driven network processing
  • Timer-based task scheduling

Phase 6: Storage & Persistence (Optional)

Goal: Persistent storage and filesystems

Milestone #22: SD Card Driver

  • EMMC controller initialization
  • Block read/write operations
  • Interrupt-driven I/O

Milestone #23: FAT32 Filesystem

  • Parse FAT32 structures
  • File operations (open, read, write, close)
  • Directory traversal

Phase 7: Advanced Hardware (Optional)

Goal: Additional peripherals and buses

Milestone #24: I2C/SPI Drivers

  • Bus initialization
  • Multi-device support
  • Sensor integration

Milestone #25: USB Host Controller

  • xHCI/EHCI initialization
  • USB device enumeration
  • Keyboard/mouse/storage support

Phase 8: Userspace (Optional)

Goal: Process isolation and privilege separation

Milestone #26: EL0 Userspace

  • Drop to EL0 for user programs
  • System call interface (SVC handler)
  • User/kernel memory isolation

Milestone #27: Process Management

  • Process creation/termination
  • Basic IPC mechanisms
  • Resource limits and scheduling

Development Practices

Each milestone must:

  1. Pass pre-commit script with no errors or warnings (./.githooks/pre-commit)
    • This verifies: formatting, clippy, documentation, tests, and build
  2. Run in QEMU (cargo run) for interactive verification
  3. Update documentation (code docs, milestone summary, and relevant guides)

Documentation Requirements

After each milestone, update:

  • README.md - Quick start, expected output
  • Roadmap (this file) - Mark milestone complete
  • Hardware docs - New peripherals
  • Architecture docs - New features

API Reference