Lecture 015

Address Space

Linear address space: Ordered set of contiguous non-negative integer addresses Virtual address space: Set of $N = 2^n$ virtual addresses Physical address space: Set of $M = 2^m$ physical addresses

Purpose of Virtual Memory

Uses main memory efficiently by using DRAM as cache tool on disk
- virtual memory are on disk, and can be cached into physical memory (DRAM)
- therefore virtual memory provide larger, cheap space for programs, since typically size of working set is less than main memory size (otherwise Thrashing)
Simplifies memory management to provide abstraction of linear address space to each program
Isolate one process' memory from another (or from kernel) to provide security

Cache Mapping

cached blocks are called pages
DRAM block size: 4KB (minimum block size in page table)
software implemented

Virtual Address

Page Table

page table: a software mapping function from virtual page to physical page

page table
page hit: mapped to physical memory (valid bit = 1)
page fault: not mapped to physical memory (valid bit = 0, address is null or on swap)
- when the address is valid but no physical memory
  - kernel evict some memory in physical address for the newly requested memory
  - edit page table to reflect changes
  - tell processor to restart instruction for page hit
allocation of new space: only write page table, linked to virtual memory address
- when the address is invalid
  - (like address 0x0 = NULL has no page mapping)

Page Fault

Trigger Page Fault
Complete Page Fault

VM as Tool for Memory Management

Key Ideas:

each process has its own virtual address space
mapping is fully associative
mapping can share library code
mapping simplify linking: compiler can always put specific region of byte code on specific memory address
mapping simplify loading: loading only edit page table. memory read and write only when on demand (lazy mode)

notice system reserve lower bytes that are unsafe to access
notice system reserve higher bytes to kernel mappings

VM as Memory Protection

Page Table Translation

$N = 2^n$ : number of address in virtual address space $M = 2^m$ : number of addresses in physical address space $P = 2^p$ : page size (bytes)

Virtual Address (VA)

VPO: virtual page offset (same as PPO)
VPN: virtual page number (what gets translated from)
- TLBL: TBL index
- TLBT: TLB tag

Physical Address (PA)

PPO: physical page offset (same as VPO)
PPN: physical page number (what gets translated to )

%cr3: control register, store address of page table in physical memory

Success Address Translation

CPU request MMU to do memory translation, providing virtual address
MMU fetch page table information using %cr3 register and virtual address
MMU look at the content of page table entry and tells memory to send back data to CPU
Memory return data to CPU
Summary:
Translation Success

Page Fault Address Translation

CPU request MMU to do memory translation, providing virtual address
MMU fetch page table information using $%cr3$ register and virtual address
MMU look at the content of page table entry, MMU trigger page fault
page fault handler evict space in memory. if dirty, update to disk.
page fault handler update page table content in memory and update memory
page fault handler return to original process, retry instruction
Summary:
Translation Fail

Translation Lookaside Buffer (TLB)

TLB: store small amount of entry of Page Table in fast cache

TLB Cache System
TLB Hit:
TLB Hit
TLB Miss:
TLB Miss
Because block size is 4KB (only map VPN, not VPO), TLB misses are rare.

Multi-Level Page Table

Problem: if 4KB page size, 48-bit address space, 8-byte PTE, then we need 512GB page table. Most of them are not allocated, <0.1% allocation.

Two-Level Page Table:

K-Level Page Table:

Some interesting facts

TLB: only store final translation
Linux: can mark a page table as physical memory, therefore allow 2MB chunk of consecutive memory
Some portions of Page Tables are shared across processes

Implementation Detail of Page Table and TLB

Address Example: 14 bit virtual address, 12 bit physical address, 64 byte page size (64 entries)

Address Example: 9+9+9+9+12=48 virtual address, 40+12=52 physical address, 4KB page size (2^12 byte page table, 8 byte payload per entry, 2^9 entries)

$VPO = PPO = log_2(\text{page size})$ because a level of page table only stores PPN (assume PPO are zeros, therefore force table to be page-size-aligned) and some extra bits
$log_2(\frac{\text{page size}}{\text{8 byte payload}}) * n + \text{VPO} = \text{Virtual Address}$
Virtual Address & Physical Address

TLB Example: 16 entries, 4-way associative

$TLBI = log_2(\text{ways associative})$
TLB Example
Page Table Example

Normal Cache Example: 16 entries, 4-way associative

Tag, Index, Offset in Normal Cache

Actual Address Translation:

Actual TLB
Actual End to End Translation
First Levels of Page Table Entries
Last Level of Page Table Entries
Page Table Levels

L1 Cache Optimization: if size of tag bits for normal cache equals the size of physical address number, then the index bits for normal cache does not have to be translated before cache access.

Virtually indexed, Physically tagged

Virtual Address Space in Linux

Virtual Kernel Memory:

identical kernel memory
- physical memory
- kernel code and data
process-specific kernel memory
- ptables
- task and mm structs
- kernel stack

task_struct in OS

Linked list is used in OS to keep track memory areas:

Linux Organize Virtual Memory as Collection of Areas

Type of Page Fault:

Types of Page Fault

Memory Mapping

Memory Mapping: VM areas initialized by associating them with disk objects

from Regular file: executable
Anonymous file: allocate a physical page full of 0's

(Dirty pages are copied back and forth between memory swap)

Copy-on-Write

Copy-on-Write(COW): If a shared region of memory is going to be written by a process, it creates a minimal copy of written space for the specific process. Page table is used to keep track of new minimal memory.

Kernel Same-Page Merging

OS scans through all of physical memory, looking for duplicate pages
When found, merge into single copy, marked as copy-on-write
Implemented in Linux kernel in 2009
Limited to pages marked as likely candidates
Especially useful when processor running many virtual machines

mmap()

A function to map disk file to virtual address

// Map [len] bytes starting at offset [offset] of the file specified
// by file description [fd], preferably at address [start]
// Return a pointer to start of mapped area (may not be [start])
void *mmap(void *start, int len, int prot, int flags, int fd, int offset)

Why mmap()

Reading big files
Shared data structures (when calling with MAP_SHARED)
File-based data structure (Database)
- Give [prot] argument [PROT_READ] or [PROT_WRITE]
- When unmap region, file will be updated via write-back
- Can implement load from file / update / write back to file
Attack a program by allocating mmap() when stack is not writable

Calculation

Page table size must be page size because each entry in page table can either store PPN of next page table or of the final PPN. Therefore, from page size, we know how big is the page table (due to above equality) and how many bits are the PPO (due to final page size, or block size). From the page table size, we can potentially know how many entries in each page table, and therefore know how many page tables in total given virtual address space. Notice, each part of virtual address (9 bits) contain information of (12 bits, 40 bits from page table + 12 bits = physical address) because those bits are used as index, and due to alignment, lower bits are known.

When malloc do 1028KB or more allocation,

Table of Content