You are on page 1of 26

COMPSYS 304

Computer Architecture
Memory Management Units

Reefed down - heading for Great Barrier Island

Memory Management Unit


Physical and Virtual Memory
Physical memory presents a flat address space
Addresses 0 to 2p-1
p = number of bits in an address word
User programs accessing this space
Conflicts in
multi-user (eg Unix)
multi-process (eg Real-Time systems)

systems
Virtual Address Space
Each user has a private address space
Addresses 0 to 2q-1
q and p not necessarily the same!

Memory Management Unit


Virtual Address Space
Each user has a private address space

User Ds
Address
Space

Virtual Addresses
Memory Management Unit
CPU (running user programs) emits Virtual
Address
MMU translates it to Physical Address
Operating System allocates pages of physical
memory to users

Virtual Addresses
Mappings between user space and physical
memory created by OS

Memory Management Unit (MMU)


Responsible for
VIRTUAL PHYSICAL
address mapping
Sits between CPU and cache
VA

CPU

MMU
D or I

PA

PA

Cache

D
or
I

Main
Mem

Cache operates on Physical Addresses


(mostly - some research on VA cache)

MMU - operation
OS constructs page tables one for each user
Page address from memory address
selects a page table entry
Page table entry contains physical page
address

MMU - operation
q-k

MMU - Virtual memory space


Page Table Entries can also point to disc
blocks
Valid bit
Set: page in memory
address is physical page address
Cleared: page swapped out
address is disc block address
MMU hardware generates page fault
when swapped out page is requested

Allows virtual memory space to be larger


than physical memory
Only working set is in physical memory
Remainder on paging disc

MMU - Page Faults


Page Fault Handler
Part of OS kernel
Finds usable physical page
LRU algorithm
Writes it back to disc if modified
Reads requested page from paging disc
Adjusts page table entries
Memory access re-tried

Page Fault
q-k

MMU - Page Faults


Page Fault Handler
Part of OS kernel
Finds usable physical page
LRU algorithm
Writes it back to disc if modified
Reads requested page from paging disc
Adjusts page table entries
Memory access re-tried

Can be an expensive process!


Usual to allow page tables to be swapped out
too!
Page fault can be generated on the page tables!

MMU - practicalities
Page size
8 kbyte pages k = 13
p = 32
p - k = 19
So page table size
219 = 0.5 x 106 entries
Each entry 4 bytes
0.5 x 106 4 = 2 Mbytes!

Page tables can take a lot of memory!


Larger page sizes reduce page table size
but can waste space
Access one byte whole page needed!

MMU - practicalities
Page tables are stored in main memory
Theyre too large to be in smaller memories!
MMU needs to read PT for address translation
Address translation can require additional
memory accesses!

>32-bit address space?

MMU - Virtual Address Space Size


Segments, etc
Virtual address spaces >232 bytes are common
Intel x86 - 46 bit address space
PowerPC - 52 bit address space

PowerPC
address
formation

MMU - Protection
Page table entries
Access control bits
Read
Write
Read/Write
Execute only
Program - text in Unix
Read only
Data
Read write

MMU - Alternative Page Table Styles


Inverted Page tables
One page table entry (PTE) / page of physical
memory
MMU has to search for correct VA entry
PowerPC hashes VA PTE address
PTE address = h( VA )
h hash function
Hashing collisions

Hash functions in hardware


hash of n bits to produce m bits
Usually m < n
Fewer bits reduces information content

MMU - Alternative Page Table Styles


Hash functions in hardware
hash of n bits to produce m bits
Usually m < n
Fewer bits reduces information content
There are only 2m distinct values now!
Some n-bit patterns will reduce to the same
m-bit patterns
y h(y)
Trivial example
2-bits 1-bit with xor 00 0
Collisions
01 1
h(x1 x0) = x1 xor x0

10
11

1
0

MMU - Alternative Page Table Styles


Inverted Page tables
One page table entry (PTE) / page of physical
memory
MMU has to search for correct VA entry
PowerPC hashes VA PTE address
PTE address = h( VA )
h hash function
Hashing collisions
PTEs are linked together
PTE contains tags (like cache) and link bits

MMU searches linked list to find correct


entry
Smaller Page Tables / Longer searches

MMU - Sharing
Can map virtual addresses
from different user spaces to
same physical pages
Sharing of data
Commonly done for frequently used programs
Unix sticky bit
Text editors, eg vi
Compilers
Any other program used by multiple users
More efficient memory use
Less paging
Better use of cache

Address Translation - Speeding it up


Two+ memory accesses for each datum?
Page table
Actual data

1 - 3 (single - 3 level tables)


1

System running slower than an 8088?


Solution
Translation Look-Aside Buffer

(TLB or TLAB)
Small cache of recently-used page table entries
Usually fully-associative
Can be quite small!

Address Translation - Speeding it up


TLB sizes

MIPS R4000
1992
MIPS R10000
1996
HP PA7100
1993
Pentium 4 (Prescott) 2006

48 entries
64 entries
120 entries
64 entries

One page table entry / page of data


Locality of reference
Programs spend a lot of time in same memory region

TLB hit rates tend to be very high


98%
Compensate for cost of a miss
(many memory accesses
but for only 2% of references to memory!)

TLB - Coverage
How much of the address space is
covered by the TLB?
ie how large is the address space for which
the TLB will ensure fast translations?
All others will need to fetch PTEs from cache,
main memory or (hopefully not too often!) disc

Coverage = number of TLB entries page size


eg 128 entry TLB, 8 kbyte pages
Coverage = 27 213 = 220 ~ 1 Mbyte

Critical factor in program performance


Working data set size > TLB coverage
could be bad news!

TLB - Coverage

Luckily, sequential access is fine!


Example: large (several MByte) array of doubles
(8 byte floating point values)

8kbyte pages 1024 doubles/page

Sequential access, eg sum all values:


for(j=0;j<n;j++)
sum = sum + x[j]

First access to each page


TLB miss

but now TLB is loaded with data for this page, so


Next 1023 accesses
TLB hits

TLB Hit Rate = 1023/1024 ~ 99.9%

TLB Avoid thrashing!

n n float matrix accessed by columns


For n 2048, what is the TLB hit rate?
Each element is on a new page!
TLB hit rate = 0%
General rule:
Access matrices by rows if possible

Memory Hierarchy - Operation


Search =
Look up if standard PT
or
Search if inverted PT

Hit =
Page found in RAM

You might also like