You are on page 1of 15

B-Tree

By Mahmoud Ismail

CS 600.226: Data Structures, Professor: Greg Hager JHU Spring 2010


2004 Goodrich, Tamassia

Agenda
! Review of 2-4 Tree ! B Tree ! Dictionary and Map

2004 Goodrich, Tamassia

Multi-way Search Trees


! Each node may store multiple key-element
pairs ! Node with d children (d-node) stores d-1 key-element pairs ! Children have keys that fall either before smallest parent key, after largest parent key, or between two parent keys
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

Multi-way Search Trees


! Each node may store multiple key-element
pairs ! Node with d children (d-node) stores d-1 key-element pairs ! Children have keys that fall either before smallest parent key, after largest parent key, or between two parent keys
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

Example Multi-way Search Tree


50 20 30 10 15 25 40 42 45 55 60 70 80 64 66 75 85 90

22

27

! External node between each pair of keys and before/after


(n-1) + 1 + 1 = n+1 external nodes
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

(2,4) Trees
! A (2,4) tree (also called 2-4 tree or 2-3-4 tree) is a multi-way
search with the following properties
! !

Node-Size Property: every internal node has at most four children Depth Property: all the external nodes have the same depth

! Depending on the number of children, an internal node of a


(2,4) tree is called a 2-node, 3-node or 4-node

10 15 24 2 8 12 18 27 32

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

Simple Insertion (no overflow)


10 10

12 14

12 14 15

Insert 15

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

Overflow and Split


! We handle an overflow at a 5-node v with a split operation:
! !

let v1 v5 be the children of v and k1 k4 be the keys of v node v is replaced nodes v' and v"
" v' is a 3-node with keys k1 k2 and children v1 v2 v3 " v" is a 2-node with key k4 and children v4 v5

key k3 is inserted into the parent u of v (a new root may be created)

u
12 18

15 24 27 30 32 35

u v
12 18

15 24 32 27 30

v'

35

v" v5

v1 v2 v3 v4 v5
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

v1 v2 v3 v4

Insertion with Overflow


10 Insert 11 12 14 15 5 10

11 12 14 15 Split 10 14 11 12

5
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

15

Insert with Cascading Split


! The overflow may propagate to the parent node u
6 8 10 5 7 9 10 6 8 5 7 9 14 11 12 15 Split Insert 11 12 14 15 5 6 8 10 7 9 11 12 14 15 Split 6 8 10 14 5 7 9 11 12 15

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

Inserting into (2,4) Tree


! 1. Search for position in deepest
internal node ! 2. Insert into position ! 3. If # elements > 3, do a split operation
Split node into 2 nodes ! Push 1 element up to parent
!

" Create new root if no parent


CS 600.226: Data split " If parent overflows,Structures, parent 2004 Goodrich, Tamassia Professor: Greg Hager

Analysis of Insertion
! Algorithm insert(k, o) ! 1. ! 2.
We search for key k to locate the insertion node v We add the new entry (k, o) at node v

! Let T be a (2,4) tree


with n items
!

! 3. while overflow(v)
if isRoot(v) create a new empty root above v v ! split(v)

! !

Tree T has O(log n) height Step 1 takes O(log n) time because we visit O (log n) nodes Step 2 takes O(1) time Step 3 takes O(log n) time because each split takes O(1) time and we perform O(log n) splits

! Thus, an insertion in a

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

(2,4) tree takes O(log n) time

Deletion
Simple Case: Delete item from a leaf node

6 8 10 5 7 9 12 14 15

Remove 14 5

6 8 10 7 9 12 15

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

Removal with Swap


Deletion of non leaf node: we replace the entry with its inorder successor (or, equivalently, with its inorder predecessor)

6 8 10 5 7 9 12 14 15

Remove 10 5 7

6 8 9 12 14 15

Swap 6 8 12 5 7 9 14 15

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

Underflow and Transfer


! To handle an underflow at node v with parent u, we consider
two cases ! Case 2: an adjacent sibling w of v is a 3-node or a 4-node
!

Transfer operation: 1. we move a child of w to v 2. we move an item from u to v 3. we move an item from w to u After a transfer, no underflow occurs

u 4 9

w 6 8

u 4 8 w 6

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

Removal with Transfer


6 8 10 5 7 9 12 14 15 Remove 9 6 8 10 5 7 12 14 15 Transfer (~rotate) 6 8 12 5
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

10

14 15

Underflow and Fusion


! Deleting an entry from a node v may cause an underflow, where
node v becomes a 1-node with one child and no keys ! To handle an underflow at node v with parent u, we consider two cases ! Case 1: the adjacent siblings of v are 2-nodes
!

Fusion operation: we merge v with an adjacent sibling w and move an entry from u to the merged node v' After a fusion, the underflow may propagate to the parent u

u 2 5 7

9 14 10 w v

u 2 5 7

9 10 14 v'

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

Removal with Fusion


6 8 10 5 7 9 12 14 15 Remove 7 5 6 8 10 9 12 14 15

Fusion 6 10 5
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

8 9

12 14 15

Removing from (2,4) Tree


! 1. Search for element ! 2. Remove element ! 3. If elements child is internal
!

Swap next larger element into hole (so weve removed element above an external) ! 4. If node has no elements If an adjacent sibling has > 1 element Perform transfer (kind of rotation) ! Else Perform fusion (can cascade upward)
CS 600.226: Data Structures, Professor: Greg Hager

2004 Goodrich, Tamassia

Analysis of Deletion
! Let T be a (2,4) tree with n items
!

Tree T has O(log n) height We visit O(log n) nodes to locate the node from which to delete the entry We handle an underflow with a series of O(log n) fusions, followed by at most one transfer Each fusion and transfer takes O(1) time

! In a deletion operation
! !

! Thus, deleting an item from a (2,4) tree takes O(log


n) time

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

(a,b) Trees
! Generalization of (2,4) trees ! Size property: internal node has at least a
children and at most b children
!

2 <= a <= (b+1)/2

! Depth property: all external nodes have

same depth ! Height of (a,b) tree is " (logn/logb) and O(logn/loga) CS 600.226: Data Structures,
2004 Goodrich, Tamassia Professor: Greg Hager

External Memory Searching


! Memory Hierarchy
Registers Cache RAM

External Memory
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

Types of External Memory


! ! ! ! ! Hard disk Floppy disk Compact disc Tape Distributed/networked memory
CS 600.226: Data Structures, Professor: Greg Hager

2004 Goodrich, Tamassia

Primary Motivation
! External memory access much slower
than internal memory access
!

orders of magnitude slower ! need to minimize I/O Complexity ! can afford slightly more work on data in memory in exchange for lower I/O complexity
2004 Goodrich, Tamassia CS 600.226: Data Structures, Professor: Greg Hager

I/O Efficient Dictionaries


! Balanced tree structures
Typically O(log2n) transfers for query or update ! Want to reduce height by constant factor as much as possible ! Can be reduced to O(logBn) = O(log2n/log2B)
!

" B is number of nodes per block


CS 600.226: Data Structures, Professor: Greg Hager

2004 Goodrich, Tamassia

B+ Trees
! Choose a and b to be #(B)
! Data are stored at leaves. ! All leaves are at the same depth

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

B+Trees
! Non Leaf nodes (except root) have between B/2 and B
children ! Root has between 2 and B children

! All leaves are at the same depth and have between L/2

and L element. (where L is an arbitrary number, usually = B)

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

B+ Tree
Example

2004 Goodrich, Tamassia

B+ Tree
! Best case h is O(logBn) ! Worst Case h is O(logB/2n) ! I/O complexity for search is O(logBn)$

2004 Goodrich, Tamassia

You might also like