B - Trees
Form of balanced tree
Anatomy of CPU
ALU does arithmetic
Cache lines
Disk
Ram
The alu puts in cache lines,
it will check the ram then if dont have it will check the disk and push back
After the cache check the value it will stoer it
If alu wants the same value, it will go to cache and just take it back. It does nit need to go to ther am and disk -> Cache hit
But what if we go to the cache line and it have to go to ram/ disk -> Cache miss
We want to maximise the amount of cache hits.
1. Assume that accessing the cache takes 1 second, accessing ram is 28 mins
2. SSD will take 5 days
3. HD will take 1-5 years
We try to utilise the cache.
Structure of B tree
- Can hold more than 3 children
- Sorted
- One node can hold more than one value
- Leaf nodes at same depth
- All non root nodes between b-1 to 2b-1 keys.
- Nodes can only be 2b-1 values
- The root has at most 2b-1 (b=2 means the root has at most 3 keys)
- If one node has c values, it must have c-1 subtrees
Properties:
1. All leaf nodes are of same depth
2. All non root nodes have b-1 to 2b-1 keys
3. the root has at most 2b-1 keys
4. An internal node with k keys must have k+1 subtrees
Analysis
Prove the height is logn
Each node beside root needs to have n-1 keys
first level = 1 key
2nd level = 2* (b-1)
We will see a heomteric prhression
2b^h -1 keys
which means the k has n-tree
Insertion
1. If current node has less than 2b keys, we do nothing
2. Else check our siblings, if either of them has less than 2b-1 keys, we rotate a value to them
3. Else do a split, meaning break current node into 3 parts, one node contain middle key (taken from median) and the other two contain the b keys each. Then push the middle key up into the parent
4. Recurse this upwards if we split
It will always grows up so the depth will always be the same
For top down insertion, when we do traversal, everytime we encounter a full node, we will split the node. We will only insert only after everything is done.
For top down insertion, when we do traversal, everytime we encounter a full node, we will split the node. We will only insert only after everything is done.
However, top down creates too much levels but for down up only splits when we need to
Deletion
1. If current node is still has b-1 keys, do nothing
2. Else we look at siblings, if either of them has greater than b-1 keys, we can take one from either
3. Else we pick one of the siblings and merge them using a key from the parent, then recurse on the parent node to check its size.
IF Unbalance
4. If parent is bare minimum, bring child up (successor), else if parent not bare minimum. bring parent down
No comments:
Post a Comment