Wednesday, October 2, 2019

CS2040s: DG 4

Hashset

Imagine we are taking inventory and we want to check what types of fruits is in the house so we add the differnt fruits into the hashset. We are not concern how many there are but more of the existence

HASHMAP
Lests imagine we have an inventory checklist, we want to add a variable to it. 

HASHSET VS HASHMAP
Same structure but one needs a variables



Hashtable recap

Insert -> o(1)

Lookup o(1)

Floor/ceiling-> O(n)

Max/min -> O(n)
It still unordered so we need to do a look up.

Collision resolution techniques

A good hash functions:
- Fast to compute
- Uses minimum slots as possible
- Scatter as uniform
- Minimal collision

Hashing functions

- Open addressing 

  - linear probing
     Insert:  Just iterate through
     Delete: Iterate through, change to del instead of just removing it
     Search: Using the hashing, iterate, if counter del just continue until reach the end

  - quadratic probing
Relative better than linear ...
But will cause a secondary clustering, there will be many empty slots that are not used
     Insert: Iterate but also with ^2 movement, h(x) + (s)^2 starting from the first slot not the next one\
     
  - double probing
Using h(v) + step * h2(v) %M
h2(v) > 1
Give a differnt prob sequence for similar keys. More complicated

- Close addressing

  - Chaining
   Its a linked list with an hash
   We need to traverse the list.
    Search: to search for a number, look for the position and traverse the list, return not found when empty slot is reach
Simple to implement


String Hashing

Convert to ascii, multiply each chara's ascii with p^[index number] sum everything and mod it with p
abs it so that we wont get any negative numbers
There might be a collision because are modulo M. When mod m we are dealing with smaller numbers. Use M = 10^9+9

Robin Hood Hashing

1. Hash value
2. Key
3. Value

The hash value is calculated from the key, if element hash zero is 0, we will just put it tehre an just return
- (number mod 10) to get the starting position
- Distance start from 0
- If the distance of the existing slot is less than current, increase distance by one and move to next slot
- if can replace and its not an empty, swap and continue search position for the swapped out item


But we need to check the existing distance of the current thing if its occupyied. If my current distance is more than what is there, check if its deleted,
If its already deleted, we just put the value there.
If its not, we will check whose distance is bigger and swap
if not iterate
Distance is always incremented by 1 and stored together with each node.
When we swap the values, we 
we will swap\

Deleting:
We replace value with del but do not change the distance value

Searching:
Linear search but when it reaches the max distance, should stop searching

No comments:

Post a Comment