Hash tables
HandoutHash tables: fast lookup
- A hash table stores items so you can find them very fast — usually in one step.
- It is a list of slots. A hash function turns a key into a slot number.
- Instead of searching every item, you jump straight to the slot the key belongs in.
A hash function
- A hash function takes a key and returns a slot index from
0tosize - 1. - The same key always gives the same slot, so you can find it again later.
- A simple one: add the character codes, then take
% sizeto stay in range.
def hash_key(key, size):
total = 0
for ch in key:
total += ord(ch) # ord("A") is 65, ord("B") is 66, ...
return total % size
print(hash_key("cat", 10)) # a slot from 0 to 9
print(hash_key("cat", 10)) # same key -> same slot
print(hash_key("dog", 10))
Collisions
- Two different keys can hash to the same slot. That is a collision.
- A table has limited slots, so collisions are unavoidable as it fills up.
- We need a rule for what to do when the slot we want is already taken.
Linear probing
- Linear probing: if a slot is full, try the next slot, then the next, wrapping around.
- Keep stepping
(slot + 1) % sizeuntil you find a free slot. - Below,
AandFboth want slot 0, soFis pushed to slot 1.
def hash_key(key, size):
total = 0
for ch in key:
total += ord(ch)
return total % size
def insert(table, key):
slot = hash_key(key, len(table))
while table[slot] is not None: # slot taken -> try the next one
slot = (slot + 1) % len(table)
table[slot] = key
return slot
table = [None] * 5
print(insert(table, "A")) # 0
print(insert(table, "F")) # 1 (A and F both hash to slot 0)
print(table) # ['A', 'F', None, None, None]
Looking up a key
- To find a key: hash it, then probe forward, comparing each slot to the key.
- Stop and return the index when you find it.
- If you reach an empty slot (or check every slot), the key is not there — return
-1.
Why hash tables are fast
- With few collisions, insert and find take about one step — we call this
O(1). - As the table fills, probing gets longer, so it is wise to keep some slots free.
- In the worst case (everything collides) it slows to a linear scan,
O(n).
Now you try
- Build the three parts: the hash function, insert with probing, and find.
- Each task checks your function on collisions and wrap-around cases.
- Press Check answer to test it.
Write hash_key(key, size). Add the character codes of key (use ord(ch)) and return the total % size, so the result is a slot from 0 to size - 1. Example: hash_key("AB", 10) is (65 + 66) % 10 = 1. The empty string gives 0.
Click Run to see the output here.
hash_key is provided. Write insert(table, key) using linear probing: go to the key's hash slot; while that slot is full (not None), step to (slot + 1) % len(table); put the key in the first free slot and return its index.
Click Run to see the output here.
hash_key is provided. Write find(table, key) that returns the index of key, or -1 if it is missing. Start at the hash slot and probe forward, comparing each slot. Stop at an empty slot (key not there). Make sure it ends even if the table is full — never loop forever.
Click Run to see the output here.