Python · Hash Tables

Complete Guide to Hash Tables in Python (Dictionaries)

A hash table is a data structure that stores data as key → value pairs and allows extremely fast lookups. In Python, the built-in dict type is implemented as a hash table. This guide explains how hash tables work, why they are so efficient, and how to use them effectively in Python.

You will learn:

How hash tables work internally
What hashing and collisions are
Why dictionary operations are O(1) on average
How to implement a simple hash table yourself in Python
Real-world applications of hash tables

1. What Is a Hash Table?

A hash table maps keys to values using a hash function. Instead of scanning through a list to find an item (O(n)), we compute an index using the key and jump directly to its position (average O(1)). This makes hash tables incredibly fast for lookups, inserts, and deletions.

Index: 0 1 2 3 4 5 Buckets: [ ] [ ] [name:Alex] [ ] [ ] [ ] key = "name" value = "Alex" hash("name") → 2 store at index 2

Hash tables are the backbone of Python dictionaries, sets, caches, and many fast lookup systems.

2. Hash Functions

A hash function converts a key into an integer called a hash code. The hash table uses this number to decide where to store the key-value pair in an internal array.

Good hash functions should be:

Fast to compute
Deterministic: same key → same hash
Uniform: spread keys across the table

In Python, you can see hash values using hash():

To convert the hash into a valid index, the hash table uses the modulo operation:

index = hash(key) % table_size

3. Python Dictionaries as Hash Tables

Python's built-in dict type is a highly optimized hash table with:

Average O(1) lookup
Average O(1) insert and delete
Automatic resizing when needed
Collision handling using open addressing

3.1 Basic Dictionary Operations

Here are some common operations with Python dictionaries:

4. Collisions in Hash Tables

A collision occurs when two different keys hash to the same index. Collisions are unavoidable in finite-sized tables, so the hash table must handle them.

hash("apple") → 2 hash("banana") → 2 # collision Index: 0 1 2 3 4 Buckets: [ ] [ ] ["apple", ...] [ ] [ ] \ → ["banana", ...] (simplified view)

Two popular strategies to handle collisions are:

Separate chaining – each bucket holds a list of key-value pairs.
Open addressing – find another empty slot in the table (probing).

CPython dictionaries use a form of open addressing with probing, not simple chaining.

5. Load Factor & Resizing

The load factor of a hash table is:

load_factor = number_of_elements / table_size

If the load factor becomes too high, collisions increase and performance degrades. To fix this, the hash table automatically:

Allocates a bigger internal array
Rehashes and moves all existing key-value pairs

Python dictionaries handle this resizing internally, so you still see average O(1) performance.

6. Time Complexity of Hash Table Operations

Average time complexities for hash table operations:

Operation	Average Time	Worst Case
Insert	O(1)	O(n)
Lookup	O(1)	O(n)
Delete	O(1)	O(n)

The worst case (O(n)) happens if many keys collide badly, but Python’s implementation is designed to make this extremely rare in real-world code.

7. Hashable & Non-Hashable Keys

Keys in a hash table must be hashable, which means:

They have a __hash__ method
They are immutable (their value does not change over time)

7.1 Valid Dictionary Keys

Common hashable types:

int, float, bool
str
tuple (if it contains only hashable items)

7.2 Invalid Keys

Common non-hashable types:

list
dict
set

my_dict = {}

# ❌ list as a key → TypeError
# my_dict[[1, 2, 3]] = "invalid"

# ✔ tuple as a key
my_dict[(1, 2, 3)] = "valid"
print(my_dict)

8. Implementing a Simple Hash Table in Python

To understand hash tables deeply, it helps to build a simple version yourself. This example uses separate chaining (each bucket is a list of key-value pairs).

class HashTable:
    def __init__(self, size=10):
        self.size = size
        self.table = [[] for _ in range(size)]

def _hash(self, key):
        # Basic wrapper around Python's hash()
        return hash(key) % self.size

def insert(self, key, value):
        index = self._hash(key)
        bucket = self.table[index]

# Update if key exists
        for pair in bucket:
            if pair[0] == key:
                pair[1] = value
                return

# Otherwise, append new key-value pair
        bucket.append([key, value])

def get(self, key):
        index = self._hash(key)
        bucket = self.table[index]
        for k, v in bucket:
            if k == key:
                return v
        return None

def delete(self, key):
        index = self._hash(key)
        bucket = self.table[index]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                del bucket[i]
                return True
        return False

def __repr__(self):
        return str(self.table)

Usage example:

# Usage:
ht = HashTable()
ht.insert("name", "Alex")
ht.insert("age", 25)
ht.insert("country", "Lithuania")

print(ht.get("name"))      # "Alex"
print(ht.get("age"))       # 25

ht.delete("age")
print(ht.get("age"))       # None

9. Real-World Uses of Hash Tables

Hash tables are used in many systems for extremely fast lookup and mapping:

Databases – indexing and query optimization
Caches – mapping keys (URLs, IDs) to cached content
Compilers – symbol tables for variables and functions
Interpreters – mapping identifiers to values (like Python’s namespace dictionaries)
Networking – routing tables and connection tracking
Games – fast access to game entities by ID

10. Hash Tables vs Cryptographic Hashing

Hash tables use hashing for indexing and fast lookup. This is different from cryptographic hash functions, which are designed for security.

Examples of cryptographic hashes:

SHA-256
SHA-512
MD5 (not secure anymore)

Cryptographic hashes are used for password storage, digital signatures, and blockchain – not for dictionary indexing.

11. Summary

In this guide, you learned:

What hash tables are and why they are so fast
How Python dictionaries implement hash tables internally
How hashing, collisions, and load factor work
Which types can be used as dictionary keys
How to implement your own simple hash table class in Python
Real-world systems that rely heavily on hash tables

Hash tables are one of the most powerful and widely used data structures. Mastering them gives you a huge advantage when designing fast and scalable applications.