Python · Hash Tables

Complete Guide to Hash Tables in Python (Dictionaries)

A hash table is a data structure that stores data as key → value pairs and allows extremely fast lookups. In Python, the built-in dict type is implemented as a hash table. This guide explains how hash tables work, why they are so efficient, and how to use them effectively in Python.

1. What Is a Hash Table?

A hash table maps keys to values using a hash function. Instead of scanning through a list to find an item (O(n)), we compute an index using the key and jump directly to its position (average O(1)). This makes hash tables incredibly fast for lookups, inserts, and deletions.

Index: 0 1 2 3 4 5 Buckets: [ ] [ ] [name:Alex] [ ] [ ] [ ] key = "name" value = "Alex" hash("name") → 2 store at index 2

Hash tables are the backbone of Python dictionaries, sets, caches, and many fast lookup systems.

2. Hash Functions

A hash function converts a key into an integer called a hash code. The hash table uses this number to decide where to store the key-value pair in an internal array.

Good hash functions should be:

In Python, you can see hash values using hash():

To convert the hash into a valid index, the hash table uses the modulo operation:

index = hash(key) % table_size

3. Python Dictionaries as Hash Tables

Python's built-in dict type is a highly optimized hash table with:

3.1 Basic Dictionary Operations

Here are some common operations with Python dictionaries:

4. Collisions in Hash Tables

A collision occurs when two different keys hash to the same index. Collisions are unavoidable in finite-sized tables, so the hash table must handle them.

hash("apple") → 2 hash("banana") → 2 # collision Index: 0 1 2 3 4 Buckets: [ ] [ ] ["apple", ...] [ ] [ ] \ → ["banana", ...] (simplified view)

Two popular strategies to handle collisions are:

CPython dictionaries use a form of open addressing with probing, not simple chaining.

5. Load Factor & Resizing

The load factor of a hash table is:

load_factor = number_of_elements / table_size

If the load factor becomes too high, collisions increase and performance degrades. To fix this, the hash table automatically:

Python dictionaries handle this resizing internally, so you still see average O(1) performance.

6. Time Complexity of Hash Table Operations

Average time complexities for hash table operations:

OperationAverage TimeWorst Case
InsertO(1)O(n)
LookupO(1)O(n)
DeleteO(1)O(n)

The worst case (O(n)) happens if many keys collide badly, but Python’s implementation is designed to make this extremely rare in real-world code.

7. Hashable & Non-Hashable Keys

Keys in a hash table must be hashable, which means:

7.1 Valid Dictionary Keys

Common hashable types:

7.2 Invalid Keys

Common non-hashable types:

my_dict = {}

# ❌ list as a key → TypeError
# my_dict[[1, 2, 3]] = "invalid"

# ✔ tuple as a key
my_dict[(1, 2, 3)] = "valid"
print(my_dict)

8. Implementing a Simple Hash Table in Python

To understand hash tables deeply, it helps to build a simple version yourself. This example uses separate chaining (each bucket is a list of key-value pairs).

Usage example:

# Usage:
ht = HashTable()
ht.insert("name", "Alex")
ht.insert("age", 25)
ht.insert("country", "Lithuania")

print(ht.get("name"))      # "Alex"
print(ht.get("age"))       # 25

ht.delete("age")
print(ht.get("age"))       # None

9. Real-World Uses of Hash Tables

Hash tables are used in many systems for extremely fast lookup and mapping:

10. Hash Tables vs Cryptographic Hashing

Hash tables use hashing for indexing and fast lookup. This is different from cryptographic hash functions, which are designed for security.

Examples of cryptographic hashes:

Cryptographic hashes are used for password storage, digital signatures, and blockchain – not for dictionary indexing.

11. Summary

In this guide, you learned:

Hash tables are one of the most powerful and widely used data structures. Mastering them gives you a huge advantage when designing fast and scalable applications.