
Python
Learning Level
Master techniques to measure, analyze, and optimize data structure performance.
The `timeit` module measures execution time reliably by running code multiple times.
import timeit
# Setup code
setup = """
lst = list(range(1000))
d = {i: i for i in range(1000)}
s = set(range(1000))
"""
# List search - O(n)
list_time = timeit.timeit('500 in lst', setup=setup, number=100000)
print(f"List search: {list_time:.4f}s") # ~0.15s
# Dict search - O(1)
dict_time = timeit.timeit('500 in d', setup=setup, number=100000)
print(f"Dict search: {dict_time:.4f}s") # ~0.005s
# Set search - O(1)
set_time = timeit.timeit('500 in s', setup=setup, number=100000)
print(f"Set search: {set_time:.4f}s") # ~0.005s
print(f"Dict/Set are {list_time/dict_time:.0f}x faster than list!")`cProfile` shows which functions consume the most time.
import cProfile
import pstats
from io import StringIO
def process_data():
"""Simulate data processing"""
# Find duplicates - inefficient
data = list(range(1000))
duplicates = []
for x in data: # O(n)
for y in data: # O(n) nested
if x == y and x != y: # Always false, but benchmark anyway
duplicates.append(x)
return duplicates
# Profile the function
pr = cProfile.Profile()
pr.enable()
process_data()
pr.disable()
# Print results
s = StringIO()
ps = pstats.Stats(pr, stream=s).sort_stats('cumulative')
ps.print_stats(10)
print(s.getvalue())
# Output shows:
# ncalls tottime percall cumtime percall filename:lineno(function)
# 1 0.150 0.150 0.150 0.150 test.py:2(process_data)import timeit
from typing import List, Set, Dict
# Problem: Find which users appear in two lists
users_a = list(range(10000))
users_b = list(range(5000, 15000))
# Approach 1: Nested loops O(n²) - SLOW
def find_common_slow(a, b):
result = []
for user in a:
if user in b: # List membership O(n)
result.append(user)
return result
# Approach 2: Convert to set O(n) - FAST
def find_common_fast(a, b):
return list(set(a) & set(b))
# Approach 3: Set operation directly O(n)
def find_common_best(a, b):
return set(a) & set(b)
# Benchmark
slow_time = timeit.timeit(
'find_common_slow(users_a, users_b)',
globals={'find_common_slow': find_common_slow, 'users_a': users_a, 'users_b': users_b},
number=100
)
fast_time = timeit.timeit(
'find_common_fast(users_a, users_b)',
globals={'find_common_fast': find_common_fast, 'users_a': users_a, 'users_b': users_b},
number=100
)
print(f"Slow (nested): {slow_time:.3f}s") # ~5.2s
print(f"Fast (set): {fast_time:.3f}s") # ~0.08s
print(f"Speedup: {slow_time/fast_time:.0f}x") # ~65x faster!| Task | Structure | Complexity | Reasoning |
|---|---|---|---|
| Store ordered items | List | O(1) append, O(n) search | Sequential access |
| Lookup by key | Dict | O(1) average | Fast key access |
| Membership check (large) | Set | O(1) average | Hash-based lookup |
| Fixed immutable data | Tuple | O(1) access, hashable | Memory efficient |
| Multiple operations | Custom class | Varies | Domain-specific |
# 1. List comprehension vs loop (faster creation)
# Slow
result = []
for i in range(10000):
result.append(i * 2)
# Fast
result = [i * 2 for i in range(10000)]
# 2. Generator expression vs list comprehension (memory)
# Takes memory
squares = [x**2 for x in range(1000000)]
# Lazy evaluation - memory efficient
squares = (x**2 for x in range(1000000))
# 3. Local variable access (faster than global/attribute)
# Slow
total = 0
for i in range(1000000):
total += some_module.CONSTANT
# Fast - cache in local variable
constant = some_module.CONSTANT
total = 0
for i in range(1000000):
total += constant
# 4. Built-in functions (C implementation)
# Slow
total = 0
for x in data:
total += x
# Fast
total = sum(data)Ready to practice? Challenges | Quiz
Resources
Ojasa Mirai
Master AI-powered development skills through structured learning, real projects, and verified credentials. Whether you're upskilling your team or launching your career, we deliver the skills companies actually need.
Learn Deep ⢠Build Real ⢠Verify Skills ⢠Launch Forward