Ojasa Mirai

Ojasa Mirai

Python

Loading...

Learning Level

🟢 BeginneršŸ”µ Advanced
šŸŽÆ Why Data Structures MatteršŸ“‹ Lists: Ordered CollectionsšŸ”‘ Dictionaries: Key-Value PairsšŸŽŖ Sets: Unique Values & SpeedšŸ“¦ Tuples: Immutable Sequencesāš–ļø Comparing All Data StructuresšŸ”§ Mastering List Methods⚔ Advanced Dictionary PatternsšŸ”„ Power of Set OperationsšŸ—ļø Building Nested Structuresāš™ļø Performance Tuning & Optimization
Python/Data Structures/Data Structure Performance

⚔ Advanced Performance Analysis — Profiling and Optimization

Master techniques to measure, analyze, and optimize data structure performance.


šŸŽÆ Profiling with timeit

The `timeit` module measures execution time reliably by running code multiple times.

import timeit

# Setup code
setup = """
lst = list(range(1000))
d = {i: i for i in range(1000)}
s = set(range(1000))
"""

# List search - O(n)
list_time = timeit.timeit('500 in lst', setup=setup, number=100000)
print(f"List search: {list_time:.4f}s")  # ~0.15s

# Dict search - O(1)
dict_time = timeit.timeit('500 in d', setup=setup, number=100000)
print(f"Dict search: {dict_time:.4f}s")  # ~0.005s

# Set search - O(1)
set_time = timeit.timeit('500 in s', setup=setup, number=100000)
print(f"Set search: {set_time:.4f}s")  # ~0.005s

print(f"Dict/Set are {list_time/dict_time:.0f}x faster than list!")

šŸ’” Profiling with cProfile

`cProfile` shows which functions consume the most time.

import cProfile
import pstats
from io import StringIO

def process_data():
    """Simulate data processing"""
    # Find duplicates - inefficient
    data = list(range(1000))
    duplicates = []
    for x in data:  # O(n)
        for y in data:  # O(n) nested
            if x == y and x != y:  # Always false, but benchmark anyway
                duplicates.append(x)
    return duplicates

# Profile the function
pr = cProfile.Profile()
pr.enable()
process_data()
pr.disable()

# Print results
s = StringIO()
ps = pstats.Stats(pr, stream=s).sort_stats('cumulative')
ps.print_stats(10)
print(s.getvalue())

# Output shows:
# ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#     1    0.150    0.150    0.150    0.150 test.py:2(process_data)

šŸŽØ Real-World Optimization Case Study

import timeit
from typing import List, Set, Dict

# Problem: Find which users appear in two lists
users_a = list(range(10000))
users_b = list(range(5000, 15000))

# Approach 1: Nested loops O(n²) - SLOW
def find_common_slow(a, b):
    result = []
    for user in a:
        if user in b:  # List membership O(n)
            result.append(user)
    return result

# Approach 2: Convert to set O(n) - FAST
def find_common_fast(a, b):
    return list(set(a) & set(b))

# Approach 3: Set operation directly O(n)
def find_common_best(a, b):
    return set(a) & set(b)

# Benchmark
slow_time = timeit.timeit(
    'find_common_slow(users_a, users_b)',
    globals={'find_common_slow': find_common_slow, 'users_a': users_a, 'users_b': users_b},
    number=100
)

fast_time = timeit.timeit(
    'find_common_fast(users_a, users_b)',
    globals={'find_common_fast': find_common_fast, 'users_a': users_a, 'users_b': users_b},
    number=100
)

print(f"Slow (nested): {slow_time:.3f}s")  # ~5.2s
print(f"Fast (set): {fast_time:.3f}s")     # ~0.08s
print(f"Speedup: {slow_time/fast_time:.0f}x")  # ~65x faster!

šŸ“Š Decision Matrix for Data Structure Selection

TaskStructureComplexityReasoning
Store ordered itemsListO(1) append, O(n) searchSequential access
Lookup by keyDictO(1) averageFast key access
Membership check (large)SetO(1) averageHash-based lookup
Fixed immutable dataTupleO(1) access, hashableMemory efficient
Multiple operationsCustom classVariesDomain-specific

šŸŽÆ Micro-optimization Techniques

# 1. List comprehension vs loop (faster creation)
# Slow
result = []
for i in range(10000):
    result.append(i * 2)

# Fast
result = [i * 2 for i in range(10000)]

# 2. Generator expression vs list comprehension (memory)
# Takes memory
squares = [x**2 for x in range(1000000)]

# Lazy evaluation - memory efficient
squares = (x**2 for x in range(1000000))

# 3. Local variable access (faster than global/attribute)
# Slow
total = 0
for i in range(1000000):
    total += some_module.CONSTANT

# Fast - cache in local variable
constant = some_module.CONSTANT
total = 0
for i in range(1000000):
    total += constant

# 4. Built-in functions (C implementation)
# Slow
total = 0
for x in data:
    total += x

# Fast
total = sum(data)

šŸ”‘ Key Takeaways

  • āœ… Use `timeit` for reliable performance measurements
  • āœ… Profile with `cProfile` to find bottlenecks
  • āœ… Set operations are 50-100x faster than list for membership
  • āœ… Big-O complexity matters more than micro-optimizations
  • āœ… Cache variables, use comprehensions, prefer built-ins

Ready to practice? Challenges | Quiz


Resources

Python Docs

Ojasa Mirai

Master AI-powered development skills through structured learning, real projects, and verified credentials. Whether you're upskilling your team or launching your career, we deliver the skills companies actually need.

Learn Deep • Build Real • Verify Skills • Launch Forward

Courses

PythonFastapiReactJSCloud

Ā© 2026 Ojasa Mirai. All rights reserved.

TwitterGitHubLinkedIn