Ojasa Mirai

Python

Learning Level

Data Processing Overview CSV Data Handling Pandas Basics DataFrames Data Filtering Aggregation & Grouping Data Cleaning & Wrangling NumPy Arrays Data Visualization Basics

Python/Data Processing/Csv Data Handling

📄 CSV Data Handling — Reading and Writing Spreadsheet Data

CSV (Comma-Separated Values) is the most common format for sharing tabular data. Learn to work with it efficiently.

🎯 Understanding CSV Format

CSV files are plain text files with rows and columns separated by commas. Each line is a record.

# CSV format looks like this:
# name,age,city
# Alice,25,New York
# Bob,30,London
# Carol,28,Paris

# Reading CSV manually (string approach)
csv_content = """name,age,city
Alice,25,New York
Bob,30,London"""

for line in csv_content.split('\n'):
    if line:  # Skip empty lines
        fields = line.split(',')
        print(f"Name: {fields[0]}, Age: {fields[1]}, City: {fields[2]}")

📖 Reading CSV Files with csv Module

The `csv` module provides reliable CSV handling with proper delimiter and quote handling.

import csv

# Reading a CSV file
with open('people.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)  # Each row is a list

# Reading with headers as DictReader
with open('people.csv', 'r') as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row)  # Each row is a dictionary
        print(f"Name: {row['name']}, Age: {row['age']}")

✏️ Writing CSV Files

import csv

# Sample data
people = [
    {'name': 'Alice', 'age': 25, 'city': 'New York'},
    {'name': 'Bob', 'age': 30, 'city': 'London'},
    {'name': 'Carol', 'age': 28, 'city': 'Paris'}
]

# Writing CSV with DictWriter
with open('people.csv', 'w', newline='') as file:
    fieldnames = ['name', 'age', 'city']
    writer = csv.DictWriter(file, fieldnames=fieldnames)

    writer.writeheader()  # Write column names
    for person in people:
        writer.writerow(person)

# Writing CSV with writer
data = [
    ['name', 'age', 'city'],
    ['Alice', 25, 'New York'],
    ['Bob', 30, 'London']
]

with open('simple.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    for row in data:
        writer.writerow(row)

🎨 Processing CSV Data

import csv

# Count values from CSV
with open('sales.csv', 'r') as file:
    reader = csv.DictReader(file)
    product_counts = {}
    for row in reader:
        product = row['product']
        product_counts[product] = product_counts.get(product, 0) + 1

print(product_counts)

# Filter CSV data
with open('people.csv', 'r') as file:
    reader = csv.DictReader(file)
    adults = [row for row in reader if int(row['age']) >= 18]

# Transform CSV data
with open('prices.csv', 'r') as file:
    reader = csv.DictReader(file)
    with open('prices_with_tax.csv', 'w', newline='') as out:
        writer = csv.DictWriter(out, fieldnames=['product', 'original_price', 'price_with_tax'])
        writer.writeheader()
        for row in reader:
            original = float(row['price'])
            with_tax = original * 1.1
            writer.writerow({
                'product': row['product'],
                'original_price': original,
                'price_with_tax': with_tax
            })

🔍 Handling Different Delimiters

import csv

# Tab-separated values (TSV)
with open('data.tsv', 'r') as file:
    reader = csv.reader(file, delimiter='\t')
    for row in reader:
        print(row)

# Semicolon-separated (common in Europe)
with open('data.csv', 'r') as file:
    reader = csv.reader(file, delimiter=';')
    for row in reader:
        print(row)

# Pipe-separated
with open('data.txt', 'r') as file:
    reader = csv.reader(file, delimiter='|')
    for row in reader:
        print(row)

📊 Real-World Example: Student Grades

import csv

# Sample CSV content
sample_data = """student,math,english,science
Alice,92,88,95
Bob,78,85,80
Carol,95,91,93
David,88,82,87"""

# Write sample data
with open('grades.csv', 'w', newline='') as f:
    f.write(sample_data)

# Read and analyze
with open('grades.csv', 'r') as f:
    reader = csv.DictReader(f)

    for row in reader:
        math = int(row['math'])
        english = int(row['english'])
        science = int(row['science'])

        average = (math + english + science) / 3
        print(f"{row['student']}: Average = {average:.1f}")

⚠️ Common CSV Issues

import csv

# Issue 1: Quoted fields with commas
data = 'name,description\n"Smith, John","A person, born in 1990"'
reader = csv.DictReader(data.strip().split('\n'))
# csv module handles quotes automatically

# Issue 2: Different line endings (Windows vs Unix)
# Use newline='' when opening CSV files in Python 3

# Issue 3: Encoding issues
with open('data.csv', 'r', encoding='utf-8') as f:
    reader = csv.reader(f)
    # Handle international characters properly

# Issue 4: Empty lines in files
with open('data.csv', 'r') as f:
    reader = csv.reader(f, skipinitialspace=True)
    for row in reader:
        if row:  # Skip empty rows
            print(row)

📊 CSV vs Other Formats

Format	Pros	Cons
CSV	Universal, simple, human-readable	No type info, delimiter issues
JSON	Structured, type-safe	Larger file size
Parquet	Efficient, columnar	Binary format, specialized tool
Excel	Rich formatting	Proprietary, harder to parse

🔑 Key Takeaways

✅ Use `csv.DictReader` for header-based access

✅ Use `csv.DictWriter` to write structured data

✅ Always use `newline=''` when opening CSV files

✅ Handle different delimiters with the `delimiter` parameter

✅ CSV is universal—best for data exchange

Ready to advance? Pandas Basics | Data Filtering

Resources

Python Docs

Ojasa Mirai

Master AI-powered development skills through structured learning, real projects, and verified credentials. Whether you're upskilling your team or launching your career, we deliver the skills companies actually need.

Learn Deep • Build Real • Verify Skills • Launch Forward

Courses

Python Fastapi ReactJS Cloud

Resources

Blog & Articles GitHub Projects Video Tutorials

Ecosystem

Ojasa Mirai Site My Growth Learning Portal Community Discord

Twitter GitHub LinkedIn