Class 12 • Chapter 3

File Handling in Python

File handling enables persistent storage by allowing Python to interact with the local disk for long-term data management.

11 note blocks5 exam topics

🎯 Exam Focus Areas

Difference between Read and Write modes (w, a, r+)Importance of the 'with' statement for resource safetySerialization vs Deserialization using pickleHandling newlines in CSV files to prevent blank rowsUsing seek() and tell() for file navigation

Introduction to Files

Data stored in variables is volatile—it disappears once the program stops running. To make data persistent, we store it in local files on the hard disk. File Handling is the process of creating, reading, writing, and manipulating these files using Python. A file is essentially a collection of bytes stored under a specific name. Python distinguishes between two primary types of files: **Text Files** (which store data in human-readable ASCII or Unicode characters) and **Binary Files** (which store data in the form of raw bytes, such as images, compiled code, or serialized objects). Understanding how to efficiently manage files is crucial for applications that need to save user settings, process large datasets, or log system events.

File Types & EOL

1Text Files: Human-readable, uses End-of-Line (EOL) characters like \n or \r\n
2Binary Files: Not human-readable, no specific EOL, stores data as it is in memory
3Common Formats: .txt, .py (Text) vs .jpg, .dat, .exe (Binary)

The File Access Flow

Regardless of the file type, the workflow for file handling in Python follows a standard three-step sequence: **Open → Process (Read/Write) → Close**. Opening a file requires the open() function, which creates a 'file object' acting as a bridge between your code and the disk. You must specify an 'access mode' (like 'r' for read or 'w' for write). Once processing is finished, it is vital to close the file using close() to free up system resources and ensure all data is correctly flushed to the disk. To automate this, Python provides the **'with' statement** (context manager), which automatically closes the file even if an error occurs during execution, making it the industry standard for safe file handling.

pythonThe 'with' Statement Pattern

# Modern way to handle files safely
with open("sample.txt", "w") as f:
    f.write("Hello PyLearn!\n")
    f.write("Persistent storage is easy.")

# File is automatically closed here

Working with Text Files

Text files are manipulated using standard string operations. For reading, Python offers read() (reads the entire file), readline() (reads one line at a time), and readlines() (returns a list of all lines). For writing, write() requires a string input, while writelines() takes a list of strings and writes them to the file. A common pitfall is the 'w' (write) mode, which overwrites the entire file if it already exists; to add data without deleting the old content, you must use 'a' (append) mode. Additionally, the 'r+' mode allows for both reading and writing simultaneously, but requires precise control over the file pointer position.

pythonText File Read/Write Operations

# Writing and then Reading lines
lines = ["Python\n", "Java\n", "C++\n"]
with open("langs.txt", "w") as f:
    f.writelines(lines)

with open("langs.txt", "r") as f:
    content = f.readlines()
    for line in content:
        print(line.strip())

Binary Files & The pickle Module

Binary files store data that cannot be represented as simple text, such as Python objects (lists, dictionaries). To handle this, Python provides the **pickle module**, which performs Serialization (Pickling)—converting an object into a byte stream—and Deserialization (Unpickling)—converting bytes back into an object. Binary files must be opened in 'rb' (read binary) or 'wb' (write binary) modes. This is extremely useful for saving the state of a program, such as a high score in a game or a complex database of student records, ensuring the data remains exactly as it was when loaded back into memory.

pythonPickling and Unpickling

import pickle

data = {"name": "Rahul", "score": 95, "rank": 1}

# Serialization
with open("student.dat", "wb") as f:
    pickle.dump(data, f)

# Deserialization
with open("student.dat", "rb") as f:
    loaded_data = pickle.load(f)
    print(loaded_data["name"])

CSV File Handling

CSV (Comma Separated Values) is a widely used multi-platform format for tabular data (like spreadsheets). The **csv module** provides optimized tools for handling these files. Using csv.writer() and csv.reader(), you can easily convert Python lists into rows of a spreadsheet and vice-versa. One important detail is the 'newline' parameter in the open() function: when working with CSVs on Windows, setting newline='' prevents the injection of extra blank rows between data. This makes Python an excellent tool for data science and administrative automation tasks where spreadsheet processing is frequent.

pythonCSV Read/Write Operations

import csv

header = ['ID', 'Name', 'Marks']
rows = [[1, 'Amit', 92], [2, 'Sumit', 88]]

# Writing CSV
with open("results.csv", "w", newline='') as f:
    writer = csv.writer(f)
    writer.writerow(header)
    writer.writerows(rows)

# Reading CSV
with open("results.csv", "r") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

The File Pointer and Navigation

Every open file has a 'File Pointer' (or cursor) that indicates the current position from which reading or writing will occur. Python provides two vital methods for navigation: tell() returns the current position of the pointer in bytes from the start, and seek(offset, from_where) is used to move the pointer to a target location. By default, seek(0) moves the cursor to the very beginning of the file. This granular control is essential for 'random access'—when you need to modify or read a specific chunk of data in a large file without processing everything that comes before it.

📝 Quick Revision Points

1open() creates a file object
2'w' overwrites, 'a' appends
3pickle.dump() writes binary, pickle.load() reads binary
4csv.writerows() takes a nested list of data
5seek(0) takes you back to the start

← PreviousFunctions: Modular Programming Next →Recursion: Elegant Logic

Loading notes...

# Writing and then Reading lines lines = ["Python\n", "Java\n", "C++\n"] with open("langs.txt", "w") as f: f.writelines(lines) with open("langs.txt", "r") as f: content = f.readlines() for line in content: print(line.strip())

import pickle data = {"name": "Rahul", "score": 95, "rank": 1} # Serialization with open("student.dat", "wb") as f: pickle.dump(data, f) # Deserialization with open("student.dat", "rb") as f: loaded_data = pickle.load(f) print(loaded_data["name"])

import csv header = ['ID', 'Name', 'Marks'] rows = [[1, 'Amit', 92], [2, 'Sumit', 88]] # Writing CSV with open("results.csv", "w", newline='') as f: writer = csv.writer(f) writer.writerow(header) writer.writerows(rows) # Reading CSV with open("results.csv", "r") as f: reader = csv.reader(f) for row in reader: print(row)