Automating Test Case Loading with Python and OpenPyXL
To interact with Microsoft Excel files within a Python environment, the openpyxl library provides robust support for reading and writing .xlsx documents. The following workflow demonstrates how to instantiate a workbook, access specific sheets, and extract data values.
import pathlib
from openpyxl import load_workbook
# Locate the workbook relative to the current script execution
current_directory = pathlib.Path(__file__).resolve().parent
workbook_path = current_directory / 'test_scenarios.xlsx'
# Initialize the workbook object
wb = load_workbook(workbook_path)
print(f"Workbook loaded: {wb}")
# Select the active worksheet by name
active_sheet = wb['login_suite']
print(f"Sheet selected: {active_sheet.title}")
# Retrieve data from specific cells
target_cell = active_sheet.cell(row=1, column=3)
print(f"Cell content type: {type(target_cell.value)}")
# Access raw value directly
row_value = active_sheet.cell(row=2, column=2).value
print(f"Row 2, Col 2 Value: {row_value}")
Iterating Through Rows and Structuring Data
For test automation frameworks, retrieving entire datasets efficiently is crucial. Accessing the .values attribute of a sheet returns a generator object representing rows as tuples. Converting this to a list allows for standard iteration.
To transform these flat rows into structured objects (dictionaries), map the header row against subsequent data rows.
# Fetch all cell values as a generator and convert to list
raw_rows = list(active_sheet.values)
headers = raw_rows[0]
# Transform remaining rows into dictionaries
data_structures = []
for row in raw_rows[1:]:
case_record = dict(zip(headers, row))
data_structures.append(case_record)
print(f"Extracted {len(data_structures)} records")
Architectural Optimization for Scalability
As test suites grow, monolithic scripts become difficult to maintain. Adopting a layered architecture ensures separation of concerns. A typical structure includes:
- Utility Layer: Encapsulates common functions like file I/O or text processing.
- Data Layer: Stores external input sources such as JSON, CSV, or Excel files.
- Test Logic Layer: Defines specific assertions and execution flows (often using pyttest).
- Report Layer: Handles logging and result generation.
- Entry Point: Orchestrates the test runner.
Refactoring the Excel handling logic into dedicated modules improves reusability across different projects.
Path Resolution Module (util/path_handler.py)
Encapsulate directory navigation logic to avoid hardcoding paths throughout the codebase.
import pathlib
def get_absolute_file_path(relative_path) -> pathlib.Path:
"""
Resolves a file path relative to the script's root directory.
Args:
relative_path (str): The filename or sub-directory path.
Returns:
pathlib.Path: Absolute path object.
"""
base = pathlib.Path(__file__).resolve()
return base.parent / relative_path
Excel Parser Module (util/excel_reader.py)
Isolate the business logic for parsing workbook data.
from openpyxl import load_workbook
from .path_handler import get_absolute_file_path
def parse_excel_data(filename: str, sheet_name: str):
"""
Reads an Excel file and returns a list of dictionaries based on headers.
"""
# Determine full path
abs_path = get_absolute_file_path(filename)
if not abs_path.exists():
raise FileNotFoundError(f"File not found: {abs_path}")
workbook = load_workbook(str(abs_path))
sheet = workbook[sheet_name]
all_values = list(sheet.values)
headers = all_values[0]
results = []
for index, row in enumerate(all_values[1:], start=2):
# Skip empty rows if necessary
if not any(row):
continue
record = {}
for h, val in zip(headers, row):
record[h] = val
results.append(record)
return results
This modular approach decouples data retrieval from test execution, facilitating easier maintenance when the test data format or location changes.