Choosing Between Append-Optimized Row and Column Storage in Greenplum
Storage Architecture Fundamentals
In the Greenplum (MPP PostgreSQL) ecosystem, Append-Optimized Row (AORO) and Append-Optimized Column (AOCO) tables represent specialized storage formats designed for analytical workloads. The core distinction lies in their physical organization:
- AORO: Data is stored sequentially by row, optimized for writing full records.
- AOCO: Data is stored sequentially by column, optimized for reading specific fields.
Unlike standard Heap tables (standard PostgreSQL tables with MVCC), these Append-Optimized formats do not support in-place updates efficiently and are tailored for high-volume data ingestion.
Physical Layout Comparison
AORO Structure
Complete rows are written contiguously to disk:
[ Record 1 (Col1, Col2, Col3) ]
[ Record 2 (Col1, Col2, Col3) ]
[ Record 3 (Col1, Col2, Col3) ]
AOCO Structure
Each column is stored in its own separate physical file or segment:
Column_1_File: Val1, Val2, Val3, Val4
Column_2_File: Val1, Val2, Val3, Val4
Column_3_File: Val1, Val2, Val3, Val4
Detailed Use Case Analysis
Standard Heap Tables
- Role: Ideal for OLTP operations, dimension tables, and small lookup tables.
- Scale: Best for datasets under 10GB or fewer than 1 million rows.
- Behavior: High-frequency single-row updates/deletes, indexed point lookups, and concurrent small transactions.
- Examples: User dimensions, configuration settings, product catalogs.
AORO (Row-Oriented)
- Role: Storing large volumes of detailed records where full rows are often retrieved.
- Scale: Hundreds of millions to billions of rows.
- Behavior: Batch ETL inserts, minimal updates. Queries often involve
SELECT *or retrieving most columns. - Examples: Application logs, raw event streams, audit trails.
- Note: Adding a column with a default value requires a full table rewrite, which is expensive on large datasets.
AOCO (Column-Oriented)
- Role: The powerhouse for Data Warehousing fact tables and OLAP analytics.
- Scale: Terabyte or Petabyte scale, often with very wide schemas (100+ columns).
- Behavior: Queries touch only a handful of columns (e.g., aggregations on specific metrics). Writes are bulk appends.
- Examples: Financial transaction facts, user behavior analysis, wide feature tables for ML.
- Note: Supports adding nullable columns instantly without rewriting data.
Performance Benchmark Comparison
Decition Matrix for Workloads
When to choose AORO (Row)
- Full Row Access: You frequently need every column in the row (e.g., exporting raw logs).
- Moderate Width: Tables with fewer than 100 columns.
- Write Heavy: The workload is primarily inserting data with few updates.
- Example Query:
SELECT * FROM server_logs WHERE log_id = 12345;
When to choose AOCO (Column)
- Massive Scale: Tables ranging from Terabytes to Petabytes.
- Wide Tables: Schemas exceeding 100 columns (e.g., user profiles, tags).
- Projected Reads: Queries access only a subset of columns.
- Heavy Aggregation: Frequent use of SUM, COUNT, AVG, GROUP BY.
- Compression Needs: Significant storage savings using ZSTD or RLE encoding.
- Example Query:
SELECT SUM(revenue), AVG(latency)
FROM network_stats
WHERE capture_date BETWEEN '2023-01-01' AND '2023-01-31';
-- Only reads 3 columns out of 150
Performance Case Study
Scenario: 1 Billion rows, 120 columns, querying 3 columns.
Creation Syntax Examples
Creating an AORO Table
CREATE TABLE raw_metrics (
event_time timestamp,
source_ip inet,
payload text
)
WITH (appendonly=true, orientation=row);
Creating an AOCO Table
CREATE TABLE sales_fact (
transaction_id bigint,
amount numeric,
region text
)
WITH (appendonly=true, orientation=column);
Creating AOCO with Compression
CREATE TABLE user_analytics (
user_id bigint,
session_count int,
last_active date
)
WITH (
appendonly=true,
orientation=column,
compresstype=zstd,
compresslevel=7
);
Analyzing Database Workload Patterns
To determine if a table is read-heavy or write-heavy, query the statistics from pg_stat_user_tables.
SELECT
schemaname,
relname AS table_name,
seq_scan + idx_scan AS total_reads,
n_tup_ins + n_tup_upd + n_tup_del AS total_writes,
pg_size_pretty(pg_total_relation_size(relid)) AS size
FROM pg_stat_user_tables
ORDER BY total_reads DESC;
Automated Workload Classification
SELECT
schemaname,
relname,
CASE
WHEN (seq_scan + idx_scan) > 10 * (n_tup_ins + n_tup_upd + n_tup_del)
THEN 'Read Intensive'
WHEN (n_tup_ins + n_tup_upd + n_tup_del) > (seq_scan + idx_scan)
THEN 'Write Intensive'
ELSE 'Mixed Workload'
END AS workload_profile
FROM pg_stat_user_tables;