Redis Data Structures and Objects
Redis provides three specialized data structures optimized for specific use cases:
| Structure | Use Case |
|---|---|
| Bitmaps | Space-efficient storage for binary state tracking (daily/monthly user check-ins) |
| HyperLogLog | Approximate cardinality estimation for massive datasets (UV counting) |
| GEO | Geospatial data storage and proximity queries |
Bitmaps
A bitmap is essentially a byte array where each bit represents a binary state (0 or 1). Introduced in Redis 2.2.0, these operations extend the string type rather than creating a new data type.
Core Operations
GETBIT key position // Retrieve bit value at specified position
SETBIT key position value // Set bit value (0 or 1) at position
BITOP operation dest source [source...]
// Operations: AND, OR, NOT, XOR - combines bitmaps and stores result
BITCOUNT key [start end] // Count bits set to 1
Movie Streaming Application Example
Consider a video platform tracking which movies are watched daily:
- Each day maintains a bitmap initialized to zeros
- When a movie is played, its ID determines the bit position to set
| Metric | Solution |
|---|---|
| Check if specific movie was played today | Calculate bit offset from movie ID, read the corresponding bit |
| Count movies played today | Count set bits in today's bitmap |
| Count movies in weekly/monthly/yearly range | OR all relevant bitmaps, then count resulting bits |
| Find unplayed movies in a year | OR all bitmaps, bits still at 0 indicate unwatched movies |
127.0.0.1:6379> SETBIT movies_20240301 100 1
(integer) 0
127.0.0.1:6379> SETBIT movies_20240301 205 1
(integer) 0
127.0.0.1:6379> SETBIT movies_20240302 100 1
(integer) 0
127.0.0.1:6379> BITCOUNT movies_20240301
(integer) 2
127.0.0.1:6379> BITOP OR period_result movies_20240301 movies_20240302
(integer) 1
127.0.0.1:6379> BITCOUNT period_result
(integer) 1
HyperLogLog
HyperLogLog implements an algorithm for estimating set cardinality with minimal memory footprint. It trades exactness for efficiency, making it suitable for massive-scale unique visitor counting.
Comparing approaches for unique user counting:
| Approach | Memory Usage | Trade-offs |
|---|---|---|
| SET type | Each user ID stored as string | Accurate but memory-intensive |
| Bitmap | One bit per possible user | Better space efficiency |
| HyperLogLog | Fixed 12KB per counter | 0.81% error rate, constant memory |
Commands
PFADD key element [element ...] // Add elements to counter
PFCOUNT key [key ...] // Get estimated cardinality
PFMERGE dest source [source ...] // Merge multiple counters
Characteristics
- Does not store actual data, only cardinality estimate
- Standard error approximately 0.81%
- Memory allocation grows incrementally as element are added
- Merged structures always occupy exactly 12KB
GEO (Geospatial)
Redis GEO enables storing coordinate pairs and performing location-based queries.
GEOADD locations longitude latitude member [longitude latitude member ...]
GEOPOS locations member [member ...]
GEODIST locations pointA pointB [unit]
GEORADIUS locations longitude latitude radius unit [WITHCOORD] [WITHDIST] [COUNT count]
GEORADIUSBYMEMBER locations member radius unit [WITHCOORD] [WITHDIST]
GEOHASH locations member [member ...]
127.0.0.1:6379> GEOADD cities 116.4 39.9 Beijing
(integer) 1
127.0.0.1:6379> GEOADD cities 121.4 31.2 Shanghai
(integer) 1
127.0.0.1:6379> GEOPOS cities Beijing
1) "116.39999866485596"
"39.900000092026976"
127.0.0.1:6379> GEODIST cities Beijing Shanghai km
"1067.8390"
Core Data Structures
Simple Dynamic Strings (SDS)
Structure Definition
Redis 4.0+ defines five SDS variants optimized for different string lengths:
typedef char *simple_string;
struct __attribute__ ((__packed__)) sds_header_8 {
uint8_t length; /* bytes currently used */
uint8_t allocated; /* total capacity excluding header and null */
unsigned char flags; /* type identifier (3 bits) */
char buffer[]; /* flexible array member for actual data */
};
struct __attribute__ ((__packed__)) sds_header_16 {
uint16_t length;
uint16_t allocated;
unsigned char flags;
char buffer[];
};
struct __attribute__ ((__packed__)) sds_header_32 {
uint32_t length;
uint32_t allocated;
unsigned char flags;
char buffer[];
};
struct __attribute__ ((__packed__)) sds_header_64 {
uint64_t length;
uint64_t allocated;
unsigned char flags;
char buffer[];
};
| Field | Purpose |
|---|---|
| length | Current string size (excluding terminator) |
| allocated | Total capacity reserved |
| flags | Type indicator using 3 bits |
| buffer | Actual character storage |
Advantages Over C Strings
| Aspect | C String | SDS | Benefit |
|---|---|---|---|
| Length retrieval | O(n) via strlen | O(1) via len field | Eliminates performance bottleneck |
| Buffer overflow | Possible with strcat/strcpy | Bounds checking before write | Memory safety guaranteed |
| Reallocation | Required on every modification | Pre-allocation and lazy freeing | Fewer memory operations |
| Binary data | Uses '\0' sentinel (text only) | Uses length field | Stores any binary content |
Applications
- Direct string value storage
- AOF module buffers
- Client input/output buffers
Linked Lists
Implementation
typedef struct list_node {
struct list_node *previous;
struct list_node *next;
void *payload;
} list_node;
typedef struct list_iterator {
list_node *current;
int traverse_direction;
} list_iterator;
typedef struct linked_list {
list_node *head;
list_node *tail;
void *(*duplicate_fn)(void *ptr);
void (*deallocate_fn)(void *ptr);
int (*compare_fn)(void *ptr, void *key);
unsigned long count;
} linked_list;
Properties
- Doubly-linked, circular structure
- O(1) head/tail/length access
- Generic storage via void pointers
Usage
Lists power LIST commands, publish-subscribe patterns, slow query logging, and monitoring features.
Hash Tables (Dictionaries)
Internal Structure
typedef struct hash_entry {
void *hash_key;
union {
void *value_ptr;
uint64_t unsigned_val;
int64_t signed_val;
double floating_val;
} stored_value;
struct hash_entry *collision_next;
} hash_entry;
typedef struct hash_table {
hash_entry **buckets;
unsigned long table_size;
unsigned long size_mask;
unsigned long active_entries;
} hash_table;
typedef struct dictionary {
struct dict_type *operations;
void *private_context;
hash_table tables[2]; /* primary and rehash targets */
long rehash_position; /* -1 when idle, index when active */
unsigned long active_iterators;
} dictionary;
Hash Function Evolution
Redis versions show different hashing strategies:
- 2.8-3.2: MurmurHash2 (good distribution even with patterned keys)
- 4.0+: SipHash (improved security against collision attacks)
uint64_t compute_hash(const void *key, int length) {
return siphash(key, length, hash_seed);
}
Rehashing Mechanism
Redis maintains two hash tables per dictionary. Rehashing proceeds in stages:
- Allocate space in the second table based on element count
- Gradually migrate entries from table[0] to table[1]
- Deallocate old table, swap references, prepare fresh table[1]
Expansion and Contraction
| Load Factor Formula | Redis Behavior |
|---|---|
| entries / bucket_count | Can exceed 1.0 |
- Expansion triggers: Load factor > 1 (normal) or > 5 (during BGSAVE/BGREWRITEAOF)
- Contraction triggers: Load factor < 0.1
Incremental Rehashing
To prevent service interruption during large-scale rehashing:
- Set rehashidx to 0, indicating rehash in progress
- Each CRUD operation migrates one bucket's worth of entries
- Rehash completes when rehashidx returns to -1
Lookup queries check both tables during transition. New writes target the new table.
Skip Lists
Concept
Skip lists extend sorted linked lists with layered indices, enabling O(log n) search through a space-time tradeoff.
A skip list contains multiple levels where higher levels act as express lanes:
| Level | Contains |
|---|---|
| Highest | Fewest elements, largest gaps |
| Lowest | All elements, consecutive spacing |
Time Complexities
- Search: O(log n) average, O(n) worst case
- Insert: O(log n) - find position, create random levels
- Delete: O(log n) - unlink from all levels
int generate_random_level() {
int lvl = 1;
while ((rand() & 0xFFFF) < (SKIPLIST_P * 0xFFFF))
lvl++;
return (lvl < MAX_LEVEL) ? lvl : MAX_LEVEL;
}
The constant ZSKIPLIST_P (0.25) means level n+1 is 1/4 as likely as level n.
Redis Implementation
typedef struct skip_node {
simple_string *element;
double ranking_score;
struct skip_node *backward;
struct skip_level {
struct skip_node *forward;
unsigned int span; /* distance between nodes */
} levels[];
} skip_node;
typedef struct skip_list {
skip_node *header;
skip_node *tail;
unsigned long element_count;
int current_max_level;
} skip_list;
Comparison with Alternatives
| Structure | Range Queries | Single Lookup | Memory | Implementation |
|---|---|---|---|---|
| Skip List | Efficient | O(log n) | ~1.33 pointers/node | Straightforward |
| Balanced Tree | Requires traversal | O(log n) | 2 pointers/node | Complex rotations |
| Hash Table | Not supported | O(1) avg | Varies | Moderate |
Integer Sets
Structure
typedef struct integer_collection {
uint32_t encoding; /* 16, 32, or 64-bit */
uint32_t size; /* element count */
int8_t contents[]; /* flexible array storage */
} integer_collection;
Integer sets store unique integers in sorted order using contiguous memory.
Upgrade Mechanism
When adding values requiring more bits than current encoding:
- Reallocate storage for new, larger encoding
- Convert existing elements to new format
- Insert new element while maintaining sorted order
Upgrades provide flexibility and memory efficiency—they only expand when necessary and are irreversible.
Compressed Lists
Layout
A compressed list is a sequential memory structure containing:
- Total byte length
- Offset to last element
- Element count
- Individual entries
- End terminator (0xFF)
Entry Structure
| Component | Size | Purpose |
|---|---|---|
| previous_entry_length | 1-5 bytes | Enables backward traversal |
| encoding | 1-5 bytes | Type and content length |
| content | Variable | Actual value storage |
Cascade Update Concern
Modifying entries can trigger cascade reallocations when previous_entry_length fields need expansion. This worst-case O(n²) scenario rarely occurs with random data but remains a theoretical concern.
Redis Object System
Object Overview
Redis wraps底层数据结构 with an object layer providing type information and reference counting. Keys are always strings; values can be strings, lists, sets, hashes, or sorted sets.
Object Structure
typedef struct redis_object {
unsigned type:4; /* STRING, LIST, SET, ZSET, HASH */
unsigned encoding:4; /* Actual storage format */
void *internal_data; /* Pointer to underlying structure */
int reference_count; /* For memory management */
} redis_object;
Encoding Options by Type
| Type | Encodings Available | Conditions |
|---|---|---|
| String | INT, EMBSTR, RAW | Integer values use INT encoding |
| List | ZIPLIST, LINKEDLIST | Small lists use ZIPLIST |
| Hash | ZIPLIST, HASHTABLE | Few fields use ZIPLIST |
| Set | INTSET, HASHTABLE | All integers, small cardinality |
| ZSet | ZIPLIST, SKIPLIST+HASHTABLE | Small datasets prefer ZIPLIST |
String Object Encodings
- INT: Direct 64-bit integer storage
- EMBSTR: Embedded SDS for short strings (< 44 bytes)
- RAW: Separate heap allocation for longer strings
ZSet Hybrid Structure
The SKIPLIST+HASHTABLE encoding combines both structures:
- Skip list maintains ordered iteration
- Hash table provides O(1) member lookup
Memory Management
Redis uses reference counting for garbage collection. Shared objects (integer strings) reduce memory when multiple keys reference identical values.
Integer strings up to 2^63-1 can be shared across the entire database, exemplifying the flyweight pattern.