Six Essential Python Data Structure Techniques for Better Code
1. Named Tuples
Named tuples allow you to assign meaningful names to tuple elements, significantly improving code readability.
Problem Statement
Consider a student information system where data follows a fixed format: (name, age, gender, email, ...). When dealing with large datasets, tuples are more memory-efficient then custom classes. However, accessing elements via numeric indices reduces code clarity.
student = ('jim', 15, 'male', 'jim8789@gmail.com')
# Accessing by index
print(student[0]) # name
print(student[1]) # age
Solution 1: Constants for Index Values
IDX_NAME = 0
IDX_AGE = 1
IDX_GENDER = 2
IDX_EMAIL = 3
print(student[IDX_NAME])
print(student[IDX_AGE])
Solution 2: Using namedtuple
from collections import namedtuple
Student = namedtuple('Student', ['name', 'age', 'gender', 'email'])
s = Student('beck', 20, 'male', 'beck4654@gmail.com')
print(type(s))
print(s.name)
print(s.email)
2. Counting Element Frequencies in Sequences
Problem 1: Finding Top N Most Frequent Elements
Given a random sequence like [12, 5, 6, 4, 6, 5, 5, 7, ...], find the 3 most frequent elements.
from random import randint
data = [randint(0, 20) for _ in range(30)]
# Manual frequency counting using dictionary
freq_map = dict.fromkeys(data, 0)
for val in data:
freq_map[val] += 1
print(freq_map)
Alternatively, use the Counter class:
from collections import Counter
counter = Counter(data)
top_three = counter.most_common(3)
print(top_three)
Problem 2: Word Frequency Analysis
Analyze word frequency in an English text file and find the top 10 most common words.
from collections import Counter
import re
with open('article.txt', 'r') as f:
content = f.read()
words = re.split(r'\W+', content)
word_counter = Counter(words)
top_words = word_counter.most_common(10)
print(top_words)
3. Sorting Dictionaries by Values
Dictionaries store key-value pairs. By default, sorted() orders by keys. To sort by values, use one of these approaches:
Approach 1: Using zip() with sorted()
from random import randint
scores = {x: randint(60, 100) for x in 'abcdefgh'}`
# Convert to (value, key) tuples and sort
sorted_pairs = zip(scores.values(), scores.keys())
result = sorted(sorted_pairs)
print(result)
Approach 2: Using key Parameter
The sorted() function signature: sorted(iterable, key, reverse)
- iterable: Any iterable object like
dict.items()ordict.keys() - key: A function (often lambda) to select comparison elements
- reverse:
Truefor descending,Falsefor ascending (default)
from random import randint
scores = {x: randint(60, 100) for x in 'abcdefgh'}
result = sorted(scores.items(), key=lambda item: item[1])
print(result)
Key Difference
The first approach sorts by values first, then by keys for ties. The second approach sorts by the specified value only.
4. Finding Common Keys Across Multiple Dictionaries
from random import randint, sample
# Generate random player data
players_s1 = {x: randint(1, 4) for x in sample('abcdef', randint(3, 6))}
players_s2 = {x: randint(1, 4) for x in sample('abcdef', randint(3, 6))}
players_s3 = {x: randint(1, 4) for x in sample('abcdef', randint(3, 6))}
print(players_s1)
print(players_s2)
print(players_s3)
Method 1: Loop-based Approach
common_keys = []
for key in players_s1:
if key in players_s2 and key in players_s3:
common_keys.append(key)
print(common_keys)
Method 2: Set Intersection
print(players_s1.keys() & players_s2.keys() & players_s3.keys())
Method 3: Using reduce() with map()
from functools import reduce
all_keys = map(dict.keys, [players_s1, players_s2, players_s3])
common = reduce(lambda a, b: a & b, all_keys)
print(common)
5. Maintaining Dictionary Order
Practical Scenario
A programming competition system needs to track contestants and their completion times. Each contestant completes a problem, and their time is recorded. Later, organizers need to query scores by name.
results = {
'LiLei': (2, 43), # rank: 2, time: 43 minutes
'HanMeimei': (5, 52), # rank: 5, time: 52 minutes
'Jim': (1, 39), # rank: 1, time: 39 minutes
}
Python versions before 3.6 had unordered dictionaries. Starting from 3.6, standard dicts maintain insertion order.
Using OrderedDict
from collections import OrderedDict
od = OrderedDict()
od['Alice'] = (1, 25)
od['Charlie'] = (2, 31)
od['Diana'] = (3, 28)
for name in od:
print(name, od[name])
Competition Simulation
from time import time
from random import randint
from collections import OrderedDict
contestants = list('ABCDEF')
rankings = OrderedDict()
start_time = time()
for i in range(6):
input() # Wait for player input
player = contestants.pop(randint(0, 5 - i))
elapsed = time() - start_time
print(f"Player {i+1}: {player} completed in {elapsed:.2f}s")
rankings[player] = (i + 1, elapsed)
print("-" * 20)
for name in rankings:
print(name, rankings[name])
6. Implementing History Tracking
Many applications track user history: browsers store visited pages, media players remember viewed files, shells record commands.
Solution
Use a bounded queue (capacity N) to store history. The deque from collections provides a double-ended circular queue that automatically discards oldest entries when full.
Basic deque Usage
from collections import deque
history = deque(maxlen=5)
history.extend([1, 2, 3, 4, 5])
print(history)
# Adding 6th element automatically removes the oldest
history.append(6)
print(history) # [2, 3, 4, 5, 6]
Persisting History with pickle
import pickle
# Save to file
pickle.dump(history, open('user_history.dat', 'wb'))
# Load from file
loaded = pickle.load(open('user_history.dat', 'rb'))
print(loaded)
Complete Example: Guessing Game with History
from random import randint
from collections import deque
import pickle
target = randint(0, 100)
print(f'Answer: {target}')
history = deque(maxlen=5)
# Load existing history if available
try:
history = pickle.load(open('guess_history.dat', 'rb'))
print(f'Loaded history: {list(history)}')
except FileNotFoundError:
pass
def check_guess(guess):
if guess == target:
print('Correct!')
return True
elif guess < target:
print(f'{guess} is too low')
else:
print(f'{guess} is too high')
return False
while True:
user_input = input("Enter a number: ")
if user_input.isdigit():
num = int(user_input)
history.append(num)
if check_guess(num):
break
elif user_input in ('history', 'h?'):
print(list(history))
pickle.dump(history, open('guess_history.dat', 'wb'))
This implementation allows users to check their guessing history and persists data between sesions.