Fundamentals of String Processing and Word Frequency Analysis with Python
Developing a Word Frequency Counter in Python
Below is a Python script for calculating word occurrences within a given text. The function handles punctuation and case normalization.
import string
def compute_word_frequency(input_text):
"""
Analyzes input text and returns a dictionary with word frequencies.
Processes text by removing punctuation and converting to lowercase.
"""
# Create a translation table to remove punctuation characters
translator = str.maketrans('', '', string.punctuation)
# Apply translation and convert text to lowercase
cleaned_text = input_text.translate(translator).lower()
# Tokenize the cleaned text into individual words
tokens = cleaned_text.split()
# Dictionary to store final word counts
frequency_map = {}
# Iterate through tokens and update frequency counts
for token in tokens:
if token in frequency_map:
frequency_map[token] = frequency_map[token] + 1
else:
frequency_map[token] = 1
return frequency_map
if __name__ == "__main__":
# Sample text for demonstration
sample = """
Bought this stuffed panda for my daughter's birthday.
She adores it and carries it everywhere. The material is soft,
and its appearance is quite charming. However, the size is
somewhat smaller relative to its cost. Perhaps other products
offer larger dimensions at a comparable price. Delivery was
surprisingly prompt, arriving a day ahead of schedule, which
allowed me some personal enjoyment before handing it over.
"""
result = compute_word_frequency(sample)
print(result)
Output:
{'bought': 1, 'this': 1, 'stuffed': 1, 'panda': 1, 'for': 1, 'my': 1, 'daughters': 1, 'birthday': 1, 'she': 1, 'adores': 1, 'it': 4, 'and': 2, 'carries': 1, 'everywhere': 1, 'the': 2, 'material': 1, 'is': 3, 'soft': 1, 'its': 2, 'appearance': 1, 'quite': 1, 'charming': 1, 'however': 1, 'size': 1, 'somewhat': 1, 'smaller': 1, 'relative': 1, 'to': 1, 'cost': 1, 'perhaps': 1, 'other': 1, 'products': 1, 'offer': 1, 'larger': 1, 'dimensions': 1, 'at': 1, 'a': 1, 'comparable': 1, 'price': 1, 'delivery': 1, 'was': 1, 'surprisingly': 1, 'prompt': 1, 'arriving': 1, 'day': 1, 'ahead': 1, 'of': 1, 'schedule': 1, 'which': 1, 'allowed': 1, 'me': 1, 'some': 1, 'personal': 1, 'enjoyment': 1, 'before': 1, 'handing': 1, 'over': 1}
Debugging Python Code in VS Code
- Open the File: Launch the Python script you intend to debug in the editor.
- Set a Breakpoint: Click in the gutter (left margin) of the desired line number. A red circle appears, indicating the program will pause execution at this point. For enstance, set a breakpoint on the line containing the
forloop to inspect each iteration. - Initiate Debugging: Press
F5or click the green "Run and Debug" button in the sidebar. Alternatively, use the debug icon in the top-right corner and select "Debug Python File". - Utilize Debugging Controls: Once execution halts at the breakpoint, use the following tools:
- Variables Panel: Inspect the current state of variables like
tokensandfrequency_map. - Debug Console: Execute ad-hoc Python expressions or debugging commands.
- Call Stack: View the sequence of functon calls leading to the current point.
- Step Into (F11): Move into the execution of a called function.
- Step Over (F10): Execute the current line and move to the next one.
- Step Out (Shift+F11): Complete execution of the currrent function and return to its caller.
- Continue (F5): Resume program execution until the next breakpoint is encountered.
- Variables Panel: Inspect the current state of variables like
While paused, you can monitor how the frequency_map dictionary is constructed step-by-step as the loop processes each word from the tokens list.