Fading Coder

One Final Commit for the Last Sprint

Home > Notes > Content

Python Modules, Packages, and Standard Library Utilities

Notes 1

Module Fundamentals

A module serves as a collection of related functionalities. They originate from Python's built-in standard library, third-party packages, or custom-developed scripts.

Formats include:

  • .py scripts written in Python.
  • Compiled C/C++ extensions (shared libraries or DLLs).
  • Directories containing an __init__.py file, recognized as packages.
  • Built-in modules intrinsically linked to the Python interpreter.

Utilizing modules improves development velocity and minimizes code duplication.

Import Mechanisms

The execution context differentiates between the main script and the imported module.

The import Statement

When initially importing:

  1. A dedicated namespace is generated for the module.
  2. The module's code executes, populating its namespace.
  3. The importing script receives a reference to the module's namespace within its own scope.

Subsequent imports reuse the existing namespace without re-executing the code. Usage requires prefixing: module_name.function_name.

  • Pros: Prevents naming collisions.
  • Cons: Verbose syntax.
import core.utils
core.utils.validate_input()

The from ... import ... Statement

Initial import process mirrors the import steps, but directly binds the specified names into the current script's namespace. Usage is prefix-free.

  • Pros: Cleaner syntax.
  • Cons: Higher risk of namespace collisions.

from module import * imports all public names, controllable via the module's __all__ attribute.

Circular Dependencies

Circular imports occur when two modules attempt to import eachother. Mitigation strategies:

  1. Delay the import by moving the from ... import ... statement to the end of the file.
  2. Localize the import by placing it inside a function, ensuring it only executes when needed.
# auth.py
print('Loading authentication module')
def verify():
    from db import fetch_user
    user = fetch_user()
    return user is not None
token = 'secret'

# db.py
print('Loading database module')
from auth import token
def fetch_user():
    return token

Dynamic imports can be achieved using importlib:

import importlib
target = 'core.engine'
handler = importlib.import_module(target)
print(dir(handler))

Module Search Path Resolution

The interpreter locates modules based on this priority:

  1. Modules already loaded in memory.
  2. Built-in standard modules.
  3. Directories listed in sys.path (starting with the executing script's directory).

All imported modules reference environment variables based on the executing script's sys.path.

Absolute imports traverse from the top-level directory listed in sys.path.

  • Pros: Universally accessible.
  • Cons: Lengthy paths.

Relative imports reference the current module's location using . (current) and .. (parent).

  • Pros: Compact syntax.
  • Cons: Restricted to intra-package usage; invalid in top-level execution scripts. Exceeding the top-level package boundary raises a ValueError.

Package Architecture

A package is a directory containing an __init__.py file. Importing a package effectively executes its __init__.py.

During the first import:

  1. Namespace generated for __init__.py.
  2. Code inside __init__.py runs.
  3. Current script binds to the package namespace.

In Python 2, __init__.py was mandatory; Python 3 allows implicit namespace packages.

Rules:

  • The dot (.) left-hand operand must signify a package.
  • Absolute imports inside a package should start from the top-level project directory.
  • Relative imports (using .) are preferred for internal package dependencies to maintain portability upon renaming top-level directories.
  • Relative imports cannot travrese beyond the package's root directory.

Standard Library Essentials

1. time

Time representations: Timestamp (seconds since epoch), Local time, UTC.

import time
current_timestamp = time.time() # Float
local_struct = time.localtime() # struct_time
utc_struct = time.gmtime()
formatted_str = time.strftime("%Y-%m-%d %H:%M:%S", local_struct)
parsed_struct = time.strptime("2023-10-05 14:30:00", "%Y-%m-%d %H:%M:%S")
timestamp_from_struct = time.mktime(parsed_struct)
time.sleep(2) # Delay execution

2. datetime

import datetime
present = datetime.datetime.now()
custom_date = datetime.datetime(2023, 5, 12, 10, 0, 0)
time_diff = present - custom_date
future_date = present + datetime.timedelta(days=7)

3. random

import random
import string
float_val = random.uniform(5.0, 10.0)
int_val = random.randint(10, 99)
even_val = random.randrange(0, 100, 2)
char_list = random.choices('xyz123')
sample_str = ''.join(random.sample(string.ascii_letters + string.digits, 6))
data_list = [1, 2, 3, 4]
random.shuffle(data_list)

4. sys

import sys
args = sys.argv # Command-line arguments
sys.exit(0) # Exit program
version_info = sys.version
platform_name = sys.platform

def show_progress(ratio, bar_width=40, prefix='Progress: '):
    ratio = min(ratio, 1.0)
    filled = '*' * int(bar_width * ratio)
    empty = '-' * (bar_width - int(bar_width * ratio))
    print(f"\r{prefix}[{filled}{empty}] {int(ratio*100)}%", end='')

5. shutil

import shutil
import zipfile

shutil.copyfile('src.txt', 'dst.txt')
shutil.copytree('folder_src', 'folder_dst', ignore=shutil.ignore_patterns('*.tmp'))
shutil.rmtree('folder_dst')
shutil.move('src.txt', 'new_location.txt')
shutil.make_archive('archive_name', 'zip', root_dir='target_folder')

# Extracting
with zipfile.ZipFile('archive_name.zip', 'r') as zf:
    zf.extractall()

6. os

import os

current_dir = os.getcwd()
os.makedirs('new_dir/sub_dir')
os.rmdir('new_dir/sub_dir')
os.rename('old.txt', 'new.txt')

combined = os.path.join('/var', 'data', 'file.txt')
print(os.path.exists('file.txt'))
print(os.path.isfile('file.txt'))
print(os.path.isdir('my_folder'))
print(os.path.getsize('file.txt'))
print(os.path.abspath('relative_path'))

7. pickle

import pickle

data_payload = {'key': 'value'}
serialized = pickle.dumps(data_payload)
deserialized = pickle.loads(serialized)

# with open('data.pkl', 'wb') as f: pickle.dump(data_payload, f)
# with open('data.pkl', 'rb') as f: loaded = pickle.load(f)

8. json

import json
from datetime import datetime

data_map = {"active": True, "count": 5}
json_str = json.dumps(data_map)
parsed_map = json.loads(json_str)

class DateTimeEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        return super().default(obj)

print(json.dumps({"now": datetime.utcnow()}, cls=DateTimeEncoder))

9. shelve

import shelve

with shelve.open('persistent_db') as db:
    db['record_1'] = {'id': 1, 'status': 'open'}
    db['record_2'] = {'id': 2, 'status': 'closed'}
    print(db['record_1'])

10. xml.etree.ElementTree

import xml.etree.ElementTree as ET

tree = ET.parse('config.xml')
root = tree.getroot()

for child in root:
    print(child.tag, child.attrib)

for node in root.iter('setting'): 
    node.text = 'updated_value' 
    node.set('modified', 'yes') 

new_elem = ET.Element('new_setting')
new_elem.text = 'added'
root.append(new_elem)
tree.write('updated_config.xml')

# Creating XML
root_elem = ET.Element("configuration")
sub_elem = ET.SubElement(root_elem, "parameter", attrib={"type": "string"})
sub_elem.text = "example"
tree_obj = ET.ElementTree(root_elem)
tree_obj.write("new_config.xml", encoding="utf-8", xml_declaration=True)

11. configparser

import configparser

parser = configparser.ConfigParser()
parser.read('setup.ini')
sections = parser.sections()
host_val = parser.get('database', 'host')
port_val = parser.getint('database', 'port')

parser.set('database', 'host', 'localhost')
parser.write(open('setup.ini', 'w'))

# Creating ini
config = configparser.ConfigParser()
config['DEFAULT'] = {'timeout': '30'}
config['database'] = {'host': '127.0.0.1', 'port': '5432'}
with open('new_setup.ini', 'w') as f:
    config.write(f)

12. hashlib and hmac

import hashlib
import hmac

hash_obj = hashlib.sha256()
hash_obj.update(b'initial_data')
hash_obj.update(b'additional_data')
print(hash_obj.hexdigest())

# HMAC
mac = hmac.new(b'secret_key', b'message_data', hashlib.sha256)
print(mac.hexdigest())

13. subprocess

import subprocess

result = subprocess.run(['ls', '-l'], capture_output=True, text=True)
print(result.stdout)

pipe1 = subprocess.Popen(['ls'], stdout=subprocess.PIPE)
pipe2 = subprocess.Popen(['grep', 'py'], stdin=pipe1.stdout, stdout=subprocess.PIPE)
output = pipe2.communicate()[0]

14. logging

import logging
import logging.config

logging.basicConfig(filename='app.log', level=logging.DEBUG,
                    format='%(asctime)s - %(levelname)s - %(message)s')
logging.debug('Debug event')
logging.error('Error encountered')

# Advanced Configuration
logger = logging.getLogger("network_ops")
handler1 = logging.FileHandler('detail.log', encoding='utf-8')
handler2 = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler1.setFormatter(formatter)
logger.addHandler(handler1)
logger.setLevel(logging.INFO)
logger.info("Operation started")

# Dictionary Configuration
LOG_CONFIG = {
    'version': 1,
    'formatters': {'basic': {'format': '%(asctime)s %(message)s'}},
    'handlers': {'console': {'class': 'logging.StreamHandler', 'formatter': 'basic', 'level': 'DEBUG'}},
    'loggers': {'main': {'handlers': ['console'], 'level': 'DEBUG'}}
}
logging.config.dictConfig(LOG_CONFIG)
log_inst = logging.getLogger('main')
log_inst.info('Configured via dictionary')

15. re (Regular Expressions)

Regex syntax and methods:

import re

matches = re.findall(r'\d+', 'ID: 42, Age: 25')
search_obj = re.search(r'(\d+)-(\d+)', '123-456')
if search_obj:
    print(search_obj.group(1)) # 123

split_res = re.split(r'[;,]', 'a,b;c')
sub_res = re.sub(r'old', 'new', 'old data old values', count=1)

pattern = re.compile(r'\bword\b')
pattern.findall('a word and another word')

# Named groups and group swapping
text = "apples|oranges|bananas"
grouped = re.search(r"(.+?)\|(.+?)\|(.+)", text)
swapped = re.sub(r"(.+?)\|(.+?)\|(.+)", r"\3|\2|\1", text) # bananas|oranges|apples

Related Articles

Designing Alertmanager Templates for Prometheus Notifications

How to craft Alertmanager templates to format alert messages, improving clarity and presentation. Alertmanager uses Go’s text/template engine with additional helper functions. Alerting rules referenc...

Deploying a Maven Web Application to Tomcat 9 Using the Tomcat Manager

Tomcat 9 does not provide a dedicated Maven plugin. The Tomcat Manager interface, however, is backward-compatible, so the Tomcat 7 Maven Plugin can be used to deploy to Tomcat 9. This guide shows two...

Skipping Errors in MySQL Asynchronous Replication

When a replica halts because the SQL thread encounters an error, you can resume replication by skipping the problematic event(s). Two common approaches are available. Methods to Skip Errors 1) Skip a...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.