Automated Herbal Medicine Identification and Database Integration using PyTorch and OpenCV
Implementing a real-time identification system for Chinese herbal medicine involves synchronizing live video capture, deep learning inference, and structured data storage. This system utilizes OpenCV for image acquisition, a PyTorch-based ResNet model for classification, and SQLite for maintaining an audit trail of identified specimens.
Video Acquisition and Automated Snapshot Triggering
To facilitate a hands-free identification process, especially in stationary conveyor-style environments, the system captures a video frame at a fixed interval of three seconds. This ensures a consistent stream of data without redundant processing of every single frame.
import cv2
import time
import os
def run_capture_loop(storage_dir='captured_herbs', interval=3):
if not os.path.exists(storage_dir):
os.makedirs(storage_dir)
camera = cv2.VideoCapture(0)
if not camera.isOpened():
print("Error: Could not access the camera.")
return
next_capture_timestamp = time.time() + interval
try:
while True:
active, frame = camera.read()
if not active:
break
current_clock = time.time()
if current_clock >= next_capture_timestamp:
# Generate filename based on current date and time
timestamp_str = time.strftime('%Y%m%d_%H%M%S', time.localtime(current_clock))
target_path = os.path.join(storage_dir, f"sample_{timestamp_str}.jpg")
cv2.imwrite(target_path, frame)
print(f"Image saved: {target_path}")
next_capture_timestamp = current_clock + interval
cv2.imshow('Herbal Identification Stream', frame)
# Exit on 'q' key press
if cv2.waitKey(1) & 0xFF == ord('q'):
break
finally:
camera.release()
cv2.destroyAllWindows()
Processing Image Metadata
After images are captured, the system identifies the files pending inference. The filenames serve as the primary key for time-based tracking in the database.
import pathlib
def fetch_captured_filenames(target_dir):
path_obj = pathlib.Path(target_dir)
# Retrieve file stems (filenames without extensions) for database indexing
return [f.stem for f in path_obj.glob('*.jpg')]
# Example usage
sample_ids = fetch_captured_filenames('captured_herbs')
Batch Result Persistence with SQLite
Once the PyTorch model generates predictions (medicine name and confidence score), these results are paired with the capture timestamps and persisted into a relational database. This allows for long-term data analysis and inventory tracking using tools like Navicat or SQLite Browser.
import sqlite3
def log_predictions_to_db(db_path, session_data):
"""
session_data: List of tuples (timestamp_id, herb_label, confidence)
"""
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
# Ensure the table schema exists
cursor.execute('''
CREATE TABLE IF NOT EXISTS herb_logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
capture_time TEXT,
medicine_name TEXT,
confidence_score TEXT
)
''')
sql_command = "INSERT INTO herb_logs (capture_time, medicine_name, confidence_score) VALUES (?, ?, ?)"
try:
cursor.executemany(sql_command, session_data)
conn.commit()
print(f"Successfully recorded {cursor.rowcount} entries.")
except sqlite3.Error as e:
print(f"Database error: {e}")
finally:
cursor.close()
conn.close()
# Example processing logic
# Assuming pred_names and confidence_levels are outputs from the ResNet model
results_to_store = list(zip(sample_ids, pred_names, confidence_levels))
log_predictions_to_db('herbal_inventory.db', results_to_store)
By decoupling the capture, recognition, and storage phases, the system maintains high performance while ensuring data integrity. This workflow effectively transforms raw visual input from a camera into a structured digital log of botanical assets.