Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Bulk Loading Excel Data into SQLite Using Python

Tech May 13 1

Frequently, structured data stored in Excel files must be transferred into relational databases for persistence and querying. While libraries like pandas and openpyxl are prevalent, a dedicated COM‑free Excel parsing library can streamline large workbook processing and provide fine‑grained cell access. This implementation combines such a parser with Python’s built‑in SQLite, enabling automatic multi‑sheet detection, dynamic table creation, and safe batch insertion—all without external database servers.

Common Scenarios That Benefit from This Approach

  • Migrating periodic business reports from spreadsheet format into a centralized database.
  • Automating scheduled sync jobs that harvest data from email attachments or shared folders.
  • Creating a lightweight staging area where multiple Excel contributions are merged for downstream analytics.
  • Quick generation of test datasets by reading pre‑defined scenarios straight from workbooks.

Why Choose This Lightweight Strategy

  • Zero external software: Reads .xlsx files without Microsoft Office or WPS installations.
  • Sheet‑aware traversal: Iterates every worksheet automatically; you never have to name them upfront.
  • Self‑adapting schema generation: Constructs table definitions directly from header rows.
  • Transactional safety: Parameterized queries prevent SQL injection and a single commit ensures atomicity.
  • Free for moderate workloads: Ideal for small to mid‑sized Excel files with no licensing overhead.

Project Setup

The only external requirement is the Free Spire.XLS for Python library. SQLite ships with Python’s standard library, so no additional database driver is necessary.

pip install FreeSpire.XLS

Implementation Flow

The logic follows a straightforward five‑stage pipeline:

Initialize Workbook → Open SQLite Connection → For Each Sheet → Extract Headers & Create Table → Insert Rows → Commit & Dispose

Complete Script

from spire.xls import Workbook
import sqlite3
import re

def import_excel_to_db(source_excel, target_db):
    """Reads all worksheets from an Excel file and writes them into SQLite tables."""

    wb = Workbook()
    wb.LoadFromFile(source_excel)

    db = sqlite3.connect(target_db)
    cur = db.cursor()

    def safe_table_name(raw_name):
        # Replace anything that isn't a letter, digit, or underscore
        return re.sub(r'\W+', '_', raw_name).strip('_')

    for idx in range(wb.Worksheets.Count):
        ws = wb.Worksheets.get_Item(idx)
        tbl = safe_table_name(ws.Name) or f"sheet_{idx}"

        used_area = ws.AllocatedRange
        max_col = used_area.ColumnCount
        max_row = used_area.RowCount

        # Read header row (index 0 of the allocated range)
        headers = []
        for c in range(max_col):
            raw = ws.Range[1, c + 1].Value
            col_name = re.sub(r'\s+', '_', raw) if raw else f"field_{c}"
            headers.append(col_name)

        columns_def = ', '.join(f'"{h}" TEXT' for h in headers)
        ddl = f'CREATE TABLE IF NOT EXISTS "{tbl}" ({columns_def})'
        cur.execute(ddl)

        # Skip header, then insert rows
        insert_base = f'INSERT INTO "{tbl}" ({",".join(f"\"{h}\"" for h in headers)}) VALUES ({','.join(['?'] * max_col)})'
        for r in range(1, max_row):          # row 1 = second Excel row
            row_values = []
            for c in range(max_col):
                row_values.append(ws.Range[r + 1, c + 1].Value)
            cur.execute(insert_base, row_values)

    db.commit()
    db.close()
    wb.Dispose()

if __name__ == '__main__':
    import_excel_to_db('Financial_Report.xlsx', 'archive.db')

Key Engineering Decisions

Robust table name generation
Using a regular expression \W+ strips all non‑word characters, replacing them with underscores. This avoids SQL errors caused by spaces, dots, or brackets in sheet titles and also prevents empty table names by falling back to a numbered default.

Schema flexibility
All columns are typed TEXT to embrace the varied nature of spreadsheet cells (dates, numbers, strings). If bussiness logic demands stricter typing, a post‑processing ALTER TABLE or a second pass with type detection can be added.

Efficient range detection
AllocatedRange returns the tightest rectangle that contains data, so the loop runs only over meaningful rows and columns instead of the entire worksheet grid.

Parameterized insertion
Placeholder lists built with ? are paired with the value list in cursor.execute(). This guarantees proper escaping, neutralizes apostrophes inside cell content, and defends against SQL injection.

Adopting the Technique for Other RDBMS

Moving to MySQL, PostgreSQL, or similar engines requires only swapping the connection setup and adjusting quoting styles:

import pymysql
cnx = pymysql.connect(host='db.example.com', user='app_user',
                      password='s3cret', database='warehouse')
# In DDL, replace double quotes with backticks for MySQL:
# CREATE TABLE IF NOT EXISTS `{tbl}` (`col` TEXT)

By isolating the connection and quoting conventions, the rest of the logic—sheet iteration, header extraction, and batch inserts—remains identical across database backends.

Tags: PythonSQLite

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.