Home > Tech > Content

Logos vs Nom: Choosing a Lexical Analyzer for Rust

Tech May 15 13

Logos is a high-performance lexical analyzer library for Rust, designed to tokenize input streams with minimal overhead and maximum clarity. It leverages Rust’s type system and macro expansion to generate efficient, compile-time optimized tokenizers using a declarative syntax.

At its core, Logos defines token types as Rust enums annotated with attributes that specify matching patterns. These patterns can be literal strings, regular expressions, or special directives like skip to ignore whitespace. The generated lexer operates as a Deterministic Finite Automaton (DFA), enabling near-optimal scanning speed with predictable memory usage.

Basic Logos Example

Here’s a tokenizer for arithmetic expressions:

use logos::Logos;

#[derive(Logos, Debug, PartialEq)]
enum Token {
    #[token("+")]
    Add,
    #[token("-")]
    Subtract,
    #[token("*")]
    Multiply,
    #[token("/")]
    Divide,
    #[regex(r"[0-9]+")]
    Integer,
    #[regex(r"[ \t\r\n]+", logos::skip)]
    Whitespace,
    #[error]
    Invalid,
}

fn main() {
    let source = "10 + 25 * 3 - 8";
    let mut lexer = Token::lexer(source);

    while let Some(token) = lexer.next() {
        println!("{:?}", token);
    }
}

Output:

Integer
Add
Integer
Multiply
Integer
Subtract
Integer

Notice how whitespace is automatically discarded via the logos::skip directive, and malformed input triggers the Invalid variant without panicking.

Comparison with Nom

Feature	Logos	Nom
Primary Purpose	Lexical analysis only	Full parsing (lexical + syntactic)
Approach	Declarative enum + attributes	Combinator functions
Performance	Extremely fast, DFA-based	Fast, but overhead increases with complexity
Error Handling	Simple skip/error variants	Rich error types, recoverable parsing
Complexity	Low — focused on tokenization	High — handles nested structures natively
Dependency Profile	Zero external deps	Self-contained but larger API surface

Logos excels when you need a fast, predictable way to convert raw text into discrete tokens. Nom, by contrast, allows you to define grammars directly in code using functional combinators like recognize, many1, or delimited, enabling full parser construction without external tools.

Integration with LALRPOP

LALRPOP is a parser generator that accepts a grammar specification and outputs Rust code for an LALR(1) parser. It does not include a lexer. Logos is a natural pairing because:

Logos produces a stream of typed tokens that LALRPOP expects.
Logos’s error tolerance allows malformed input to be gracefully handled before reaching the grammar layer.
Both tools are zero-dependency and compile-time efficient.

A typical workflow:

Define tokens using Logos (e.g., Ident, Number, Keyword).
Use the same token enum in LALRPOP’s grammer file:

use crate::Token;

grammar;

Pub Expr: i32 = {
    <n:Integer> => n,
    <left:Expr> "+" <right:Expr> => left + right,
    <left:Expr> "*" <right:Expr> => left * right,
};

Then feed the Logos lexer’s output directly into the LALRPOP parser:

let tokens: Vec<Token> = Token::lexer(source).collect();
let parser = ExprParser::new();
let result = parser.parse(tokens).unwrap();

When to Choose Which?

Use Logos when your goal is to separate concerns: tokenize cleanly, then pass tokens to a dedicated parser like LALRPOP or even a hand-written recursive descent parser. This separation improves maintainability and enables reuse across multiple parser backends.

Use Nom if you want to define the entire parsing pipeline — from character-by-character scanning to AST construction — in a single, cohesive codebase. Nom’s combinators are powerful for languages with context-sensitive syntax or when you need fine-grained control over backtracking and error messages.

For most embedded DSLs or configuraton parsers, Logos + LALRPOP offers the best balance of speed, clarity, and separation of concerns. For complex programming languages or where parsing logic is tightly coupled with lexical structure, Nom may be preferable.

Tags: Rust

Back to List

Prev: Building a Handwriting Recognition System Using k-Nearest Neighbors

Next: Architectural Fundamentals of Sparse Mixture-of-Experts and Key-Value Caching

Fading Coder

Logos vs Nom: Choosing a Lexical Analyzer for Rust

Basic Logos Example

Comparison with Nom

Integration with LALRPOP

When to Choose Which?

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Logos vs Nom: Choosing a Lexical Analyzer for Rust

Basic Logos Example

Comparison with Nom

Integration with LALRPOP

When to Choose Which?

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment