Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Logos vs Nom: Choosing a Lexical Analyzer for Rust

Tech May 15 1

Logos is a high-performance lexical analyzer library for Rust, designed to tokenize input streams with minimal overhead and maximum clarity. It leverages Rust’s type system and macro expansion to generate efficient, compile-time optimized tokenizers using a declarative syntax.

At its core, Logos defines token types as Rust enums annotated with attributes that specify matching patterns. These patterns can be literal strings, regular expressions, or special directives like skip to ignore whitespace. The generated lexer operates as a Deterministic Finite Automaton (DFA), enabling near-optimal scanning speed with predictable memory usage.

Basic Logos Example

Here’s a tokenizer for arithmetic expressions:

use logos::Logos;

#[derive(Logos, Debug, PartialEq)]
enum Token {
    #[token("+")]
    Add,
    #[token("-")]
    Subtract,
    #[token("*")]
    Multiply,
    #[token("/")]
    Divide,
    #[regex(r"[0-9]+")]
    Integer,
    #[regex(r"[ \t\r\n]+", logos::skip)]
    Whitespace,
    #[error]
    Invalid,
}

fn main() {
    let source = "10 + 25 * 3 - 8";
    let mut lexer = Token::lexer(source);

    while let Some(token) = lexer.next() {
        println!("{:?}", token);
    }
}

Output:

Integer
Add
Integer
Multiply
Integer
Subtract
Integer

Notice how whitespace is automatically discarded via the logos::skip directive, and malformed input triggers the Invalid variant without panicking.

Comparison with Nom

Logos excels when you need a fast, predictable way to convert raw text into discrete tokens. Nom, by contrast, allows you to define grammars directly in code using functional combinators like recognize, many1, or delimited, enabling full parser construction without external tools.

Integration with LALRPOP

LALRPOP is a parser generator that accepts a grammar specification and outputs Rust code for an LALR(1) parser. It does not include a lexer. Logos is a natural pairing because:

  • Logos produces a stream of typed tokens that LALRPOP expects.
  • Logos’s error tolerance allows malformed input to be gracefully handled before reaching the grammar layer.
  • Both tools are zero-dependency and compile-time efficient.

A typical workflow:

  1. Define tokens using Logos (e.g., Ident, Number, Keyword).
  2. Use the same token enum in LALRPOP’s grammar file:
use crate::Token;

grammar;

Pub Expr: i32 = {
    <n:Integer> => n,
    <left:Expr> "+" <right:Expr> => left + right,
    <left:Expr> "*" <right:Expr> => left * right,
};

Then feed the Logos lexer’s output directly into the LALRPOP parser:

let tokens: Vec<Token> = Token::lexer(source).collect();
let parser = ExprParser::new();
let result = parser.parse(tokens).unwrap();

When to Choose Which?

Use Logos when your goal is to separate concerns: tokenize cleanly, then pass tokens to a dedicated parser like LALRPOP or even a hand-written recursive descent parser. This separation improves maintainability and enables reuse across multiple parser backends.

Use Nom if you want to define the entire parsing pipeline — from character-by-character scanning to AST construction — in a single, cohesive codebase. Nom’s combinators are powerful for languages with context-sensitive syntax or when you need fine-grained control over backtracking and error messages.

For most embedded DSLs or configuration parsers, Logos + LALRPOP offers the best balance of speed, clarity, and separation of concerns. For complex programming languages or where parsing logic is tightly coupled with lexical structure, Nom may be preferable.

Tags: Rust

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.