Building an Interpreter in C#: Expression AST Design Based on 'Crafting Interpreters'
Abstract Syntax Tree (AST) in Interpreter Development
Once tokenization is complete, the next step in interpreter development is to represent expressions in a structured, object-oriented format. This is where the Abstract Syntax Tree (AST) comes into play. It transforms linear source code into a hierarchical tree structure that the interpreter can process effectively.
Why AST Matters
Source code, in its raw form, is a sequence of characters. To interpret and execute operations correct, we need a structured representation that reflects the logic and order of operations. For instance:
(1 + 2) * 3
As text, the operation's precedence is ambiguous. However, the AST clearly represents the structure:
*
/ \
+ 3
/ \
1 2
Core Expression Class Design
The base class Expression serves as the foundation for all expression types. Specific expressions extend this class to represent different syntactic constructs:
- Constant: Represents literal values like numbers or strings
- Operation: Handles binary operations such as addition and multiplication
- Group: Represents parenthesized exprestions
- SignChange: Handles unary operations like negation
Visitor Pattern Implementation
To enable various operations on expressions with out coupling logic to data structures, we implement the visitor pattern. This allows separate extensions for evaluation, printing, and analysis.
Visitor Interface
public interface IExpressionVisitor<T>
{
T VisitOperation(Operation expression);
T VisitGroup(Group expression);
T VisitConstant(Constant expression);
T VisitSignChange(SignChange expression);
}
Base Expression Class
public abstract class Expression
{
public abstract T Accept<T>(IExpressionVisitor<T> visitor);
}
Expression Implementations
1. Operation (Binary Expression)
public class Operation : Expression
{
public Expression LeftOperand { get; }
public Token OperatorToken { get; }
public Expression RightOperand { get; }
public Operation(Expression left, Token op, Expression right)
{
LeftOperand = left;
OperatorToken = op;
RightOperand = right;
}
public override T Accept<T>(IExpressionVisitor<T> visitor)
{
return visitor.VisitOperation(this);
}
}
2. Constant (Literal Value)
public class Constant : Expression
{
public object Value { get; }
public Constant(object value)
{
Value = value;
}
public override T Accept<T>(IExpressionVisitor<T> visitor)
{
return visitor.VisitConstant(this);
}
}
3. Group (Parenthesized Expression)
public class Group : Expression
{
public Expression Content { get; }
public Group(Expression expression)
{
Content = expression;
}
public override T Accept<T>(IExpressionVisitor<T> visitor)
{
return visitor.VisitGroup(this);
}
}
4. SignChange (Unary Expression)
public class SignChange : Expression
{
public Token OperatorToken { get; }
public Expression Operand { get; }
public SignChange(Token op, Expression right)
{
OperatorToken = op;
Operand = right;
}
public override T Accept<T>(IExpressionVisitor<T> visitor)
{
return visitor.VisitSignChange(this);
}
}
Significance of Expression Design
- The expression tree serves as the core data model for the interpreter
- Visitor pattern enables extensible operations on expressions
- Clear class hierarchy improves code maintainability and readability