Grapa Execution Pipeline Overview

Introduction

This document provides a high-level overview of Grapa's complete execution pipeline, from source text to final results. Understanding this pipeline is essential for debugging, optimization, and extending the Grapa language.

Complete Pipeline Flow

Source Text → Lexer → Parser → Executor → Results
     ↓         ↓        ↓         ↓         ↓
   "x + 5" → Tokens → $OP Tree → Execute → 8

Stage 1: Lexer (Text → Tokens)

The lexer converts raw text input into a stream of tokens:

Input: "x + 5" Output: $ID("x") $STR(" ",skip) $SYM("+") $STR(" ",skip) $INT("5")

Key Features: - Token Classification: Identifies tokens by type (identifier, operator, number, etc.) - State Machine: Handles complex token patterns (strings, comments, etc.) - Quote Analysis: Detects syntax issues with quote characters - Performance Tracking: Measures tokenization speed

Debug Component: lexer Usage: GRAPA_SESSION_DEBUG_COMPONENTS=lexer

Stage 2: Parser (Tokens → Execution Trees)

The parser converts tokens into execution trees using BNF grammar rules:

Input: $ID("x") $SYM("+") $INT("5") Output: @<add,{@<var,{x}>,5}>

Key Features: - Grammar Rules: Uses BNF grammar for syntax analysis - Left Recursion: Handles complex recursive expressions - Error Context: Provides detailed error information - Tree Construction: Builds optimized execution trees

Debug Component: parser Usage: GRAPA_SESSION_DEBUG_COMPONENTS=parser

Stage 3: Executor (Execution Trees → Results)

The executor runs the execution trees to produce final results:

Input: @<add,{@<var,{x}>,5}> Output: 8 (assuming x = 3)

Key Features: - Operation Execution: Executes $OP and $CODE nodes - Library Loading: Dynamically loads required libraries - Function Calls: Invokes Grapa functions and methods - Scope Management: Handles variable scoping and binding - Performance Metrics: Tracks execution timing

Debug Component: executor Usage: GRAPA_SESSION_DEBUG_COMPONENTS=executor

Debug Component Shorthands

For convenience, Grapa provides shorthand debug components:

Compiler Component (`compiler`)

Scope: Stages 1-2 (Lexer + Parser)
Usage: GRAPA_SESSION_DEBUG_COMPONENTS=compiler
Equivalent: GRAPA_SESSION_DEBUG_COMPONENTS=lexer,parser

Runtime Component (`runtime`)

Scope: All stages (Lexer + Parser + Executor)
Usage: GRAPA_SESSION_DEBUG_COMPONENTS=runtime
Equivalent: GRAPA_SESSION_DEBUG_COMPONENTS=compiler,executor

Complete Example

Here's a complete example showing the full pipeline:

Source Code

x = 3;
result = x + 5;
result.echo();

Pipeline Execution

Stage 1: Lexer

[DEBUG-SESSION-1-lexer] LEX: Created token type=4 value='x' quote=
[DEBUG-SESSION-1-lexer] LEX: Created token type=10 value=' ' quote=
[DEBUG-SESSION-1-lexer] LEX: Created token type=10 value='=' quote=
[DEBUG-SESSION-1-lexer] LEX: Created token type=10 value=' ' quote=
[DEBUG-SESSION-1-lexer] LEX: Created token type=8 value='3' quote=
[DEBUG-SESSION-1-lexer] LEX: Created token type=10 value=';' quote=

Stage 2: Parser

[DEBUG-SESSION-1-parser] PARSE: Full token stream: [4:'x'] [10:'='] [8:'3'] [10:';']
[DEBUG-SESSION-1-parser] PARSE: Rule matching - assignment statement
[DEBUG-SESSION-1-parser] PARSE: Execution tree built - $OP(assign, [$VAR(x), $CONST(3)])

Stage 3: Executor

[DEBUG-SESSION-1-executor] EXEC: Starting execution - operation type=21
[DEBUG-SESSION-1-executor] EXEC: Processing operation type=21 (CODE=22, OP=21)
[DEBUG-SESSION-1-executor] EXEC: Library execution - hasLibrary=true
[DEBUG-SESSION-1-executor] EXEC: Function call - libParam='assign'
[DEBUG-SESSION-1-executor] EXEC: Execution completed - elapsed time=15.00 microseconds, hasResult=true

Final Output

Performance Characteristics

Compilation Performance

Lexer: Typically 1-10 microseconds per token
Parser: Typically 10-100 microseconds per statement
Total Compilation: Usually under 1 millisecond for simple scripts

Execution Performance

Simple Operations: 1-10 microseconds
Function Calls: 10-100 microseconds
Library Operations: 100-1000 microseconds
Complex Scripts: Varies by complexity

Debug Overhead

Minimal Impact: ShouldDebug() checks prevent unnecessary output
Component Filtering: Only enabled components generate output
Level Filtering: Granular control over verbosity
Session Isolation: No interference between parallel sessions
Stream Separation: Debug output goes to stderr, normal output to stdout

Architecture Insights

Async Pipeline Design

The lexer and parser operate as separate, parallel queues:

Raw Input → Lexer Queue → Token Queue → Parser Queue → Execution Tree

Session Context

Each session has isolated debug context
Session IDs prevent output interference
Context passed to lexer at creation time
Multi-processing support for parallel execution

Error Handling

Lexer Errors: Token-level syntax issues
Parser Errors: Grammar and structure issues
Executor Errors: Runtime and execution issues
Debug Context: Detailed error information at each stage

Best Practices

Debug Usage

Start with runtime: Use GRAPA_SESSION_DEBUG_COMPONENTS=runtime for complete visibility
Filter by component: Use specific components (lexer, parser, executor) for targeted debugging
Adjust levels: Use debug levels (1-5) for appropriate verbosity
Performance monitoring: Use Level 5 for performance analysis

Development Workflow

Syntax Issues: Use lexer component to identify token problems
Grammar Issues: Use parser component to debug rule matching
Runtime Issues: Use executor component to trace execution flow
Performance Issues: Use Level 5 debug for timing analysis

Production Considerations

Disable debug output in production for maximum performance
Use session-specific debug flags for targeted debugging
Monitor debug output volume to prevent performance impact
Use component filtering to focus on specific areas

Conclusion

Grapa's execution pipeline provides comprehensive visibility into the entire process from source text to final results. The debug system offers granular control over what information is displayed, ensuring minimal performance impact while providing powerful debugging capabilities.

Understanding this pipeline is essential for: - Debugging: Identifying issues at the correct stage - Optimization: Finding performance bottlenecks - Extension: Adding new language features - Troubleshooting: Resolving complex execution issues

The modular design allows for targeted debugging while maintaining the ability to view the complete execution flow when needed.