Stoolap Architecture

This document provides a high-level overview of Stoolap’s architecture, including its major components and how they interact.

System Overview

Stoolap is a high-performance embedded SQL database written in pure Rust. Its architecture prioritizes:

  • Memory-first design with optional disk persistence
  • Full ACID transactions with MVCC
  • Cost-based query optimizer with adaptive execution
  • Multiple index types (B-tree, Hash, Bitmap, HNSW)
  • Parallel query execution via Rayon
  • Minimal unsafe code (only in performance-critical hot paths)

Core Components

Stoolap’s architecture consists of the following major components:

Client Interface

  • Command-Line Interface - Interactive CLI for database operations
  • Library API - Direct Rust API for embedded use
  • C FFI - C API for language bindings (--features ffi)

Query Processing Pipeline

  1. Parser - Converts SQL text into an abstract syntax tree (AST)
    • Lexical analyzer (lexer.rs)
    • Syntax parser (parser.rs)
    • AST builder (ast.rs)
  2. Planner/Optimizer - Converts AST into an optimized execution plan
    • Cost-based planning with I/O and CPU costs
    • Statistics-based optimization
    • Join order optimization (dynamic programming)
    • Adaptive query execution
  3. Executor - Executes the plan and produces results
    • Query executor (query.rs)
    • DDL executor (ddl.rs)
    • Semantic query cache (semantic_cache.rs)

Storage Engine

  • MVCC Engine - Multi-version concurrency control for transaction isolation
    • Transaction management (transaction.rs)
    • Version store (version_store.rs)
    • Visibility rules
  • Table Management - Table creation and schema handling
    • Schema validation
    • Table metadata management
    • Column type management
  • Index System - Multiple index types for different query patterns
    • B-tree indexes (btree.rs) - Range queries, sorting
    • Hash indexes (hash.rs) - O(1) equality lookups
    • Bitmap indexes (bitmap.rs) - Low-cardinality columns
    • HNSW indexes (hnsw.rs) - Vector similarity search
    • Multi-column indexes (multi_column.rs)
  • Persistence Layer - Optional disk storage
    • Write-ahead logging (WAL)
    • Immutable cold volumes (column-major storage with manifests and tombstones)
    • Backup snapshots (PRAGMA SNAPSHOT for disaster recovery)
    • Crash recovery (manifest + WAL replay)

Function System

  • Function Registry - Central registry for 129 SQL functions
    • Scalar functions (scalar/) - String, math, date, JSON, vector functions
    • Aggregate functions (aggregate/)
    • Window functions (window/)

Parallel Execution

  • Rayon Integration - Work-stealing parallelism
  • Parallel Filter - Multi-threaded WHERE evaluation
  • Parallel Join - Concurrent hash build and probe
  • Parallel Sort - Multi-threaded ORDER BY

Request Flow

When a query is executed, it flows through the system as follows:

  1. Query Submission
    • SQL text is submitted via CLI or library API
  2. Parsing and Validation
    • SQL is parsed into an AST
    • Syntax and semantic validation is performed
    • Query is prepared for execution
  3. Planning and Optimization
    • Execution plan is generated with cost estimates
    • Statistics are used to optimize the plan
    • Indexes are selected based on query patterns
    • Join ordering is optimized using dynamic programming
  4. Execution
    • For read queries:
      • Appropriate isolation level is applied
      • Storage engine provides data with visibility rules
      • Filters and projections are applied (with parallelism for large datasets)
      • Results are processed (joins, aggregations, sorting)
      • Final result set is returned
    • For write queries:
      • Transaction is started if not already active
      • Write operations are applied with MVCC rules
      • Indexes are updated atomically
      • Changes are committed or rolled back
  5. Result Handling
    • Results are formatted and returned to the client
    • Memory is released
    • Transaction state is updated

Physical Architecture

In-Memory Mode

In memory-only mode, Stoolap operates entirely in RAM:

  • All data structures reside in memory
  • No disk I/O for data access
  • Highest performance but no durability
  • Use with Database::open("memory://")

Persistent Mode

In persistent mode, Stoolap uses disk storage with memory caching:

  • Data is stored on disk with WAL for durability
  • Write-ahead logging ensures crash recovery
  • Immutable cold volumes for fast startup on large tables
  • Use with Database::open("file:///path/to/db")

Concurrency Model

Stoolap uses MVCC for concurrency:

  • True MVCC - Full version chains with history
  • Optimistic Concurrency Control - Transactions validate at commit time
  • Lock-Free Reads - Readers never block writers
  • Concurrent Writers - Multiple transactions can commit simultaneously
  • Two Isolation Levels - READ COMMITTED (default) and SNAPSHOT

Implementation Details

The core implementation is organized as follows:

src/
├── api/           # Public Database API
├── core/          # Core types (Value, Row, Schema, Error)
├── executor/      # Query execution engine
│   ├── query.rs   # Main query executor
│   ├── planner.rs # Query planner with cost estimation
│   └── ...
├── functions/     # 129 built-in functions
│   ├── scalar/    # String, math, date, JSON, vector functions
│   ├── aggregate/ # COUNT, SUM, AVG, etc.
│   └── window/    # ROW_NUMBER, RANK, LAG, etc.
├── optimizer/     # Cost-based query optimizer
│   ├── cost.rs    # Cost estimator
│   ├── join.rs    # Join optimization
│   └── ...
├── parser/        # SQL parser (lexer, AST, parser)
├── storage/       # Storage engine
│   ├── index/     # B-tree, Hash, Bitmap, HNSW indexes
│   └── mvcc/      # MVCC implementation
└── bin/           # CLI binary

Architectural Principles

Stoolap’s architecture is guided by the following principles:

  1. Performance First - Optimize for speed and memory efficiency
  2. Memory Safety - Pure Rust with minimal unsafe code (FFI and hot paths only)
  3. Modularity - Clean component interfaces for extensibility
  4. Simplicity - Favor simple solutions over complex ones
  5. Data Integrity - Ensure consistent and correct results