NULL Handling
NULL Handling
This document explains how NULL values are handled in Stoolap based on test files and implementation details.
Overview
Based on test files (/test/is_null_test.go
, /test/null_index_test.go
), Stoolap follows standard SQL semantics for NULL values, representing the absence of a value. NULL is distinct from zero, empty string, or any other value.
NULL Behavior in SQL
In Stoolap, NULL follows these standard SQL behaviors:
- NULL is not equal to any value, including another NULL
- Comparisons with NULL generally yield NULL, not TRUE or FALSE
- NULL requires special operators (IS NULL, IS NOT NULL) for testing
- Functions and expressions propagate NULL values
- Aggregate functions generally ignore NULL values
Column NULL Constraints
Columns in Stoolap can be defined as nullable (the default) or NOT NULL:
-- Create a table with both nullable and NOT NULL columns
CREATE TABLE users (
id INTEGER PRIMARY KEY, -- Primary key implies NOT NULL
name TEXT NOT NULL, -- Can never be NULL
email TEXT, -- Can be NULL (default)
age INTEGER -- Can be NULL (default)
);
Testing for NULL Values
To test for NULL values, use the IS NULL and IS NOT NULL operators:
-- Find rows with NULL values
SELECT * FROM users WHERE email IS NULL;
-- Find rows without NULL values
SELECT * FROM users WHERE email IS NOT NULL;
From /test/is_null_test.go
, these operators are properly optimized and can use indexes.
Example from Test Files
From /test/is_null_test.go
:
-- Create a test table
CREATE TABLE test_null (
id INTEGER PRIMARY KEY,
text_val TEXT,
int_val INTEGER,
float_val FLOAT,
bool_val BOOLEAN
);
-- Insert data with NULL values
INSERT INTO test_null VALUES (1, 'Text', 10, 3.14, TRUE);
INSERT INTO test_null VALUES (2, NULL, NULL, NULL, NULL);
INSERT INTO test_null VALUES (3, 'Other', NULL, 2.71, FALSE);
-- Query with IS NULL
SELECT id FROM test_null WHERE text_val IS NULL; -- Returns 2
-- Query with IS NOT NULL
SELECT id FROM test_null WHERE int_val IS NOT NULL; -- Returns 1
-- Multiple NULL conditions
SELECT id FROM test_null
WHERE text_val IS NULL AND int_val IS NULL; -- Returns 2
-- Mix of NULL and regular conditions
SELECT id FROM test_null
WHERE float_val IS NOT NULL AND bool_val = FALSE; -- Returns 3
NULL in Indexes
As demonstrated in /test/null_index_test.go
, Stoolap supports indexing columns that contain NULL values:
-- Create a table with nullable columns
CREATE TABLE test_index_null (
id INTEGER PRIMARY KEY,
category TEXT,
value INTEGER
);
-- Create an index on a nullable column
CREATE INDEX idx_category ON test_index_null(category);
CREATE INDEX idx_value ON test_index_null(value);
-- Insert data with NULL values
INSERT INTO test_index_null VALUES (1, 'A', 10);
INSERT INTO test_index_null VALUES (2, NULL, 20);
INSERT INTO test_index_null VALUES (3, 'B', NULL);
INSERT INTO test_index_null VALUES (4, NULL, NULL);
-- Query with IS NULL using index
SELECT id FROM test_index_null WHERE category IS NULL; -- Returns 2, 4
The tests confirm that IS NULL and IS NOT NULL conditions can use indexes for efficient filtering.
NULL in Expressions
NULL values propagate through expressions according to standard SQL rules:
-- Any operation with NULL generally yields NULL
SELECT 1 + NULL; -- Result: NULL
SELECT 'text' || NULL; -- Result: NULL
SELECT column1 = NULL; -- Result: NULL (not FALSE!)
-- Exceptions for logical operators
SELECT TRUE OR NULL; -- Result: TRUE
SELECT FALSE AND NULL; -- Result: FALSE
NULL in Joins
NULL values in join columns affect the matching behavior:
-- Inner join (NULL doesn't match anything)
SELECT a.id, b.id
FROM table_a a
INNER JOIN table_b b ON a.value = b.value;
-- Rows with NULL in value don't match
-- Left join (preserves all rows from left table)
SELECT a.id, b.id
FROM table_a a
LEFT JOIN table_b b ON a.value = b.value;
-- Rows with NULL in value appear with NULL for b columns
NULL in GROUP BY and DISTINCT
NULL is considered a single value for grouping and distinct operations:
-- NULLs are grouped together
SELECT category, COUNT(*)
FROM products
GROUP BY category;
-- All NULL categories form a single group
-- NULLs count as one distinct value
SELECT COUNT(DISTINCT category) FROM products;
-- Counts NULL as one distinct value if present
NULL in Aggregation Functions
NULL handling in aggregate functions:
-- COUNT(*) counts all rows regardless of NULL
SELECT COUNT(*) FROM users; -- Counts all rows
-- COUNT(column) skips NULL values
SELECT COUNT(email) FROM users; -- Counts only non-NULL emails
-- Other aggregates (SUM, AVG, MAX, MIN) ignore NULL values
SELECT AVG(age) FROM users; -- Average of non-NULL ages
COALESCE Function
To provide default values for NULL, use the COALESCE function:
-- Return the first non-NULL value
SELECT COALESCE(email, 'No Email') FROM users;
-- Can check multiple values in order
SELECT COALESCE(preferred_name, first_name, 'Anonymous') FROM users;
Implementation Details
From the test files and code inspection:
- NULL values are represented distinctly in the storage engine
- The expression evaluator handles NULL propagation
- Index structures store and retrieve NULL values efficiently
- IS NULL and IS NOT NULL operators are optimized for performance
- Comparisons with NULL follow three-valued logic (TRUE, FALSE, NULL)