Skip to content

Commit fa0debd

Browse files
committed
fix: address 5 spec review issues — missing nodes, coercion types, IS TRUE, CASE/WHEN, TRUE/FALSE literals
1 parent ffa1659 commit fa0debd

File tree

1 file changed

+60
-14
lines changed

1 file changed

+60
-14
lines changed

docs/superpowers/specs/2026-04-03-expression-evaluator-design.md

Lines changed: 60 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ The evaluator switches on `expr->type`:
6060
6161
| NodeType | Action |
6262
|---|---|
63-
| `NODE_LITERAL_INT` | Parse `expr->value()` string to int64 via `strtoll` |
63+
| `NODE_LITERAL_INT` | Check for `"TRUE"`/`"FALSE"` text → `value_bool()`. Otherwise parse via `strtoll` → `value_int()` |
6464
| `NODE_LITERAL_FLOAT` | Parse `expr->value()` string to double via `strtod` |
6565
| `NODE_LITERAL_STRING` | `value_string(expr->value())` |
6666
| `NODE_LITERAL_NULL` | `value_null()` |
@@ -69,6 +69,21 @@ The evaluator switches on `expr->type`:
6969
| `NODE_PLACEHOLDER` | `value_null()` (unresolved placeholder) |
7070
| `NODE_IDENTIFIER` | `resolve(expr->value())` (column or keyword-as-value) |
7171
72+
### Deferred node types (return value_null)
73+
74+
These node types are produced by the expression parser but require features beyond the expression evaluator's scope. They return `value_null()` with a clear comment for future implementation:
75+
76+
| NodeType | Reason deferred |
77+
|---|---|
78+
| `NODE_SUBQUERY` | Requires full executor |
79+
| `NODE_TUPLE` | Row constructor — requires row/tuple value type |
80+
| `NODE_ARRAY_CONSTRUCTOR` | ARRAY[...] — requires array value type |
81+
| `NODE_ARRAY_SUBSCRIPT` | expr[index] — requires array support |
82+
| `NODE_FIELD_ACCESS` | (expr).field — requires composite type support |
83+
| `NODE_EXPRESSION` | Wrapper node — unwrap and evaluate first child |
84+
85+
Note: `NODE_EXPRESSION` should be handled by unwrapping (evaluating its first child), not by returning NULL.
86+
7287
### Qualified name
7388
7489
`NODE_QUALIFIED_NAME` has two children (table, column). Combine into `"table.column"` and call `resolve()`.
@@ -79,14 +94,19 @@ The evaluator switches on `expr->type`:
7994
8095
```
8196
1. If operator is AND/OR: use short-circuit evaluation (see below)
82-
2. Evaluate left child → left_val
83-
3. Evaluate right child → right_val
84-
4. NULL propagation: if either NULL → return value_null()
85-
5. Find common type: CoercionRules<D>::common_type(left_val.tag, right_val.tag)
86-
6. Coerce both to common type
87-
7. Apply operator → return result
97+
2. If operator is IS/IS NOT: handle IS TRUE/FALSE/NULL (see below)
98+
3. Evaluate left child → left_val
99+
4. Evaluate right child → right_val
100+
5. NULL propagation: if either NULL → return value_null()
101+
6. Map Value::Tag → SqlType::Kind for coercion lookup
102+
7. Find common type: CoercionRules<D>::common_type(left_kind, right_kind)
103+
8. Map result SqlType::Kind back to Value::Tag
104+
9. Coerce both operands: CoercionRules<D>::coerce_value(val, target_tag, arena)
105+
10. Apply operator → return result
88106
```
89107
108+
**Value::Tag ↔ SqlType::Kind mapping:** The evaluator needs a helper to convert between these enums since `CoercionRules` operates on `SqlType::Kind` while runtime values carry `Value::Tag`. The mapping is straightforward (TAG_INT64→INT, TAG_DOUBLE→DOUBLE, TAG_STRING→VARCHAR, etc.).
109+
90110
**Short-circuit for AND/OR:**
91111
- `AND`: if left is FALSE → return FALSE without evaluating right. If left is NULL → evaluate right; if right is FALSE → FALSE, else NULL.
92112
- `OR`: if left is TRUE → return TRUE without evaluating right. If left is NULL → evaluate right; if right is TRUE → TRUE, else NULL.
@@ -96,11 +116,23 @@ The evaluator switches on `expr->type`:
96116
- Division by zero → NULL
97117
- `%` / `MOD` → integer remainder
98118
- `DIV` → integer division (truncate toward zero)
119+
- Note: `DIV` and `MOD` are MySQL keywords. If the parser doesn't currently produce them as `NODE_BINARY_OP` infix operators (they may not be in `infix_precedence`), the evaluator should handle them gracefully. They may arrive as function calls instead — the function registry already has `MOD` registered.
99120
100121
**Comparison operators** (`=`, `<>`, `!=`, `<`, `>`, `<=`, `>=`):
101122
- Compare coerced values
102123
- Return `value_bool(result)`
103124
125+
**IS / IS NOT operators** (`IS`, `IS NOT`):
126+
The parser produces `NODE_BINARY_OP` with value `"IS"` or `"IS NOT"` for `IS TRUE`, `IS FALSE`, `IS NOT TRUE`, `IS NOT FALSE`. The right child is a `NODE_LITERAL_INT` with text `"TRUE"` or `"FALSE"`.
127+
128+
- `expr IS TRUE` → evaluate expr; if it's truthy and not NULL → TRUE, else FALSE
129+
- `expr IS FALSE` → evaluate expr; if it's falsy and not NULL → TRUE, else FALSE
130+
- `expr IS NOT TRUE` → NOT (expr IS TRUE)
131+
- `expr IS NOT FALSE` → NOT (expr IS FALSE)
132+
- These NEVER return NULL — they always return TRUE or FALSE.
133+
134+
Note: `IS NULL` and `IS NOT NULL` are handled via separate `NODE_IS_NULL`/`NODE_IS_NOT_NULL` node types, not through this path.
135+
104136
**String operators:**
105137
- `||` in PostgreSQL: string concatenation
106138
- `||` in MySQL: logical OR
@@ -114,7 +146,8 @@ The evaluator switches on `expr->type`:
114146
|---|---|
115147
| `-` | Negate: int → `-int_val`, double → `-double_val`. NULL → NULL. |
116148
| `NOT` | `null_semantics::eval_not(child_val)` |
117-
| `+` | No-op (unary plus) |
149+
150+
Note: The parser does not produce `NODE_UNARY_OP` for unary `+` — it discards it and returns the inner expression directly. So no `+` case is needed.
118151
119152
### IS NULL / IS NOT NULL
120153
@@ -145,24 +178,37 @@ The evaluator switches on `expr->type`:
145178
146179
### CASE/WHEN
147180
148-
`NODE_CASE_WHEN` children are interleaved: [case_expr], when1, then1, when2, then2, ..., [else_expr].
181+
`NODE_CASE_WHEN` children are: [case_expr], when1, then1, when2, then2, ..., [else_expr].
182+
183+
**Disambiguation rule:** The parser's `parse_case()` adds the case expression as the first child ONLY when the token after CASE is not WHEN. This means:
184+
- If the child count is **even** → searched CASE (pairs of WHEN/THEN, optionally +1 ELSE makes odd, but each pair is 2)
185+
- If the child count is **odd** → either simple CASE (case_expr + pairs) or searched CASE with ELSE
149186
150-
**Searched CASE** (no case_expr — first child is a WHEN condition):
187+
**Practical approach:** Count children. Try to interpret as searched CASE first (children are WHEN/THEN pairs + optional ELSE). Evaluate the first child as a WHEN condition. If the CASE statement doesn't match any when conditions and we have the wrong form, fall back. In practice, the parser's CASE/WHEN structure is:
188+
- Searched: `[condition1, value1, condition2, value2, ..., else_value]` — odd count with ELSE, even without
189+
- Simple: `[case_expr, match1, value1, match2, value2, ..., else_value]` — even count with ELSE, odd without
190+
191+
**Simplest correct implementation:** Store a flag in `NODE_CASE_WHEN`'s `flags` field to distinguish the forms. The parser sets `flags = 1` for simple CASE, `flags = 0` for searched CASE. This requires a one-line change to the parser's `parse_case()` method and eliminates all ambiguity.
192+
193+
**Searched CASE** (`flags == 0`):
151194
```
195+
Children: when1, then1, when2, then2, ..., [else_expr]
152196
For each WHEN/THEN pair:
153197
evaluate WHEN condition
154198
if TRUE → evaluate and return THEN value
155-
If no match and ELSE exists → evaluate and return ELSE
199+
If remaining child exists → it's ELSE, evaluate and return
156200
If no match and no ELSE → NULL
157201
```
158202
159-
**Simple CASE** (first child is case_expr):
203+
**Simple CASE** (`flags == 1`):
160204
```
205+
Children: case_expr, when1, then1, when2, then2, ..., [else_expr]
161206
Evaluate case_expr
162-
For each WHEN/THEN pair:
207+
For each WHEN/THEN pair (starting from second child):
163208
evaluate WHEN value
164209
if case_expr = WHEN value → evaluate and return THEN
165-
If no match → ELSE or NULL
210+
If remaining child → ELSE
211+
If no match → NULL
166212
```
167213
168214
### Function calls

0 commit comments

Comments
 (0)