Error Handling in T

Comprehensive guide to error handling, recovery patterns, and debugging in T.

Table of Contents


Error Philosophy

T treats errors as first-class values, not exceptions. This design enables:

  1. Explicit Error Handling: Errors don’t crash programs; they flow through pipelines
  2. Railway-Oriented Programming: Success and failure paths are explicit
  3. Composable Recovery: Error handling logic can be pipelined like data
  4. Predictable Behavior: No hidden control flow from exceptions
  5. Observability at Scale: Pipeline builds capture and persist errors as artifacts, preventing build halts while ensuring full traceability.

Key Principle: Errors are data, and data can be transformed, inspected, and recovered from.


Execution Modes: Resilient vs. Fail-Fast

T-Lang provides two modes of evaluation, controlled by the “resilience” setting:

1. Resilient Mode (Default)

Evaluation continues even when statements or pipeline nodes result in VError values. This is aligned with the “Errors are Values” philosophy, allowing you to collect as much information as possible from a single run. Residual errors are simply passed to downstream functions (which may short-circuit via |> or recover via ?|>). This is the recommended mode for complex data pipelines where you want to observe as many diagnostic outcomes as possible in a single pass.

2. Fail-Fast Mode

Evaluation stops immediately upon encountering the first VError. This is the usual, common behaviour for critical scripts where subsequent steps should only run if previous ones were flawlessly successful.

How to toggle: - CLI: Use t run --failfast script.t - REPL: Set t_run(failfast = true, ...) - Pipelines: Use t_make(failfast = true)



Error Types

T has several categories of errors:

1. Type Errors

Occur when operations receive incompatible types:

1 + "hello"
-- Error(TypeError: Cannot add Int and String)

[1, 2] * 3
-- Error(TypeError: Cannot multiply List and Int. Hint: Use map(\(x) x * 3))

2. Name Errors

Occur when referencing undefined variables or functions:

undefined_var
-- Error(NameError: 'undefined_var' is not defined)

prnt(42)
-- Error(NameError: 'prnt' is not defined. Did you mean 'print'?)

Features: - Levenshtein distance-based suggestions - Searches current scope and standard library - Case-insensitive suggestions - Reserved Keyword & Built-in Protection: Prevents overwriting core standard library functions or keywords (such as build_log or print), triggering a NameError if assignment (=) or reassignment (:=) is attempted: t print = 42 -- Error(NameError: Cannot overwrite print: it's a reserved keyword!)

3. Value Errors

Occur when values are invalid for the operation:

sqrt(-1)
-- Error(ValueError: Cannot compute square root of negative number)

quantile([1, 2, 3], 1.5)
-- Error(ValueError: Quantile must be between 0 and 1)

4. Arity Errors

Occur when function calls have wrong number of arguments:

add = \(a, b) a + b
add(1)
-- Error(ArityError: Expected 2 arguments (a, b) but got 1)

add(1, 2, 3)
-- Error(ArityError: Expected 2 arguments (a, b) but got 3)

Features: - Shows expected parameter names - Shows actual argument count - Suggests correct usage

5. Division by Zero

1 / 0
-- Error(DivisionByZero: Division by zero)

10 % 0
-- Error(DivisionByZero: Modulo by zero)

6. NA Errors

Occur when NA values are encountered without explicit handling:

mean([1, NA, 3])
-- Error(AggregationError: "Function `mean` encountered NA value. Handle missingness explicitly or set `na_rm` to true.")

1 + NA
-- Error(NAPredicateError: Operation on NA: NA values do not propagate implicitly. Handle missingness explicitly.)

7. Assertion Errors

Occur when assertions fail:

assert(2 + 2 == 5)
-- Error(AssertionError: Assertion failed)

assert(false, "Custom message")
-- Error(AssertionError: Custom message)

8. Validation Errors

Custom errors for data validation:

validate_age = \(age) 
  if (age < 0 or age > 150)
    error("ValidationError", "Age must be between 0 and 150")
  else
    age

validate_age(-5)
-- Error(ValidationError: Age must be between 0 and 150)

Error as Values

Creating Errors

-- Generic error
err = error("Something went wrong")

-- Error with code
err = error("NotFoundError", "File not found")

-- Conditional errors
result = if (value < 0)
  error("ValueError", "Value must be positive")
else
  value

Inspecting Errors

result = 1 / 0

-- Check if error
is_error(result)        -- true

-- Get error code
error_code(result)      -- "DivisionByZero"

-- Get error message
error_msg(result)   -- "Division by zero"

-- Get additional context
error_context(result)   -- Stack trace or additional info

Polymorphic Error Inspection on Pipeline Nodes

The standard error functions (error_code, error_msg, and error_context) are polymorphic and can also be called directly on pipeline ComputedNode variables (e.g., p.X or p.combined_df).

This allows direct programmatic inspection of pipeline failures without manually opening log files:

-- Direct inspection of a failed node's error code and message
error_code(p.X)         -- "RuntimeError"
error_msg(p.X)      -- "NameError: name 'dataset_n' is not defined"

-- Inspecting a hard-failed nix-build node
error_msg(p.combined_df)  -- Full multi-line Python/Arrow traceback

Composing and Chaining Errors

T provides powerful primitives to analyze exceptions/diagnostics and preserve their provenance:

1. collect_exceptions(pipeline)

Converts all terminal errors and warning diagnostics from computed nodes of a built pipeline into a structured four-column DataFrame (node, status, code, and message), allowing you to inspect and filter multiple pipeline issues.

To keep the printed output clean and readable: * Traceback Truncation: collect_exceptions automatically extracts the last non-empty line of multi-line tracebacks (like Python or Arrow error messages) to preserve the actual error class and message, and caps the text length at 100 characters. * Polars-Style Cell Truncation: When pretty-printing any DataFrame in the REPL, all string cells exceeding 35 characters are truncated to 32 characters followed by ... (Polars-style) to prevent wide/broken columns.

[!NOTE] Even if a build fails, the build log is written unconditionally to the _pipeline/ directory. This guarantees that collect_exceptions(pipeline) can extract the exact traceback and error code of failed nodes, and retrieve warnings for successfully built nodes.

exceptions_df = collect_exceptions(my_pipeline)

-- exceptions_df can now be processed with colcraft verbs:
exceptions_df |> filter($status == "Warning")
Explaining Collected Exceptions

T’s built-in explain() function has specialized support for collect_exceptions DataFrames: - Direct Explanation (1 exception): If there is exactly one exception row in the DataFrame, calling explain(exceptions_df) will automatically map to that diagnostic exception and return a structured dictionary explaining the exact error or warning details (containing the originating node, diagnostic code, and description message). - Consolidated Explanation (multiple exceptions): If there are zero or multiple exceptions, calling explain(exceptions_df) returns a structured representation of the exception collection itself (exceptions_list), with a count property and an exceptions list containing the mapped explanation of each individual diagnostic element.

2. error_chain(err1, err2)

Preserves the causal chain of multiple failures. Chaining sets err2 as the "cause" in err1’s context, maintaining complete traceback and causation history:

err_low_level = error("KeyError", "Key 'id' not found")
err_high_level = error("ValueError", "Validation failed")

chained = error_chain(err_high_level, err_low_level)
error_context(chained)$cause  -- returns err_low_level

Error Propagation

Errors flow through computations until caught:

step1 = 1 / 0                    -- Error
step2 = step1 + 10               -- Still error (no computation)
step3 = step2 * 2                -- Still error

is_error(step3)                  -- true
error_msg(step3)             -- "Division by zero"

Pipe Operators and Errors

Conditional Pipe (|>)

Short-circuits on errors:

-- Success path
5 |> \(x) x * 2 |> \(x) x + 1   -- 11

-- Error short-circuits
error("fail") |> \(x) x * 2 |> \(x) x + 1
-- Error(GenericError: fail)
-- The functions are never called

Use case: Normal data pipelines where errors should stop processing.

df = read_csv("data.csv")
  |> filter($age > 0)
  |> select($name, $age)
  |> arrange($age)
-- If read_csv fails, rest of pipeline is skipped

Maybe-Pipe (?|>)

Always forwards values, including errors:

-- Success path (same as |>)
5 ?|> \(x) x * 2                 -- 10

-- Error is forwarded to function
error("fail") ?|> \(x) 
  if (is_error(x)) 0 else x      -- 0 (recovered!)

Use case: Error recovery, fallback values, logging.

-- Recovery pattern
result = risky_operation()
  ?|> \(x) if (is_error(x)) {
    print(str_join(["Error occurred: ", error_msg(x)]))
    default_value
  } else {
    x
  }

-- Chain recovery with normal processing
error("fail")
  ?|> \(x) if (is_error(x)) 0 else x  -- Recovery
  |> \(x) x + 1                        -- Normal processing (1)

Pattern Matching and Errors

Pattern matching is the most declarative way to handle first-class errors in T. It allows you to destructure error values and branch logic based on specific outcomes.

Basic Error Matching

Use the Error { msg } pattern to capture the error message:

result = match(1 / 0) {
  Error { msg: m } => str_sprintf("Caught error: %s", m),
  val => str_sprintf("Result: %f", val)
}
-- "Caught error: Division by zero"

Automatic Error Propagation

If you don’t explicitly handle Error in a match expression, the original error value is propagated automatically. This matches the behavior of the standard pipe (|>).

-- This match does NOT handle Error
res = match(error("Fail")) {
  [head, ..tail] => head,
  [] => 0
}
-- Result is still Error("Fail")

Error Recovery Patterns

Pattern 1: Default Values

-- Return default on error using match
safe_divide = \(a, b)
  match(a / b) {
    Error { _ } => 0.0,
    res => res
  }

safe_divide(10, 0)   -- 0.0
safe_divide(10, 2)   -- 5.0

Pattern 2: Fallback Sources

-- Try multiple data sources cleanly
data = match(read_csv("primary.csv")) {
  Error { _ } => match(read_csv("backup.csv")) {
    Error { _ } => read_csv("fallback.csv"),
    res => res
  },
  res => res
}

-- Or combining match with the maybe-pipe for flat recovery
final_data = read_csv("primary.csv")
  ?|> \(res) match(res) {
    Error { _ } => read_csv("backup.csv"),
    valid_df => valid_df
  }

Pattern 3: Validation and Recovery

-- Validate, recover if invalid
process_value = \(v)
  if (v < 0)
    error("ValueError", "Negative value")
  else if (is_na(v))
    error("NAError", "NA value")
  else
    v * 2

-- Recover from validation errors
safe_process = \(v)
  process_value(v) ?|> \(result)
    if (is_error(result)) {
      print(str_sprintf("Invalid value: %s", error_msg(result)))
      0  -- Default
    } else {
      result
    }

safe_process(-5)   -- Prints error, returns 0
safe_process(10)   -- Returns 20

Pattern 4: Error Logging

-- Log errors and continue
logged_operation = risky_function()
  ?|> \(x) if (is_error(x)) {
    write_log(str_sprintf("Error in risky_function: %s", error_msg(x)))
    x  -- Pass error along
  } else {
    x
  }

-- Or convert to success after logging
logged_with_default = logged_operation
  ?|> \(x) if (is_error(x)) default_value else x

Pattern 5: Partial Success

-- Process list, collecting errors
process_many = \(items)
  map(items, \(item)
    process_item(item) ?|> \(result)
      if (is_error(result))
        {success: false, error: error_msg(result)}
      else
        {success: true, value: result}
  )

results = process_many([1, -2, 3, 0])
-- [
--   {success: true, value: 2},
--   {success: false, error: "Negative value"},
--   {success: true, value: 6},
--   {success: false, error: "Division by zero"}
-- ]

-- Filter successes
successes = filter(results, \(r) r.success)

Pattern 6: Error Transformation

Pattern matching is excellent for converting low-level errors into domain-specific ones.

-- Transform errors using match
enhance_error = \(e)
  match(e) {
    Error { msg } => 
      if (str_detect(msg, "DivisionByZero"))
        error("MathError", str_sprintf("Calculation failed: %s", msg))
      else if (str_detect(msg, "NAError"))
        error("DataQualityError", str_sprintf("Quality issue: %s", msg))
      else
        e,
    _ => e
  }

risky_calc() ?|> enhance_error

Pipeline Diagnostics and Soft-Failures

In T-Lang, the materialization of a pipeline is a separate phase from the logic execution. When a node in a pipeline fails, it doesn’t necessarily halt the entire build.

Pipeline Node Statuses & Failure Modes

When you run build_pipeline(p) and inspect the build log using build_log(p) |> build_log_to_frame(), each node in the pipeline is marked with a distinct execution status.

Node Status Did it execute? Did it fail? Pipeline Impact Description & Cause
Completed Yes No Success (propagates data) The node ran successfully and serialized its output artifact.
Completed with error (Soft-Fail) Yes Yes (Soft) Continues (propagates error) The script ran inside the sandbox but raised a user-space exception. T-Lang captures this as a first-class VError value so independent branches can still build.
Errored (Hard-Fail) Yes Yes (Hard) Aborts Build The sandbox execution crashed entirely or exited with a non-zero code (e.g., syntax errors, missing packages, memory exhaustion).
Skipped No No Bypassed The node was never evaluated or executed because an upstream dependency suffered a hard Errored failure.

Detailed Failure Mechanics

  1. Why Completed with error propagates downstream: Because T-Lang treats errors as first-class values, a soft-failure is saved as a normal serialized directory outcome. Downstream nodes are still scheduled to run, receive the VError as input, and automatically propagate it further down the pipe unless you explicitly recover from it.

  2. Why Errored halts downstream execution: When a node suffers a hard Errored nix-build failure, the Nix daemon immediately stops evaluating that branch. No output directory is produced, and the entire build process terminates.

  3. Why subsequent nodes are marked Skipped: Because the build aborted early, any downstream nodes that rely on the failed node are never invoked. They do not contain any errors themselves; they are simply bypassed entirely and marked as Skipped to provide a clean, accurate picture of the pipeline state.

The Build Summary

After a build, T provides an iconographic summary:

✖ Pipeline build captured node errors [5 succeeded, 2 captured errors, 2 had warnings]
  ! Captured error in node: r_err
  ! Captured error in node: py_err
  ? Warnings in node: r_warn
  ? Warnings in node: py_warn

Upstream Warning Propagation

When a node emits a warning, downstream nodes that depend on it automatically inherit that warning. This ensures that warnings are never silently lost in a pipeline chain.

Use warning_msg(node) to inspect a node’s own warnings and any warnings inherited from ancestor nodes:

warning_msg(p.filtered)
-- "filter() excluded 1 row because the predicate evaluated to NA"

warning_msg(p.count)        -- downstream of `filtered`
-- "Ancestor node 'filtered' reported following warning: filter() excluded 1 row because the predicate evaluated to NA"

Use inspect_node(node).warnings for structured access:

inspect_node(p.count).warnings
-- [
--   { source: "filtered", message: "filter() excluded 1 row because the predicate evaluated to NA" }
-- ]

The per-node diagnostics in read_pipeline(p).nodes carry full warning lists as well. The aggregate diagnostics.summary counts only own warnings, so the pipeline-level summary reflects which nodes originally produced warnings without double-counting inherited ones.

Investigating with explain()

When you load a node that soft-failed, you receive a T-Lang Error object. You can use the explain() builtin to see the exact cause, including tracebacks from other languages.

hu = read_node("py_err")
-- Error(RuntimeError: "Critical error in Python logic")

explain(hu)
-- Output:
-- Context:
--   runtime_traceback: "Traceback (most recent call last): ..."
--   node_name: "py_err"
--   node_status: "errored"

Polyglot Error Handling (Python/R)

T-Lang provides a “diagnostic bridge” for nodes running in other languages.

Python Nodes

R Nodes


Pattern 7: Conditional Modeling and Automated Reporting

In automated daily pipelines where data is refreshed externally, a three-way branching strategy is often necessary. You can use pattern matching to choose between a full complex model, a simplified fallback model, or a validation report based on the data’s characteristics.

p = pipeline {
  -- 1. Load the latest daily raw data
  raw_data = node(command = read_csv("daily_extract.csv"))

  -- 2. Validate and "tag" the data based on quality metrics
  quality_status = node(command = 
    raw_data ?|> \(df) {
      if (nrow(df) < 500) 
        error("CriticalError", "Insufficient rows for any modeling.")
      else if (length(levels(df.segment)) < 2)
        [type: "low_diversity", data: df]
      else
        [type: "high_quality", data: df]
    }
  )

  -- 3. Precise branching: Choose the best action for each state
  final_result = node(command = 
    quality_status ?|> \(outcome) match(outcome) {
      -- Full Case: Sufficient data and to_factor levels
      [type: "high_quality", data: df] => 
        lm(df, $yield ~ $temp + $segment),
      
      -- Fallback Case: Enough data, but not enough to_factor diversity
      -- We drop the 'segment' predictor to avoid model failure
      [type: "low_diversity", data: df] => 
        lm(df, $yield ~ $temp),
      
      -- Failure Case: Stop modeling and generate a diagnostic report
      Error { msg } => 
        generate_validation_report(msg, raw_data)
    }
  )
}

Why use this?: * Adaptive Modeling: Your pipeline automatically scales its complexity to match the quality of the incoming data, avoiding “singular matrix” or “one level to_factor” errors. * Operational Intelligence: Instead of the whole pipeline failing due to a minor data shift (like one category disappearing from today’s extract), the system gracefully degrades its service while still providing a result. * Auditability: Every run clearly states which path was taken through the use of descriptive tags like low_diversity or high_quality.


Comparing Error Handling Tools

T offers several tools for managing errors, each suited for different scenarios:

General Guidance: * Use |> for the majority of your data pipelines where you expect success and want to stop on the first error. * Use ?|> when you need to inspect or act on an error at a specific point in a pipeline, often followed by match or an if (is_error(...)) check. * Use match when your error recovery logic involves different actions for different error types, or when you want a clear, declarative way to distinguish between success and various error states.


Common Errors

NA Handling Errors

Problem: Forgot to handle NA values

mean([1, 2, NA, 4])
-- Error(NAError: NA value encountered)

Solution: Use na_rm = true

mean([1, 2, NA, 4], na_rm = true)  -- 2.33...

Type Mismatch Errors

Problem: Mixing incompatible types

"Age: " + 25
-- Error(TypeError: Cannot add String and Int)

Solution: Use separate print statements or to_string() for manual conversion.

print("Age: ")
print(25)

-- Or use R/Python for formatting if needed

Empty Collection Errors

Problem: Operating on empty data

mean([])
-- Error(ValueError: Cannot compute mean of empty list)

Solution: Check before operating

safe_mean = \(data)
  if (length(data) == 0)
    error("ValueError", "Empty data")
  else
    mean(data)

-- Or use default
safe_mean_with_default = \(data)
  if (length(data) == 0) 0.0 else mean(data)

Division by Zero

Problem: Dividing by zero

revenue_per_customer = total_revenue / customer_count
-- Error if customer_count = 0

Solution: Guard condition

revenue_per_customer = if (customer_count == 0)
  0.0
else
  total_revenue / customer_count

Missing Column Errors

Problem: Referencing non-existent column

df.nonexistent_column
-- Error(NameError: Column 'nonexistent_column' not found)

Solution: Check columns first

cols = colnames(df)
has_column = length(filter(cols, \(c) c == "age")) > 0

if (has_column)
  df.age
else
  error("ColumnError", "Required column 'age' not found")

Best Practices

1. Fail Fast, Fail Explicitly

Good: Validate inputs early

process_data = \(df)
  if (nrow(df) == 0)
    error("ValidationError", "Empty dataset")
  else if (not has_required_columns(df))
    error("ValidationError", "Missing required columns")
  else
    -- Process data

Bad: Silent failures

process_data = \(df)
  -- Assumes data is valid, may fail later
  df |> filter(...) |> summarize(...)

2. Use Descriptive Error Messages

Good: Clear, actionable messages

if (age < 0 or age > 150)
  error("ValidationError", str_sprintf("Age must be between 0 and 150, got: %s", to_string(age)))

Bad: Vague messages

if (age < 0 or age > 150)
  error("Invalid age")

3. Choose the Right Pipe

Use |> for normal success paths:

df |> filter($active) |> select($name)

Use ?|> for error recovery:

load_data() ?|> \(x) if (is_error(x)) backup_data() else x

4. Document Error Conditions

-- Computes mean, errors if list is empty or contains NA
-- Use na_rm = true to skip NA values
my_mean = \(values, na_rm)
  if (length(values) == 0)
    error("ValueError", "Empty list")
  else
    -- Implementation

5. Test Error Cases

-- Test successful case
assert(safe_divide(10, 2) == 5.0)

-- Test error recovery
assert(safe_divide(10, 0) == 0.0)
assert(is_error(risky_function(-1)))

6. Log Errors in Production

production_pipeline = pipeline {
  data = read_csv("data.csv") ?|> \(x) if (is_error(x)) {
    write_log(str_sprintf("CRITICAL: Failed to load data: %s", error_msg(x)))
    x
  } else x
  
  -- Rest of pipeline with error handling
}

7. Avoid Deep Nesting

Bad: Deeply nested error checks

result = if (is_error(step1))
  handle_step1_error(step1)
else
  if (is_error(step2))
    handle_step2_error(step2)
  else
    if (is_error(step3))
      handle_step3_error(step3)
    else
      success_value

Good: Use maybe-pipe for flat composition

result = step1
  ?|> \(x) if (is_error(x)) handle_step1_error(x) else step2(x)
  ?|> \(x) if (is_error(x)) handle_step2_error(x) else step3(x)
  ?|> \(x) if (is_error(x)) handle_step3_error(x) else x

Debugging Errors

result = risky_operation()

if (is_error(result)) {
  print(str_sprintf("Error code: %s", error_code(result)))
  print(str_sprintf("Error message: %s", error_msg(result)))
  print(str_sprintf("Error context: %s", error_context(result)))
}

Trace Error Source

-- Add markers in pipeline
step1 = operation1() ?|> \(x) if (is_error(x)) {
  print("Error at step1")
  x
} else x

step2 = operation2(step1) ?|> \(x) if (is_error(x)) {
  print("Error at step2")
  x
} else x

Assert Invariants

-- Use assertions to catch logic errors
result = computation()
assert(result > 0, "Result should be positive")
assert(length(result_list) == expected_length, "Length mismatch")

See Also: - Examples — Error handling patterns in practice - API Reference — Error-related functions - Troubleshooting — Common issues and solutions