API Reference

Package-oriented guide to T’s standard library.

Coverage: Package overview and worked examples
Exhaustive per-function reference: docs/reference/index.md, generated from source docstrings
Auto-loaded: All standard packages are automatically available in every T session


Table of Contents


For generated one-page documentation for every exported function, including newer Chrono, string, join, to_factor, and helper APIs, use the Function Reference.


Core Syntax: Lists, Dictionaries, and Blocks

These forms are distinct and non-overlapping:

Bracket literal rule ([...])

When parsing a bracket literal, T applies this rule:

  1. Parse comma-separated top-level items.
  2. If any top-level item is key: value, the whole literal is treated as a dictionary.
  3. Otherwise, it is treated as a list.

This means:

[]                    -- empty List
[:]                   -- empty Dict
[1, 2, 3]             -- List
[name: "alice"]      -- Dict
[name: "alice", age: 32]  -- Dict

Disallowed mixed forms

A single bracket literal cannot mix dictionary entries and plain expressions:

[name: "alice", 12]  -- Parse error
[name:]               -- Parse error

Braces are blocks

{ ... } is reserved for block syntax (e.g., in control flow and pipeline/intent constructs). It is not used for general dictionary literals.

An empty brace block {} parses as an empty block (Block []) and evaluates to NA at runtime. Braces are never used for dictionary literals; dictionaries always use the bracket ([...]) syntax described above.


Shell Interaction

Shell Escape (?<{ ... }>)

T provides first-class support for executing shell commands using the ?<{ }> syntax.

Examples:

?<{ls -la}>             -- Prints directory listing
files = ?<{ls}>         -- Captures filenames in a string
?<{cd /tmp}>            -- Changes working directory to /tmp

Core Package

Fundamental functional programming utilities.

print(value)

Print a value to standard output.

Parameters:

Returns:

The printed value (for chaining)

Examples:

print(42)                    -- 42
print("Hello, T!")           -- Hello, T!
print([1, 2, 3])             -- [1, 2, 3]
x = 10 |> print |> \(v) v * 2  -- Prints 10, returns 20

pretty_print(value)

Pretty-print a value with detailed formatting (for DataFrames, structures, etc.).

Parameters:

Returns:

The value (for chaining)

Examples:

pretty_print(df)  -- Formatted DataFrame output

type(value)

Get the type name of a value as a string.

Parameters:

Returns:

String — Type name

Examples:

type(42)             -- "Int"
type(3.14)           -- "Float"
type(true)           -- "Bool"
type("hello")        -- "String"
type([1, 2])         -- "List"
type([x: 1])         -- "Dict"
type(NA)             -- "NA"
type(error("x"))     -- "Error"
type(df)             -- "DataFrame"

to_integer(value)

Convert a value to an integer robustly. Handles strings with spaces, percentages, commas, and recognizes ‘TRUE’/‘FALSE’. Also propagates vectorization over Collections.

Parameters:

Returns:

Int, NA, or a Collection of Int/NA

Examples:

to_integer("12 300")     -- 12300
to_integer("TRUE")       -- 1
to_integer("FALSE")      -- 0
to_integer("15%")        -- 15
to_integer("3,14")       -- 3
to_integer(3.14)         -- 3
to_integer("hello")      -- NA(Int)
to_integer(["1", "2"])   -- [1, 2]

to_float(value) / to_float(value)

Convert a value to a float robustly. to_float is an alias for to_float. Handles strings with spaces, percentages, commas, and recognizes ‘TRUE’/‘FALSE’. Also propagates vectorization over Collections.

Parameters:

Returns:

Float, NA, or a Collection of Float/NA

Examples:

to_float("3,14")         -- 3.14
to_float("15%")          -- 15.0
to_float(" 1 200.5 ")    -- 1200.5
to_float("TRUE")       -- 1.0
to_float("F")          -- 0.0
to_float(42)             -- 42.0
to_float("hello")        -- NA(Float)
to_float(["1", "2"])   -- [1.0, 2.0]

to_symbol(value)

Convert a string name into a Symbol so it can be injected into quoted code with !!. Existing symbols pass through unchanged.

Parameters:

Returns:

Symbol

Examples:

to_symbol("mpg")                           -- mpg
to_expr(select(df, !!to_symbol("mpg")))       -- to_expr(select(df, mpg))
name = "result"
to_expr(f(!!to_symbol(name) := 42))           -- to_expr(f(result = 42))

args(fn)

Returns a dictionary of parameter names and their expected types for a function.


is_error(value)

Returns true if the value is an Error object.


get(target, selector = NA, default = NA)

Unified retrieval for variables, collection elements, pipeline nodes, or lens focuses.

Examples:

get("salary")                -- variable lookup
get(list, 0)                  -- indexing
get(df, col_lens("mpg"))      -- lens focus
get(val, 0)                   -- fallback if val is NA/Error

ifelse(condition, true_val, false_val, missing = NA, out_type = NA)

Vectorized conditional selection.


case_when(...formulas, .default = NA)

Vectorized multi-condition switch. Uses condition ~ value formulas.


identical(a, b)

Deep equality check. Works for collections and complex objects.


eval(expr) / to_expr(x) / to_exprs(...)

quo(x) / quos(...) / enquo(p) / enquos(...)

Metaprogramming and quotation utilities.


body(fn) / source(fn)

Inspect function implementation.


run(cmd)

Execute a shell command and return its stdout as a string.


cat(...values, sep = " ", file = NA, append = false)

Print values to stdout or a file without a trailing newline (unless specified).


getwd() / exit(code = 0)

Environment and process control.


file_exists(path) / dir_exists(path) / list_files(path, pattern = NA)

read_file(path) / read_lines(path)

File system introspection and reading.


path_join(...) / path_abs(path)

path_basename(path) / path_dirname(path)

path_ext(path) / path_stem(path)

Cross-platform path manipulation.


show_plot(plot)

Display a built or unbuilt R/Python/Julia plot node (depends on the environment’s plot viewer).


env()

Returns a list of all variable names currently in the environment.


length(collection)

Get the number of elements in a collection.

Parameters:

Returns:

Int — Number of elements

Examples:

length([1, 2, 3])      -- 3
length("hello")        -- 5
length([])             -- 0

head(collection, n)

Get the first element(s) of a collection. For DataFrames, returns the first n rows (default 5). For Lists, returns the first element.

Parameters:

Returns:

Single element (for Lists) or DataFrame (for DataFrames)

Examples:

head([1, 2, 3, 4, 5])       -- 1
head(df)                     -- first 5 rows
head(df, 3)                  -- first 3 rows
head(df, n = 10)             -- first 10 rows

tail(collection, n)

For DataFrames, returns the last n rows (default 5). For Lists, returns all elements except the first.

Parameters:

Returns:

List (for Lists) or DataFrame (for DataFrames)

Examples:

tail([1, 2, 3, 4, 5])  -- [2, 3, 4, 5]
tail(df)                -- last 5 rows
tail(df, 3)             -- last 3 rows
tail(df, n = 10)        -- last 10 rows

map(collection, fn)

Apply a function to each element of a collection.

Parameters:

Returns:

List (or Vector) of results

Examples:

map([1, 2, 3], \(x) x * x)           -- [1, 4, 9]
map(["a", "b"], \(s) s + "!")        -- ["a!", "b!"]
map([1, 2, 3], \(x) x + 10)          -- [11, 12, 13]

filter(collection, predicate)

Keep only elements that satisfy a predicate.

Parameters:

Returns:

List (or Vector) of matching elements

Examples:

filter([1, 2, 3, 4, 5], \(x) x > 3)    -- [4, 5]
filter([1, 2, 3], \(x) x % 2 == 0)     -- [2]
filter(["a", "ab", "abc"], \(s) length(s) > 1)  -- ["ab", "abc"]

sum(collection)

Sum all numeric elements.

Parameters:

Returns:

Int or Float — Sum

Examples:

sum([1, 2, 3, 4, 5])    -- 15
sum([1.5, 2.5, 3.0])    -- 7.0
sum([])                 -- 0

seq(start, end, step = 1)

Generate a sequence of numbers.

Parameters:

Returns:

List of numbers

Examples:

seq(1, 5)       -- [1, 2, 3, 4, 5]
seq(0, 10, 2)   -- [0, 2, 4, 6, 8, 10]
seq(5, 1, -1)   -- [5, 4, 3, 2, 1]

Parameters:

Returns:

Bool — true if value is an Error

Examples:

is_error(42)             -- false
is_error(error("msg"))   -- true
is_error(1 / 0)          -- true

getwd()

Returns the current working directory of the T interpreter.

Returns:

String — Working directory path


file_exists(path)

Check if a regular file exists at the given path. Returns false for directories.

Parameters:

Returns:

Bool


dir_exists(path)

Check if a directory exists at the given path.

Parameters:

Returns:

Bool


read_file(path)

Read the entire contents of a file as a string.

Parameters:

Returns:

String or Error(FileError)


list_files(path, pattern)

List files and directories in a given path.

Parameters:

Returns:

List[String] or Error(FileError)


env(name)

Get the value of an environment variable.

Parameters:

Returns:

String or NA if not found


exit(code)

Exits the T interpreter.

Parameters:


path_join(...)

Join multiple path segments using the system-specific separator.

Parameters:

Returns:

String


path_basename(path)

Get the filename/last component of a path.

Parameters:

Returns:

String


path_dirname(path)

Get the directory portion of a path.

Parameters:

Returns:

String


path_ext(path)

Get the file extension (including the dot). Returns NA if no extension is found.

Parameters:

Returns:

String or NA


path_stem(path)

Get the filename without its extension.

Parameters:

Returns:

String


path_abs(path)

Resolves a relative path to an absolute path against the current working directory.

Parameters:

Returns:

String


rm(...)

Remove one or more variables from the environment by name. Supports bare symbols (R-style selective removal), strings, and lists of names via the list parameter.

Parameters:

Returns:

NA

Examples:

x = 10; y = 20
rm(x, y)          -- Removes x and y

z = 30
rm("z")           -- Removes z

vars = ["a", "b"]
rm(list = vars)   -- Removes variables 'a' and 'b'

Base Package

Error handling, NA values, and assertions.

error(message) / error(code, message)

Create an error value.

Parameters:

Returns:

Error value

Examples:

error("Something went wrong")
error("ValueError", "Invalid input")

e = error("custom error")
error_msg(e)  -- "custom error"

error_code(err)

Get the error code from an Error value.

Parameters:

Returns:

String — Error code

Examples:

e = 1 / 0
error_code(e)  -- "DivisionByZero"

e2 = error("TypeError", "msg")
error_code(e2)  -- "TypeError"

error_msg(err)

Get the error message from an Error value.

Parameters:

Returns:

String — Error message

Examples:

e = error("Something broke")
error_msg(e)  -- "Something broke"

e2 = 1 / 0
error_msg(e2)  -- "Division by zero"

warning_msg(node)

Get the warning message from a completed computed node (if any exists).

Parameters:

Returns:

String — Warning message, or an empty string "" if there are no warnings.

Examples:

p = pipeline { a = suppress_warnings(node()) }
build_pipeline(p)
warning_msg(p.a)  -- Returns warning message string or ""

error_context(err)

Get additional context from an Error value (if available).

Parameters:

Returns:

Dict — A dictionary of related context data.

Examples:

error_context(e)  -- Additional debugging information

error_chain(err1, err2)

Explicitly chains two Error values together to preserve their provenance. This sets err2 as the underlying cause in err1’s context.

Parameters:

Returns:

Error — The chained Error value.

Examples:

err1 = error("Primary calculation failed")
err2 = error("KeyError", "Missing key 'x'")
chained = error_chain(err1, err2)

error_context(chained)$cause  -- Returns err2

assert(condition) / assert(condition, message)

Assert that a condition is true; error if false.

Parameters:

Returns:

true if condition holds

Examples:

assert(2 + 2 == 4)                -- true
assert(1 > 2)                     -- Error(AssertionError)
assert(false, "Custom message")   -- Error(AssertionError: Custom message)

assert_file_exists(path) / assert_file_exists(path, message)

Assert that a regular file exists at path.

Parameters:

Returns:

true if the file exists

Examples:

assert_file_exists("output.csv")
assert_file_exists("report.html", "report generation failed")

assert_dir_exists(path) / assert_dir_exists(path, message)

Assert that a directory exists at path.

Parameters:

Returns:

true if the directory exists

Examples:

assert_dir_exists("results")
assert_dir_exists("artifacts", "artifact directory was not created")

assert_size_of_file(path, size) / assert_size_of_file(path, size, message)

Assert that a regular file exists and has the expected size in bytes.

Parameters:

Returns:

true if the file exists and matches the expected size

Examples:

assert_size_of_file("output.csv", 128)
assert_size_of_file("report.html", 0, "report should be empty")

assert_non_empty_file(path) / assert_non_empty_file(path, message)

Assert that a regular file exists and contains at least one byte.

Parameters:

Returns:

true if the file exists and is non-empty

Examples:

assert_non_empty_file("output.csv")
assert_non_empty_file("plot.png", "plot was not written")

NA

Untyped missing value constant.

Examples:

x = NA
is_na(x)  -- true

na_int() / na_float() / na_bool() / na_string()

Create typed NA values.

Returns:

Typed NA value

Examples:

na_int()     -- NA(Int)
na_float()   -- NA(Float)
na_bool()    -- NA(Bool)
na_string()  -- NA(String)

is_na(value)

Check if a value is NA.

Parameters:

Returns:

Bool — true if value is NA

Examples:

is_na(NA)           -- true
is_na(na_int())     -- true
is_na(42)           -- false
is_na("hello")      -- false

serialize(value, path)

Serializes a value to a .tobj file.

Parameters:

Returns:

NA

Seealso: deserialize


deserialize(path)

Deserializes a value from a .tobj file.

Parameters:

Returns:

Any — The deserialized value

Seealso: serialize


t_write_json(value, path)

Serializes a T value to a JSON file. This is used as the universal baseline for object transport between runtimes.

Parameters:

Returns:

NA


t_read_json(path)

Deserializes a T value from a JSON file. Automatically handles type conversion for scalars, lists, and dictionaries.

Parameters:

Returns:

Any — The deserialized value


Math Package

Mathematical functions operating on scalars and vectors. Most functions are vectorized over Collections.

sqrt(x)

Square root.

Parameters:

Returns:

Float

Examples:

sqrt(4)      -- 2.0
sqrt(2)      -- 1.41421356237
sqrt(0)      -- 0.0
sqrt(-1)     -- Error (negative input)

abs(x)

Absolute value.

Parameters:

Returns:

Same type as input

Examples:

abs(-5)      -- 5
abs(3.14)    -- 3.14
abs(0)       -- 0

log(x) / log10(x) / log2(x)

Logarithm functions. log is natural logarithm (base e).

Parameters:

Returns:

Float

Examples:

log(10)      -- 2.30258509299
log10(100)   -- 2.0
log2(1024)   -- 10.0

exp(x)

Exponential function (e^x).

Parameters:

Returns:

Float

Examples:

exp(0)       -- 1.0
exp(1)       -- 2.71828182846

pow(base, exponent)

Power function (base^exponent).

Parameters:

Returns:

Float

Examples:

pow(2, 10)   -- 1024.0
pow(9, 0.5)  -- 3.0

sin(x) / cos(x) / tan(x)

Standard trigonometric functions (input in radians).


asin(x) / acos(x) / atan(x) / atan2(y, x)

Inverse trigonometric functions. atan2 returns the angle whose tangent is y/x.


sinh(x) / cosh(x) / tanh(x)

asinh(x) / acosh(x) / atanh(x)

Hyperbolic and inverse hyperbolic functions.


floor(x) / ceiling(x) / ceiling(x)

Rounding to integers. ceiling and ceiling are aliases.


round(x, digits = 0) / signif(x, digits = 6)

Rounding to decimal places or significant figures.


trunc(x) / sign(x)

Truncate fractional part or get the sign (-1, 0, 1) of a value.


ndarray(data, shape = NA) / reshape(array, shape)

Create or reshape N-dimensional arrays. ndarray can infer shape from nested lists.


shape(array) / ndarray_data(array)

Get NDArray dimensions (as a List) or flat data (as a List of Floats).


matmul(a, b) / inv(matrix) / transpose(matrix)

Linear algebra operations on 2D NDArrays.


diag(x) / kron(a, b) / cbind(a, b)

Matrix creation and manipulation. diag extracts the diagonal from a 2D array or creates a diagonal matrix from a 1D array.


iota(n)

Returns a Vector of length n filled with 1.0 (a ones vector). Useful for initializing weights or masks.

Examples:

pow(2, 3)    -- 8.0
pow(10, 2)   -- 100.0
pow(4, 0.5)  -- 2.0 (square root)
pow(2, -1)   -- 0.5


Stats Package

Statistical functions for data analysis. Most functions handle missingness via an na_rm parameter.

Descriptive Statistics

median(x, na_rm = false, weights = NA) / mean(x, na_rm = false, weights = NA)

min(x, na_rm = false) / max(x, na_rm = false) / range(x, na_rm = false)

Basic descriptive statistics. mean and median also accept optional non-negative observation weights. range returns a List of [min, max].


var(x, na_rm = false, weights = NA) / sd(x, na_rm = false, weights = NA) / cv(x, na_rm = false, weights = NA)

Variance, standard deviation, and coefficient of variation (sd/mean). These also accept optional non-negative observation weights. The weighted sd/var path uses the weighted population denominator (sum(weights)), while the unweighted path uses sample formulas.


iqr(x, na_rm = false, weights = NA) / mad(x, na_rm = false)

Interquartile range and Median Absolute Deviation (scaled by 1.4826). iqr accepts optional non-negative observation weights.


fivenum(x, na_rm = false, weights = NA)

Tukey’s five-number summary (min, lower-hinge, median, upper-hinge, max), with optional non-negative observation weights.


skewness(x, na_rm = false, weights = NA) / kurtosis(x, na_rm = false, weights = NA)

Skewness and excess kurtosis, with optional non-negative observation weights.


trimmed_mean(x, trim = 0.1, na_rm = false, weights = NA)

Mean calculated after trimming a fraction of observations from each end. Optional weights affect the trim cut points and the retained mean.


quantile(x, p, na_rm = false, weights = NA)

Compute quantile/percentile (p between 0 and 1), with optional non-negative observation weights.


mode(x)

Return the most frequent value. Does not currently support na_rm.


Data Transformation

normalize(x) / standardize(x) / scale(x)

Rescale or center numeric data. scale and standardize compute z-scores. normalize scales to [0, 1].


winsorize(x, limits = [0.05, 0.05], na_rm = false, weights = NA)

Clamp values using one limit or a two-element vector [lower_tail_fraction, upper_tail_fraction], with each fraction in [0, 0.5). Optional weights affect the cut points only; zero-weight observations remain in the output.


huber_loss(actual, predicted, delta = 1.0)

Compute the Huber loss between two vectors.


Distributions (CDFs)

pnorm(x) / pt(x, df) / pf(q, df1, df2) / pchisq(q, df)

Cumulative Distribution Functions for Normal, Student-t, F, and Chi-squared distributions.


Modeling

lm(data, formula, weights = NA)

Fit a linear regression model. Without weights this is OLS; with weights it performs weighted least squares. The formula is a Formula value such as mpg ~ wt + hp.


summary(model) / fit_stats(model)

summary(model) returns a Dict containing a _tidy_df DataFrame plus metadata; fit_stats(model) returns a DataFrame of model-level metrics.


predict(data, model) / score(data, model)

Perform vectorized prediction on new data. score is an alias.


add_diagnostics(model, data) / add_diagnostics(model, data)

Augment data with per-observation diagnostics: .fitted, .resid, .hat, .sigma, .cooksd, and .std.resid.


anova(model1, model2, ...)

Compare multiple nested models using an ANOVA table.


coef(model) / residuals(model) / vcov(model) / df_residual(model)

Extract model components.


wald_test(model, terms)

Perform a Wald test for a joint hypothesis on coefficients.


t_read_onnx(path) / t_read_pmml(path)

Import pre-trained models from ONNX or PMML formats for native scoring. Julia nodes can also consume ^onnx artifacts through ONNXRunTime.jl; ONNX export from Julia remains explicitly unsupported.


cut(x, breaks, ...) / poly(x, degree, ...)

Basis functions for modeling.



DataFrame Package

CSV I/O and DataFrame introspection.

to_dataframe(data)

Constructs a DataFrame from either a list of rows (Dictionaries) or a Dictionary of columns (Vectors/Lists).

Parameters:

Returns:

DataFrame

Examples:

-- Column-wise
df = to_dataframe([x: [1, 2], y: [3, 4]])

-- Row-wise
df = to_dataframe([
  [name: "Alice", age: 30],
  [name: "Bob", age: 25]
])

read_csv(path, separator = ",", skip_lines = 0, skip_header = false, clean_colnames = false)

Read a CSV file into a DataFrame.

Parameters:

Returns:

DataFrame


read_parquet(path)

Read a Parquet file into a DataFrame.


read_arrow(path) / write_arrow(to_dataframe, path)

Read or write Arrow IPC files.


write_csv(to_dataframe, path, separator = ",")

Write a DataFrame to a CSV file.


nrow(to_dataframe) / ncol(to_dataframe)

Get number of rows or columns.


colnames(to_dataframe)

Get column names as a List of strings.


clean_colnames(x)

Standardizes column names using a snake_case convention. Works on DataFrames or Lists of strings.


glimpse(to_dataframe)

Prints a summary of the DataFrame structure, including dimensions, column names, types, and first few values.


pull(to_dataframe, column)

Extracts a single column as a Vector.


to_array(to_dataframe, columns = NA)

Converts numeric columns of a DataFrame to a matrix (NDArray).


ncol(to_dataframe)

Get number of columns.

Parameters:

Returns:

Int — Column count

Examples:

ncol(df)  -- 5

colnames(to_dataframe)

Get column names.

Parameters:

Returns:

List of Strings

Examples:

colnames(df)  -- ["name", "age", "dept", "salary"]

glimpse(to_dataframe)

Get a compact overview of a DataFrame, showing column names, types, and example values. Similar to dplyr’s glimpse().

Parameters:

Returns:

Dict with kind, nrow, ncol, and columns (list of column summaries)

Examples:

glimpse(df)
-- {`kind`: "to_dataframe", `nrow`: 100, `ncol`: 4, `columns`: ["name <String> ...", "age <Int> ...", ...]}

clean_colnames(to_dataframe) / clean_colnames(names)

Normalize column names to safe identifiers.

Parameters:

Returns:

DataFrame with cleaned names, OR List of cleaned Strings

Transformations: 1. Symbol expansion: %percent, euro, $dollar, etc. 2. Diacritics removal: cafécafe 3. Lowercase 4. Non-alphanumeric → _, collapse runs 5. Prefix digits with x_: 1stx_1st 6. Empty → col_N 7. Collision resolution: _2, _3, etc.

Examples:

clean_colnames(["Growth%", "MILLION€", "café"])
-- ["growth_percent", "million_euro", "cafe"]

clean_colnames(["A.1", "A-1"])
-- ["a_1", "a_1_2"]  (collision resolved)

df2 = clean_colnames(df)  -- DataFrame with cleaned column names

Colcraft Package

Data manipulation verbs and window functions.

Data Verbs

select(to_dataframe, ...columns)

Select columns by name. Supports dollar-prefix NSE syntax.

Parameters:

Returns:

DataFrame with selected columns

Examples:

df |> select($name, $age)
df |> select($dept)

filter(to_dataframe, predicate)

Filter rows by condition. Supports NSE expressions with dollar-prefix column references.

Parameters:

Returns:

DataFrame with matching rows

Examples:

df |> filter($age > 30)
df |> filter($dept == "Engineering")
df |> filter($salary > 50000 and $active == true)

mutate(to_dataframe, $col = expr) / mutate(to_dataframe, new_col, fn)

Add or transform a column. Supports $col = expr named-arg syntax with NSE.

Parameters (named-arg form): - to_dataframe — DataFrame - $col = expr — Column name from $col, value from NSE expression

Parameters (positional form): - to_dataframe — DataFrame - new_col — Column reference ($bonus) - fn — Function taking row dict: \(row) ..., OR - value — Constant value for all rows

Returns:

DataFrame with new/modified column

Examples:

-- Named-arg NSE syntax
df |> mutate($bonus = $salary * 0.1)
df |> mutate($age_next_year = $age + 1)

-- Positional NSE with lambda
df |> mutate($bonus, \(row) row.salary * 0.1)

-- Grouped mutate (broadcast group result)
df |> group_by($dept) |> mutate($dept_size, \(g) nrow(g))

arrange(to_dataframe, column, direction = "asc")

Sort rows by column. Supports dollar-prefix NSE for column names.

Parameters:

Returns:

Sorted DataFrame

Examples:

df |> arrange($age)
df |> arrange($salary, "desc")

group_by(to_dataframe, ...columns)

Group by one or more columns. Supports dollar-prefix NSE for column names.

Parameters:

Returns:

Grouped DataFrame

Usage:

-- Use with summarize to aggregate
df |> group_by($dept) |> summarize($avg_salary, \(g) mean(g.salary))

-- Use with mutate to broadcast group results
df |> group_by($dept) |> mutate($dept_count, \(g) nrow(g))

Examples:

df |> group_by($dept)
df |> group_by($dept, $location)

summarize(grouped_df, $col = expr) / summarize(grouped_df, new_col, fn)

Aggregate grouped data. Supports $col = expr named-arg syntax with NSE.

Parameters (named-arg form): - grouped_df — Grouped DataFrame (from group_by()) - $col = expr — Column name from $col, aggregation from NSE expression (e.g. sum($amount))

Parameters (positional form): - grouped_df — Grouped DataFrame (from group_by()) - new_col — Column reference ($count) - fn — Aggregation function: \(group) ...

Returns:

DataFrame with one row per group

Examples:

-- Named-arg NSE syntax
df |> group_by($dept) |> summarize($count = nrow($dept))
df |> group_by($dept) |> summarize($avg_salary = mean($salary))
df |> group_by($region) |> summarize($total_sales = sum($sales), $n = nrow($region))

-- Positional NSE with lambda
df |> group_by($dept) |> summarize($count, \(g) nrow(g))

ungroup(grouped_df)

Remove grouping from a DataFrame.

Parameters:

Returns:

Ungrouped DataFrame

Examples:

ungrouped = df |> group_by($dept) |> ungroup()

Join and Bind Functions

left_join(x, y, by = NA) / inner_join / full_join / semi_join / anti_join

Join two DataFrames.

Parameters:

Returns:

Joined DataFrame


bind_rows(...) / bind_cols(...)

Combine multiple DataFrames by stacking rows or placing columns side-by-side.


Wrangling Utilities

count(df, ...columns)

Count occurrences of unique values.


distinct(df, ...columns)

Keep only unique rows.


drop_na(df, ...columns)

Drop rows containing NA values in the specified columns.


replace_na(df, values)

Replace NA values with specified defaults.


rename(df, ...new_name = old_name)

Rename columns.


relocate(df, ...columns, before = NA, after = NA)

Change column order.


slice(df, ...indices) / slice_min(df, col, n = 1) / slice_max(df, col, n = 1)

Subset rows by position or extreme values.


pivot_longer(df, cols, names_to = "name", values_to = "value")

pivot_wider(df, names_from = "name", values_from = "value")

Reshape DataFrames between long and wide formats.


separate(df, col, into, sep = "[^a-zA-Z0-9]+") / unite(df, col, ...from, sep = "_")

Split a column into multiple columns, or combine multiple columns into one.


Factor Manipulation

to_factor(x, levels = NA, ordered = false)

Create to_factor-encoded vectors. Derives unique levels alphabetically if levels is not provided.


levels(f)

Get labels from a to_factor.


fct_recode(f, ...new = old) / fct_relevel(f, ...levels, after = 0)

Rename or reorder to_factor levels.


fct_lump_n(f, n, other_level = "Other") / fct_lump_min / fct_lump_prop

Collapse infrequent levels into an “Other” category.


fct_infreq(f) / fct_rev(f) / fct_reorder(f, x, .desc = false)

Reorder levels by frequency, reversal, or summary of another vector.


Aggregation Context

n()

Returns the number of rows in the current group. Only valid inside summarize().


n_distinct(x)

Returns the number of unique non-NA values.


Window Functions

Window functions compute values across rows without collapsing them.

Ranking Functions

row_number(vector)

Assign unique row numbers.

Parameters:

Returns:

Vector of row numbers (1, 2, 3, …), NA for NA positions

Examples:

row_number([10, 30, 20])     -- Vector[1, 3, 2]
row_number([3, NA, 1])       -- Vector[2, NA, 1]

min_rank(vector)

Minimum rank (gaps after ties).

Parameters:

Returns:

Vector of ranks

Examples:

min_rank([1, 1, 2, 2, 2])    -- Vector[1, 1, 3, 3, 3]
min_rank([3, NA, 1, 3])      -- Vector[2, NA, 1, 2]

dense_rank(vector)

Dense rank (no gaps).

Parameters:

Returns:

Vector of ranks

Examples:

dense_rank([1, 1, 2, 2])     -- Vector[1, 1, 2, 2]
dense_rank([10, 10, 20])     -- Vector[1, 1, 2]

cume_dist(vector)

Cumulative distribution (proportion ≤ value).

Parameters:

Returns:

Vector of Float (0.0 to 1.0)

Examples:

cume_dist([1, 2, 3])         -- Vector[0.333..., 0.666..., 1.0]

percent_rank(vector)

Percent rank ((rank - 1) / (n - 1)).

Parameters:

Returns:

Vector of Float (0.0 to 1.0)

Examples:

percent_rank([1, 2, 3])      -- Vector[0.0, 0.5, 1.0]

ntile(vector, n)

Divide into n groups.

Parameters:

Returns:

Vector of group numbers (1 to n)

Examples:

ntile([1, 2, 3, 4], 2)       -- Vector[1, 1, 2, 2]
ntile([1, 2, 3, 4, 5], 3)    -- Vector[1, 1, 2, 2, 3]

Offset Functions

lag(vector, n = 1)

Shift values forward (add NA at start).

Parameters:

Returns:

Vector with shifted values

Examples:

lag([1, 2, 3, 4])            -- Vector[NA, 1, 2, 3]
lag([1, 2, 3, 4], 2)         -- Vector[NA, NA, 1, 2]
lag([1, NA, 3])              -- Vector[NA, 1, NA]

lead(vector, n = 1)

Shift values backward (add NA at end).

Parameters:

Returns:

Vector with shifted values

Examples:

lead([1, 2, 3, 4])           -- Vector[2, 3, 4, NA]
lead([1, 2, 3, 4], 2)        -- Vector[3, 4, NA, NA]

Cumulative Functions

NA propagates: once NA is encountered, all subsequent values become NA.

cumsum(vector)

Cumulative sum.

Examples:

cumsum([1, 2, 3, 4])         -- Vector[1, 3, 6, 10]
cumsum([1, NA, 3])           -- Vector[1, NA, NA]

cummin(vector)

Cumulative minimum.

Examples:

cummin([3, 1, 4, 1])         -- Vector[3, 1, 1, 1]

cummax(vector)

Cumulative maximum.

Examples:

cummax([1, 3, 2, 5])         -- Vector[1, 3, 3, 5]

cummean(vector)

Cumulative mean.

Examples:

cummean([2, 4, 6])           -- Vector[2.0, 3.0, 4.0]

cumall(vector)

Cumulative AND (all true so far?).

Examples:

cumall([true, true, false])  -- Vector[true, true, false]

cumany(vector)

Cumulative OR (any true so far?).

Examples:

cumany([false, true, false]) -- Vector[false, true, true]

Chrono Package

High-performance date and time manipulation, inspired by R’s lubridate.

to_date(value) / to_datetime(value)

Convert values to Date or Datetime types.

Parameters:

Returns:

Date / Datetime / Collection

Examples:

to_date("2023-05-15")  -- 2023-05-15
to_datetime("2023-05-15 14:00:00")

ymd(string) / mdy(string) / dmy(string) / ydm(string)

ymd_h(string) / ymd_hm(string) / ymd_hms(string)

Parse strings into dates or datetimes using common layouts.

Parameters:

Returns:

Date / Datetime

Examples:

ymd("2023-05-15")
mdy("05-15-2023")
ymd_hms("2023-05-15 14:30:05")

parse_date(string, format) / parse_datetime(string, format, tz = "UTC")

Parse strings into temporal values using explicit strptime-style formats.

Parameters:

Returns:

Date / Datetime


today() / now(tz = "UTC")

Get the current UTC date or datetime.

Returns:

Date / Datetime


year(x) / month(x, label = false) / day(x) / day(x)

yday(x) / wday(x, label = false, week_start = 7) / week(x) / isoweek(x) / isoyear(x)

quarter(x) / semester(x)

Extract calendar components from Date or Datetime values.

Parameters:

Returns:

Int / String


hour(x) / minute(x) / second(x) / tz(x)

Extract time-of-day components or timezone labels from Datetime values.

Returns:

Int / Float / String


am(x) / pm(x)

Check whether a time is before or after noon.

Returns:

Bool


floor_date(datetime, unit) / ceiling_date(datetime, unit) / round_date(datetime, unit)

Round a date/datetime to the nearest unit boundary (year, month, day, hour, etc.).

Parameters:

Returns:

Same as input type

Examples:

floor_date(to_date("2023-05-15"), "month")  -- 2023-05-01

make_date(year, month, day) / make_datetime(year, month, day, hour, min, sec, tz)

Construct temporal values from numeric components.


format_date(x, format) / format_datetime(x, format)

Format temporal values as strings using strftime-style patterns.


interval(start, end) / %within%(x, interval)

Construct temporal intervals and test membership.


years(n) / months(n) / weeks(n) / days(n) / hours(n) / minutes(n) / seconds(n)

Construct Period objects for date arithmetic.


is_date(x) / is_datetime(x) / is_period(x) / is_duration(x) / is_interval(x)

Type predicates for temporal values.


is_leap_year(x) / days_in_month(x)

Calendar helpers.


with_tz(x, tz) / force_tz(x, tz)

Update the timezone label of a Datetime value.


Strcraft Package

Modern string manipulation utilities, inspired by R’s stringr.

str_replace(string, pattern, replacement) / replace_first(string, pattern, replacement)

Replace occurrences of a pattern. str_replace replaces all occurrences (global replace); replace_first replaces only the first occurrence.


str_detect(string, pattern) / contains(s, sub)

Check if a pattern or substring exists.


starts_with(s, prefix) / ends_with(s, suffix)

Check string boundaries.


str_extract(s, pattern) / str_extract_all(s, pattern)

Extract matching substrings. str_extract returns the first match; str_extract_all returns a List of all matches.


str_count(s, pattern) / str_nchar(s)

Count matches or total characters.


str_trim(s) / trim_start(s) / trim_end(s)

Remove whitespace.


str_lines(s) / str_words(s) / str_split(s, sep)

Split strings into parts. str_lines splits on newlines; str_words splits on any whitespace.


str_pad(s, width, side = "left", pad = " ")

Pad strings to a fixed width.


str_trunc(s, width, side = "right", ellipsis = "...")

Truncate strings with an ellipsis.


str_flatten(values, collapse = "") / str_join(items, sep = "")

Combine multiple strings into one.


to_lower(s) / to_upper(s)

Case normalization.


str_repeat(s, n)

Repeat a string n times.


str_format(fmt, values) / str_sprintf(fmt, ...)

String interpolation and formatting. str_format uses {name} placeholders with a Dictionary or named List; str_sprintf uses C-style % specifiers.



Lens Package

Composable access and update lenses for dictionaries, lists, data frames, and pipeline inspection.

For the full walkthrough and worked examples, see the Lens guide. For the generated per-function entries, see the Function Reference.

col_lens(name) / idx_lens(i) / row_lens(i)

Focus on a dictionary key/column, a list index, or a DataFrame row.


filter_lens(predicate)

Focus on elements matching a predicate (supports DataFrames, Lists, and Vectors).


node_lens(name) / node_meta_lens(name, field) / env_var_lens(node, var)

Focus on pipeline nodes, their metadata, or environment variables.


compose(...lenses)

Combine multiple lenses into a deep traversal.


get(data, lens) / set(data, lens, value) / over(data, lens, fn)

Read, write, or transform data at the focused location.


modify(data, ...pairs)

Apply a sequence of (lens, function) pairs to the same data structure.


Pipeline Package

Pipeline introspection and management.

pipeline(...)

Constructs a Pipeline from a Dictionary of named nodes or a List of node records.


build_pipeline(p, verbose = 0) / populate_pipeline(p, build = true)

Materialize a pipeline to Nix artifacts. build_pipeline is the primary entry point for full Nix builds and returns a BuildLog value (nodes, duration, failed_nodes, out_path). populate_pipeline can be used to generate the Nix expression without building (with build = false).


read_pipeline(p) / inspect_pipeline(p)

Returns a dictionary with node metadata and diagnostics summary. inspect_pipeline focuses on the DAG structure (edges).


read_node(node, which_log = NA)

Retrieves the dynamically evaluated or built artifact of a node. Strictly expects a ComputedNode object (e.g. p.node_name).


filter_node(p, predicate) / select_node(p, ...)

Subsetting nodes in a pipeline. filter_node keeps nodes matching a condition; select_node picks nodes by name.


mutate_node(p, ...) / rename_node(p, ...)

Modify nodes within a pipeline. mutate_node can redefine or add nodes; rename_node changes node labels while preserving dependencies.


arrange_node(p, ...)

Reorders nodes in the pipeline definition (does not affect execution order, which is DAG-driven).


trace_nodes(p, node_names)

Returns a sub-pipeline containing only the specified nodes and all their recursive dependencies.


which_nodes(p, predicate)

Filter the richer node records from read_pipeline(p).nodes without manually writing read_pipeline, compose, or an explicit lambda.


errored_nodes(p)

Convenience wrapper returning the subset of node records whose diagnostics.error is not NA.

node(command, script = NA, runtime = "T", serializer = "default", deserializer = "default", env_vars = [:], args = [:], shell = NA, shell_args = [], functions = [], include = [], noop = false)

Configure execution settings such as the runtime and custom serialized methods for a pipeline node.

Parameters:

Returns:

A pipeline node configuration object (NodeDef). Must be used as a named binding inside a pipeline { ... } block; the node code is executed by the pipeline builder, not immediately.

Examples:

p = pipeline {
y = node(command = x + 5, runtime = T)
z = node(
command = build_model(y),
runtime = R,
functions = ["utils.R"],
include = "config.yml"
)
}

py(command, script = NA, serializer = "default", deserializer = "default", env_vars = [:], functions = [], include = [], noop = false)

pyn(command, script = NA, serializer = "default", deserializer = "default", env_vars = [:], functions = [], include = [], noop = false)

Configure a Python Pipeline Node. A convenience wrapper around node() with runtime = "Python". Used directly within a pipeline { ... } block to execute Python code.

Parameters:

Returns:

A pipeline node configuration object (NodeDef). Must be used as a named binding inside a pipeline { ... } block; the Python code is executed by the pipeline builder, not immediately.


rn(command, script = NA, serializer = "default", deserializer = "default", env_vars = [:], functions = [], include = [], noop = false)

Configure an R Pipeline Node. A convenience wrapper around node() with runtime = "R". Used directly within a pipeline { ... } block to execute R code.

Parameters:

Returns:

A pipeline node configuration object (NodeDef). Must be used as a named binding inside a pipeline { ... } block; the R code is executed by the pipeline builder, not immediately.


jln(command, script = NA, serializer = "default", deserializer = "default", env_vars = [:], functions = [], include = [], noop = false)

Configure a Julia Pipeline Node. A convenience wrapper around node() with runtime = "Julia". Used directly within a pipeline { ... } block to execute Julia code.

Parameters:

Returns:

A pipeline node configuration object (NodeDef). Must be used as a named binding inside a pipeline { ... } block; the Julia code is executed by the pipeline builder, not immediately.


qn(script = NA, serializer = "default", deserializer = "default", env_vars = [:], args = [:], functions = [], include = [], noop = false)

Configure a Quarto pipeline node. A convenience wrapper around node() with runtime = "Quarto". Use it to render .qmd files inside pipeline { ... } blocks.

Parameters:

Returns:

A pipeline node configuration object (NodeDef). Must be used as a named binding inside a pipeline { ... } block; the Quarto document is rendered by the pipeline builder, not immediately.


shn(command, script = NA, serializer = "text", deserializer = "default", env_vars = [:], args = [], shell = "sh", shell_args = [], functions = [], include = [], noop = false)

Configure a shell pipeline node. A convenience wrapper around node() with runtime = "sh". Use it for CLI tools, inline shell scripts, and .sh files inside pipeline { ... } blocks.

Parameters:

Returns:

A pipeline node configuration object (NodeDef). Must be used as a named binding inside a pipeline { ... } block; the shell command is executed by the pipeline builder, not immediately.

suppress_warnings(value)

Silence diagnostic warnings for a pipeline node while maintaining auditability in the background metadata.

Parameters:

Returns:

The original value, but with a signal to the evaluator to suppress console warnings for the currently executing node.

Examples:

p = pipeline {
  -- Silence warnings from a high-noise filter
  filtered = raw 
    |> filter($amount > 100) 
    |> suppress_warnings
}

pipeline_nodes(pipeline)

Get all node names in a pipeline.

Parameters:

Returns:

List of Strings (node names)

Examples:

p = pipeline { x = 1; y = 2; z = x + y }
pipeline_nodes(p)  -- ["x", "y", "z"]

pipeline_deps(pipeline, node_name)

Get dependencies of a specific node.

Parameters:

Returns:

List of Strings (dependency names)

Examples:

p = pipeline { x = 1; y = 2; z = x + y }
pipeline_deps(p, "z")  -- ["x", "y"]
pipeline_deps(p, "x")  -- []

pipeline_node(pipeline, node_name)

Get the value of a specific node.

Parameters:

Returns:

Node value

Examples:

p = pipeline { x = 10; doubled = x * 2 }
pipeline_node(p, "x")       -- 10
pipeline_node(p, "doubled") -- 20

pipeline_run(pipeline, nix_options = NA)

Re-execute a pipeline. If nix_options is provided, triggers a cache-aware Nix build of the pipeline using the specified options. Otherwise, re-executes the pipeline dynamically in-memory.

Parameters:

Returns:

Pipeline object with updated values (or DataFrame if dry_run = true)

Examples:

p = pipeline { x = 10; y = x * 2 }
p2 = pipeline_run(p)
df = pipeline_run(p, nix_options = [dry_run: true])

populate_pipeline(pipeline, build = false, verbose = 0, nix_options = NA)

Prepare pipeline infrastructure in _pipeline/.

Parameters:

Returns:

Success message, BuildLog, or DataFrame.

Examples:

populate_pipeline(p)
populate_pipeline(p, build = true)
populate_pipeline(p, build = true, nix_options = [max_jobs: 4, cache: "rstats-on-nix"])

build_pipeline(pipeline, verbose = 0, nix_options = NA)

Shorthand for populate_pipeline(p, build = true). Recommended for scripts run with t run.

Parameters:

Returns:

BuildLog with fields: - nodes — per-node status/duration records - duration — total build duration in seconds - failed_nodes — list of failed/errored node names - out_path — Nix output path for the build (migration path for previous string-return behavior) (or DataFrame if dry_run = true)

Examples:

build_pipeline(p)
build_pipeline(p, nix_options = [dry_run: true])
build_pipeline(p, nix_options = [targets: ["c"], max_jobs: 4, cache: "rstats-on-nix", force: ["c"]])

read_node(node, which_log = NA)

Read a dynamically evaluated or materialized artifact from a pipeline build.

Parameters:

Returns:

Deserialized value.

Examples:

read_node(p.summary_stats)
read_node(p.model_v1, which_log = "20260221")

debug_node(node)

Launches an interactive guest subshell (Python, R, or Julia REPL) to debug a pipeline node using its exact build state and context.

Parameters:

Returns:

Runs interactively. Control returns to the parent T REPL once the subshell is exited.

Details: Within the subshell, all upstream build paths and companion library loaders are provided, and custom project-level variables (p_env_vars/un_env_vars) are propagated directly. To enforce strict reproducibility and prevent configuration drift, all imperative package updates (e.g., pip, install.packages, Pkg.add) are dynamically intercepted and blocked.

Examples:

p = pipeline { a = 1; b = a + 5 }
build_pipeline(p)
debug_node(p.b)

inspect_log(which_log = NA)

View build status and output paths for a pipeline build.

Parameters:

Returns:

DataFrame with columns: node, success, path, output.


list_logs()

List all available build logs in _pipeline/.

Returns:

DataFrame with columns: filename, mod_time, size_kb.


build_log(p)

Returns the BuildLog of the latest Nix build for the given pipeline. Contains detailed node-level status records, duration, failed node names, and out_path.

Parameters:

Returns:

BuildLog — A structured build log record.

Examples:

p = pipeline { a = 1 / 0 }
build_pipeline(p)
log = build_log(p)

build_log_to_frame(log)

Tabulates a BuildLog record into a structured DataFrame summarizing the build status, duration, and Nix store paths of all pipeline nodes.

Parameters:

Returns:

DataFrame — A DataFrame with columns name, status, duration, and path.

Examples:

log = build_log(p)
df = build_log_to_frame(log)
-- Returns a DataFrame:
--   name  | status     | duration | path
--   "a"   | "Errored"  | 0.02     | "/nix/store/..."

build_log_history(p, n = NA, pattern = NA)

Returns a summary DataFrame of all historical builds matching the current pipeline’s node signature, ordered from most recent to oldest.

Parameters:

Returns:

DataFrame — A DataFrame detailing historical builds with columns: - build_id (1-indexed rank from most recent to oldest) - timestamp (ISO-8601 UTC string of build time) - duration (total duration in seconds) - n_nodes (total number of nodes) - n_failed (number of failed/errored nodes) - n_warnings (number of warnings issued) - out_path (Nix output store path for the build) - hash (unique content hash of build input signature)

Examples:

p = pipeline { a = 1; b = 2 }
hist = build_log_history(p, n = 5)

node_diff(node_a, node_b, log_a = "latest", log_b = "latest", key = [], context = 3)

Compares the dynamic evaluations or built artifacts of node_a and node_b across two historical builds (defaults to comparing the latest build of both).

Parameters:

Returns:

Dict — A structured type-sensitive diff dictionary containing: - For DataFrames (csv, arrow, parquet): schema_changed (Bool), added_columns (List), removed_columns (List), nrows_a (Int), nrows_b (Int), and numeric_drift (DataFrame summarizing column-level mean values and shift percentages). - For PMML Models (pmml): model_type (String), coefficients_changed (Bool), and coef_diff (DataFrame comparing regression coefficients and intercept shift deltas). Falls back to generic structural equality diff for non-regression models. - For Text Files (text): changed (Bool), lines_added (Int), lines_removed (Int), and diff (String unified diff output). - For Python-native artifacts (for example pickled NumPy ndarrays): kind = "python_object_diff", unified diff line counts, rendered git-like diff hunks, and shape/dtype metadata when available. - For Julia-native artifacts (for example serialized arrays or structs): kind = "julia_object_diff", DeepDiffs-rendered summaries, captured diff lines, and type/shape metadata when available. - For R-native artifacts (for example serialized model objects): kind = "r_object_diff", diffobj-rendered summaries, captured diff lines, and class/type metadata when available. - For Generic/Scalars: value_a (Any), value_b (Any), changed (Bool), and delta (Float numeric difference or NA).

Native Python, Julia, and R object diffs are preserved only for artifacts using the standard default or tobj serializers. Custom serializer names use the normal artifact-loading path instead; use the companion helper package directly when a native artifact requires a custom deserializer. Julia-native diffs are executed through a fresh Julia helper process per comparison, so repeated large diffs will include Julia startup cost.

Examples:

p = pipeline { a = 1; b = 2 }
-- Compare most recent to second most recent
diff_scalar = node_diff(p.a, p.a)

-- Compare with explicit 1-indexed ranks or regex patterns
diff_model = node_diff(p.model_node, p.model_node, log_a = ".*train1.*", log_b = ".*train2.*")

collect_exceptions(p)

Collects all terminal error exceptions and non-terminal warning diagnostics from the computed nodes of a built pipeline.

Parameters:

Returns:

DataFrame — A DataFrame with columns node, status, code, and message detailing the exceptions and warnings across all nodes.

Examples:

p = pipeline { a = 1 / 0; b = a + 5 }
build_pipeline(p)
exceptions = collect_exceptions(p)
-- Returns a DataFrame with:
--   node | status  | code             | message
--   "a"  | "Error" | "DivisionByZero" | "Division by zero"
--   "b"  | "Error" | "UpstreamError"  | "Upstream dependency 'a' failed"

Explain Package

Introspection and LLM tooling.

explain(value)

Get detailed explanation of a value.

For DataFrames, returns a compact summary by default showing kind, nrow, ncol, and a hint. Detailed fields (schema, na_stats, example_rows) are accessible via dot notation.

Specialized support for collect_exceptions(p) DataFrames: If the input DataFrame is the diagnostics table returned by collect_exceptions(p) (detected via the columns ["node", "status", "code", "message"]), explain() behaves as follows: - Single Exception: If the DataFrame contains exactly one row, calling explain() directly maps to that specific exception, returning a dictionary with keys kind, type ("Error" or "Warning"), error_code/warning_code, error_message/warning_message, and node. - Multiple Exceptions: If there are zero or multiple rows, explain() returns an overarching exceptions_list dictionary containing keys kind, type, description, count, and exceptions (a list of mapped explanation dictionaries for each diagnostic element).

For pipeline node results returned by read_node(...), explain() now returns a top-level node wrapper with kind, node_name, diagnostics, and contents. The contents field is the explained payload stored in the node. In the REPL and CLI t explain ..., explain output is shown with a tree-style formatter for readability, but the runtime value remains a normal Dict.

Parameters:

Returns:

Dict with introspection data

Examples:

explain(42)
-- {`kind`: "value", `type`: "Int", `value`: 42}

explain(df)
-- {`kind`: "to_dataframe", `nrow`: 100, `ncol`: 5, `hint`: "Use explain(df).schema, ..."}

-- Access detailed fields:
explain(df).schema        -- list of column name/type pairs
explain(df).na_stats      -- NA count per column
explain(df).example_rows  -- first 5 rows as list of dicts

node_info = explain(read_node("model"))
node_info.node_name       -- node/container metadata
node_info.diagnostics     -- node diagnostics
node_info.contents        -- explained node payload

explain_json(value)

Returns a JSON string representation of the explain output.


intent_fields(intent) / intent_get(intent, key)

Access metadata fields from an Intent object (e.g. from an intent { ... } block).


intent_fields(intent)

Get all fields from an intent block.

Parameters:

Returns:

Dict of field names to values

Examples:

i = intent { description: "Analysis", assumes: "Clean data" }
intent_fields(i)
-- {description: "Analysis", assumes: "Clean data"}

intent_get(intent, field)

Get a specific field from an intent block.

Parameters:

Returns:

Field value

Examples:

i = intent { description: "Customer analysis" }
intent_get(i, "description")  -- "Customer analysis"

Operators

Arithmetic

Operator Description Example
+ Addition / String concatenation 2 + 35, "a" + "b""ab"
- Subtraction 5 - 23
* Multiplication 4 * 520
/ Division 15 / 35
% Modulo 7 % 31

Comparison

Operator Description Example
== Equal 5 == 5true
!= Not equal 5 != 3true
< Less than 3 < 5true
> Greater than 5 > 3true
<= Less or equal 5 <= 5true
>= Greater or equal 3 >= 2true

Logical (Scalar Control Flow)

Operator Description Example
&& Logical AND (Short-circuit) true && falsefalse
|| Logical OR (Short-circuit) true || falsetrue
! Logical NOT (Strict) !falsetrue

Bitwise / Boolean (Strict)

Operator Description Example
& Bitwise/Boolean AND true & falsefalse, 3 & 11
| Bitwise/Boolean OR true | falsetrue, 3 | 13

Membership

Operator Description Logic
in Check if element exists in list x in [a, b]

Examples:

1 in [1, 2, 3]       -- true
4 in [1, 2, 3]       -- false
[1, 4] in [1, 2, 3]  -- [true, false] (Broadcasting)

Broadcasting

Standard operators can be broadcasted over lists/vectors by prefixing with ..

Operator Description Logic
.+, .-, .*, ./ Element-wise Arithmetic [1, 2] .+ 1 -> [2, 3]
.==, .!=, .<, .>, .<=, .>= Element-wise Comparison [1, 2] .> 1 -> [false, true]
.&, .| Element-wise Logical/Bitwise [true, false] .& true -> [true, false]

[!NOTE] in automatically broadcasts if the left-hand side is a list/vector. You do not need .in.

Pipes

Operator Description Error Handling
\|> Conditional pipe Short-circuits on error
?\|> Maybe-pipe Forwards errors to function

Type System

Type Example Description
Int 42 Integer numbers
Float 3.14 Floating-point numbers
Bool true, false Boolean values
String "hello" Text strings
List [1, 2, 3] Ordered collections
Dict [x: 1, y: 2] Key-value maps
Vector Column data Typed arrays (from DataFrames)
DataFrame Table data First-class tabular data
Function \(x) x + 1 First-class functions
NA NA, na_int() Explicit missing values (typed)
Error error("msg") Structured errors (not exceptions)
Intent intent { ... } LLM metadata block
Pipeline pipeline { ... } DAG computation graph
Formula y ~ x Statistical model specification

Next Steps

Now that you’ve explored the API, learn how to build reproducible data pipelines:

  1. Pipeline Tutorial — Master T’s core execution model.
  2. Data Manipulation Examples — Practical examples of data wrangling.
  3. Project Development — Master T’s project structure and dependency management.
  4. Package Development — Create reusable T libraries.