T uses a first-class serializer system to manage data interchange between different runtimes (T, R, Python, Julia) and for materializing pipeline nodes as persistent artifacts.
Serializers are identified by the ^ prefix. You can
specify them when defining pipeline nodes:
p = pipeline {
-- Use the built-in Arrow serializer for a DataFrame
data = node(command = read_csv("large.csv"), serializer = ^arrow)
-- Use the PMML serializer for a model
model = rn(command = <{ lm(y ~ x, data = data) }>, serializer = ^pmml)
-- Use the JSON serializer for a simple dictionary
config = node(command = { "debug": true, "retries": 5 }, serializer = ^json)
}
If you don’t specify a serializer, T uses the ^tlang
(internal binary) format for T-to-T communication. For other runtimes, T
attempts to infer a sensible default based on the data type or the
specific wrapper used (e.g., shn() defaults to
^text).
| Identifier | Name | Best For | Compatibility |
|---|---|---|---|
^tlang |
T-Native | T-to-T interchange | T only |
^arrow |
Apache Arrow | Large DataFrames | T, R, Python, Julia |
^pmml |
PMML | Predictive Models | T, R, Python |
^onnx |
ONNX | ML Models | T, R, Python |
^json |
JSON | Config, lists, dicts | T, R, Python |
^csv |
CSV | Tabular data | T, R, Python |
^text |
Plain Text | Logs, shell output | All |
serializer
StructureA serializer is a first-class object in T. You can inspect its properties or even define your own.
type serializer = {
format: string,
writer: function(path: string, value: any) -> result[NA, string],
reader: function(path: string) -> result[any, string]
}
You can create a custom serializer by defining a record that matches the required interface:
my_log_serializer = {
format: "log",
writer: \(path, val) {
-- custom logic to write log
Ok(NA)
},
reader: \(path) {
-- custom logic to read log
Ok("log content")
}
}
-- Usage
node(command = ..., serializer = my_log_serializer)
One of the most powerful features of T’s serializer system is the static coherence check. When you build a pipeline, T verifies that the format produced by a source node matches the format expected by the consumer node.
node A {
target: wn("data.csv", serializer = ^csv)
}
node B {
source: rn("data.csv", serializer = ^arrow)
}
-- Result: Static Error
-- "Format mismatch: Node A produces ^csv, but Node B expects ^arrow."
This prevents runtime errors after long-running computations by catching interchange mismatches at the start of the build.
For cross-language nodes, serializers provide the necessary glue code
for the target runtime. For example, when using ^arrow in
an R node:
arrow R library into the build
environment.arrow::write_ipc_file().For a serializer to work across non-T runtimes, it can optionally provide code snippets for R and Python. These snippets are strings that T injects into the generated build scripts.
You can define these by adding r_writer,
r_reader, py_writer, or py_reader
keys to your serializer dictionary. You can use standard strings or
foreign code blocks <{ ... }> for
better readability:
my_custom_ser = [
format: "custom",
-- T implementation
writer: \(path, val) { Ok(NA) },
reader: \(path) { Ok(42) },
-- R snippets (using foreign code blocks)
r_writer: <{ function(obj, path) { saveRDS(obj, path) } }>,
r_reader: <{ function(path) { readRDS(path) } }>,
-- Python snippets
py_writer: <{ lambda obj, path: pickle.dump(obj, open(path, 'wb')) }>,
py_reader: <{ lambda path: pickle.load(open(path, 'rb')) }>
]
When T processes a node with an R runtime and the above
serializer: 1. It looks for the r_writer snippet. 2. It
generates a call in the node’s R script:
<r_writer>(node_result, "artifact_path").
If you use a custom format name (e.g.,
format: "myformat"), you should ensure that your R or
Python scripts have the necessary libraries loaded to handle that
format. You can do this by adding the libraries to your
tproject.toml or using the functions /
includes parameters in the node definition.
For more information on how pipelines use these serializers, see the
Pipeline Tutorial. For a
model-focused walkthrough of ^pmml, see the PMML Tutorial.