T — A Language for Tabular Data and Human–LLM Collaboration


T is an experimental programming language for declarative, functional manipulation of tabular data. Inspired by R’s tidyverse and OCaml’s semantic rigor, T is designed to make data analysis explicit, inspectable, and pipeline-oriented.

Unlike traditional scripting languages, T is built from the ground up to support human–LLM collaborative programming, where humans specify intent and constraints, and language tools (including LLMs) generate localized, mechanical code.

Status: Pre-alpha. Actively designed and implemented.


Documentation


Design Goals


LLM-Native by Design

T treats large language models as first-class collaborators, not magic code generators. The language and tooling are designed to make LLM-generated code:

Humans define intent, assumptions, and invariants. LLMs generate localized code. T enforces semantics and correctness.


Intent Blocks

T supports intent blocks: structured comments that encode analytical goals, assumptions, and checks in a machine-readable way.

-- intent:
-- goal: "Estimate approval as a function of age and income"
-- assumptions:
--   - age is approximately linear
--   - missing income is non-random
-- checks:
--   - no negative income
--   - at least 100 observations per group
  

Intent blocks are preserved by tooling, version-controlled with code, and used as stable regeneration boundaries for LLM-assisted workflows.


Pipelines

Pipelines are T’s core execution model. Each pipeline is a DAG of named nodes with explicit dependencies, cacheable results, and inspectable outputs.

pipeline analysis {
  raw = { read_csv("data.csv") }

  cleaned = {
    raw |> filter(age > 18)
        |> mutate(income_k = income / 1000)
  }

  model = {
    cleaned |> lm(approval ~ age + income_k)
  }
}
  

Pipelines enable local reasoning, reproducibility, and safe regeneration of individual steps without rewriting entire scripts.


Language Features


Numerical Backend

T’s numerical stack is layered:

This approach prioritizes fast development, explicit semantics, and safe defaults, while leaving room for future performance upgrades.


Standard Packages

Packages are part of the standard library and loaded by default. Each function lives in its own file.


Alpha Roadmap

The alpha version of T targets a complete, end-to-end workflow:

Performance tuning, GPUs, and distributed execution are explicitly out of scope for alpha.


Project Structure

.
├── flake.nix
├── ast.ml
├── parser.ml
├── lexer.ml
├── eval.ml
├── repl.ml
├── pipeline.ml
├── dataframe.ml
└── packages/
    ├── core/
    ├── stats/
    └── colcraft/
  

Building

nix develop
t repl
  

Contributing

Contributions focus on clarity, explicit semantics, and small, reviewable changes. Packages live in-repo during early development.


License: EUPL v1.2.


Best viewed with a sense of curiosity.
View Source on GitHub