Discover the key features that make TOON the most efficient data format for Large Language Models
Dramatically reduce token usage compared to standard JSON, cutting costs and fitting more data in context windows.
Explicit array lengths and field declarations help models validate data structure and reduce generation errors.
Removes redundant braces, brackets, and quotes. Only quotes when necessary for ambiguous values.
Uses whitespace like YAML for nested objects, making structure visually clear without extra punctuation.
Declare field names once for uniform arrays, then stream data as CSV-style rows for maximum efficiency.
Collapse single-key wrapper chains into dotted paths to reduce indentation and save even more tokens.
{
"employees": [
{
"id": 1,
"name": "Alice Johnson",
"department": "Engineering",
"salary": 95000,
"active": true
},
{
"id": 2,
"name": "Bob Smith",
"department": "Marketing",
"salary": 72000,
"active": true
},
{
"id": 3,
"name": "Carol White",
"department": "Engineering",
"salary": 88000,
"active": false
}
]
}
employees[3]{id,name,department,salary,active}:
1,Alice Johnson,Engineering,95000,true
2,Bob Smith,Marketing,72000,true
3,Carol White,Engineering,88000,false
Choose between comma, tab, or pipe delimiters based on your data characteristics for optimal tokenization.
items[2]: a,b,c
items[2 ]: a b c
items[2|]: a|b|c
TOON only quotes strings when necessary, maximizing token efficiency while maintaining clarity.
Tabular formatting applies recursively to nested arrays, maintaining efficiency at every level.
teams[2]:
- name: Alpha
members[2]{id,name}:
1,Alice
2,Bob
- name: Beta
members[3]{id,name}:
3,Carol
4,Dave
5,Eve
Non-JSON types are automatically converted to LLM-safe representations.
When you have multiple records with identical fields, TOON's tabular format shines with 30-60% token savings.
Structured data in prompts benefits from explicit lengths and fields that help models validate output.
Applications where token costs matter, especially with repeated API calls or large datasets.
Compress API responses with repeated structures before passing to LLMs.
With minimal tabular eligibility (0-30%), JSON compact may be more efficient.
When objects have varying field sets, TOON's benefits diminish significantly.
For flat tables without nesting, CSV is simpler and slightly more compact.
If your infrastructure expects JSON, adding conversion may not be worth the complexity.