Frequently Asked Questions

Everything you need to know about using TOON with Large Language Models

Getting Started

TOON (Token-Oriented Object Notation) is a compact, human-readable encoding of the JSON data model specifically designed for Large Language Models. It was created to address the token inefficiency of standard JSON when used in LLM applications.

Traditional JSON, while excellent for APIs and human readability, is verbose with its braces, quotes, and repeated keys. Every token costs money in LLM applications, and context windows have limits. TOON reduces token usage by 30-60% on uniform datasets while maintaining full JSON compatibility and human readability.

Key Motivation:
  • • Reduce token costs for LLM applications
  • • Fit more data in context windows
  • • Provide structure that helps LLMs validate data
  • • Maintain human readability and editability

No, TOON is not a replacement for JSON. It's a specialized encoding format optimized for a specific use case: passing structured data to Large Language Models.

Think of it as a translation layer: you continue using JSON in your application code, APIs, and storage. When you need to pass data to an LLM, you encode it as TOON to save tokens. When the LLM returns TOON data, you decode it back to JSON.

Best Practice: Use JSON for programmatic data handling, APIs, and storage. Use TOON specifically for LLM input/output where token efficiency matters.

Token savings depend heavily on your data structure:

Best Savings (30-60%)
  • • Uniform arrays of objects
  • • Tabular data with many rows
  • • Repeated field structures
  • • Simple primitive values
Modest Savings (10-30%)
  • • Semi-uniform structures
  • • Mixed nested data
  • • Moderate complexity
Limited Savings or Worse

Deeply nested structures, highly irregular data, or objects with mostly unique field sets may not benefit from TOON and could even use more tokens than compact JSON.

Example: 100 employee records
JSON (formatted): 126,860 tokens
TOON: 49,831 tokens
→ 60.7% reduction, ~77,000 tokens saved

LLMs were not trained specifically on TOON, but they generally understand it well because:

  • Familiar patterns: TOON combines YAML-like indentation with CSV-style tables, both of which are common in training data
  • Clear structure: Explicit array lengths and field declarations make the structure obvious
  • Human-readable: The format is intuitive enough that models can infer the pattern from examples

Benchmark results show LLMs achieve 73.9% accuracy on TOON data retrieval tasks compared to 69.7% for JSON, suggesting models may actually parse TOON more reliably than JSON in some contexts.

Best Practice: When generating TOON output, include the header format in your prompt (e.g., users[N]{`{id,name,role}`}:) to help the model follow the correct structure.

Implementation

Install the official NPM package:

npm install @toon-format/toon

Converting JSON to TOON:

import { encode } from "@toon-format/toon";

const data = {
  users: [
    { id: 1, name: "Alice", role: "admin" },
    { id: 2, name: "Bob", role: "user" }
  ]
};

const toonString = encode(data);
console.log(toonString);

Converting TOON back to JSON:

import { decode } from "@toon-format/toon";

const toonString = `
users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user
`;

const jsonObject = decode(toonString);

Install the Python package:

pip install python-toon

Encoding JSON to TOON:

from toon import encode

# A channel object
channel = {"name": "tapaScript", "age": 2, "type": "education"}
toon_output = encode(channel)
print(toon_output)

Decoding TOON back to JSON:

from toon import decode

toon_string = """
name: tapaScript
age: 2
type: education
"""

python_struct = decode(toon_string)
print(python_struct)

No, TOON is typically generated programmatically. Most TOON data will be:

  • Automatically generated by software using the encode functions
  • Converted from existing JSON data before sending to LLMs
  • Created by AI models when asked to output in TOON format

While TOON is human-readable and can be written manually, the main workflow is to convert your existing JSON data to TOON programmatically, especially when preparing prompts for LLMs.

TOON currently has official implementations for:

JavaScript/TypeScript

@toon-format/toon (NPM)

Python

python-toon (PyPI)

The format is designed to work well with Go, Rust, and other languages, with more implementations expected as the format gains adoption.

Use Cases

TOON is ideal for:

LLM Prompt Data

Passing structured data to GPT, Claude, Gemini, or other language models where token efficiency matters

Uniform Arrays

Datasets with many records sharing the same structure (users, products, transactions, logs)

Cost Optimization

Applications where reducing token costs is critical (high-volume API calls, training data preparation)

Context Window Limits

When you need to fit more data into limited context windows

TOON is being explored for:

  • Training Data: Reducing token overhead for structured training data in LLM fine-tuning
  • Agent Frameworks: Compact data exchange in multi-agent AI systems
  • MCP & AI Workflows: Faster data serialization between Model Context Protocol and AI workflow engines
  • Serverless AI APIs: Cost and speed optimization in serverless environments
  • RAG Systems: Efficient retrieval-augmented-generation with large knowledge bases
  • Few-Shot Learning: Providing more examples within the same token budget

Absolutely! A hybrid approach often works best:

Recommended Strategy:
  • ✓ Keep JSON for your standard API communications and data storage
  • ✓ Convert to TOON when sending data to LLMs or preparing prompts
  • ✓ Use JSON for configuration files and human-edited content
  • ✓ Use TOON for large datasets in AI workflows

This gives you the best of both worlds: JSON's universal compatibility for regular use, and TOON's efficiency when communicating with AI models.

Limitations & Considerations

JSON might be better when:

Deeply Nested Data

Complex hierarchical structures with many levels of nesting may not benefit from TOON

Irregular Data Shapes

Objects with varying fields or highly inconsistent structures across array items

Non-AI Applications

Standard web APIs, databases, configuration files, or any non-LLM use case

Strict Schema Requirements

Applications requiring type enforcement, validation, or schema definitions (use JSON Schema instead)

Yes, in certain cases TOON can be less efficient than compact JSON. This typically happens with:

  • Deeply nested structures: Multiple levels of nesting can add overhead with TOON's indentation
  • Small datasets: Very small objects might not benefit enough to offset the schema declaration
  • Highly irregular data: When objects in an array have mostly different fields
Important: Always benchmark with your actual data. Token savings vary by tokenizer, data structure, and LLM. What works efficiently for one dataset may not for another.

TOON is NOT a replacement for JSON in traditional web/API contexts:

  • Browsers don't natively parse TOON (no TOON.parse())
  • REST APIs expect JSON as the standard format
  • Databases store JSON, not TOON
  • Most tooling and libraries are built around JSON

TOON is specifically designed for LLM communication, not as a general-purpose data format. You'll need to convert between JSON and TOON programmatically when interfacing with AI models.

TOON is in active development with specification v2.0 released. Consider these factors:

Production-Ready
  • ✓ Stable specification (v2.0)
  • ✓ Official implementations
  • ✓ Active community
  • ✓ MIT licensed
Consider
  • • Still evolving format
  • • Limited tooling ecosystem
  • • May need fallback to JSON
  • • Test with your use case

Many teams are successfully using TOON in production for AI workflows, but it's wise to test thoroughly and maintain JSON as a fallback option.

Future & Community

TOON is gaining traction in the AI developer community with potential for:

  • Standardization: Just as JSON became the Web's data exchange standard, TOON could become the standard for AI data interchange
  • Native LLM Support: Future LLMs may be trained with TOON examples or have built-in TOON parsing
  • Expanded Tooling: More libraries, converters, and integrations across different languages
  • Framework Integration: Direct TOON support in popular AI frameworks like LangChain, LlamaIndex, etc.
  • Specification Evolution: Community-driven improvements through the RFC process

"Just as JSON has been a standard for the Web's data exchange, TOON may soon be standardized for AI data interchange."

TOON is an open-source project with an active community:

Want to contribute? Check out the Contributing Guide in the spec repository to propose new features, report issues, or submit implementations for other languages.

Here's a quick start guide:

1
Try the Online Converter

Use our JSON to TOON converter to see token savings with your own data

2
Install the Library

Add the package for your language (npm or pip)

3
Test with Your LLM

Convert your prompt data to TOON and compare results

4
Benchmark & Optimize

Measure actual token savings and adjust your data structure for maximum efficiency

Pro Tip: Next time you craft a prompt or pass structured data to an AI model, try it in TOON format. You may notice the model gets faster and cheaper!

Still Have Questions?

Can't find what you're looking for? Check out our GitHub discussions or open an issue.

Visit GitHub