Overview Docs Tutorials

Configuration

tokentrace can be configured in code via tokentrace.Config or via a tokentrace.yml file. When both are present, the YAML file takes precedence for fields it explicitly sets. Environment variable substitution is supported using the ${VAR} syntax.

Loading configuration

By default, tokentrace looks for tokentrace.yml in the current working directory. You can specify a path explicitly:

tracer, err := tokentrace.NewFromFile("./config/tokentrace.yml")

Or load from an io.Reader:

f, _ := os.Open("tokentrace.yml")
tracer, err := tokentrace.NewFromReader(f)

Full schema

# tokentrace.yml

# transport — where spans are delivered
transport:
  type: file          # "file", "http", "stdout", "multi", or "noop"

  # For type: file
  file:
    path: ./traces.jsonl
    rotate: true
    max_size_mb: 100
    max_files: 7
    sync: false
    buffer_size_kb: 256

  # For type: http
  http:
    endpoint: https://collector.example.com/spans
    batch_size: 100
    flush_interval: 2s
    timeout: 10s
    max_retries: 3
    retry_backoff: 500ms
    compression: gzip       # "gzip" or "none"
    max_queue_size: 10000
    headers:
      Authorization: "Bearer ${TOKENTRACE_TOKEN}"

  # For type: multi — fan out to multiple transports
  multi:
    - type: file
      file:
        path: ./traces.jsonl
    - type: http
      http:
        endpoint: https://collector.example.com/spans

# http_server — expose metrics and ingestion API
http_server:
  enabled: true
  addr: ":9090"
  metrics_path: /metrics
  prometheus_path: /metrics/prometheus
  ingest_path: /spans
  health_path: /health
  read_timeout: 5s
  write_timeout: 10s
  auth_token: "${TOKENTRACE_API_TOKEN}"   # optional; omit to disable auth

# retention — how long spans are kept in the in-process ring buffer
# and (if using FileTransport with the HTTP server) on disk
retention:
  memory_window: 7d    # keep up to 7 days of spans in memory
  disk_days: 30        # keep rotated JSONL files for 30 days (FileTransport only)

# alerts — list of alert rules
alerts:
  - name: hourly-cost-spike
    metric: total_cost
    op: gt
    threshold: 10.00
    window: 1h
    cooldown: 4h
    min_spans: 5
    delivery:
      type: http
      url: https://hooks.example.com/alerts
      timeout: 5s
      headers:
        Authorization: "Bearer ${ALERT_TOKEN}"

  - name: p95-latency-regression
    metric: latency_p95
    op: gt
    threshold: 3000
    window: 30m
    delivery:
      type: stdout

  - name: quality-drop
    metric: quality_score
    op: lt
    threshold: 0.75
    window: 1h
    min_spans: 20
    filter:
      model: gpt-4o
    delivery:
      type: http
      url: https://hooks.example.com/alerts

  - name: error-rate-spike
    metric: error_rate
    op: gt
    threshold: 0.05
    window: 15m
    min_spans: 10
    delivery:
      type: stdout

# custom_metrics — derived metrics beyond the built-ins
custom_metrics:
  - name: legal_doc_cost
    type: sum
    field: cost
    filter_key: document_type
    filter_val: legal

  - name: agent_step_count
    type: count
    filter_key: step

# pricing — override built-in model pricing
pricing:
  models:
    my-fine-tuned-model:
      prompt_per_1m: 3.00
      completion_per_1m: 12.00

Field reference

transport

Field Type Default Description
type string "stdout" Transport type: file, http, stdout, multi, noop

transport.file

Field Type Default Description
path string ./traces.jsonl Path to the output file
rotate bool false Enable log rotation
max_size_mb int 100 Rotate when file exceeds this size
max_files int 7 Number of rotated files to keep
sync bool false Call fsync after each write
buffer_size_kb int 256 Write buffer size in KB

transport.http

Field Type Default Description
endpoint string Required. URL to POST spans to
batch_size int 100 Maximum spans per request
flush_interval duration 2s Maximum time between flushes
timeout duration 10s HTTP request timeout
max_retries int 3 Retry attempts after failure
retry_backoff duration 500ms Initial retry backoff (doubles each attempt)
compression string "none" "gzip" or "none"
max_queue_size int 10000 Drop spans when queue exceeds this
headers map Additional HTTP headers

http_server

Field Type Default Description
enabled bool false Start the HTTP server
addr string ":9090" Listen address
auth_token string Bearer token required for all requests
read_timeout duration 5s HTTP read timeout
write_timeout duration 10s HTTP write timeout

alerts

Field Type Required Description
name string yes Unique name for this rule
metric string yes Metric to evaluate
op string yes gt, gte, lt, lte
threshold float yes Threshold value
window duration yes Time window for metric evaluation
cooldown duration no Minimum time between firings (default: window)
min_spans int no Minimum spans required to evaluate (default: 0)
filter map no Key/value attribute filter
delivery.type string yes http or stdout
delivery.url string if http Webhook URL
delivery.timeout duration no Delivery request timeout (default: 5s)
delivery.headers map no Additional HTTP headers for delivery

Environment variable substitution

Use ${VAR_NAME} anywhere in the YAML file to substitute an environment variable at load time. If the variable is not set, the field is treated as an empty string. To require a variable to be set, use ${VAR_NAME:?error message} — tokentrace will refuse to start and print the error message if the variable is unset.

http:
  endpoint: "${COLLECTOR_URL:?COLLECTOR_URL must be set}"
  headers:
    Authorization: "Bearer ${TOKENTRACE_TOKEN:?TOKENTRACE_TOKEN must be set}"

Next steps

  • Transports — Transport implementation details and options.
  • Alerts — Alert rule semantics and delivery options.
  • HTTP API — Enable and use the HTTP server.
← Previous HTTP API
Next → Go API