About Grammar Learner

Walk-state grammar inference and structural compression for repeating data streams.

What it does

Grammar Learner ingests structured records — JSON, CSV, or key=value — and infers a walk-state grammar: a compact model of how fields relate and transition across records. Once learned, the grammar compresses new records in the same stream at ratios of 10–60× by encoding only deviations from predicted state.

Ideal for IoT telemetry, log streams, sensor arrays, medical vitals, and any domain where records share a predictable schema with slowly-changing values.

How it works
  1. Records ingested and format auto-detected (JSON / CSV / key=value)
  2. Field types inferred (numeric, categorical, timestamp, boolean)
  3. Transition model built: P(value_t | value_{t-1}, field)
  4. Walk grammar encoded as Markov table per field
  5. Compression: encode only the deviation from grammar prediction
Supported Data Formats
JSON Array
[{"ts": 1714000000, "val": 22.5},
 {"ts": 1714000060, "val": 22.7}]
CSV with Header
ts,method,status,bytes
1714000000,GET,200,1842
1714000001,POST,201,512
Key=Value Blocks
hostname=node-01
status=active
firmware=2.4.1

hostname=node-02
status=active
Use Cases
High-compression domains
  • IoT sensor telemetry (temp, humidity, pressure)
  • Medical vitals streams (HR, SpO2, BP)
  • SCADA/ICS process variables
  • Financial tick data
  • Edge device health metrics
Moderate-compression domains
  • Web server access logs
  • Network flow records (NetFlow/IPFIX)
  • Security event streams
  • Device configuration registries
  • Application performance metrics