About Grammar Learner
Walk-state grammar inference and structural compression for repeating data streams.
What it does
Grammar Learner ingests structured records — JSON, CSV, or key=value — and infers a walk-state grammar: a compact model of how fields relate and transition across records. Once learned, the grammar compresses new records in the same stream at ratios of 10–60× by encoding only deviations from predicted state.
Ideal for IoT telemetry, log streams, sensor arrays, medical vitals, and any domain where records share a predictable schema with slowly-changing values.
How it works
- Records ingested and format auto-detected (JSON / CSV / key=value)
- Field types inferred (numeric, categorical, timestamp, boolean)
- Transition model built: P(value_t | value_{t-1}, field)
- Walk grammar encoded as Markov table per field
- Compression: encode only the deviation from grammar prediction
Supported Data Formats
JSON Array
[{"ts": 1714000000, "val": 22.5},
{"ts": 1714000060, "val": 22.7}]
CSV with Header
ts,method,status,bytes 1714000000,GET,200,1842 1714000001,POST,201,512
Key=Value Blocks
hostname=node-01 status=active firmware=2.4.1 hostname=node-02 status=active
Use Cases
High-compression domains
- IoT sensor telemetry (temp, humidity, pressure)
- Medical vitals streams (HR, SpO2, BP)
- SCADA/ICS process variables
- Financial tick data
- Edge device health metrics
Moderate-compression domains
- Web server access logs
- Network flow records (NetFlow/IPFIX)
- Security event streams
- Device configuration registries
- Application performance metrics