JsonHilo.js
Fully-featured minimal modular ultra-fast zero-dependency low-level lossless streaming JSON parser with a high-level interface, written in runtime-independent JavaScript.
A stand-alone part of the TAO-JSON interoperability project.
Part of the TAO-Deno project.
Version
v0+2021-06-20+beta
Version 0
from 2021-06-20
, beta
.
Version 0
means that some of the parser’s API might still change before reaching version 1
which will signify a stable API.
However, the event API is mostly stable and the parser is suitable for use.
It passes standards-compliance tests and performs well in benchmarks.
Installation
Not necessary. Import modules directly from deno.land/x:
import {JsonHigh} from 'https://deno.land/x/jsonhilo@v0+2021-06-20+beta/mod.js'
Or from jsDelivr:
import {JsonHigh} from 'https://cdn.jsdelivr.net/gh/tree-annotation/jsonhilo@v0+2021-06-20+beta/mod.js'
This should work out of the box in Deno and the browser.
Quickstart
See a basic example in demo.js
, pasted below:
import {JsonHigh, JsonHighEventType} from './JsonHigh.js'
const stream = JsonHigh((event) => {
switch (event.type) {
case JsonHighEventType.openArray:
case JsonHighEventType.openObject:
case JsonHighEventType.closeArray:
case JsonHighEventType.closeObject:
console.log(event.type)
break
case JsonHighEventType.key:
console.log('key:', event.key)
break
case JsonHighEventType.value:
console.log('value:', typeof event.value, event.value)
break
}
})
stream.push('{"array": [null, true, false, 1.2e-3, "[demo]"]}')
This uses the simplified high-level interface built on top of the more powerful low-level core.
Features
- TAO-style simple
- Dependency-free
- Minimal
- Runtime-independent
- Lossless
- Modular
- Fast
- Streaming-friendly
- Optionally standards-compliant
- Unicode-compatible
Runtime-independent
The code is written in modern JavaScript and relies upon some of its features, standard modules in particular.
Beyond that it does not use any runtime-specific features and should work in any modern JavaScript environment. It was tested in Deno, Node.js, and the browser.
Lossless
Unlike any other known streaming JSON parser, this one provides a low-level interface for lossless parsing, i.e. it is possible to recover the exact input, including whitespace and string escape sequences, from parser events.
This feature can be used to implement accurate translators from JSON to other representations (in particular TAO), syntax highlighters, JSON scanners that search for substrings in strings on-the-fly, without first loading them into memory, and more.
Modular
The library is highly modular with a fully independent core, around which various adapters and extensions are built, including an easy-to-use high-level interface.
JsonLow
The core module is JsonLow.js
. It has no dependencies, so it can be used on its own. It is very minimal and is optimized for performance and accuracy. It provides the most fine-grained control over the parsing process. The events generated by the parser carry enough information to losslessly recreate the input exactly, including whitespace.
JsonHigh
JsonHigh.js
is the high-level module which provides a more convenient interface. It is composed of auxiliary modules and adapters built around the core. It is optimized for convenience and provides similar functionality and granularity to other streaming parsers, such as clarinet or creationix/jsonparse.
Events
There are 4 event types without properties which indicate start and end of structures:
JsonHighEventType.openArray
: an array started ([
)JsonHighEventType.closeArray
: an array ended (]
)JsonHighEventType.openObject
: an object started ({
)JsonHighEventType.closeObject
: an object ended (}
)
And 2 event types with a property which capture primitives:
JsonHighEventType.key
: an object’s key ended. Thekey
property of the event contains the key as a JavaScript string.JsonHighEventType.value
: a primitive JSON value ended. Thevalue
property of the event contains the corresponding JavaScript value:true
,false
,null
, a number, or a string.
Fast
Preliminary benchmarks show that the low-level parser is on average at least as fast as clarinet, which is the fastest streaming JSON parser in JavaScript I could find.
Streaming-friendly
By default the parser is streaming-friendly by accepting the following:
Multiple consecutive top-level JSON values – it can read line-delimited JSON and concatenated JSON, e.g. JSON Lines, ndjson. Whitespace-separated primitives are also supported.
Trailing commas – a single trailing comma in an array or an object generates no errors.
Zero-length or whitespace-only input – generates no errors.
Standards-compliant
The streaming-friendly features can be supressed by Ecma404.js
, an adapter module which provides full ECMA-404/RFC 8259 compliance.
This is confirmed by passing the full JSON Parsing Test Suite.
Unicode-compatible
The core logic operates on Unicode code points – in line with spec – rather than code units or characters.
© 2021 xtao.org