Regex Parse
Regex-parse is a simple regular expression parser that parses a string of regular expression and outputs an abstract syntax tree.
Usage
import { parse, Node, guard } from "https://deno.land/x/regex_parse/mod.ts"
let tree = parse("(?<name>foo)|(bar)*foo{3,4}");
function walk(s: Node) {
if (guard.isChar(s)) {
// code
}
if (guard.isGroup(s)) {
// code
}
}
walk(tree);
This module only does regular expression parsing, with no other apis. Use built in guards to help you process nodes.
Cli
Installation:
deno install -n rep https://deno.land/x/regex_parse/rep.ts
Options:
rep --help
Usage:
$ rep <expression>
Commands:
<expression> regular expression to parse
For more info, run any command with the `--help` flag:
$ rep --help
Options:
-h, --help Display this message
-v, --version Display version number
rep
also process regular expression from stdin if it received no arguments:
echo -n "(?<number>[0-9]|a*)?" | rep | jq
{
"type": "quntifier",
"kind": "?",
"child": {
"type": "group",
"kind": "named",
"child": {
"type": "union",
"left": {
"type": "characterClass",
"negated": false,
"expressions": [
{
"type": "classRange",
"from": "0",
"to": "9"
}
]
},
"right": {
"type": "quntifier",
"kind": "*",
"child": {
"type": "char",
"value": "a"
}
}
},
"name": "number"
}
}
Api
Node
interface Node {
type: string;
}
This interface represents any javascript object that has a type field, representing a node. It is used in the below documentation to refer to any node.
Char
interface Char {
type: "char";
value: string;
}
A simple literal character.
Assertion
interface Assertion {
type: "anchor";
value: "$" | "^";
}
This node is either an $
end marker or ^
begin marker.
Control
A control character
interface Control {
type: "Control";
value: string;
}
Currently the possible value of field value
are \n\t\v\f\r
.
Escaped
interface Escaped {
type: "escaped";
value: string;
}
A character escaped with /
, you can find \b
for boundary and \k
for backreference here if you need them.
Null
interface Null {
type: "null";
}
This node represents nothing or the empty string. It is used in situations such as ()
where the capturing group has nothing in it, or a|
where there is nothing on either side of the pipe.
CharacterClass
interface CharacterClass {
type: "characterClass";
negated: boolean;
expressions: (Char | ClassRange)[];
}
A character class with surounded by []
.
ClassRange
interface ClassRange {
type: "classRange";
from: string;
to: string;
}
Class range in the format of [x-y]
in which the codepoint of x has to be smaller than y. This node is only found within a characterClass
.
Star
interface Star {
type: "quntifier";
kind: "*"
child: Node;
}
The kleene star *
.
Plus
interface Plus {
type: "quntifier";
kind: "+";
child: Node;
}
The kleene plus +
.
Range
interface Range {
type: "quntifier";
kind: "{";
from: number;
to?: number;
}
A range quntifier of the form x{min,max}
. The to
field is missing if the quntifier appeared like this x{3,}
and the same if this x{3}
.
Option
interface Option {
type: "quntifier";
kind: "?";
}
A ?
.
Simple
interface Simple {
type: "group";
kind: "simple"
child: Node;
}
An ordinar capturing group ()
.
LookAhead
interface LookAhead {
type: "group";
kind: "lookAhead";
negated: boolean;
child: Node;
}
Look ahead in the form (?=)
or negated (?!)
.
LookBehind
interface LookBehind {
type: "group";
kind: "lookBehind";
negated: boolean;
child: Node;
}
Look behind in the form (?<=)
or negated (?<!)
.
NonCapturing
interface NonCapturing {
type: "group";
kind: "nonCapturing";
child: Node;
}
Noncapturing group in the form (?:)
.
Named
interface Named {
type: "group";
kind: "named";
child: Node;
name: string;
}
A named group of the form (?<name>)