Skip to main content

Regex Parse

Licence

Regex-parse is a simple regular expression parser that parses a string of regular expression and outputs an abstract syntax tree.

Usage

import { parse, Node, guard } from "https://deno.land/x/regex_parse/mod.ts"

let tree = parse("(?<name>foo)|(bar)*foo{3,4}");

function walk(s: Node) {
  if (guard.isChar(s)) {
    // code
  }

  if (guard.isGroup(s)) {
    // code
  }
}

walk(tree);

This module only does regular expression parsing, with no other apis. Use built in guards to help you process nodes.

Cli

Installation:

deno install -n rep https://deno.land/x/regex_parse/rep.ts

Options:

rep --help
Usage:
  $ rep <expression>

Commands:
  <expression>  regular expression to parse

For more info, run any command with the `--help` flag:
  $ rep --help

Options:
  -h, --help     Display this message 
  -v, --version  Display version number

rep also process regular expression from stdin if it received no arguments:

echo -n "(?<number>[0-9]|a*)?" | rep | jq
{
  "type": "quntifier",
  "kind": "?",
  "child": {
    "type": "group",
    "kind": "named",
    "child": {
      "type": "union",
      "left": {
        "type": "characterClass",
        "negated": false,
        "expressions": [
          {
            "type": "classRange",
            "from": "0",
            "to": "9"
          }
        ]
      },
      "right": {
        "type": "quntifier",
        "kind": "*",
        "child": {
          "type": "char",
          "value": "a"
        }
      }
    },
    "name": "number"
  }
}

Api

Node

interface Node {
  type: string;
}

This interface represents any javascript object that has a type field, representing a node. It is used in the below documentation to refer to any node.

Char

interface Char {
  type: "char";
  value: string;
}

A simple literal character.

Assertion

interface Assertion {
  type: "anchor";
  value: "$" | "^";
}

This node is either an $ end marker or ^ begin marker.

Control

A control character

interface Control {
    type: "Control";
    value: string;
}

Currently the possible value of field value are \n\t\v\f\r.

Escaped

interface Escaped {
  type: "escaped";
  value: string;
}

A character escaped with /, you can find \b for boundary and \k for backreference here if you need them.

Null

interface Null {
  type: "null";
}

This node represents nothing or the empty string. It is used in situations such as () where the capturing group has nothing in it, or a| where there is nothing on either side of the pipe.

CharacterClass

interface CharacterClass {
  type: "characterClass";
  negated: boolean;
  expressions: (Char | ClassRange)[];
}

A character class with surounded by [].

ClassRange

interface ClassRange {
  type: "classRange";
  from: string;
  to: string;
}

Class range in the format of [x-y] in which the codepoint of x has to be smaller than y. This node is only found within a characterClass.

Star

interface Star {
  type: "quntifier";
  kind: "*"
  child: Node;
}

The kleene star *.

Plus

interface Plus {
  type: "quntifier";
  kind: "+";
  child: Node;
}

The kleene plus +.

Range

interface Range {
  type: "quntifier";
  kind: "{";
  from: number;
  to?: number;
}

A range quntifier of the form x{min,max}. The to field is missing if the quntifier appeared like this x{3,} and the same if this x{3}.

Option

interface Option {
  type: "quntifier";
  kind: "?";
}

A ?.

Simple

interface Simple {
  type: "group";
  kind: "simple"
  child: Node;
}

An ordinar capturing group ().

LookAhead

interface LookAhead {
  type: "group";
  kind: "lookAhead";
  negated: boolean;
  child: Node;
}

Look ahead in the form (?=) or negated (?!).

LookBehind

interface LookBehind {
  type: "group";
  kind: "lookBehind";
  negated: boolean;
  child: Node;
}

Look behind in the form (?<=) or negated (?<!).

NonCapturing

interface NonCapturing {
  type: "group";
  kind: "nonCapturing";
  child: Node;
}

Noncapturing group in the form (?:).

Named

interface Named {
  type: "group";
  kind: "named";
  child: Node;
  name: string;
}

A named group of the form (?<name>)

Licence

MIT