Skip to main content
The Deno 2 Release Candidate is here
Learn more

leac

lint status badge test status badge License: MIT

Lexer / tokenizer.

Features

  • Lightweight. Zero dependencies. Not a lot of code.

  • Well tested - comes will tests for everything including examples.

  • Compact syntax - less boilerplate. Rule name is enough when it is the same as the lookup string.

  • No failures - it just stops when there are no matching rules and returns the information about whether it completed and where it stopped in addition to tokens array.

  • Composable lexers - instead of states within a lexer.

  • Stateless lexers - all inputs are passed as arguments, all outputs are returned in a result object.

  • No streaming - accepts a string at a time.

  • Only text tokens, no arbitraty values. It seems to be a good habit to have tokens that are trivially serializable back into valid input string. Don’t do the parser’s job. There are a couple of convenience features such as the ability to discard matches or string replacements for regular expression rules but that has to be used mindfully.

Install

> npm i leac

Examples

const lex = createLexer([
  { name: '-', str: '-' },
  { name: '+' },
  { name: 'ws', regex: /\s+/, discard: true },
  { name: 'number', regex: /[0-9]|[1-9][0-9]+/ },
]);

const { tokens, offset, complete } = lex('2 + 2');

API

What about …?

  • performance - The code is very simple but I won’t put any unverified assumptions here. I’d be grateful to anyone who can provide a good benchmark project to compare different lexers.

  • stable release - Current release is well thought out and tested. I leave a chance that some changes might be needed based on feedback. Before version 1.0.0 this will be done without a deprecation cycle.

Some other lexer / tokenizer packages