import { Tokenizer } from "https://deno.land/x/html_parser@v0.1.3/src/mod.ts";
Properties
Some behavior, eg. when decoding entities, is done while we are in another state. This keeps track of the other state type.
Data that has already been processed will be removed from the buffer occasionally.
_bufferOffset
keeps track of how many characters have been removed, to make sure position information is accurate.
Methods
HTML only allows ASCII alpha characters (a-z and A-Z) at the beginning of a tag name.
XML allows a lot more characters here (@see https://www.w3.org/TR/REC-xml/#NT-NameStartChar). We allow anything that wouldn't end the tag.
Iterates through the buffer, calling the function corresponding to the current state.
States that are more likely to be hit are higher up, as a performance improvement.
The current index within all of the written data.