Deno DOM Design Doc
People:
- b-fuze (author)
Overview
This is an attempt to implement the browserâs DOM API as a Deno module; in an effort to facilitate parsing of arbitrary HTML into a DOM tree, its manipulation, and (re)serialization into HTML form. The primary API is JS, in a Deno-based environment. The internal parsing and rendering facilities will be implemented in Rust, with the JS-Rust interface achieved via wasmbindgen, and the user-land API implemented in TypeScript hopefully mirroring the actual DOM API weâre familiar with.
Motivation
This project is primarily aimed at enabling SSR in Deno, however it should be general enough to also aid in things like webscraping, HTML manipulation, etc.
Goals
- HTML parser in Deno
- Fast
- Mirror most* supported DOM APIs as closely as possible
- Provide specific APIs in addition to DOM APIs to make certain operations more efficient, like controlling Shadow DOM (see Open Questions)
- Use cutting-edge JS features like private class members, optional chaining, etc
Non-Goals
- Headless browser implementation
- Ability to run JS embedded in documents (
<script>
tags,onload
, etc) - Parse CSS or JS (theyâre just text, but this may be supported in the future for CSSOM)
- Support older (or even not so old) JS engines. In other words, there will be no support of transpilation to ES5, no support of polyfills, etc
- Support special functionality of obsolet HTML elements (
<marquee>
, etc)
Open Questions
Shadow DOM
Shadow DOM is a supported feature, yet shadow DOM canât be represented in HTML and can only be constructed with DOM APIs. This is problematic for situations like SSR where you need to serialize the tree into HTML. How should it be handled?
Some ideas:
- Rendered as HTML child nodes
- Omitted entirely
- Some flag of sorts that toggles its serializability
- Inline JS that would render the respective shadow DOM for the concerned elements?
Proxies
It is known that proxies can incur runtime costs, as they are a somewhat magical
existence in the context of JS engines. However, they may be necessary for
things like
Element.attributes
that enable to manipulate an elementâs attributes simply by setting or removing
properties on a ânormal JS object.â
One consideration is to make it opt-in by making .attributes
a getter that
will generate a Proxy
on-the-fly when needed.
SVG
???
Implementation
In JS land there will be typical classes like HTMLDocument
, Text
,
HTMLDivElement
, etc. These will be built from either parsing a document or
directly from document.createElement(...)
.
Caching
Elements will cache various aspects of their children like class name maps, tag
name maps, etc. It will also cache its innerHTML
, innerText
, and
textContent
.
Example Usage
Example parsing a basic document
import { DOMParser } from "./deno-dom/mod.ts";
const doc = new DOMParser().parseFromString(
`
<p>Hello <b>World!</b></p>
`,
"text/html",
);
const text = doc.querySelector("p").childNodes[0].textContent;
console.log(text); // "Hello "