Skip to main content
Using Deno in production at your company? Earn free Deno merch.
Give us feedback
Module

x/vectorizer/mod.ts>SplitTokenizer

Machine Learning utilities for TypeScript
Latest
class SplitTokenizer
import { SplitTokenizer } from "https://deno.land/x/vectorizer@v0.7.5/mod.ts";

Tokenize text based on separator (whitespace)

Constructors

new
SplitTokenizer(options?: Partial<BaseTokenizerOptions & { indices: boolean; }>)

Properties

readonly
lastToken: number
skipWords: "english" | false | string[]

Words to ignore from vocabulary

vocabulary: Map<string, number>

Configuration / Function for preprocessing

Methods

fit(text: string | string[]): this

Construct a vocabulary from a given set of text.

split(text: string): string[]
transform(text: string | string[]): number[][]

Convert a document (string | array of strings) into vectors.