Skip to main content
Module

x/codepoint_iterator/mod.ts>bytesToCodePoint

Fast uint8array to utf-8 codepoint iterator for streams and array buffers by @okikio & @jonathantneal
Latest
function bytesToCodePoint
import { bytesToCodePoint } from "https://deno.land/x/codepoint_iterator@v1.1.1/mod.ts";

Converts a sequence of bytes into a Unicode code point. This function is a key part of decoding UTF-8 encoded text, as it translates the raw bytes back into the characters they represent.

UTF-8 can be represented by 1 to 4 bytes. This function given the byte length of the utf-8 character calculates the code point using the 1 to 4 numbers given for the bytes of the utf-8 character.

Due to the dynamic length of utf-8 characters, its faster to just grab the bytes from the Uint8Array then calculate it's codepoint than trying to decode said Uint8Array into a string and then converting said string into codepoints.

Parameters

byteLength: number

The number of bytes in a Uint8Array required to represent a single UTF-8 character (the number of bytes ranges from 1 to 4).

unnamed 1: number[]

An array of length byteLength bytes that make up the UTF-8 character.

Returns

number

The Unicode code point of the UTF-8 character.