decancer
A portable module that removes common confusables from strings without the use of Regexes. Available for Rust, Node.js, Deno, and the Browser.
Pros:
- Extremely fast, no use of regex whatsoever!
- No dependencies.
- Simple to use, just one single function.
- Supports all the way to UTF-32 code-points. Like emojis, zalgos, etc.
- While this project may not be perfect, it should cover the vast majority of confusables.
Con:
- Remember that this project is not perfect, false-positives may happen.
installation
Rust
In your Cargo.toml
:
decancer = "1.3.3"
Node.js
In your shell:
$ npm install decancer
In your code:
const decancer = require('decancer');
Deno
In your code:
import init from "https://deno.land/x/decancer@v1.3.3/mod.ts";
const decancer = await init();
Browser
In your code:
import init from "https://cdn.jsdelivr.net/gh/null8626/decancer@v1.3.3/decancer.min.js";
const decancer = await init();
examples
NOTE: cured output will ALWAYS be in lowercase.
JavaScript
const noCancer = decancer('vEⓡ𝔂 𝔽𝕌Ňℕy ţ乇𝕏𝓣');
console.log(noCancer); // 'very funny text'
Rust
extern crate decancer;
use decancer::Decancer;
fn main() {
let instance = Decancer::new();
let output = instance.cure("vEⓡ𝔂 𝔽𝕌Ňℕy ţ乇𝕏𝓣");
assert_eq!(output, String::from("very funny text"));
}
If you want to check if the decancered string contains a certain keyword, i recommend using this instead since mistranslations can happen (e.g mistaking the number 0 with the letter O)
JavaScript
const noCancer = decancer(someString);
if (decancer.contains(noCancer, 'no-no-word')) console.log('LANGUAGE!!!');
Rust
extern crate decancer;
use decancer::Decancer;
fn main() {
let instance = Decancer::new();
let output = instance.cure("vEⓡ𝔂 𝔽𝕌Ňℕy ţ乇𝕏𝓣");
if instance.contains(output, "badwordhere") {
println!("LANGUAGE!!!");
}
}
contributions
All contributions are welcome. Feel free to fork the project at GitHub! <3
If you want to add, remove, modify, or view the list of supported confusables, you can clone the GitHub repository, and modify it directly with Node.js. Either through a script or directly from the REPL.
const reader = await import('./contrib/index.mjs');
const data = reader.default('./core/bin/confusables.bin');
// do something with data...
data.save('./core/bin/confusables.bin');
special thanks
These are the primary resources that made this project possible.