prettyBenching

A simple Deno library, that gives you pretty benchmarking progress and results in the commandline

maintained

Try it out

This runs a short benchmark to showcase the module live.

deno run -r --allow-hrtime https://deno.land/x/pretty_benching/example.ts

Getting started

Add the following to your deps.ts

export {
  prettyBenchmarkResult,
  prettyBenchmarkProgress,
  prettyBenchmarkDown,
  prettyBenchingHistory
} from 'https://deno.land/x/pretty_benching@v0.3.3/mod.ts';

or just simply import it directly:

import { prettyBenchmarkResult, prettyBenchmarkProgress, prettyBenchmarkDown, prettyBenchingHistory } from 'https://deno.land/x/pretty_benching@v0.3.3/mod.ts';

Note

Using Deno’s --allow-hrtime flag when running your code will result in a more precise benchmarking, because than float milliseconds will be used for measurement instead of integer.

You can use nocolor in the options of both prettyBenchmarkProgress and prettyBenchmarkResult to turn off the coloring on the output. It doesn’t interfere with the Deno’s fmt color settings.

prettyBenchmarkProgress

Prints the Deno runBenchmarks() method’s progressCb callback values in a nicely readable format.

Usage

Simply add it to runBenchmarks() like below and you are good to go. Using silent: true is encouraged, so the default logs don’t interfere

await runBenchmarks({ silent: true }, prettyBenchmarkProgress())

The output would look something like this during running:

running

End when finished:

finished

Thresholds

You can define thresholds to specific benchmarks and than the times of the runs will be colored respectively

const thresholds: Thresholds = {
  "for100ForIncrementX1e6": {green: 0.85, yellow: 1},
  "for100ForIncrementX1e8": {green: 84, yellow: 93},
  "forIncrementX1e9": {green: 900, yellow: 800},
  "forIncrementX1e9x2": {green: 15000, yellow: 18000},
}

runBenchmarks({ silent: true }, prettyBenchmarkProgress({thresholds}))

threshold

Indicators

You can use indicators, which help you categorise your benchmarks. You can change the character which gets added before the benchmark.

const indicators: BenchIndicator[] = [
  { benches: /100/, modFn: colors.bgRed },
  { benches: /for/, modFn: colors.red },
  { benches: /custom/, modFn: () => colors.bgYellow(colors.black("%")) }, // changes indicator char
];

indicator

prettyBenchmarkResults

Prints the Deno runBenchmarks() method’s result in a nicely readable format.

Usage

Simply call prettyBenchmarkResult with the desired settings.

Setting the nocolor option to true will remove all the built in coloring. Its usefull, if you log it somewhere or save the output to a file. It won’t interfere with Deno’s fmt color settings.

Use the silent: true flag in runBenchmarks, if you dont want to see the default output

// ...add benches...

runBenchmarks({silent: true})
.then(prettyBenchmarkResult())
.catch((e: any) => {
  console.error(e.stack);
});

The output would look something like this:

example

Thresholds

You can define thresholds to specific benchmarks and than related things, like times or graph bars will be colored respectively. This can use the same thresholds object as in prettyBenchmarkProgress.

const thresholds: Thresholds = {
      "multiple-runs": { green: 76, yellow: 82 },
      "benchmark-start": { green: 2, yellow: 3 },
};

runBenchmarks().then(prettyBenchmarkResult({ thresholds }));

threshold

Indicators

You can use indicators, which help you categorise your benchmarks besides just their names. You can set what color the table should have. With modFn you can also change what color the marker should be, or even change the indicator icon like seen below (default is #). You can pass this object to prettyBenchmarkProgress too.

const indicators: BenchIndicator[] = [
  {
    benches: /multiple-runs/,
    color: colors.magenta,
    modFn: () => "🚀",
  }
];

runBenchmarks().then(prettyBenchmarkResult({ indicators }));

indicator

Parts

You can change what the result cards should contain with the parts object. Once you define it you have to set all parts you want. The default parts setting is { graph: true, graphBars: 5 }.

You can define what parts you want to use in the options, like this:

prettyBenchmarkResult(
  {
    nocolor: false,
    thresholds,
    indicators,
    parts: {
      extraMetrics: true,
      threshold: true,
      graph: true,
      graphBars: 10,
    },
  },
)

Using all options:

thresholdLine

Extra metrics `{ extraMetrics: true }`

Setting this will give you an extra row, which adds extra calculated values like min, max, mean as ((min+max)/2) , median.

extraMetrics

Threshold `{ threshold: true }`

Need to have thresholds in the root of the options object, which have a matching threshold for the specific benchmark, otherwise it wont add it to the specific card.

It simply show what the set thresholds for the benchmark. Can be usefull if nocolor is set to true.

thresholdLine

Graph `{ graph: true, graphBars: 5 }`

Adds a graph, which shows the distribution of the runs of the benchmark.

Only shows, when there are 10 or more runs set.

The graph shows the results groupped into timeframes, where the groups frame start from the value on the head of its line, and end with excluding the value on the next line.

With graphBars you can set how many bars it should show. Default is 5.

prettyBenchmarkDown

Generates a summary markdown from the results of the Deno runBenchmarks() method’s result.

Name Runs Total (ms) Average (ms) Thresholds

Rotating other things 1000 2143.992 2.144 - -

🎹 Rotating arrays 1000 2021.054 2.021 <= 3.5 ✅
<= 4.4 🔶
> 4.4 🔴 ✅

% Proving NP==P 1 4384.908 4384.908 <= 4141 ✅
<= 6000 🔶
> 6000 🔴 🔶

🚀 Standing out 1000 375.708 0.376 <= 0.3 ✅
<= 0.33 🔶
> 0.33 🔴 🔴

	Name	Runs	Total (ms)	Average (ms)	Thresholds
	Rotating other things	1000	2143.992	2.144	-	-
🎹	Rotating arrays	1000	2021.054	2.021	<= 3.5 ✅ <= 4.4 🔶 > 4.4 🔴	✅
%	Proving NP==P	1	4384.908	4384.908	<= 4141 ✅ <= 6000 🔶 > 6000 🔴	🔶
🚀	Standing out	1000	375.708	0.376	<= 0.3 ✅ <= 0.33 🔶 > 0.33 🔴	🔴

A full example output: pr_benchmark_output.md

Usage

Simply call prettyBenchmarkDown with the desired settings.

// ...add benches...

runBenchmarks()
.then(prettyBenchmarkDown(console.log))
.catch((e: any) => {
  console.error(e.stack);
});

The first parameter of this function is an output function, where you cen recieve the generated markdown’s text. In the example above it just print is to console.

Without defining any options, it will generate one markdown table with one row for each benchmark. Something like this:

Name Runs Total (ms) Average (ms)

Sorting arrays 4000 1506.683 0.377

Rotating arrays 1000 1935.981 1.936

Proving NP==P 1 4194.431 4194.431

Standing out 1000 369.566 0.370

Name	Runs	Total (ms)	Average (ms)
Sorting arrays	4000	1506.683	0.377
Rotating arrays	1000	1935.981	1.936
Proving NP==P	1	4194.431	4194.431
Standing out	1000	369.566	0.370

Writing to a file

runBenchmarks()
.then(prettyBenchmarkDown(
  (markdown: string) => { Deno.writeTextFileSync("./benchmark.md", markdown); },
  { /* ...options */ }
))
.catch((e: any) => {
  console.error(e.stack);
});

🔽 Needs –allow-write flag to run

Options

You can fully customise the generated markdown. Add text, use predefined, or custom columns or group your benchmarks and define these per group.

Here you can seen an example that showcases every option: pr_benchmark_output.md It was generated with: pr_benchmarks.ts

Extra texts

options.title: Defines a level 1 title (# MyTitle) on the top of the generated markdown
options.description: Defines a part, that is put before all of the result tables. If defined as a function, it recieves the runBenchmarks result, so it can be set dynamically. It also accepts a simple string as well.
options.afterTables: Defines a part, that is put after all of the result tables. If defined as a function, it recieves the runBenchmarks result, so it can be set dynamically. It also accepts a simple string as well.

Columns `options.columns`, `group.columns`

You can customise, what columns you want to see in each table. To see what every column type generates check out the example

If not defined, the generator uses the default columns defined by the module
If defined, you take full control, of what columns you want to see. The default columns are exported, and there are other premade columns for you to use.

defaultColumns(columns: string[]) example

columns: [
  ...defaultColumns(),
  ...defaultColumns(['name', 'measuredRunsAvgMs'])
]

It includes Name, Runs, Total (ms) and Average (ms) columns, these are the default values of the BenchmarkRunResult. Filter them with an array of propertyKeys.

indicatorColumn(indicators: BenchIndicator[]) example

columns: [
  indicatorColumn(indicators),
]

Defines a column, that contains the indicator for the given bench, if defined. Keep in mind, that it strips any color from the indicator.

thresholdsColumn(thresholds: Thresholds, indicateResult?: boolean) example

columns: [
  thresholdsColumn(thresholds), // only shows the threshold ranges
  thresholdsColumn(thresholds, true), // shows the result in the cell too
]

Defines a column, that shows the threshold ranges for the given bench, if defined. If you set indicateResult to true, it shows in what range the benchmark fell, in the same cell.

thresholdResultColumn(thresholds: Thresholds) example

columns: [
  thresholdResultColumn(thresholds),
]

Defines a column, that show into what threhold range the benchmark fell.

extraMetricsColumns(options?) example

columns: [
  ...extraMetricsColumns(),
  ...extraMetricsColumns({ ignoreSingleRuns: true }), // puts '-' in cells, where bench was only run once
  ...extraMetricsColumns({ metrics: ["max", "min", "mean", "median", "stdDeviation"] }),
]

Defines columns, that show extra calculated metrics like min, max, mean, median, stdDeviation. You can define which of these you want, in the metrics array. You can also tell it, to put - in the cells, where the benchmark was only run once with ignoreSingleRuns.

Custom columns example

columns: [
  {
    title: 'CustomTotal',
    propertyKey: 'totalMs',
    toFixed: 5,
    align: 'left'
  },
  {
    title: 'Formatter',
    formatter: (r: BenchmarkResult, cd: ColumnDefinition) => `${r.name}:${cd.title}`
  },
]

When you need something else, you can define you own columns. You can put custom ColumnDefinitions into the columns array.

The simplest way, is to give it a propertyKey, and than it shows that value of the BenchmarkResult. You can use any key here, but you will have to put these values into the results manually. If a result[propertyKey] is undefined, than it puts a - into that cell. If your returned value is a number, than you can use toFixed to tell what precision you want to see. (It’s ignored if value is not a number)
If your usecase is more complex, than you can use the formatter method, where you get the benchmark result, and you can return any value that you want from that. The predefined column types above use this method as well.

interface ColumnDefinition {
  title: string;
  propertyKey?: string;
  align?: "left" | "center" | "right";
  toFixed?: number;
  formatter?: (result: BenchmarkResult, columnDef: ColumnDefinition) => string;
}

Groups `options.groups`

groups: [
  {
    include: /array/,
    name: "A group for arrays",
    description: "The array group's description",
    afterTable: (gr: BenchmarkResult[], g: GroupDefinition, rr: BenchmarkRunResult) => `Dynamic ${g.name}, ${gr.length}, ${rr.results.length}`,
    columns: [/* ... */]
  }
]

You can group your benches, so they are separated in your generated markdown. For this, you need to define include RegExp. Right now, every benchmark, that doesnt fit any group will be put into one table at the bottom, so if you dont want some filter them before manually.

In each group you can define a name which will be a level 2 heading (## Name) before you group.

You can also define description and afterTable, which behave the same like the ones in the root of options.

If you want, you can have different columns in each group, if you define them in the groups columns array.

interface GroupDefinition {
  include: RegExp;
  name: string;
  columns?: ColumnDefinition[];
  description?: string | ((groupResults: BenchmarkResult[], group: GroupDefinition,runResults: BenchmarkRunResult ) => string);
  afterTable?: string | ((groupResults: BenchmarkResult[], group: GroupDefinition, runResults: BenchmarkRunResult ) => string);
}

As a Github Action

Use this in a github action, eg. comment benchmark results on PRs.

You can see an example Github Action for this here or see it in use in a showcase repo.

prettyBenchmarkHistory

Helps to keep track of the results of the different runBenchmarks() runs historically.

Usage

Note this module doesn’t handle the loading and saving of the data from/to the disk. See examples.

First, if you already have saved historic data, you need to load it from disk (or elsewhere). If no previous historicData is provided in the constructor, it starts a fresh, empty history.

After it was initiated with the options and data, you can simply call addResults with the new results, and save them again into a file, using getDataString() which returns the historic data in a pretty printed JSON string. If you want to work on the data itself, call getData().

You are able to set some rules in the options, like to only allow to add a result, if every benchmark was run a minimum of x times, or if no benchmark was added or removed or had its runsCount changed since the previous run.

By default it only allows to add results that were measured with --allow-hrtime flag, but this rule can be disabled.

// add benches, then

let historicData;
try {
    historicData = JSON.parse(Deno.readTextFileSync("./benchmarks/history.json"));
} catch(e) {
  // Decide whether you want to proceed with no history
  console.warn(`⚠ cant read history file. (${e.message})`);
}

const history = new prettyBenchmarkHistory(historicData, {/*options*/});

runBenchmarks().then((results: BenchmarkRunResult) => {
    history.addResults(results {id: "version_tag"});
    Deno.writeTextFileSync("./benchmarks/history.json", history.getDataString());
});

The resulting historic data would look something like this, based on the options:

{
  "history": [
    {
      "date": "2020-09-12T20:28:36.812Z",
      "id": "v1.15.2",
      "benchmarks": {
        "RotateArrays": {
          "measuredRunsAvgMs": 0.061707600000003596,
          "runsCount": 500,
          "totalMs": 30.853800000001797,
          "extras": {
            "max": 0.45420000000001437,
            "min": 0.034700000000043474,
            "mean": 0.24445000000002892,
            "median": 0.04179999999996653,
            "std": 0.04731720894389344
          }
        },
        "x3#14": {
          "measuredRunsAvgMs": 2.6682033000000036,
          "runsCount": 1000,
          "totalMs": 2668.2033000000038,
          "extras": {
            "max": 9.25019999999995,
            "min": 1.983299999999872,
...

Rules and options

easeOnlyHrTime: Allows storing low precision measurements, which where measured without --allow-hrtime flag
strict: Contains a set of rules, which are all enforced, if boolean true is set, but can be individually controlled if an object is provided:
- noRemoval: Throw an error, when previously saved benchmark is missing from the current set when calling addResults. Ignored on the very first set of benchmarks.
- noAddition: Throw an error, when previously not saved benchmark is added to the current set when calling addResults. Ignored on the very first set of benchmarks.
- noRunsCountChange: Throw an error, when the runsCount changes for a benchmark from the previous run’s runsCount. Ignored on new benchmarks.
minRequiredRuns: Throw an error, when any benchmark has lower runsCount than the set value.
saveIndividualRuns: Saves the measuredRunsMs array for each benchmark. WARNING this could result in a very big history file overtime. Consider calculating necessary values before save instead with benchExtras or runExtras.
benchExtras(result: BenchmarkResult) => T : Saves the returned object for each benchmark into it’s extras property.
runExtras(runResult: BenchmarkRunResult) => K : Saves the returned object for each run into it’s runExtras property.

Methods

addResults: Stores the run’s result into the historic data, enforces all set rules on the results. You can specify an id in the options to help identify the specific historic data besides the date. It useful for example to set it to the benchmarked module’s version number.
getDeltasFrom: Calls getDeltaForBenchmark for each benchmark in the provided BenchmarkRunResults and returns the values as one object.
getDeltaForBenchmark: Calculates deltas for given BenchmarkResult for each provided property key.
getData: Returns a copy of the historic data.
getDataString: Returns the historic data in a pretty-printed JSON string.
getBenchmarkNames: Returns an array of each benchmark’s name, which result is present in the historic data.

Usecases

Show deltas in the different formats:

prettyBenchmarkProgress: prettyBenchingHistory_progress_delta

code

const history = new prettyBenchmarkHistory(historicData, {/*options*/});

runBenchmarks({ silent: true }, prettyBenchmarkProgress(
  { rowExtras: deltaProgressRowExtra(history) }
));

prettyBenchmarkResults: prettyBenchingHistory_result_card_delta

code

const history = new prettyBenchmarkHistory(historicData, {/*options*/});

runBenchmarks().then(prettyBenchmarkResult(
    { infoCell: deltaResultInfoCell(history) }
));

prettyBenchmarkDown:

Name	Average (ms)	Change in average
x3#14	2.8319	🟢 -33% (1.3895ms)
MZ/X	5.6873	🔺 +5% (0.2468ms)
MZ/T	2.7544	-

code

const history = new prettyBenchmarkHistory(historicData, {/*options*/});

runBenchmarks().then(prettyBenchmarkDown(console.log, {
  columns: [
      ...defaultColumns(['name', 'measuredRunsAvgMs']),
      deltaColumn(history),
  ]
}));

Show each previous measurement as a column in a markdown table

Name 2020-09-12
21:54:53.706 v0.5.6 v0.8.0 Current Change in average

historic 0.0704 0.0740 0.0904 0.0650 🟢 -28% (0.0254ms)

x3#14 6.1675 2.9979 4.2214 3.6275 🟢 -14% (0.5939ms)

MZ/X - 3.3095 5.4405 7.4553 🔺 +37% (2.0147ms)

MZ/T - - - 3.7763 -

Name	2020-09-12 21:54:53.706	v0.5.6	v0.8.0	Current	Change in average
historic	0.0704	0.0740	0.0904	0.0650	🟢 -28% (0.0254ms)
x3#14	6.1675	2.9979	4.2214	3.6275	🟢 -14% (0.5939ms)
MZ/X	-	3.3095	5.4405	7.4553	🔺 +37% (2.0147ms)
MZ/T	-	-	-	3.7763	-

code

  const history = new prettyBenchmarkHistory(historicData, {/*options*/});

  runBenchmarks().then(prettyBenchmarkDown(console.log, {
    columns: [
        { title: "Name", propertyKey: "name" },
        ...historyColumns(history),
        { title: "Current", propertyKey: "measuredRunsAvgMs", toFixed: 4 },
        deltaColumn(history),
    ]
  }));

Calculate thresholds from the previous results: calculateThresholds docs
Github Actions: Save results on version tags, report benchmarking results as a comment on PR-s.
Fail/warn in CI on a PR if the delta is too big or benchmark is in red threshold with: getDeltasFrom and getThresholdResultsFrom

Roadmap

BenchmarkProgress

Add indicator options
Add nocolor option
Unify indicator option types, use color
Add overridable output function like in benchmark results

BenchmarkResults

Overrideable output function
Refactor outputting result in a single call
Add nocolor option
Fix graph
Add indicator options like in progress
Tidy up current benchmark results look
Add options to define what parts are shown in the result cards. (eg. show graph, more calculated values like mean, …)
Find a place in extraMetrics for standard deviation.
Add option to crop outlayer results from graph (maybe with a percent limit).
Add an option to have a minimalist result output, that resembles the final progress output, instead of the big cards.

Historic data

Add module to enable historic data save/read inside repo
Make use of historic module, enable automatic calculating of thresholds from previous runs
Option to use historic data, to tell if benchmarks got better or worse from previous runs.

Operational

Write README docs
Separate prettyBenchmarkResults and prettyBenchmarkProgress into independently importable modules.
Add the ability to follow the change on how the outputs look like.
Refactor how optional options are handled
Write JSDocs
Proper tests
Refactor README
Add showcase module, which helps to have consistent docs images
Make module contributor friendly

prettyBenching

Jump to

Try it out

Getting started

Note

prettyBenchmarkProgress

Usage

Thresholds

Indicators

prettyBenchmarkResults

Usage

Thresholds

Indicators

Parts

Extra metrics { extraMetrics: true }

Threshold { threshold: true }

Graph { graph: true, graphBars: 5 }

prettyBenchmarkDown

Usage

Writing to a file

Options

Extra texts

Columns options.columns, group.columns

defaultColumns(columns: string[]) example

indicatorColumn(indicators: BenchIndicator[]) example

thresholdsColumn(thresholds: Thresholds, indicateResult?: boolean) example

thresholdResultColumn(thresholds: Thresholds) example

extraMetricsColumns(options?) example

Custom columns example

Groups options.groups

As a Github Action

prettyBenchmarkHistory

Usage

Rules and options

Methods

Usecases

Roadmap

BenchmarkProgress

BenchmarkResults

Historic data

Operational

Extra metrics `{ extraMetrics: true }`

Threshold `{ threshold: true }`

Graph `{ graph: true, graphBars: 5 }`

Columns `options.columns`, `group.columns`

Groups `options.groups`