Skip to main content

FlashInfer-Bench Dataset

Download the FlashInfer-Bench dataset.

Benchmarking

Via CLI

Run benchmarks on a local trace dataset:
flashinfer-bench run --local /path/to/flashinfer-trace

Custom Options

# Run with custom configuration
flashinfer-bench run --local /path/to/flashinfer-trace \
  --warmup-runs 10 \
  --iterations 100 \
  --num-trials 5 \
  --rtol 1e-3 \
  --atol 1e-3

# Run specific definitions or solutions
flashinfer-bench run --local /path/to/flashinfer-trace \
  --definitions gemm_n5120_k2048 rmsnorm_h128 \
  --solutions solution_name_1 solution_name_2...

# Resume interrupted runs
flashinfer-bench run --local /path/to/flashinfer-trace --resume

Via Python API

from flashinfer_bench.bench import Benchmark, BenchmarkConfig
from flashinfer_bench.data import TraceSet

# Load trace dataset
trace_set = TraceSet.from_path("/path/to/flashinfer-trace")

# Configure benchmark
config = BenchmarkConfig(
    warmup_runs=10,
    iterations=100,
    num_trials=5,
    rtol=1e-3,
    atol=1e-3,
)

# Run benchmark
benchmark = Benchmark(trace_set, config)
benchmark.run_all(save_results=True)

# Get best solution for a definition, e.g. gemm_n5120_k2048
best_trace = trace_set.get_best_trace("gemm_n5120_k2048")
if best_trace:
    print(f"Best solution: {best_trace.solution}")
    print(f"Speedup: {best_trace.evaluation.performance.speedup_factor:.2f}×")
How to use FlashInfer-Bench to automatically trace and optimize FlashInfer operations with custom kernels.

Tracing and Apply Overview

FlashInfer-Bench provides two key capabilities:
  1. Tracing: Automatically capture workload from your FlashInfer calls
  2. Apply: Automatically substitute optimized custom kernels for FlashInfer operations
With adapters already written for FlashInfer, you can enable these features with minimal code changes.

Basic Usage with Apply

The simplest way to use FlashInfer-Bench is through environment variables. Once you’ve installed FlashInfer-Bench, you can enable tracing and apply by:
  1. Import flashinfer_bench before importing FlashInfer
  2. Set environment variables to control behavior

Example: Drop-in Optimization

import flashinfer_bench  # Import to install adapters
from flashinfer.norm import fused_add_rmsnorm

# Your FlashInfer code runs as normal
# But optimized kernels are automatically applied when available

Environment Variables

Control FlashInfer-Bench behavior with these environment variables:
  • FIB_ENABLE_TRACING=1: Enable workload tracing to collect performance data
  • FIB_ENABLE_APPLY=1: Enable automatic kernel substitution
  • FIB_DATASET_PATH=/path/to/dataset: Specify where trace data and custom kernels are stored (default: ~/.cache/flashinfer_bench/dataset)
I