Skip to content

Commit 6047ee1

Browse files
weiyilwyxuyxu
andauthored
add benchmark script (#158)
* add benchmark script * ignore all .claude files --------- Co-authored-by: xuyixuan.xyx <[email protected]>
1 parent 17eab4e commit 6047ee1

File tree

3 files changed

+494
-1
lines changed

3 files changed

+494
-1
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,5 @@ dist/
99
.DS_Store/
1010
.pytest_cache/
1111
.ruff_cache/
12-
CLAUDE.md
12+
CLAUDE.md
13+
.claude/

examples/model_benchmark_readme.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Multi-Model Performance Benchmark
2+
3+
This directory contains a performance benchmark script for various image generation pipelines with different optimization configurations.
4+
5+
## Features
6+
7+
- Benchmark different models:
8+
- FLUX
9+
- Qwen-Image
10+
- Benchmark different optimization modes:
11+
- Basic mode (default)
12+
- FP8 linear optimization
13+
- Torch compile optimization
14+
- FP8 + compile combination
15+
- CPU offloading
16+
- Generate detailed CUDA timeline traces using torch.profiler
17+
- Compare performance across different configurations
18+
- Save sample images from each benchmark run
19+
20+
## Usage
21+
22+
```bash
23+
# Basic FLUX benchmark
24+
python model_perf_benchmark.py --model flux --mode basic
25+
26+
# Qwen-Image with FP8 optimization
27+
python model_perf_benchmark.py --model qwen_image --mode fp8
28+
29+
# FLUX with Torch compile optimization
30+
python model_perf_benchmark.py --model flux --mode compile
31+
32+
# Qwen-Image with all optimizations and profiling
33+
python model_perf_benchmark.py --model qwen_image --mode all --trace-file
34+
35+
# FLUX profiling with auto-generated filename (includes config and GPU info)
36+
python model_perf_benchmark.py --model flux --mode fp8 --trace-file
37+
38+
# Qwen-Image with custom prompt and profiling
39+
python model_perf_benchmark.py --model qwen_image --mode fp8 --prompt "a cyberpunk cityscape" --trace-file
40+
41+
# Benchmark with specific loop count
42+
python model_perf_benchmark.py --model flux --mode compile --num-runs 10
43+
```
44+
45+
For Qwen-Image models, you may need to specify additional paths:
46+
```bash
47+
# Qwen-Image with custom model paths
48+
python model_perf_benchmark.py --model qwen_image --model-path /path/to/model --encoder-path /path/to/encoder --vae-path /path/to/vae --mode basic
49+
```
50+
51+
## Output
52+
53+
The script will generate:
54+
- Performance timing results
55+
- Sample images from each run
56+
- Chrome trace files for detailed profiling (if `--trace-file` is specified)
57+
58+
You can view the trace files in https://ui.perfetto.dev/

0 commit comments

Comments
 (0)