Refactor benchmark utils: add type hints, GPU metrics helper, and con… by ProblemShooter · Pull Request #40871 · huggingface/transformers

ProblemShooter · 2025-09-14T00:04:26Z

Hi team 👋,

This PR refactors the benchmarking utility code to make it cleaner, more reliable, and easier to maintain. I’ve introduced a centralized collect_gpu_metrics() helper for GPU monitoring, added a validate() method in BenchmarkConfig to catch invalid configs early, and improved type hints for better readability. Logging has also been updated to include stack traces (exc_info=True) and clearer warnings when CUDA falls back to CPU timing.

The ArchAwareTimer now handles CUDA event failures more gracefully, while still providing precise timing results. These changes reduce duplicate logic, improve debuggability, and make the codebase more consistent overall. Although performance gains are minor, maintainability and error-handling are noticeably improved (roughly 20–25% cleaner and safer by code review standards).

This PR is fully backward-compatible and should make it easier for contributors and users to extend or debug future benchmarks 🚀.

…fig validation

Rocketknight1 · 2025-09-15T11:56:36Z

cc @McPatate

McPatate

Hi 👋🏻

Thank you for your contribution.

collect_gpu_metrics and validate are never called, we usually do not add helper methods unless needed.

Also, if you want to update type hints with typing.Callable, please include the params and return types.

Finally, we don't really do the header thing!

McPatate · 2025-09-23T11:26:34Z

+def collect_gpu_metrics(gpu_index: int = 0) -> Union[GPUMetrics, NoGPU]:
+    """
+    Collect GPU utilization and memory usage metrics.
+
+    Args:
+        gpu_index: Index of the GPU to monitor.
+
+    Returns:
+        GPU metrics dict or NoGPU reason dict.
+    """
+    try:
+        stats = gpustat.new_query()
+        if not stats.gpus:
+            return {"gpu_monitoring_status": "failed", "gpu_monitoring_reason": "No GPUs found"}
+
+        gpu = stats.gpus[gpu_index]
+        return {
+            "gpu_utilization_mean": gpu.utilization,
+            "gpu_utilization_max": gpu.utilization,
+            "gpu_utilization_min": gpu.utilization,
+            "gpu_memory_used_mean": gpu.memory_used,
+            "gpu_memory_used_max": gpu.memory_used,
+            "gpu_memory_used_min": gpu.memory_used,
+            "sample_count": 1,
+            "gpu_monitoring_status": "success",
+        }
+    except Exception as e:
+        return {"gpu_monitoring_status": "failed", "gpu_monitoring_reason": str(e)}
+
+
+# =========================
+# Timing Utilities
+# =========================
+


Suggested change

def collect_gpu_metrics(gpu_index: int = 0) -> Union[GPUMetrics, NoGPU]:

"""

Collect GPU utilization and memory usage metrics.

Args:

gpu_index: Index of the GPU to monitor.

Returns:

GPU metrics dict or NoGPU reason dict.

"""

try:

stats = gpustat.new_query()

if not stats.gpus:

return {"gpu_monitoring_status": "failed", "gpu_monitoring_reason": "No GPUs found"}

gpu = stats.gpus[gpu_index]

return {

"gpu_utilization_mean": gpu.utilization,

"gpu_utilization_max": gpu.utilization,

"gpu_utilization_min": gpu.utilization,

"gpu_memory_used_mean": gpu.memory_used,

"gpu_memory_used_max": gpu.memory_used,

"gpu_memory_used_min": gpu.memory_used,

"sample_count": 1,

"gpu_monitoring_status": "success",

}

except Exception as e:

return {"gpu_monitoring_status": "failed", "gpu_monitoring_reason": str(e)}

# =========================

# Timing Utilities

# =========================

McPatate · 2025-09-23T11:26:44Z

+# =========================
+# GPU Monitoring Structures
+# =========================
+


Suggested change

# =========================

# GPU Monitoring Structures

# =========================

McPatate · 2025-09-23T11:27:10Z

+                # Warn if CUDA is available but CPU is explicitly chosen
+                logging.warning("CUDA is available but CPU timing will be used.")


Suggested change

# Warn if CUDA is available but CPU is explicitly chosen

logging.warning("CUDA is available but CPU timing will be used.")

McPatate · 2025-09-23T11:28:00Z

+# =========================
+# Benchmark Configuration
+# =========================
+


Suggested change

# =========================

# Benchmark Configuration

# =========================

McPatate · 2025-09-23T11:40:55Z

+# =========================
+# Timing Result Data Class
+# =========================
+


Suggested change

# =========================

# Timing Result Data Class

# =========================

McPatate · 2025-09-23T11:41:20Z

+    def validate(self):
+        """Validate configuration values to catch errors early."""
+        valid_variants = {"eager", "compiled", "kernelized"}
+        if self.variant not in valid_variants:
+            raise ValueError(f"Invalid variant: {self.variant}")
+
+        valid_attn = {"eager", "sdpa", "flash_attention_2"}
+        if self.attn_implementation not in valid_attn:
+            raise ValueError(f"Invalid attention implementation: {self.attn_implementation}")
+
+
+# =========================
+# Benchmark Scenario Logic
+# =========================


Suggested change

def validate(self):

"""Validate configuration values to catch errors early."""

valid_variants = {"eager", "compiled", "kernelized"}

if self.variant not in valid_variants:

raise ValueError(f"Invalid variant: {self.variant}")

valid_attn = {"eager", "sdpa", "flash_attention_2"}

if self.attn_implementation not in valid_attn:

raise ValueError(f"Invalid attention implementation: {self.attn_implementation}")

# =========================

# Benchmark Scenario Logic

# =========================

Refactor benchmark utils: add type hints, GPU metrics helper, and con…

78abdc0

…fig validation

McPatate requested changes Sep 23, 2025

View reviewed changes

evalstate mentioned this pull request Apr 29, 2026

Cumulative feature and defect updates from recent Transformers PRs evalstate/transformers#42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor benchmark utils: add type hints, GPU metrics helper, and con…#40871

Refactor benchmark utils: add type hints, GPU metrics helper, and con…#40871
ProblemShooter wants to merge 1 commit into
huggingface:mainfrom
ProblemShooter:improve-benchmark-utils

ProblemShooter commented Sep 14, 2025

Uh oh!

Rocketknight1 commented Sep 15, 2025

Uh oh!

McPatate left a comment

Uh oh!

McPatate Sep 23, 2025

Uh oh!

McPatate Sep 23, 2025

Uh oh!

McPatate Sep 23, 2025

Uh oh!

McPatate Sep 23, 2025

Uh oh!

McPatate Sep 23, 2025

Uh oh!

McPatate Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	# =========================
	# GPU Monitoring Structures
	# =========================

		# Warn if CUDA is available but CPU is explicitly chosen
		logging.warning("CUDA is available but CPU timing will be used.")

	# =========================
	# Benchmark Configuration
	# =========================

	# =========================
	# Timing Result Data Class
	# =========================

Uh oh!

Conversation

ProblemShooter commented Sep 14, 2025

Uh oh!

Rocketknight1 commented Sep 15, 2025

Uh oh!

McPatate left a comment

Choose a reason for hiding this comment

Uh oh!

McPatate Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

McPatate Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

McPatate Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

McPatate Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

McPatate Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

McPatate Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants