test-backend-ops: enables perf/eval testing of composite ops#14833
Open
etasnadi wants to merge 7 commits intoggml-org:masterfrom
Open
test-backend-ops: enables perf/eval testing of composite ops#14833etasnadi wants to merge 7 commits intoggml-org:masterfrom
etasnadi wants to merge 7 commits intoggml-org:masterfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for testing computation graphs "composite ops" in test-backend-ops, enabling performance and correctness evaluation of fused operations compared to indirect implementations. This is useful for op development when the direct implementation doesn't exist on CPU but can be tested via equivalent computation graphs.
- Introduces
test_case_compareclass for comparing outputs between different operation implementations - Adds support for composite operation performance testing with proper node duplication logic
- Implements example comparison between direct CONV_2D and im2col-based CONV_2D implementations
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| tests/test-backend-ops.cpp | Main implementation adding composite op testing infrastructure and CONV_2D comparison examples |
| ggml/src/ggml-backend.cpp | Adds new function for comparing outputs between two different computation graphs |
| ggml/include/ggml-backend.h | Declares the new graph comparison function in the public API |
Comments suppressed due to low confidence (1)
CISC
reviewed
Jul 24, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch adds support for testing computation graphs "composite ops" in
test-backend-ops.This is useful
Currently out of the tree code is used to test the correctness (#14316 or #14316) or non-standardized out-of-tree vibe coded standalone gists added to test the performance in #14388 (comment).
In particular, this PR enables
An example is when we compare the output of
CONV_2D(direct conv implementation) with theggml_conv_2d(indirect conv implementation as the latter contains im2col followed by a mul_mat in the resulting graph).To test output of an op against a graph, the user needs to add a test case for the graph and the actual op, then they need to subclass a
test_case_compare : public test_casethat accepts the two test cases in the constructor. The tensor name assignment should be defined intest_case_compareto let theeval()function know how to copy the inputs between the two graphs before execution. The output nodes will then be compared after execution.When testing the perf of a graph, the
get_input_namesoftest_caseshould be overwritten to return the name of the input tensors that will be used byeval_perfto know which nodes should be duplicated. The default implementation returns an empty list andeval_perfassumes that the graph tests a regular op containing only the input nodes connected to a single output node doing the actual calculation, so only the output will be duplicated in this case.