Skip to content

Conversation

tlemo
Copy link
Contributor

@tlemo tlemo commented Jul 24, 2025

This PR implements the suggestion described in #1308

  1. Allow passing multiple (comma-separated?) ops to test-backend-ops. This can be convenient when working on a set of ops, when you'd want to test them together (but without having to run every single op). For example:

test-backend-ops.exe test -o "ADD,RMS_NORM,ROPE,SILU,SOFT_MAX"

  1. Support full test-case variation string in addition to basic op names. This would make it easy to select a single variation, either for testing or for benchmarking. It can be particularly useful for profiling a particular variation (ex. a CUDA kernel), for example:

test-backend-ops.exe perf -b CUDA0 -o "MUL_MAT(type_a=f16,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=2)"

These two can be combined. As the current -o, this change doesn't try to detect/report an error if an filter doesn't name existing ops (ex. misspelled)

1. Allow passing multiple (comma-separated?) ops to test-backend-ops. This can be convenient when working on a set of ops, when you'd want to test them together (but without having to run every single op). For example:

`test-backend-ops.exe test -o "ADD,RMS_NORM,ROPE,SILU,SOFT_MAX"`

2. Support full test-case variation string in addition to basic op names. This would make it easy to select a single variation, either for testing or for benchmarking. It can be particularly useful for profiling a particular variation (ex. a CUDA kernel), for example:

`test-backend-ops.exe perf -b CUDA0 -o "MUL_MAT(type_a=f16,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=2)"`

These two can be combined. As the current `-o`, this change doesn't try to detect/report an error if an filter doesn't name existing ops (ex. misspelled)
@tlemo
Copy link
Contributor Author

tlemo commented Jul 24, 2025

Question for the maintainers: the GGML and llama.cpp versions of test-backend-ops.cpp appear to have diverged. Which project would be best for submitting suggestions & PRs for GGML-specific areas?

@CISC
Copy link
Contributor

CISC commented Jul 24, 2025

test-backend-ops.cpp will get synced with llama.cpp (on a roughly weekly basis), submitting the PR here is fine.

@slaren
Copy link
Member

slaren commented Jul 24, 2025

It's still preferable to open the PR in the llama.cpp repository because it is likely to receive more attention, the CI is better, and it reduces the chances of merge conflicts.
@ggerganov maybe PRs in this repository should be strictly limited to the examples or other code that isn't shared.

@tlemo
Copy link
Contributor Author

tlemo commented Jul 24, 2025

Thanks @CISC, @slaren. My current experiments are based on llama.cpp, and it would be easier for me to extract and submit PRs from there. Do we still want this PR here, or should I close it and reopen one in llama.cpp?

@CISC , are you sure test-backend-ops.cpp is automatically synced? From a quick glance it seems that the two version diverged enough that automatic merges seem difficult.

@ggerganov
Copy link
Member

@ggerganov maybe PRs in this repository should be strictly limited to the examples or other code that isn't shared.

Yes, the conflicts sometimes are too difficult to resolve. We can avoid that by pushing things only through llama.cpp (whisper.cpp does not get many core contributions either way).

@tlemo Yes, they are synced - here is the incoming latest version: #1311

In any case, let's move the PR to llama.cpp for better coverage and reviews.

@tlemo
Copy link
Contributor Author

tlemo commented Jul 24, 2025

Moved to llama.cpp: ggml-org/llama.cpp#14865

@tlemo tlemo closed this Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants