Skip to content

[FEATURE] GPU: Support patching fatbin with multiple PTXs #463

@Officeyutong

Description

@Officeyutong

Some program, for example, llama.cpp, will load a fatbin containing 100+ PTXs, we don't have support for such situation now and defaults to fatbin containing only one PTX. Support such situation.

What to do:

  • Use cuobjdump --extract-ptx to extract all PTX files
  • Patch each of PTX files in the same way as we patch single PTX files
  • Use fatbinary to compose patched PTX files to a fatbin

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions