Skip to content

Build issue in configure: Doesn't detect, or has option to detect AMD's libdrm #720

@dletai

Description

@dletai

What version of hwloc are you using?

Compiling hwloc-2.12.1

Which operating system and hardware are you running on?

OS: RHEL 9.4
HW: x86_64
ROCM version: rocm-6.4.1.60401-83.el9.x86_64
amdgpu: libdrm-amdgpu-devel-2.4.124.60401-2164967.el9.x86_64

  • On Unix-like systems, run uname -a so that we know which operating system, distribution, and kernel version you are using.
    Linux node 5.14.0-427.42.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct 18 14:35:40 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux

Details of the problem

  • What happened?
    Running ./configure reports:

**** RSMI configuration
configure: using standard ROCm install path /opt/rocm ...
checking for rocm_smi/rocm_smi.h... no
**** end of RSMI configuration

However the path exists:

ls -l /opt/rocm/include/rocm_smi/rocm_smi.h 
-rw-r--r-- 1 root root 192700 May 13 07:07 /opt/rocm/include/rocm_smi/rocm_smi.h
  • When looking at config.log, turns out that:
configure:29793: using standard ROCm install path /opt/rocm ...
configure:29817: checking for rocm_smi/rocm_smi.h
configure:29817: gcc -c -g -O2  -I/opt/rocm/include/ conftest.c >&5
In file included from /opt/rocm/include/rocm_smi/rocm_smi.h:57,
                 from conftest.c:173:
/opt/rocm/include/rocm_smi/kfd_ioctl.h:26:10: fatal error: libdrm/drm.h: No such file or directory
   26 | #include <libdrm/drm.h>
      |          ^~~~~~~~~~~~~~
compilation terminated.

  • drm.h is provides by libdrm-amdgpu-devel rpm:
rpm -ql libdrm-amdgpu-devel | grep 'libdrm/drm.h'
/opt/amdgpu/include/libdrm/drm.h

Additional information

Simple workaround that works:

CPPFLAGS=-I/opt/amdgpu/include LDFLAGS=-L/opt/amdgpu/lib64 ../hwloc-2.12.1/configure

**** RSMI configuration
configure: using standard ROCm install path /opt/rocm ...
checking for rocm_smi/rocm_smi.h... yes
checking for rsmi_init in -lrocm_smi64... yes
checking whether a program linked with -lrocm_smi64 can run... no
checking whether rsmi_dev_partition_id_get is declared... yes
**** end of RSMI configuration

The reason it can't run is due to the fact that the compilation machine doesn't have ROCM device, so that's not an issue. The issue is only in configure not looking at /opt/amdgpu for locations of libdrm, or providing a command line option for custom drm locations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions