ROCm System Management


ROCm System Management Interface

This repository includes the rocm-smi tool. This tool exposes functionality for clock and temperature management of your ROCm enabled system.


You may find rocm-smi at the following location after installing the rocm package:


Alternatively, you may clone this repository and run the tool directly.


The SMI will report a “version” which is the version of the kernel installed:

AMD ROCm System Management Interface v$(uname)

For ROCk installations, this will be the AMDGPU module version (e.g. 5.0.71) For non-ROCk or monolithic ROCk installations, this will be the kernel version, which will be equivalent to the following bash command:

$(uname -a) | cut -d ' ' -f 3)


For detailed and up to date usage information, we recommend consulting the help:

/opt/rocm/bin/rocm-smi -h

For convenience purposes, following is the output from the -h flag:

AMD ROCm System Management Interface | ROCM-SMI version: 1.3.1|

usage: rocm-smi [-h] [-d DEVICE [DEVICE …]] [–alldevices] [–showhw] [-a] [-i] [-v] [–showdriverversion]

[–showfwinfo [BLOCK [BLOCK …]]] [–showmclkrange] [–showmemvendor] [–showsclkrange] [–showproductname] [–showserial] [–showuniqueid] [–showvoltagerange] [–showbus] [–showpagesinfo] [–showpendingpages] [–showretiredpages] [–showunreservablepages] [-f] [-P] [-t] [-u] [–showmemuse] [–showvoltage] [-b] [-c] [-g] [-l] [-M] [-m] [-o] [-p] [-S] [-s] [–showmeminfo TYPE [TYPE …]] [–showpids] [–showreplaycount] [–showrasinfo BLOCK [BLOCK …]] [–showvc] [–showxgmierr] [-r] [–resetfans] [–resetprofile] [–resetpoweroverdrive] [–resetxgmierr] [–setsclk LEVEL [LEVEL …]] [–setmclk LEVEL [LEVEL …]] [–setpcie LEVEL [LEVEL …]] [–setslevel SCLKLEVEL SCLK SVOLT] [–setmlevel MCLKLEVEL MCLK MVOLT] [–setvc POINT SCLK SVOLT] [–setsrange MINMAX SCLK] [–setmrange MINMAX SCLK] [–setfan LEVEL] [–setperflevel LEVEL] [–setoverdrive %] [–setmemoverdrive %] [–setpoweroverdrive WATTS] [–setprofile SETPROFILE] [–rasenable BLOCK ERRTYPE] [–rasdisable BLOCK ERRTYPE] [–rasinject BLOCK] [–gpureset] [–load FILE | –save FILE] [–autorespond RESPONSE] [–loglevel LEVEL] [–json]

-h, –help

show this help message and exit


Reset specified GPU (One GPU must be specified)

–load FILE

Load Clock, Fan, Performance and Profile settings

–save FILE

Save Clock, Fan, Performance and Profile settings

-d DEVICE [DEVICE …], –device DEVICE [DEVICE …] Execute command on specified device

Display Options:


Execute command on non-AMD devices as well as AMD devices


Show Hardware details

-a, –showallinfo

Show Temperature, Fan and Clock values


-i, –showid


-v, –showvbios

Show VBIOS version


Show kernel driver version

–showfwinfo [BLOCK [BLOCK …]]

Show FW information


Show mclk range


Show GPU memory vendor


Show sclk range


Show SKU/Vendor name


Show GPU’s Serial Number


Show GPU’s Unique ID


Show voltage range


Show PCI bus number

Pages information:


Show retired, pending and unreservable pages


Show pending retired pages


Show retired pages


Show unreservable pages

Hardware-related information:

-f, –showfan

Show current fan speed

-P, –showpower

Show current Average Graphics Package Power Consumption

-t, –showtemp

Show current temperature

-u, –showuse

Show current GPU use


Show current GPU memory used


Show current GPU voltage

Software-related/controlled information:

-b, –showbw

Show estimated PCIe use

-c, –showclocks

Show current clock frequencies

-g, –showgpuclocks

Show current GPU clock frequencies

-l, –showprofile

Show Compute Profile attributes

-M, –showmaxpower

Show maximum graphics package power this GPU will consume

-m, –showmemoverdrive

Show current GPU Memory Clock OverDrive level

-o, –showoverdrive

Show current GPU Clock OverDrive level

-p, –showperflevel

Show current DPM Performance Level

-S, –showclkvolt

Show supported GPU and Memory Clocks and Voltages

-s, –showclkfrq

Show supported GPU and Memory Clock

–showmeminfo TYPE [TYPE …]

Show Memory usage information for given block(s) TYPE


Show current running KFD PIDs


Show PCIe Replay Count

–showrasinfo BLOCK [BLOCK …]

Show RAS enablement information and error counts for the specified block(s)


Show voltage curve


Show XGMI error information since last read

Set options:

–setsclk LEVEL [LEVEL …]

Set GPU Clock Frequency Level(s) (requires manual Perf level)

–setmclk LEVEL [LEVEL …]

Set GPU Memory Clock Frequency Level(s) (requires manual Perf level)

–setpcie LEVEL [LEVEL …]

Set PCIE Clock Frequency Level(s) (requires manual Perf level)


Change GPU Clock frequency (MHz) and Voltage (mV) for a specific Level


Change GPU Memory clock frequency (MHz) and Voltage for (mV) a specific Level


Change SCLK Voltage Curve (MHz mV) for a specific point

–setsrange MINMAX SCLK

Set min(0) or max(1) SCLK speed

–setmrange MINMAX SCLK

Set min(0) or max(1) MCLK speed

–setfan LEVEL

Set GPU Fan Speed (Level or %)

–setperflevel LEVEL

Set Performance Level

–setoverdrive %

Set GPU OverDrive level (requires manual|high Perf level)

–setmemoverdrive %

Set GPU Memory Overclock OverDrive level (requires manual|high Perf level)

–setpoweroverdrive WATTS

Set the maximum GPU power using Power OverDrive in Watts

–setprofile SETPROFILE

Specify Power Profile level (#) or a quoted string of CUSTOM Profile attributes “# # # #…” (requires manual Perf level)

–rasenable BLOCK ERRTYPE

Enable RAS for specified block and error type

–rasdisable BLOCK ERRTYPE

Disable RAS for specified block and error type

–rasinject BLOCK

Inject RAS poison for specified block (ONLY WORKS ON UNSECURE BOARDS)

Reset options:

-r, –resetclocks

Reset clocks and OverDrive to default


Reset fans to automatic (driver) control


Reset Power Profile back to default


Set the maximum GPU power back to the device deafult state


Reset XGMI error count

Auto-response options:

–autorespond RESPONSE

Response to automatically provide for all prompts (NOT RECOMMENDED)

Output options:

–loglevel LEVEL

How much output will be printed for what program is doing, one of debug/info/warning/error/critical


Print output in JSON format

Detailed Option Descriptions

–setsclk/–setmclk # [# # …]: This allows you to set a mask for the levels. For example, if a GPU has 8 clock levels, you can set a mask to use levels 0, 5, 6 and 7 with –setsclk 0 5 6 7 . This will only use the base level, and the top 3 clock levels. This will allow you to keep the GPU at base level when there is no GPU load, and the top 3 levels when the GPU load increases.

–setfan LEVEL: This sets the fan speed to a value ranging from 0 to 255 (not from 0-100%). If the level ends with a %, the fan speed is calculated as pct*maxlevel/100 (maxlevel is usually 255, but is determined by the ASIC) .. NOTE:

While the hardware is usually capable of overriding this value when required, it is
recommended to not set the fan level lower than the default value for extended periods
of time

–setperflevel LEVEL: This lets you use the pre-defined Performance Level values, which can include: auto (Automatically change PowerPlay values based on GPU workload) low (Keep PowerPlay values low, regardless of workload) high (Keep PowerPlay values high, regardless of workload) manual (Only use values defined in sysfs values)

–setoverdrive/–setmemoverdrive #: DEPRECATED IN NEWER KERNEL VERSIONS (use –setslevel/–setmlevel instead) This sets the percentage above maximum for the max Performance Level. For example, –setoverdrive 20 will increase the top sclk level by 20%. If the maximum sclk level is 1000MHz, then –setoverdrive 20 will increase the maximum sclk to 1200MHz

–setpoweroverdrive/–resetpoweroverdrive #: This allows users to change the maximum power available to a GPU package. The input value is in Watts. This limit is enforced by the hardware, and some cards allow users to set it to a higher value than the default that ships with the GPU. This Power OverDrive mode allows the GPU to run at higher frequencies for longer periods of time, though this may mean the GPU uses more power than it is allowed to use per power supply specifications. Each GPU has a model-specific maximum Power OverDrive that is will take; attempting to set a higher limit than that will cause this command to fail.

–setprofile SETPROFILE: The Compute Profile accepts 1 or n parameters, either the Profile to select (see –showprofile for a list of preset Power Profiles) or a quoted string of values for the CUSTOM profile. NOTE: These values can vary based on the ASIC, and may include: SCLK_PROFILE_ENABLE - Whether or not to apply the 3 following SCLK settings (0=disable,1=enable) NOTE: This is a hidden field. If set to 0, the following 3 values are displayed as ‘-‘ SCLK_UP_HYST - Delay before sclk is increased (in milliseconds) SCLK_DOWN_HYST - Delay before sclk is decresed (in milliseconds) SCLK_ACTIVE_LEVEL - Workload required before sclk levels change (in %) MCLK_PROFILE_ENABLE - Whether or not to apply the 3 following MCLK settings (0=disable,1=enable) NOTE: This is a hidden field. If set to 0, the following 3 values are displayed as ‘-‘ MCLK_UP_HYST - Delay before mclk is increased (in milliseconds) MCLK_DOWN_HYST - Delay before mclk is decresed (in milliseconds) MCLK_ACTIVE_LEVEL - Workload required before mclk levels change (in %)

BUSY_SET_POINT - Threshold for raw activity level before levels change FPS - Frames Per Second USE_RLC_BUSY - When set to 1, DPM is switched up as long as RLC busy message is received MIN_ACTIVE_LEVEL - Workload required before levels change (in %)


When a compute queue is detected, these values will be automatically applied to the system Compute Power Profiles are only applied when the Performance Level is set to “auto”

The CUSTOM Power Profile is only applied when the Performance Level is set to “manual” so using this flag will automatically set the performance level to “manual”

It is not possible to modify the non-CUSTOM Profiles. These are hard-coded by the kernel

-P, –showpower: Show Average Graphics Package power consumption

“Graphics Package” refers to the GPU plus any HBM (High-Bandwidth memory) modules, if present

-M, –showmaxpower: Show the maximum Graphics Package power that the GPU will attempt to consume. This limit is enforced by the hardware.

–loglevel: This will allow the user to set a logging level for the SMI’s actions. Currently this is only implemented for sysfs writes, but can easily be expanded upon in the future to log other things from the SMI

–showmeminfo: This allows the user to see the amount of used and total memory for a given block (vram, vis_vram, gtt). It returns the number of bytes used and total number of bytes for each block ‘all’ can be passed as a field to return all blocks, otherwise a quoted-string is used for multiple values (e.g. “vram vis_vram”) vram refers to the Video RAM, or graphics memory, on the specified device vis_vram refers to Visible VRAM, which is the CPU-accessible video memory on the device gtt refers to the Graphics Translation Table

-b, –showbw: This shows an approximation of the number of bytes received and sent by the GPU over the last second through the PCIe bus. Note that this will not work for APUs since data for the GPU portion of the APU goes through the memory fabric and does not ‘enter/exit’ the chip via the PCIe interface, thus no accesses are generated, and the performance counters can’t count accesses that are not generated. NOTE: It is not possible to easily grab the size of every packet that is transmitted in real time, so the kernel estimates the bandwidth by taking the maximum payload size (mps), which is the max size that a PCIe packet can be. and multiplies it by the number of packets received and sent. This means that the SMI will report the maximum estimated bandwidth, the actual usage could (and likely will be) less

–showrasinfo: This shows the RAS information for a given block. This includes enablement of the block (currently GFX, SDMA and UMC are the only supported blocks) and the number of errors ue - Uncorrectable errors ce - Correctable errors

Clock Type Descriptions

DCEFCLK - DCE (Display) FCLK - Data fabric (VG20 and later) - Data flow from XGMI, Memory, PCIe SCLK - GFXCLK (Graphics core)


SOCCLK split from SCLK as of Vega10. Pre-Vega10 they were both controlled by SCLK

MCLK - GPU Memory (VRAM) PCLK - PCIe bus


This gives 2 speeds, PCIe Gen1 x1 and the highest available based on the hardware

SOCCLK - System clock (VG10 and later)- Data Fabric (DF), MM HUB, AT HUB, SYSTEM HUB, OSS, DFD Note - DF split from SOCCLK as of Vega20. Pre-Vega20 they were both controlled by SOCCLK

–gpureset: This flag will attempt to reset the GPU for a specified device. This will invoke the GPU reset through the kernel debugfs file amdgpu_gpu_recover. Note that GPU reset will not always work, depending on the manner in which the GPU is hung.

—showdriverversion: This flag will print out the AMDGPU module version for amdgpu-pro or ROCK kernels. For other kernels, it will simply print out the name of the kernel (uname)

–showserial: This flag will print out the serial number for the graphics card NOTE: This is currently only supported on Vega20 server cards that support it. Consumer cards and cards older than Vega20 will not support this feature.

–showproductname: This uses the pci.ids file to print out more information regarding the GPUs on the system. ‘update-pciids’ may need to be executed on the machine to get the latest PCI ID snapshot, as certain newer GPUs will not be present in the stock pci.ids file, and the file may even be absent on certain OS installation types

–showpagesinfo | –showretiredpages | –showpendingpages | –showunreservablepages: These flags display the different “bad pages” as reported by the kernel. The three types of pages are: Retired pages (reserved pages) - These pages are reserved and are unable to be used Pending pages - These pages are pending for reservation, and will be reserved/retired Unreservable pages - These pages are not reservable for some reason.

–showmemuse | –showuse | –showmeminfo –showuse and –showmemuse are used to indicate how busy the respective blocks are. For example, for –showuse (gpu_busy_percent sysfs file), the SMU samples every ms or so to see if any GPU block (RLC, MEC, PFP, CP) is busy. If so, that’s 1 (or high). If not, that’s 0 (low). If we have 5 high and 5 low samples, that means 50% utilization (50% GPU busy, or 50% GPU use). The windows and sampling vary from generation to generation, but that is how GPU and VRAM use is calculated in a generic sense. –showmeminfo (and VRAM% in concise output) will show the amount of VRAM used (visible, total, GTT), as well as the total available for those partitions. The percentage shown there indicates the amount of used memory in terms of current allocations OverDrive settings

  • Enabling OverDrive requires both a card that support OverDrive and a driver parameter that enables its use.

  • Because OverDrive features can damage your card, most workstation and server GPUs cannot use OverDrive.

  • Consumer GPUs that can use OverDrive must enable this feature by setting bit 14 in the amdgpu driver’s ppfeaturemask module parameter

For OverDrive functionality, the OverDrive bit (bit 14) must be enabled (by default, the OverDrive bit is disabled on the ROCK and upstream kernels). This can be done by setting amdgpu.ppfeaturemask accordingly in the kernel parameters, or by changing the default value inside amdgpu_drv.c (if building your own kernel).

As an example, if the ppfeaturemask is set to 0xffffbfff (11111111111111111011111111111111), then enabling the OverDrive bit would make it 0xffffffff (11111111111111111111111111111111). These are the flags that require OverDrive functionality to be enabled for the flag to work:


Testing changes

After making changes to the SMI, run the test script to ensure that all functionality remains intact before uploading the patch. This can be done using:

./ /opt/rocm/bin/rocm-smi

The test can run all flags for the SMI, or specific flags can be tested with the -s option.

Any new functionality added to the SMI should have a corresponding test added to the test script.


The information contained herein is for informational purposes only, and is subject to change without notice. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document. Terms and limitations applicable to the purchase or use of AMD’s products are as set forth in a signed agreement between the parties or in AMD’s Standard Terms and Conditions of Sale.

AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

Copyright (c) 2014-2017 Advanced Micro Devices, Inc. All rights reserved.

Programing ROCm-SMI

SYSFS Interface

Naming and data format standards for sysfs files

The libsensors library offers an interface to the raw sensors data through the sysfs interface. Since lm-sensors 3.0.0, libsensors is completely chip-independent. It assumes that all the kernel drivers implement the standard sysfs interface described in this document. This makes adding or updating support for any given chip very easy, as libsensors, and applications using it, do not need to be modified. This is a major improvement compared to lm-sensors 2.

Note that motherboards vary widely in the connections to sensor chips. There is no standard that ensures, for example, that the second temperature sensor is connected to the CPU, or that the second fan is on the CPU. Also, some values reported by the chips need some computation before they make full sense. For example, most chips can only measure voltages between 0 and +4V. Other voltages are scaled back into that range using external resistors. Since the values of these resistors can change from motherboard to motherboard, the conversions cannot be hard coded into the driver and have to be done in user space.

For this reason, even if we aim at a chip-independent libsensors, it will still require a configuration file (e.g. /etc/sensors.conf) for proper values conversion, labeling of inputs and hiding of unused inputs.

An alternative method that some programs use is to access the sysfs files directly. This document briefly describes the standards that the drivers follow, so that an application program can scan for entries and access this data in a simple and consistent way. That said, such programs will have to implement conversion, labeling and hiding of inputs. For this reason, it is still not recommended to bypass the library.

Each chip gets its own directory in the sysfs /sys/devices tree. To find all sensor chips, it is easier to follow the device symlinks from /sys/class/hwmon/hwmon*.

Up to lm-sensors 3.0.0, libsensors looks for hardware monitoring attributes in the “physical” device directory. Since lm-sensors 3.0.1, attributes found in the hwmon “class” device directory are also supported. Complex drivers (e.g. drivers for multifunction chips) may want to use this possibility to avoid namespace pollution. The only drawback will be that older versions of libsensors won’t support the driver in question.

All sysfs values are fixed point numbers.

There is only one value per file, unlike the older /proc specification. The common scheme for files naming is: <type><number>_<item>. Usual types for sensor chips are “in” (voltage), “temp” (temperature) and “fan” (fan). Usual items are “input” (measured value), “max” (high threshold, “min” (low threshold). Numbering usually starts from 1, except for voltages which start from 0 (because most data sheets use this). A number is always used for elements that can be present more than once, even if there is a single element of the given type on the specific chip. Other files do not refer to a specific element, so they have a simple name, and no number.

Alarms are direct indications read from the chips. The drivers do NOT make comparisons of readings to thresholds. This allows violations between readings to be caught and alarmed. The exact definition of an alarm (for example, whether a threshold must be met or must be exceeded to cause an alarm) is chip-dependent.

When setting values of hwmon sysfs attributes, the string representation of the desired value must be written, note that strings which are not a number are interpreted as 0! For more on how written strings are interpreted see the “sysfs attribute writes interpretation” section at the end of this file.


denotes any positive number starting from 0


denotes any positive number starting from 1


read only value


write only value


read/write value

Read/write values may be read-only for some chips, depending on the hardware implementation.

All entries (except name) are optional, and should only be created in a given driver if the chip has the feature.

Global attributes


The chip name.This should be a short, lowercase string, not containing whitespace,
dashes, or the wildcard character ‘*’.This attribute represents the chip name.
It is the only mandatory attribute.I2C devices get this attribute created automatically.


The interval at which the chip will update readings.
Unit: millisecond
Some devices have a variable update rate or interval.
This attribute can be used to change it to the desired value.



Voltage min value.
Unit: millivolt


Voltage critical min value.
Unit: millivolt
If voltage drops to or below this limit, the system may take drastic action such as power
down or reset. At the very least, it should report a fault.


Voltage max value.
Unit: millivolt


Voltage critical max value.
Unit: millivolt
If voltage reaches or exceeds this limit, the system may take drastic action such as power
down or reset. At the very least, it should report a fault.


Voltage input value.
Unit: millivolt
Voltage measured on the chip pin.Actual voltage depends on the scaling resistors on the
motherboard, as recommended in the chip datasheet.This varies by chip and by motherboard.
Because of this variation, values are generally NOT scaled by the chip driver, and must be
done by the application.However, some drivers (notably lm87 and via686a) do scale, because
of internal resistors built into a chip.These drivers will output the actual voltage. Rule of
thumb: drivers should report the voltage values at the “pins” of the chip.


Average voltage
Unit: millivolt


Historical minimum voltage
Unit: millivolt


Historical maximum voltage
Unit: millivolt


Reset inX_lowest and inX_highest


Reset inX_lowest and inX_highest for all sensors


Suggested voltage channel label.
Text string Should only be created if the driver has hints about what this voltage channel
is being used for, and user-space doesn’t. In all other cases, the label is provided by


Enable or disable the sensors.
When disabled the sensor read will return -ENODATA.
1: Enable
0: Disable


CPU core reference voltage.
Unit: millivolt
Not always correct.


Voltage Regulator Module version number.
RW (but changing it should no more be necessary)
Originally the VRM standard version multiplied by 10, but now an arbitrary number, as not
all standards have a version number.Affects the way the driver calculates the CPU core
reference voltage from the vid pins.

Also see the Alarms section for status flags associated with voltages.



Fan minimum value
Unit: revolution/min (RPM)


Fan maximum value
Unit: revolution/min (RPM)
Only rarely supported by the hardware.


Fan input value.
Unit: revolution/min (RPM)


Fan divisor.
Integer value in powers of two (1, 2, 4, 8, 16, 32, 64, 128).
Some chips only support values 1, 2, 4 and 8.
Note that this is actually an internal clock divisor, which
affects the measurable speed range, not the read value.


Number of tachometer pulses per fan revolution.
Integer value, typically between 1 and 4.
This value is a characteristic of the fan connected to the device’s input,
so it has to be set in accordance with the fan model.Should only be created
if the chip has a register to configure the number of pulses. In the absence
of such a register (and thus attribute) the value assumed by all devices is 2 pulses
per fan revolution.


Desired fan speed
Unit: revolution/min (RPM)
Only makes sense if the chip supports closed-loop fan speed
control based on the measured fan speed.


Suggested fan channel label.
Text string
Should only be created if the driver has hints about what this fan channel is being
used for, and user-space doesn’t.In all other cases, the label is provided by user-space.


Enable or disable the sensors
When diabled the sensor read will return -ENODATA
1: Enable
0: Disable

Also see the Alarms section for status flags associated with fans.



Pulse width modulation fan control.
Integer value in the range 0 to 255
255 is max or 100%.


Fan speed control method:
0: no fan speed control (i.e. fan at full speed)
1: manual fan speed control enabled (using pwm[1-*])
2+: automatic fan speed control enabled
Check individual chip documentation files for automatic mode details.


0: DC mode (direct current)
1: PWM mode (pulse-width modulation)


Base PWM frequency in Hz.
Only possibly available when pwmN_mode is PWM, but not always present even then.


Select which temperature channels affect this PWM output in auto mode. Bitfield,
1 is temp1, 2 is temp2, 4 is temp3 etc…
Which values are possible depend on the chip used.
Define the PWM vs temperature curve. Number of trip points is chip-dependent.Use this
for chips which associate trip points to PWM output channels.
Define the PWM vs temperature curve. Number of trip points is chip dependent.
Use this for chips which associate trip points to temperature channels.

There is a third case where trip points are associated to both PWM output channels and temperature channels: the PWM values are associated to PWM output channels while the temperature values are associated to temperature channels. In that case, the result is determined by the mapping between temperature inputs and PWM outputs. When several temperature inputs are mapped to a given PWM output, this leads to several candidate PWM values.The actual result is up to the chip, but in general the highest candidate value (fastest fan speed) wins.



Sensor type selection.
Integers 1 to 6
1: CPU embedded diode
2: 3904 transistor
3: thermal diode
4: thermistor
6: Intel PECI
Not all types are supported by all chips


Temperature max value.
Unit: millidegree Celsius (or millivolt, see below)


Temperature min value.
Unit: millidegree Celsius


Temperature hysteresis value for max limit.
Unit: millidegree Celsius
Must be reported as an absolute temperature, NOT a delta from the max value.


Temperature hysteresis value for min limit.
Unit: millidegree Celsius
Must be reported as an absolute temperature, NOT a delta from the min value.


Temperature input value.
Unit: millidegree Celsius


Temperature critical max value, typically greater than
corresponding temp_max values.
Unit: millidegree Celsius


Temperature hysteresis value for critical limit.
Unit: millidegree Celsius
Must be reported as an absolute temperature, NOT a delta from the critical value.


Temperature emergency max value, for chips supporting more than two upper
temperature limits. Must be equal or greater than corresponding temp_crit values.
Unit: millidegree Celsius


Temperature hysteresis value for emergency limit.
Unit: millidegree Celsius
Must be reported as an absolute temperature, NOT a delta from the emergency value.


Temperature critical min value, typically lower than corresponding temp_min values.
Unit: millidegree Celsius


Temperature hysteresis value for critical min limit.
Unit: millidegree Celsius
Must be reported as an absolute temperature, NOT a delta from the critical min value.


Temperature offset which is added to the temperature reading by the chip.
Unit: millidegree Celsius
Read/Write value.


Suggested temperature channel label.
Text string Should only be created if the driver has hints about what this temperature
channel is being used for, and user-space doesn’t. In all other cases, the label is
provided by user-space.


Historical minimum temperature
Unit: millidegree Celsius


Historical maximum temperature
Unit: millidegree Celsius


Reset temp_lowest and temp_highest


Reset temp_lowest and temp_highest for all sensors


Enable or diable the sensors
When diabled the sensor read will return -ENODATA
1: Enable
0: Disable

Some chips measure temperature using external thermistors and an ADC, and report the temperature measurement as a voltage. Converting this voltage back to a temperature (or the other way around for limits) requires mathematical functions not available in the kernel, so the conversion must occur in user space. For these chips, all temp* files described above should contain values expressed in millivolt instead of millidegree Celsius. In other words, such temperature channels are handled as voltage channels by the driver.

Also see the Alarms section for status flags associated with temperatures.



Current max value
Unit: milliampere


Current min value.
Unit: milliampere


Current critical low value
Unit: milliampere


Current critical high value.
Unit: milliampere


Current input value
Unit: milliampere


Average current use
Unit: milliampere


Historical minimum current
Unit: milliampere


Historical maximum current
Unit: milliampere


Reset currX_lowest and currX_highest


Reset currX_lowest and currX_highest for all sensors


Enable or disable the sensors
When diabled the sensor read will return -ENODATA
1: Enable
0: Disable

Also see the Alarms section for status flags associated with currents.



Average power use
Unit: microWatt


Power use averaging interval. A poll notification is sent to this
file if the hardware changes the averaging interval.
Unit: milliseconds


Maximum power use averaging interval
Unit: milliseconds


Minimum power use averaging interval
Unit: milliseconds


Historical average maximum power use
Unit: microWatt


Historical average minimum power use
Unit: microWatt


A poll notification is sent to power[1-*]_average when power use
rises above this value.
Unit: microWatt


A poll notification is sent to power[1-*]_average when power use
sinks below this value.
Unit: microWatt


Instantaneous power use
Unit: microWatt


Historical maximum power use
Unit: microWatt


Historical minimum power use
Unit: microWatt


Reset input_highest, input_lowest,
average_highest and average_lowest.


Accuracy of the power meter.
Unit: Percent


If power use rises above this limit, the system should take action to
reduce power use.A poll notification is sent to this file if the cap is
changed by the hardware.The *_cap files only appear if the cap is known
to be enforced by hardware.
Unit: microWatt


Margin of hysteresis built around capping and notification.
Unit: microWatt


Maximum cap that can be set.
Unit: microWatt


Minimum cap that can be set.
Unit: microWatt


Maximum power.
Unit: microWatt


Critical maximum power.
If power rises to or above this limit, the system is expected take drastic
action to reduce power consumption, such as a system shutdown or
a forced powerdown of some devices.
Unit: microWatt


Enable or disable the sensors.
When diabled the sensor read will return -ENODATA
1: Enable
0: Disable

Also see the Alarms section for status flags associated with power readings.



Cumulative energy use
Unit: microJoule


Enable or disable the sensors
When diabled the sensor read will return -ENODATA
1: Enable
0: Disable



Unit: milli-percent (per cent mille, pcm)


Enable or disable the sensors
When diabled the sensor read will return -ENODATA
1: Enable
0: Disable


Each channel or limit may have an associated alarm file, containing a boolean value. 1 means than an alarm condition exists, 0 means no alarm.

Usually a given chip will either use channel-related alarms, or limit-related alarms, not both. The driver should just reflect the hardware implementation.

Channel alarm
0: no alarm
1: alarm


Limit alarm
0: no alarm
1: alarm

Each input channel may have an associated fault file. This can be used to notify open diodes, unconnected fans etc. where the hardware supports it. When this boolean has value 1, the measurement for that channel should not be trusted.

Input fault condition
0: no fault occurred
1: fault condition

Some chips also offer the possibility to get beeped when an alarm occurs:


Master beep enable
0: no beeps
1: beeps
Channel beep
0: disable
1: enable

In theory, a chip could provide per-limit beep masking, but no such chip was seen so far.

Old drivers provided a different, non-standard interface to alarms and beeps. These interface files are deprecated, but will be kept around for compatibility reasons:


Alarm bitmask.
Integer representation of one to four bytes.
A ‘1’ bit means an alarm.
Chips should be programmed for ‘comparator’ mode so that
the alarm will ‘come back’ after you read the register
if it is still valid.
Generally a direct representation of a chip’s internal
alarm registers; there is no standard for the position
of individual bits. For this reason, the use of this
interface file for new drivers is discouraged. Use
individual *_alarm and *_fault files instead.
Bits are defined in kernel/include/sensors.h.


Bitmask for beep.
Same format as ‘alarms’ with the same bit locations,
use discouraged for the same reason. Use individual
*_beep files instead.

Intrusion detection


Chassis intrusion detection
0: OK
1: intrusion detected
Contrary to regular alarm flags which clear themselves
automatically when read, this one sticks until cleared by
the user. This is done by writing 0 to the file. Writing
other values is unsupported.


Chassis intrusion beep
0: disable
1: enable
Average Sample Configuration

Devices allowing for reading {in,power,curr,temp}_average values may export attributes for controlling number of samples used to compute average.

Application software needs to understand the properties of the underlying hardware to leverage the performance capabilities of the platform for feature utilization and task scheduling. The sysfs topology exposes this information in a loosely hierarchal order. The information is populated by the KFD driver is gathered from ACPI (CRAT) and AMDGPU base driver.

The sysfs topology is arranged hierarchically as following. The root directory of the topology is

Based on the platform inside this directory there will be sub-directories corresponding to each HSA Agent. A system with N HSA Agents will have N directories as shown below.


HSA Agent Information

The HSA Agent directory and the sub-directories inside that contains all the information about that agent. The following are the main information available.

Node Information

This is available in the root directory of the HSA agent. This provides information about the compute capabilities of the agent which includes number of cores or compute units, SIMD count and clock speed.


The memory bank information attached to this agent is populated in “mem_banks” subdirectory. /sys/devices/virtual/kfd/kfd/topology/nodes/N/mem_banks


The caches available for this agent is populated in “cache” subdirectory /sys/devices/virtual/kfd/kfd/topology/nodes/N/cache

How to use topology information

The information provided in sysfs should not be directly used by application software. Application software should always use Thunk library API (libhsakmt) to access topology information. Please refer to Thunk API for more information.

The data are associated with a node ID, forming a per-node element list which references the elements contained at relative offsets within that list. A node associates with a kernel agent or agent. Node ID’s should be 0-based, with the “0” ID representing the primary elements of the system (e.g., “boot cores”, memory) if applicable. The enumeration order and—if applicable—values of the ID should match other information reported through mechanisms outside of the scope of the requirements;

For example, the data and enumeration order contained in the ACPI SRAT table on some systems should match the memory order and properties reported through HSA. Further detail is out of the scope of the System Architecture and outlined in the Runtime API specification.


Each of these nodes is interconnected with other nodes in more advanced systems to the level necessary to adequately describe the topology.


Where applicable, the node grouping of physical memory follows NUMA principles to leverage memory locality in software when multiple physical memory blocks are available in the system and agents have a different “access cost” (e.g., bandwidth/latency) to that memory.

KFD Topology structure for AMDGPU :

[–setsclk LEVEL [LEVEL …]] [–setmclk LEVEL [LEVEL …]] [–setpcie LEVEL [LEVEL …]]


ROCm GPU/GCD Isolation


It is possible to rearrange or isolate the collection of ROCm GPU/GCD devices that are available on a ROCm platform. This can be achieved at the start of an application by way of ROCR_VISIBLE_DEVICES environment variable.

Devices to be made visible to an application should be specified as a comma-separated list of enumerable devices. For example, to use devices 0 and 2 from a ROCm platform with four devices, set ROCR_VISIBLE_DEVICES=0,2 before launching the application. The application will then enumerate these devices as device 0 and device 1, respectively.

This can used by cooperating applications to effectively allocate GPU/GCDs among themselves.