Using thinkfan for fan control on Lenovo (Linux) laptops
A problem that affects high-performance laptop computers is that they tend to either run hot, or generate a lot of fan noise, or both. I have several Lenovo ThinkPad laptops, all with H-class Intel CPUs, all rated to run continuously at 65C. Some of these have discrete GPUs, with similar temperature ratings.
The default behaviour of the cooling system on my P53 is not to turn the fans on at all until the CPU/GPU temperatures reach about 60C, and then turn both the fans on at high speed. This is probably safe, given the ratings of the components, but I’d prefer to have a bit more fan noise, and not burn my fingers on the keyboard.
So I’ve started experimenting with the thinkfan
utility,
as a way to get some control over the fan curve, that is, the
relationship between temperatures and fan speed.
Note:
You can probably break a computer if you fiddle with its cooling behaviour without proper care and attention. It’s something to be done cautiously, with careful observation.
thinkfan
isn’t the only utility for temperature control
on Linux, and it’s probably not limited to ThinkPads. It’s reasonably
well documented but, as with all Linux documentation, it’s written for
people who don’t need it. thinkfan
is far from a
zero-configuration utility, and setting it up for particular hardware
can be rather fiddly.
In this article I explain how temperature monitoring and fan control
work in Linux, and then describe how I configure thinkfan
,
using my Lenovo P53 as an example. The P53 is a big, heavy,
industrial-grade laptop, that can generate an awful lot of heat.
Temperature monitoring in modern Linux systems
Temperature monitoring is part of the ‘hwmon’ subsystem, which the
kernel exposes in the pseudo-directory /sys/class/hwmon
.
Various kernel modules, like ‘cortemp’, create pseudo-files in that
directory, indicating temperature and many other things.
If you look in the hwmon
directory, you’ll see it
contains only subdirectories, named hwmon0
,
hwmon1
, etc. These subdirectories are, in fact, links to
other directories, usually under /sys/devices
.
Unfortunately, the numbering of the directories in
/sys/class/hwmon
isn’t guaranteed to be stable, which can
be a problem. In practice, I’ve found that it’s reasonably stable on the
same hardware, but I don’t think you should really rely on this if you
don’t have to – and most of the time you don’t (more below).
All the temperature sensors have pseudo-files with names of the form
tempNN_input
, where NN
is a number that starts
at 1 (not 0).
To see them all, try this:
$ find -H /sys/class/hwmon/* -name temp?_input
You’ll need the -H
here because the directories are
symbolic links. On my Lenovo P53 I see 23 temperature sensors. Do I know
what they all are? Nope – not even close. That’s also a problem.
I have some idea what they are, because some
tempNN_input
files are accompanied by a label. So, for
example:
$ cat /sys/class/hwmon/hwmon10/temp1_label
Package id 0
This is only partially useful, because I don’t know what specific
hwmon driver corresponds to hwmon10
. There are two ways I
might find out.
$ ls -l /sys/class/hwmon/ | grep hwmon10
lrwxrwxrwx 1 root root 0 Aug 25 19:24 hwmon10 -> ../../devices/platform/coretemp.0/hwmon/hwmon10
You’ll see coretemp
in the directory name. Another
method is to look at the name
file in the
hwmon
directory:
$ cat /sys/class/hwmon/hwmon10/name
coretemp
In both cases I get coretemp
as the name. It turns out
that coretemp
is a driver that measures temperatures on the
individual cores of an Intel-like CPUs. On an ARM device you might have
cpu-thermal
instead.
To see all the various driver names, and the directories in
hwmon
they correspond to (on this current boot), try
this:
find -H /sys/class/hwmon/* -name name -printf "%p " -exec cat {} \;
/sys/class/hwmon/hwmon0/name acpitz
/sys/class/hwmon/hwmon1/name BAT0
...
In short, each subdirectory of /sys/class/hwmon
has a
file name
whose contents indicate what its driver is, and a
bunch of tempNN_input
files that indicate temperatures
associated with that driver. If we’re lucky, each
tempNN_input
file has an tempNN_label
file
that says what that sensor is.
So far, so good; but how do we know which are useful drivers and
temperature metrics to use for fan control? The simple answer is: we
don’t. The coretemp
module is almost always going to be
useful, because CPU temperature is so significant. But what about disk
temperatures? Or GPU temperatures?
There’s really no systematic solution here, apart from a heap of web
searching. NVME disks, for example, are monitored by the
nvme
driver. On Lenovo Thinkpads, the driver
thinkpad
provides a temperature indicator whose label is
GPU
, which does exactly what the name suggests. If you have
an NVidia GPU, it’s possible to get information directly from the
proprietary NVidia kernel module, if you’re using it.
The thinkpad
driver on my P53 exposes a bunch of
temperature metrics with no labels. What do they do? I don’t know. Some
websites provide information that might be relevant to some
Lenovo models. In general, enthusiasts have worked out what the sensors
do on some models by trial-and-error, using tactics like directing a
cooling spray at various components, while watching the temperature
metrics.
What about the temperature of the wifi adapter? On my P53, this
information comes from the iwlwifi_1
hwmon driver but,
again, this could well be different on another system.
The sensors
utility will dump a heap of useful
information from hwmon is a human-readable format; I found this to be a
useful starting point, when figuring out what sensors to monitor. In the
end, CPU temperature and GPU temperature (if your machine has a specific
GPU) are likely to be the significant metrics; I’m also concerned about
my NVME drives, which do tend to get quite warm under load.
There are (at least) a couple of other things to consider, when choosing which sensors to use for fan control.
First, not all sensors (not all tempNN_input
files)
correspond to dynamic values. Some represent other metrics of interest
to the manufacturer. It’s important to realize that, just because an
hwmon driver can read a sensor, that doesn’t mean its value can, or
should, be interpreted.
A good example of this is the long-term temperature metrics stored by certain storage drives. Some sensors indicate average temperatures, or lifetime maximum temperatures. While these are important metrics for assessing the health of the drive, they’re useless for fan control.
Second, there’s no point using a metric for fan control if the fan speed makes little or no difference to it. If the relevant component isn’t in the airflow path of the fan, then the fan will have little effect on its temperature. For example, in my P53 the temperature of the wifi adapter only falls by a few degrees when running the fan at full speed for an extended duration. Using the wifi temperature sensor for fan control more-or-less guarantees that the fans will run all the time, even when the CPU/GPU temperatures are in the thirties, and the wifi adapter won’t run significantly cooler as a result.
In short, it’s worth monitoring all the temperatures, and seeing which ones change if you force the fan to run at full speed. If it changes only a little, question whether it’s any value for fan control. If a metric doesn’t change even when you cool the computer with a powerful external fan, then most likely it isn’t an instantaneous temperature measurement at all.
Fan control on modern Linux systems
Linux (at least on Intel hardware) has essentially two methods for
controlling built-in fans. The more modern method uses pseudo-files in
the hwmon subsystem, which you can find using essentially the same
method as for temperature sensors. You’re looking for files called
pwmNN
, which NN
is a number starting at 1.
There are some subtleties here but, essentially, you set the fan speed
by writing a number 0-255 to the relevant file.
The other method is to use the older interface
/proc/acpi/ibm/fan
. If you cat
this file,
you’ll get the current status of the fan, and brief instructions on what
‘commands’ to send, that is, what to write to the file, to control the
fan. To set the fan speed, you’ll need to write level X
to
/proc/acpi/ibm/fan
. X
isn’t necessarily a
number.
On my P53, the supported levels are 0-7, auto
,
disengaged
, and full-speed
. auto
effectively puts control back in the hands of the machine’s firmware. On
my machine level 7 and full-speed
have the same effect.
disengaged
is subtly different – this disables the feedback
system that regulates the fan speed. The result may (on some
systems) be faster than full-speed
, with the disadvantage
that you could lose the ability to monitor the fan speed. Unless you’re
in a very noisy environment, however, you’ll probably know that the fans
are spinning at full speed, just because of the turbine roar they
make.
On Thinkpad machines, fan control is done via the
thinkpad_acpi
module. You’ll need to load this module with
the option fan_control=1
to enable non-default fan
speeds.
About thinkfan
thinkfan
is a simple utility for controlling fan speed,
originally on Thinkpad devices, but probably these days on others. It
was originally written by Victor Mataré. It’s been around for more than
ten years, and it’s still kind-of maintained, although the peak of
activity was back in 2020. There are 36 open bugs on the project’s
GitHub page, but that’s probably not a huge number, compared to the
number of people using it. It’s in most of the standard Linux
repositories.
The purpose of thinkfan
is to modify the default
fan/temperature response, that is, to change the fan speed for
particular temperatures. It’s not clear to me whether careless use, or
misbehaviour, of thinkfan
could actually break a specific
machine. I would imagine that thermal CPU/GPU throttling would act to
reduce the amount of heat generated if the fan speed were inadequate.
However, it’s not something to be complacent about, and I’ve monitored
my laptops’ temperatures quite carefully when they’re working hard,
before deciding that my fan control settings are adequate.
thinkfan
won’t do anything useful without uses a YAML
configuration file. Personally, I loathe YAML, because layout is part of
the syntax. Still, it is what it is. You’ll need to specify what sensors
to read, what method of fan control to use, and what mapping to make
between the temperature and fan speed. thinkfan
supports
hwmon interfaces and also /proc/acpi/ibm/fan
. It has a
number of other capabilities, too, which I haven’t used, and can’t
comment on.
thinkfan
has a ‘simple’ mode of operation, and a
‘detailed’ mode. In simple mode, the temperature used for fan control is
just the highest value from among all the sensors listed in the
configuration. In ‘detailed’ mode, each sensor is given its own
temperature range, and the fan speed is determined from the sensor that
reads the highest value within its own range. You probably need
‘detailed’ mode if you have components that run at very different
temperatures. My CPU and GPU can stand temperatures of above 90C, for
example, but a magnetic hard drive would struggle at temperatures above
60C.
An alternative to using ‘detailed’ mode is to apply a fixed temperature correction to specific sensors. You could, for example, add 20C to the temperature of a magnetic disk, and then assume that the temperature needed to give full fan speed is 80C, not 60C, as might be the case for the CPU.
I haven’t experimented much with any of these more sophisticated modes of operation, as I have no particular reason to think that the components in my laptops have different safe temperature ranges.
thinkfan
sensor
configuration
For the record, this is the configuration I’m using on my P53:
sensors:
- hwmon: /sys/class/hwmon
name: coretemp
indices: [1, 2, 3, 4, 5, 6, 7] # CPU package and cores
- hwmon: /sys/class/hwmon
name: thinkpad
indices: [1, 2] # CPU aggregate and GPU
- hwmon: /sys/devices/pci0000:00/0000:00:1b.0/0000:02:00.0/nvme/nvme0/hwmon2/temp1_input # SSD1
- hwmon: /sys/devices/pci0000:00/0000:00:1d.0/0000:55:00.0/nvme/nvme1/hwmon3/temp1_input # SSD2
You’ll see that it’s possible to specify hwmon sensors in different
ways. The more elegant is to use the name
and
indices
format, with a specific base directory. With this
format, thinkfan
will enumerate the directories under the
base directory, looking for a name
file that matches the
supplied name. indices
refers to the specific
tempXX_input
file and, again, values start at 1, not 0.
You’ll need to inspect the individual tempXX_label
files to
find the sensor indices to include. The advantage of this approach is
that it won’t break if the entries in /sys/class/hwmon
get
renumbered.
In my case, temp1_input
in the coretemp
driver is the aggregate package temperature, labeled
Package id 0
. In my tests, this temperature invariably
reads a bit higher than the other cores, which are
temp2_input
, etc. I’m not entirely sure, therefore, whether
it’s necessary to record the additional core temperatures, along with
the package temperature. I’ve tried both ways, and it doesn’t make a
huge difference.
I’m also using the CPU and GPU sensors from the thinkpad
driver. The CPU temperature is always a degree or two lower than the
Package id 0
value so, again, I don’t really know whether
this metric needs to be included. I’d certainly want to include the GPU
temperature, though.
In my P53, recording the drive temperature is somewhat fiddly. There are two NVME drives, and one SATA. I’m not worried about the SATA drive, since it doesn’t get much use, and it never seems to get warm. It doesn’t even have ventilation slots in the chassis, so there’s probably little point increasing the fan speed even if it does warm up. The NVME drives, however, do have ventilation, and they do tend to get warm under load.
The problem with having two NVME drives is that they’re handled by
two different instances of the nvme
driver – but both are
just called nvme
. It would be nice to be able to
specify:
- hwmon: /sys/class/hwmon
name: nvme
indices: [1]
and have thinkfan
read the temperatures for both drives.
Sadly, that doesn’t work: it just reports duplicate drivers.
So I’ve had to use the full path to the corresponding
temp1_input
files for each NVME drive. So far the pathnames
have remained consistent across reboots but, as I said, this is perhaps
not something to rely on.
I’m not using any of the other temperature metrics, even those I can
interpret. The ones I’m using are the ones that seem to increase under
load. I’ve had to figure out what these are by extensive monitoring,
along with trial-and-error. In short, while the thinkfan
configuration is tricky, mastering it alone doesn’t mean the job’s
done.
thinkfan
fan
configuration
You can use the same configuration syntax as for sensors but, since
my P53 has the old /proc/acpi/ibm/fan
interface, it’s
easier just to use that. So the entire fan configuration is this:
fans:
- tpacpi: /proc/acpi/ibm/fan
thinkfan
fan curve
We have a way to read the temperature, and a way to set the fan speed. Now we need a way to link these two metrics together. There’s plenty of scope for creativity here, and I’m still experimenting, to see what gets the best results. My feeling is that, so long as you assign maximum fan speed to any temperature that is likely to be high enough to lead to severe throttling, you have a reasonably safe configuration. The creative part lies in deciding what to do with temperatures lower than this.
In my case, I want the fan to remain off completely at temperatures below 45C, then increase speed rapidly beyond that point. With the other settings I have, I know that all the routine stuff I do – editing text, looking at websites – doesn’t increase the temperature more than this, and I want the fan to be mostly off under light load. At greater than light loads, I want the fan to be on, and working hard: I feel this will increase the lifespan of the hardware, as well as making the machine more comfortable to use. If you prefer heat to fan noise, however, you’ll probably prefer a different configuration to mine.
This is the configuration I currently have:
levels:
- [0, 0, 45] # Fan off at less than 45C
- ["level 1", 42, 50] # Gentle
- ["level 3", 48, 60] # Moderate
- ["level 5", 55, 70] # High
- ["level 6", 68, 78] # Higher
- ["level 7", 75, 255] # Max speed
Note that the temperature bands overlap: if they don’t, you’ll have the fans rapidly switching on and off when the temperature is near the band edges.
Dual fan issues
The P53 has separate fans for the CPU and GPU. In principle, they can
be controlled individually, but the thinkpad
driver doesn’t
expose a way to do this. Even if it did, thinkfan
doesn’t
support multiple fans.
In practice, setting the fan speed via
/proc/acpi/ibm/fan
sets the speeds of both fans. They don’t
run at exactly the same speed, whatever fan level you set, but they’re
similar.
Closing remarks
thinkfan
isn’t perfect. Sometimes it doesn’t start at
boot time, and sometimes it crashes when waking up from sleep. The
configuration is fiddly and, even though I understand it, I still needed
a lot of trial-and-error to get the behaviour I wanted.
Still it seems to do what it should, and it makes my laptops more agreeable to use.