Testing the Intel TCO watchdog using Ubuntu live

This page written on

Context

What is this about? This page provides instructions on how to use Ubuntu live, without actually installing Linux, to test the TCO watchdog feature (see below) included on Intel chipsets and checks whether it works correctly. The point is to detect those motherboards whose faulty hardware or buggy BIOS make the watchdog dysfunctional. You will need: one desktop computer with an Intel chipset and one USB thumb drive or optical medium. No hard disk drives are required. We generally assume that the computer boots following the UEFI protocol (although indications will be given on how to perform the test with a traditional BIOS).

Background: A hardware watchdog is a feature included on many computers, whose purpose is to reboot the computer automatically in case the system hangs: once the watchdog is activated, it must receive a ping at regular intervals from the system, and, if it does not, it will cause a hardware reset. All (nearly all?) desktop and server Intel chipsets at least since the 2000's have such a hardware watchdog, which, in Intel parlance, is known as the TCO watchdog, where TCO stands for Total Cost of Ownership (whatever that means). • On some motherboards, however, the watchdog fails to reset the system properly and causes the motherboard to remain stuck during POST: this could be caused by faulty hardware or a buggy BIOS. The point of this page is to provide simple instructions to test whether the watchdog works. • Under Linux, the TCO watchdog driver is known as iTCO-wdt (this creates a /dev/watchdog device): we will use it to perform the test. (The instructions on this page do not install Linux, it is only used on a “live” system.)

Please write to me if you perform this test: I am greatly interested in knowing which motherboards work and which don't. Please report your motherboard manufacturer and model, your BIOS vendor and version (if known), whether booting through UEFI, and whether using a graphics card or CPU-integrated graphics chipset (and what other extension cards may be present). I am particularly interested in test cases where at least one of the following conditions holds, and even more so if several do: (i) the BIOS is written by AMI (American Megatrends), (ii) the chipset is of the Intel 100 series or C230 series (a.k.a. Sunrise Point, used for Skylake processors with an LGA1151 socket), or (iii) the system is booting under UEFI (as opposed to legacy BIOS).

Cross-references: LKML, Reddit, LKML.

The instructions

(1) Download Ubuntu desktop from here (then click on the Download link). I will be assuming version 16.04.1 LTS desktop; get the amd64 architecture (the default). Note that Ubuntu is free, you do not need to pay to download it (although you are encouraged to contribute if you decide to actually use it), there is a Not now, take me to the download on the download page.

If you wish to check the integrity of the file ubuntu-16.04.1-desktop-amd64.iso, it should have the following cryptographic hashes: MD5=17643c29e3c4609818f26becf76d29a3, SHA-1=805337c2c3a00ac9b4a59a5c9692903ad30fe3ce and SHA-256=dc7dee086faabc9553d5ff8ff1b490a7f85c379f49de20c076f11fb6ac7c0f34 (size=1513308160).

(2) Copy the .iso image downloaded in step (1) to a DVD or USB stick. If using a DVD, you need to burn the ISO disc image to the DVD: see here for further instructions on how to do this. If using a USB stick (formatted as VFAT), mount the ISO image so as to access the files inside (thees instructions might be of use to Windows users) and simply copy all these files from the ISO image to the thumb drive (so the drive will contain files called README.diskdefines and md5sum.txt and directories called .disk and EFI and boot and casper and so on).

Note: In the case of a USB thumb drive, copying the files is enough when booting through UEFI. In the case of a legacy BIOS boot, it is not enough: the drive must additionally be made bootable. See here for full instructions on how to create a USB bootable stick on Windows. (Instructions from Linux itself: assuming the USB stick has a single partition called /dev/sdx1, rename the isolinux directory on the drive as syslinux and rename the isolinux.cfg file it contains as syslinux.cfg then unmount it and run syslinux -i -d syslinux /dev/sdx1 then dd if=/usr/lib/SYSLINUX/mbr.bin of=/dev/sdx and lastly fdisk /dev/sdx and make partition 1 bootable using the a command.) Again, this is all required only for legacy boot: for UEFI, simply copy the files to the drive.

[Screenshot]

(3) Boot from the DVD or USB thumb drive created in step (2). (Use F8 on boot to choose whence to boot from.) If booting through UEFI, the screen should appear as shown on right. Press the e key (while the menu item Try Ubuntu without installing is selected) —

[Screenshot]

the screen will then appear as shown on the left: use the arrow keys to move cursor before quiet splash and type the following extra argument: acpi_enforce_resources=lax (leave at least one space around the entire argument, but no spaces around the = sign). See next screenshot

[Screenshot]

Now press F10 to boot.

Note: If not booting through UEFI, the boot screen will be different (Syslinux instead of Grub). Press the control key to select boot options, then press enter to select English, then press F6 for Other Options, press escape to dismiss the popup, use the arrow keys to edit the boot options line, and add acpi_enforce_resources=lax (surrounded by spaces) before quiet splash, and press enter to boot.

Technical note: The acpi_enforce_resources=lax kernel parameter must be passed because, on many systems, the ACPI reserves the region used by the i2c driver: see here for a discussion.

[Screenshot]

(4) Linux will now boot. After a certain time, the screen should look like shown at right. Click on the Ubuntu logo at the top left,

[Screenshot]

and type terminal in the Search your computer box, and press enter. This should open a terminal as shown next:

[Screenshot]

(the prompt is ubuntu@ubuntu:~$ and the cursor is a square). Now type sudo su to become administrator and the prompt should change to root@ubuntu:/home/ubuntu# (this is the administrator, or “root” prompt).

[Screenshot]

(5) Now type the following commands at the administrator prompt:

modprobe i2c-i801
modprobe i2c-smbus
modprobe iTCO-wdt

(note that capitalization matters: it's iTCO-wdt and not itco-wdt; on the other hand, the minus sign can be replaced by an underscore in any or all of these lines if it's somehow easier to type). The first two load the I²C bus drivers and the third loads the watchdog driver itself. None of these commands will produce any output.

Optional: check whether the watchdog has correctly been detected by typing dmesg to view the Linux kernel logs. Look for lines such as the following (preceded by numerical timestamps):

iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
iTCO_wdt: Found a Intel PCH TCO device (Version=4, TCOBASE=0x0400)
iTCO_wdt: initialized. heartbeat=120 sec (nowayout=0)

If only the first appears, the watchdog was not detected (and it is useless to continue). If all three appear, the watchdog was correctly detected.

(6) Finally, type the following at the administrator prompt:

cat >> /dev/watchdog

and press enter twice. This will cause the watchdog to start counting down. The prompt will not return (do not press control-c nor control-d), simply wait.

After a few minutes (specifically, twice the heartbeat value indicated in the above optional step), the system will perform a hard reboot.

Results: If the system performs a successful reboot (as if the reset button had been pressed), congratulations!, the watchdog is functioning correctly. (Please report a success to me.) If the system fails to reboot (remains forever hung in POST), turn the computer off, disconnect the power supply for a few minutes, and then perform a cold reboot to return to normal operation. Please report this as a failure.

Many thanks to all those who can try this test!