Intel "In-Field Scan" Coming With Sapphire Rapids As New Silicon Failure Testing Feature

Written by Michael Larabel in Intel on 2 March 2022 at 11:10 AM EST. 11 Comments
INTEL
Intel In-Field Scan is a hardware feature we have not heard the company talk about publicly until yesterday when they posted a new open-source Linux driver for this hardware failure testing feature being introduced with Sapphire Rapids processors.

Intel In-Field Scan is a hardware feature being initially introduced with at least some of the upcoming Xeon "Sapphire Rapids" processor SKUs and allows running circuit level tests on a CPU core for detecting hardware problems not caught by parity or ECC checks. The intent with Intel In-Field Scan (it is using an "IFS" acronymn, not to be confused with Intel Foundry Services) has hardware hooks for performing per-core tests and reporting any silicon failures from said tests. Intel In-Field Scan is designed to be used by cloud providers, OEMs, and other hyperscalers for running tests and finding any in-field failures due to aging silicon or other hardware problems that would not otherwise be detected by existing hardware checks such as ECC memory errors or other machine check exceptions.


Intel In-Field Scan makes a lot of sense for future Xeon Scalable server processors for helping to detect any silicon issues prior to deployment into production or after being deployed with routine monitoring of the aging silicon.


As for what all of these silicon-level hardware tests that will be conducted, that isn't entirely clear. This proposed Intel IFS kernel driver is just the infrastructure for handling In-Field Scan while the tests themselves will be loaded as a binary similar to the Intel CPU microcode. The Intel IFS tests will be loaded from a file and are specific to particular CPU Family/Model/Stepping. These files are authenticated prior to use and when loaded stored within secure memory.

When running on supported Intel processors with a driver having the Intel IFS driver and having the test IFS images available, the tests can be loaded via /sys/devices/system/cpu/ifs/reload. Triggering the IFS tests to then execute on all available CPU cores can then be carried out via writing to /sys/devices/system/cpu/ifs/run_test. The IFS driver also allows testing individual specific CPU cores as well via sysfs.

After carrying out an In-Field Scan test, the results are written to /sys/devices/system/cpu/ifs/status for reporting if all CPU cores passed or failed. There are sysfs files as well for reporting specific CPU cores that passed/failed or were untested.


These interfaces will allow for OEMs and hyperscalers to easily carry out these silicon failure tests whenever desired prior to deployment or in an ongoing manner to look for any issues stemming from the aging silicon.

The Intel In-Field Scan Linux kernel driver is currently under review on the kernel mailing list and amounts to around 1.5k lines of new code -- not counting the to-be-published CPU model specific test files that will seemingly be coming out later once Sapphire Rapids is formally launched.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week