AMD Continues With MCE/SMCA Linux Driver Changes Ahead Of Zen 4 CPUs

Written by Michael Larabel in AMD on 18 April 2022 at 07:19 PM EDT. 2 Comments
AMD
This year AMD engineers working on hardware enablement for Linux have been busy with EDAC driver improvements like RDDR5 and LRDDR5 handling, AMD Scalable Machine Check Architecture (SMCA) additions for "future" CPUs, and the various other areas outside of the error detection and correction field. Today though is a new patch series back in that hardware error handling space with new SMCA code.

A new patch series posted on Monday for the AMD MCE (Machine Check Exception) driver adds support for two new "syndrome" registers used in "future AMD Scalable MCA systems" and as part of that implementing a new FRU Text feature. Given the timing of this work and AMD's cadence around Linux hardware enablement timing, this is almost certainly for EPYC 7004 "Genoa" and "Bergamo" server processors.


AMD engineers remain very busy working on Linux support ahead of Zen 4 processors launching later this year.


The intention with the new syndrome registers to be found as part of the SMCA IP with future AMD CPUs is for providing supplemental error information. The FRU text feature is for a Field Replaceable Unit (FRU) string that is represented in the new syndrome registers. The FRU text string can vary based on MCA bank and is populated dynamically for each error state. This FRU string will be included as part of all AMD MCE reports for hardware errors.

The new AMD MCE driver patches are now out for review on the kernel mailing list and given the timing could be merged for the v5.19 cycle if no issues turn up. Long story short, this is another patch series pointing at the seemingly more than usual hardware error detection/reporting changes coming for next-generation EPYC server processors and all should be welcomed improvements by server administrators for helping to deal with any hardware/system issues.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week