TY - GEN
T1 - Correlation of no trouble found errors to negative bias temperature instability
AU - LiVolsi, Robert
AU - McCormick, Kevin
AU - Torres, Myra
AU - Velamala, Jyothi
AU - Zheng, Rui
AU - Cao, Yu
PY - 2011/5/13
Y1 - 2011/5/13
N2 - No Trouble Found (NTF) and Cannot Duplicate (CND) errors on modern digital electronics are increasingly prevalent and occur at a rate of 50-60% using conventional bench top diagnostics [1]. This work correlates NTF diagnostic errors to Negative Bias Temperature Instability (NBTI), a prominent failure degradation mode and self annealing mechanism in sub-100 nm CMOS technology. NBTI degradation is duplicated in laboratory experiments on 90 nm MPC7448 Freescale Microprocessors. Accelerated aging via in situ thermal and voltage cycling is conducted while benchmark scripts are running on the MPC7448. Faults observed include premature program termination, corruption of system services, L1 and L2 cache errors as reported by the kernel, and total system failure. After 8 hours of rest, the system boots up normally with no indication of system degradation. Final system failure is observed after several faults. Conventional Built-In Test (BIT) fails to detect these faults upon reboot of the system. Various control tests and test profiles are used to accelerate NBTI degradation on the microprocessor samples. The challenge of faulty behavior is distinguishing between health and degradation leading to failure. Analysis techniques are used to show separation between healthy and degraded data, and independent NBTI research at Arizona State University is used to correlate NBTI behavior to NTF diagnostic errors.
AB - No Trouble Found (NTF) and Cannot Duplicate (CND) errors on modern digital electronics are increasingly prevalent and occur at a rate of 50-60% using conventional bench top diagnostics [1]. This work correlates NTF diagnostic errors to Negative Bias Temperature Instability (NBTI), a prominent failure degradation mode and self annealing mechanism in sub-100 nm CMOS technology. NBTI degradation is duplicated in laboratory experiments on 90 nm MPC7448 Freescale Microprocessors. Accelerated aging via in situ thermal and voltage cycling is conducted while benchmark scripts are running on the MPC7448. Faults observed include premature program termination, corruption of system services, L1 and L2 cache errors as reported by the kernel, and total system failure. After 8 hours of rest, the system boots up normally with no indication of system degradation. Final system failure is observed after several faults. Conventional Built-In Test (BIT) fails to detect these faults upon reboot of the system. Various control tests and test profiles are used to accelerate NBTI degradation on the microprocessor samples. The challenge of faulty behavior is distinguishing between health and degradation leading to failure. Analysis techniques are used to show separation between healthy and degraded data, and independent NBTI research at Arizona State University is used to correlate NBTI behavior to NTF diagnostic errors.
UR - http://www.scopus.com/inward/record.url?scp=79955771952&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79955771952&partnerID=8YFLogxK
U2 - 10.1109/AERO.2011.5747585
DO - 10.1109/AERO.2011.5747585
M3 - Conference contribution
AN - SCOPUS:79955771952
SN - 9781424473502
T3 - IEEE Aerospace Conference Proceedings
BT - 2011 Aerospace Conference, AERO 2011
T2 - 2011 IEEE Aerospace Conference, AERO 2011
Y2 - 5 March 2011 through 12 March 2011
ER -