# On-Chip Delay Degradation Measurement for Aging Compensation

#### Kyung Ki Kim\*

School of Electronic and Electrical Engineering, Daegu University, Gyeongsan, South Korea; kkkim@daegu.ac.kr

#### Abstract

As technology scales down, it has become one of the most critical issues in aging-tolerant nanoscale MOSFET circuit design to monitor the performance degradation of the circuits under aging stress conditions such as Negative-Bias-Temperature Instability (NBTI) and Hot-Carrier-Injection (HCI). Hence, this paper proposes a novel on-chip circuit to measure the delay degradation of stressed MOSFET digital circuits and digitalize the degradation for aging compensation. A 0.11µm CMOS technology has been used to implement and evaluate the proposed circuits.

Keywords: Aging Effect, Aging Prediction, Bias Temperature Instability, Hot Carrier Injection, Reliability

#### 1. Introduction

It has become ever harder to design reliable circuits with each nanometer technology node. Under normal operation conditions, a transistor device can be affected by various aging effects by Hot Carrier Injection (HCI), Bias Temperature Instability (BTI), and Time Dependent Dielectric Breakdown (TDDB) resulting in performance degradation and eventually design failure. Among the aging phenomena, the NBTI impact on PMOSFETs and the HCI impact on NMOSFETs have become the most critical reliability issue. Hence, accurate NBTI/HCI sensors are imperative for the reliable digital circuit design: the outputs of the sensor circuits can be used as control signals for the post-silicon tuning system.

NBTI degradation affects PMOS transistors when a negative bias is applied to the gate or, equivalently, when the gate is grounded and a positive bias is applied to source/drain as shown in Figure 1. The presence of hydrogenated Si-bonds (Si-H) at the interface between Si and gate oxide, boron penetration into the gate oxide, and presence of impurities in the oxide originate interface and oxide charge traps. In inversion mode, holes can be injected into these traps which lead to V<sub>th</sub> increase and I<sub>dsat</sub> decrease. An increase in V<sub>th</sub> reduces the voltage overdrive (V<sub>DD</sub>-V<sub>th</sub>), decreasing the circuit stability and margins. NBTI degrades performance and yield of PMOS devices. PBTI is seen for NMOS when a positive bias stress is applied across the gate oxide for the NMOS device. Although the impact of NBTI is higher than that of PBTI, PBTI has become increasingly important with the use of Hf-based dielectrics in the gate-oxide for leakage reduction<sup>1-7</sup>.

HCI describes a degradation of the electrical parameters of MOSFETs under a dynamic stress mode. If a channel hot carrier collides with a crystal atom near the drain region, it may produce an electron-hole pair by impact ionization also called avalanche pair production as shown in Figure 2. Electrons from impact ionization could have enough energy to be injected into gate oxide region and charge existing oxide traps or generate new oxide-interface traps. The end result of hot carrier injection into gate oxide is a degradation of transistor parameters such as saturation current ( $I_{dsat}$ ) and threshold voltage ( $V_{th}$ )<sup>8-11</sup>.

The above mentioned reliability problems can severely degrade the performance and in the worst case can cause system failure. A conventional method handling such problems is to provide more safety margins (called guardband) in the circuit design phase. That is, these reliability mechanisms should be considered in the early design stages to make sure that MOSFET circuits are operated with enough margins to function correctly over their entire lifetime. However, since this solution



Figure 1. NBTI stress on a PMOSFET device in a nanometer technology



Figure 2. HCI stress on a NMOSFET device in a nanometer technology.

excessively increases the circuit size and power dissipation of a system, an adaptive compensation technique is required to reduce the increased area and power as well as to compensate for the performance loss due to device. For a good adaptive technique, accurate reliability monitoring circuits are imperative: the output of each monitoring circuit can be used as control signals of the post-silicon tuning system. The novelty of the proposed work is the development of the delay degradation measurement for the aging compensation (post-silicon tuning system). For a good adaptive technique, the self-adaptive system has to include an on-chip aging prediction circuits whose outputs are strongly correlated with threshold-voltage degradation caused by aging stresses.

In this paper, we propose a new on-chip circuit to measure the delay degradation detecting a guardband violation of sequential logics. The novel on-chip circuit is inserted between a combinational logic and a flip-flop in a sequential digital system. The monitoring circuit block is based on the concept of stability checking during the guardband interval  $T_g$  as shown in Figure 3 by detecting signal transitions during the guardband interval. If one or more paths of the combinational logic circuit have aged enough, the output of the combinational logic circuit will enter the guardband interval, which means the circuit is very close to failure. By detecting signal transitions within the guardband interval, the circuit failure due to the aging effects can be predicted.

Recently, a circuit failure prediction circuit has been proposed in Reference <sup>12</sup> which deploys distributed aging



**Figure 3.** Guardband violation due to transistor aging.

sensors locally measuring the time degradation of a critical or near-critical path. The prediction circuit can detect the moment when the signals of a critical path arrive within a predefined guardband interval. However, the scheme in Reference <sup>12</sup> consists of a complicated Delay Element (DE) with large area and has only 1-bit output of Stability Checker (SC). Another circuit failure prediction sensor proposed in Reference <sup>13</sup> has an improved programmable DE with low area overhead, but it has still 1-bit SC. It is not possible to predict the accurate status of aging phenomena using only 1-bit SC, so it is not easy to apply the scheme to a self-adaptive system.

In this paper, we propose a novel on-chip circuit to measure the delay degradation with a simple programmable delay element using a 0.11  $\mu$ m CMOS technology where 3-bit outputs are strongly correlated with the V<sub>tb</sub>-

degradation caused by BTI and HCI stress. The proposed circuits can be easily applied to a self-adaptive system for a reliable operation. The new fully digital on-chip circuit has been evaluated by a 4x4 multiplier.

# 2. On-Chip Delay Degradation Measurement

This section presents a new on-chip circuit to measure the delay degradation due to aging effects. The proposed on-chip circuit deploys a flip-flop based delay detector for monitoring a guardband violation of sequential logics. The proposed circuit detects the moment when the critical path delay of a combinational logic in a sequential design exceeds a normal value which guarantees a correct circuit operation.

The proposed circuit has four blocks as shown in Figure 4: a Guardband Generator (GG) to create three guardband windows, a Path Delay Monitor (PDM) to detect data signal transition, a bit decision circuits, and a pulse generator block. In Figure 4, a "Measure" signal

(meas) is asserted to the aging measurement circuit to turn the circuit on or off. In order to reduce a power overhead, the prediction circuit will be periodically worked for a short time and will be turned off for the most time. PDM is place on each flip-flop, while GG and "Bit Decision" circuit are shared for all the flip-flops, which leads to smaller area overhead.

PDM block includes edge detectors consisting of a XOR gate, an AND gate, and a Pulse Generator, where the XOR gate detects the moment when the input D and the output Q of each flip-flop have a different digital value (that is, when a new data signal passes through combinational logics and arrives at the input D of a flip-flop). The AND gate makes the PDM operated only at the low state of the CLK signal, and the pulse generator using a schmitt trigger circuit as shown in Figure 5 generates a pulse with a small width to present the transition time of each data signal. The NOR gate merges the outputs of all the edge detectors in time order.

GG block consists of a buffer chain, programmable two skewed inverters, and an inverter chain. This block



Figure 4. The proposed circuit to measure the delay degradation.

plays an important role to delay the CLK signal, and the falling transition of the delayed CLK signal is placed in the guardband region, where the delay time of CLK is controlled by the control input signal C0~C5. The Inv1~Inv4 generate three guardband region, and D0~D2 are the final delayed CLK signals to make the gardband region.

The delay time  $\beta$  in Figure 4 is the propagation time from a XOR gate to an inverter of PDM, so the generated pulse from PDM is arrived after time  $\beta$  from the real transition time of the data signal. In order to compensate the delay time  $\beta$ , a buffer chain in GG block with the  $\beta$  delay time is deployed. The output D0~D2 will be pulses with a small width if each pulse reaches one of three guardband regions.

Finally, the bit decision circuits as shown in Figure 6 generate the 3-bit outputs depending on the outputs of PDM during the measurement mode, where NM2 is used to reset the node Na. The final output Z0~Z2 present guardband regions to warn a circuit failure. For example, (Z0Z1Z2 = 000) is a normal operation, (Z0Z1Z2 = 100) is the first warning step, (Z0Z1Z2 = 110) is the second warning step, and (Z0Z1Z2 = 111) is the circuit failure. When the outputs of the bit decision circuits are 111, a Failure signal will be generated. In the unmeasurement mode, NM2 will be off which the proposed system will be acted as only a monitoring circuit. The pulse generators of the pulse generator block in Figure 5 generate pulses with a small width to present the transition times of Z0~Z2, and an OR gate merges the outputs of all the edge detectors in time order.

Figure 7 shows a timing diagram for the proposed circuit. In this timing diagram, a data transition occurs at the first guardband region, so the primary output Z0~Z2 is "100". Ref\_CLK signal is delayed by  $\beta$  time behind CLK



**Figure 5.** Pulse generator circuit using a schmitt trigger.



Figure 6. Bit decision circuit.

signal to compensate the delay of PDM block. D0~D2, the final delayed CLK signals, are generated to make a gardband region, where the rising edge of D0 signal, the falling edge of D1, and the rising edge of D2 mean the guardband region 1,2, and 3. The transitions of PG1 and PG2 come from the difference between the input and output of the flip-flop, which means a new data signal passes through combinational logics and arrives at the input D of a flip-flop. B0~B2 signals can be generated depending on the D0~D2 and PG1~2 signals. In this diagram, B0 signal has a pulse due to a transition of D0 signal and PG2 signal, which indicates the guardband violation. As a result, Z0~Z2 change from 000 to 100 in the measurement mod,



Figure 7. Timg diagram for the proposed circuit.

and the output (CP) of the proposed measurement circuit can be used as control signals in a self-adaptive system using effective methods such as adaptive body biasing, supply voltage scaling, frequency scaling, and etc.

Figure 8 shows a flowchart of a self-tuning system using a power gating structure and a forward biasing. Based on the output (CP) of the proposed measurement circuit, the self-tuning system can decides the number of turn-on power switches of a power gating structure and the forward biasing voltage in active mode. If a failure signal is generated, an Adaptive Frequency Scaling (AFS) system will be changed the clock frequency for sequential circuits to satisfy with the timing requirement.

#### 3. Simulation Result

The proposed circuits have been designed and evaluated using a 0.11  $\mu$ m MOSFET technology model (VDD=1.1V). The number of cycles in the stressed input-signal with 0.5 duty cycle and 2 GHz frequency is increased for a long term NBTI-stress simulation, and the HCI stress time (switching time) for these experiments is 400  $\mu$ sec. A 4x4 multiplier has been used as a benchmark circuit in our simulation. The simulation result has been compared with the result of Reference <sup>12</sup> as shown in Table 1.

In the case of delay overhead, the penalty is very small because the proposed circuit does not have



**Figure 8.** Flowchart of a self-tuning system using a power gating structure and a forward biasing.

| Т | abl | e | <b>I</b> . | Simu | lation | resu | lts |
|---|-----|---|------------|------|--------|------|-----|
|---|-----|---|------------|------|--------|------|-----|

| Overhead                                   | Reference <sup>12</sup> | Proposed       |
|--------------------------------------------|-------------------------|----------------|
| # of outputs                               | 1 output                | 3-outputs      |
| Technology                                 | 65 nm PTM               | 0.11 um        |
| Test benchmark circuit                     | OpenRISC processor      | 4x4 Multiplier |
| Delay Overhead                             | < 1%                    | < 1%           |
| Power overhead<br>(in no-measurement mode) | 0.1%                    | ~1%            |
| Power Overhead<br>(in measurement mode)    | 7.5%                    | ~50 %          |
| Programmable delay buffer                  | No                      | Yes            |

influence on the multiplier delay time. The power overhead in the no-measurement mode is also very small because the measurement signal is used to turn off the proposed circuit. In the measurement mode, the multiplier with the modified flip-flop consumes 50% more power than that of the multiplier with a normal flip-flop. Therefore, all the simulation results show that all the overhead impact on the multiplier is small.

On the other hand, all the overheads in Reference <sup>12</sup> are smaller than the overheads of our proposed circuits, but the prediction circuit proposed in Reference <sup>12</sup> has only 1-bit output to present a guardband window. Also, the technology used in Reference <sup>12</sup> is a PTM (Predictive Technology Model) model which is not a real technology model, and the buffer delay cannot be changed after chip fabrication. Our proposed measurement circuit has a programmable buffer to change the buffer delay, so the guardband window can be changed depending control signals depending on the clock cycle and technology model of digital systems even after chip fabrication.

## 4. Conclusion

This paper proposes novel on-chip circuit in a 0.11  $\mu$ m technology to measure the delay degradation caused by aging effects of sequential logics. The simulation results show that the proposed circuits achieve a good aging failure measurement and low overhead. For a good adaptive design technique for overcoming the performance degradation due to aging phenomena, our accurate aging prediction circuit would be a practicable solution in nanoscale CMOS circuits. In a nanometer digital circuit operated in the ultra-low voltage region, even the smallest of variations can slow down a transistor's switching speed, and an aging device may not perform adequately at the

very low voltage. Until now, the reliability (aging) effect has traditionally been the area of process engineers, but in the future circuit-designers need to consider these reliability effects in the early stages of design to make sure there are enough margins for circuits to function correctly over their entire lifetime. Therefore, this paper for resilient circuits will cause a design paradigm shift in all aspects of VLSI design.

## 5. Acknowledgement

This research was supported by the Daegu University Research Grant, 2010.

# 6. References

- Wooters SN, Cabe AC, Qi Z, Wang J, Mann RW, Calhoun BH, Stan MR, Blalock TN. Tracking on-chip age using distributed, embedded sensors. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2011; 99:1–12.
- Qi Z, Wang J, Cabe A, Wooters S, Blalock T, Calhoun B, Stan M. Sram-based NBTI/PBTI sensor system design. 47th ACM/IEEE Design Automation Conference; 2010 Jun.
- Omana M, Rossi D, Bosio N, Metra C. Self-checking monitor for NBTI due degradation. IEEE 16th International Mixed-Signals, Sensors and Systems Test Workshop; 2010 Jun.
- Vattikonda R, Wenping W, Yu C. Modeling and minimization of PMOS NBTI effect for robust nanometer design. Design Automation Conference, 2006 43rd ACM/IEEE; 2006. p. 1047–52.
- 5. Zafar S, et al. A comparative study of NBTI and PBTI (charge trapping) in SiO2/HFO2 stacks with fUSI, TiN, ReGates. IEEE Proceedings VLSI Technology; 2006 Jun. p. 23–5.

- Martin-Martinez J, Rodriguez R, Nafria M, Aymerich X. Time-dependent variability related to BTI effects in MOSFETs: impact on CMOS differential amplifiers. IEEE Transaction on Device and Materials Reliability. 2009 Jun; 9(2):305–10.
- Yang S, Yang H, Chuang C, Hwang W. Timing control degradation and NBTI/PBTI tolerant design for Write-replica circuit in nanoscale CMOS SRAM. IEEE International Symposium on VLSI Design, Automation and Test; 2009. p. 162–5.
- Bravaix A, Guerin C, Huard V, Roy D, Roux JM, Vincent E. Hot-carrier acceleration factors for low power management in DC-AC stressed 40nm NMOS node at high temperature. 47th Annual International Reliability Physics Symposium Proceedings (IEEE IRPS); 2009 Apr. p. 26–30.
- Lu MF, Chiang S, Liu A, Huang-Lu S, et al. Hot carrier degradation in novel strained-Si nMOSFETs. 42nd Annual International Reliability Physics Symposium Proceedings (IEEE IRPS); 2004 Apr 25–29. p. 18–22.
- Chen YY, Gardner M, Fulford J, Wristers D, Joshi AB, et al. Enhanced hot-hole degra-dation in P+-poly PMOSFETs with oxynitride gate dielectrics. International Symposium on VLSI Technology, Systems, and Applications (VLSI); 1999 Jun 8–10. p. 86–9.
- Arnaud F, Liu J, Lee YM, et al. 32nm general purpose bulk CMOS technology for high performance applications at low voltage. IEEE IEDM'08; 2008 Dec. p. 1–4.
- Agarwal M, Paul BC, Zhang M, Mitra S. Circuit failure prediction and its application to transistor aging. IEEE VLSI test symposium; 2007. p. 277–86.
- Vazquez J, Champac V, Ziesemer A, Reis R, Teixeira I, Santos M, Teixeira J. Low-sensitivity to process variations aging sensor for automotive safety-critical applications. VLSI Test Symposium (VTS); 2010. p. 238–44.