# **VLSI** Design

## Sheet 1

1.6 Sketch a transistor-level schematic for a compound CMOS logic gate for each of the following functions:

a) 
$$Y = \overline{ABC + D}$$

b) 
$$Y = (\overline{AB + C}) \cdot \overline{D}$$

c) 
$$Y = \overline{AB + C \cdot (A + B)}$$

- 1.9 Sketch transistor-level schematics for the following logic functions. You may assume you have both true and complementary versions of the inputs available.
  - a) A 2:4 decoder defined by

$$Y0 = \overline{A0} \cdot \overline{A1}$$

$$Y1 = A0 \cdot \overline{A1}$$

$$Y2 = \overline{A0} \cdot A1$$

$$Y3 = A0 \cdot A1$$

- 1.10 Sketch a stick diagram for a CMOS 4-input NOR gate from Exercise 1.5.
- 1.11 Estimate the area of your 4-input NOR gate from Exercise 1.10.
- 1.15 Draw a transistor-level schematic for the latch of Figure 1.75. How does the schematic differ from Figure 1.31(b)?



FIGURE 1.75 Level-sensitive latch stick diagram

1.13 Figure 1.74 shows a stick diagram of a 2-input NAND gate. Sketch a side view (cross-section) of the gate from X to X'.



FIGURE 1.74 2-input NAND gate stick diagram

- 1.18 A 3-input majority gate returns a true output if at least two of the inputs are true. A minority gate is its complement. Design a 3-input CMOS minority gate using a single stage of logic.
  - a) sketch a transistor-level schematic
  - b) sketch a stick diagram
  - c) estimate the area from the stick diagram
- 1.19 Design a 3-input minority gate using CMOS NANDs, NORs, and inverters. How many transistors are required? How does this compare to a design from Exercise 1.18(a)?

### Reference:

CMOS VLSI design ,  $4^{th}$  edition, Neil Weste and David Harris

1/



2/





The minimum area is 5 tracks by 5 tracks (40  $\lambda$  x 40  $\lambda$  = 1600  $\lambda$ <sup>2</sup>).

# 4/

6 tracks wide by 6 tracks tall, or 2304  $\lambda^2$ .

This latch is nearly identical save that the inverter and transmission gate feedback has been replaced by a tristate feedaback gate.





6/



(c) 6 tracks wide x 7 tracks high =  $(48 \times 56) = 2688 \lambda^2$ .

20 transistors, vs. 10 in 1.16(a).



# VLSI design Sheet 2

4.1 Sketch a 2-input NOR gate with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter. Compute the rising and falling propagation delays of the NOR gate driving b identical NOR gates using the Elmore delay model. Assume that every source or drain has fully contacted diffusion when making your estimate of capacitance.

Sketch a stick diagram for the 2-input NOR. Repeat Exercise 4.1 with better capacitance estimates. In particular, if a diffusion node is shared between two parallel transistors, only budget its capacitance once. If a diffusion node is between two series transistors and requires no contacts, only budget half the capacitance because of the smaller diffusion area.

- 4.4 Find the worst-case Elmore parasitic delay of an n-input NOR gate.
- 4.5 Sketch a delay vs. electrical effort graph like that of Figure 4.21 for a 2-input NOR gate using the logical effort and parasitic delay estimated in Section 4.4.2. How does the slope of your graph compare to that of a 2-input NAND? How does the y-intercept compare?
- 4.6 Let a 4x inverter have transistors four times as wide as those of a unit inverter. If a unit inverter has three units of input capacitance and parasitic delay of p<sub>inv</sub>, what is the input capacitance of a 4x inverter? What is the logical effort? What is the parasitic delay?
  - 4.11 Consider four designs of a 6-input AND gate shown in Figure 4.40. Develop an expression for the delay of each path if the path electrical effort is H. What design is fastest for H = 1? For H = 5? For H = 20? Explain your conclusions intuitively.



----

4.19 Consider a process in which pMOS transistors have three times the effective resistance as nMOS transistors. A unit inverter with equal rising and falling delays in this process is shown in Figure 4.42. Calculate the logical efforts of a 2-input NAND gate and a 2-input NOR gate if they are designed with equal rising and falling delays.



- 4.20 Generalize Exercise 4.19 if the pMOS transistors have μ times the effective resistance of nMOS transistors. Find a general expression for the logical efforts of a k-input NAND gate and a k-input NOR gate. As μ increases, comment on the relative desirability of NANDs vs. NORs.
- 4.25 The clock buffer in Figure 4.43 can present a maximum input capacitance of 100 fF. Both true and complementary outputs must drive loads of 300 fF. Compute the input capacitance of each inverter to minimize the worst-case delay from input to either output. What is this delay, in  $\tau$ ? Assume the inverter parasitic delay is 1.



FIGURE 4.43 Clock buffer

## **Sheet 2 solutions**

1/

The rising delay is (R/2)\*8C + R\*(6C+5hC) = (10+5h)RC if both of the series pMOS transistors have their own contacted diffusion at the intermediate node. More realistically, the diffusion will be shared, reducing the delay to (R/2)\*4C + R\*(6C+5hC) = (8+5h)RC. Neglecting the diffusion capacitance not on the path from Y to GND, the falling delay is R\*(6C+5hC) = (6+5h)RC.



2/

The rising delay is (R/2)\*2C + R\*(5C+5hC) = (6+5h)RC and the falling delay is R\*(5C+5hC) = (5+5h)RC.



The output node has 3nC. Each internal node has 2nC. The resistance through each pMOS is R/n. Hence, the propagation delay is

$$t_{pd} = R(3nC) + \sum_{i=1}^{n-1} \left(\frac{iR}{n}\right) (2nC) = (n^2 + 2n)RC$$

The slope (logical effort) is 5/3 rather than 4/3. The y-intercept (parasitic delay) is identical, at 2.



4/

 $C_{\rm in}$  = 12 units. g = 1. p =  $p_{\rm inv}$ . Changing the size affects the capacitance but not logical effort or parasitic delay.

4.11  $D = N(GH)^{1/N} + P$ . Compare in a spreadsheet. Design (b) is fastest for H = 1 or 5. Design (d) is fastest for H = 20 because it has a lower logical effort and more stages to drive the large path effort. (c) is always worse than (b) because it has greater logical effort, all else being equal.

### Comparison of 6-input AND gates

| Design | G                 | P       | N | D (H=1) | D (H=5) | D (H=20) |
|--------|-------------------|---------|---|---------|---------|----------|
| (a)    | 8/3 * 1           | 6+1     | 2 | 10.3    | 14.3    | 21.6     |
| (b)    | 5/3 * 5/3         | 3 + 2   | 2 | 8.3     | 12.5    | 19.9     |
| (c)    | 4/3 * 7/3         | 2+3     | 2 | 8.5     | 12.9    | 20.8     |
| (d)    | 5/3 * 1 * 4/3 * 1 | 3+1+2+1 | 4 | 11.8    | 14.3    | 17.3     |

## 6/

NAND2: g = 5/4; NOR2: g = 7/4. The inverter has a 3:1 P/N ratio and 4 units of capacitance. The NAND has a 3:2 ratio and 5 units of capacitance, while the NOR has a 6:1 ratio and 7 units of capacitance.

NAND:  $g = (\mu + k) / (\mu + 1)$ ; NOR:  $g = (\mu k + 1) / (\mu + 1)$ . As  $\mu$  increases, NOR gates get worse compared to NAND gates because the series pMOS devices become more expensive.

## 7/

If the first upper inverter has size x and the lower 100-x and the second upper inverter has the same stage effort as the first (to achieve least delay), the least delays are:  $D = 2(300/x)^{1/2} + 2 = 300/(100-x) + 1$ . Hence x = 49.4,  $D = 6.9 \tau$ , and the sizes are 49.4 and 121.7 for the upper inverters and 50.6 for the lower inverter. Such circuits are called *forks* and are discussed in depth in [Sutherland99].

- You are synthesizing a chip composed of random logic with an average activity factor of 0.1. You are using a standard cell process with an average switching capacitance of 450 pF/mm<sup>2</sup>. Estimate the dynamic power consumption of your chip if it has an area of 70 mm<sup>2</sup> and runs at 450 MHz at  $V_{DD} = 0.9$  V.
- You are considering lowering  $V_{DD}$  to try to save power in a static CMOS gate. You will also scale  $V_t$  proportionally to maintain performance. Will dynamic power consumption go up or down? Will static power consumption go up or down?
- 3 The stack effect causes the current through two series OFF transistors to be an order of magnitude less than  $I_{\text{off}}$  when DIBL is significant. Show that the current is  $I_{\text{off}}/2$  when DIBL is insignificant (e.g.,  $\eta = 0$ ). Assume  $\gamma = 0$ , n = 1.
- 4 Determine the activity factor for the signal shown in Figure 5.34. The clock rate is 1 GHz.



FIGURE 5.34 Signal for Exercise 5.4

- 7 Derive the switching probabilities in Table 5.1.
- 9 Construct a table similar to Table 5.2 for a 2-input NOR gate.
- 10 Design a header switch for a power gating circuit in a 65 nm process. Suppose the pMOS transistor has an ON resistance of about 2.5 k $\Omega \cdot \mu$ m. The block being gated has an ON current of 100 mA. How wide must the header transistor be to cause less than a 2% increase in delay?

## O Solutions

- 1  $P = aCV^2 f = 0.1 * (450e^{-12} * 70) * (0.9)^2 * 450e^6 = 1.08 W.$
- 2 Dynamic power consumption will go down because it is quadratically dependent on V<sub>DD</sub>. Static power will go up because subthreshold leakage is exponentially dependent on V<sub>t</sub>.
- 3 Simplify using  $V_{DD} >> v_T$ :

$$\begin{split} I_{1} &= I_{ds0} e^{\frac{-\nu_{t}}{\nu_{T}}} \left[ 1 - e^{\frac{-\nu_{DD}}{\nu_{T}}} \right] \approx I_{ds0} e^{\frac{-\nu_{t}}{\nu_{T}}} \\ I_{2} &= I_{ds0} e^{\frac{-\nu_{t}}{\nu_{T}}} \left[ 1 - e^{\frac{-x}{\nu_{T}}} \right] = I_{ds0} e^{\frac{-\nu_{t-x}}{\nu_{T}}} \left[ 1 - e^{\frac{-\nu_{DD+x}}{\nu_{T}}} \right] \\ I_{2} &\approx I_{1} \left[ 1 - e^{\frac{-x}{\nu_{T}}} \right] = I_{1} e^{\frac{-x}{\nu_{T}}} \\ 1 - e^{\frac{-x}{\nu_{T}}} &= e^{\frac{-x}{\nu_{T}}} \Rightarrow e^{\frac{-x}{\nu_{T}}} = \frac{1}{2} \Rightarrow I_{2} / I_{1} = 1/2 \end{split}$$

- 4 The signal makes 4 transitions in 10 cycles, so the activity factor is (1/2)(4/10) = 0.2.
- .7 AND2: Y = 1 when A = 1 and B = 1

AND3: Y = 1 when A, B, and C all are 1

OR2: Y = 1 unless A = 0 and B = 0

NAND2: Y = 1 unless A = 1 and B = 1

NOR2: Y = 1 when A = 0 and B = 0

XOR2: Y = 1 when A = 1 and B = 0 or when A = 0 and B = 1

9 Gate leakage through an ON nMOS transistor is 6.3 nA and through an ON pMOS transistor is negligible. Subthreshold leakage through the nMOS transistors is 5.6 nA. Subthreshold leakage through a single pMOS transistor is 9.3 nA.

Table 2: NOR leakage

| State (AB) | Isub                                             | Igate            | Itotal |
|------------|--------------------------------------------------|------------------|--------|
| 00         | 5.6 * 2 (2 nMOS)                                 | 0                | 11.2   |
| 01         | 9.3 (pMOS)                                       | 6.3 (1 nMOS)     | 15.6   |
| 10         | < 9.3 (pMOS with inter-<br>mediate node at  Vt ) | 6.3 (1 nMOS)     | ~ 12   |
| 11         | << 9.3 (stack effect with two OFF pMOS)          | 6.3 * 2 (2 nMOS) | ~ 13   |

### 1

# Chapter 6 Some Selected Problems

1. [E, None, 4.2] Implement the equation X = ((A + B) (C + D + E) + F) G using complementary CMOS. Size the devices so that the output resistance is the same as that of an inverter with an NMOS W/L = 2 and PMOS W/L = 6. Which input pattern(s) would give the worst and best equivalent pull-up or pull-down resistance?

#### Solution

Rewriting the output expression in the form X = ((A + B)(C + D + E) + F)G = ((AB + CDE)F) + G allows us to build the pulldown network by inspection (parallel devices implement an OR, and series devices implement an AND). The pullup network is the dual of the pulldown network.



The plot shows sizes that meet the requirement - in the worst case, the output resistance of the circuit matches the output resistance of an inverter with NMOS W/L=2 and PMOS W/L=6.

The worst case pull-up resistance occurs whenever a single path exists from the output node to Vdd. Examples of vectors for the worst case are ABCDEFG=1111100 and 0101110. The best case pull-up resistance occurs when ABCDEFG=0000000.

The worst case pull-down resistance occurs whenever a single path exists from the output node to GND. Examples of vectors for the worst case are ABCDEFG=0000001 and 0011110.

The best case pull-down resistance occurs when ABCDEFG=1111111.

2. Implement the following expression in a full static CMOS logic fashion using no more than 10 transistors:

$$\overline{Y} = (A \cdot B) + (A \cdot C \cdot E) + (D \cdot E) + (D \cdot C \cdot B)$$

**Solution** 

Chapter 6 Problem Set

The circuit is given in the next figure.



3. Consider the circuit of Figure 6.1.



Figure 6.1 CMOS combinational logic gate.

**a.** What is the logic function implemented by the CMOS transistor network? Size the NMOS and PMOS devices so that the output resistance is the same as that of an inverter with an NMOS W/L = 4 and PMOS W/L = 8.

### **Solution**

The logic function is  $:Y = \overline{(A+B)CD}$ . The transistor sizes are given in the figure above.

**b.** What are the input patterns that give the worst case  $t_{pHL}$  and  $t_{pLH}$ . State clearly what are the initial input patterns and which input(s) has to make a transition in order to achieve this maximum propagation delay. Consider the effect of the capacitances at the internal nodes.

### Solution

The worst case  $t_{pHL}$  happens when the internal node capacitances (Cx2 and Cx3) are charged before the high to low transition. The initial states that can cause this are: ABCD=[1010, 1110, 0110]. The final state is one of: ABCD=[1011, 0111].

The worst case  $t_{vLH}$  happens when CxI is charged before the low to high transition. The input pattern that can cause this is: ABCD=[0111] =>[0011].

**c.** Verify part (b) with SPICE. Assume all transistors have minimum gate length (0.25μm).

### **Solution**

The two cases are shown below.



Figure 6.2 Best and worst tpHL.



Figure 6.3 Best and worst t<sub>pLH</sub>.

d. If P(A=1)=0.5, P(B=1)=0.2, P(C=1)=0.3 and P(D=1)=1, determine the power dissipation in the logic gate. Assume  $V_{DD}$ =2.5V,  $C_{out}$ =30fF and  $f_{clk}$ =250MHz.

### **Solution**

Since D is always 1, the circuit implements the following function  $Y = \overline{(A+B)C}$ . 
$$\begin{split} P_{(A+B)=1} &= P_{A=0}.P_B = 0 = 0.5*(1\text{-}0.2) = 0.4, \\ P_{(A+B)=0} &= 1\text{-}0.4 = 0.6, \end{split}$$

$$\begin{split} &P_{(A+B)=0} = 1 - 0.4 = 0.6, \\ &P_{Y=0} = P_{(A+B)=1}.P_C = 1 = 0.6*0.3 = 0.18 \\ &P_{Y=1} = 1 - 0.18 = 0.82 \\ &P_{Y=0=>1} = 0.18*0.82 = 0.1476 \\ &So\ Pdyn = P_{Y=0=>1}C_{out}V_{DD}^2f_{clk} = (0.1476)(30.10^{-15})(2.5^2)(250.10^6) = 6.92\,\mu\,W. \end{split}$$

### 4. [M, None, 4.2] CMOS Logic

a. Do the following two circuits (Figure 6.4) implement the same logic function? If yes, what is that logic function? If no, give Boolean expressions for both circuits.

4 Chapter 6 Problem Set

### Solution

Yes, they implement the same logic function :  $F = (\overline{ABCD} + \overline{E}) = (\overline{A} + \overline{B} + \overline{C} + \overline{D}).\overline{E}$ 

b. Will these two circuits' output resistances always be equal to each other?

### Solution

No

**c.** Will these two circuits' rise and fall times always be equal to each other? Why or why not? **Solution** 

No. Circuit B appears optimized for the case where the transistor with input E is on the critical path since it is closer to the output node than in circuit A. Therefore, if input E arrives later, circuit B will be faster than circuit A since the internal node will already be charged and only the output capacitance needs to be switched. Even if we assume, all inputs arrive at the same time, however, the two circuits rise and fall times will not be equal to each other. Consider an input combination where E,A,B,C,D are all low. Circuit A has only one body-affected device while circuit B has four. Since the associated rise in  $V_t$  and fall in output resistance affects only one resistor in circuit A, but four parallel resistors in circuit B, we expect a difference in the timing waveforms.



- 5. [E, None, 4.2] The transistors in the circuits of the preceding problem have been sized to give an output resistance of  $13 \, k\Omega$  for the worst-case input pattern. This output resistance can vary, however, if other patterns are applied.
  - **a.** What input patterns (*A*–*E*) give the lowest output resistance when the output is low? What is the value of that resistance?

### **Solution**

The lowest output resistance is obtained when all inputs (A, B, C, D and E) are equal to 1. In that case, the output resistance is the parallel of the resistance of a nMOS of width 1, with a series of four equal nMOS of width 4. Both combinations have the same resistance, equal to the worst-case output resistance, 13 k  $\Omega$ . Then the output resistance, in this case, is half this value, 6.5 k  $\Omega$ .

**b.** What input patterns (*A*–*E*) give the lowest output resistance when the output is high? What is the value of that resistance?