**The Factors That Limit Time Resolution in Photodetectors** 

# **Dependence on Feature Size**

## Gary Drake Argonne National Laboratory

University of Chicago Workshop Apr. 29, 2011

# Outline

- Basic Notation
- Survey of Feature Sizes
- How Feature Size Affects Speed Scaling Laws
- Consequences of Deep Submicron Technologies
- Practical Considerations
- Summary

## **Basic Notations & Principles in CMOS Design**

#### Geometry

- **W** = Gate Width
- L = Gate Length
- t<sub>ox</sub> = thickness of gate oxide
- W/L defines transistor size
- L<sub>min</sub> = min. feature size

#### Semiconductor Construction

- N-type mobile charge carriers are electrons (<u>n</u>egative charge)
- **P-type** mobile charge carriers are holes (<u>p</u>ositive charge)
- **Doping** add impurities to pure silicon to make material N-type or P-type
  - N<sub>a</sub> density of acceptor atoms holes
  - N<sub>d</sub> density of donor atoms electrons
- NMOS or N-channel electrons
  - Conductive channel is N-type
- PMOS or P-channel holes
  - Conductive channel is P-type
- Source & Drain doping implants into substrate to create n<sup>+</sup> & p<sup>+</sup> regions
- Oxide Layer Gate is isolated from substrate, which makes it high impedance, except for leakage



#### Operation

- Gate-Source voltage V<sub>GS</sub> creates conduction channel under gate, allowing current to flow between drain & source
- Minimum gate voltage to create channel called the Threshold Voltage V<sub>TH</sub>

## Survey of Feature Sizes or Size <u>DOES</u> Matter!





**Big is Good?** 



**Big is Good!** 



**Big is Good!** 



Small is Good. But not too small...



**Big is Good?** 

## **Survey of Feature Sizes** IBM & TSMC CMOS Processes Offered Through MOSIS Today



# **The Basic Question:**

## How to Achieve Faster (Higher BW) ASICS?

## The Basic Answer:

• General trend:

#### The smaller the feature size, the faster the chip can operate.

- Why? A few high-level reasons:
  - Smaller size = <u>shorter distance</u> that signals need to propagate
  - Smaller size = generally <u>lower parasitic capacitance</u> (But watch out for larger capacitance/unit area...)
  - Smaller size  $\rightarrow$  <u>use lower voltage rails</u>  $\rightarrow$  reach logic levels faster



# The Basic Question:

## How to Achieve Faster (Higher BW) ASICS? (Cont.)

#### Measurements Tell the Tale:

- One measure of speed in CMOS: Ring Oscillator Frequency
  - Basic test structure used by MOSIS to measure fabrication run acceptance
  - Generally configured as an odd number of inverters, minimum-sized devices
     F<sub>0</sub> = 1/(2N \* Inverter\_delay), N = # Inverters (MOSIS uses 31 stages.)



# **How Feature Size Affects Speed**

## Robert Dennard's Scaling Law (1974)

• If scale the physical parameters of an integrated circuit equally by factor K, then performance parameters scale as follows:

| Geometry &<br>Supply voltage                                             | L <sub>g</sub> , W <sub>g</sub><br>T <sub>ox,</sub> V <sub>dd</sub> | к                | Scaling K: K=0.7 for example                                                                                                                                                                                                                | Device or Circuit Parameter<br>Device dimension t <sub>ox</sub> , L, W                                                                                                           | Scaling Factor<br>K<br>1/K |
|--------------------------------------------------------------------------|---------------------------------------------------------------------|------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------|
| Drive current<br>in saturation<br>I <sub>d</sub> per unit W <sub>g</sub> | I <sub>d</sub><br>I <sub>d</sub> /μm                                | К<br>1           | $I_{d} = v_{sat}W_{g}C_{o}(V_{g}-V_{th}) \qquad C_{o}: \text{ gate C per unit area}$ $\longrightarrow W_{g}(t_{ox}^{-1})(V_{g}-V_{th}) = W_{g}t_{ox}^{-1}(V_{g}-V_{th}) = KK^{-1}K = K$ $I_{d} \text{ per unit } W_{g} = I_{d} / W_{g} = 1$ | Doping concentration $N_a$ $1/K$ Voltage VKCurrent IKCapacitance $eA/t$ KDelay time per circuit $VC/I$ KPower dissipation per circuit $VI$ K <sup>2</sup> Power density $VI/A$ 1 |                            |
| Gate capacitance                                                         | Cg                                                                  | к                | $C_g = \varepsilon_o \varepsilon_{ox} L_g W_g / t_{ox} \longrightarrow KK / K = K$                                                                                                                                                          |                                                                                                                                                                                  | Table Courtesy of          |
| Switching speed                                                          | τ                                                                   | к                | $\tau = C_g V_{dd} / I_d \longrightarrow KK/K = K$                                                                                                                                                                                          |                                                                                                                                                                                  | Mark Bohr, IBM             |
| Clock frequency                                                          | f                                                                   | 1/K              | $f = 1/\tau = 1/K$                                                                                                                                                                                                                          |                                                                                                                                                                                  |                            |
| Chip area                                                                | A <sub>chip</sub>                                                   | α                | $\alpha$ : Scaling factor $\rightarrow$ In the past, $\alpha > 1$ for most cases                                                                                                                                                            |                                                                                                                                                                                  |                            |
| Integration (# of Tr)                                                    | N                                                                   | α/K <sup>2</sup> | $N \rightarrow \alpha/K^2 = 1/K^2$ , when $\alpha=1$                                                                                                                                                                                        |                                                                                                                                                                                  |                            |
| Power per chip                                                           | Р                                                                   | α                | fNCV <sup>2</sup> /2 $\rightarrow$ K <sup>-1</sup> ( $\alpha$ K <sup>-2</sup> )K(K <sup>1</sup> ) <sup>2</sup> = $\alpha$ = 1, when $\alpha$ =1                                                                                             | Slide Courtesy of<br>Hiiroshi Iwai,                                                                                                                                              |                            |

- Dennard's Scaling Law & Moore's Law
  - Gordon Moore's Law: "The number of transistors that can placed inexpensively on an integrated circuit doubles approximately every two years."
     → Basis for Industry Roadmap (ITRS)
  - Dennard: If scale physical dimensions by K, then area of chip ~ K<sup>2</sup>
    - For K ~0.7 ~ 1/SQRT(2),  $K^2 \sim \frac{1}{2} \rightarrow X2$  more transistors for a given die size
  - Industry: Ratios between successive technology steps: 350/500 = 0.7; 250/350 = 0.71; 90/130 = 0.69; 65/90 = 0.722



Moore's Law will come to an end in ~2014, when the feature size approaches a few atomic layers.

Plots Courtesy of Mark Bohr, IBM

#### Examining Speed versus Technology

• A technical answer from semiconductor physics (Courtesy of Willy Sansen):





## Examining Speed versus Technology (Cont.)

• Unity Gain Frequency:

$$f_{\rm T} = \frac{\mu}{2\pi L^2} \left( V_{\rm GS} - V_{\rm T} \right)$$

• Length L

Critical Parameters:

1000

- Mobility
- Channel Length
- Thresh. Voltage

$$V_{GS}-V_{T}$$

⇒ Decreases as feature size decreases, L  $\alpha$  K scaling, for K < 1 ⇒ Causes  $f_T$  to increase as 1/K<sup>2</sup> for K < 1

• Carrier Mobility

- $\Rightarrow N_D \alpha \ 1/K \rightarrow \text{Increases for } K<1$
- ⇒ f<sub>T</sub> decreases for K<1, but slowly...

- $V_T \& (V_{GS} V_T)$ ?
  - Want  $(V_{GS} V_T)$  as large as possible
  - Want  $V_T$  as small as possible

#### Examining Speed versus Technology (Cont.)

- Generally, threshold voltage V<sub>TH</sub> gets smaller with feature size
  - Partly a consequence of decreasing V<sub>dd</sub>
  - Partly a consequence of increasing  $C_{OX}$  (aF/ $\mu$ m<sup>2</sup> gate area)
- But, are approaching a limit with submicron technologies... Why?

$$V_{\rm th} = V_{\rm fb} + 2\psi_B + \frac{\sqrt{2\varepsilon_{\rm si}qN_a\left(2\psi_B + V_{\rm bs}\right)}}{C_{\rm ox}}$$

where  $V_{fb}$  is the flatband voltage,  $v_{\tau} = kT/q$ C<sub>OX</sub> is the gate oxide capacitance/area  $\psi_{\mathsf{B}}$  is the zero bias mobility  $V_{hs}$  is the max. depletion layer width N<sub>a</sub> is the acceptor doping density

Process



Data from MOSIS

## Examining Speed versus Technology (Cont.)

- $V_{dd}$  scales with K, so  $V_{dd}$  decreases with smaller feature size
  - Difficulty in Down-scaling of Supply Voltage: Vdd



Slide Courtesy of Hiroshi Iwai, Tokyo Institute of Technology

#### Examining Speed versus Technology (Cont.)

• Bottom Line - Unity Gain Frequency of FETs:



# So, f<sub>T</sub> α 1/K (approximate) i.e., f<sub>T</sub> increases as feature size decreases

## **Consequences of Deep Submicron Technologies**

## Scaling Laws worked well until ~ 100 nm and smaller

- Performance generally well-predicted > 100 nm
- Below 100 nm, start to see second-order and higher-order effects that affect performance
  - Especially for analog IC or mixed-signal ICs

## Direct Performance Effects

- Increased leakage currents
  - Increased power consumption
- Transistor size mismatch



#### ■ Other Performance Factors → Especially for Analog

- Reduced voltage rails
- Decreased signal-to-noise
- Decreased dynamic range
- Increased power density

Graphic Courtesy of Kaushik Roy, Purdue University

#### Leakage Currents

- Gate oxide tunneling leakage (I<sub>G</sub>)
  - Thin gate oxides allow electron tunneling from gate to substrate
  - Major source of leakage current in sub-micron CMOS
  - The thinner the oxide, the worse the leakage current, i.e. the smaller the feature size, the worse the leakage current, since t<sub>ox</sub> scales





#### Leakage Currents (Cont.)

- Subthreshold leakage (I<sub>SUB</sub>)
  - Drain-source current during weak inversion, V<sub>GS</sub> < V<sub>TH</sub>
  - Dominated by diffusion current of minority carriers, rather than drift current that dominates in strong inversion
  - Carriers move along surface
  - Very sensitive to process parameters, device size, supply voltage, and temperature

$$I_{\rm ds} = \mu_0 C_{\rm ox} \frac{W}{L} (m-1) (v_T)^2 \times e^{(V_g - V_{\rm th})/mv_T} \times \left(1 - e^{-v_{\rm DS}/v_T}\right)$$

where

$$m = 1 + \frac{C_{\rm dm}}{C_{\rm ox}} = 1 + \frac{\frac{\varepsilon_{\rm si}}{W_{\rm dm}}}{\frac{\varepsilon_{\rm ox}}{t_{\rm ox}}} = 1 + \frac{3t_{\rm ox}}{W_{\rm dm}}$$

where  $V_{th}$  is the threshold voltage,  $v_T = kT/q$   $C_{OX}$  is the gate oxide capacitance  $\mu_{OX}$  is the zero bias mobility  $W_{dm}$  is the max. depletion layer width



Graphics Courtesy of Kaushik Roy, Purdue University

## Leakage Currents (Cont.)

- PN Junction Reverse-Bias Leakage (I<sub>REV</sub>)
  - 2 mechanisms
    - Minority-carrier drift/diffusion
    - Electron-hole generation
  - Occurs when P & N junctions are heavily doped
    - Doping (N<sub>a</sub> & N<sub>d</sub>) increases as feature size decreases
       → Leakage increases as feature size decreases
    - Causes band-to-band tunneling (BTBT) due to electric field from the doping

$$E = \sqrt{\frac{2qN_aN_d(V_{app} + V_{bi})}{\varepsilon_{si}(N_a + N_d)}}$$



Graphics Courtesy of Kaushik Roy, Purdue University

Gate O

Well

Drain

'n

p-well

Modified Graphic Courtesy

of Kaushik Roy, Purdue

Source

0-

**OFF** State

## Leakage Currents (Cont.)

- Gate Induced Drain Leakage (I<sub>GIDL</sub>)
  - Caused by narrowing of depletion region around drain when FET is off
    - Due to electric field between gate and drain
  - Get tunneling of minority carriers from drain to substrate
  - Strong function of
    - Doping profile at drain edge
    - Oxide thickness



#### Matching Problems

- Generally, device matching becomes worse as feature size decreases
  - Limitations of lithography & processing



Willy Sansen

- ⇒ Slope is gentle ...
- ⇒ Many designs may not be affected ...

## **Other Consequences of Deep Submicron Technologies**

#### Reduced Headroom

 Circuits with more than ~2-3 transistors between V<sub>dd</sub> and V<sub>ss</sub> run out of head room due to reduced V<sub>dd</sub> (scales with K…)

#### Reduced Signal-to-Noise

- Reduced voltage rails mean smaller signals
- Noise may not improve with smaller feature size
  - Depends on design...
- Reduced Dynamic Range
  - Also a consequence of reduced voltage rails and higher noise...
- Increased Power Density
  - Consequence of leakage currents

⇒ For digital, these are mostly OK
 ⇒ For analog, can be a problem...

# **Practical Considerations**

- Availability (through MOSIS)
  - Currently, technologies between 40-100 nm only offered by TSMC
    - Even then, models sparse or not available
  - IBM: Only trusted vendors below 130 nm



| Fab House | Feature<br>Size | Process      |
|-----------|-----------------|--------------|
| IBM       | 22 nm           |              |
| IBM       | 28 nm           |              |
| IBM       | 32 nm           | 32SOI        |
| TSMC      | 40 nm           | CLN40/CMN40  |
| IBM       | 45 nm           | 12SOI        |
| TSMC      | 45 nm           | CLN45/CMN45  |
| IBM       | 65 nm           | 10SF         |
| IBM       | 65 nm           | 10LPe/RFe    |
| TSMC      | 65 nm           | CLN65/CMN65  |
| TSMC      | 65 nm           | CMN65T       |
| IBM       | 90 nm           | 9SF          |
| IBM       | 90 nm           | 9LP/RF       |
| TSMC      | 90 nm           | CLN90/CMN90  |
| TSMC      | 90 nm           | CMN65T       |
| IBM       | 0.130 µm        | 8RF-LM       |
| IBM       | 0.130 µm        | 8RF-DM       |
| TSMC      | 0.130 µm        | CL013/CM013  |
| TSMC      | 0.130 µm        | CL013LP      |
| TSMC      | 0.130 µm        | CL013LV      |
| IBM       | 0.180 µm        | 7SF          |
| IBM       | 0.180 µm        | 7RF          |
| IBM       | 0.180 µm        | 7RFSOI       |
| IBM       | 0.180 µm        | 7HV          |
| IBM       | 0.250 µm        | 6RF          |
| IBM       | 0.350 µm        | - Sectore Br |
| IBM       | 0.500 µm        |              |

# **Practical Considerations**

#### Cost

- (Costs unknown for every technology at press time...)
- Generally, leading-edge technologies very expensive, even through MOSIS

#### Models

- Models are poor to non-existent for leading-edge technologies
- Customer support is poor for leading-edge technologies
- MOSIS data not available for leading-edge technologies

#### General Difficulty

• Expect long learning curve with the more advanced technologies

# Summary

- For high-speed and high-bandwidth, generally want technologies with smaller feature size
  - Smaller distances
  - Smaller parasitic capacitances
  - Smaller voltage swings
  - Intrinsically faster devices (from scaling law principles)

#### Below ~ 100 nm, start to have higher-order problems

- Leakage currents of various kinds can affect designs
  - Higher internal power
  - May significantly affect analog circuits, in particular sample & hold circuits
- Matching of transistors is worse
- Signal headroom is reduced
- Dynamic range is reduced

#### Prospects for near future

- Scaling laws have worked beautifully for 40 years, but we're reaching a hard limit
- Industry has kept pace, but will have difficulty below 10 nm
- Leading-edge technologies difficult to use right now, but this will improve
- Not clear if higher-order process problems can be solved for analog use...
- Perhaps clever circuit design can be employed, but this will take time...

#### ⇒ 130 nm technology is not a bad place to be right now for analog work...

# Bibliography

ROY, K., MUKHOPADHYAY, S., MAHMOODI-MEIMAND, S., "Leakage Current Mechanisms and Leakage Reduction Techniques in Deep-Submicrometer CMOS Circuits," PROCEEDINGS OF THE IEEE, VOL. 91, NO. 2, FEBRUARY 2003, pp. 305-327

Sansen., W., "Analog IC Design in Nanometer CMOS Technologies," Presentation, New Delhi, Jan., 2009.

IEEE SSCS NEWSLETTER, Winter, 2007.

Iwai, H., "Technology Roadmap for 22nm CMOS and Beyond," Presentation, IEDST 2009, Bombay, June 1, 2009.