Tải bản đầy đủ

Xu hướng công nghệ IC và Kiến trúc máy tính

Outline
• Goal of this class:

CS5100 Advanced Computer Architecture

− To understand the trends of IC technology and be able to
relate the trends with computer architecture designs
− Why need to know the trends?

Technology Trends

• Learn from the history
• Understand possible future and know how to adapt now

Prof. Chung-Ta King

• Class outline:

Department of Computer Science
National Tsing Hua University, Taiwan


− Trends in technology (Sec. 1.4)
− Trends in power and energy (Sec. 1.5)
− Trends in cost (Sec. 1.6)

1
National Tsing Hua University

National Tsing Hua University

IC Technology and Processor Performance

Review of Transistors (MOSFET) on IC

# transistors on ICs x2 every 2 years

Exponential growth

Source: Intel
Corp.

2
National Tsing Hua University

3
National Tsing Hua University


Technology Scaling
• Feature size:

− Minimum size of transistor or wire in x or y
dimension
− 10 microns in 1971 to 22 nm in 2012
− New technology node every 2 years or so
− ~70% (S) reduction for each generation
0.7x
0.7x

National Tsing Hua University


10 µm – 1971
6 µm – 1974
3 µm – 1977
1.5 µm – 1982
1 µm – 1985
800 nm – 1989
600 nm – 1994
350 nm – 1995
250 nm – 1997
180 nm – 1999
130 nm – 2001
90 nm – 2004
65 nm – 2006
45 nm – 2008
32 nm – 2010
22 nm – 2012
14 nm – 2014
4
10 nm – 2016

Effects of Scaling
• More transistors per unit area

− Feature size reduced by 0.7 (S) area of a transistor
reduced by 0.5 (S2)
− 2X # transistors/unit area
− Fixed cost per wafer lower cost per transistor

• Faster transistors

− Reduce time to switch on/off transistors
speed improved by S
exponential increase in clock rate

• Less supplied voltage and power

− Power to switch transistor reduced, but not power density
− Voltage to drive transistors reduced
5

National Tsing Hua University

Effects of Scaling

Summary: Technology Trends

• Local wires are getting faster
• Global wires are getting slower, i.e. scale poorly

• Integrated circuit technology

− Transistor density: 35%/year
− Die size: 10-20%/year
− Integration overall: 40-55%/year (slow down after 2003!)

− No longer possible to cross chip in one cycle
− Computer architects need to plan around this

• DRAM capacity: 25-40%/year (slowing)
• Flash capacity: 50-60%/year

Chip size

− 15-20X cheaper/bit than DRAM

Scaling of
reachable radius

3D stacking
Distributed mechanisms

• Magnetic disk capacity: 40%/year
− 15-25X cheaper/bit than Flash
− 300-500X cheaper/bit than DRAM
− But not speed

6
National Tsing Hua University

7
National Tsing Hua University


Implications for Computer Architecture

Bandwidth versus Latency

• High rate of density improvements

• Bandwidth or throughput

− Used for bringing 4-bit, 8-bit, through 64-bit
microprocessors in the early days of microprocessors
− Used for multiprocessor per chip, wider SIMD, …, in recent
years

• Quantitative changes leading to qualitative changes
− 25K to 30K transistors per chip in early 1980s
possible to build a single-chip 32-bit microprocessor
− By mid 1980s, FP unit can be integrated
− By late 1980s, L1 cache can fit on the same chip
Performance improvements often in discrete steps

− Total work done in a given time
− 10,000-25,000X improvement for processors
− 300-1200X improvement for memory and disks

• Latency or response time

− Time between start and completion of an event
− 30-80X improvement for processors
− 6-8X improvement for memory and
disks

• Work with signal propagation delay on wires

8
National Tsing Hua University

9
National Tsing Hua University

Bandwidth and Latency

Summary: Bandwidth and Latency
• For disk, LAN, memory & microprocessor, bandwidth
improves by square of latency improvement
Log-log plot
of
bandwidth
and latency
milestones

− In the time that bandwidth doubles, latency improves by
no more than 1.2X to 1.4X

• Lag probably even larger in real systems, as BW gains
multiplied by replicated components





Multiple processors in a cluster or in a chip
Multiple disks in a disk array
Multiple memory modules in a large memory
Simultaneous communication in switched LAN

• HW and SW developers should innovate assuming
latency lags bandwidth
10
National Tsing Hua University

11
National Tsing Hua University


Outline

Power Density Trend

• Trends in technology (Sec. 1.4)
• Trends in power and energy (Sec. 1.5)
• Trends in cost (Sec. 1.6)

P = αCVdd f + Vdd I st + Vdd I leak
2

Source: Intel Corp.
12
National Tsing Hua University

13
National Tsing Hua University

Power

Power and Energy

• Intel 80386 consumed ~2 W, but 3.3 GHz Intel Core
i7 consumes 130 W
• Heat must be
dissipated from
the chip
• Today, power is
major limitation
to using
transistors, not
silicon area

• Pavg = Pdynamic + Pstatic
• Energy is related to power through time
• If power dissipation remains constant through time
T, then
E = (Pavg x T)

14
National Tsing Hua University

15
National Tsing Hua University


Dynamic Power and Energy

Static Power

• For CMOS chips, traditional dominant energy
consumption has been in switching transistors,
called dynamic power

• Because leakage current flows even when a
transistor is off, now static power is important too

− Currentstatic x Voltage
− Scales with number of transistors
− Increase as transistors shrink and # transistors increases

− ½ x capacitive load x voltage2 x frequency switched

• For mobile devices, energy is better metric

• With 65nm or better technologies, leakage can account for
50% of total power if not designed properly

− ½ x capacitive load x voltage2

• Reducing clock rate reduces power, but not energy
• Reducing power:





− To reduce: power gating

Do nothing well: turn off clock of inactive modules
Dynamic Voltage-Frequency Scaling (DVFS)
Low power state for DRAM, disks
Overclocking, turning off cores

16
National Tsing Hua University

17
National Tsing Hua University

Implications for Computer Architecture

Outline

• Architectural designs for low power using metrics
such as tasks per joule or performance per watt

• Trends in technology (Sec. 1.4)
• Trends in power and energy (Sec. 1.5)
• Trends in cost (Sec. 1.6)

− Use the right power/energy to do the right things

• Sometimes, do things faster but at a higher power
may be better race to halt
− Often techniques for performance also lead to power
saving

18
National Tsing Hua University

19
National Tsing Hua University


VLSI Economics

NRE

• Selling price Stotal

• Engineering cost

− Stotal = Ctotal / (1-m)

− Depends on size of design team, including benefits,
training, computers
− CAD tools:

• m = profit margin
• Ctotal = total cost

• Digital front end: $10K
• Analog front end: $100K
• Digital back end: $1M

− Nonrecurring engineering cost (NRE)
− Recurring cost
− Fixed cost: data sheets and application notes, marketing
and advertising, yield analysis

• Prototype manufacturing

− Mask costs: $500k – 1M in 130 nm process
− Test fixture and package tooling

20
National Tsing Hua University

21
National Tsing Hua University

Recurring Cost of IC

Cost and Computer Architecture
• Sole control of computer architects on IC cost is die
area, and hence a portion of the cost
− What functions should be included or excluded in the
design?
− Number of I/O pins
− Design complexities

− Defects per unit area = 0.016-0.057 defects per cm2 (2010)
− N = process-complexity factor = 11.5-15.5 (40 nm, 2010)

22
National Tsing Hua University

23
National Tsing Hua University


Technology and Architecture

Technology and Architecture

• How to translate technology improvements into
increases in computing performance?

• Increased transistor counts:

− Basic strategies: parallelism, speculation, overlapping,
monitoring/profiling
− Modular and hierarchical architectures

− Constraints on power dissipation, localized
communication, design and verification complexities

• Increasing clock frequency:

− Need to tackle power, heat, clock skew, wire delay
− Gap to memory and I/O devices, PC board design
multi-level cache (with on-chip cache)
− Need scalable design with little complexity, parallelism
e.g., multiple functional units, RICS cores
− Need good locality, avoid long distance and rapid
interaction, e.g., MP on a chip

• Shorter wires, lower complexity, scale with technology
• On-chip cache/DRAM, MP on a chip, multithreading, vector
processing, VLIW

− For monitoring and learning program’s execution and
subsequently recasting it for faster execution
− Self-adapting, self-management, self-healing, …
− More functionalities: multimedia, facilities for I/O and
memory, bandwidth and latency improvement

24
National Tsing Hua University

Recap
• Trends in technology
• Trends in power and energy
• Do you understand the trends of IC technology?
• Can you explain the implications and relate the
trends with computer architecture designs?

26
National Tsing Hua University

25
National Tsing Hua University



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay

×