You are on page 1of 25

04/14/05 Ajit Datar, Apurva Padhye

Computer Architecture
1
Graphics Processing Unit
Architecture GPU Arch!
"ith a #ocus on $%&D&A Ge'orce
()00 GPU
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
*
"hat is a GPU
+ 'rom "i,ipedia - A specia.i/ed processor
e##icient at manipu.ating and disp.aying computer
graphics
+ *D primitive support 0 1it 1.oc, trans#ers
+ 2ome might have video support
+ And o# course 3D support a topic at the heart o#
this presentation!
+ GPUs are optimi/ed #or raster graphics
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
3
4he Graphics pipe.ine
5odern graphics pipe.ine .e#t! re#- http-//graphics6stan#ord6edu/courses/cs44)a7017#a../.ectures/.ecture*/8a.,0106htm.!
9penG: 3D pipe.ine right! re#- http-//8886vor.esungen6uos6de/in#ormati,/i#c;;700/openg./images/pipe.ine6gi#!
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
4
3D graphics so#t8are inter#aces
+ :o8 .eve.
+ 2peci#ication not an AP&
+ Crossp.at#orm imp.ementations
+ Popu.ar 8ith some games
+ A simp.e se< o# openg. instr in C!
g.C.earCo.or060,060,060,060!=
g.C.earG:>C9:9?>@U''A?>@&4!=
g.Co.or3#160,160,160!=
g.9rtho060,160,060,160,7160,160!=
g.@eginG:>P9:BG9$!=
g.%erteC06*5,06*5,060!=
g.%erteC06D5,06*5,060!=
g.%erteC06D5,06D5,060!=
g.%erteC06*5,06D5,060!=
g.And!=
9penG: v*60 as o# no8!
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
5
3D graphics so#t8are inter#aces
+ Eigh .eve.
+ 3D AP& 0 part o# DirectF
+ %ery popu.ar in the gaming industry
+ 5icroso#t p.at#orms on.y
Direct 3D v;60c as o# no8!
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
(
$%&D&A Ge'orce ()00
+ &mpressive per#ormance stats

(00 5i..ion vertices/s

(64 1i..ion teCe.s/s

1*6) 1i..ion piCe.s/s rendering //stenci. on.y

(4 piCe.s per c.oc, cyc.e ear.y /7cu.. reject rate!


+ ?iva series 1
st
DirectF compati1.e!
0 ?iva 1*), ?iva 4$4, ?iva 4$4*
+ Ge'orce 2eries
0 Ge'orce *5(, Ge'orce 3 DirectF )!, Ge'orce 'F, Ge'orce ( series
Genera. in#o
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
D
$%&D&A Ge'orce ()00
@.oc, Diagram
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
)
+ A..o8 shader to 1e
app.ied to each verteC
+ 4rans#ormation and
other per verteC ops
+ A..o8 verteC shader to
#etch teCture data (
series on.y!
$%&D&A Ge'orce ()00
Vertex Processor (or vertex shader)
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
;
+ Cu../c.ip 0 per primitive
operation and data
preparation #or
rasteri/ation
+ ?asteri/ation- primitive
to piCe. mapping
+ G cu..ing - <uic, piCe.
e.imination 1ased on
depth
$%&D&A Ge'orce ()00
Clipping, Z Culling and Rasterization
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
10
+ 'ragment - a candidate
piCe.
+ %arying num1er o# piCe.
pipe.ines
+ 9perates on <uads 0 #or
teCture :9D
+ 2&5D processing hides
teCture #etch .atency
+ 4eCture caches
$%&D&A Ge'orce ()00
Fragment processor and Texel pipeline
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
11
+ 4eCture unit can app.y #i.ters6
+ 2hader units can per#orm )
math ops 8/o teCture .oad!
or 4 math ops 8ith teCture
.oad! in a c.oc,
+ 'og ca.cu.ation done in the
end
+ PiCe.s a.most ready #or
#rame1u##er
$%&D&A Ge'orce ()00
Fragment processor and Texel pipeline
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
1*
+ Depth testing
+ 2tenci. tests
+ A.pha operations
+ ?ender #ina. co.or to
target 1u##er
$%&D&A Ge'orce ()00
Z compare and blend
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
13
$%&D&A Ge'orce ()00
+ %erteC stream #re<uency
0 hard8are support #or .ooping over a su1set o#
vertices
+ ACamp.e- rendering the same o1ject
mu.tip.e times at di## .ocations grass,
so.diers, peop.e in stadium!
'eatures 0 Geometry &nstancing
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
14
$%&D&A Ge'orce ()00
+ Aar.y cu..ing and c.ipping=
0 cu.. nonvisi1.e primitives at high rate
+ ?asteri/ation
0 supports Point 2prite, A.iased and anti7a.iasing and triang.es, etc
+ G7Cu..
0 A..o8s high7speed remova. o# hidden sur#aces
+ 9cc.usion Huery
0 Ieeps a record o# the num1er o# #ragments passing or #ai.ing the
depth test and reports it to the CPU
'eatures 7 continued
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
15
$%&D&A Ge'orce ()00
+ 4eCturing
0 ACtended support #or non po8er o# t8o teCtures to match support
#or po8er o# t8o teCtures 7 5ipmapping, "rapping and
c.amping, Cu1e map and 3D teCtures6
+ 2hado8 @u##er 2upport
0 'etches shado8 1u##er as a projective teCture and per#orms /7
compares o# the shado8 1u##er data to distance #rom .ight6
'eatures Continued
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
1(
$%&D&A Ge'orce ()00
+ &ncreased instruction count upto (5535
instructions6!
+ 'ragment processor= mu.tip.e render targets6
+ Dynamic #.o8 contro. 1ranching
+ %erteC teCturing
+ 5ore temporary registers6
'eatures 0 2hader 2upport
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
1D
$%&D&A Ge'orce ()00
+ Co7issue-
0 Aach #our7component78ide vector unit is
capa1.e o# eCecuting t8o independent
instructions in para..e.
0 5ore sca.ar computations done in .ess
time6
+ Dua. issue-
0 t8o independent instructions can 1e
eCecuted on di##erent parts o# the shader
pipe.ine
0 5a,es schedu.ing easy and more e##icient6
'eatures 0 Co7issue and Dua. &ssue
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
1)
GPGPU
+ :oo, at GPU as a #ast 2&5D processor
+ &t is a specia.i/ed processor, so not a..
programs can 1e run
+ ACamp.e computationa. programs 0 ''4,
Cryptography, ?ay 4racing, 2egmentation
and even sound processingJ
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
1;
GPU #rom comp arch perspective
+ 'ocus on '.oating point math
+ #p3* and #p1( precision support #or intermediate
ca.cu.ations
+ ( #our78ide #p3* vector 5ADs/c.oc, in shaders and 1
sca.ar mu.ti#unction op
+ 1( #our78ide #p3* vector 5ADs/c.oc, in #rag7proc p.us 1(
#our78ide #p3* 5U:s
+ Dedicated #p1( norma.i/ation hard8are
Processing units
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
*0
GPU #rom comp arch perspective
+ Use dedicated 1ut standard memory architectures
eg D?A5!
+ 5u.tip.e sma.. independent memory partitions #or
improved .atency
+ 5emory used to store 1u##ers and optiona..y teCtures
+ &n .o87end system &nte. )55G5! system memory is
shared as the Graphics memory
5emory
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
*1
GPU #rom comp arch perspective
+ GPU inter#aces 8ith the CPU using #ast 1uses .i,e AGP and
PC& ACpress
+ Port speeds
0 PC& eCpress upto )G@/sec 4 K 4 !
Practica..y upto 36* K 36* !
0 AGP upto * G@/sec #or )C AGP!
+ 2uch 1us speeds are important 1ecause teCtures and verteC
data needs to come #rom CPU to GPU a#ter that itLs the
interna. GPU 1and8idth that matters!
2ystem &nter#ace
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
**
GPU #rom comp arch perspective
+ 4eCture caches * .eve.!
0 2hared 1et8een verteC procs and #ragment procs
0 Cache processed/#i.tered teCtures
+ %erteC caches
0 cache processed and unprocessed verteCes
0 improve computation and #etch per#ormance
+ G and 1u##er cache and 8rite <ueues
Caches
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
*3
Demo
+ http-//do8n.oad6nvidia6com/do8n.oads/nGone/videos/nvidia/na.u68mv

04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
*4
?e#erences
+ $vidia ()00 chapter #rom GPU Gems *
http-//do8n.oad6nvidia6com/deve.oper/GPU>Gems>*/GPU>Gems*>ch306pd#
+ 9penG: design
http-//graphics6stan#ord6edu/courses/cs44)a7017#a../design>openg.6pd#
+ 9penG: programming guide &2@$- 0*01(045)*!
+ ?ea. time graphics architectures .ecture notes
http-//graphics6stan#ord6edu/courses/cs44)a7017#a../
+ Ge'orce *5( overvie8
http-//8886nvne8s6net/revie8s/ge#orce>*5(/gpu>overvie86shtm.
+ $%&D&A 8e1site
http-//nvidia6com
04/14/05 Ajit Datar, Apurva Padhye
Computer Architecture
*5
2o .ong and than,s #or a.. the #ish
9h yeah 666 any <uestionsM!

You might also like