TPP Report

1.
INTRODUCTION
Production of CPUs has increased due to higher number of kernels in single chip. Multiple
instructions multiple
data is used in modern processors where each kernel operates for different processors and executes
various
instructions. Unlike CPUs NVIDIA GPUs have multiprocessors with the number of kernels greater
than eight,
with hundreds of arithmetic logic units, with thousands of counters and minor split of storage
memory.
While developing kernels of central processor, engineers oriented on fulfilling a single stream of
sequential
instructions with maximum capacity as well as graphical kernels designed for the rapid response of
large number
of parallel, executable stream instructions. Functioning of graphics processing unit is parallelized from
the
beginning. Paralleling algorithms on large number of running units is basic for effective usage of GPU
computing
resource
s.
Due to thus structure of graphical kernels question arises on how effective is running its resources in
order to
achieve higher time indices. The research in this field is necessary. Also the question of reducing
energy
consumption in computational process is actual and

perspective.
The purpose of this work is researching various functioning modes of graphics processing unit and
energy
consumption in response to the number of operating computing units. This article describes the
experiment of
how energy consumption changes as the number if running units and

streams differ.
2. PARALLEL CALCULATIONS ON GPU

The use of parallel calculations in GPU by using computing power of video chip an increase of
productivity in
comparison to central processor can range from 5 to 30 times. For example, NVIDIA figures of
acceleration:
1. For fluorescent microscopy –

12x
2. For molecular dynamics – 8 to

16x
3. For electrostatics (Straight and multilayer summation of Coulomb) – 40

to 120x
Currently, calculations on GPU are being applied to fields like analysis and image/sound processing,
physics
simulation, cryptography, adaptive radiation therapy, geo information

systems, etc.
The fields of application vary and due to high productivity and wide range of tasks it is necessary to
research the
energy consumption in computing process and where possible applying provided algorithms to
reduce expenses.
1 | Page
3. MEASURING TOOLS
Measuring tools for energy consumption powered by graphics processing unit. Such tools as current
sensor,
microcontroller unit (ST) have been used to measure energy consumption. To receive data, connect
current sensor.
After an input signal is converted, the result will be
shown.
Four pins are used for sensing the current coming from μP / μC. The analog signal produced is
passed on to ADC
for processing along with DMAC. Further analysis is done by transferring to virtual COM
port of PC.
4. EXPERIMENT
Model of programming in CUDA involves grouping the streams. They are gathered into thread block –
dimensional and two-dimensional grids of streams that interact between each other by splitting
storage memory
and synch points. The program (kernel) executes over the grid of thread blocks as shown
in Figure 1
Figure 1 Structure of grouping the stream on
GPU
2 | Page
5. EXPERIMENT
Grids interact with each other by splitting memory and synch points. To optimize the number of blocks
and
streams in the video card an experiment has been carried out. It will define the energy consumption in
response
to the number of operating computing

units.
In creation of experiment graphics processing unit NVIDIA GeForce GTX 480 has been used. Before
data is
collected from computing device, program has been created where matrices are multiplied with the
use of graphics
processor resources. Multiplication is realized with block multiplication. Block matrix can be visualized
as the
original matrix with a collection of the horizontal and vertical lines, which break it out into a collection
of smaller
matrice
s.
Multiplication formula for matrices A and

B:
nCij = ∑ Aik × Bkj

k=1
6. RESULTS AND
DISCUSSIONS
The result of this experiment is as shown in the following graph

(Figure 2)
Figure 2 Energy consumption in response to the number of operating computing

units
According to Figure 3 it is obvious that with maximum number of blocks when computing on graphical
processor,
computing time does not change

significantly.
3 | Page
Figure 3 Dependency of computing time from number of computing
blocks
It is graphically demonstrated in figure 4 how the level of energy consumption depends on the
number of blocks
and threads. From the charts it can be said that energy consumption depends on the number used
computing
elements of GPU. Summarizing the results of experiment, the fact that the optimal number of blocks
for
computing energy of blocks for computing energy efficiency is maximum, supported by GPU, can
been
confirmed. When determining the number of streams certain dependency can be observed leading to
formula,
which optimized the number of used streams on GPU.

(Page 5)
4 | Page
7.
CONCLUSION(S)
1. Energy consumption is a direct function of number of computing
blocks
2. Optimization of energy consumption can be done

through
i. Software improvising (optimal

algorithms)
ii. Hardware improvising (optimum internal

architecture)
3. Studying and understanding external dependencies help us improve further (i.e. more dependency
knowledge
directly corresponds to a more accurate

model)
8.
REFERENCES
[1] Technical paper –
https://ieeexplore.ieee.org/document/7910995
[2] ACS712 –
https://www.sparkfun.com/datasheets/BreakoutBoards/0712.pdf
[3] CUDA architecture

overview –
http://developer.download.nvidia.com/compute/cuda/docs/CUDA_Architecture_Overv
iew.pdf
Figure 4 Energy consumption in response to the number of blocks and
threads
5 | Page

TPP Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

TPP Report

Uploaded by

Copyright:

Available Formats

1.

consumption in computational process is actual and

how energy consumption changes as the number if running units and

2. PARALLEL CALCULATIONS ON GPU

1. For fluorescent microscopy –

2. For molecular dynamics – 8 to

3. For electrostatics (Straight and multilayer summation of Coulomb) – 40

simulation, cryptography, adaptive radiation therapy, geo information

to the number of operating computing

Multiplication formula for matrices A and

nCij = ∑ Aik × Bkj

The result of this experiment is as shown in the following graph

Figure 2 Energy consumption in response to the number of operating computing

computing time does not change

which optimized the number of used streams on GPU.

2. Optimization of energy consumption can be done

i. Software improvising (optimal

ii. Hardware improvising (optimum internal

directly corresponds to a more accurate

[3] CUDA architecture

You might also like