You are on page 1of 26

GPU

MATLAB



MathWorks, SoftLine

Data Analysis Tasks


Access

Explore & Discover

Files
Data Analysis
& Modeling

Software

Algorithm
Development

Share
Reporting and
Documentation

Outputs for Design

Code & Applications

Hardware

Application
Development

Deployment

Automate
2

NEWS

MathWorks
GPU MATLAB


NVIDIA

GPU
( )

.

50
GPU

50x

MATLAB

Computer Cluster
Desktop Computer

MATLAB Distributed Computing Server

Parallel Computing Toolbox

MATLAB:

Worker

Worker
Worker

TOOLBOXES

BLOCKSETS

Worker
Worker
Worker
Worker

Worker

Parallel Computing

Larger Compute Pool

Larger Memory Pool


11

26

41

12

27

42

13

28

43

14

29

44

15

30

45

16

31

46

17

32

47

17

33

48

19

34

49

20

35

50

21

36

51

22

37

52


MATLAB


:
parfor, distributed arrays, batch


:
Jobs/Tasks, spmd, MPI-interface

GPU

Single
processor

Multicore

Multiprocessor

Cluster
Grid,
Cloud


MATLAB
2005

2006

2007

2008

2009

2010

v1.0
Distributed jobs
Dynamic licensing

v2.0

R2010b

MPI functions
3rd party schedulers

GPU arrays
GPU math

GPU

v3.0
Distributed arrays
Parallel math

GPU Beta
R2007a

Local Workers

R2009b

R2008a

Parallel for-loop
Optimization Toolbox

R2008b
Compilation
spmd support

Distributed arrays
Parallelism in toolboxes


10

GPU ?

GPU CPU

* Parallel Computing Toolbox NVIDIA GPU 1.3 ,


NVIDIA Tesla 10- 20-.
11

GPU
3D Gaming & CAD

Scientific Computing

12

GPU
CUDA Community Showcase:

http://www.nvidia.com/object/cuda_showcase_html.html
13

GPU ?

(single/double)

IEEE

14

GPU MATLAB

GPU:
1) GPU

MATLAB
2)
GPU

3) CUDA

15

GPU :
>>
>>

>>
>>

>>

A = someArray(1000, 1000);
G = gpuArray(A); % Push to GPU memory
F = fft(G);
x = G\b;
z = gather(x); % Bring back into MATLAB

17

+100 , GPU

fft, fft2, ifft, ifft2


(A*B)
(A\b)
LU
.
abs, acos, , minus, , plus, ,
sin,

18

19

CPU:
D = data;
iterations = 2000; % # of parallel iterations
stride = iterations*step; %stride of outer loop
M = ceil((numel(x)-W)/stride);%iterations needed
o = cell(M, 1); % preallocate output
for i = 1:M
% What are the start points
thisSP = (i-1)*stride:step:
(min(numel(x)-W, i*stride)-1);
% Move the data efficiently into a matrix
X = copyAndWindowInput(D, window, thisSP);
% Take lots of fft's down the colmuns
X = abs(fft(X));

% Return only the first part to MATLAB


o{i} = X(1:E, 1:ratio:end);
end

22

CPU GPU
D = data;
iterations = 2000; % # of parallel iterations
stride = iterations*step; %stride of outer loop

D = gpuArray(data);
iterations = 2000; % # of parallel iterations
stride = iterations*step; %stride of outer loop

M = ceil((numel(x)-W)/stride);%iterations needed
o = cell(M, 1); % preallocate output

M = ceil((numel(x)-W)/stride);%iterations needed
o = cell(M, 1); % preallocate output

for i = 1:M
% What are the start points
thisSP = (i-1)*stride:step:
(min(numel(x)-W, i*stride)-1);

for i = 1:M
% What are the start points
thisSP = (i-1)*stride:step: ...
(min(numel(D)-W, i*stride)-1);

end

% Move the data efficiently into a matrix


X = copyAndWindowInput(D, window, thisSP);

% Move the data efficiently into a matrix


X = copyAndWindowInput(D, window, thisSP);

% Take lots of fft's down the colmuns


X = abs(fft(X));

% Take lots of fft's down the colmuns


X = gather(abs(fft(X)));

% Return only the first part to MATLAB


o{i} = X(1:E, 1:ratio:end);

% Return only the first part to MATLAB


o{i} = X(1:E, 1:ratio:end);
end

23

MATLAB:

Worker

Worker
Worker

TOOLBOXES

BLOCKSETS

Worker
Worker
Worker
Worker

Worker

24

CPU GPU GPU


D = gpuArray(data);
iterations = 2000; % # of parallel iterations
stride = iterations*step; %stride of outer loop

D = gpuArray(data);
iterations = 2000; % # of parallel iterations
stride = iterations*step; %stride of outer loop

M = ceil((numel(x)-W)/stride);%iterations needed
o = cell(M, 1); % preallocate output

M = ceil((numel(x)-W)/stride);%iterations needed
o = cell(M, 1); % preallocate output

for i = 1:M
% What are the start points
thisSP = (i-1)*stride:step: ...
(min(numel(D)-W, i*stride)-1);

parfor i = 1:M
% What are the start points
thisSP = (i-1)*stride:step: ...
(min(numel(D)-W, i*stride)-1);

end

% Move the data efficiently into a matrix


X = copyAndWindowInput(D, window, thisSP);

% Move the data efficiently into a matrix


X = copyAndWindowInput(D, window, thisSP);

% Take lots of fft's down the colmuns


X = gather(abs(fft(X)));

% Take lots of fft's down the colmuns


X = gather(abs(fft(X)));

% Return only the first part to MATLAB


o{i} = X(1:E, 1:ratio:end);

% Return only the first part to MATLAB


o{i} = X(1:E, 1:ratio:end);
end

25

26

50
GPU

50x

27

GPU NVIDIA
CUDA 1.3
:
http://www.nvidia.com/object/cuda_gpus.html

37


Mathworks
Softline: www.sl-matlab.ru
matlab.exponenta.ru

Mathworks: www.mathworks.com
E-mail: matlab@softline.ru

Phone: +7 (495) 232 00 23 . 0609

38

You might also like