Professional Documents
Culture Documents
Applications
Multicore SoftwareEngineering
Multicore Hardware
Interface definition
Algorithm Engineer Expertise in programming Tools: C, Single Assignment C (SAC) URL: www.sac-home.org
Concurrency Engineer Expertise in concurrent system design Tools: S-Net URL: www.snet-home.org
System Architecture
Concurrency engineer User Algorithm engineer
S-Net Declarative language Boxes are atomic or terminal symbols Inductive modeling of streaming networks CAL: Constraint Aggregation Language Describe functional and extra-functional behavior of S-NET boxes Define cause-and-effect relations between box input and output, define constraints
SAC: Single Assignment C Functional, array-based programming language Automatic parallelization Any other language may be used Constraint: statelessness
Modeling domain
Statistical analysis
Resource Mgmt
Console
attached devices
User
Admin
Device user
SVP: SANE Virtual Processor Developed by AETHER project Targeted by high-level programming languages Dynamic creation of tasks and sub-tasks (sequential, parallel or pipelined) Static and dynamic mapping of tasks to resources (at compile time or run-time) Self-adaptive mapping
Agenda
1. Single Assignment C (SAC) 2. S-Net 3. Statistical Performance Analysis 4. Initial Evaluation Results 5. Conclusion
Agenda
1. Single Assignment C (SAC) 2. S-Net 3. Statistical Performance Analysis 4. Initial Evaluation Results 5. Conclusion
11
Synchronization mechanisms implicit (via streams) and explicit (via SynchroCell) Type system to determine routing of messages (with flow inheritance as additional glue) Works on shared memory and also distributed systems S-Net webpage: http://www.snet-home.org
12
Coordinator (S-Net)
S-Net Boxes
SaC
ISO C
{A,B,<T>}
{X,<T>}
S-Net Boxes
{A,B,C,<T>}
S-Net Boxes
{X,C,<T>}
Network Combinators
Serial Combination: net X connect foo1 .. foo2
box foo1 {A} {B} box foo1 {B} {C}
Network Combinators
Parallel Replication: net X connect foo ! <T>
! <T> box foo
{A,<T>} {B}
...
<T>
...
Explicit Synchronization
The one-shot synchronization cell: [|{A,B},{C,D}|] Pattern for continuous synchronization: [|{A,B}|] * {A,B}
* {A,B} sync
{A,B} {C,D}
sync
{A,B}
S-Net Compiler
S-Net Runtime
Box Interface
Box Compiler
S-Net Module
Library / Objects
Executable
Agenda
1. Single Assignment C (SAC) 2. S-Net 3. Statistical Performance Analysis 4. Initial Evaluation Results 5. Conclusion
22
The Problem
[Partner: University of St Andrews] Need to find statistically valid cost information
For S-Net networks composed from sub-networks For alternative hardware configurations Base this on sample execution data
Analysis Approach
Build a type-based cost model for S-Net networks Determine probability distribution functions (pdfs) For key metrics of Latency, Jitter and Throughput For sub-networks and networks For different abstract hardware Combine them using a type-based analysis
Based on the cost-model Exploiting well-known type-and-effect technology
Box A Hardware H1
30 20 10 0 1 3 5 7 9 11 13 15 17 19
p(A/H1)
A/H1
25 20 15 10 5 0 1 3 5 7 9 11 13 15 17 19
Box A Hardware H2
p(B/H1 || A/H1 )
0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 1 4 7 10 13 16 19 0 1 3 5 7 9 1113151719
B/H1
Measure
p(B/H1)
Combine pdfs
p(B/H1 || A/H2 )
25
Measurements
Obtained from execution on hardware platforms
Key metrics latency, throughput and jitter Associated with
networks/boxes
Depending on
physical hardware key execution parameters
26
For a given F, G, H there is always a unique copula Copulas capture all kinds of dependence relations, including independence and partial dependence
The structured nature of S-Net allows us to deal with pdfs at multiple levels
Boxes, sub-networks, networks can all have pdfs
Agenda
1. Single Assignment C (SAC) 2. S-Net 3. Statistical Performance Analysis 4. Initial Evaluation Results 5. Conclusion
31
Applications
Overview
Interventional X-Ray Processing
32
Speed-up
1.99 1.98
2.91 2.91
5.66 5.55
5.97 10.20
25.24
Conclusion For the off-the-shelf Intel system, the SAC compiler achieves near-optimal speed-up for a real-world problem with an automatic parallelization of the identification classifier calls
33
High-Performance Inline Quality Inspection of Textured Surfaces for Manufacturing Industry, SCCH
Problem Description
In the eld of quality inspection of textured surfaces, e.g., for the production of foils or industrial woven fabrics, we have to cope with a high scanning speed up to 300m/min, i.e. about 80MB/sec per camera systems, and a complex phenomenology of textures and defects. This requires the application of advanced cost-intensive algorithms of image processing as well as machine learning, the use of high-performance computational hardware like GPUs or multi-core systems and the exploitation of parallelization potentials. The analysis of the whole processing pipeline (image acquisition, preprocessing, feature extraction, registration, defect detection and classi cation) with standard languages regarding performance is a resource- and time-intensive challenge.
34
35
hpFilter, SCCH
Benchmarking setup
CPU: DELL Precision 690 , 8 Cores, 3.2 GHz
Benchmarking CPU Execution time in seconds 50 40 30 20 10 0 px 1024x1024 px 2048x2048 px 4096x4096 OpenCV Sac-SEQ SAC-MT
OpenCV: http://opencv.willowgarage.com/wiki/
36
hpFilter, SCCH
Benchmarking Setup
NVIDIA GeForce GTX 480 Ultra , 480 Cores, 1.4 GHz
Benchmarking GPU 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 px 1024x1024 px 2048x2048 Data dimension px 4096x4096
37
SAC-CUDA CUDA-manual
hpFilter, SCCH
Benchmarking Setup
UNIBENCH, 1-48 Cores, 2.2 GHz
px 2048 x 2048
Agenda
1. Single Assignment C (SAC) 2. S-Net 3. Statistical Performance Analysis 4. Initial Evaluation Results 5. Conclusion
39
THANK YOU!
41
The information in this document is proprietary to the following ADVANCE StatArch consortium members funded by means of European Union within the 7th Framework Program: University of Hertfordshire (HERTS), University of St Andrews (USTAN), University of Twente (TWENTE), Universiteit van Amsterdam (UvA), Philips Medical Systems Nederland BV (PHILIPS), Technion: Israeli Institute of Technology (TECHNION), University of California at Irvine (UCI), SAP AG (SAP), BioID GmbH (BioID) and Software Competence Center Hagenberg (SCCH). The information in this document is provided "as is", and no guarantee or warranty is given that the information is fit for any particular purpose. The above referenced consortium members shall have no liability for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials subject to any liability which is mandatory due to applicable law. Copyright 2011 by HERTS, USTAN, TWENTE, UvA, PHILIPS, TECHNION, UCI, SAP, BioID and SCCH. All rights reserved.
42