CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming

Ebook486 pages84 hours

CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming

Name: CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming
Author: Gregory Ruetsch
ISBN: 9780124169722

By Gregory Ruetsch and Massimiliano Fatica

Rating: 0 out of 5 stars

()

Read preview

About this ebook

CUDA Fortran for Scientists and Engineers shows how high-performance application developers can leverage the power of GPUs using Fortran, the familiar language of scientific computing and supercomputer performance benchmarking. The authors presume no prior parallel computing experience, and cover the basics along with best practices for efficient GPU computing using CUDA Fortran.

To help you add CUDA Fortran to existing Fortran codes, the book explains how to understand the target GPU architecture, identify computationally intensive parts of the code, and modify the code to manage the data and parallelism and optimize performance. All of this is done in Fortran, without having to rewrite in another language. Each concept is illustrated with actual examples so you can immediately evaluate the performance of your code in comparison.

Leverage the power of GPU computing with PGI’s CUDA Fortran compiler
Gain insights from members of the CUDA Fortran language development team
Includes multi-GPU programming in CUDA Fortran, covering both peer-to-peer and message passing interface (MPI) approaches
Includes full source code for all the examples and several case studies
Download source code and slides from the book's companion website

Skip carousel

LanguageEnglish

PublisherElsevier Science

Release dateSep 11, 2013

ISBN9780124169722

Author

Gregory Ruetsch

Greg Ruetsch is a Senior Applied Engineer at NVIDIA, where he works on CUDA Fortran and performance optimization of HPC codes. He holds a Bachelor’s degree in mechanical and aerospace engineering from Rutgers University and a Ph.D. in applied mathematics from Brown University. Prior to joining NVIDIA he has held research positions at Stanford University’s Center for Turbulence Research and Sun Microsystems Laboratories.

Related authors

Skip carousel

Related to CUDA Fortran for Scientists and Engineers

Related ebooks

Skip carousel

Shared Memory Application Programming: Concepts and Strategies in Multicore Application Programming
Ebook
Shared Memory Application Programming: Concepts and Strategies in Multicore Application Programming
byVictor Alessandrini
Rating: 0 out of 5 stars
0 ratings
CUDA Application Design and Development
Ebook
CUDA Application Design and Development
byRob Farber
Rating: 0 out of 5 stars
0 ratings
Advances in GPU Research and Practice
Ebook
Advances in GPU Research and Practice
byHamid Sarbazi-Azad
Rating: 0 out of 5 stars
0 ratings
CUDA Programming: A Developer's Guide to Parallel Computing with GPUs
Ebook
CUDA Programming: A Developer's Guide to Parallel Computing with GPUs
byShane Cook
Rating: 4 out of 5 stars
4/5
GPU Programming in MATLAB
Ebook
GPU Programming in MATLAB
byNikolaos Ploskas
Rating: 1 out of 5 stars
1/5
Accelerating MATLAB with GPU Computing: A Primer with Examples
Ebook
Accelerating MATLAB with GPU Computing: A Primer with Examples
byJung W. Suh
Rating: 3 out of 5 stars
3/5
GPU-based Parallel Implementation of Swarm Intelligence Algorithms
Ebook
GPU-based Parallel Implementation of Swarm Intelligence Algorithms
byYing Tan
Rating: 0 out of 5 stars
0 ratings
Pro Spring Boot 2: An Authoritative Guide to Building Microservices, Web and Enterprise Applications, and Best Practices
Ebook
Pro Spring Boot 2: An Authoritative Guide to Building Microservices, Web and Enterprise Applications, and Best Practices
byFelipe Gutierrez
Rating: 0 out of 5 stars
0 ratings
Professional CUDA C Programming
Ebook
Professional CUDA C Programming
byJohn Cheng
Rating: 5 out of 5 stars
5/5
Parallel and High Performance Computing
Ebook
Parallel and High Performance Computing
byRobert Robey
Rating: 0 out of 5 stars
0 ratings
OpenCL Programming by Example
Ebook
OpenCL Programming by Example
byRavishekhar Banger
Rating: 0 out of 5 stars
0 ratings
Learning Jupyter
Ebook
Learning Jupyter
byDan Toomey
Rating: 5 out of 5 stars
5/5
FORTRAN 90 for Scientists and Engineers
Ebook
FORTRAN 90 for Scientists and Engineers
byBrian H. Hahn
Rating: 3 out of 5 stars
3/5
Mastering Julia
Ebook
Mastering Julia
byMalcolm Sherrington
Rating: 0 out of 5 stars
0 ratings
Practical MATLAB Deep Learning: A Project-Based Approach
Ebook
Practical MATLAB Deep Learning: A Project-Based Approach
byMichael Paluszek
Rating: 0 out of 5 stars
0 ratings
Mastering Python Scientific Computing
Ebook
Mastering Python Scientific Computing
byMehta Hemant Kumar
Rating: 4 out of 5 stars
4/5
Clojure Programming Cookbook
Ebook
Clojure Programming Cookbook
byMakoto Hashimoto
Rating: 0 out of 5 stars
0 ratings
Julia Cookbook
Ebook
Julia Cookbook
byJalem Raj Rohit
Rating: 0 out of 5 stars
0 ratings
Visual Media Processing Using MATLAB Beginner's Guide
Ebook
Visual Media Processing Using MATLAB Beginner's Guide
byGeorge Siogkas
Rating: 0 out of 5 stars
0 ratings
Practical Scientific Computing
Ebook
Practical Scientific Computing
byMuhammad Ali
Rating: 0 out of 5 stars
0 ratings
Scientific Computing with Python 3
Ebook
Scientific Computing with Python 3
byJan Erik Solem
Rating: 0 out of 5 stars
0 ratings
Heterogeneous Computing with OpenCL 2.0
Ebook
Heterogeneous Computing with OpenCL 2.0
byDavid R. Kaeli
Rating: 0 out of 5 stars
0 ratings
Boost.Asio C++ Network Programming Cookbook
Ebook
Boost.Asio C++ Network Programming Cookbook
byRadchuk Dmytro
Rating: 0 out of 5 stars
0 ratings
Building Python Real-Time Applications with Storm
Ebook
Building Python Real-Time Applications with Storm
byBhatnagar Kartik
Rating: 0 out of 5 stars
0 ratings
Modern Fortran: Building efficient parallel applications
Ebook
Modern Fortran: Building efficient parallel applications
byMilan Curcic
Rating: 0 out of 5 stars
0 ratings
Learning IPython for Interactive Computing and Data Visualization - Second Edition
Ebook
Learning IPython for Interactive Computing and Data Visualization - Second Edition
byRossant Cyrille
Rating: 2 out of 5 stars
2/5
Qt 5 Blueprints
Ebook
Qt 5 Blueprints
bySymeon Huang
Rating: 4 out of 5 stars
4/5
Common Lisp A Complete Guide
Ebook
Common Lisp A Complete Guide
byGerardus Blokdyk
Rating: 1 out of 5 stars
1/5
Parallel Programming with OpenACC
Ebook
Parallel Programming with OpenACC
byRob Farber
Rating: 5 out of 5 stars
5/5
Mathematica Navigator: Mathematics, Statistics and Graphics
Ebook
Mathematica Navigator: Mathematics, Statistics and Graphics
byHeikki Ruskeepaa
Rating: 4 out of 5 stars
4/5

Computers For You

Skip carousel

101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
The Invisible Rainbow: A History of Electricity and Life
Ebook
The Invisible Rainbow: A History of Electricity and Life
byArthur Firstenberg
Rating: 4 out of 5 stars
4/5
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
Ebook
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
byTriumph Books
Rating: 5 out of 5 stars
5/5
Master Builder Roblox: The Essential Guide
Ebook
Master Builder Roblox: The Essential Guide
byTriumph Books
Rating: 4 out of 5 stars
4/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
Ebook
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
User Friendly: How the Hidden Rules of Design Are Changing the Way We Live, Work, and Play
Ebook
User Friendly: How the Hidden Rules of Design Are Changing the Way We Live, Work, and Play
byCliff Kuang
Rating: 4 out of 5 stars
4/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
Ebook
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
byBruce Sterling
Rating: 4 out of 5 stars
4/5
GarageBand Basics: The Complete Guide to GarageBand: Music
Ebook
GarageBand Basics: The Complete Guide to GarageBand: Music
byAventuras De Viaje
Rating: 0 out of 5 stars
0 ratings
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
Ebook
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
byAndrew Hodges
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Learning the Chess Openings
Ebook
Learning the Chess Openings
byJef Kaan
Rating: 5 out of 5 stars
5/5
Storytelling with Data: Let's Practice!
Ebook
Storytelling with Data: Let's Practice!
byCole Nussbaumer Knaflic
Rating: 4 out of 5 stars
4/5
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
Ebook
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
byJoe Shelley
Rating: 5 out of 5 stars
5/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
AP® Computer Science Principles Crash Course
Ebook
AP® Computer Science Principles Crash Course
byJacqueline Corricelli
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

Computational Thinking & Learning Python During an AI Revolution
Podcast episode
Computational Thinking & Learning Python During an AI Revolution
byThe Real Python Podcast
0 ratings
0% found this document useful
41. Bob Nystrom
Podcast episode
41. Bob Nystrom
byIt's All Widgets! Flutter Podcast
0 ratings
0% found this document useful
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
Podcast episode
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
byMachine Learning Guide
0 ratings
0% found this document useful
25: Selenium, pytest, Mozilla – Dave Hunt: Interview with Dave Hunt @davehunt82. We Cover: Selenium Driver: http://www.seleniumhq.org/ pytest: http://docs.pytest.org/ pytest plugins: pytest-selenium: http://pytest-selenium.readthedocs.io/ pytest-html: https://pypi.python.
Podcast episode
25: Selenium, pytest, Mozilla – Dave Hunt: Interview with Dave Hunt @davehunt82. We Cover: Selenium Driver: http://www.seleniumhq.org/ pytest: http://docs.pytest.org/ pytest plugins: pytest-selenium: http://pytest-selenium.readthedocs.io/ pytest-html: https://pypi.python.
byTest and Code
0 ratings
0% found this document useful
#111 The Rise of the Julia Programming Language
Podcast episode
#111 The Rise of the Julia Programming Language
byDataFramed
0 ratings
0% found this document useful
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
Podcast episode
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
byCoder Radio
0 ratings
0% found this document useful
2: Pytest vs Unittest vs Nose: Choosing a test framework
Podcast episode
2: Pytest vs Unittest vs Nose: Choosing a test framework
byTest and Code
0 ratings
0% found this document useful
Harnessing Python for Research: Scientific Applications of Python with Michael Kennedy: Still scrabbling with Excel? Consider Python language uses, says programmer and podcaster Michael Kennedy. A general programming language that is easy to use in multiple environments, Python programming is limitless and has numerous open source...
Podcast episode
Harnessing Python for Research: Scientific Applications of Python with Michael Kennedy: Still scrabbling with Excel? Consider Python language uses, says programmer and podcaster Michael Kennedy. A general programming language that is easy to use in multiple environments, Python programming is limitless and has numerous open source...
byFinding Genius Podcast
0 ratings
0% found this document useful
#10 Data Science, the Environment and MOOCs: Air pollution, the environment and data science: where do these intersect? Find out in this episode of DataFramed, in which Hugo speaks with Roger Peng, Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health...
Podcast episode
#10 Data Science, the Environment and MOOCs: Air pollution, the environment and data science: where do these intersect? Find out in this episode of DataFramed, in which Hugo speaks with Roger Peng, Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health...
byDataFramed
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
Generators, Coroutines, and Learning Python Through Exercises
Podcast episode
Generators, Coroutines, and Learning Python Through Exercises
byThe Real Python Podcast
0 ratings
0% found this document useful
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
Podcast episode
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
byMachine Learning Cafe
0 ratings
0% found this document useful
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
Podcast episode
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
Podcast episode
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
Ep. 34 - d'Oh My Zsh: In this episode, Oh My Zsh founder Robby Russell tells the story of how he unexpectedly launched one of the most popular zsh configuration frameworks out there. He shares his process, some mean tweets, and his advice for people starting open source...
Podcast episode
Ep. 34 - d'Oh My Zsh: In this episode, Oh My Zsh founder Robby Russell tells the story of how he unexpectedly launched one of the most popular zsh configuration frameworks out there. He shares his process, some mean tweets, and his advice for people starting open source...
byfreeCodeCamp Podcast
0 ratings
0% found this document useful
A Programmer's Guide to Computer Science with Dr. William Springer: Have you failed a job interview because you don't know computer science? William Springer has a PhD in computer science and his books takes you through what you would have learned while earning a four-year computer science degree! Both Scott and William believe in breaking down boundaries, and it starts with this show!
Podcast episode
A Programmer's Guide to Computer Science with Dr. William Springer: Have you failed a job interview because you don't know computer science? William Springer has a PhD in computer science and his books takes you through what you would have learned while earning a four-year computer science degree! Both Scott and William believe in breaking down boundaries, and it starts with this show!
byHanselminutes with Scott Hanselman
100%
100% found this document useful
120: FastAPI & Typer - Sebastián Ramírez: Sebastián Ramírez is the developer behind FastAPI for Python REST APIs and Typer, for CLI applications. We discuss FastAPI, Typer, Swagger UI, interface design, autocompletion, and more.
Podcast episode
120: FastAPI & Typer - Sebastián Ramírez: Sebastián Ramírez is the developer behind FastAPI for Python REST APIs and Typer, for CLI applications. We discuss FastAPI, Typer, Swagger UI, interface design, autocompletion, and more.
byTest and Code
0 ratings
0% found this document useful
Manipulating and Analyzing Audio in Python
Podcast episode
Manipulating and Analyzing Audio in Python
byThe Real Python Podcast
0 ratings
0% found this document useful
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
Podcast episode
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Episode 54: μ: Getting The Most Out Of Conferences
Podcast episode
Episode 54: μ: Getting The Most Out Of Conferences
byMaterialism: A Materials Science Podcast
0 ratings
0% found this document useful
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
Podcast episode
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
byData Engineering Podcast
0 ratings
0% found this document useful
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
41: Piezoelectric Materials: In Your Body, Underwater, and In Space (ft. Dr. Susan Trolier-McKinstry): The Curie brothers discovered a class of materials that, with an asymmetrical crystal structure, could produce an electric potential upon mechanical deformation. These piezoelectric materials are now widely used in the medical, naval, and space industrie...
Podcast episode
41: Piezoelectric Materials: In Your Body, Underwater, and In Space (ft. Dr. Susan Trolier-McKinstry): The Curie brothers discovered a class of materials that, with an asymmetrical crystal structure, could produce an electric potential upon mechanical deformation. These piezoelectric materials are now widely used in the medical, naval, and space industrie...
byIt's a Material World | Materials Science Podcast
0 ratings
0% found this document useful
A Programmer's Introduction to Mathematics with Jeremy Kun: Like Programming, Mathematics has language and culture. Jeremy Kun has written A Programmer's Introduction to Mathematics as a way to bridge these two worlds and make the power and magic of mathematics available and understandable to programmers everywhere.
Podcast episode
A Programmer's Introduction to Mathematics with Jeremy Kun: Like Programming, Mathematics has language and culture. Jeremy Kun has written A Programmer's Introduction to Mathematics as a way to bridge these two worlds and make the power and magic of mathematics available and understandable to programmers everywhere.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Engineering interview tips & tricks: with Emma Draper & Jonas
Podcast episode
Engineering interview tips & tricks: with Emma Draper & Jonas
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
Podcast episode
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
Podcast episode
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
JIT Compilation and Exascale Computing with Hal Finkel: Rob and Jason are joined by Hal Finkel from the US Department of Energy. They first talk to Hal about the LLVM 13 release and why the release notes were lacking. Then they talk to Hal about his C++ JIT Proposal, the Clang prototype and how it could be...
Podcast episode
JIT Compilation and Exascale Computing with Hal Finkel: Rob and Jason are joined by Hal Finkel from the US Department of Energy. They first talk to Hal about the LLVM 13 release and why the release notes were lacking. Then they talk to Hal about his C++ JIT Proposal, the Clang prototype and how it could be...
byCppCast
0 ratings
0% found this document useful
Hasty Treat - Why should I use React Hooks?: In this Hasty Treat, Scott and Wes talk about React Hooks and why you might want to use them instead of class components. Sentry - Sponsor If you want to know what’s happening with your errors, track them with . Sentry is open-source error...
Podcast episode
Hasty Treat - Why should I use React Hooks?: In this Hasty Treat, Scott and Wes talk about React Hooks and why you might want to use them instead of class components. Sentry - Sponsor If you want to know what’s happening with your errors, track them with . Sentry is open-source error...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Hardware Hacking Made Easy With CircuitPython: An interview about recapturing the creativity and excitement of the early years of computing by using Python to experiment with microcontrollers
Podcast episode
Hardware Hacking Made Easy With CircuitPython: An interview about recapturing the creativity and excitement of the early years of computing by using Python to experiment with microcontrollers
byThe Python Podcast.__init__
100%
100% found this document useful

Skip carousel

Rise Of The Robots
Linux Format
Article
Rise Of The Robots
Jan 12, 2021
7 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
Docker Is Dead, Long Live Docker
Linux Format
Article
Docker Is Dead, Long Live Docker
Aug 22, 2023
1 min read
QEMU, KVM And The Other Ones
Linux Format
Article
QEMU, KVM And The Other Ones
Feb 9, 2021
4 min read
Saving and Executing Your Code
Essential Apple User Magazine
Article
Saving and Executing Your Code
Jul 31, 2019
2 min read
Deep Learning Tests Billions Of Graphene Combos In 2 Days
Futurity
Article
Deep Learning Tests Billions Of Graphene Combos In 2 Days
Apr 11, 2019
2 min read
Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Enhance GIMP With Must-have Plugins
Linux Format
Article
Enhance GIMP With Must-have Plugins
Apr 4, 2023
Credit: http://gimp.org Karsten Günther loves to extend his GIMP installations with useful plugins. And playing with the new tools… Many of the most important plugins presented here are quite old, but they are still maintained and sometimes even deve
9 min read
Mining BTC with a GPU: NiceHash
APC
Article
Mining BTC with a GPU: NiceHash
May 17, 2021
1 min read
Computer Scientists Discover Limits of Major Research Algorithm
Quanta
Article
Computer Scientists Discover Limits of Major Research Algorithm
Aug 17, 2021
1 min read
Access Your Mac Anywhere
MacLife
Article
Access Your Mac Anywhere
Nov 8, 2022
2 min read
Lint Hub
Linux Format
Article
Lint Hub
Jul 27, 2021
Pylint – a comprehensive linter that focuses on standards compliance and error detection. It’s likely built into your favourite IDE. Pycodestyle (formerly known as PEP8) – focuses on validating code formatting PEPs, has some overlap with Pylint. Pyfl
1 min read
How To Code Diagrams, Graphs And Pie Charts
Linux Format
Article
How To Code Diagrams, Graphs And Pie Charts
Mar 9, 2021
7 min read
The Verdict
Linux Format
Article
The Verdict
May 30, 2023
2 min read
GO Inside Parsing – How Go Handles The Code
Linux Format
Article
GO Inside Parsing – How Go Handles The Code
Jul 30, 2019
This tutorial has two aspects: a theoretical one and a practical one. In the theoretical part, you will learn about parsing, grammar and regular expressions; this is how languages are built and therefore understood in terms of construction and usage.
8 min read
Windows Sandbox: How To Use Microsoft’s Virtual Windows PC To Secure Your Digital Life
PCWorld
Article
Windows Sandbox: How To Use Microsoft’s Virtual Windows PC To Secure Your Digital Life
Jul 2, 2019
6 min read
What Is The Future Of Game Streaming Now That Stadia Is Dead?
APC
Article
What Is The Future Of Game Streaming Now That Stadia Is Dead?
Oct 31, 2022
Once hyped as being ‘the future of gaming’, the Google Stadia game streaming service was officially, just three years after launch and before even making it to Australian shores. When game streaming first launched we did have some apprehension about
2 min read
VisionFive V1 RISC-V SBC on sale
Linux Format
Article
VisionFive V1 RISC-V SBC on sale
May 3, 2022
1 min read
Scan Cloud RTX Virtual Workstation
PC Pro Magazine
Article
Scan Cloud RTX Virtual Workstation
Aug 7, 2022
2 min read
Tweak And Tune Your Own Kernel Scheduler
Linux Format
Article
Tweak And Tune Your Own Kernel Scheduler
Nov 14, 2023
SCHEDTOOL Credit: https://github.com/freequaos/schedtool OUR EXPERT QUICK TIP The first time you compile your own kernel, prepare the disk for handling up to 12GB of new data. Also reserve a good chunk of time and your favourite brew. A compile runs
11 min read
32-bit Oddities
Linux Format
Article
32-bit Oddities
May 5, 2020
Those trusty 32-bit CPUs are vulnerable to the speculative execution attacks that have kept on coming since January 2018. These were mitigated (albeit some months later than their 64-bit counterparts) in the kernel, but the associated fixes led to so
4 min read
Benchmark your SSD
APC
Article
Benchmark your SSD
Nov 2, 2020
4 min read
Doctor
Maximum PC
Article
Doctor
Jun 25, 2019
> HDMI Crash Course> Movie Magic> Missing Gigahertz I’ve had this problem for about seven years now, but I’ve managed to work around the issue until recently. In short, I am using a Palit GeForce GTX 460 graphics card that suffers from occasional scr
6 min read
The Return Of Gpu Computing
PC Pro Magazine
Article
The Return Of Gpu Computing
Jul 8, 2021
5 min read
The Return Of GPU Computing
APC
Article
The Return Of GPU Computing
May 16, 2022
5 min read
Nvidia CUDA rulez
Linux Format
Article
Nvidia CUDA rulez
Jun 27, 2023
2 min read
The Ultimate PC Build Guide
APC
Article
The Ultimate PC Build Guide
Apr 1, 2024
14 min read
Hot Picks
Linux Format
Article
Hot Picks
Mar 9, 2021
13 min read
Jargon Buster
Computeractive
Article
Jargon Buster
Mar 24, 2021
1080p Of the common types of high-definition video, this is the best quality: 1920x1080 pixels. 1440p High-definition video of 2560x1440 pixels. 2160p High-definition video of 3840x2160 pixels. 32bit/64bit A measure of how much data a PC can process
5 min read

Related categories

Skip carousel

Reviews for CUDA Fortran for Scientists and Engineers

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

CUDA Fortran for Scientists and Engineers - Gregory Ruetsch

CUDA Fortran for Scientists and Engineers

Best Practices for Efficient CUDA Fortran Programming

First Edition

Massimiliano Fatica

Gregory Ruetsch

NVIDIA Corporation, Santa Clara, CA

Cover image

Title page

Copyright

Dedication

Acknowledgments

Preface

Companion Site

Part I: CUDA Fortran Programming

Chapter 1. Introduction

Abstract

1.1 A brief history of GPU computing

1.2 Parallel computation

1.3 Basic concepts

1.4 Determining CUDA hardware features and limits

1.5 Error handling

1.6 Compiling CUDA Fortran code

Chapter 2. Performance Measurement and Metrics

Abstract

2.1 Measuring kernel execution time

2.2 Instruction, bandwidth, and latency bound kernels

2.3 Memory bandwidth

Chapter 3. Optimization

Abstract

3.1 Transfers between host and device

3.2 Device memory

3.3 On-chip memory

3.4 Memory optimization example: matrix transpose

3.5 Execution configuration

3.6 Instruction optimization

3.7 Kernel loop directives

Chapter 4. Multi-GPU Programming

Abstract

4.1 CUDA multi-GPU features

4.2 Multi-GPU Programming with MPI

Part II: Case Studies

Chapter 5. Monte Carlo Method

Abstract

5.1 CURAND

with CUF kernels

with reduction kernels

5.4 Accuracy of summation

5.5 Option pricing

Chapter 6. Finite Difference Method

Abstract

6.1 Nine-Point 1D finite difference stencil

6.2 2D Laplace equation

Chapter 7. Applications of Fast Fourier Transform

Abstract

7.1 CUFFT

7.2 Spectral derivatives

7.3 Convolution

7.4 Poisson Solver

Part III: Appendices

Appendix A. Tesla Specifications

Appendix B. System and Environment Management

B.1 Environment variables

B.2 nvidia-smi System Management Interface

Appendix C. Calling CUDA C from CUDA Fortran

C.1 Calling CUDA C libraries

C.2 Calling User-Written CUDA C Code

Appendix D. Source Code

D.1 Texture memory

D.2 Matrix transpose

D.3 Thread- and instruction-level parallelism

D.4 Multi-GPU programming

D.5 Finite difference code

D.6 Spectral Poisson Solver

References

Index

Copyright

Acquiring Editor: Todd Green

Development Editor: Lindsay Lawrence

Project Manager: Punithavathy Govindaradjane

Designer: Matthew Limbert

Morgan Kaufmann is an imprint of Elsevier

225 Wyman Street, Waltham, MA 02451, USA

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data

Ruetsch, Gregory.

CUDA Fortran for scientists and engineers : best practices for efficient CUDA Fortran programming / Gregory Ruetsch, Massimiliano Fatica.

pages cm

Includes bibliographical references and index.

ISBN 978-0-12-416970-8 (alk. paper)

1. FORTRAN (Computer program language) I. Fatica, Massimiliano. II. Title. III. Title: Best practices for efficient CUDA Fortran programming.

QA76.73.F25R833 2013

005.13’1--dc23

2013022226

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-416970-8

Printed and bound in the United States of America

14 15 16 17 18 10 9 8 7 6 5 4 3 2 1

For information on all MK publications visit our website at www.mkp.com

Dedication

To Fortran programmers, who know a good thing when they see it.

Acknowledgments

Writing this book has been an enjoyable and rewarding experience for us, largely due to the interactions with the people who helped shape the book into the form you have before you. There are many people who have helped with this book, both directly and indirectly, and at the risk of leaving someone out we would like to thank the following people for their assistance.

Of course, a book on CUDA Fortran would not be possible without CUDA Fortran itself, and we would like to thank The Portland Group (PGI), especially Brent Leback and Michael Wolfe, for literally giving us something to write about. Working with PGI on CUDA Fortran has been a delightful experience.

The authors often reflect on how computations used in their theses, which required many, many hours on large-vector machines of the day, can now run on an NVIDIA graphics processing unit (GPU) in less time than it takes to get a cup of coffee. We would like to thank those at NVIDIA who helped enable this technological breakthrough. We would like to thank past and present members of the CUDA software team, especially Philip Cuadra, Mark Hairgrove, Stephen Jones, Tim Murray, and Joel Sherpelz for answering the many questions we asked them.

Much of the material in this book grew out of collaborative efforts in performance-tuning applications. We would like to thank our collaborators in such efforts, including Norbert Juffa, Patrick Legresley, Paulius Micikevicius, and Everett Phillips.

Many people reviewed the manuscript for this book at various stages in its development, and we would like to thank Roberto Gomperts, Mark Harris, Norbert Juffa, Brent Leback, and Everett Phillips for their comments and suggestions.

We would like to thank Ian Buck for allowing us to spend time at work on this endeavor, and we would like to thank our families for their understanding while we also worked at home.

Finally, we would like to thank all of our teachers. They enabled us to write this book, and we hope in some way that by doing so, we have continued the chain of helping others.

Preface

This document is intended for scientists and engineers who develop or maintain computer simulations and applications in Fortran and who would like to harness the parallel processing power of graphics processing units (GPUs) to accelerate their code. The goal here is to provide the reader with the fundamentals of GPU programming using CUDA Fortran as well as some typical examples, without having the task of developing CUDA Fortran code become an end in itself.

The CUDA architecture was developed by NVIDIA to allow use of the GPU for general-purpose computing without requiring the programmer to have a background in graphics. There are many ways to access the CUDA architecture from a programmer’s perspective, including through C/C++ from CUDA C or through Fortran using The Portland Group’s (PGI’s) CUDA Fortran. This document pertains to the latter approach. PGI’s CUDA Fortran should be distinguished from the PGI Accelerator and OpenACC Fortran interfaces to the CUDA architecture, which are directive-based approaches to using the GPU. CUDA Fortran is simply the Fortran analog to CUDA C.

The reader of this book should be familiar with Fortran 90 concepts, such as modules, derived types, and array operations. For those familiar with earlier versions of Fortran but looking to upgrade to a more recent version, there are several excellent books that cover this material (e.g., Metcalf, 2011). Some features introduced in Fortran 2003 are used in this book, but these concepts are explained in detail. Although this book does assume some familiarity with Fortran 90, no experience with parallel programming (on the GPU or otherwise) is required. Part of the appeal of parallel programming on GPUs using CUDA is that the programming model is simple and novices can get parallel code up and running very quickly.

Often one comes to CUDA Fortran with the goal of porting existing, sometimes rather lengthy, Fortran code to code that leverages the GPU. Because CUDA is a hybrid programming model, where both GPU and CPU are utilized, CPU code can be incrementally ported to the GPU. CUDA Fortran is also used by those porting applications to GPUs mainly using the directive-base OpenACC approach, but who want to improve the performance of a few critical sections of code by hand-coding CUDA Fortran. Both OpenACC and CUDA Fortran can coexist in the same code.

This book is divided into two main parts. The first part is a tutorial on CUDA Fortran programming, from the basics of writing CUDA Fortran code to some tips on optimization. The second part is a collection of case studies that demonstrate how the principles in the first part are applied to real-world examples.

This book makes use of the PGI 13.x compilers, which can be obtained from http://pgroup.com. Although the examples can be compiled and run on any supported operating system in a variety of development environments, the examples included here are compiled from the command line as one would do under Linux or Mac OS X.

Companion Site

Supplementary materials for readers can be downloaded from Elsevier: http://store.elsevier.com/product.jsp?isbn=9780124169708.

Part I: CUDA Fortran Programming

Outline

Chapter 1 Introduction

Chapter 2 Performance Measurement and Metrics

Chapter 3 Optimization

Chapter 4 Multi-GPU Programming

Chapter 1

Introduction

Abstract

After a short discussion of the history of parallel computation on graphics processing units, or GPUs, this chapter goes through a sequence of simple examples that illustrate the fundamental aspects of computation on GPUs using CUDA Fortran. The hybrid nature of CUDA Fortran programming is illustrated, which contains both host code that is run on the CPU and device code that is executed on the GPU. Ways to determine hardware features and capabilities from within CUDA Fortran code are presented, as are error handling, compilation of CUDA Fortran code, and system management.

Keywords

Data parallelism; Hybrid computation; Host and device code; Kernel; Execution configuration; Compute capability; Error handling; Compilation; Device management

1.1 A brief history of GPU computing

Parallel computing has been around in one form or another for many decades. In the early stages it was generally confined to practitioners who had access to large and expensive machines. Today, things are very different. Almost all consumer desktop and laptop computers have central processing units, or CPUs, with multiple cores. Even most processors in cell phones and tablets have multiple cores. The principal reason for the nearly ubiquitous presence of multiple cores in CPUs is the inability of CPU manufacturers to increase performance in single-core designs by boosting the clock speed. As a result, since about 2005 CPU designs have scaled out to multiple cores rather than scaled up to higher clock rates. Although CPUs are available with a few to tens of cores, this amount of parallelisms pales in comparison to the number of cores in a graphics processing unit (GPU). For example, the NVIDIA Tesla® K20X contains 2688 cores. GPUs were highly parallel architectures from their beginning, in the mid-1990s, since graphics processing is an inherently parallel task.

The use of GPUs for general-purpose computing, often referred to as GPGPU, was initially a challenging endeavor. One had to program to the graphics application programming interface (API), which proved to be very restrictive in the types of algorithms that could be mapped to the GPU. Even when such a mapping was possible, the programming required to make this happen was difficult and not intuitive for scientists and engineers outside the computer graphics vocation. As such, adoption of the GPU for scientific and engineering computations was slow.

Things changed for GPU computing with the advent of NVIDIA’s CUDA® architecture in 2007. The CUDA architecture included both hardware components on NVIDIA’s GPU and a software programming environment that eliminated the barriers to adoption that plagued GPGPU. Since CUDA’s first appearance in 2007, its adoption has been tremendous, to the point where, in November 2010, three of the top five supercomputers in the Top 500 list used GPUs. In the November 2012 Top 500 list, the fastest computer in the world was also GPU-powered. One of the reasons for this very fast adoption of CUDA is that the programming model was very simple. CUDA C, the first interface to the CUDA architecture, is essentially C with a few extensions that can offload portions of an algorithm to run on the GPU. It is a hybrid approach where both CPU and GPU are used, so porting computations to the GPU can be performed incrementally.

In late 2009, a joint effort between The Portland Group® (PGI®) and NVIDIA led to the CUDA Fortran compiler. Just as CUDA C is C with extensions, CUDA Fortran is essentially Fortran 90 with a few extensions that allow users to leverage the power of GPUs in their computations. Many books, articles, and other documents have been written to aid in the development of efficient CUDA C applications (e.g., Sanders and Kandrot, 2011; Kirk and Hwu, 2012; Wilt, 2013). Because it is newer, CUDA Fortran has relatively fewer aids for code development. Much of the material for writing efficient CUDA C translates easily to CUDA Fortran, since the underlying architecture is the same, but there is still a need for material that addresses how to write efficient code in CUDA Fortran. There are a couple of reasons for this. First, though CUDA C and CUDA Fortran are similar, there are some differences that will affect how code is written. This is not surprising, since CPU code written in C and Fortran will typically take on a different character as projects grow. Also, there are some features in CUDA C that are not present in CUDA Fortran, such as certain aspects of textures. Conversely, there are some features in CUDA Fortran, such as the device variable attribute used to denote data that resides on the GPU, that are not present in CUDA C.

This book is written for those who want to use parallel computation as a tool in getting other work done rather than as an end in itself. The aim is to give the reader a basic set of skills necessary for them to write reasonably optimized CUDA Fortran code that takes advantage of the NVIDIA® computing hardware. The reason for taking this approach rather than attempting to teach how to extract every last ounce of performance from the hardware is the assumption that those using CUDA Fortran do so as a means rather than an end. Such users typically value clear and maintainable code that is simple to write and performs reasonably well across many generations of CUDA-enabled hardware and CUDA Fortran software.

But where is the line drawn in terms of the effort-performance tradeoff? In the end it is up to the developer to decide how much effort to put into optimizing code. In making this decision, we need to know what type of payoff we can expect when eliminating various bottlenecks and what effort is involved in doing so. One goal of this book is to help the reader develop an intuition needed to make such a return-on-investment assessment. To achieve this end, we discuss bottlenecks encountered in writing common algorithms in science and engineering applications in CUDA Fortran. Multiple workarounds are presented when possible, along with the performance impact of each optimization effort.

1.2 Parallel computation

Before jumping into writing CUDA Fortran code, we should say a few words about where CUDA fits in with other types of parallel programming models. Familiarity with and an understanding of other parallel programming models is not a prerequisite for this book, but for readers who do have some parallel programming experience, this section might be helpful in categorizing CUDA.

We have already mentioned that CUDA is a hybrid computing model, where both the CPU and GPU are used in an application. This is advantageous for development because sections of an existing CPU code can be ported to the GPU incrementally. It is possible to overlap computation on the CPU with computation on the GPU, so this is one aspect of parallelism.

A far greater degree of parallelism occurs within the GPU itself. Subroutines that run on the GPU are executed by many threads in parallel. Although all threads execute the same code, these threads typically operate on different data. This data parallelism is a fine-grained parallelism, where it is most efficient to have adjacent threads operate on adjacent data, such as elements of an array. This model of parallelism is very different from a model like Message Passing Interface, commonly known as MPI, which is a coarse-grained model. In MPI, data are typically divided into large segments or partitions, and each MPI process performs calculations on an entire data partition.

A few characteristics of the CUDA programming model are very different from CPU-based parallel programming models. One difference is that there is very little overhead associated with creating GPU threads. In addition to fast thread creation, context switches, where threads change from active to inactive and vice versa, are very fast for GPU threads compared to CPU threads. The reason context switching is essentially instantaneous on the GPU is that the GPU does not have to store state, as the CPU does when switching threads between being active and inactive. As a result of this fast context switching, it is advantageous to heavily oversubscribe GPU cores—that is, have many more resident threads than GPU cores so that memory latencies can be hidden. It is not uncommon to have the number of resident threads on a GPU an order of magnitude larger than the number of cores on the GPU. In the CUDA programming model, we essentially write a serial code that is executed by many GPU threads in parallel. Each thread executing this code has a means of identifying itself in order to operate on different data, but the code that CUDA threads execute is very similar to what we would write for serial CPU code. On the other hand, the code of many parallel CPU programming models differs greatly from serial CPU code. We will revisit each of

Enjoying the preview?

Page 1 of 1

CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming

About this ebook

Gregory Ruetsch

Related authors

Related to CUDA Fortran for Scientists and Engineers

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for CUDA Fortran for Scientists and Engineers

What did you think?

Book preview

CUDA Fortran for Scientists and Engineers - Gregory Ruetsch

Table of Contents

Part I: CUDA Fortran Programming

Part II: Case Studies

Part III: Appendices

Copyright

Dedication

Acknowledgments

Preface

Abstract

Keywords

1.1 A brief history of GPU computing

1.2 Parallel computation